Download presentation

Presentation is loading. Please wait.

Published byAlisha Langmaid Modified over 2 years ago

1
Planting Your Rows: Using SAS ® Formats to Make the Generation of Zero-Filled Rows in Tables Less Thorny Kathy Hardis Fraeman HASUG Q3 2012

2
Planting your rows with SAS ® Formats One of many clever tricks that can be done with SAS formats Techniques used for this trick are not unique to this trick, can be applied to other programming tricks I SAS formats!

3
When to use this trick with SAS formats Need tables with all possible values of a variable or variables All possible values of the variable dont exist in the data 3

4
Solution without using SAS Formats Hard code all possible values of a variable in the program Disadvantages –Tedious –Potential for large number of variables and/or values –Possible values could change over time 4

5
Solution using SAS Formats Attach a format to each variable with all possible values described Advantages –Dynamic –Only need to include variable values in the SAS format 5

6
SAS Programming Techniques 1. Saving a permanent SAS format library 2. Attach a format to a variable 3. Determine the format attached to a variable –SCL –SQL 4. Identify the data values defined in the attached format using CNTLOUT 6

7
Technique 1 Saving a permanent SAS format library libname library ">>pathname<<"; proc format library = library;... run; 7

8
Permanent SAS format library 8 Named formats Stored as a SAS Catalog

9
SAMPLE DATA Obs employee year num dollar 1 Hall FY 2008 10 $10,000.00 2 Hall FY 2010 15 $15,500.00 3 Oates FY 2008 8 $500.00 4 Brooks FY 2008 15 $11,111.00 5 Brooks FY 2010 20 $12,345.67 6 Abbott FY 2008 50 $75,757.00 7 Abbott FY 2010 75 $99,999.99 8 Costello FY 2008 33 $33,333.00 9 Costello FY 2010 44 $44,444.44 9

10
Variables with attached formats EMPLOYEE and YEAR are not text variables Variables have attached formats –EMPLOYEE has an attached format emplfmt. –YEAR has an attached format yearfmt. 10

11
SAS Formats attached to variables proc format library = library; value emplfmt 1 = "Hall" 2 = "Oates" 3 = "Brooks" 4 = "Dunn" 5 = "Abbott" 6 = "Costello" ; value yearfmt 2008 = "FY 2008" 2009 = "FY 2009" 2010 = "FY 2010" ; run; 11

12
Technique 2 Attach format to the variable data in.sales; set sales; format employee emplfmt. year yearfmt. ; run; 12

13
Obs employee year num dollar 1 Hall FY 2008 10 $10,000.00 2 Hall FY 2010 15 $15,500.00 3 Oates FY 2008 8 $500.00 4 Brooks FY 2008 15 $11,111.00 5 Brooks FY 2010 20 $12,345.67 6 Abbott FY 2008 50 $75,757.00 7 Abbott FY 2010 75 $99,999.99 8 Costello FY 2008 33 $33,333.00 9 Costello FY 2010 44 $44,444.44 Dunn hasnt sold anything No sales by anyone in FY 2009 Oates didnt sell anything in 2010 13

14
REPORT WITH MISSING ROWS Number Amount of Year Employee of Sales Sales ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ FY 2008 Hall 10 $10,000.00 Oates 8 $500.00 Brooks 15 $11,111.00 Abbott 50 $75,757.00 Costello 33 $33,333.00 FY 2010 Hall 15 $15,500.00 Brooks 20 $12,345.67 Abbott 75 $99,999.99 Costello 44 $44,444.44 14

15
Technique 3 Determine a Variables Format Dont need to hard code the name of a variables format unless in a FORMAT statement Can use two techniques to determine a variables format –%SYSFUNC with SCL –A dictionary table with PROC SQL 15

16
Method 1 -- %SYSFUNC with SAS SCL %let dsid = %sysfunc(open(IN.SALES, i)); %let varnum = %sysfunc(varnum(&dsid,EMPLOYEE)); %let format = %sysfunc(varfmt(&dsid, &varnum)); %let rc = %sysfunc(close(&dsid)); %put EMPLOYEE VARIABLE FORMAT = &format; 999 %put EMPLOYEE VARIABLE FORMAT = &format; EMPLOYEE VARIABLE FORMAT = EMPLFMT. Note that the. in the format name is included in the macro variable &format 16

17
Method 2 – PROC SQL DICTIONARY TABLES proc sql; create table formats as select format from dictionary.columns where upcase(libname) = 'IN' and upcase(memname) = 'SALES' and upcase(name) = ('EMPLOYEE') ; quit; 17

18
PROC SQL Dictionary Table Output proc print data=formats; run; Obs format 1 EMPLFMT. Again note that the. in the format name is included in the value of the variable 18

19
Technique 4 Identify all values of a SAS format CNTLOUT Option of PROC FORMAT –FMTNAME – name of the format –START – starting value of the format –LABEL – descriptive label associated with the value of START 19

20
PROC FORMAT Code proc format library = library cntlout = formatlib (keep = fmtname start label); run; proc print data = formatlib; run; 20

21
Output data set FORMATLIB Obs FMTNAME START LABEL 1 EMPLFMT 1 Hall 2 EMPLFMT 2 Oates 3 EMPLFMT 3 Brooks 4 EMPLFMT 4 Dunn 5 EMPLFMT 5 Abbott 6 EMPLFMT 6 Costello 7 YEARFMT 2008 FY 2008 8 YEARFMT 2009 FY 2009 9 YEARFMT 2010 FY 2010 21

22
Output data set FORMATLIB Note that the variable with the name of the format FMTNAME does not have a. at the end Can add a. to the end fmtname = cats(fmtname,.); 22

23
Putting it all together – finding all possible values for variables using attached formats Uses previously described techniques Can be done with SCL or SQL Complete code is provided 23

24
SAS Macro to put it all together Two macros, one for each technique –%SYSFUNC with SCL –PROC SQL Dictionary tables Both macros use CNTLOUT Both macros input name of variable, output values of variable in a data set %getfmt1(var=employee, outvals=empvals); %getfmt1(var=year, outvals=yearvals); 24

25
Method 1 – SCL %macro getfmt1(var=, outvals=); %let dsid = %sysfunc(open(IN.SALES,i)); /* Input data is hard coded*/ %let varnum = %sysfunc(varnum(&dsid, &var)); %let varfmt = %sysfunc(varfmt(&dsid, &varnum)); %let rc = %sysfunc(close(&dsid)); %put &varfmt; proc format library = library cntlout = &outvals (keep = fmtname start label where = (cats(fmtname,'.') = "&varfmt")); run; title "Data set &outvals from macro GETFMT1 -- SYSFUNC and SCL"; proc print data = &outvals; run; %mend getfmt1; %getfmt1(var=employee, outvals=empvals); %getfmt1(var=year, outvals=yearvals); 25

26
Method 2 – SQL (part 1) %macro getfmt2(var=, outvals=); proc sql; create table &var.fmt as select format from dictionary.columns where upcase(libname) = 'IN' and /* Input data is hard coded*/ upcase(memname) = 'SALES' and upcase(name) = upcase("&var") ; quit; data _null_; set &var.fmt; call symputX("varfmt", format, 'L'); run; %put &varfmt; 26

27
Method 2 – SQL (part 2) proc format library = library cntlout = &outvals (keep = fmtname start label where = (cats(fmtname,'.') = "&varfmt")); run; title "Data set &outvals from macro GETFMT2 SQL Dictionary Table"; proc print data = &outvals; run; %mend getfmt2; %getfmt2(var=employee, outvals=empvals); %getfmt2(var=year, outvals=yearvals); 27

28
Output Data from %GETFMT1 (SCL) Data set empvals from macro GETFMT1 -- SYSFUNC and SCL Obs FMTNAME START LABEL 1 EMPLFMT 1 Hall 2 EMPLFMT 2 Oates 3 EMPLFMT 3 Brooks 4 EMPLFMT 4 Dunn 5 EMPLFMT 5 Abbot 6 EMPLFMT 6 Costello Data set yearvals from macro GETFMT1 -- SYSFUNC and SCL Obs FMTNAME START LABEL 1 YEARFMT 2008 FY 2008 2 YEARFMT 2009 FY 2009 3 YEARFMT 2010 FY 2010 28

29
Output Data from %GETFMT2 (SQL) Data set empvals from macro GETFMT2 SQL Dictionary Table Obs FMTNAME START LABEL 1 EMPLFMT 1 Hall 2 EMPLFMT 2 Oates 3 EMPLFMT 3 Brooks 4 EMPLFMT 4 Dunn 5 EMPLFMT 5 Abbot 6 EMPLFMT 6 Costello Data set yearvals from macro GETFMT2 SQL Dictionary Table Obs FMTNAME START LABEL 1 YEARFMT 2008 FY 2008 2 YEARFMT 2009 FY 2009 3 YEARFMT 2010 FY 2010 29

30
Data set with all possible values of both variables Uses data sets created by macros –EMPVALS – all possible values of EMPLOYEE –YEARVALS – all possible values of YEAR 30

31
Data set with all possible values of both variables Combined with SQL join to create a data set will all 6 possible values of the variable EMPLOYEE with all 3 possible values of the variable YEAR Name of combined data set is ALLROWS ALLROWS has 6 x 3 = 18 observations 31

32
SQL Join to create ALLROWS data yearvals_mod (keep = year a); set yearvals; year = input(trim(left(start)),8.); a = 1; run; data empvals_mod (keep = employee b); set empvals; employee = input(trim(left(start)),8.); b = 1; run; proc sql; create table allrows as select year, employee from yearvals_mod y, empvals_mod e where y.a = e.b; quit; proc sort data=allrows; by year employee; run; 32

33
Data set ALLROWS All rows needed for table Formats will automatically be added later in the merge Obs year employee 1 2008 1 2 2008 2 3 2008 3 4 2008 4 5 2008 5 6 2008 6 7 2009 1 8 2009 2 9 2009 3 10 2009 4 11 2009 5 12 2009 6 13 2010 1 14 2010 2 15 2010 3 16 2010 4 17 2010 5 18 2010 6 33

34
Create data set for report proc sort data=in.sales out=sales; by year employee; run; data sales_all; merge allrows (in=a) sales (in=s); by year employee; /*------------------------------*/ /* Zero-fill the missing rows /*------------------------------*/ if a and ^s then do; num = 0; dollar = 0; end; run; 34

35
Data set SALES_ALL Input data for report Obs year employee num dollar 1 FY 2008 Hall 10 $10,000.00 2 FY 2008 Oates 8 $500.00 3 FY 2008 Brooks 15 $11,111.00 4 FY 2008 Dunn 0 $0.00 5 FY 2008 Abbott 50 $75,757.00 6 FY 2008 Costello 33 $33,333.00 7 FY 2009 Hall 0 $0.00 8 FY 2009 Oates 0 $0.00 9 FY 2009 Brooks 0 $0.00 10 FY 2009 Dunn 0 $0.00 11 FY 2009 Abbott 0 $0.00 12 FY 2009 Costello 0 $0.00 13 FY 2010 Hall 15 $15,500.00 14 FY 2010 Oates 0 $0.00 15 FY 2010 Brooks 20 $12,345.67 16 FY 2010 Dunn 0 $0.00 17 FY 2010 Abbott 75 $99,999.99 18 FY 2010 Costello 44 $44,444.44 35

36
REPORT WITHOUT MISSING ROWS Number Amount of Year Employee Sales Sales ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ FY 2008 Hall 10 $10,000.00 Oates 8 $500.00 Brooks 15 $11,111.00 Dunn 0 $0.00 Abbott 50 $75,757.00 Costello 33 $33,333.00 FY 2009 Hall 0 $0.00 Oates 0 $0.00 Brooks 0 $0.00 Dunn 0 $0.00 Abbott 0 $0.00 Costello 0 $0.00 FY 2010 Hall 15 $15,500.00 Oates 0 $0.00 Brooks 20 $12,345.67 Dunn 0 $0.00 Abbott 75 $99,999.99 Costello 44 $44,444.44 36

37
Conclusions Use of SAS formats can improve the programming process Techniques used for this programming trick can be adapted to other programming tricks 37

Similar presentations

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google