Presentation is loading. Please wait.

Presentation is loading. Please wait.

SAS Programming Techniques for Decoding Variables on the Database Level By Chris Speck PAREXEL International RTSUG – Wednesday, March 23, 2011.

Similar presentations


Presentation on theme: "SAS Programming Techniques for Decoding Variables on the Database Level By Chris Speck PAREXEL International RTSUG – Wednesday, March 23, 2011."— Presentation transcript:

1 SAS Programming Techniques for Decoding Variables on the Database Level By Chris Speck PAREXEL International RTSUG – Wednesday, March 23, 2011

2 Libraries can be viewed as discrete units. May require a change in perspective for the beginning and intermediate programmer. Crucial now that programmers are under greater pressure to apply regulatory standards to clinical research data. Programmers must now deal with metadata on the library level, which can be especially difficult with legacy data. One tool for this is the Meta-Engine

3 You’re given a library of 55 datasets of varying quality from 2001 which you must convert to submission ready data. Many datasets have variables with different kinds of non-native SAS formats. New variables equivalent to the decode of these formatted variables must be made. All non-native formats need to be stripped. What are you going to do?

4 One solution is to make one program per dataset. Another is to create one massive program that updates datasets one at a time. Some of the obvious flaws to this approach include Specific only to one project Involves much unnecessary rework Disorganized Difficult to debug

5 SAS program that manipulates metadata of entire libraries. Portable Streamlined Easy to understand and debug Loops through a library one dataset at a time to make quick and uniform changes. Relies much on the SAS macro facility Dictionary tables Proc format

6 Meta-Engine Overview Decode( ) Macro Input: A WORK dataset with a formatted variable Output: A WORK dataset with the decode of this variable Meta-Engine Structure

7 Meta-Engine Overview Decode( ) Macro Input: A WORK dataset with a formatted variable Output: A WORK dataset with the decode of this variable Meta-Engine Structure %DO Looping Mechanism Uses PROC SQL and Dictionary Tables. %DO Looping Mechanism

8 Meta-Engine Overview Decode( ) Macro Input: A WORK dataset with a formatted variable Output: A WORK dataset with the decode of this variable Meta-Engine Structure %DO Looping Mechanism Uses PROC SQL and Dictionary Tables. %DO Looping Mechanism Code to Adjust Dataset labels

9 Meta-Engine Overview Decode( ) Macro Input: A WORK dataset with a formatted variable Output: A WORK dataset with the decode of this variable Meta-Engine Structure %DO Looping Mechanism Uses PROC SQL and Dictionary Tables. %DO Looping Mechanism Code to Adjust Dataset labels Code Calling the Decode() macro Does this once for every variable needing decoding.

10 Meta-Engine Overview Decode( ) Macro Input: A WORK dataset with a formatted variable Output: A WORK dataset with the decode of this variable Meta-Engine Structure %DO Looping Mechanism Uses PROC SQL and Dictionary Tables. %DO Looping Mechanism Code to Adjust Dataset labels Code Calling the Decode() macro Does this once for every variable needing decoding. Code to make further changes

11 Meta-Engine Overview The Meta-Engine macro asks for two library names: The one that contains the existing database, and the one that will contain the corrected, submission ready database. For example: %macro MetaEngine(lib=, outlib=); %mend MetaEngine; %MetaEngine(lib=MYLIB, outlib=MYNEWLIB); >

12 %DO Looping Mechanism proc sql noprint; select memname into :dsnames separated by '~' from dictionary.columns where libname="&LIB" and varnum=1; quit;

13 %DO Looping Mechanism proc sql noprint; select memname into :dsnames separated by '~' from dictionary.columns where libname="&LIB" and varnum=1; quit; Using Dictionary Tables to produce a list of datasets in the library separated by tildes (~)

14 %DO Looping Mechanism proc sql noprint; select memname into :dsnames separated by '~' from dictionary.columns where libname="&LIB" and varnum=1; quit; %let i = %eval(1); %do %while (%scan(&dsnames,&i,~) ne ); %let thisds =%scan(&dsnames,%eval(&i),~); %let i = %eval(&i+1); %end; Using Dictionary Tables to produce a list of datasets in the library separated by tildes (~)

15 %DO Looping Mechanism proc sql noprint; select memname into :dsnames separated by '~' from dictionary.columns where libname="&LIB" and varnum=1; quit; %let i = %eval(1); %do %while (%scan(&dsnames,&i,~) ne ); %let thisds =%scan(&dsnames,%eval(&i),~); %let i = %eval(&i+1); %end; Using Dictionary Tables to produce a list of datasets in the library separated by tildes (~) Macro variable THISDS will represent a dataset name for every loop iteration

16 %DO Looping Mechanism proc sql noprint; select memname into :dsnames separated by '~' from dictionary.columns where libname="&LIB" and varnum=1; quit; %let i = %eval(1); %do %while (%scan(&dsnames,&i,~) ne ); %let thisds =%scan(&dsnames,%eval(&i),~); %let i = %eval(&i+1); %end; Using Dictionary Tables to produce a list of datasets in the library separated by tildes (~) Macro variable THISDS will represent a dataset name for every loop iteration Loops till you run out of tildes

17 %DO Looping Mechanism proc sql noprint; select memname into :dsnames separated by '~' from dictionary.columns where libname="&LIB" and varnum=1; quit; %let i = %eval(1); %do %while (%scan(&dsnames,&i,~) ne ); %let thisds =%scan(&dsnames,%eval(&i),~); %let i = %eval(&i+1); %end; Using Dictionary Tables to produce a list of datasets in the library separated by tildes (~) Macro variable THISDS will represent a dataset name for every loop iteration Loops till you run out of tildes >

18 Adjusting Dataset Labels %let dslabel=; proc sql noprint; select memlabel into :dslabel from dictionary.tables where libname="&lib" and memname="&thisds"; quit; %let dslabel=%trim(&dslabel);

19 Adjusting Dataset Labels %let dslabel=; proc sql noprint; select memlabel into :dslabel from dictionary.tables where libname="&lib" and memname="&thisds"; quit; %let dslabel=%trim(&dslabel); Data Steps won’t save them

20 Adjusting Dataset Labels %let dslabel=; proc sql noprint; select memlabel into :dslabel from dictionary.tables where libname="&lib" and memname="&thisds"; quit; %let dslabel=%trim(&dslabel); Data Steps won’t save them Resetting label macro variable before each loop iteration

21 Adjusting Dataset Labels %let dslabel=; proc sql noprint; select memlabel into :dslabel from dictionary.tables where libname="&lib" and memname="&thisds"; quit; %let dslabel=%trim(&dslabel); Data Steps won’t save them Resetting label macro variable before each loop iteration Dictionary Table assigns label of THISDS to macro variable DSLABEL. Used to assign dataset label to final dataset.

22 Adjusting Dataset Labels %let dslabel=; proc sql noprint; select memlabel into :dslabel from dictionary.tables where libname="&lib" and memname="&thisds"; quit; %let dslabel=%trim(&dslabel); %if &thisds=DATA1 %then %let dslabel=Label for DATA1; %else %if &thisds=DATA2 %then %let dslabel=Label for DATA2; Data Steps won’t save them Resetting label macro variable before each loop iteration In case you want to manually adjust dataset labels (not in paper) Dictionary Table assigns label of THISDS to macro variable DSLABEL. Used to assign dataset label to final dataset.

23 Calling the Decode Macro data ds0; set &lib..&thisds; run; %let fv=; %if &thisds=DATA1 %then %do; %decode(fmtlib=&lib, ds=ds0, newds=ds0_00, var=D1VAR1, newvar=D1VAR1C); %let fv=&fv D1VAR1; %end;

24 Calling the Decode Macro data ds0; set &lib..&thisds; run; %let fv=; %if &thisds=DATA1 %then %do; %decode(fmtlib=&lib, ds=ds0, newds=ds0_00, var=D1VAR1, newvar=D1VAR1C); %let fv=&fv D1VAR1; %end; Base dataset equal to THISDS

25 Calling the Decode Macro data ds0; set &lib..&thisds; run; %let fv=; %if &thisds=DATA1 %then %do; %decode(fmtlib=&lib, ds=ds0, newds=ds0_00, var=D1VAR1, newvar=D1VAR1C); %let fv=&fv D1VAR1; %end; Base dataset equal to THISDS Will equal list of all decoded variables for later processing

26 Calling the Decode Macro data ds0; set &lib..&thisds; run; %let fv=; %if &thisds=DATA1 %then %do; %decode(fmtlib=&lib, ds=ds0, newds=ds0_00, var=D1VAR1, newvar=D1VAR1C); %let fv=&fv D1VAR1; %end; Base dataset equal to THISDS DECODE MACRO Parameters: Will equal list of all decoded variables for later processing

27 Calling the Decode Macro data ds0; set &lib..&thisds; run; %let fv=; %if &thisds=DATA1 %then %do; %decode(fmtlib=&lib, ds=ds0, newds=ds0_00, var=D1VAR1, newvar=D1VAR1C); %let fv=&fv D1VAR1; %end; Base dataset equal to THISDS DECODE MACRO Parameters: Format library Will equal list of all decoded variables for later processing

28 Calling the Decode Macro data ds0; set &lib..&thisds; run; %let fv=; %if &thisds=DATA1 %then %do; %decode(fmtlib=&lib, ds=ds0, newds=ds0_00, var=D1VAR1, newvar=D1VAR1C); %let fv=&fv D1VAR1; %end; Base dataset equal to THISDS DECODE MACRO Parameters: Format library Input dataset Will equal list of all decoded variables for later processing

29 Calling the Decode Macro data ds0; set &lib..&thisds; run; %let fv=; %if &thisds=DATA1 %then %do; %decode(fmtlib=&lib, ds=ds0, newds=ds0_00, var=D1VAR1, newvar=D1VAR1C); %let fv=&fv D1VAR1; %end; Base dataset equal to THISDS DECODE MACRO Parameters: Format library Input dataset Output dataset (with 1 new decode variable) Will equal list of all decoded variables for later processing

30 Calling the Decode Macro data ds0; set &lib..&thisds; run; %let fv=; %if &thisds=DATA1 %then %do; %decode(fmtlib=&lib, ds=ds0, newds=ds0_00, var=D1VAR1, newvar=D1VAR1C); %let fv=&fv D1VAR1; %end; Base dataset equal to THISDS DECODE MACRO Parameters: Format library Input dataset Output dataset (with 1 new decode variable) Variable to be decoded Will equal list of all decoded variables for later processing

31 Calling the Decode Macro data ds0; set &lib..&thisds; run; %let fv=; %if &thisds=DATA1 %then %do; %decode(fmtlib=&lib, ds=ds0, newds=ds0_00, var=D1VAR1, newvar=D1VAR1C); %let fv=&fv D1VAR1; %end; Base dataset equal to THISDS DECODE MACRO Parameters: Format library Input dataset Output dataset (with 1 new decode variable) Variable to be decoded Decode variable name Will equal list of all decoded variables for later processing

32 Calling the Decode Macro data ds0; set &lib..&thisds; run; %let fv=; %if &thisds=DATA1 %then %do; %decode(fmtlib=&lib, ds=ds0, newds=ds0_00, var=D1VAR1, newvar=D1VAR1C); %let fv=&fv D1VAR1; %end; Base dataset equal to THISDS DECODE MACRO Parameters: Format library Input dataset Output dataset (with 1 new decode variable) Variable to be decoded Decode variable name Will equal list of all decoded variables for later processing Builds decode list

33 Calling the Decode Macro data ds0; set &lib..&thisds; run; %let fv=; %if &thisds=DATA1 %then %do; %decode(fmtlib=&lib, ds=ds0, newds=ds0_00, var=D1VAR1, newvar=D1VAR1C); %let fv=&fv D1VAR1; %end; Base dataset equal to THISDS DECODE MACRO Parameters: Format library Input dataset Output dataset (with 1 new decode variable) Variable to be decoded Decode variable name Will equal list of all decoded variables for later processing Builds decode list Used DS0_00 numbering scheme with NEWDS parameter because it will be the parameter DS in the next macro call, producing DS0_001. Final product should be DS1.

34 Calling the Decode Macro How it would appear in real code

35 Decode Macro proc sql noprint; select format into :fmt from dictionary.columns where libname="WORK" and memname=%upcase("&ds") and name="&var"; quit; 1. Finds variable’s format using SAS Dictionary table COLUMNS and assigns it to the FMT macro variable.

36 Decode Macro proc sql noprint; select format into :fmt from dictionary.columns where libname="WORK" and memname=%upcase("&ds") and name="&var"; quit; proc format noprint cntlout=fmt (keep=length) library=%upcase(&fmtlib) fmtlib; select %substr(&fmt,1,%length(&fmt)-1); run; 2. Gets full range of &FMT format using FMTLIB option. Saves to dataset FMT. LENGTH is max length of format value. 1. Finds variable’s format using SAS Dictionary table COLUMNS and assigns it to the FMT macro variable.

37 Decode Macro proc sql noprint; select format into :fmt from dictionary.columns where libname="WORK" and memname=%upcase("&ds") and name="&var"; quit; proc format noprint cntlout=fmt (keep=length) library=%upcase(&fmtlib) fmtlib; select %substr(&fmt,1,%length(&fmt)-1); run; Gets full range of &FMT format using FMTLIB option. Saves to dataset FMT. 1. Finds variable’s format using SAS Dictionary table COLUMNS and assigns it to the FMT macro variable. Why do we use a substring function here? 2. Gets full range of &FMT format using FMTLIB option. Saves to dataset FMT. LENGTH is max length of format value.

38 Decode Macro proc sql noprint; select format into :fmt from dictionary.columns where libname="WORK" and memname=%upcase("&ds") and name="&var"; quit; proc format noprint cntlout=fmt (keep=length) library=%upcase(&fmtlib) fmtlib; select %substr(&fmt,1,%length(&fmt)-1); run; Gets full range of &FMT format using FMTLIB option. Saves to dataset FMT. 1. Finds variable’s format using SAS Dictionary table COLUMNS and assigns it to the FMT macro variable. Why do we use a substring function here? To remove the trailing period 2. Gets full range of &FMT format using FMTLIB option. Saves to dataset FMT. LENGTH is max length of format value.

39 Decode Macro data _null_; set fmt; if _n_=1 then call symput('len',cats(put(length,best.))); run; 2.5. Assigns max length of format to &LEN to prevent truncation.

40 Decode Macro data _null_; set &ds; if _n_=1 then do; if length(vlabel(&var))<=38 then call symput('newlabel',cats(vlabel(&var))||"-C"); else if length(vlabel(&var))=39 then call symput('newlabel',cats(vlabel(&var))||"C"); else if length(vlabel(&var))>=40 then call symput('newlabel',substr(cats(vlabel(&var)), 1,length(vlabel(&var))-1)||"C"); end; run; 3. Retrieves variable label with VLABEL

41 Decode Macro data _null_; set &ds; if _n_=1 then do; if length(vlabel(&var))<=38 then call symput('newlabel',cats(vlabel(&var))||"-C"); else if length(vlabel(&var))=39 then call symput('newlabel',cats(vlabel(&var))||"C"); else if length(vlabel(&var))>=40 then call symput('newlabel',substr(cats(vlabel(&var)), 1,length(vlabel(&var))-1)||"C"); end; run; 3. Retrieves variable label with VLABEL Adjusts label so decode variables will have unique labels. Truncates label if >40 characters. Assigns to macro variable &NEWLABEL

42 Decode Macro data &newds; length &newvar $&len; set &ds; &newvar=put(&var,&fmt); label &newvar="&newlabel"; run; 4. Creates output dataset (&NEWDS). Derives decode variable (&NEWVAR) with variable format (&FMT). Assigns it a length (&LEN) and a label (&NEWLABEL).

43 Decode Macro data &newds; length &newvar $&len; set &ds; &newvar=put(&var,&fmt); label &newvar="&newlabel"; run; proc datasets nolist lib=work memtype=data; delete fmt &ds; run; quit; 5. Garbage collection 4. Creates output dataset (&NEWDS). Derives decode variable (&NEWVAR) with variable format (&FMT). Assigns it a length (&LEN) and a label (&NEWLABEL).

44 Further Adjustments data ds2; set ds1; %if &thisds=DATA3 %then %do; label D3VAR1C="Trunc. decode label which was too long-C"; %end; run; For further information see my previous paper SAS Programming Techniques for Adjusting Metadata on the Database Level.

45 Completing Database Loop data &outlib..&thisds %if %length(&dslabel)>0 %then (label="&dslabel");; set ds2; format &fv; run; Creates final dataset in output library Assigns dataset label DS2 could be DS3 or any number depending on the adjustments performed. Strips formats off of decoded variables. Process repeats for every dataset in library.

46 Other Possibilities Possible to automate variable decoding. PROC SQL produces list of variables with non- native formats. List informs inner loop calling %DECODE() once for each variable. A gain in automation, a loss in adaptability. Not all formats exist in the same catalog. Not all variable names may be 8 characters long. Not all formatted variables may require decodes.

47 Other Possibilities The Meta-Engine can be tweaked depending on the task. Some ideas include: Testing libraries for SAS 5 compliance Excluding certain datasets Renaming datasets Splitting a dataset into two if it takes up too much memory. Adjusting dataset and variable metadata. See my previous paper SAS Programming Techniques for Adjusting Metadata on the Database Level.

48 The Meta-Engine offers a quick and streamlined approach for a programmer to begin thinking about metadata on the library or database level. Programmers can begin to manipulate whole libraries intuitively as if they were datasets. The Meta-Engine in its entirety plus further information can be found in my paper SAS Programming Techniques for Decoding Variables on the Database Level. Conclusion

49 Chris Speck, Senior Programmer PAREXEL International 2520 Meridian Parkway, Suite 200 Durham, NC 27713 Work Phone: 919 294 5018 Fax: 919 544 3698 chris.speck@parexel.com Contact Information


Download ppt "SAS Programming Techniques for Decoding Variables on the Database Level By Chris Speck PAREXEL International RTSUG – Wednesday, March 23, 2011."

Similar presentations


Ads by Google