Download presentation
Presentation is loading. Please wait.
Published byEvangeline Cannon Modified over 10 years ago
1
This is the Modern World: Simple, Overlooked SAS® Enhancements Bruce Gilsen Federal Reserve Board June 2009
2
Introduction Smaller, less dramatic enhancements fall thru cracks at Federal Reserve (SAS consultant for 24 years) SAS-L Conferences Users still use more cumbersome methods Enhancements from V92, V9, V8, V7, even V6! Impetus: yymmddn8. in earlier paper
3
Underutilized enhancements Write date values in form yyyymmdd Increment date values w/INTNX function Transport files: PROC CPORT/CIMPORT vs PROC COPY w/EXPORT engine Count # times character or substring occurs in a character string Concatenate character strings Sort by numeric part of character values Retrieve DB2 data on z/OS mainframes
4
1. Write date values in form yyyymmdd Lots of informats, formats to read/write date values 9.1.3: 22 date informats, 46 date formats Most common input form: yyyymmdd YYMMDD8. informat reads dates in form yyyymmdd Most common output form (overwhelmingly): yyyymmdd YYMMDD8. format writes dates as yy-mm-dd YYMMDD10. format writes dates as yyyy-mm-dd
5
Writing dates as yyyymmdd, pre-V8 Some date output data one; date1 = "18oct2008"d; * 10/18/2008 date value; put date1 yymmdd8.; * writes 08-10-18 ; put date1 yymmdd10.; * writes 2008-10-18 ; Before Version 8 PUT: character string of form yyyy-mm-dd COMPRESS: remove dashes datetemp = compress(put(date1,yymmdd10.),'-'); put datetemp $8. ; * writes 20081018 ;
6
Writing dates as yyyymmdd, V8 New format, YYMMDDxw x: optional separator w: number of bytes x, one of: B blank N none C colon P period D dash S slash date1 = "18oct2008"d; * 10/18/2008 date value; put date1 yymmddn8.; * writes 20081018 ;
7
YYMMDDxw format, examples data _null_ ; date1 = "18oct2008"d; * 10/18/08 date value ; put date1 yymmddb8.; * 08 10 18 ; put date1 yymmddc8.; * 08:10:18 ; put date1 yymmddd10.; * 2008-10-18 ; put date1 yymmddn8.; * 20081018 ; put date1 yymmddp8.; * 08.10.18 ; put date1 yymmdds10.; * 2008/10/18 ; put date1 yymmddn6.; * 081018 ;
8
2. Increment date values with INTNX INTNX returns SAS ® date value incremented by number of intervals (days, months, quarters, years,…) SAS date value: integer, number of days before or after January 1, 1960 December 31, 1959 -1 January 1, 1960 0 January 2, 1960 1... October 18, 2008 16,727
9
INTNX Syntax (informal, incomplete) INTNX(interval, start-from, increment ) interval: unit (years, months, days,...) start-from is incremented by start-from: SAS date value to increment increment: # of intervals start-from is incremented by alignment: where start-from aligned after incrementing. Values: BEGINNING, MIDDLE, END, SAMEDAY (v9). Default: BEGINNING.
10
Before Version 9: 2 INTNX issues Alignment BEGINNING (start of period) Leap year handling
11
INTNX issue 1: alignment to start of period Increment by 2 days, months, years Unexpected result for months, years data one; date1 = ‘18oct2008'd; dateplus2day = intnx(‘day',date1,2); dateplus2month = intnx(‘month',date1,2); dateplus2year = intnx(‘year',date1,2); run; Variable Description Expect Actual dateplus2day 10/18/08+2 days 10/20/08 10/20/08 dateplus2month 10/18/08+2 months 12/18/08 12/1/08 dateplus2year 10/18/08+2 years 10/18/10 1/1/10
12
dateplus2day = intnx(‘day',date1,2); date1 = ‘18oct2008'd; = 17,823, date value for 10/18/2008 Increment by 2 days First, increment by 2 days Then, align to start of 10/20/2008 (beginning of interval, DAY), no effect DATEPLUS2DAY = 17,825, date value for 10/20/2008 as expected
13
dateplus2month = intnx(‘month',date1,2); date1 = ‘18oct2008'd; = 17,823, date value for 10/18/2008 Increment by 2 months First, increment by 2 months Then, align to start of 12/2008 (beginning of interval, MONTH) DATEPLUS2MONTH = 17,867, date value for 12/1/2008
14
dateplus2year = intnx(‘year',date1,2); date1 = ‘18oct2008'd; = 17,823, date value for 10/18/2008 Increment by 2 years First, increment by 2 years Then, align to start of 2010 (beginning of interval, YEAR) DATEPLUS2MONTH = 18,263, date value for 1/1/2010
15
INTNX issue 2: leap years Example of shifting by years offset = 2; * number of years to increment; if day(date1) = 29 and month(date1) = 2 and mod(offset,4) ne 0 and mod(offset, 400) ne 0 then date1 = mdy(2, 28, year(date1)+offset); else date1 = mdy(month(date1), day(date1), year(date1)+offset) ; Need more generalized code for all frequencies
16
Solution: SAMEDAY New alignment value in V9 Preserves date value’s alignment in interval after it’s incremented No effect when interval = DAY Use for all intervals except DAY
17
Solution: SAMEDAY dateplus2day = intnx(‘day’, date1, 2, “sameday”) 2 days after 10/18/2008 Result: 17,825, SAS date value for 10/20/2008 dateplus2month = intnx(‘month’, date1, 2, “sameday”) 2 months after 10/18/2008 Result: 17,884, SAS date value for 12/18/2008 dateplus2year = intnx(‘year’, date1, 2, “sameday”) 2 years after 10/18/2008 Result: 18,553, SAS date value for 10/18/2010
18
Some interesting dates date2 = intnx(‘year’, ’29feb2000’d, 1, “sameday”) 1 year after 2/29/2000 Result: 15,034, SAS date value for 2/28/2001 date3 = intnx(‘year’, ’29feb2000’d, 4, “sameday”) 4 years after 2/29/2000 Result: 16,130, SAS date value for 2/29/2004 2000, 2004 = leap years, 2001 not leap year
19
Some interesting dates date4 = intnx(‘month’, ’31mar2003’d, -1, “sameday”) 1 month before 3/31/2003 Result: 15,764, SAS date value for 2/28/2003 date5 = intnx(‘month’, ’31mar2004’d, -1, “sameday”) 1 month before 3/31/2004 Result: 16,130, SAS date value for 2/29/2004 2000, 2004 = leap years, 2003 not leap year
20
SAMEDAY until 9.2: single, non-shifted date intervals only Following intervals might return wrong answer, no error or warning multiple (e.g., month2 = two-month interval) shifted (e.g., month.4 = month interval starting on April 1) time datetime Interval = DAY, WEEK, WEEKDAY, TENDAY, SEMIMONTH, MONTH, QTR, SEMIYEAR, YEAR okay Fixed in 9.2 http://support.sas.com/techsup/unotes/SN/016/016184.html
21
3. Transport files: PROC CPORT/CIMPORT vs PROC COPY w/EXPORT engine Transport file File organized in machine-independent format SAS can read on any operating system Use to copy SAS data sets from one operating system to another Starting w/ Version 6, two types of transport files XPORT-style transport files CPORT-style transport files
22
XPORT-style transport files: 3 steps 1. On original host, create transport file with data set(s) libname inn 'directory_with_data_set(s)' ; libname trans xport 'transport_file_to_create'; proc copy in=inn out=trans ; run ; 2. Copy transport file to destination host with ftp command or another method 3. On destination host, transport file ---> SAS data set(s) libname trans xport 'transport_file' ; libname sasdata1 ‘directory_for_data_sets‘; proc copy in=trans out=sasdata1 ; run ;
23
CPORT-style transport files: 3 steps 1. On original host, create transport file with data set(s) libname sasdata1 'directory_with_SAS_data_set(s)'; filename trans 'transport_file_to_create'; proc cport library=sasdata1 file=trans ; run ; 2. Copy transport file to destination host with ftp command or another method 3. On destination host, transport file ---> SAS data set(s) filename trans 'transport_file' ; libname sasdata1 ‘directory_for_data_sets' ; proc cimport infile=trans library=sasdata1 ; run ;
24
Version 7 changes and transport files Version 7 changes Variable name maximum length: 8 32 Character variable maximum length: 200 32768 Variable names display based on “first reference” Transport files XPORT-style created in Version 6 format CPORT-style understands Version 7 changes Always use CPORT-style
25
XPORT-style transport files, Version 7 and beyond options validvarname=v6; Documented in V8, undocumented but works in V9 Empty data set if any character variable length more than 200, error in log Variable names longer than 8 characters truncated, unique Variable names converted to upper case Otherwise Empty data set if any character variable length more than 200 or variable name longer than 8 characters, error in log Variable names converted to upper case
26
4. Count number of times a character or substring occurs in a character string Typical objective, count number of times "ab" occurs in "ab1aba344444abb555ab4“ (4 times) "a” or “b" occur in "ab1aba344444abb555ab4“ (10 times) Pre-Version 9: iterative methods (loops) Version 9 COUNT: # times substring occurs in a character string COUNTC: # times 1 or more characters occur in a character string
27
Count occurrences of substring "ab" in "ab1aba344444abb555ab4" data one; char1 = "ab1aba344444abb555ab4"; * character variable to check; * Before Version 9 ; num_ab = 0; * counter for number of occurrences ; loc=index(char1,"ab"); * find 1st occurrence ; do while (loc ne 0); num_ab+1; * count occurrences of "ab" ; char1 = substr(char1,loc+2); * resume search after "ab" ; loc = index(char1, "ab"); * find next occurrence ; end; * Or, a single statement in Version 9 ; num_ab = count(char1,"ab"); * Version 9 ; run;
28
Count occurrences of characters "a" or "b" in "ab1aba344444abb555ab4" data one; char1 = "ab1aba344444abb555ab4"; * character variable to check; * Before Version 9 ; num_a_or_b = 0; * counter for number of occurrences ; loc=indexc(char1,"ab"); * find 1st occurrence ; do while (loc ne 0); num_a_or_b+1; * count occurrences of "a” or “b“ ; char1 = substr(char1,loc+1); * resume search after "a” or “b" ; loc = indexc(char1, "ab"); * find next occurrence ; end; * Or, a single statement in Version 9 ; num_a_or_b = countc(char1,"ab"); * Version 9 ; run;
29
Countw: Version 9.2 Counts number of words in a character string Can also replace iterative programming
30
5. Concatenate character strings Pre-Version 9 || to concatenate TRIM removes trailing blanks TRIM(LEFT) removes leading and trailing blanks Version 9, new functions concatenate character strings CAT does not remove leading or trailing blanks CATT removes trailing blanks CATS removes leading and trailing blanks CATX removes leading and trailing blanks, inserts delimiter (separator) between each argument Note difference in length of result
31
data one; length char1 char2 char3 $8.; char1="xxxxxxxx"; char2=" yyy"; * 1 leading blank ; char3="zzzz"; * Leading and trailing blanks included; cat1 = char1 || char2 || char3; cat1 = cat(char1, char2, char3); * V9 ; * Remove trailing blanks ; cat2 = trim(char1) || trim(char2) || trim(char3); cat2 = catt(char1, char2, char3); * V9 ; cat1=xxxxxxxx yyy zzzz cat2=xxxxxxxx yyyzzzz
32
data one; length char1 char2 char3 $8.; char1="xxxxxxxx"; char2=" yyy"; * 1 leading blank ; char3="zzzz"; * Remove leading and trailing blanks ; cat3 = trim(left(char1)) || trim(left(char2)) || trim(left(char3)); cat3 = cats(char1, char2, char3); cat3=xxxxxxxxyyyzzzz
33
Important difference: length of result data one; length char1 char2 char3 $8.; char1="xxxxxxxx"; char2=" yyy"; * 1 leading blank ; char3="zzzz"; cat1 = char1 || char2 || char3; cat1 = cat(char1, char2, char3); etc. If length of result (e.g., CAT1) not defined ||: sum of length of contributing variables: 8+8+8=24 CAT, CATT, CATS, CATX: 200
34
data one; length char1 char2 char3 $8.; char1="xxxxxxxx"; char2=" yyy"; * 1 leading blank ; char3="zzzz"; aa = '123'; * Delimiter: value of AA, can vary by observation ; char_catx1 = catx(aa, char1, char2); * Delimiter: comma ; char_catx2 = catx(',', char1, char2, char3); * Delimiter: '123' ; char_catx3 = catx('123', char1, char2, char3); * Delimiter: single blank ; char_catx4 = catx(' ', char1, char2, char3); run; char_catx1=xxxxxxxx123yyy char_catx2=xxxxxxxx,yyy,zzzz char_catx3=xxxxxxxx123yyy123zzzz char_catx4=xxxxxxxx yyy zzzz
35
6. Sort by the numeric part of character values Variable BANK: bank2 bank100 bank1 bank10 Conventional PROC SORT left to right character by character comparison "1" in "bank100" less than "2" in "bank2“ Wanted: sort integer values w/in text by numeric value Conventional sort order Desired sort order bank1 bank10 bank2 bank100 bank10 bank2 bank100
36
Before Version 9.2: one method Make new data set with variable w/numeric part of BANK Sort by new variable This example assumes numeric part of value always starts at same location (5th character) at most three digits More general case slightly more complicated data two; set one; banktemp = input(substr(bank,5),3.); run; proc sort data=two out=three; by banktemp; run;
37
Version 9.2: linguistic collation SORTSEQ=LINGUISTIC + NUMERIC_COLLATION sorts integer values w/in text by numeric part of value Don’t create extra data set proc sort data=one out=two sortseq=linguistic (numeric_collation=on); by bank; run; Sort order for BANK: bank1 bank2 bank10 bank100
38
Numeric collation not just at end of character values ADDRESS values East 14th Street East 2nd Street Conventional sort: “1” less than “2” Sortseq=linguistic (numeric_collation=on): 2 less than 14 Conventional sort order Linguistic collation order East 14th Street East 2nd Street East 2nd Street East 14th Street
39
7. Retrieve DB2 data on z/OS mainframes Old methods PROC DB2EXT PROC ACCESS Modern methods SAS/ACCESS LIBNAME statement Pass-Through Facility
40
PROC DB2EXT Since Version 6 (1990), undocumented but works In Version 9, message NOTE: Beginning with SAS version 10, DB2EXT will no longer be available. “Grandfathered” to Version 6 Long DB2 column names truncated to 8 characters Algorithm ensures unique variable names
41
PROC ACCESS Introduced in Version 6 (1990) Use access and view descriptors Documented and supported, not recommended PROC CV2VIEW: convert access and view descriptors to SQL views “Grandfathered” to Version 6 Long DB2 column names truncated to 8 characters Algorithm ensures unique variable names
42
SAS/ACCESS LIBNAME statement Introduced in Version 7 Assign libref directly to DB2 database Easiest method Long DB2 column names processed directly
43
Pass-through facility Introduced in Version 6 Retrieve DB2 data by sending DBMS-specific SQL statements to DB2 DBMS = Database Management System Long DB2 column names Processed directly unless options validvarname=v6; options validvarname=v6; Long name truncated to 8 characters, with unique names as per PROC DB2EXT, PROC ACCESS Documented in V8, undocumented but works in V9
44
Retrieving data: 4 methods options db2ssid=dsn; /* 1. PROC DB2EXT */ run; proc db2ext out=new1; select cntry_nm from nicua.cuv_country_nm; run; proc access dbms=db2; /* 2. PROC ACCESS */ create work.new1.access; table = nicua.cuv_country_nm; ssid = dsn; create work.new1.view; select cntry_nm; run;
45
Retrieving data: 4 methods /* 3. SAS/ACCESS to DB2 LIBNAME statement */ libname in1 db2 ssid=dsn authid=nicua; data new1; set in1.cuv_country_nm (keep = cntry_nm); run; proc sql; /* 4. Pass_Through Facility */ connect to db2 (ssid=dsn); create table new1 as select * from connection to db2 (select cntry_nm from nicua.cuv_country_nm); disconnect from db2; quit;
46
Notes SAS/ACCESS LIBNAME statement vs Pass-Through Facility Both recommended SAS/ACCESS LIBNAME statement easiest Pass-Through Facility can be more efficient, especially for complex queries Compared in SAS/ACCESS manual If convert from PROC DB2EXT or PROC ACCESS DB2 column names could be longer In subsequent code: RENAME or change code
47
For more information, please contact Bruce Gilsen Federal Reserve Board, mail stop 157 Washington, DC 20551 phone: 202-452-2494 e-mail: bruce.gilsen@frb.gov SAS is a registered trademark or trademark of SAS Institute Inc. in the USA and other countries. ® indicates USA registration. Other brand and product names are registered trademarks or trademarks of their respective companies.
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.