Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter 21 Reading Hierarchical Files Reading Hierarchical Raw Data Files.

Similar presentations


Presentation on theme: "Chapter 21 Reading Hierarchical Files Reading Hierarchical Raw Data Files."— Presentation transcript:

1 Chapter 21 Reading Hierarchical Files Reading Hierarchical Raw Data Files

2 2 Objectives – Read data with mixed record types – Read a hierarchical file and create one observation per detail record. – Read a hierarchical file and create one observation per header record.

3 3 Mixed Record Types Not all records have the same format. 101 USA 1-20-1999 3295.50 3034 EUR 30JAN1999 1876,30 101 USA 1-30-1999 2938.00 128 USA 2-5-1999 2908.74 1345 EUR 6FEB1999 3145,60 109 USA 3-17-1999 2789.10... Multiple INPUT statements are needed using conditional statement to control.

4 4 Desired Output Sales Sale ID Location Date Amount 101 USA 14264 3295.50 3034 EUR 14274 1876.30 101 USA 14274 2938.00 128 USA 14280 2908.74 1345 EUR 14281 3145.60 109 USA 14320 2789.10

5 5 The INPUT Statement Multiple INPUT statements are needed for different formats of the same variable: input SalesID $ Location $; if Location='USA' then input SaleDate : mmddyy10. Amount; else if location='EUR' then input SaleDate : date9. Amount : comma8.;

6 6 The INPUT Statement NOTE: 6 records were read from the infile 'sales.dat'. The minimum record length was 24. The maximum record length was 26. NOTE: The data set WORK.SALES has 3 observations and 4 variables.... NOTE: This is NOT correct. We suppose have 6 cases (6 observations), not 3.

7 7 Undesirable Output Sales Sale ID Location Date Amount 101 USA.. 1345 EUR..... NOTE: This is NOT correct. We have 6 cases (6 observations), not 3. Besides, all the Sale date and Amount are missing!

8 8 The program: input SalesID $ Location $; if Location='USA' then input SaleDate : mmddyy10. Amount; else if location='EUR' then input SaleDate : date9. Amount : commax8.; The raw data: 101 USA 1-20-1999 3295.50 3034 EUR 30JAN1999 1876,30 101 USA 1-30-1999 2938.00 128 USA 2-5-1999 2908.74 1345 EUR 6FEB1999 3145,60 109 USA 3-17-1999 2789.10 NOTE: Each INPUT statement reads a new case (observation), based on the IF condition. The output: Sales Sale ID Location Date Amount 101 USA.. 1345 EUR..

9 9 Use The Single Trailing @ to control reading the same case requiring more than one INPUT statement The single trailing @ option holds a raw data record in the input buffer until SAS – executes an INPUT statement with no trailing @, or – reaches the bottom of the DATA step. General form of an INPUT statement with the single trailing @: INPUT var1 var2 var3 … @;

10 10 input SalesID $ Location $ @; if location='USA' then input SaleDate : mmddyy10. Amount; else if Location='EUR' then input SaleDate : date9. Amount : commax8.; Hold record for next INPUT statement. Load next record. Processing the Trailing @

11 11 PDV SALESIDSALEDATE AMOUNT LOCATION data sales; length SalesID $ 4 Location $ 3; infile 'raw-data-file'; input SalesID $ Location $ @; if Location='USA' then input SaleDate : mmddyy10. Amount; else if Location='EUR' then input SaleDate : date9. Amount : commax8.; run; Compile Input Buffer... 101 USA 1-20-1999 3295.50 3034 EUR 30JAN1999 1876,30 101 USA 1-30-1999 2938.00 128 USA 2-5-1999 2908.74 1345 EUR 6FEB1999 3145,60 109 USA 3-17-1999 2789.10 Raw Data File

12 12 data sales; length SalesID $ 4 Location $ 3; infile 'raw-data-file'; input SalesID $ Location $ @; if Location='USA' then input SaleDate : mmddyy10. Amount; else if Location='EUR' then input SaleDate : date9. Amount : commax8.; run; PDV SALESIDSALEDATE. AMOUNT. LOCATION 101 USA 1-20-1999 3295.50 3034 EUR 30JAN1999 1876,30 101 USA 1-30-1999 2938.00 128 USA 2-5-1999 2908.74 1345 EUR 6FEB1999 3145,60 109 USA 3-17-1999 2789.10 Raw Data File... Execute Input Buffer

13 13 data sales; length SalesID $ 4 Location $ 3; infile 'raw-data-file'; input SalesID $ Location $ @; if Location='USA' then input SaleDate : mmddyy10. Amount; else if Location='EUR' then input SaleDate : date9. Amount : commax8.; run; PDV SALESIDSALEDATE. AMOUNT. LOCATION Input Buffer 101 USA 1-20-1999 3295.50 3034 EUR 30JAN1999 1876,30 101 USA 1-30-1999 2938.00 128 USA 2-5-1999 2908.74 1345 EUR 6FEB1999 3145,60 109 USA 3-17-1999 2789.10 Raw Data File 1 0 1 U S A 1 - 2 0 - 1 9 9 9 3 2 9 5. 5 0...

14 14 data sales; length SalesID $ 4 Location $ 3; infile 'raw-data-file'; input SalesID $ Location $ @; if Location='USA' then input SaleDate : mmddyy10. Amount; else if Location='EUR' then input SaleDate : date9. Amount : commax8.; run; PDV SALESIDSALEDATE. AMOUNT. LOCATION Input Buffer 101 USA 1-20-1999 3295.50 3034 EUR 30JAN1999 1876,30 101 USA 1-30-1999 2938.00 128 USA 2-5-1999 2908.74 1345 EUR 6FEB1999 3145,60 109 USA 3-17-1999 2789.10 Raw Data File 1 0 1 U S A 1 - 2 0 - 1 9 9 9 3 2 9 5. 5 0

15 15 data sales; length SalesID $ 4 Location $ 3; infile 'raw-data-file'; input SalesID $ Location $ @; if Location='USA' then input SaleDate : mmddyy10. Amount; else if Location='EUR' then input SaleDate : date9. Amount : commax8.; run; PDV SALESIDSALEDATE. AMOUNT. LOCATION True Input Buffer 101 USA 1-20-1999 3295.50 3034 EUR 30JAN1999 1876,30 101 USA 1-30-1999 2938.00 128 USA 2-5-1999 2908.74 1345 EUR 6FEB1999 3145,60 109 USA 3-17-1999 2789.10 Raw Data File 1 0 1 U S A 1 - 2 0 - 1 9 9 9 3 2 9 5. 5 0 101USA... Hold record.

16 16 data sales; length SalesID $ 4 Location $ 3; infile 'raw-data-file'; input SalesID $ Location $ @; if Location='USA' then input SaleDate : mmddyy10. Amount; else if Location='EUR' then input SaleDate : date9. Amount : commax8.; run; Input Buffer 101 USA 1-20-1999 3295.50 3034 EUR 30JAN1999 1876,30 101 USA 1-30-1999 2938.00 128 USA 2-5-1999 2908.74 1345 EUR 6FEB1999 3145,60 109 USA 3-17-1999 2789.10 Raw Data File SALESIDSALEDATE. AMOUNT. LOCATION 101USA PDV 142643295.50... 1 0 1 U S A 1 - 2 0 - 1 9 9 9 3 2 9 5. 5 0

17 17 Write out observation to sales. data sales; length SalesID $ 4 Location $ 3; infile 'raw-data-file'; input SalesID $ Location $ @; if Location='USA' then input SaleDate : mmddyy10. Amount; else if Location='EUR' then input SaleDate : date9. Amount : commax8.; run; SALESIDSALEDATE. AMOUNT. LOCATION 101USA PDV 142643295.50... Input Buffer 101 USA 1-20-1999 3295.50 3034 EUR 30JAN1999 1876,30 101 USA 1-30-1999 2938.00 128 USA 2-5-1999 2908.74 1345 EUR 6FEB1999 3145,60 109 USA 3-17-1999 2789.10 Raw Data File 1 0 1 U S A 1 - 2 0 - 1 9 9 9 3 2 9 5. 5 0 Implicit output

18 18 data sales; length SalesID $ 4 Location $ 3; infile 'raw-data-file'; input SalesID $ Location $ @; if Location='USA' then input SaleDate : mmddyy10. Amount; else if Location='EUR' then input SaleDate : date9. Amount : commax8.; run; Implicit return SALESIDSALEDATE. AMOUNT. LOCATION 101USA PDV 142643295.50... Input Buffer 101 USA 1-20-1999 3295.50 3034 EUR 30JAN1999 1876,30 101 USA 1-30-1999 2938.00 128 USA 2-5-1999 2908.74 1345 EUR 6FEB1999 3145,60 109 USA 3-17-1999 2789.10 Raw Data File 1 0 1 U S A 1 - 2 0 - 1 9 9 9 3 2 9 5. 5 0

19 19 data sales; length SalesID $ 4 Location $ 3; infile 'raw-data-file'; input SalesID $ Location $ @; if Location='USA' then input SaleDate : mmddyy10. Amount; else if Location='EUR' then input SaleDate : date9. Amount : commax8.; run; PDV SALESIDSALEDATE. AMOUNT. LOCATION 101 USA 1-20-1999 3295.50 3034 EUR 30JAN1999 1876,30 101 USA 1-30-1999 2938.00 128 USA 2-5-1999 2908.74 1345 EUR 6FEB1999 3145,60 109 USA 3-17-1999 2789.10 Raw Data File Continue processing until end of the raw data file.... Input Buffer 1 0 1 U S A 1 - 2 0 - 1 9 9 9 3 2 9 5. 5 0

20 20 NOTE: 6 records were read from the infile 'sales.dat'. The minimum record length was 24. The maximum record length was 26. NOTE: The data set WORK.SALES has 6 observations and 4 variables. Mixed Record Types Partial Log

21 21 Sales Sale ID Location Date Amount 101 USA 14264 3295.50 3034 EUR 14274 1876.30 101 USA 14274 2938.00 128 USA 14280 2908.74 1345 EUR 14281 3145.60 109 USA 14320 2789.10 Mixed Record Types proc print data=sales noobs; run; PROC PRINT Output

22 22 Subsetting from a Raw Data File This scenario uses the raw data file from the previous example. 101 USA 1-20-1999 3295.50 3034 EUR 30JAN1999 1876,30 101 USA 1-30-1999 2938.00 128 USA 2-5-1999 2908.74 1345 EUR 6FEB1999 3145,60 109 USA 3-17-1999 2789.10

23 23 Desired Output The sales manager wants to see sales for the European branch only. Sales Sale ID Location Date Amount 3034 EUR 14274 1876.30 1345 EUR 14281 3145.60

24 24 The Subsetting IF Statement data europe; length SalesID $ 4 Location $ 3; infile 'raw-data-file'; input SalesID $ Location $ @; if Location='USA' then input SaleDate : mmddyy10. Amount ; else if Location='EUR' then input SaleDate : date9. Amount : commax8.; if Location='EUR'; run; This is okay, but not efficient. It reads the entire data first, then select EUR location.

25 25 The Subsetting IF Statement The subsetting IF should appear as early in the program as possible but after the variables used in the condition are calculated. In this case, we should read only the EUR cases by adding the IF statement right after reading Location.

26 26 The Subsetting IF Statement Because the program reads only European sales, the INPUT statement for USA sales is not needed. data europe; length SalesID $ 4 Location $ 3; infile 'raw-data-file'; input SalesID $ Location $ @; if Location='EUR'; input SaleDate : date9. Amount : commax8.; run;

27 27 The Subsetting IF Statement Sales Sale ID Location Date Amount 3034 EUR 14274 1876.30 1345 EUR 14281 3145.60 proc print data=europe noobs; run;

28 28 Processing Hierarchical Files Many files are hierarchical in structure, consisting of – a header record – one or more related detail records. Typically, each record contains a field that identifies whether it is a header record or a detail record. Header Detail Header Detail Header Detail

29 29 Processing Hierarchical Files You can read a hierarchical file into a SAS data set by creating one observation per detail record and storing the header information as part of each observation. Header 1 Detail 1 Detail 2 Detail 3 Header 2 Detail 1 Header 3 Detail 1 Detail 2 Hierarchical File Header Variables Header 1 Header 2 Header 3 Detail Variables Detail 1 Detail 2 Detail 3 Detail 1 Detail 2 SAS Data Set

30 30 Processing Hierarchical Files You can also create one observation per header record and store the information from detail records in summary variables. Header 1 Detail 1 Detail 2 Detail 3 Header 2 Detail 1 Header 3 Detail 1 Detail 2 Header Variables Header 1 Header 2 Header 3 Summary Variables Summary 1 Summary 2 Summary 3 Hierarchical FileSAS Data Set

31 31 E:Adams:Susan D:Michael:C D:Lindsay:C E:Porter:David D:Susan:S E:Lewis:Dorian D. D:Richard:C E:Dansky:Ian E:Nicholls:James D:Roberta:C E:Slaydon:Marla D:John:S Creating One Observation Per Detail The raw data file dependents has a header record containing the name of the employee and a detail record for each dependent on the employee’s health insurance. E: Employee, D: Dependent C: Child, S: Spouse Each data value is separated by :

32 32 Desired Output Personnel wants a list of all the dependents and the name of the associated employee. EmpLName EmpFName DepName Relation Adams Susan Michael C Adams Susan Lindsay C Porter David Susan S Lewis Dorian D. Richard C Nicholls James Roberta C Slaydon Marla John S

33 33 A Hierarchical File – Not all the records are the same. – The fields are separated by colons. – There is a field indicating whether the record is a header or a detail record. E:Adams:Susan D:Michael:C D:Lindsay:C E:Porter:David D:Susan:S E:Lewis:Dorian D. D:Richard:C E:Dansky:Ian E:Nicholls:James D:Roberta:C E:Slaydon:Marla D:John:S

34 34 How to Read the Hierarchical Data input Type $ @; if Type='E' then input EmpLName $ EmpFName $; else input DepName $ Relation $;

35 35 How to Output Only the Dependents input Type $ @; if Type='E' then input EmpLName $ EmpFName $; else do; input DepName $ Relation $; output; end; Try the following program. Observe what is wrong with the result.

36 36 Input Buffer E:Adams:Susan D:Michael:C D:Lindsay:C E:Porter:David D:Susan:S E:Lewis:Dorian D. D:Richard:C E:Dansky:Ian E:Nicholls:James D:Roberta:C E:Slaydon:Marla D:John:S EMPLNAMERELATION data dependents(drop=Type); length Type $ 1 EmpLName EmpFName DepName $ 20 Relation $ 1; infile 'raw-data-file' dlm= ':'; input Type $ @; if Type='E' then input EmpLName $ EmpFName $; else do; input DepName $ Relation $; output; end; run; Compile EMPFNAMEDEPNAMETYPE D...

37 37 EMPLNAMERELATIONEMPFNAMEDEPNAMETYPE data dependents(drop=Type); length Type $ 1 EmpLName EmpFName DepName $ 20 Relation $ 1; infile 'raw-data-file' dlm=':'; input Type $ @; if Type='E' then input EmpLName $ EmpFName $; else do; input DepName $ Relation $; output; end; run; E:Adams:Susan D:Michael:C D:Lindsay:C E:Porter:David D:Susan:S E:Lewis:Dorian D. D:Richard:C E:Dansky:Ian E:Nicholls:James D:Roberta:C E:Slaydon:Marla D:John:S Execute... D Input Buffer

38 38 Input Buffer EMPLNAMERELATIONEMPFNAMEDEPNAMETYPE data dependents(drop=Type); length Type $ 1 EmpLName EmpFName DepName $ 20 Relation $ 1; infile 'raw-data-file' dlm=':'; input Type $ @; if Type='E' then input EmpLName $ EmpFName $; else do; input DepName $ Relation $; output; end; run; E:Adams:Susan D:Michael:C D:Lindsay:C E:Porter:David D:Susan:S E:Lewis:Dorian D. D:Richard:C E:Dansky:Ian E:Nicholls:James D:Roberta:C E:Slaydon:Marla D:John:S E : A d a m s : S u s a n... D

39 39 Input Buffer EMPLNAMERELATIONEMPFNAMEDEPNAMETYPE D data dependents(drop=Type); length Type $ 1 EmpLName EmpFName DepName $ 20 Relation $ 1; infile 'raw-data-file' dlm=':'; input Type $ @; if Type='E' then input EmpLName $ EmpFName $; else do; input DepName $ Relation $; output; end; run; E:Adams:Susan D:Michael:C D:Lindsay:C E:Porter:David D:Susan:S E:Lewis:Dorian D. D:Richard:C E:Dansky:Ian E:Nicholls:James D:Roberta:C E:Slaydon:Marla D:John:S E : A d a m s : S u s a n E Hold record....

40 40 EMPLNAMERELATIONEMPFNAMEDEPNAMETYPE D data dependents(drop=Type); length Type $ 1 EmpLName EmpFName DepName $ 20 Relation $ 1; infile 'raw-data-file' dlm=':'; input Type $ @; if Type='E' then input EmpLName $ EmpFName $; else do; input DepName $ Relation $; output; end; run; Input Buffer E:Adams:Susan D:Michael:C D:Lindsay:C E:Porter:David D:Susan:S E:Lewis:Dorian D. D:Richard:C E:Dansky:Ian E:Nicholls:James D:Roberta:C E:Slaydon:Marla D:John:S E : A d a m s : S u s a n E True...

41 41 Input Buffer EMPLNAMERELATIONEMPFNAMEDEPNAMETYPE D data dependents(drop=Type); length Type $ 1 EmpLName EmpFName DepName $ 20 Relation $ 1; infile 'raw-data-file' dlm=':' input Type $ @; if Type='E' then input EmpLName $ EmpFName $; else do; input DepName $ Relation $; output; end; run; E:Adams:Susan D:Michael:C D:Lindsay:C E:Porter:David D:Susan:S E:Lewis:Dorian D. D:Richard:C E:Dansky:Ian E:Nicholls:James D:Roberta:C E:Slaydon:Marla D:John:S EAdamsSusan Implicit return No implicit output... E : A d a m s : S u s a n

42 42 Input Buffer EMPLNAMERELATIONEMPFNAMEDEPNAMETYPE D data dependents(drop=Type); length Type $ 1 EmpLName EmpFName DepName $ 20 Relation $ 1; infile 'raw-data-file' dlm=':'; input Type $ @; if Type='E' then input EmpLName $ EmpFName $; else do; input DepName $ Relation $; output; end; run; E:Adams:Susan D:Michael:C D:Lindsay:C E:Porter:David D:Susan:S E:Lewis:Dorian D. D:Richard:C E:Dansky:Ian E:Nicholls:James D:Roberta:C E:Slaydon:Marla D:John:S E : A d a m s : S u s a n EAdamsSusan...

43 43 Input Buffer EMPLNAMERELATIONEMPFNAMEDEPNAMETYPE D data dependents(drop=Type); length Type $ 1 EmpLName EmpFName DepName $ 20 Relation $ 1; infile 'raw-data-file' dlm=':'; input Type $ @; if Type='E' then input EmpLName $ EmpFName $; else do; input DepName $ Relation $; output; end; run; E:Adams:Susan D:Michael:C D:Lindsay:C E:Porter:David D:Susan:S E:Lewis:Dorian D. D:Richard:C E:Dansky:Ian E:Nicholls:James D:Roberta:C E:Slaydon:Marla D:John:S E : A d a m s : S u s a n EAdamsSusan No implicit output...

44 44 Input Buffer EMPLNAMERELATIONEMPFNAMEDEPNAMETYPE D data dependents(drop=Type); length Type $ 1 EmpLName EmpFName DepName $ 20 Relation $ 1; infile 'raw-data-file' dlm=':'; input Type $ @; if Type='E' then input EmpLName $ EmpFName $; else do; input DepName $ Relation $; output; end; run; E:Adams:Susan D:Michael:C D:Lindsay:C E:Porter:David D:Susan:S E:Lewis:Dorian D. D:Richard:C E:Dansky:Ian E:Nicholls:James D:Roberta:C E:Slaydon:Marla D:John:S EAdamsSusan... Implicit return E : A d a m s : S u s a n

45 45 Input Buffer EMPLNAMERELATIONEMPFNAMEDEPNAMETYPE D data dependents(drop=Type); length Type $ 1 EmpLName EmpFName DepName $ 20 Relation $ 1; infile 'raw-data-file' dlm=':'; input Type $ @; if Type='E' then input EmpLName $ EmpFName $; else do; input DepName $ Relation $; output; end; run; E:Adams:Susan D:Michael:C D:Lindsay:C E:Porter:David D:Susan:S E:Lewis:Dorian D. D:Richard:C E:Dansky:Ian E:Nicholls:James D:Roberta:C E:Slaydon:Marla D:John:S Reinitialize PDV.... E : A d a m s : S u s a n

46 46 EMPLNAMERELATIONEMPFNAMEDEPNAMETYPE D data dependents(drop=Type); length Type $ 1 EmpLName EmpFName DepName $ 20 Relation $ 1; infile 'raw-data-file' dlm=':'; input Type $ @; if Type='E' then input EmpLName $ EmpFName $; else do; input DepName $ Relation $; output; end; run; E:Adams:Susan D:Michael:C D:Lindsay:C E:Porter:David D:Susan:S E:Lewis:Dorian D. D:Richard:C E:Dansky:Ian E:Nicholls:James D:Roberta:C E:Slaydon:Marla D:John:S... Input Buffer E : A d a m s : S u s a n

47 47 EMPLNAMERELATIONEMPFNAMEDEPNAMETYPE D data dependents(drop=Type); length Type $ 1 EmpLName EmpFName DepName $ 20 Relation $ 1; infile 'raw-data-file' dlm=':'; input Type $ @; if Type='E' then input EmpLName $ EmpFName $; else do; input DepName $ Relation $; output; end; run; E:Adams:Susan D:Michael:C D:Lindsay:C E:Porter:David D:Susan:S E:Lewis:Dorian D. D:Richard:C E:Dansky:Ian E:Nicholls:James D:Roberta:C E:Slaydon:Marla D:John:S Input Buffer D : M i c h e a l : C...

48 48 Input Buffer EMPLNAMERELATIONEMPFNAMEDEPNAMETYPE D data dependents(drop=Type); length Type $ 1 EmpLName EmpFName DepName $ 20 Relation $ 1; infile 'raw-data-file' dlm=':'; input Type $ @; if Type='E' then input EmpLName $ EmpFName $; else do; input DepName $ Relation $; output; end; run; E:Adams:Susan D:Michael:C D:Lindsay:C E:Porter:David D:Susan:S E:Lewis:Dorian D. D:Richard:C E:Dansky:Ian E:Nicholls:James D:Roberta:C E:Slaydon:Marla D:John:S D Hold record. D : M i c h a e l : C...

49 49 EMPLNAMERELATIONEMPFNAMEDEPNAMETYPE D data dependents(drop=Type); length Type $ 1 EmpLName EmpFName DepName $ 20 Relation $ 1; infile 'raw-data-file' dlm=':'; input Type $ @; if Type='E' then input EmpLName $ EmpFName $; else do; input DepName $ Relation $; output; end; run; E:Adams:Susan D:Michael:C D:Lindsay:C E:Porter:David D:Susan:S E:Lewis:Dorian D. D:Richard:C E:Dansky:Ian E:Nicholls:James D:Roberta:C E:Slaydon:Marla D:John:S D False... Input Buffer D : M i c h a e l : C

50 50 EMPLNAMERELATIONEMPFNAMEDEPNAMETYPE D data dependents(drop=Type); length Type $ 1 EmpLName EmpFName DepName $ 20 Relation $ 1; infile 'raw-data-file' dlm=':'; input Type $ @; if Type='E' then input EmpLName $ EmpFName $; else do; input DepName $ Relation $; output; end; run; E:Adams:Susan D:Michael:C D:Lindsay:C E:Porter:David D:Susan:S E:Lewis:Dorian D. D:Richard:C E:Dansky:Ian E:Nicholls:James D:Roberta:C E:Slaydon:Marla D:John:S DMichaelC... Input Buffer D : M i c h a e l : C

51 51 Write out observation to dependents. EMPLNAMERELATIONEMPFNAMEDEPNAMETYPE R D data dependents(drop=Type); length Type $ 1 EmpLName EmpFName DepName $ 20 Relation $ 1; infile 'raw-data-file' dlm=':'; input Type $ @; if Type='E' then input EmpLName $ EmpFName $; else do; input DepName $ Relation $; output; end; run; E:Adams:Susan D:Michael:C D:Lindsay:C E:Porter:David D:Susan:S E:Lewis:Dorian D. D:Richard:C E:Dansky:Ian E:Nicholls:James D:Roberta:C E:Slaydon:Marla D:John:S DMichaelC... Input Buffer D : M i c h a e l : C Explicit output

52 52 Input Buffer D : M i c h a e l : C EMPLNAMERELATIONEMPFNAMEDEPNAMETYPE R D data dependents(drop=Type); length Type $ 1 EmpLName EmpFName DepName $ 20 Relation $ 1; infile 'raw-data-file' dlm=':'; input Type $ @; if Type='E' then input EmpLName $ EmpFName $; else do; input DepName $ Relation $; output; end; run; E:Adams:Susan D:Michael:C D:Lindsay:C E:Porter:David D:Susan:S E:Lewis:Dorian D. D:Richard:C E:Dansky:Ian E:Nicholls:James D:Roberta:C E:Slaydon:Marla D:John:S DMichaelC Implicit return...

53 53 Undesirable Output Emp Emp LName FName DepName Relation Michael C Lindsay C Susan S Richard C Roberta C John S EmpLname and EmpFname are not properly captured.

54 54 The RETAIN Statement (Review) General form of the RETAIN statement: The RETAIN statement prevents SAS from reinitializing the values of new variables at the top of the DATA step. This means that values from previous records are available for processing. RETAIN variable-name ;

55 55 data dependents(drop=Type); length Type $ 1 EmpLName EmpFName DepName $ 20 Relation $ 1; retain EmpLName EmpFName; infile 'raw-data-file' dlm=':'; input Type $ @; if Type='E' then input EmpLName $ EmpFName $; else do; input DepName $ Relation $; output; end; run; Hold EmpLName and EmpFName

56 56 E:Adams:Susan D:Michael:C D:Lindsay:C E:Porter:David D:Susan:S E:Lewis:Dorian D. D:Richard:C E:Dansky:Ian E:Nicholls:James D:Roberta:C E:Slaydon:Marla D:John:S EMPLNAMERELATION data dependents(drop=Type); length Type $ 1 EmpLName EmpFName DepName $ 20 Relation $ 1; retain EmpLName EmpFName; infile 'raw-data-file' dlm=':'; input Type $ @; if Type='E' then input EmpLName $ EmpFName $; else do; input DepName $ Relation $; output; end; run; Compile EMPFNAMEDEPNAMETYPE R R D R... Input Buffer

57 57 EMPLNAMERELATIONEMPFNAMEDEPNAMETYPE R R D R data dependents(drop=Type); length Type $ 1 EmpLName EmpFName DepName $ 20 Relation $ 1; retain EmpLName EmpFName; infile 'raw-data-file' dlm=':'; input Type $ @; if Type='E' then input EmpLName $ EmpFName $; else do; input DepName $ Relation $; output; end; run; E:Adams:Susan D:Michael:C D:Lindsay:C E:Porter:David D:Susan:S E:Lewis:Dorian D. D:Richard:C E:Dansky:Ian E:Nicholls:James D:Roberta:C E:Slaydon:Marla D:John:S Execute... Input Buffer

58 58 Input Buffer EMPLNAMERELATIONEMPFNAMEDEPNAMETYPE R R D R data dependents(drop=Type); length Type $ 1 EmpLName EmpFName DepName $ 20 Relation $ 1; retain EmpLName EmpFName; infile 'raw-data-file' dlm=':'; input Type $ @; if Type='E' then input EmpLName $ EmpFName $; else do; input DepName $ Relation $; output; end; run; E:Adams:Susan D:Michael:C D:Lindsay:C E:Porter:David D:Susan:S E:Lewis:Dorian D. D:Richard:C E:Dansky:Ian E:Nicholls:James D:Roberta:C E:Slaydon:Marla D:John:S E : A d a m s :S u s a n...

59 59 Input Buffer EMPLNAMERELATIONEMPFNAMEDEPNAMETYPE R R D R data dependents(drop=Type); length Type $ 1 EmpLName EmpFName DepName $ 20 Relation $ 1; retain EmpLName EmpFName; infile 'raw-data-file' dlm=':'; input Type $ @; if Type='E' then input EmpLName $ EmpFName $; else do; input DepName $ Relation $; output; end; run; E:Adams:Susan D:Michael:C D:Lindsay:C E:Porter:David D:Susan:S E:Lewis:Dorian D. D:Richard:C E:Dansky:Ian E:Nicholls:James D:Roberta:C E:Slaydon:Marla D:John:S E : A d a m s : S u s a n E Hold record....

60 60 EMPLNAMERELATIONEMPFNAMEDEPNAMETYPE R R D R data dependents(drop=Type); length Type $ 1 EmpLName EmpFName DepName $ 20 Relation $ 1; retain EmpLName EmpFName; infile 'raw-data-file' dlm=':'; input Type $ @; if Type='E' then input EmpLName $ EmpFName $; else do; input DepName $ Relation $; output; end; run; E:Adams:Susan D:Michael:C D:Lindsay:C E:Porter:David D:Susan:S E:Lewis:Dorian D. D:Richard:C E:Dansky:Ian E:Nicholls:James D:Roberta:C E:Slaydon:Marla D:John:S E True... Input Buffer E : A d a m s : S u s a n

61 61 Input Buffer E : A d a m s : S u s a n EMPLNAMERELATIONEMPFNAMEDEPNAMETYPE R R D R data dependents(drop=Type); length Type $ 1 EmpLName EmpFName DepName $ 20 Relation $ 1; retain EmpLName EmpFName; infile 'raw-data-file' dlm=':'; input Type $ @; if Type='E' then input EmpLName $ EmpFName $; else do; input DepName $ Relation $; output; end; run; E:Adams:Susan D:Michael:C D:Lindsay:C E:Porter:David D:Susan:S E:Lewis:Dorian D. D:Richard:C E:Dansky:Ian E:Nicholls:James D:Roberta:C E:Slaydon:Marla D:John:S EAdamsSusan Implicit return No implicit output...

62 62 Input Buffer E : A d a m s : S u s a n EMPLNAMERELATIONEMPFNAMEDEPNAMETYPE R R D R data dependents(drop=Type); length Type $ 1 EmpLName EmpFName DepName $ 20 Relation $ 1; retain EmpLName EmpFName; infile 'raw-data-file' dlm=':'; input Type $ @; if Type='E' then input EmpLName $ EmpFName $; else do; input DepName $ Relation $; output; end; run; E:Adams:Susan D:Michael:C D:Lindsay:C E:Porter:David D:Susan:S E:Lewis:Dorian D. D:Richard:C E:Dansky:Ian E:Nicholls:James D:Roberta:C E:Slaydon:Marla D:John:S EAdamsSusan...

63 63 Input Buffer E : A d a m s : S u s a n EMPLNAMERELATIONEMPFNAMEDEPNAMETYPE R R D R data dependents(drop=Type); length Type $ 1 EmpLName EmpFName DepName $ 20 Relation $ 1; retain EmpLName EmpFName; infile 'raw-data-file' dlm=':'; input Type $ @; if Type='E' then input EmpLName $ EmpFName $; else do; input DepName $ Relation $; output; end; run; E:Adams:Susan D:Michael:C D:Lindsay:C E:Porter:David D:Susan:S E:Lewis:Dorian D. D:Richard:C E:Dansky:Ian E:Nicholls:James D:Roberta:C E:Slaydon:Marla D:John:S EAdamsSusan No implicit output...

64 64 Input Buffer E : A d a m s : S u s a n EMPLNAMERELATIONEMPFNAMEDEPNAMETYPE R R D R data dependents(drop=Type); length Type $ 1 EmpLName EmpFName DepName $ 20 Relation $ 1; retain EmpLName EmpFName; infile 'raw-data-file' dlm=':'; input Type $ @; if Type='E' then input EmpLName $ EmpFName $; else do; input DepName $ Relation $; output; end; run; E:Adams:Susan D:Michael:C D:Lindsay:C E:Porter:David D:Susan:S E:Lewis:Dorian D. D:Richard:C E:Dansky:Ian E:Nicholls:James D:Roberta:C E:Slaydon:Marla D:John:S EAdamsSusan Implicit return...

65 65 Input Buffer E : A d a m s : S u s a n EMPLNAMERELATIONEMPFNAMEDEPNAMETYPE R R D R data dependents(drop=Type); length Type $ 1 EmpLName EmpFName DepName $ 20 Relation $ 1; retain EmpLName EmpFName; infile 'raw-data-file' dlm=':'; input Type $ @; if Type='E' then input EmpLName $ EmpFName $; else do; input DepName $ Relation $; output; end; run; E:Adams:Susan D:Michael:C D:Lindsay:C E:Porter:David D:Susan:S E:Lewis:Dorian D. D:Richard:C E:Dansky:Ian E:Nicholls:James D:Roberta:C E:Slaydon:Marla D:John:S Reinitialize PDV. AdamsSusan...

66 66 EMPLNAMERELATIONEMPFNAMEDEPNAMETYPE R D R data dependents(drop=Type); length Type $ 1 EmpLName EmpFName DepName $ 20 Relation $ 1; retain EmpLName EmpFName; infile 'raw-data-file' dlm=':'; input Type $ @; if Type='E' then input EmpLName $ EmpFName $;output; else do; input DepName $ Relation $; output; end; run; E:Adams:Susan D:Michael:C D:Lindsay:C E:Porter:David D:Susan:S E:Lewis:Dorian D. D:Richard:C E:Dansky:Ian E:Nicholls:James D:Roberta:C E:Slaydon:Marla D:John:S AdamsSusan... Input Buffer

67 67 Input Buffer EMPLNAMERELATIONEMPFNAMEDEPNAMETYPE R D data dependents(drop=Type); length Type $ 1 EmpLName EmpFName DepName $ 20 Relation $ 1; retain EmpLName EmpFName; infile 'raw-data-file' dlm=':'; input Type $ @; if Type='E' then input EmpLName $ EmpFName $; else do; input DepName $ Relation $; output; end; run; E:Adams:Susan D:Michael:C D:Lindsay:C E:Porter:David D:Susan:S E:Lewis:Dorian D. D:Richard:C E:Dansky:Ian E:Nicholls:James D:Roberta:C E:Slaydon:Marla D:John:S D : M i c h a e l : C AdamsSusan...

68 68 EMPLNAMERELATIONEMPFNAMEDEPNAMETYPE R D R data dependents(drop=Type); length Type $ 1 EmpLName EmpFName DepName $ 20 Relation $ 1; retain EmpLName EmpFName; infile 'raw-data-file' dlm=':'; input Type $ @; if Type='E' then input EmpLName $ EmpFName $; else do; input DepName $ Relation $; output; end; run; E:Adams:Susan D:Michael:C D:Lindsay:C E:Porter:David D:Susan:S E:Lewis:Dorian D. D:Richard:C E:Dansky:Ian E:Nicholls:James D:Roberta:C E:Slaydon:Marla D:John:S DAdamsSusan... Input Buffer D : M i c h a e l : C Hold record.

69 69 EMPLNAMERELATIONEMPFNAMEDEPNAMETYPE R D R data dependents(drop=Type); length Type $ 1 EmpLName EmpFName DepName $ 20 Relation $ 1; retain EmpLName EmpFName; infile 'raw-data-file' dlm=':'; input Type $ @; if Type='E' then input EmpLName $ EmpFName $; else do; input DepName $ Relation $; output; end; run; E:Adams:Susan D:Michael:C D:Lindsay:C E:Porter:David D:Susan:S E:Lewis:Dorian D. D:Richard:C E:Dansky:Ian E:Nicholls:James D:Roberta:C E:Slaydon:Marla D:John:S DMichaelC False AdamsSusan... Input Buffer D : M i c h a e l : C

70 70 Input Buffer D : M i c h a e l : C Write out observation to dependents. EMPLNAMERELATIONEMPFNAMEDEPNAMETYPE R R D R data dependents(drop=Type); length Type $ 1 EmpLName EmpFName DepName $ 20 Relation $ 1; retain EmpLName EmpFName; infile 'raw-data-file' dlm=':'; input Type $ @; if Type='E' then input EmpLName $ EmpFName $; else do; input DepName $ Relation $; output; end; run; E:Adams:Susan D:Michael:C D:Lindsay:C E:Porter:David D:Susan:S E:Lewis:Dorian D. D:Richard:C E:Dansky:Ian E:Nicholls:James D:Roberta:C E:Slaydon:Marla D:John:S DMichaelC Implicit return Explicit output AdamsSusan...

71 71 Input Buffer D : M i c h a e l : C Write out observation to dependents. EMPLNAMERELATIONEMPFNAMEDEPNAMETYPE R R D R data dependents(drop=Type); length Type $ 1 EmpLName EmpFName DepName $ 20 Relation $ 1; retain EmpLName EmpFName; infile 'raw-data-file' dlm=':'; input Type $ @; if Type='E' then input EmpLName $ EmpFName $; else do; input DepName $ Relation $; output; end; run; E:Adams:Susan D:Michael:C D:Lindsay:C E:Porter:David D:Susan:S E:Lewis:Dorian D. D:Richard:C E:Dansky:Ian E:Nicholls:James D:Roberta:C E:Slaydon:Marla D:John:S DMichaelC Explicit output AdamsSusan...

72 72 Input Buffer D : M i c h a e l : C EMPLNAMERELATIONEMPFNAMEDEPNAMETYPE R R D R data dependents(drop=Type); length Type $ 1 EmpLName EmpFName DepName $ 20 Relation $ 1; retain EmpLName EmpFName; infile 'raw-data-file' dlm=':'; input Type $ @; if Type='E' then input EmpLName $ EmpFName $; else do; input DepName $ Relation $; output; end; run; E:Adams:Susan D:Michael:C D:Lindsay:C E:Porter:David D:Susan:S E:Lewis:Dorian D. D:Richard:C E:Dansky:Ian E:Nicholls:James D:Roberta:C E:Slaydon:Marla D:John:S DMichaelC Implicit return AdamsSusan...

73 73 Input Buffer D : M i c h a e l : C EMPLNAMERELATIONEMPFNAMEDEPNAMETYPE R R D R data dependents(drop=Type); length Type $ 1 EmpLName EmpFName DepName $ 20 Relation $ 1; retain EmpLName EmpFName; infile 'raw-data-file' dlm=':'; input Type $ @; if Type='E' then input EmpLName $ EmpFName $; else do; input DepName $ Relation $; output; end; run; E:Adams:Susan D:Michael:C D:Lindsay:C E:Porter:David D:Susan:S E:Lewis:Dorian D. D:Richard:C E:Dansky:Ian E:Nicholls:James D:Roberta:C E:Slaydon:Marla D:John:S DMichaelC AdamsSusan Continue processing until end of the raw data file.

74 74 Creating One Observation Per Detail EmpLName EmpFName DepName Relation Adams Susan Michael C Adams Susan Lindsay C Porter David Susan S Lewis Dorian D. Richard C Nicholls James Roberta C Slaydon Marla John S proc print data=work.dependents noobs; run; PROC PRINT Output Correct Result

75 75 E:E01442 D:Michael:C D:Lindsay:C E:E00705 D:Susan:S E:E01577 D:Richard:C E:E00997 E:E00955 D:Roberta:C E:E00224 D:John:S Create One Observation Per Header Record – Employee insurance is free for the employees. – Each employee pays $50 per month for a spouse’s insurance. – Each employee pays $25 per month for a child’s insurance.

76 76 Desired Output Personnel wants a list of all employees and their monthly payroll deductions for insurance. ID Deduct E01442 50 E00705 50 E01577 25 E00997 0 E00955 25 E00224 50

77 77 E:E01442 D:Michael:C D:Lindsay:C E:E00705 D:Susan:S E:E01577 D:Richard:C E:E00997 E:E00955 D:Roberta:C E:E00224 D:John:S Calculating the Value of Deduct – type of record read – value of Relation w hen Type =‘ D ’. The values of Deduct will change according to the

78 78 Retaining ID Values of ID and Deduct must be held across iterations of the DATA step. – ID must be retained with a RETAIN statement. – Deduct is created with a sum statement, which automatically retains. retain ID;

79 79 E:E01442 D:Michael:C D:Lindsay:C E:E00705 D:Susan:S E:E01577 D:Richard:C E:E00997 E:E00955 D:Roberta:C E:E00224 D:John:S End Observation 1End Observation 2End Observation 3End Observation 4End Observation 5End Observation 6 When to Output ?...

80 80 When SAS Loads a Type E Record 1.Output what is currently in the PDV (unless this is the first time through the DATA step). 2. Read the next employee’s identification number. 3. Reset Deduct to 0. if Type='E' then do; if _N_ > 1 then output; input ID $; Deduct=0; end; NOTE: _N_ = 1 is the first record with TYPE =‘E’, but there is no data to be processed yet.

81 81 When SAS Loads a Type D Record 1. Read the dependent’s name and relationship. 2. Check the relationship. 3. Increment Deduct appropriately. else do; input DepName $ Relation $; if Relation='C' then Deduct+25; else Deduct+50; end;

82 82 data work.insurance(drop=Type DepName Relation); length Type $ 1 ID $ 6 DepName $ 20 Relation $ 1; retain ID; infile 'raw-data-file' dlm=':'; input Type $ @; if Type='E' then do; if _N_ > 1 then output; input ID $; Deduct=0; end; else do; input DepName $ Relation $; if Relation='C' then Deduct+25; else Deduct+50; end; run;

83 83 E:E01442 D:Michael:C D:Lindsay:C E:E00705 D:Susan:S E:E01577 D:Richard:C E:E00997 E:E00955 D:Roberta:C E:E00224 D:John:S What About the Last Record? No implicit output...

84 84 Recall : The END= Option in the INFILE statement General form of the END= option: where variable-name is any valid SAS variable name. The END= option creates a variable that has the value – 1 if it is the last record of the input file – 0 otherwise. Variables created with END= are automatically dropped. INFILE 'file-name' END=variable-name;

85 85 data work.insurance(drop=Type DepName Relation); length Type $ 1 ID $ 6 DepName $ 20 Relation $ 1; retain ID; infile 'raw-data-file' dlm=':' end=LastRec; input Type $ @; if Type='E' then do; if _N_ > 1 then output; input ID $; Deduct=0; end; else do; input DepName $ Relation $; if Relation='C' then Deduct+25; else Deduct+50; end; if LastRec then output; run;

86 86 RELATION data work.insurance(drop=Type DepName Relation); length Type $ 1 ID $ 6 DepName $ 20 Relation $ 1; retain ID; infile 'raw-data-file' dlm=':' end=LastRec; input Type $ @; if Type='E' then do; if _N_>1 then output; input ID $; Deduct=0; end; else do; input DepName $ Relation $; if Relation='C' then Deduct+25; else Deduct+50; end; if LastRec then output; run; Compile TYPE R IDDEPNAME DEDUCT D _N_LASTREC DD R DD... Input Buffer E:E01442 D:Michael:C D:Lindsay:C E:E00705 D:Susan:S E:E01577 D:Richard:C E:E00997 E:E00955 D:Roberta:C E:E00224 D:John:S

87 87 RELATION TYPE R IDDEPNAME 0 DEDUCT D R _N_LASTREC D E:E01442 D:Michael:C D:Lindsay:C E:E00705 D:Susan:S E:E01577 D:Richard:C E:E00997 E:E00955 D:Roberta:C E:E00224 D:John:S Execute 10 DDD... data work.insurance(drop=Type DepName Relation); length Type $ 1 ID $ 6 DepName $ 20 Relation $ 1; retain ID; infile 'raw-data-file' dlm=':' end=LastRec; input Type $ @; if Type='E' then do; if _N_>1 then output; input ID $; Deduct=0; end; else do; input DepName $ Relation $; if Relation='C' then Deduct+25; else Deduct+50; end; if LastRec then output; run; Input Buffer input Type $ @;

88 88 Input Buffer RELATION TYPE R IDDEPNAME 0 DEDUCT D R _N_LASTREC D E : E 0 1 4 4 2 10 DDD... data work.insurance(drop=Type DepName Relation); length Type $ 1 ID $ 6 DepName $ 20 Relation $ 1; retain ID; infile 'raw-data-file' dlm=':' end=LastRec; input Type $ @; if Type='E' then do; if _N_>1 then output; input ID $; Deduct=0; end; else do; input DepName $ Relation $; if Relation='C' then Deduct+25; else Deduct+50; end; if LastRec then output; run; E:E01442 D:Michael:C D:Lindsay:C E:E00705 D:Susan:S E:E01577 D:Richard:C E:E00997 E:E00955 D:Roberta:C E:E00224 D:John:S input Type $ @;

89 89 Input Buffer E : E 0 1 4 4 2 RELATION TYPE R IDDEPNAME 0 DEDUCT D R _N_LASTREC D E10 D E:E01442 D:Michael:C D:Lindsay:C E:E00705 D:Susan:S E:E01577 D:Richard:C E:E00997 E:E00955 D:Roberta:C E:E00224 D:John:S Hold record. DD... data work.insurance(drop=Type DepName Relation); length Type $ 1 ID $ 6 DepName $ 20 Relation $ 1; retain ID; infile 'raw-data-file' dlm=':' end=LastRec; input Type $ @; if Type='E' then do; if _N_>1 then output; input ID $; Deduct=0; end; else do; input DepName $ Relation $; if Relation='C' then Deduct+25; else Deduct+50; end; if LastRec then output; run; input Type $ @;

90 90 RELATION TYPE R IDDEPNAME 0 DEDUCT D R _N_LASTREC D E10 DDD... data work.insurance(drop=Type DepName Relation); length Type $ 1 ID $ 6 DepName $ 20 Relation $ 1; retain ID; infile 'raw-data-file' dlm=':' end=LastRec; input Type $ @; if Type='E' then do; if _N_>1 then output; input ID $; Deduct=0; end; else do; input DepName $ Relation $; if Relation='C' then Deduct+25; else Deduct+50; end; if LastRec then output; run; if Type='E' then do; E:E01442 D:Michael:C D:Lindsay:C E:E00705 D:Susan:S E:E01577 D:Richard:C E:E00997 E:E00955 D:Roberta:C E:E00224 D:John:S True Input Buffer E : E 0 1 4 4 2

91 91 DD RELATION TYPE R IDDEPNAME 0 DEDUCT D R _N_LASTREC D E10 D... data work.insurance(drop=Type DepName Relation); length Type $ 1 ID $ 6 DepName $ 20 Relation $ 1; retain ID; infile 'raw-data-file' dlm=':' end=LastRec; input Type $ @; if Type='E' then do; if _N_>1 then output; input ID $; Deduct=0; end; else do; input DepName $ Relation $; if Relation='C' then Deduct+25; else Deduct+50; end; if LastRec then output; run; if _N_ > 1 then output; False Input Buffer E : E 0 1 4 4 2 D:Michael:C D:Lindsay:C E:E00705 D:Susan:S E:E01577 D:Richard:C E:E00997 E:E00955 D:Roberta:C E:E00224 D:John:S

92 92 RELATION TYPE R IDDEPNAME 0 DEDUCT D R _N_LASTREC D E10E014420 DDD... data work.insurance(drop=Type DepName Relation); length Type $ 1 ID $ 6 DepName $ 20 Relation $ 1; retain ID; infile 'raw-data-file' dlm=':' end=LastRec; input Type $ @; if Type='E' then do; if _N_>1 then output; input ID $; Deduct=0; end; else do; input DepName $ Relation $; if Relation='C' then Deduct+25; else Deduct+50; end; if LastRec then output; run; input ID $; Input Buffer E : E 0 1 4 4 2 D:Michael:C D:Lindsay:C E:E00705 D:Susan:S E:E01577 D:Richard:C E:E00997 E:E00955 D:Roberta:C E:E00224 D:John:S

93 93 Input Buffer E : E 0 1 4 4 2 DD RELATION TYPE R IDDEPNAME 0 DEDUCT D R _N_LASTREC D E10E014420 D Deduct=0... data work.insurance(drop=Type DepName Relation); length Type $ 1 ID $ 6 DepName $ 20 Relation $ 1; retain ID; infile 'raw-data-file' dlm=':' end=LastRec; input Type $ @; if Type='E' then do; if _N_>1 then output; input ID $; Deduct=0; end; else do; input DepName $ Relation $; if Relation='C' then Deduct+25; else Deduct+50; end; if LastRec then output; run; Deduct=0; E:E01442 D:Michael:C D:Lindsay:C E:E00705 D:Susan:S E:E01577 D:Richard:C E:E00997 E:E00955 D:Roberta:C E:E00224 D:John:S

94 94 DD RELATION TYPE R IDDEPNAME 0 DEDUCT D R _N_LASTREC D E10E014420 D... data work.insurance(drop=Type DepName Relation); length Type $ 1 ID $ 6 DepName $ 20 Relation $ 1; retain ID; infile 'raw-data-file' dlm=':' end=LastRec; input Type $ @; if Type='E' then do; if _N_>1 then output; input ID $; Deduct=0; end; else do; input DepName $ Relation $; if Relation='C' then Deduct+25; else Deduct+50; end; if LastRec then output; run; end; Input Buffer E : E 0 1 4 4 2 D:Michael:C D:Lindsay:C E:E00705 D:Susan:S E:E01577 D:Richard:C E:E00997 E:E00955 D:Roberta:C E:E00224 D:John:S

95 95 DD RELATION TYPE R IDDEPNAME 0 DEDUCT D R _N_LASTREC D E10E014420 D... data work.insurance(drop=Type DepName Relation); length Type $ 1 ID $ 6 DepName $ 20 Relation $ 1; retain ID; infile 'raw-data-file' dlm=':' end=LastRec; input Type $ @; if Type='E' then do; if _N_>1 then output; input ID $; Deduct=0; end; else do; input DepName $ Relation $; if Relation='C' then Deduct+25; else Deduct+50; end; if LastRec then output; run; if LastRec then output; False Input Buffer E : E 0 1 4 4 2 D:Michael:C D:Lindsay:C E:E00705 D:Susan:S E:E01577 D:Richard:C E:E00997 E:E00955 D:Roberta:C E:E00224 D:John:S

96 96 DD RELATION TYPE R IDDEPNAME 0 DEDUCT D R _N_LASTREC D E10E014420 D... data work.insurance(drop=Type DepName Relation); length Type $ 1 ID $ 6 DepName $ 20 Relation $ 1; retain ID; infile 'raw-data-file' dlm=':' end=LastRec; input Type $ @; if Type='E' then do; if _N_>1 then output; input ID $; Deduct=0; end; else do; input DepName $ Relation $; if Relation='C' then Deduct+25; else Deduct+50; end; if LastRec then output; run; Input Buffer E : E 0 1 4 4 2 D:Michael:C D:Lindsay:C E:E00705 D:Susan:S E:E01577 D:Richard:C E:E00997 E:E00955 D:Roberta:C E:E00224 D:John:S Implicit return

97 97 RELATION TYPE R IDDEPNAME 0 DEDUCT D R _N_LASTREC D E0144220 DDD... Input Buffer E : E 0 1 4 4 2 D:Michael:C D:Lindsay:C E:E00705 D:Susan:S E:E01577 D:Richard:C E:E00997 E:E00955 D:Roberta:C E:E00224 D:John:S Reinitialize PDV. data work.insurance(drop=Type DepName Relation); length Type $ 1 ID $ 6 DepName $ 20 Relation $ 1; retain ID; infile 'raw-data-file' dlm=':' end=LastRec; input Type $ @; if Type='E' then do; if _N_>1 then output; input ID $; Deduct=0; end; else do; input DepName $ Relation $; if Relation='C' then Deduct+25; else Deduct+50; end; if LastRec then output; run;

98 98 RELATION TYPE R IDDEPNAME 0 DEDUCT D R _N_LASTREC D E0144220 DDD... data work.insurance(drop=Type DepName Relation); length Type $ 1 ID $ 6 DepName $ 20 Relation $ 1; retain ID; infile 'raw-data-file' dlm=':' end=LastRec; input Type $ @; if Type='E' then do; if _N_>1 then output; input ID $; Deduct=0; end; else do; input DepName $ Relation $; if Relation='C' then Deduct+25; else Deduct+50; end; if LastRec then output; run; input Type $ @; Input Buffer E : E 0 1 4 4 2 D:Michael:C D:Lindsay:C E:E00705 D:Susan:S E:E01577 D:Richard:C E:E00997 E:E00955 D:Roberta:C E:E00224 D:John:S

99 99 Input Buffer RELATION TYPE R IDDEPNAME 0 DEDUCT D R _N_LASTREC D E014422 D : M i c h a e l : C 0 DDD... data work.insurance(drop=Type DepName Relation); length Type $ 1 ID $ 6 DepName $ 20 Relation $ 1; retain ID; infile 'raw-data-file' dlm=':' end=LastRec; input Type $ @; if Type='E' then do; if _N_>1 then output; input ID $; Deduct=0; end; else do; input DepName $ Relation $; if Relation='C' then Deduct+25; else Deduct+50; end; if LastRec then output; run; E:E01442 D:Michael:C D:Lindsay:C E:E00705 D:Susan:S E:E01577 D:Richard:C E:E00997 E:E00955 D:Roberta:C E:E00224 D:John:S input Type $ @;

100 100 RELATION TYPE R IDDEPNAME 0 DEDUCT D R _N_LASTREC D E014422D0 D Input Buffer D : M i c h a e l : C E:E01442 D:Michael:C D:Lindsay:C E:E00705 D:Susan:S E:E01577 D:Richard:C E:E00997 E:E00955 D:Roberta:C E:E00224 D:John:S Hold record. DD... data work.insurance(drop=Type DepName Relation); length Type $ 1 ID $ 6 DepName $ 20 Relation $ 1; retain ID; infile 'raw-data-file' dlm=':' end=LastRec; input Type $ @; if Type='E' then do; if _N_>1 then output; input ID $; Deduct=0; end; else do; input DepName $ Relation $; if Relation='C' then Deduct+25; else Deduct+50; end; if LastRec then output; run; input Type $ @;

101 101 RELATION TYPE R IDDEPNAME 0 DEDUCT D R _N_LASTREC D E014422D0 DDD... data work.insurance(drop=Type DepName Relation); length Type $ 1 ID $ 6 DepName $ 20 Relation $ 1; retain ID; infile 'raw-data-file' dlm=':' end=LastRec; input Type $ @; if Type='E' then do; if _N_>1 then output; input ID $; Deduct=0; end; else do; input DepName $ Relation $; if Relation='C' then Deduct+25; else Deduct+50; end; if LastRec then output; run; if Type=‘E’ then do; Input Buffer D : M i c h a e l : C E:E01442 D:Michael:C D:Lindsay:C E:E00705 D:Susan:S E:E01577 D:Richard:C E:E00997 E:E00955 D:Roberta:C E:E00224 D:John:S False

102 102 RELATION TYPE R IDDEPNAME 0 DEDUCT D R _N_LASTREC D E014422Michael C D0 DDD... data work.insurance(drop=Type DepName Relation); length Type $ 1 ID $ 6 DepName $ 20 Relation $ 1; retain ID; infile 'raw-data-file' dlm=':' end=LastRec; input Type $ @; if Type='E' then do; if _N_>1 then output; input ID $; Deduct=0; end; else do; input DepName $ Relation $; if Relation='C' then Deduct+25; else Deduct+50; end; if LastRec then output; run; input DepName $ Relation $; Input Buffer D : M i c h a e l : C E:E01442 D:Michael:C D:Lindsay:C E:E00705 D:Susan:S E:E01577 D:Richard:C E:E00997 E:E00955 D:Roberta:C E:E00224 D:John:S

103 103 RELATION TYPE R IDDEPNAME 0 DEDUCT D R _N_LASTREC D E014422Michael C D0 DDD... data work.insurance(drop=Type DepName Relation); length Type $ 1 ID $ 6 DepName $ 20 Relation $ 1; retain ID; infile 'raw-data-file' dlm=':' end=LastRec; input Type $ @; if Type='E' then do; if _N_>1 then output; input ID $; Deduct=0; end; else do; input DepName $ Relation $; if Relation='C' then Deduct+25; else Deduct+50; end; if LastRec then output; run; True if Relation='C' then Deduct+25; 25 Input Buffer D : M i c h a e l : C E:E01442 D:Michael:C D:Lindsay:C E:E00705 D:Susan:S E:E01577 D:Richard:C E:E00997 E:E00955 D:Roberta:C E:E00224 D:John:S 0 + 25

104 104 RELATION TYPE R IDDEPNAME 0 DEDUCT D R _N_LASTREC D E014422Michael C25 D0 DDD... data work.insurance(drop=Type DepName Relation); length Type $ 1 ID $ 6 DepName $ 20 Relation $ 1; retain ID; infile 'raw-data-file' dlm=':' end=LastRec; input Type $ @; if Type='E' then do; if _N_>1 then output; input ID $; Deduct=0; end; else do; input DepName $ Relation $; if Relation='C' then Deduct+25; else Deduct+50; end; if LastRec then output; run; if LastRec then output; False Input Buffer D : M i c h a e l : C E:E01442 D:Michael:C D:Lindsay:C E:E00705 D:Susan:S E:E01577 D:Richard:C E:E00997 E:E00955 D:Roberta:C E:E00224 D:John:S

105 105 DD RELATION TYPE R IDDEPNAME 25 DEDUCT D R _N_LASTREC D E014422Michael C25 D0 D... data work.insurance(drop=Type DepName Relation); length Type $ 1 ID $ 6 DepName $ 20 Relation $ 1; retain ID; infile 'raw-data-file' dlm=':' end=LastRec; input Type $ @; if Type='E' then do; if _N_>1 then output; input ID $; Deduct=0; end; else do; input DepName $ Relation $; if Relation='C' then Deduct+25; else Deduct+50; end; if LastRec then output; run; Input Buffer D : M i c h a e l : C E:E01442 D:Michael:C D:Lindsay:C E:E00705 D:Susan:S E:E01577 D:Richard:C E:E00997 E:E00955 D:Roberta:C E:E00224 D:John:S Implicit return

106 106 RELATION TYPE R IDDEPNAME 25 DEDUCT D R _N_LASTREC D E0144230 DDD Input Buffer D : M i c h a e l : C E:E01442 D:Michael:C D:Lindsay:C E:E00705 D:Susan:S E:E01577 D:Richard:C E:E00997 E:E00955 D:Roberta:C E:E00224 D:John:S... Reinitialize PDV. data work.insurance(drop=Type DepName Relation); length Type $ 1 ID $ 6 DepName $ 20 Relation $ 1; retain ID; infile 'raw-data-file' dlm=':' end=LastRec; input Type $ @; if Type='E' then do; if _N_>1 then output; input ID $; Deduct=0; end; else do; input DepName $ Relation $; if Relation='C' then Deduct+25; else Deduct+50; end; if LastRec then output; run;

107 107 RELATION TYPE R IDDEPNAME 25 DEDUCT D R _N_LASTREC D E0144230 DDD... data work.insurance(drop=Type DepName Relation); length Type $ 1 ID $ 6 DepName $ 20 Relation $ 1; retain ID; infile 'raw-data-file' dlm=':' end=LastRec; input Type $ @; if Type='E' then do; if _N_>1 then output; input ID $; Deduct=0; end; else do; input DepName $ Relation $; if Relation='C' then Deduct+25; else Deduct+50; end; if LastRec then output; run; input Type $ @; Input Buffer D : M i c h a e l : C E:E01442 D:Michael:C D:Lindsay:C E:E00705 D:Susan:S E:E01577 D:Richard:C E:E00997 E:E00955 D:Roberta:C E:E00224 D:John:S

108 108 Input Buffer RELATION TYPE R IDDEPNAME 25 DEDUCT D R _N_LASTREC D E014423 D : L i n d s a y : C 0 DDD... data work.insurance(drop=Type DepName Relation); length Type $ 1 ID $ 6 DepName $ 20 Relation $ 1; retain ID; infile 'raw-data-file' dlm=':' end=LastRec; input Type $ @; if Type='E' then do; if _N_>1 then output; input ID $; Deduct=0; end; else do; input DepName $ Relation $; if Relation='C' then Deduct+25; else Deduct+50; end; if LastRec then output; run; input Type $ @; E:E01442 D:Michael:C D:Lindsay:C E:E00705 D:Susan:S E:E01577 D:Richard:C E:E00997 E:E00955 D:Roberta:C E:E00224 D:John:S

109 109 RELATION TYPE R IDDEPNAME 25 DEDUCT D R _N_LASTREC D E014423D0 D Input Buffer D : L i n d s a y : C E:E01442 D:Michael:C D:Lindsay:C E:E00705 D:Susan:S E:E01577 D:Richard:C E:E00997 E:E00955 D:Roberta:C E:E00224 D:John:S Hold record. DD... data work.insurance(drop=Type DepName Relation); length Type $ 1 ID $ 6 DepName $ 20 Relation $ 1; retain ID; infile 'raw-data-file' dlm=':' end=LastRec; input Type $ @; if Type='E' then do; if _N_>1 then output; input ID $; Deduct=0; end; else do; input DepName $ Relation $; if Relation='C' then Deduct+25; else Deduct+50; end; if LastRec then output; run; input Type $ @;

110 110 RELATION TYPE R IDDEPNAME 25 DEDUCT D R _N_LASTREC D E014423 25 D0 DDD... data work.insurance(drop=Type DepName Relation); length Type $ 1 ID $ 6 DepName $ 20 Relation $ 1; retain ID; infile 'raw-data-file' dlm=':' end=LastRec; input Type $ @; if Type='E' then do; if _N_>1 then output; input ID $; Deduct=0; end; else do; input DepName $ Relation $; if Relation='C' then Deduct+25; else Deduct+50; end; if LastRec then output; run; if Type=‘E’ then do; False Input Buffer D : L i n d s a y : C E:E01442 D:Michael:C D:Lindsay:C E:E00705 D:Susan:S E:E01577 D:Richard:C E:E00997 E:E00955 D:Roberta:C E:E00224 D:John:S

111 111 RELATION TYPE R IDDEPNAME 25 DEDUCT D R _N_LASTREC D E014423Lindsay C D0 DDD... data work.insurance(drop=Type DepName Relation); length Type $ 1 ID $ 6 DepName $ 20 Relation $ 1; retain ID; infile 'raw-data-file' dlm=':' end=LastRec; input Type $ @; if Type='E' then do; if _N_>1 then output; input ID $; Deduct=0; end; else do; input DepName $ Relation $; if Relation='C' then Deduct+25; else Deduct+50; end; if LastRec then output; run; input DepName $ Relation $; Input Buffer D : L i n d s a y : C E:E01442 D:Michael:C D:Lindsay:C E:E00705 D:Susan:S E:E01577 D:Richard:C E:E00997 E:E00955 D:Roberta:C E:E00224 D:John:S

112 112 DD RELATION TYPE R IDDEPNAME 25 DEDUCT D R _N_LASTREC D E014423Lindsay C50 D0 D... data work.insurance(drop=Type DepName Relation); length Type $ 1 ID $ 6 DepName $ 20 Relation $ 1; retain ID; infile 'raw-data-file' dlm=':' end=LastRec; input Type $ @; if Type='E' then do; if _N_>1 then output; input ID $; Deduct=0; end; else do; input DepName $ Relation $; if Relation='C' then Deduct+25; else Deduct+50; end; if LastRec then output; run; if Relation=‘C’ then Deduct+25; True Input Buffer D : L i n d s a y : C E:E01442 D:Michael:C D:Lindsay:C E:E00705 D:Susan:S E:E01577 D:Richard:C E:E00997 E:E00955 D:Roberta:C E:E00224 D:John:S 25 + 25

113 113 data work.insurance(drop=Type DepName Relation); length Type $ 1 ID $ 6 DepName $ 20 Relation $ 1; retain ID; infile 'raw-data-file' dlm=':' end=LastRec; input Type $ @; if Type='E' then do; if _N_>1 then output; input ID $; Deduct=0; end; else do; input DepName $ Relation $; if Relation='C' then Deduct+25; else Deduct+50; end; if LastRec then output; run; RELATION TYPE R IDDEPNAME 25 DEDUCT D R _N_LASTREC D E014423Lindsay C50 D0 DDD... if LastRec then output; False Input Buffer D : L i n d s a y : C E:E01442 D:Michael:C D:Lindsay:C E:E00705 D:Susan:S E:E01577 D:Richard:C E:E00997 E:E00955 D:Roberta:C E:E00224 D:John:S

114 114 Input Buffer D : L i n d s a y : C DD E:E01442 D:Michael:C D:Lindsay:C E:E00705 D:Susan:S E:E01577 D:Richard:C E:E00997 E:E00955 D:Roberta:C E:E00224 D:John:S RELATION TYPE R IDDEPNAME DEDUCT D R _N_LASTREC D E014423Lindsay C 2550D0 D... data work.insurance(drop=Type DepName Relation); length Type $ 1 ID $ 6 DepName $ 20 Relation $ 1; retain ID; infile 'raw-data-file' dlm=':' end=LastRec; input Type $ @; if Type='E' then do; if _N_>1 then output; input ID $; Deduct=0; end; else do; input DepName $ Relation $; if Relation='C' then Deduct+25; else Deduct+50; end; if LastRec then output; run; Implicit return

115 115 Input Buffer D : L i n d s a y : C RELATION TYPE R IDDEPNAME DEDUCT D R _N_LASTREC D 450E014420 DDD Reinitialize PDV. data work.insurance(drop=Type DepName Relation); length Type $ 1 ID $ 6 DepName $ 20 Relation $ 1; retain ID; infile 'raw-data-file' dlm=':' end=LastRec; input Type $ @; if Type='E' then do; if _N_>1 then output; input ID $; Deduct=0; end; else do; input DepName $ Relation $; if Relation='C' then Deduct+25; else Deduct+50; end; if LastRec then output; run; E:E01442 D:Michael:C D:Lindsay:C E:E00705 D:Susan:S E:E01577 D:Richard:C E:E00997 E:E00955 D:Roberta:C E:E00224 D:John:S

116 116 RELATION TYPE R IDDEPNAME DEDUCT D R _N_LASTREC D 450E014420 DDD data work.insurance(drop=Type DepName Relation); length Type $ 1 ID $ 6 DepName $ 20 Relation $ 1; retain ID; infile 'raw-data-file' dlm=':' end=LastRec; input Type $ @; if Type='E' then do; if _N_>1 then output; input ID $; Deduct=0; end; else do; input DepName $ Relation $; if Relation='C' then Deduct+25; else Deduct+50; end; if LastRec then output; run; input Type $ @; Input Buffer D : L i n d s a y : C E:E01442 D:Michael:C D:Lindsay:C E:E00705 D:Susan:S E:E01577 D:Richard:C E:E00997 E:E00955 D:Roberta:C E:E00224 D:John:S

117 117 Input Buffer RELATION TYPE R IDDEPNAME DEDUCT D R _N_LASTREC D E : E 0 0 7 0 5 450E014420 DDD data work.insurance(drop=Type DepName Relation); length Type $ 1 ID $ 6 DepName $ 20 Relation $ 1; retain ID; infile 'raw-data-file' dlm=':' end=LastRec; input Type $ @; if Type='E' then do; if _N_>1 then output; input ID $; Deduct=0; end; else do; input DepName $ Relation $; if Relation='C' then Deduct+25; else Deduct+50; end; if LastRec then output; run; input Type $ @; E:E01442 D:Michael:C D:Lindsay:C E:E00705 D:Susan:S E:E01577 D:Richard:C E:E00997 E:E00955 D:Roberta:C E:E00224 D:John:S

118 118 RELATION TYPE R IDDEPNAME DEDUCT D R _N_LASTREC D 450E014420 D Input Buffer E : E 0 0 7 0 5 E:E01442 D:Michael:C D:Lindsay:C E:E00705 D:Susan:S E:E01577 D:Richard:C E:E00997 E:E00955 D:Roberta:C E:E00224 D:John:S Hold record. DD data work.insurance(drop=Type DepName Relation); length Type $ 1 ID $ 6 DepName $ 20 Relation $ 1; retain ID; infile 'raw-data-file' dlm=':' end=LastRec; input Type $ @; if Type='E' then do; if _N_>1 then output; input ID $; Deduct=0; end; else do; input DepName $ Relation $; if Relation='C' then Deduct+25; else Deduct+50; end; if LastRec then output; run; input Type $ @; E

119 119 RELATION TYPE R IDDEPNAME DEDUCT D R _N_LASTREC D 450E014420E DDD... data work.insurance(drop=Type DepName Relation); length Type $ 1 ID $ 6 DepName $ 20 Relation $ 1; retain ID; infile 'raw-data-file' dlm=':' end=LastRec; input Type $ @; if Type='E' then do; if _N_>1 then output; input ID $; Deduct=0; end; else do; input DepName $ Relation $; if Relation='C' then Deduct+25; else Deduct+50; end; if LastRec then output; run; if Type=‘E’ then do; Input Buffer E:E01442 D:Michael:C D:Lindsay:C E:E00705 D:Susan:S E:E01577 D:Richard:C E:E00997 E:E00955 D:Roberta:C E:E00224 D:John:S E : E 0 0 7 0 5 True

120 120 RELATION TYPE R IDDEPNAME DEDUCT D R _N_LASTREC D 450E014420E DDD... data work.insurance(drop=Type DepName Relation); length Type $ 1 ID $ 6 DepName $ 20 Relation $ 1; retain ID; infile 'raw-data-file' dlm=':' end=LastRec; input Type $ @; if Type='E' then do; if _N_>1 then output; input ID $; Deduct=0; end; else do; input DepName $ Relation $; if Relation='C' then Deduct+25; else Deduct+50; end; if LastRec then output; run; if _N_ > 1 then output; True Input Buffer E : E 0 0 7 0 5 E:E01442 D:Michael:C D:Lindsay:C E:E00705 D:Susan:S E:E01577 D:Richard:C E:E00997 E:E00955 D:Roberta:C E:E00224 D:John:S

121 121 Write out observation to insurance. RELATION TYPE R IDDEPNAME DEDUCT D R _N_LASTREC D 450E014420E DDD... data work.insurance(drop=Type DepName Relation); length Type $ 1 ID $ 6 DepName $ 20 Relation $ 1; retain ID; infile 'raw-data-file' dlm=':' end=LastRec; input Type $ @; if Type='E' then do; if _N_>1 then output; input ID $; Deduct=0; end; else do; input DepName $ Relation $; if Relation='C' then Deduct+25; else Deduct+50; end; if LastRec then output; run; if _N_ > 1 then output; Input Buffer E : E 0 0 7 0 5 E:E01442 D:Michael:C D:Lindsay:C E:E00705 D:Susan:S E:E01577 D:Richard:C E:E00997 E:E00955 D:Roberta:C E:E00224 D:John:S True Explicit output

122 122 RELATION TYPE R IDDEPNAME DEDUCT D R _N_LASTREC D E0022412John S 25 50 D1 DDD data work.insurance(drop=Type DepName Relation); length Type $ 1 ID $ 6 DepName $ 20 Relation $ 1; retain ID; infile 'raw-data-file' dlm=':' end=LastRec; input Type $ @; if Type='E' then do; if _N_>1 then output; input ID $; Deduct=0; end; else do; input DepName $ Relation $; if Relation='C' then Deduct+25; else Deduct+50; end; if LastRec then output; run; Input Buffer D : J o h n : S E:E01442 D:Michael:C D:Lindsay:C E:E00705 D:Susan:S E:E01577 D:Richard:C E:E00997 E:E00955 D:Roberta:C E:E00224 D:John:S Implicit return

123 123 Creating One Observation Per Header ID Deduct E01442 50 E00705 50 E01577 25 E00997 0 E00955 25 E00224 50 proc print data=insurance noobs; run; PROC PRINT Output

124 Exercise 1 Open program c21_1. Carefully check the data structure, and go through each program statement to make sure you know why the statement is needed. Run the program, and learn how to read hierarchical data.


Download ppt "Chapter 21 Reading Hierarchical Files Reading Hierarchical Raw Data Files."

Similar presentations


Ads by Google