Chapter 21 Reading Hierarchical Files Reading Hierarchical Raw Data Files.

Slides:



Advertisements
Similar presentations
How SAS implements structured programming constructs
Advertisements

Statistical Methods Lynne Stokes Department of Statistical Science Lecture 7: Introduction to SAS Programming Language.
Chapter 17 Read Raw Data in Fixed Format using Formatted Input Objectives Distinguish between standard and nonstandard numeric data Read standard fixed-field.
Chapter 3: Editing and Debugging SAS Programs. Some useful tips of using Program Editor Add line number: In the Command Box, type num, enter. Save SAS.
Programming Logic and Design Eighth Edition
SAS Programming: Working With Variables. Data Step Manipulations New variables should be created during a Data step Existing variables should be manipulated.
Object Oriented Design An object combines data and operations on that data (object is an instance of class) data: class variables operations: methods Three.
C Lecture Notes 1 Program Control (Cont...). C Lecture Notes 2 4.8The do / while Repetition Structure The do / while repetition structure –Similar to.
Computer Science 1620 Programming & Problem Solving.
Jeremy W. Poling B&W Y-12 L.L.C. Can’t Decide Whether to Use a DATA Step or PROC SQL? You Can Have It Both Ways with the SQL Function!
Control Structures - Repetition Chapter 5 2 Chapter Topics Why Is Repetition Needed The Repetition Structure Counter Controlled Loops Sentinel Controlled.
Chapter 18: Modifying SAS Data Sets and Tracking Changes 1 STAT 541 ©Spring 2012 Imelda Go, John Grego, Jennifer Lasecki and the University of South Carolina.
Understanding SAS Data Step Processing Alan C. Elliott stattutorials.com.
Introduction to SAS Essentials Mastering SAS for Data Analytics Alan Elliott and Wayne Woodward SAS Essentials - Elliott & Woodward1.
Into to SAS ®. 2 List the components of a SAS program. Open an existing SAS program and run it. Objectives.
Creating SAS® Data Sets
Topics in Data Management SAS Data Step. Combining Data Sets I - SET Statement Data available on common variables from different sources. Multiple datasets.
Welcome to SAS…Session..!. What is SAS..! A Complete programming language with report formatting with statistical and mathematical capabilities.
NextGen Trustee Department Disbursements This class will cover the various methods of handling department disbursements. Whether entering them manually.
Chapter 14: Generating Data with Do Loops OBJECTIVES Understand iterative DO loops. Construct a DO loop to perform repetitive calculations Use DO loops.
10- 1 Chapter 10. To familiarize you with  Main types of computer-generated reports  Techniques used for efficient printing of group reports and control.
Lecture 5 Sorting, Printing, and Summarizing Your Data.
RTSUG 04Feb2014: Beyond Directory Listings in SAS By: Jim Worley.
Chapter 20 Creating Multiple Observations from a Single Record Objectives Create multiple observations from a single record containing repeating blocks.
Using Advanced INPUT Techniques Peter Cosette Dave Hall Amy Dunn-Ruiz Eric Lyon.
Chapter 15: Combining Data Horizontally 1 STAT 541 ©Spring 2012 Imelda Go, John Grego, Jennifer Lasecki and the University of South Carolina.
SAS Efficiency Techniques and Methods By Kelley Weston Sr. Statistical Programmer Quintiles.
BMTRY 789 Lecture 2 SAS Syntax, entering raw data, etc. Lecturer: Annie N. Simpson, MSc. Readings – Chapters 1, 2, 12, & 13 Lab Problems 1.1, 1.2, 1.3,
Lesson 2 Topic - Reading in data Chapter 2 (Little SAS Book)
(Spring 2015) Instructor: Craig Duckett Lecture 10: Tuesday, May 12, 2015 Mere Mortals Chap. 7 Summary, Team Work Time 1.
Programming Logic and Design Fourth Edition, Comprehensive Chapter 6 Looping.
13-1 COBOL for the 21 st Century Nancy Stern Hofstra University Robert A. Stern Nassau Community College James P. Ley University of Wisconsin-Stout (Emeritus)
Chapter 22: Using Best Practices 1 STAT 541 ©Spring 2012 Imelda Go, John Grego, Jennifer Lasecki and the University of South Carolina.
Chapter 5 Reading and Manipulating SAS ® Data Sets and Creating Detailed Reports Xiaogang Su Department of Statistics University of Central Florida.
BMTRY 789 Lecture 11: Debugging Readings – Chapter 10 (3 rd Ed) from “The Little SAS Book” Lab Problems – None Homework Due – None Final Project Presentations.
5 1 Data Files CGI/Perl Programming By Diane Zak.
Here’s another problem (see section 2.13 on page 54). A file contains two different types of records (say A’s and B’s) and we only want to read in the.
Chapter 4 concerns various SAS procedures (PROCs). Every PROC operates on: –the most recently created dataset –all the observations –all the appropriate.
13-1 Sequential File Processing Chapter Chapter Contents Overview of Sequential File Processing Sequential File Updating - Creating a New Master.
Introduction to SAS Essentials Mastering SAS for Data Analytics Alan Elliott and Wayne Woodward SAS Essentials - Elliott & Woodward1.
13- 1 Chapter 13.  Overview of Sequential File Processing  Sequential File Updating - Creating a New Master File  Validity Checking in Update Procedures.
Controlling Input and Output
Time Series Data Processes by Tai Yu April 15, 2013.
Control Break Processing
Davisware GlobalEdge 2008 Payroll Main Menu Time Entry and Payroll Processing.
Chapter 18 Reading Free-Format Data. 2 Objectives Read free-format data not recognized in fixed fields. Read free-format data separated by non-blank delimiters,
BMTRY 789 Lecture 6: Proc Sort, Random Number Generators, and Do Loops Readings – Chapters 5 & 6 Lab Problem - Brain Teaser Homework Due – HW 2 Homework.
Use the SET statement to: –create an exact copy of a SAS dataset –modify an existing SAS dataset by creating new variables, subsetting (using a subsetting.
Lesson 2 Topic - Reading in data Programs 1 and 2 in course notes –Chapter 2 (Little SAS Book)
Chapter 6: Modifying and Combining Data Sets  The SET statement is a powerful statement in the DATA step DATA newdatasetname; SET olddatasetname;.. run;
Chapter 14: Combining Data Vertically 1 STAT 541 ©Spring 2012 Imelda Go, John Grego, Jennifer Lasecki and the University of South Carolina.
INCREMENTAL AGGREGATION After you create a session that includes an Aggregator transformation, you can enable the session option, Incremental Aggregation.
General Condition Loop A general condition loop just loops while some condition remains true. Note that the body of the loop should (eventually) change.
FILES AND EXCEPTIONS Topics Introduction to File Input and Output Using Loops to Process Files Processing Records Exceptions.
SAS ® 101 Based on Learning SAS by Example: A Programmer’s Guide Chapters 5 & 6 By Ravi Mandal.
Chapter 11 Reading SAS Data
By Sasikumar Palanisamy
Chapter 6: Modifying and Combining Data Sets
Control Structures II Chapter 3
Programming Logic and Design Fourth Edition, Comprehensive
Chapter 4: Using Lookup Tables to Match Data: Arrays
Chapter 2: Getting Data into SAS
Chapter 5: Using DATA Step Arrays
By Don Henderson PhilaSUG, June 18, 2018
Chapter 22 Reading Hierarchical Files
Topics Introduction to File Input and Output
Iteration: Beyond the Basic PERFORM
Introduction to DATA Step Programming: SAS Basics II
Topics Introduction to File Input and Output
Introduction to SAS Essentials Mastering SAS for Data Analytics
Presentation transcript:

Chapter 21 Reading Hierarchical Files Reading Hierarchical Raw Data Files

2 Objectives – Read data with mixed record types – Read a hierarchical file and create one observation per detail record. – Read a hierarchical file and create one observation per header record.

3 Mixed Record Types Not all records have the same format. 101 USA EUR 30JAN , USA USA EUR 6FEB , USA Multiple INPUT statements are needed using conditional statement to control.

4 Desired Output Sales Sale ID Location Date Amount 101 USA EUR USA USA EUR USA

5 The INPUT Statement Multiple INPUT statements are needed for different formats of the same variable: input SalesID $ Location $; if Location='USA' then input SaleDate : mmddyy10. Amount; else if location='EUR' then input SaleDate : date9. Amount : comma8.;

6 The INPUT Statement NOTE: 6 records were read from the infile 'sales.dat'. The minimum record length was 24. The maximum record length was 26. NOTE: The data set WORK.SALES has 3 observations and 4 variables.... NOTE: This is NOT correct. We suppose have 6 cases (6 observations), not 3.

7 Undesirable Output Sales Sale ID Location Date Amount 101 USA EUR..... NOTE: This is NOT correct. We have 6 cases (6 observations), not 3. Besides, all the Sale date and Amount are missing!

8 The program: input SalesID $ Location $; if Location='USA' then input SaleDate : mmddyy10. Amount; else if location='EUR' then input SaleDate : date9. Amount : commax8.; The raw data: 101 USA EUR 30JAN , USA USA EUR 6FEB , USA NOTE: Each INPUT statement reads a new case (observation), based on the IF condition. The output: Sales Sale ID Location Date Amount 101 USA EUR..

9 Use The Single to control reading the same case requiring more than one INPUT statement The single option holds a raw data record in the input buffer until SAS – executes an INPUT statement with no or – reaches the bottom of the DATA step. General form of an INPUT statement with the single INPUT var1 var2 var3

10 input SalesID $ Location if location='USA' then input SaleDate : mmddyy10. Amount; else if Location='EUR' then input SaleDate : date9. Amount : commax8.; Hold record for next INPUT statement. Load next record. Processing the

11 PDV SALESIDSALEDATE AMOUNT LOCATION data sales; length SalesID $ 4 Location $ 3; infile 'raw-data-file'; input SalesID $ Location if Location='USA' then input SaleDate : mmddyy10. Amount; else if Location='EUR' then input SaleDate : date9. Amount : commax8.; run; Compile Input Buffer USA EUR 30JAN , USA USA EUR 6FEB , USA Raw Data File

12 data sales; length SalesID $ 4 Location $ 3; infile 'raw-data-file'; input SalesID $ Location if Location='USA' then input SaleDate : mmddyy10. Amount; else if Location='EUR' then input SaleDate : date9. Amount : commax8.; run; PDV SALESIDSALEDATE. AMOUNT. LOCATION 101 USA EUR 30JAN , USA USA EUR 6FEB , USA Raw Data File... Execute Input Buffer

13 data sales; length SalesID $ 4 Location $ 3; infile 'raw-data-file'; input SalesID $ Location if Location='USA' then input SaleDate : mmddyy10. Amount; else if Location='EUR' then input SaleDate : date9. Amount : commax8.; run; PDV SALESIDSALEDATE. AMOUNT. LOCATION Input Buffer 101 USA EUR 30JAN , USA USA EUR 6FEB , USA Raw Data File U S A

14 data sales; length SalesID $ 4 Location $ 3; infile 'raw-data-file'; input SalesID $ Location if Location='USA' then input SaleDate : mmddyy10. Amount; else if Location='EUR' then input SaleDate : date9. Amount : commax8.; run; PDV SALESIDSALEDATE. AMOUNT. LOCATION Input Buffer 101 USA EUR 30JAN , USA USA EUR 6FEB , USA Raw Data File U S A

15 data sales; length SalesID $ 4 Location $ 3; infile 'raw-data-file'; input SalesID $ Location if Location='USA' then input SaleDate : mmddyy10. Amount; else if Location='EUR' then input SaleDate : date9. Amount : commax8.; run; PDV SALESIDSALEDATE. AMOUNT. LOCATION True Input Buffer 101 USA EUR 30JAN , USA USA EUR 6FEB , USA Raw Data File U S A USA... Hold record.

16 data sales; length SalesID $ 4 Location $ 3; infile 'raw-data-file'; input SalesID $ Location if Location='USA' then input SaleDate : mmddyy10. Amount; else if Location='EUR' then input SaleDate : date9. Amount : commax8.; run; Input Buffer 101 USA EUR 30JAN , USA USA EUR 6FEB , USA Raw Data File SALESIDSALEDATE. AMOUNT. LOCATION 101USA PDV U S A

17 Write out observation to sales. data sales; length SalesID $ 4 Location $ 3; infile 'raw-data-file'; input SalesID $ Location if Location='USA' then input SaleDate : mmddyy10. Amount; else if Location='EUR' then input SaleDate : date9. Amount : commax8.; run; SALESIDSALEDATE. AMOUNT. LOCATION 101USA PDV Input Buffer 101 USA EUR 30JAN , USA USA EUR 6FEB , USA Raw Data File U S A Implicit output

18 data sales; length SalesID $ 4 Location $ 3; infile 'raw-data-file'; input SalesID $ Location if Location='USA' then input SaleDate : mmddyy10. Amount; else if Location='EUR' then input SaleDate : date9. Amount : commax8.; run; Implicit return SALESIDSALEDATE. AMOUNT. LOCATION 101USA PDV Input Buffer 101 USA EUR 30JAN , USA USA EUR 6FEB , USA Raw Data File U S A

19 data sales; length SalesID $ 4 Location $ 3; infile 'raw-data-file'; input SalesID $ Location if Location='USA' then input SaleDate : mmddyy10. Amount; else if Location='EUR' then input SaleDate : date9. Amount : commax8.; run; PDV SALESIDSALEDATE. AMOUNT. LOCATION 101 USA EUR 30JAN , USA USA EUR 6FEB , USA Raw Data File Continue processing until end of the raw data file.... Input Buffer U S A

20 NOTE: 6 records were read from the infile 'sales.dat'. The minimum record length was 24. The maximum record length was 26. NOTE: The data set WORK.SALES has 6 observations and 4 variables. Mixed Record Types Partial Log

21 Sales Sale ID Location Date Amount 101 USA EUR USA USA EUR USA Mixed Record Types proc print data=sales noobs; run; PROC PRINT Output

22 Subsetting from a Raw Data File This scenario uses the raw data file from the previous example. 101 USA EUR 30JAN , USA USA EUR 6FEB , USA

23 Desired Output The sales manager wants to see sales for the European branch only. Sales Sale ID Location Date Amount 3034 EUR EUR

24 The Subsetting IF Statement data europe; length SalesID $ 4 Location $ 3; infile 'raw-data-file'; input SalesID $ Location if Location='USA' then input SaleDate : mmddyy10. Amount ; else if Location='EUR' then input SaleDate : date9. Amount : commax8.; if Location='EUR'; run; This is okay, but not efficient. It reads the entire data first, then select EUR location.

25 The Subsetting IF Statement The subsetting IF should appear as early in the program as possible but after the variables used in the condition are calculated. In this case, we should read only the EUR cases by adding the IF statement right after reading Location.

26 The Subsetting IF Statement Because the program reads only European sales, the INPUT statement for USA sales is not needed. data europe; length SalesID $ 4 Location $ 3; infile 'raw-data-file'; input SalesID $ Location if Location='EUR'; input SaleDate : date9. Amount : commax8.; run;

27 The Subsetting IF Statement Sales Sale ID Location Date Amount 3034 EUR EUR proc print data=europe noobs; run;

28 Processing Hierarchical Files Many files are hierarchical in structure, consisting of – a header record – one or more related detail records. Typically, each record contains a field that identifies whether it is a header record or a detail record. Header Detail Header Detail Header Detail

29 Processing Hierarchical Files You can read a hierarchical file into a SAS data set by creating one observation per detail record and storing the header information as part of each observation. Header 1 Detail 1 Detail 2 Detail 3 Header 2 Detail 1 Header 3 Detail 1 Detail 2 Hierarchical File Header Variables Header 1 Header 2 Header 3 Detail Variables Detail 1 Detail 2 Detail 3 Detail 1 Detail 2 SAS Data Set

30 Processing Hierarchical Files You can also create one observation per header record and store the information from detail records in summary variables. Header 1 Detail 1 Detail 2 Detail 3 Header 2 Detail 1 Header 3 Detail 1 Detail 2 Header Variables Header 1 Header 2 Header 3 Summary Variables Summary 1 Summary 2 Summary 3 Hierarchical FileSAS Data Set

31 E:Adams:Susan D:Michael:C D:Lindsay:C E:Porter:David D:Susan:S E:Lewis:Dorian D. D:Richard:C E:Dansky:Ian E:Nicholls:James D:Roberta:C E:Slaydon:Marla D:John:S Creating One Observation Per Detail The raw data file dependents has a header record containing the name of the employee and a detail record for each dependent on the employee’s health insurance. E: Employee, D: Dependent C: Child, S: Spouse Each data value is separated by :

32 Desired Output Personnel wants a list of all the dependents and the name of the associated employee. EmpLName EmpFName DepName Relation Adams Susan Michael C Adams Susan Lindsay C Porter David Susan S Lewis Dorian D. Richard C Nicholls James Roberta C Slaydon Marla John S

33 A Hierarchical File – Not all the records are the same. – The fields are separated by colons. – There is a field indicating whether the record is a header or a detail record. E:Adams:Susan D:Michael:C D:Lindsay:C E:Porter:David D:Susan:S E:Lewis:Dorian D. D:Richard:C E:Dansky:Ian E:Nicholls:James D:Roberta:C E:Slaydon:Marla D:John:S

34 How to Read the Hierarchical Data input Type if Type='E' then input EmpLName $ EmpFName $; else input DepName $ Relation $;

35 How to Output Only the Dependents input Type if Type='E' then input EmpLName $ EmpFName $; else do; input DepName $ Relation $; output; end; Try the following program. Observe what is wrong with the result.

36 Input Buffer E:Adams:Susan D:Michael:C D:Lindsay:C E:Porter:David D:Susan:S E:Lewis:Dorian D. D:Richard:C E:Dansky:Ian E:Nicholls:James D:Roberta:C E:Slaydon:Marla D:John:S EMPLNAMERELATION data dependents(drop=Type); length Type $ 1 EmpLName EmpFName DepName $ 20 Relation $ 1; infile 'raw-data-file' dlm= ':'; input Type if Type='E' then input EmpLName $ EmpFName $; else do; input DepName $ Relation $; output; end; run; Compile EMPFNAMEDEPNAMETYPE D...

37 EMPLNAMERELATIONEMPFNAMEDEPNAMETYPE data dependents(drop=Type); length Type $ 1 EmpLName EmpFName DepName $ 20 Relation $ 1; infile 'raw-data-file' dlm=':'; input Type if Type='E' then input EmpLName $ EmpFName $; else do; input DepName $ Relation $; output; end; run; E:Adams:Susan D:Michael:C D:Lindsay:C E:Porter:David D:Susan:S E:Lewis:Dorian D. D:Richard:C E:Dansky:Ian E:Nicholls:James D:Roberta:C E:Slaydon:Marla D:John:S Execute... D Input Buffer

38 Input Buffer EMPLNAMERELATIONEMPFNAMEDEPNAMETYPE data dependents(drop=Type); length Type $ 1 EmpLName EmpFName DepName $ 20 Relation $ 1; infile 'raw-data-file' dlm=':'; input Type if Type='E' then input EmpLName $ EmpFName $; else do; input DepName $ Relation $; output; end; run; E:Adams:Susan D:Michael:C D:Lindsay:C E:Porter:David D:Susan:S E:Lewis:Dorian D. D:Richard:C E:Dansky:Ian E:Nicholls:James D:Roberta:C E:Slaydon:Marla D:John:S E : A d a m s : S u s a n... D

39 Input Buffer EMPLNAMERELATIONEMPFNAMEDEPNAMETYPE D data dependents(drop=Type); length Type $ 1 EmpLName EmpFName DepName $ 20 Relation $ 1; infile 'raw-data-file' dlm=':'; input Type if Type='E' then input EmpLName $ EmpFName $; else do; input DepName $ Relation $; output; end; run; E:Adams:Susan D:Michael:C D:Lindsay:C E:Porter:David D:Susan:S E:Lewis:Dorian D. D:Richard:C E:Dansky:Ian E:Nicholls:James D:Roberta:C E:Slaydon:Marla D:John:S E : A d a m s : S u s a n E Hold record....

40 EMPLNAMERELATIONEMPFNAMEDEPNAMETYPE D data dependents(drop=Type); length Type $ 1 EmpLName EmpFName DepName $ 20 Relation $ 1; infile 'raw-data-file' dlm=':'; input Type if Type='E' then input EmpLName $ EmpFName $; else do; input DepName $ Relation $; output; end; run; Input Buffer E:Adams:Susan D:Michael:C D:Lindsay:C E:Porter:David D:Susan:S E:Lewis:Dorian D. D:Richard:C E:Dansky:Ian E:Nicholls:James D:Roberta:C E:Slaydon:Marla D:John:S E : A d a m s : S u s a n E True...

41 Input Buffer EMPLNAMERELATIONEMPFNAMEDEPNAMETYPE D data dependents(drop=Type); length Type $ 1 EmpLName EmpFName DepName $ 20 Relation $ 1; infile 'raw-data-file' dlm=':' input Type if Type='E' then input EmpLName $ EmpFName $; else do; input DepName $ Relation $; output; end; run; E:Adams:Susan D:Michael:C D:Lindsay:C E:Porter:David D:Susan:S E:Lewis:Dorian D. D:Richard:C E:Dansky:Ian E:Nicholls:James D:Roberta:C E:Slaydon:Marla D:John:S EAdamsSusan Implicit return No implicit output... E : A d a m s : S u s a n

42 Input Buffer EMPLNAMERELATIONEMPFNAMEDEPNAMETYPE D data dependents(drop=Type); length Type $ 1 EmpLName EmpFName DepName $ 20 Relation $ 1; infile 'raw-data-file' dlm=':'; input Type if Type='E' then input EmpLName $ EmpFName $; else do; input DepName $ Relation $; output; end; run; E:Adams:Susan D:Michael:C D:Lindsay:C E:Porter:David D:Susan:S E:Lewis:Dorian D. D:Richard:C E:Dansky:Ian E:Nicholls:James D:Roberta:C E:Slaydon:Marla D:John:S E : A d a m s : S u s a n EAdamsSusan...

43 Input Buffer EMPLNAMERELATIONEMPFNAMEDEPNAMETYPE D data dependents(drop=Type); length Type $ 1 EmpLName EmpFName DepName $ 20 Relation $ 1; infile 'raw-data-file' dlm=':'; input Type if Type='E' then input EmpLName $ EmpFName $; else do; input DepName $ Relation $; output; end; run; E:Adams:Susan D:Michael:C D:Lindsay:C E:Porter:David D:Susan:S E:Lewis:Dorian D. D:Richard:C E:Dansky:Ian E:Nicholls:James D:Roberta:C E:Slaydon:Marla D:John:S E : A d a m s : S u s a n EAdamsSusan No implicit output...

44 Input Buffer EMPLNAMERELATIONEMPFNAMEDEPNAMETYPE D data dependents(drop=Type); length Type $ 1 EmpLName EmpFName DepName $ 20 Relation $ 1; infile 'raw-data-file' dlm=':'; input Type if Type='E' then input EmpLName $ EmpFName $; else do; input DepName $ Relation $; output; end; run; E:Adams:Susan D:Michael:C D:Lindsay:C E:Porter:David D:Susan:S E:Lewis:Dorian D. D:Richard:C E:Dansky:Ian E:Nicholls:James D:Roberta:C E:Slaydon:Marla D:John:S EAdamsSusan... Implicit return E : A d a m s : S u s a n

45 Input Buffer EMPLNAMERELATIONEMPFNAMEDEPNAMETYPE D data dependents(drop=Type); length Type $ 1 EmpLName EmpFName DepName $ 20 Relation $ 1; infile 'raw-data-file' dlm=':'; input Type if Type='E' then input EmpLName $ EmpFName $; else do; input DepName $ Relation $; output; end; run; E:Adams:Susan D:Michael:C D:Lindsay:C E:Porter:David D:Susan:S E:Lewis:Dorian D. D:Richard:C E:Dansky:Ian E:Nicholls:James D:Roberta:C E:Slaydon:Marla D:John:S Reinitialize PDV.... E : A d a m s : S u s a n

46 EMPLNAMERELATIONEMPFNAMEDEPNAMETYPE D data dependents(drop=Type); length Type $ 1 EmpLName EmpFName DepName $ 20 Relation $ 1; infile 'raw-data-file' dlm=':'; input Type if Type='E' then input EmpLName $ EmpFName $; else do; input DepName $ Relation $; output; end; run; E:Adams:Susan D:Michael:C D:Lindsay:C E:Porter:David D:Susan:S E:Lewis:Dorian D. D:Richard:C E:Dansky:Ian E:Nicholls:James D:Roberta:C E:Slaydon:Marla D:John:S... Input Buffer E : A d a m s : S u s a n

47 EMPLNAMERELATIONEMPFNAMEDEPNAMETYPE D data dependents(drop=Type); length Type $ 1 EmpLName EmpFName DepName $ 20 Relation $ 1; infile 'raw-data-file' dlm=':'; input Type if Type='E' then input EmpLName $ EmpFName $; else do; input DepName $ Relation $; output; end; run; E:Adams:Susan D:Michael:C D:Lindsay:C E:Porter:David D:Susan:S E:Lewis:Dorian D. D:Richard:C E:Dansky:Ian E:Nicholls:James D:Roberta:C E:Slaydon:Marla D:John:S Input Buffer D : M i c h e a l : C...

48 Input Buffer EMPLNAMERELATIONEMPFNAMEDEPNAMETYPE D data dependents(drop=Type); length Type $ 1 EmpLName EmpFName DepName $ 20 Relation $ 1; infile 'raw-data-file' dlm=':'; input Type if Type='E' then input EmpLName $ EmpFName $; else do; input DepName $ Relation $; output; end; run; E:Adams:Susan D:Michael:C D:Lindsay:C E:Porter:David D:Susan:S E:Lewis:Dorian D. D:Richard:C E:Dansky:Ian E:Nicholls:James D:Roberta:C E:Slaydon:Marla D:John:S D Hold record. D : M i c h a e l : C...

49 EMPLNAMERELATIONEMPFNAMEDEPNAMETYPE D data dependents(drop=Type); length Type $ 1 EmpLName EmpFName DepName $ 20 Relation $ 1; infile 'raw-data-file' dlm=':'; input Type if Type='E' then input EmpLName $ EmpFName $; else do; input DepName $ Relation $; output; end; run; E:Adams:Susan D:Michael:C D:Lindsay:C E:Porter:David D:Susan:S E:Lewis:Dorian D. D:Richard:C E:Dansky:Ian E:Nicholls:James D:Roberta:C E:Slaydon:Marla D:John:S D False... Input Buffer D : M i c h a e l : C

50 EMPLNAMERELATIONEMPFNAMEDEPNAMETYPE D data dependents(drop=Type); length Type $ 1 EmpLName EmpFName DepName $ 20 Relation $ 1; infile 'raw-data-file' dlm=':'; input Type if Type='E' then input EmpLName $ EmpFName $; else do; input DepName $ Relation $; output; end; run; E:Adams:Susan D:Michael:C D:Lindsay:C E:Porter:David D:Susan:S E:Lewis:Dorian D. D:Richard:C E:Dansky:Ian E:Nicholls:James D:Roberta:C E:Slaydon:Marla D:John:S DMichaelC... Input Buffer D : M i c h a e l : C

51 Write out observation to dependents. EMPLNAMERELATIONEMPFNAMEDEPNAMETYPE R D data dependents(drop=Type); length Type $ 1 EmpLName EmpFName DepName $ 20 Relation $ 1; infile 'raw-data-file' dlm=':'; input Type if Type='E' then input EmpLName $ EmpFName $; else do; input DepName $ Relation $; output; end; run; E:Adams:Susan D:Michael:C D:Lindsay:C E:Porter:David D:Susan:S E:Lewis:Dorian D. D:Richard:C E:Dansky:Ian E:Nicholls:James D:Roberta:C E:Slaydon:Marla D:John:S DMichaelC... Input Buffer D : M i c h a e l : C Explicit output

52 Input Buffer D : M i c h a e l : C EMPLNAMERELATIONEMPFNAMEDEPNAMETYPE R D data dependents(drop=Type); length Type $ 1 EmpLName EmpFName DepName $ 20 Relation $ 1; infile 'raw-data-file' dlm=':'; input Type if Type='E' then input EmpLName $ EmpFName $; else do; input DepName $ Relation $; output; end; run; E:Adams:Susan D:Michael:C D:Lindsay:C E:Porter:David D:Susan:S E:Lewis:Dorian D. D:Richard:C E:Dansky:Ian E:Nicholls:James D:Roberta:C E:Slaydon:Marla D:John:S DMichaelC Implicit return...

53 Undesirable Output Emp Emp LName FName DepName Relation Michael C Lindsay C Susan S Richard C Roberta C John S EmpLname and EmpFname are not properly captured.

54 The RETAIN Statement (Review) General form of the RETAIN statement: The RETAIN statement prevents SAS from reinitializing the values of new variables at the top of the DATA step. This means that values from previous records are available for processing. RETAIN variable-name ;

55 data dependents(drop=Type); length Type $ 1 EmpLName EmpFName DepName $ 20 Relation $ 1; retain EmpLName EmpFName; infile 'raw-data-file' dlm=':'; input Type if Type='E' then input EmpLName $ EmpFName $; else do; input DepName $ Relation $; output; end; run; Hold EmpLName and EmpFName

56 E:Adams:Susan D:Michael:C D:Lindsay:C E:Porter:David D:Susan:S E:Lewis:Dorian D. D:Richard:C E:Dansky:Ian E:Nicholls:James D:Roberta:C E:Slaydon:Marla D:John:S EMPLNAMERELATION data dependents(drop=Type); length Type $ 1 EmpLName EmpFName DepName $ 20 Relation $ 1; retain EmpLName EmpFName; infile 'raw-data-file' dlm=':'; input Type if Type='E' then input EmpLName $ EmpFName $; else do; input DepName $ Relation $; output; end; run; Compile EMPFNAMEDEPNAMETYPE R R D R... Input Buffer

57 EMPLNAMERELATIONEMPFNAMEDEPNAMETYPE R R D R data dependents(drop=Type); length Type $ 1 EmpLName EmpFName DepName $ 20 Relation $ 1; retain EmpLName EmpFName; infile 'raw-data-file' dlm=':'; input Type if Type='E' then input EmpLName $ EmpFName $; else do; input DepName $ Relation $; output; end; run; E:Adams:Susan D:Michael:C D:Lindsay:C E:Porter:David D:Susan:S E:Lewis:Dorian D. D:Richard:C E:Dansky:Ian E:Nicholls:James D:Roberta:C E:Slaydon:Marla D:John:S Execute... Input Buffer

58 Input Buffer EMPLNAMERELATIONEMPFNAMEDEPNAMETYPE R R D R data dependents(drop=Type); length Type $ 1 EmpLName EmpFName DepName $ 20 Relation $ 1; retain EmpLName EmpFName; infile 'raw-data-file' dlm=':'; input Type if Type='E' then input EmpLName $ EmpFName $; else do; input DepName $ Relation $; output; end; run; E:Adams:Susan D:Michael:C D:Lindsay:C E:Porter:David D:Susan:S E:Lewis:Dorian D. D:Richard:C E:Dansky:Ian E:Nicholls:James D:Roberta:C E:Slaydon:Marla D:John:S E : A d a m s :S u s a n...

59 Input Buffer EMPLNAMERELATIONEMPFNAMEDEPNAMETYPE R R D R data dependents(drop=Type); length Type $ 1 EmpLName EmpFName DepName $ 20 Relation $ 1; retain EmpLName EmpFName; infile 'raw-data-file' dlm=':'; input Type if Type='E' then input EmpLName $ EmpFName $; else do; input DepName $ Relation $; output; end; run; E:Adams:Susan D:Michael:C D:Lindsay:C E:Porter:David D:Susan:S E:Lewis:Dorian D. D:Richard:C E:Dansky:Ian E:Nicholls:James D:Roberta:C E:Slaydon:Marla D:John:S E : A d a m s : S u s a n E Hold record....

60 EMPLNAMERELATIONEMPFNAMEDEPNAMETYPE R R D R data dependents(drop=Type); length Type $ 1 EmpLName EmpFName DepName $ 20 Relation $ 1; retain EmpLName EmpFName; infile 'raw-data-file' dlm=':'; input Type if Type='E' then input EmpLName $ EmpFName $; else do; input DepName $ Relation $; output; end; run; E:Adams:Susan D:Michael:C D:Lindsay:C E:Porter:David D:Susan:S E:Lewis:Dorian D. D:Richard:C E:Dansky:Ian E:Nicholls:James D:Roberta:C E:Slaydon:Marla D:John:S E True... Input Buffer E : A d a m s : S u s a n

61 Input Buffer E : A d a m s : S u s a n EMPLNAMERELATIONEMPFNAMEDEPNAMETYPE R R D R data dependents(drop=Type); length Type $ 1 EmpLName EmpFName DepName $ 20 Relation $ 1; retain EmpLName EmpFName; infile 'raw-data-file' dlm=':'; input Type if Type='E' then input EmpLName $ EmpFName $; else do; input DepName $ Relation $; output; end; run; E:Adams:Susan D:Michael:C D:Lindsay:C E:Porter:David D:Susan:S E:Lewis:Dorian D. D:Richard:C E:Dansky:Ian E:Nicholls:James D:Roberta:C E:Slaydon:Marla D:John:S EAdamsSusan Implicit return No implicit output...

62 Input Buffer E : A d a m s : S u s a n EMPLNAMERELATIONEMPFNAMEDEPNAMETYPE R R D R data dependents(drop=Type); length Type $ 1 EmpLName EmpFName DepName $ 20 Relation $ 1; retain EmpLName EmpFName; infile 'raw-data-file' dlm=':'; input Type if Type='E' then input EmpLName $ EmpFName $; else do; input DepName $ Relation $; output; end; run; E:Adams:Susan D:Michael:C D:Lindsay:C E:Porter:David D:Susan:S E:Lewis:Dorian D. D:Richard:C E:Dansky:Ian E:Nicholls:James D:Roberta:C E:Slaydon:Marla D:John:S EAdamsSusan...

63 Input Buffer E : A d a m s : S u s a n EMPLNAMERELATIONEMPFNAMEDEPNAMETYPE R R D R data dependents(drop=Type); length Type $ 1 EmpLName EmpFName DepName $ 20 Relation $ 1; retain EmpLName EmpFName; infile 'raw-data-file' dlm=':'; input Type if Type='E' then input EmpLName $ EmpFName $; else do; input DepName $ Relation $; output; end; run; E:Adams:Susan D:Michael:C D:Lindsay:C E:Porter:David D:Susan:S E:Lewis:Dorian D. D:Richard:C E:Dansky:Ian E:Nicholls:James D:Roberta:C E:Slaydon:Marla D:John:S EAdamsSusan No implicit output...

64 Input Buffer E : A d a m s : S u s a n EMPLNAMERELATIONEMPFNAMEDEPNAMETYPE R R D R data dependents(drop=Type); length Type $ 1 EmpLName EmpFName DepName $ 20 Relation $ 1; retain EmpLName EmpFName; infile 'raw-data-file' dlm=':'; input Type if Type='E' then input EmpLName $ EmpFName $; else do; input DepName $ Relation $; output; end; run; E:Adams:Susan D:Michael:C D:Lindsay:C E:Porter:David D:Susan:S E:Lewis:Dorian D. D:Richard:C E:Dansky:Ian E:Nicholls:James D:Roberta:C E:Slaydon:Marla D:John:S EAdamsSusan Implicit return...

65 Input Buffer E : A d a m s : S u s a n EMPLNAMERELATIONEMPFNAMEDEPNAMETYPE R R D R data dependents(drop=Type); length Type $ 1 EmpLName EmpFName DepName $ 20 Relation $ 1; retain EmpLName EmpFName; infile 'raw-data-file' dlm=':'; input Type if Type='E' then input EmpLName $ EmpFName $; else do; input DepName $ Relation $; output; end; run; E:Adams:Susan D:Michael:C D:Lindsay:C E:Porter:David D:Susan:S E:Lewis:Dorian D. D:Richard:C E:Dansky:Ian E:Nicholls:James D:Roberta:C E:Slaydon:Marla D:John:S Reinitialize PDV. AdamsSusan...

66 EMPLNAMERELATIONEMPFNAMEDEPNAMETYPE R D R data dependents(drop=Type); length Type $ 1 EmpLName EmpFName DepName $ 20 Relation $ 1; retain EmpLName EmpFName; infile 'raw-data-file' dlm=':'; input Type if Type='E' then input EmpLName $ EmpFName $;output; else do; input DepName $ Relation $; output; end; run; E:Adams:Susan D:Michael:C D:Lindsay:C E:Porter:David D:Susan:S E:Lewis:Dorian D. D:Richard:C E:Dansky:Ian E:Nicholls:James D:Roberta:C E:Slaydon:Marla D:John:S AdamsSusan... Input Buffer

67 Input Buffer EMPLNAMERELATIONEMPFNAMEDEPNAMETYPE R D data dependents(drop=Type); length Type $ 1 EmpLName EmpFName DepName $ 20 Relation $ 1; retain EmpLName EmpFName; infile 'raw-data-file' dlm=':'; input Type if Type='E' then input EmpLName $ EmpFName $; else do; input DepName $ Relation $; output; end; run; E:Adams:Susan D:Michael:C D:Lindsay:C E:Porter:David D:Susan:S E:Lewis:Dorian D. D:Richard:C E:Dansky:Ian E:Nicholls:James D:Roberta:C E:Slaydon:Marla D:John:S D : M i c h a e l : C AdamsSusan...

68 EMPLNAMERELATIONEMPFNAMEDEPNAMETYPE R D R data dependents(drop=Type); length Type $ 1 EmpLName EmpFName DepName $ 20 Relation $ 1; retain EmpLName EmpFName; infile 'raw-data-file' dlm=':'; input Type if Type='E' then input EmpLName $ EmpFName $; else do; input DepName $ Relation $; output; end; run; E:Adams:Susan D:Michael:C D:Lindsay:C E:Porter:David D:Susan:S E:Lewis:Dorian D. D:Richard:C E:Dansky:Ian E:Nicholls:James D:Roberta:C E:Slaydon:Marla D:John:S DAdamsSusan... Input Buffer D : M i c h a e l : C Hold record.

69 EMPLNAMERELATIONEMPFNAMEDEPNAMETYPE R D R data dependents(drop=Type); length Type $ 1 EmpLName EmpFName DepName $ 20 Relation $ 1; retain EmpLName EmpFName; infile 'raw-data-file' dlm=':'; input Type if Type='E' then input EmpLName $ EmpFName $; else do; input DepName $ Relation $; output; end; run; E:Adams:Susan D:Michael:C D:Lindsay:C E:Porter:David D:Susan:S E:Lewis:Dorian D. D:Richard:C E:Dansky:Ian E:Nicholls:James D:Roberta:C E:Slaydon:Marla D:John:S DMichaelC False AdamsSusan... Input Buffer D : M i c h a e l : C

70 Input Buffer D : M i c h a e l : C Write out observation to dependents. EMPLNAMERELATIONEMPFNAMEDEPNAMETYPE R R D R data dependents(drop=Type); length Type $ 1 EmpLName EmpFName DepName $ 20 Relation $ 1; retain EmpLName EmpFName; infile 'raw-data-file' dlm=':'; input Type if Type='E' then input EmpLName $ EmpFName $; else do; input DepName $ Relation $; output; end; run; E:Adams:Susan D:Michael:C D:Lindsay:C E:Porter:David D:Susan:S E:Lewis:Dorian D. D:Richard:C E:Dansky:Ian E:Nicholls:James D:Roberta:C E:Slaydon:Marla D:John:S DMichaelC Implicit return Explicit output AdamsSusan...

71 Input Buffer D : M i c h a e l : C Write out observation to dependents. EMPLNAMERELATIONEMPFNAMEDEPNAMETYPE R R D R data dependents(drop=Type); length Type $ 1 EmpLName EmpFName DepName $ 20 Relation $ 1; retain EmpLName EmpFName; infile 'raw-data-file' dlm=':'; input Type if Type='E' then input EmpLName $ EmpFName $; else do; input DepName $ Relation $; output; end; run; E:Adams:Susan D:Michael:C D:Lindsay:C E:Porter:David D:Susan:S E:Lewis:Dorian D. D:Richard:C E:Dansky:Ian E:Nicholls:James D:Roberta:C E:Slaydon:Marla D:John:S DMichaelC Explicit output AdamsSusan...

72 Input Buffer D : M i c h a e l : C EMPLNAMERELATIONEMPFNAMEDEPNAMETYPE R R D R data dependents(drop=Type); length Type $ 1 EmpLName EmpFName DepName $ 20 Relation $ 1; retain EmpLName EmpFName; infile 'raw-data-file' dlm=':'; input Type if Type='E' then input EmpLName $ EmpFName $; else do; input DepName $ Relation $; output; end; run; E:Adams:Susan D:Michael:C D:Lindsay:C E:Porter:David D:Susan:S E:Lewis:Dorian D. D:Richard:C E:Dansky:Ian E:Nicholls:James D:Roberta:C E:Slaydon:Marla D:John:S DMichaelC Implicit return AdamsSusan...

73 Input Buffer D : M i c h a e l : C EMPLNAMERELATIONEMPFNAMEDEPNAMETYPE R R D R data dependents(drop=Type); length Type $ 1 EmpLName EmpFName DepName $ 20 Relation $ 1; retain EmpLName EmpFName; infile 'raw-data-file' dlm=':'; input Type if Type='E' then input EmpLName $ EmpFName $; else do; input DepName $ Relation $; output; end; run; E:Adams:Susan D:Michael:C D:Lindsay:C E:Porter:David D:Susan:S E:Lewis:Dorian D. D:Richard:C E:Dansky:Ian E:Nicholls:James D:Roberta:C E:Slaydon:Marla D:John:S DMichaelC AdamsSusan Continue processing until end of the raw data file.

74 Creating One Observation Per Detail EmpLName EmpFName DepName Relation Adams Susan Michael C Adams Susan Lindsay C Porter David Susan S Lewis Dorian D. Richard C Nicholls James Roberta C Slaydon Marla John S proc print data=work.dependents noobs; run; PROC PRINT Output Correct Result

75 E:E01442 D:Michael:C D:Lindsay:C E:E00705 D:Susan:S E:E01577 D:Richard:C E:E00997 E:E00955 D:Roberta:C E:E00224 D:John:S Create One Observation Per Header Record – Employee insurance is free for the employees. – Each employee pays $50 per month for a spouse’s insurance. – Each employee pays $25 per month for a child’s insurance.

76 Desired Output Personnel wants a list of all employees and their monthly payroll deductions for insurance. ID Deduct E E E E E E

77 E:E01442 D:Michael:C D:Lindsay:C E:E00705 D:Susan:S E:E01577 D:Richard:C E:E00997 E:E00955 D:Roberta:C E:E00224 D:John:S Calculating the Value of Deduct – type of record read – value of Relation w hen Type =‘ D ’. The values of Deduct will change according to the

78 Retaining ID Values of ID and Deduct must be held across iterations of the DATA step. – ID must be retained with a RETAIN statement. – Deduct is created with a sum statement, which automatically retains. retain ID;

79 E:E01442 D:Michael:C D:Lindsay:C E:E00705 D:Susan:S E:E01577 D:Richard:C E:E00997 E:E00955 D:Roberta:C E:E00224 D:John:S End Observation 1End Observation 2End Observation 3End Observation 4End Observation 5End Observation 6 When to Output ?...

80 When SAS Loads a Type E Record 1.Output what is currently in the PDV (unless this is the first time through the DATA step). 2. Read the next employee’s identification number. 3. Reset Deduct to 0. if Type='E' then do; if _N_ > 1 then output; input ID $; Deduct=0; end; NOTE: _N_ = 1 is the first record with TYPE =‘E’, but there is no data to be processed yet.

81 When SAS Loads a Type D Record 1. Read the dependent’s name and relationship. 2. Check the relationship. 3. Increment Deduct appropriately. else do; input DepName $ Relation $; if Relation='C' then Deduct+25; else Deduct+50; end;

82 data work.insurance(drop=Type DepName Relation); length Type $ 1 ID $ 6 DepName $ 20 Relation $ 1; retain ID; infile 'raw-data-file' dlm=':'; input Type if Type='E' then do; if _N_ > 1 then output; input ID $; Deduct=0; end; else do; input DepName $ Relation $; if Relation='C' then Deduct+25; else Deduct+50; end; run;

83 E:E01442 D:Michael:C D:Lindsay:C E:E00705 D:Susan:S E:E01577 D:Richard:C E:E00997 E:E00955 D:Roberta:C E:E00224 D:John:S What About the Last Record? No implicit output...

84 Recall : The END= Option in the INFILE statement General form of the END= option: where variable-name is any valid SAS variable name. The END= option creates a variable that has the value – 1 if it is the last record of the input file – 0 otherwise. Variables created with END= are automatically dropped. INFILE 'file-name' END=variable-name;

85 data work.insurance(drop=Type DepName Relation); length Type $ 1 ID $ 6 DepName $ 20 Relation $ 1; retain ID; infile 'raw-data-file' dlm=':' end=LastRec; input Type if Type='E' then do; if _N_ > 1 then output; input ID $; Deduct=0; end; else do; input DepName $ Relation $; if Relation='C' then Deduct+25; else Deduct+50; end; if LastRec then output; run;

86 RELATION data work.insurance(drop=Type DepName Relation); length Type $ 1 ID $ 6 DepName $ 20 Relation $ 1; retain ID; infile 'raw-data-file' dlm=':' end=LastRec; input Type if Type='E' then do; if _N_>1 then output; input ID $; Deduct=0; end; else do; input DepName $ Relation $; if Relation='C' then Deduct+25; else Deduct+50; end; if LastRec then output; run; Compile TYPE R IDDEPNAME DEDUCT D _N_LASTREC DD R DD... Input Buffer E:E01442 D:Michael:C D:Lindsay:C E:E00705 D:Susan:S E:E01577 D:Richard:C E:E00997 E:E00955 D:Roberta:C E:E00224 D:John:S

87 RELATION TYPE R IDDEPNAME 0 DEDUCT D R _N_LASTREC D E:E01442 D:Michael:C D:Lindsay:C E:E00705 D:Susan:S E:E01577 D:Richard:C E:E00997 E:E00955 D:Roberta:C E:E00224 D:John:S Execute 10 DDD... data work.insurance(drop=Type DepName Relation); length Type $ 1 ID $ 6 DepName $ 20 Relation $ 1; retain ID; infile 'raw-data-file' dlm=':' end=LastRec; input Type if Type='E' then do; if _N_>1 then output; input ID $; Deduct=0; end; else do; input DepName $ Relation $; if Relation='C' then Deduct+25; else Deduct+50; end; if LastRec then output; run; Input Buffer input Type

88 Input Buffer RELATION TYPE R IDDEPNAME 0 DEDUCT D R _N_LASTREC D E : E DDD... data work.insurance(drop=Type DepName Relation); length Type $ 1 ID $ 6 DepName $ 20 Relation $ 1; retain ID; infile 'raw-data-file' dlm=':' end=LastRec; input Type if Type='E' then do; if _N_>1 then output; input ID $; Deduct=0; end; else do; input DepName $ Relation $; if Relation='C' then Deduct+25; else Deduct+50; end; if LastRec then output; run; E:E01442 D:Michael:C D:Lindsay:C E:E00705 D:Susan:S E:E01577 D:Richard:C E:E00997 E:E00955 D:Roberta:C E:E00224 D:John:S input Type

89 Input Buffer E : E RELATION TYPE R IDDEPNAME 0 DEDUCT D R _N_LASTREC D E10 D E:E01442 D:Michael:C D:Lindsay:C E:E00705 D:Susan:S E:E01577 D:Richard:C E:E00997 E:E00955 D:Roberta:C E:E00224 D:John:S Hold record. DD... data work.insurance(drop=Type DepName Relation); length Type $ 1 ID $ 6 DepName $ 20 Relation $ 1; retain ID; infile 'raw-data-file' dlm=':' end=LastRec; input Type if Type='E' then do; if _N_>1 then output; input ID $; Deduct=0; end; else do; input DepName $ Relation $; if Relation='C' then Deduct+25; else Deduct+50; end; if LastRec then output; run; input Type

90 RELATION TYPE R IDDEPNAME 0 DEDUCT D R _N_LASTREC D E10 DDD... data work.insurance(drop=Type DepName Relation); length Type $ 1 ID $ 6 DepName $ 20 Relation $ 1; retain ID; infile 'raw-data-file' dlm=':' end=LastRec; input Type if Type='E' then do; if _N_>1 then output; input ID $; Deduct=0; end; else do; input DepName $ Relation $; if Relation='C' then Deduct+25; else Deduct+50; end; if LastRec then output; run; if Type='E' then do; E:E01442 D:Michael:C D:Lindsay:C E:E00705 D:Susan:S E:E01577 D:Richard:C E:E00997 E:E00955 D:Roberta:C E:E00224 D:John:S True Input Buffer E : E

91 DD RELATION TYPE R IDDEPNAME 0 DEDUCT D R _N_LASTREC D E10 D... data work.insurance(drop=Type DepName Relation); length Type $ 1 ID $ 6 DepName $ 20 Relation $ 1; retain ID; infile 'raw-data-file' dlm=':' end=LastRec; input Type if Type='E' then do; if _N_>1 then output; input ID $; Deduct=0; end; else do; input DepName $ Relation $; if Relation='C' then Deduct+25; else Deduct+50; end; if LastRec then output; run; if _N_ > 1 then output; False Input Buffer E : E D:Michael:C D:Lindsay:C E:E00705 D:Susan:S E:E01577 D:Richard:C E:E00997 E:E00955 D:Roberta:C E:E00224 D:John:S

92 RELATION TYPE R IDDEPNAME 0 DEDUCT D R _N_LASTREC D E10E DDD... data work.insurance(drop=Type DepName Relation); length Type $ 1 ID $ 6 DepName $ 20 Relation $ 1; retain ID; infile 'raw-data-file' dlm=':' end=LastRec; input Type if Type='E' then do; if _N_>1 then output; input ID $; Deduct=0; end; else do; input DepName $ Relation $; if Relation='C' then Deduct+25; else Deduct+50; end; if LastRec then output; run; input ID $; Input Buffer E : E D:Michael:C D:Lindsay:C E:E00705 D:Susan:S E:E01577 D:Richard:C E:E00997 E:E00955 D:Roberta:C E:E00224 D:John:S

93 Input Buffer E : E DD RELATION TYPE R IDDEPNAME 0 DEDUCT D R _N_LASTREC D E10E D Deduct=0... data work.insurance(drop=Type DepName Relation); length Type $ 1 ID $ 6 DepName $ 20 Relation $ 1; retain ID; infile 'raw-data-file' dlm=':' end=LastRec; input Type if Type='E' then do; if _N_>1 then output; input ID $; Deduct=0; end; else do; input DepName $ Relation $; if Relation='C' then Deduct+25; else Deduct+50; end; if LastRec then output; run; Deduct=0; E:E01442 D:Michael:C D:Lindsay:C E:E00705 D:Susan:S E:E01577 D:Richard:C E:E00997 E:E00955 D:Roberta:C E:E00224 D:John:S

94 DD RELATION TYPE R IDDEPNAME 0 DEDUCT D R _N_LASTREC D E10E D... data work.insurance(drop=Type DepName Relation); length Type $ 1 ID $ 6 DepName $ 20 Relation $ 1; retain ID; infile 'raw-data-file' dlm=':' end=LastRec; input Type if Type='E' then do; if _N_>1 then output; input ID $; Deduct=0; end; else do; input DepName $ Relation $; if Relation='C' then Deduct+25; else Deduct+50; end; if LastRec then output; run; end; Input Buffer E : E D:Michael:C D:Lindsay:C E:E00705 D:Susan:S E:E01577 D:Richard:C E:E00997 E:E00955 D:Roberta:C E:E00224 D:John:S

95 DD RELATION TYPE R IDDEPNAME 0 DEDUCT D R _N_LASTREC D E10E D... data work.insurance(drop=Type DepName Relation); length Type $ 1 ID $ 6 DepName $ 20 Relation $ 1; retain ID; infile 'raw-data-file' dlm=':' end=LastRec; input Type if Type='E' then do; if _N_>1 then output; input ID $; Deduct=0; end; else do; input DepName $ Relation $; if Relation='C' then Deduct+25; else Deduct+50; end; if LastRec then output; run; if LastRec then output; False Input Buffer E : E D:Michael:C D:Lindsay:C E:E00705 D:Susan:S E:E01577 D:Richard:C E:E00997 E:E00955 D:Roberta:C E:E00224 D:John:S

96 DD RELATION TYPE R IDDEPNAME 0 DEDUCT D R _N_LASTREC D E10E D... data work.insurance(drop=Type DepName Relation); length Type $ 1 ID $ 6 DepName $ 20 Relation $ 1; retain ID; infile 'raw-data-file' dlm=':' end=LastRec; input Type if Type='E' then do; if _N_>1 then output; input ID $; Deduct=0; end; else do; input DepName $ Relation $; if Relation='C' then Deduct+25; else Deduct+50; end; if LastRec then output; run; Input Buffer E : E D:Michael:C D:Lindsay:C E:E00705 D:Susan:S E:E01577 D:Richard:C E:E00997 E:E00955 D:Roberta:C E:E00224 D:John:S Implicit return

97 RELATION TYPE R IDDEPNAME 0 DEDUCT D R _N_LASTREC D E DDD... Input Buffer E : E D:Michael:C D:Lindsay:C E:E00705 D:Susan:S E:E01577 D:Richard:C E:E00997 E:E00955 D:Roberta:C E:E00224 D:John:S Reinitialize PDV. data work.insurance(drop=Type DepName Relation); length Type $ 1 ID $ 6 DepName $ 20 Relation $ 1; retain ID; infile 'raw-data-file' dlm=':' end=LastRec; input Type if Type='E' then do; if _N_>1 then output; input ID $; Deduct=0; end; else do; input DepName $ Relation $; if Relation='C' then Deduct+25; else Deduct+50; end; if LastRec then output; run;

98 RELATION TYPE R IDDEPNAME 0 DEDUCT D R _N_LASTREC D E DDD... data work.insurance(drop=Type DepName Relation); length Type $ 1 ID $ 6 DepName $ 20 Relation $ 1; retain ID; infile 'raw-data-file' dlm=':' end=LastRec; input Type if Type='E' then do; if _N_>1 then output; input ID $; Deduct=0; end; else do; input DepName $ Relation $; if Relation='C' then Deduct+25; else Deduct+50; end; if LastRec then output; run; input Type Input Buffer E : E D:Michael:C D:Lindsay:C E:E00705 D:Susan:S E:E01577 D:Richard:C E:E00997 E:E00955 D:Roberta:C E:E00224 D:John:S

99 Input Buffer RELATION TYPE R IDDEPNAME 0 DEDUCT D R _N_LASTREC D E D : M i c h a e l : C 0 DDD... data work.insurance(drop=Type DepName Relation); length Type $ 1 ID $ 6 DepName $ 20 Relation $ 1; retain ID; infile 'raw-data-file' dlm=':' end=LastRec; input Type if Type='E' then do; if _N_>1 then output; input ID $; Deduct=0; end; else do; input DepName $ Relation $; if Relation='C' then Deduct+25; else Deduct+50; end; if LastRec then output; run; E:E01442 D:Michael:C D:Lindsay:C E:E00705 D:Susan:S E:E01577 D:Richard:C E:E00997 E:E00955 D:Roberta:C E:E00224 D:John:S input Type

100 RELATION TYPE R IDDEPNAME 0 DEDUCT D R _N_LASTREC D E014422D0 D Input Buffer D : M i c h a e l : C E:E01442 D:Michael:C D:Lindsay:C E:E00705 D:Susan:S E:E01577 D:Richard:C E:E00997 E:E00955 D:Roberta:C E:E00224 D:John:S Hold record. DD... data work.insurance(drop=Type DepName Relation); length Type $ 1 ID $ 6 DepName $ 20 Relation $ 1; retain ID; infile 'raw-data-file' dlm=':' end=LastRec; input Type if Type='E' then do; if _N_>1 then output; input ID $; Deduct=0; end; else do; input DepName $ Relation $; if Relation='C' then Deduct+25; else Deduct+50; end; if LastRec then output; run; input Type

101 RELATION TYPE R IDDEPNAME 0 DEDUCT D R _N_LASTREC D E014422D0 DDD... data work.insurance(drop=Type DepName Relation); length Type $ 1 ID $ 6 DepName $ 20 Relation $ 1; retain ID; infile 'raw-data-file' dlm=':' end=LastRec; input Type if Type='E' then do; if _N_>1 then output; input ID $; Deduct=0; end; else do; input DepName $ Relation $; if Relation='C' then Deduct+25; else Deduct+50; end; if LastRec then output; run; if Type=‘E’ then do; Input Buffer D : M i c h a e l : C E:E01442 D:Michael:C D:Lindsay:C E:E00705 D:Susan:S E:E01577 D:Richard:C E:E00997 E:E00955 D:Roberta:C E:E00224 D:John:S False

102 RELATION TYPE R IDDEPNAME 0 DEDUCT D R _N_LASTREC D E014422Michael C D0 DDD... data work.insurance(drop=Type DepName Relation); length Type $ 1 ID $ 6 DepName $ 20 Relation $ 1; retain ID; infile 'raw-data-file' dlm=':' end=LastRec; input Type if Type='E' then do; if _N_>1 then output; input ID $; Deduct=0; end; else do; input DepName $ Relation $; if Relation='C' then Deduct+25; else Deduct+50; end; if LastRec then output; run; input DepName $ Relation $; Input Buffer D : M i c h a e l : C E:E01442 D:Michael:C D:Lindsay:C E:E00705 D:Susan:S E:E01577 D:Richard:C E:E00997 E:E00955 D:Roberta:C E:E00224 D:John:S

103 RELATION TYPE R IDDEPNAME 0 DEDUCT D R _N_LASTREC D E014422Michael C D0 DDD... data work.insurance(drop=Type DepName Relation); length Type $ 1 ID $ 6 DepName $ 20 Relation $ 1; retain ID; infile 'raw-data-file' dlm=':' end=LastRec; input Type if Type='E' then do; if _N_>1 then output; input ID $; Deduct=0; end; else do; input DepName $ Relation $; if Relation='C' then Deduct+25; else Deduct+50; end; if LastRec then output; run; True if Relation='C' then Deduct+25; 25 Input Buffer D : M i c h a e l : C E:E01442 D:Michael:C D:Lindsay:C E:E00705 D:Susan:S E:E01577 D:Richard:C E:E00997 E:E00955 D:Roberta:C E:E00224 D:John:S

104 RELATION TYPE R IDDEPNAME 0 DEDUCT D R _N_LASTREC D E014422Michael C25 D0 DDD... data work.insurance(drop=Type DepName Relation); length Type $ 1 ID $ 6 DepName $ 20 Relation $ 1; retain ID; infile 'raw-data-file' dlm=':' end=LastRec; input Type if Type='E' then do; if _N_>1 then output; input ID $; Deduct=0; end; else do; input DepName $ Relation $; if Relation='C' then Deduct+25; else Deduct+50; end; if LastRec then output; run; if LastRec then output; False Input Buffer D : M i c h a e l : C E:E01442 D:Michael:C D:Lindsay:C E:E00705 D:Susan:S E:E01577 D:Richard:C E:E00997 E:E00955 D:Roberta:C E:E00224 D:John:S

105 DD RELATION TYPE R IDDEPNAME 25 DEDUCT D R _N_LASTREC D E014422Michael C25 D0 D... data work.insurance(drop=Type DepName Relation); length Type $ 1 ID $ 6 DepName $ 20 Relation $ 1; retain ID; infile 'raw-data-file' dlm=':' end=LastRec; input Type if Type='E' then do; if _N_>1 then output; input ID $; Deduct=0; end; else do; input DepName $ Relation $; if Relation='C' then Deduct+25; else Deduct+50; end; if LastRec then output; run; Input Buffer D : M i c h a e l : C E:E01442 D:Michael:C D:Lindsay:C E:E00705 D:Susan:S E:E01577 D:Richard:C E:E00997 E:E00955 D:Roberta:C E:E00224 D:John:S Implicit return

106 RELATION TYPE R IDDEPNAME 25 DEDUCT D R _N_LASTREC D E DDD Input Buffer D : M i c h a e l : C E:E01442 D:Michael:C D:Lindsay:C E:E00705 D:Susan:S E:E01577 D:Richard:C E:E00997 E:E00955 D:Roberta:C E:E00224 D:John:S... Reinitialize PDV. data work.insurance(drop=Type DepName Relation); length Type $ 1 ID $ 6 DepName $ 20 Relation $ 1; retain ID; infile 'raw-data-file' dlm=':' end=LastRec; input Type if Type='E' then do; if _N_>1 then output; input ID $; Deduct=0; end; else do; input DepName $ Relation $; if Relation='C' then Deduct+25; else Deduct+50; end; if LastRec then output; run;

107 RELATION TYPE R IDDEPNAME 25 DEDUCT D R _N_LASTREC D E DDD... data work.insurance(drop=Type DepName Relation); length Type $ 1 ID $ 6 DepName $ 20 Relation $ 1; retain ID; infile 'raw-data-file' dlm=':' end=LastRec; input Type if Type='E' then do; if _N_>1 then output; input ID $; Deduct=0; end; else do; input DepName $ Relation $; if Relation='C' then Deduct+25; else Deduct+50; end; if LastRec then output; run; input Type Input Buffer D : M i c h a e l : C E:E01442 D:Michael:C D:Lindsay:C E:E00705 D:Susan:S E:E01577 D:Richard:C E:E00997 E:E00955 D:Roberta:C E:E00224 D:John:S

108 Input Buffer RELATION TYPE R IDDEPNAME 25 DEDUCT D R _N_LASTREC D E D : L i n d s a y : C 0 DDD... data work.insurance(drop=Type DepName Relation); length Type $ 1 ID $ 6 DepName $ 20 Relation $ 1; retain ID; infile 'raw-data-file' dlm=':' end=LastRec; input Type if Type='E' then do; if _N_>1 then output; input ID $; Deduct=0; end; else do; input DepName $ Relation $; if Relation='C' then Deduct+25; else Deduct+50; end; if LastRec then output; run; input Type E:E01442 D:Michael:C D:Lindsay:C E:E00705 D:Susan:S E:E01577 D:Richard:C E:E00997 E:E00955 D:Roberta:C E:E00224 D:John:S

109 RELATION TYPE R IDDEPNAME 25 DEDUCT D R _N_LASTREC D E014423D0 D Input Buffer D : L i n d s a y : C E:E01442 D:Michael:C D:Lindsay:C E:E00705 D:Susan:S E:E01577 D:Richard:C E:E00997 E:E00955 D:Roberta:C E:E00224 D:John:S Hold record. DD... data work.insurance(drop=Type DepName Relation); length Type $ 1 ID $ 6 DepName $ 20 Relation $ 1; retain ID; infile 'raw-data-file' dlm=':' end=LastRec; input Type if Type='E' then do; if _N_>1 then output; input ID $; Deduct=0; end; else do; input DepName $ Relation $; if Relation='C' then Deduct+25; else Deduct+50; end; if LastRec then output; run; input Type

110 RELATION TYPE R IDDEPNAME 25 DEDUCT D R _N_LASTREC D E D0 DDD... data work.insurance(drop=Type DepName Relation); length Type $ 1 ID $ 6 DepName $ 20 Relation $ 1; retain ID; infile 'raw-data-file' dlm=':' end=LastRec; input Type if Type='E' then do; if _N_>1 then output; input ID $; Deduct=0; end; else do; input DepName $ Relation $; if Relation='C' then Deduct+25; else Deduct+50; end; if LastRec then output; run; if Type=‘E’ then do; False Input Buffer D : L i n d s a y : C E:E01442 D:Michael:C D:Lindsay:C E:E00705 D:Susan:S E:E01577 D:Richard:C E:E00997 E:E00955 D:Roberta:C E:E00224 D:John:S

111 RELATION TYPE R IDDEPNAME 25 DEDUCT D R _N_LASTREC D E014423Lindsay C D0 DDD... data work.insurance(drop=Type DepName Relation); length Type $ 1 ID $ 6 DepName $ 20 Relation $ 1; retain ID; infile 'raw-data-file' dlm=':' end=LastRec; input Type if Type='E' then do; if _N_>1 then output; input ID $; Deduct=0; end; else do; input DepName $ Relation $; if Relation='C' then Deduct+25; else Deduct+50; end; if LastRec then output; run; input DepName $ Relation $; Input Buffer D : L i n d s a y : C E:E01442 D:Michael:C D:Lindsay:C E:E00705 D:Susan:S E:E01577 D:Richard:C E:E00997 E:E00955 D:Roberta:C E:E00224 D:John:S

112 DD RELATION TYPE R IDDEPNAME 25 DEDUCT D R _N_LASTREC D E014423Lindsay C50 D0 D... data work.insurance(drop=Type DepName Relation); length Type $ 1 ID $ 6 DepName $ 20 Relation $ 1; retain ID; infile 'raw-data-file' dlm=':' end=LastRec; input Type if Type='E' then do; if _N_>1 then output; input ID $; Deduct=0; end; else do; input DepName $ Relation $; if Relation='C' then Deduct+25; else Deduct+50; end; if LastRec then output; run; if Relation=‘C’ then Deduct+25; True Input Buffer D : L i n d s a y : C E:E01442 D:Michael:C D:Lindsay:C E:E00705 D:Susan:S E:E01577 D:Richard:C E:E00997 E:E00955 D:Roberta:C E:E00224 D:John:S

113 data work.insurance(drop=Type DepName Relation); length Type $ 1 ID $ 6 DepName $ 20 Relation $ 1; retain ID; infile 'raw-data-file' dlm=':' end=LastRec; input Type if Type='E' then do; if _N_>1 then output; input ID $; Deduct=0; end; else do; input DepName $ Relation $; if Relation='C' then Deduct+25; else Deduct+50; end; if LastRec then output; run; RELATION TYPE R IDDEPNAME 25 DEDUCT D R _N_LASTREC D E014423Lindsay C50 D0 DDD... if LastRec then output; False Input Buffer D : L i n d s a y : C E:E01442 D:Michael:C D:Lindsay:C E:E00705 D:Susan:S E:E01577 D:Richard:C E:E00997 E:E00955 D:Roberta:C E:E00224 D:John:S

114 Input Buffer D : L i n d s a y : C DD E:E01442 D:Michael:C D:Lindsay:C E:E00705 D:Susan:S E:E01577 D:Richard:C E:E00997 E:E00955 D:Roberta:C E:E00224 D:John:S RELATION TYPE R IDDEPNAME DEDUCT D R _N_LASTREC D E014423Lindsay C 2550D0 D... data work.insurance(drop=Type DepName Relation); length Type $ 1 ID $ 6 DepName $ 20 Relation $ 1; retain ID; infile 'raw-data-file' dlm=':' end=LastRec; input Type if Type='E' then do; if _N_>1 then output; input ID $; Deduct=0; end; else do; input DepName $ Relation $; if Relation='C' then Deduct+25; else Deduct+50; end; if LastRec then output; run; Implicit return

115 Input Buffer D : L i n d s a y : C RELATION TYPE R IDDEPNAME DEDUCT D R _N_LASTREC D 450E DDD Reinitialize PDV. data work.insurance(drop=Type DepName Relation); length Type $ 1 ID $ 6 DepName $ 20 Relation $ 1; retain ID; infile 'raw-data-file' dlm=':' end=LastRec; input Type if Type='E' then do; if _N_>1 then output; input ID $; Deduct=0; end; else do; input DepName $ Relation $; if Relation='C' then Deduct+25; else Deduct+50; end; if LastRec then output; run; E:E01442 D:Michael:C D:Lindsay:C E:E00705 D:Susan:S E:E01577 D:Richard:C E:E00997 E:E00955 D:Roberta:C E:E00224 D:John:S

116 RELATION TYPE R IDDEPNAME DEDUCT D R _N_LASTREC D 450E DDD data work.insurance(drop=Type DepName Relation); length Type $ 1 ID $ 6 DepName $ 20 Relation $ 1; retain ID; infile 'raw-data-file' dlm=':' end=LastRec; input Type if Type='E' then do; if _N_>1 then output; input ID $; Deduct=0; end; else do; input DepName $ Relation $; if Relation='C' then Deduct+25; else Deduct+50; end; if LastRec then output; run; input Type Input Buffer D : L i n d s a y : C E:E01442 D:Michael:C D:Lindsay:C E:E00705 D:Susan:S E:E01577 D:Richard:C E:E00997 E:E00955 D:Roberta:C E:E00224 D:John:S

117 Input Buffer RELATION TYPE R IDDEPNAME DEDUCT D R _N_LASTREC D E : E E DDD data work.insurance(drop=Type DepName Relation); length Type $ 1 ID $ 6 DepName $ 20 Relation $ 1; retain ID; infile 'raw-data-file' dlm=':' end=LastRec; input Type if Type='E' then do; if _N_>1 then output; input ID $; Deduct=0; end; else do; input DepName $ Relation $; if Relation='C' then Deduct+25; else Deduct+50; end; if LastRec then output; run; input Type E:E01442 D:Michael:C D:Lindsay:C E:E00705 D:Susan:S E:E01577 D:Richard:C E:E00997 E:E00955 D:Roberta:C E:E00224 D:John:S

118 RELATION TYPE R IDDEPNAME DEDUCT D R _N_LASTREC D 450E D Input Buffer E : E E:E01442 D:Michael:C D:Lindsay:C E:E00705 D:Susan:S E:E01577 D:Richard:C E:E00997 E:E00955 D:Roberta:C E:E00224 D:John:S Hold record. DD data work.insurance(drop=Type DepName Relation); length Type $ 1 ID $ 6 DepName $ 20 Relation $ 1; retain ID; infile 'raw-data-file' dlm=':' end=LastRec; input Type if Type='E' then do; if _N_>1 then output; input ID $; Deduct=0; end; else do; input DepName $ Relation $; if Relation='C' then Deduct+25; else Deduct+50; end; if LastRec then output; run; input Type E

119 RELATION TYPE R IDDEPNAME DEDUCT D R _N_LASTREC D 450E014420E DDD... data work.insurance(drop=Type DepName Relation); length Type $ 1 ID $ 6 DepName $ 20 Relation $ 1; retain ID; infile 'raw-data-file' dlm=':' end=LastRec; input Type if Type='E' then do; if _N_>1 then output; input ID $; Deduct=0; end; else do; input DepName $ Relation $; if Relation='C' then Deduct+25; else Deduct+50; end; if LastRec then output; run; if Type=‘E’ then do; Input Buffer E:E01442 D:Michael:C D:Lindsay:C E:E00705 D:Susan:S E:E01577 D:Richard:C E:E00997 E:E00955 D:Roberta:C E:E00224 D:John:S E : E True

120 RELATION TYPE R IDDEPNAME DEDUCT D R _N_LASTREC D 450E014420E DDD... data work.insurance(drop=Type DepName Relation); length Type $ 1 ID $ 6 DepName $ 20 Relation $ 1; retain ID; infile 'raw-data-file' dlm=':' end=LastRec; input Type if Type='E' then do; if _N_>1 then output; input ID $; Deduct=0; end; else do; input DepName $ Relation $; if Relation='C' then Deduct+25; else Deduct+50; end; if LastRec then output; run; if _N_ > 1 then output; True Input Buffer E : E E:E01442 D:Michael:C D:Lindsay:C E:E00705 D:Susan:S E:E01577 D:Richard:C E:E00997 E:E00955 D:Roberta:C E:E00224 D:John:S

121 Write out observation to insurance. RELATION TYPE R IDDEPNAME DEDUCT D R _N_LASTREC D 450E014420E DDD... data work.insurance(drop=Type DepName Relation); length Type $ 1 ID $ 6 DepName $ 20 Relation $ 1; retain ID; infile 'raw-data-file' dlm=':' end=LastRec; input Type if Type='E' then do; if _N_>1 then output; input ID $; Deduct=0; end; else do; input DepName $ Relation $; if Relation='C' then Deduct+25; else Deduct+50; end; if LastRec then output; run; if _N_ > 1 then output; Input Buffer E : E E:E01442 D:Michael:C D:Lindsay:C E:E00705 D:Susan:S E:E01577 D:Richard:C E:E00997 E:E00955 D:Roberta:C E:E00224 D:John:S True Explicit output

122 RELATION TYPE R IDDEPNAME DEDUCT D R _N_LASTREC D E John S D1 DDD data work.insurance(drop=Type DepName Relation); length Type $ 1 ID $ 6 DepName $ 20 Relation $ 1; retain ID; infile 'raw-data-file' dlm=':' end=LastRec; input Type if Type='E' then do; if _N_>1 then output; input ID $; Deduct=0; end; else do; input DepName $ Relation $; if Relation='C' then Deduct+25; else Deduct+50; end; if LastRec then output; run; Input Buffer D : J o h n : S E:E01442 D:Michael:C D:Lindsay:C E:E00705 D:Susan:S E:E01577 D:Richard:C E:E00997 E:E00955 D:Roberta:C E:E00224 D:John:S Implicit return

123 Creating One Observation Per Header ID Deduct E E E E E E proc print data=insurance noobs; run; PROC PRINT Output

Exercise 1 Open program c21_1. Carefully check the data structure, and go through each program statement to make sure you know why the statement is needed. Run the program, and learn how to read hierarchical data.