Presentation is loading. Please wait.

Presentation is loading. Please wait.

Using Advanced INPUT Techniques Peter Cosette Dave Hall Amy Dunn-Ruiz Eric Lyon.

Similar presentations


Presentation on theme: "Using Advanced INPUT Techniques Peter Cosette Dave Hall Amy Dunn-Ruiz Eric Lyon."— Presentation transcript:

1 Using Advanced INPUT Techniques Peter Cosette Dave Hall Amy Dunn-Ruiz Eric Lyon

2 Advanced Input Outline Handling Missing Values and Short Data Lines Detecting the End of Files Advanced Options for Reading Data Files Reading Data Conditionally The Single Trailing @ and the Double Trailing @@ Using Variable and Informant Lists Creating Multiple Obs. from One Line of Input Relative Column Pointers

3 LOG: “SAS went to a new line when INPUT statement reached past end of the line” Solutions Missover sets variable to missing if it has more variables than data values in one line Pad When doing column input this command pads each line with blanks up to 256 bytes Truncover Good for reading in variable length records with missing data

4 Inputting data With missing values missing.txt: 1 1 2 3 5 8 13 21 34 55 89 Use missover

5 Inputting data with missing values short.txt 001Amy Dunn-Ruiz 100 90 93 002Peter Cossette 95 88 003David Hall 85 82 96 004Eric Lyon 98100 95 Use Pad

6 End=Last Option last is a temporary variable that is false(0) until last record of external file is read in, then it is true(1) end is also a set option and used to determine when you are reading the last observation in a SAS data set

7 Data Example for End=Last & Obs days.txt Sun 0 Mon 0 Tue 8 Wed 2 Thu 10 Fri 3 Sat 1

8 Obs and First Obs No matter how many observations are in the file, if you put obs=3 it will only reads in the first 3 Use FirstObs to read in selected observations in the middle of a data set firstobs=4 obs=7 Useful for large data sets where you need to test stuff out

9 Reading in Multiple Files The Wildcard End=Finished Filename Dummy Variable

10 The ? Wildcard If you have multiple files with similar names like: Person1, Person2, Person3,…. data wild; infile 'e:\files\info\person?.txt'; input Name Style Letter; run; You can use a * wildcard to allow for more than one wildcard digit

11 End=Finished data lotsofiles; if finished = 0 then infile ‘grover.txt' end=finished; else infile ‘bigbird.txt'; else infile ‘snuffy.txt’; input Name Style Place; run; Use this to read in one file, then when that's done, it moves on to read in the next file, and so on;

12 Filename filename super (‘grover.txt’ ‘oscar.txt’ ‘bert.txt); data lotsofiles; infile super; input Name Place Style; run; We called our file super, you can name it anything you want This is a little less typing than End=Finished method

13 Dummy with External File data lotsofiles; infile 'e:\super.txt'; input External $ 30.; infile dummy filevar=External end=Last; do until (last); input Name Place Style; output; end; run; E:\super.txt contains a list of all of the filenames we want to read in Easier to use than End=Finshed or Filename when you have LOTS of files to read in

14 Dummy with Datalines data lotsofiles; input External $ 30.; infile dummy filevar=External end=Last; do until (last); input Name Place Style; output; end; datalines; e:\wilma.txt e:\betty.txt e:\fred.txt e:\barney.txt etc..... ; run; Same as previous slide, but with datalines not an external file

15 Reading Multiple Lines of Data Name lines #1, #2, #3, etc and it reads them in together as one line Or use a slash to indicate where the lines are separated (though this is not always super clear, author doesn't like this technique)

16 Data in Columns 12345678901234 HondaCivic1996 112654 3599 Ford Focus2002 83775 7999 BMW X32006 3026017999

17 Mixing Record Types with Conditional Input If some data you are reading in has more variables than other data Using an @ sign fixes the errors, it is the absolute column pointer

18 Mixing Record Types with Conditional Input star89.txt 012345678901234567 001875378 2008 002848791352009 003565943212009 004999194842009 005365412 2008

19 More Trailing @ Sign For Use as a Filter The @ tells computer not to move on after reading in Year, lets it do the if statement rather than go onto next line

20 The Trailing @@ When there are, for example, only 2 variables but more than 2 things on the dataline Normally SAS won't read in all the data off that line, moves to next line after inputting the first 2 values on the line @@ makes everything nice again, UNLESS THERE'S A MISSING VALUE...THEN IT WON'T WORK

21 Using Variable & Informat Lists You can supply a single informat to a list of variables and save typing input (FirstName LastName MT1-MT3)(2*$10. 3*$2.);

22 Using Relative Column Pointers to Read a Complex Data Structure Effectively The + sign is a relative column pointer input @1Age1 2. @3Wt1 3. @6Age2 2.@8Wt2 3. @11Age3 2.@13 Wt3 3. etc…….

23 Summary of Key Terms Missover - Used to handle data with missing values Pad – Used to handle column data with missing values End = Last - Used to detect the end of a data file Obs - used to read the first n obs. from a data file Firstobs - Sets the first obs. read from a data file Filevar – Used to specify filename Single @ - Used to “hold the line” for conditional input Double @@ - Use with Caution! Informant Lists & Relative Column Pointers – Useful!


Download ppt "Using Advanced INPUT Techniques Peter Cosette Dave Hall Amy Dunn-Ruiz Eric Lyon."

Similar presentations


Ads by Google