Presentation on theme: "The INFILE Statement Reading files into SAS from an outside source: A Very Useful Tool!"— Presentation transcript:
The INFILE Statement Reading files into SAS from an outside source: A Very Useful Tool!
Often the data you want to use are in an outside source, such as an Excel spreadsheet or text editor. You can import these data into SAS with the INFILE statement, rather than typing in all of the data by hand. In addition to coming from different sources, the data may be delimited by spaces, tabs, or commas. There are codes to tell SAS what type of data it will be reading into the Editor.
Basic SAS Code The program you will use to read in a data file will follow this basic format, with slight variations: DATA name of data set; INFILE ‘C:\location and name of file’ ; INPUT var1 var2 etc.; RUN; Let’s try a couple of examples.
Space delimited data with no missing values and no labels This will be the simplest type of data set you could encounter. Download the file called relief.txt, and save it to your hard drive where you will be able to easily locate it. Open SAS and type the following code into your Editor window, keeping in mind that the path to the file may be different on your computer:
Sample Code DATA relieftime; INFILE ‘C:\Documents and Settings\My Documents\relief.txt’ ; INPUT group $ time; TITLE ‘Reading in Data from an External File’ ; PROC PRINT DATA = relieftime; RUN;
The first line creates a data set called relieftime. The INFILE statement tells SAS where to find the dataset. The third line labels the two variables in the data set (just as you have seen before). The TITLE statement gives anything you print a title. The PRINT statement prints your data in the Output window, so that you can make sure you did not lose any data. Try running the program in SAS. Were you successful? Check your log for any ERROR messages and try running the program again.
Check your log for valuable information. It tells you how many observations were read into SAS, and you can see if you are missing any data.
Output of PROC PRINT
Comma delimited data with missing values and labels in the first row You will also encounter comma-separated value (csv) files, which are often created using Excel. To read in this type of data, you will need to use the DSD option in your SAS code. Save the file about blood pressure (bp.csv) to your hard drive, then type the following code into your Editor window, keeping in mind the path to your file may be different from the example. Before you read the file into SAS, open it and look at the format. Notice that first row contains variable names and there are a few missing values. We will address that using various SAS statements.
SAS Code dlm [delimiter] = ‘,’ tells SAS that the data is delimited by commas. dsd (delimiter-sensitive data) allows SAS to read in a.csv file. This also allows SAS to recognize the missing data. firstobs = 2 tells SAS to start reading the data from the second row (because the first row contains variable labels).
PROC PRINT Output
Tab delimited Data Other types of data may be tab delimited, and this is specified in the dlm option. We use the statement dlm = ’09’x. (’09’x is the hexadecimal representation of the tab character, but you don’t need to know that; it’s basically the technical way of specifying TAB as the delimiter.) Download the file aplastic.txt, open it to see that the data begin on the first line, then read the file into SAS.
SAS Code for Tab Delimited Data After running this program, check your SAS log. Are there any Errors or Warnings? If not, look at your Output. Is any data missing?
PROC PRINT Output
Conclusions These are just a few examples of the most common types of files you may encounter when reading in an external data file into SAS using the INFILE statement. This skill will be valuable for future tutorials and homework assignments. Here are some helpful links for more info: UCLA Academic Technology Services University of Michigan Software Help University at Albany CSDA