Presentation is loading. Please wait.

Presentation is loading. Please wait.

I OWA S TATE U NIVERSITY Department of Animal Science Getting Your Data Into SAS (Chapter 2 in the Little SAS Book) Animal Science 500 Lecture No. 3 September.

Similar presentations


Presentation on theme: "I OWA S TATE U NIVERSITY Department of Animal Science Getting Your Data Into SAS (Chapter 2 in the Little SAS Book) Animal Science 500 Lecture No. 3 September."— Presentation transcript:

1 I OWA S TATE U NIVERSITY Department of Animal Science Getting Your Data Into SAS (Chapter 2 in the Little SAS Book) Animal Science 500 Lecture No. 3 September 7, 2010

2 I OWA S TATE U NIVERSITY Department of Animal Science Arithmetic Operators OperationSymbolExampleResult +addition Num + Num Example: 5 + 3 add two numbers together -subtraction Num - Num Example: 5 – 3 or can use two variables ending wt. – beginning wt. subtract the value of 5 - 3 * multiplication (table note 1)(table note 1) 2*y Always have to have * cannot use 2(y) or 2y multiply 2 by the value of Y /division var/5 or can use variable weight gain / days on test divide the value of VAR by 5 ** can also use the ^ exponentiation a**2 or a^2 raise A to the second power

3 I OWA S TATE U NIVERSITY Department of Animal Science Comparison Operators u Comparison operators set up a comparison, operation, or calculation with two variables, constants, or expressions within the dataset being used. n If the comparison is true, the result is 1. n If the comparison is false, the result is 0. u Comparison operators can be expressed as symbols or with their mnemonic equivalents, which are shown in the following table:

4 I OWA S TATE U NIVERSITY Department of Animal Science Comparison Operators Symbol Mnemonic EquivalentDefinitionExample =EQequal toa=3 ^=NEnot equal to (table note 1)(table note 1)a ne 3 ¬=NEnot equal to ~=NEnot equal to >GTgreater thannum>5 <LTless thannum<8 >=GEgreater than or equal to (table note 2)(table note 2) sales>=300 <=LEless than or equal to (table note 3)(table note 3)sales<=100 INequal to one of a listnum in (3, 4, 5)

5 I OWA S TATE U NIVERSITY Department of Animal Science Logical (Boolean) Operators and Expressions SymbolMnemonic EquivalentExample &AND(a>b & c>d) |OR(a>b or c>d) !OR ¦ ¬NOTnot(a>b) ˆNOT ~ Logical operators, also called Boolean operators, are usually used in expressions to link sequences of comparisons.

6 I OWA S TATE U NIVERSITY Department of Animal Science Finding your data u Most of the time your “raw” data files will be saved as external files 1. Text files – Word, WordPerfect, Writer, etc. 2. Spreadsheets - Excel, Lotus, Quattro Pro, etc. 3. Other systems – Unix, Open VMS, etc.

7 I OWA S TATE U NIVERSITY Department of Animal Science Reading external files into SAS u The files containing your stored data will typically be stored 1. On the hard drive of the computer that you will ultimately use to analyze the data with SAS 2. Stored externally – l USB memory stick (flash memory) l External hard drive Must get your data from “storage” into SAS to conduct the analyses

8 I OWA S TATE U NIVERSITY Department of Animal Science Reading external files into SAS u Use the Infile statement within a DATA step u Data mytrial; Infile ‘c:\mydocument\trial.xls’; Input statement (Input variable names Remember to put the $ for character variables. You may have to tell SAS which columns individual variables are found and place the decimal

9 I OWA S TATE U NIVERSITY Department of Animal Science Reading external files into SAS u Data mytrial; Infile ‘c:\mydocument\trial.xls’ DLM=“,” ; Many options to assist you when using the infile command. DLM= used to specify the delimiter that separates the variables in your raw data file. For example, dlm=','indicates a comma is the delimiter (e.g., a comma separated file,.csv file). Or, dlm='09'x indicates that tabs are used to separate your variables (e.g., a tab separated file).

10 I OWA S TATE U NIVERSITY Department of Animal Science Reading external files into SAS u Other options n DSD The dsd option has 2 functions. n First, it recognizes two consecutive delimiters as a missing value. n For example, if your file contained the line 20,30,,50 SAS will treat this as 20 30 50 but with the the dsd option SAS will treat it as 20 30. 50, which is probably what you intended.

11 I OWA S TATE U NIVERSITY Department of Animal Science Reading external files into SAS u Other options n DSD option allows you to include the delimiter within quoted strings. For example, you would want to use the dsd option if you had a comma separated file and your data included values like "George Bush, Jr.". With the dsd option, SAS will recognize that the comma in "George Bush, Jr." is part of the name, and not a separator indicating a new variable.

12 I OWA S TATE U NIVERSITY Department of Animal Science Reading external files into SAS u Other options n FIRSTOBS= Tells SAS what on what line you want it to start reading your raw data file. (Default = 1) If the first record(s) contains header information such as variable names, then set firstobs=n where n is the record number where the data actually begin. Example: Assume you are reading a comma separated file or a tab separated file where the variable names are on the first line. Use firstobs=2 to tell SAS to begin reading at the second line. (Ignores the first line with the names of the variables).

13 I OWA S TATE U NIVERSITY Department of Animal Science Reading external files into SAS u Other options n MISSOVER This option prevents SAS from going to a new input line if it does not find values for all of the variables in the current line of data. For example, you may be reading a space delimited file and that is supposed to have 10 values per line, but one of the line had only 9 values. Without the missover option, SAS will look for the 10th value on the next line of data. Sets all empty variables to missing when reading a short line.

14 I OWA S TATE U NIVERSITY Department of Animal Science Reading external files into SAS u Other options n MISSOVER If your data is supposed to only have one observation for each line of raw data, then this could cause errors throughout the rest of your data file. If you have a raw data file that has one record per line, this option is a prudent method of trying to keep such errors from cascading through the rest of your data file.

15 I OWA S TATE U NIVERSITY Department of Animal Science Reading external files into SAS u Other options n OBS= Indicates which line in your raw data file should be treated as the last record to be read by SAS. This is a good option to use for testing your program. For example, you might use obs=100 to just read in the first 100 lines of data while you are testing your program.

16 I OWA S TATE U NIVERSITY Department of Animal Science Reading external files into SAS u Other options u A typical infile statement for reading a comma delimited file that contains the variable names in the first line of data would be: u INFILE "test.txt" DLM=',' DSD MISSOVER FIRSTOBS=2 ;

17 I OWA S TATE U NIVERSITY Department of Animal Science Reading external files into SAS u Other options n LRECL = logical record length LRECL is really useful for Windows users. Default, Windows creates files with a logical record length of 256. May appear that SAS is not reading all of your data or that beyond some point all variables are not being read.

18 I OWA S TATE U NIVERSITY Department of Animal Science Reading external files into SAS u Other options n LRECL = logical record length LRECL is really useful for Windows users. You can tell Windows exactly how long to make the record length on the filename statement in SAS. The option is lrecl= (logical record length) and it looks like this: filename myFile "c:\some directory\some file.txt" LRECL= 400; u This option is REQUIRED if length of data line is over 256.

19 I OWA S TATE U NIVERSITY Department of Animal Science Knowing what Options are Available u Obviously can look up using: n SAS on-line help n SAS manuals and books n Other example programs Can also determine what options are available using the PROC Options: Run; Quit; Will output what options are available to you at this step of your SAS program or code.

20 I OWA S TATE U NIVERSITY Department of Animal Science Informats u Host of selected informats on pages 46-47 in the The Little SAS Book, 4 th Edition. n Different ways data can be formatted and read in SAS n Dates, Times, and combined DateTime n Reading Julian dates

21 I OWA S TATE U NIVERSITY Department of Animal Science Titles and Footnotes u SAS allows up to 10 lines of text at the top (titles) and the bottom (footnote) on each page of output using the title and footnote statements. n Title text; n Footnote text; n Where n is the number of lines and have the range of limits for each 1 to 10. n If text is omitted, the title or footnote is deleted n Otherwise it remains in effect until it is redefined.

22 I OWA S TATE U NIVERSITY Department of Animal Science Titles and Footnotes u SAS allows up to 10 lines of text at the top (titles) and the bottom (footnote) on each page of output using the title and footnote statements. n To have no titles you can include title; n The default in SAS included the date and page number at the top of each output. n To get rid of these options l Type nodate and / or nonumber in the options section.

23 I OWA S TATE U NIVERSITY Department of Animal Science Temporary versus Permanent SAS Data Sets u Temporary SAS dataset n Only exists during the current job or session n It is erased by SAS when you finish and close down SAS u Permanent SAS dataset n Does not mean it is around for ever or eternity n It remains stored even after you close your SAS session. u If you use a data set more than once, it is more efficient to save it as a permanent SAS data set

24 I OWA S TATE U NIVERSITY Department of Animal Science Temporary versus Permanent SAS Data Sets u Using the Permanent SAS data set allows you to skip the infile step whether you use the import wizard or use an infile statement. u If you are going to modify your data set it is likely easier to use the temporary SAS data set. n Need to add more data to “final” data set n Have not checked the “final” data set for errors n Maybe other reasons.

25 I OWA S TATE U NIVERSITY Department of Animal Science Listing the Contents of a SAS Data Set u Proc Contents n Place Proc Contents data=yourdatasetname; n If you leave off the data= then SAS will perform the Proc Contents on the last data set created. n It is a good way to check and see if all of your data are being correctly read into SAS for further analyses.

26 I OWA S TATE U NIVERSITY Department of Animal Science Listing the Contents of a SAS Data Set u Output from Proc Contents – 1. Data Set Name – be sure you evaluated the correct data set 2. Observations – did the correct number of observations get read into the analysis 3. Variables - were the correct number of variables identified 4. Created – date the analysis was created 5. Label – Some label you might have provided

27 I OWA S TATE U NIVERSITY Department of Animal Science Listing the Contents of a SAS Data Set u Output from Proc Contents – Listing of variables in alphabetical order The following output is created for each variable 1. Type – numeric or character 2. Length – storage size (in bytes) 3. Format for printing if any (for example the date may have been converted to worddate) 4. Informat for input if any (for example mmddyyyy for a date) 5. Variable label (e.g. date of birth, height in inches, weight in pounds

28 I OWA S TATE U NIVERSITY Department of Animal Science Processing an Existing Data Set u When you want to process an existing SAS data set n Use the set statement rather than an infile statement u Each time SAS encounters a set statement, SAS inputs an observation from an existing data set which contains all of the variables

29 I OWA S TATE U NIVERSITY Department of Animal Science Processing an Existing Data Set Data data1; set data2; Average daily gain = (offweight – onweight) / daysontest; Run; Quit; Again, if the user does not specify a dataset to perform the operations, the last dataset used will be used again.

30 I OWA S TATE U NIVERSITY Department of Animal Science Arithmetic Operators u Arithmetic operators indicate that an arithmetic calculation is performed, as shown in the following table:


Download ppt "I OWA S TATE U NIVERSITY Department of Animal Science Getting Your Data Into SAS (Chapter 2 in the Little SAS Book) Animal Science 500 Lecture No. 3 September."

Similar presentations


Ads by Google