Presentation is loading. Please wait.

Presentation is loading. Please wait.

How to start using SAS.

Similar presentations


Presentation on theme: "How to start using SAS."— Presentation transcript:

1 How to start using SAS

2 The topics An overview of the SAS system
Reading raw data/ create SAS data set Combining SAS data sets & Match merging SAS Data Sets Formatting data Introduce some simple regression procedure Summary report procedures

3 Basic Screen Navigation
Main: Editor contains the SAS program to be submitted. Log contains information about the processing of the SAS program, including any warning and error messages Output contains reports generated by SAS procedures and DATA steps Side: Explore navigate to other objects like libraries Results navigate your Output window

4 SAS programs A SAS program is a sequence of steps that the user submits for execution. Data steps are typically used to create SAS data sets PROC steps are typically used to process SAS data sets (that is, generate reports and graphs, edit data, sort data and analyze data

5 SAS Data Libraries A SAS data library is a collection of SAS files that are recognized as a unit by SAS A SAS data set is one type of SAS file stored in a data library Work library is temporary library, when SAS is closed, all the datasets in the Work library are deleted; create a permanent SAS dataset via your own library. - SAS use data libraries to store data sets. - You can think of a SAS data library as a drawer in a filling cabinet and a SAS data set as one of the file folders in the drawer. - The Work library is temporary. When SAS is closed, all the datasets in the Work library are deleted. if you want to save a dataset to continue to work with it later, create a permanent SAS dataset via a library.

6 SAS Data Libraries Identify SAS data libraries by assigning each a library reference name (libref) with LIBNAME statement LIBNAME libref “file-folder-location”; Eg: LIBNAME readData 'C:\temp\sas class\readData‘; Rules for naming a libref: The name must be 8 characters or less The name must begin with a letter or underscore The remaining characters must be letters, numbers or underscores.

7 Reading raw data set into SAS system
In order to create a SAS data set from a raw data file, you must Start a DATA step and name the SAS data set being created (DATA statement) Identify the location of the raw data file to read (INFILE statement) Describe how to read the data fields from the raw data file (INPUT statement)

8 Reading external raw data file into SAS system
LIBNAME readData 'C:\temp\sas class\readData‘; DATA readData.wa80; INFILE “k:\census\stf2_wa80.txt”; SUMRYLVL COUNTY $3. @253 TABA1 TABA1 9.0; RUN; The LIBNAME statement assigns a libref ‘readData ’ to a data library. The DATA statement creates a permanent SAS data set named ‘wa80’. The INFILE statement points to a raw data file. The INPUT statement - name the SAS variables - identify the variables as character or numeric ($ indicates character data) - specify the locations of the fields in the raw data - can be specified as column, formatted, list, or named input The RUN statement detects the end of a step You identify SAS data libraries by assigning each a library reference name (libref). The name must be 8 characters or less, must begin with a letter or underscore and the remaining characters must be letters, numbers, or underscores.

9 Example 1 Reading raw data separated by spaces
/* Create a SAS permanent data set named HighLow1; Read the data file temperature1.dat using listing input */ DATA readData.HighLow1; INFILE ‘C:\sas class\readData\temperature1.dat’; INPUT City $ State $ NormalHigh NormalLow RecordHigh RecordLow; RUN; /* The PROC PRINT step creates a isting report of the readData.HighLow1 data set */ PROC PRINT DATA = readData.highlow1; TITLE ‘High and Low Temperatures for July’; temperature1.dat: Nome AK Miami FL Raleign NC

10 Example 2 Reading multiple lines of raw data per observation
temperature2.dat: /* Read the data file using line pointer, slash(/) and pount-n (#n). The slash(/) indicates next line, the #n means to go to the n line for that observation. Slash(/) can be replaced by #2 here */ DATA readData.highlow2; INFILE ‘C:\sas class\readData\temperature2.dat’; INPUT City $ State $ / NormalHigh NormalLow #3 RecordHigh RecordLow; PROC PRINT DATA = readData.highlow2; TITLE ‘High and Low Temperatures for July’; RUN; Nome AK 55 44 88 29 Miami FL 90 75 97 65 Raleign NC 88 68 105 50

11 Example 3 Reading multiple observations per line of raw data
temperature3.dat: Nome AK Miami FL Raleign NC 88 /* To read multiple observations per line of raw data,use double railing at signs at the end of INPUT statement */ DATA readData.highlow3; INFILE ‘C:\sas class\readData\temperature3.dat’; INPUT City $ State $ NormalHigh NormalLow RecordHigh RecordLow PROC PRINT DATA = readData.highlow3; TITLE ‘High and Low Temperatures for July’; RUN; When you have multiple observations per line of raw data, you can use double railing at signs at the end of your INPUT statement.

12 Reading external raw data file into SAS system
Reading raw data arranged in columns INPUT FILEID $ 1-5 RECTYP $ 6-9 SUMRYLVL $ URBARURL $ SMSACOM $ 14-15; Reading raw data mixed in columns INPUT FILEID $ SUMRYLVL $ TABA1 9.0 @271 TABA1 9.0; /* is the column pointer, where n is the number of the column SAS should move to. The $w. reads standard character data, and w.d reads standard numeric data, where w is the total width and d is the number of decimal places. */

13 Reading Delimited or PC Database Files with the IMPORT Procedure
If your data file has the proper extension, use the simplest form of the IMPORT procedure: PROC IMPORT DATA FILE = ‘filename’ OUT = data-set Type of File Extension DBMS Identifier Comma-delimited csv CSV Tab-delimited txt TAB Excel xls EXCEL Lotus Files wk1, .wk3, .wk WK1,WK3,WK4 Delimiters other than commas or tabs DLM Examples: 1. PROC IMPORT DATAFILE=‘c:\temp\sale.csv’ OUT=readData.money; RUN; 2. PROC IMPORT DATAFILE=‘c:\temp\bands.xls’ OUT=readData.music; RUN;

14 Reading Files with the IMPORT Procedure
If your file does not have the proper extension, or your file is of type with delimiters other than commas or tabs, then you must use the DBMS= and DELIMITER= option PROC IMPORT DATAFILE = ‘filename’ OUT = data-set DBMS = identifier; DELIMITER = ‘delimiter-character’; RUN; Example: PROC IMPORT DATAFILE = ‘C:\sas class\readData\import2.txt’ OUT =readData.sasfile DBMS =DLM; DELIMITER = ‘&’;

15 Format in SAS data set Standard Formats (selected):
Character: $w. Date, Time and Datetime: DATEw., MMDDYYw., TIMEw.d, …… Numeric: COMMAw.d, DOLLARw.d, …… Use FORMAT statement PROC PRINT DATA=sales; VAR Name DateReturned CandyType Profit; FORMAT DateReturned DATE9. Profit DOLLAR 6.2; RUN; 15

16 Format in SAS data set Create your own custom formats with two steps:
Create the format using PROC FORMAT and VALUE statement. Assign the format to the variable using FORMAT statement. General form of a simple PROC FORMAT steps: PROC FORMAT; VALUE name range-1=‘formatted-text-1’ range-2=‘formatted-text-2’ ……; RUN; The name in VALUE statement is the name of the format you are creating, which can’t be longer than eight characters, must not start or end with a number. If the format is for character data, it must start with a $. Create our own custom formats when you use a lot of coded data. Formats can remind you of the meaning behind the category. Note that formats do not change the actual value of the variable, just how it’s displayed. 16

17 Format in SAS data set Exmaple:
/* Step1: Create the format for certain variables */ PROC FORMAT; VALUE genFmt 1 = 'Male' 2 = 'Female'; VALUE money low-<25000='Less than 25,000' ='25,000 to 50,000' 50000<-high='More than 50,000'; VALUE $codeFmt 'FLTA1'-'FLTA3'='Flight Attendant' 'PILOT1'-'PILOT3'='Pilot'; RUN; /* Step2: Assign the variables */ DATA fmtData.crew1; SET fmtData.crew; FORMAT Gender genFmt. Salary money. JobCode $codeFmt.; If the format is for character data, it must start with a $ 17

18 Format in SAS data set Permanently store formats in a SAS catalog by
Creating a format catalog file with LIB in PROC FORMAT statement Setting the format search options Example: LIBNAME class ‘C:\sas class\Format’; OPTIONS FMTSEARCH=(fmtData.fmtvalue); RUN; PROC FORMAT LIB=fmtData.fmtvalue; VALUE genFmt 1 = ‘Male’ 2=‘Female’; RUN; 18

19 Combining SAS Data Sets: Concatenating and Interleaving
Use the SET statement in a DATA step to concatenate SAS data sets. Use the SET and BY statements in a DATA step to interleave SAS data sets.

20 Combining SAS Data Sets: Concatenating and Interleaving
General form of a DATA step concatenation: DATA SAS-data-set; SET SAS-data-set1 SAS-data-set2 …; RUN; Example: DATA stack.allEmp; SET stack.emp1 stack.emp2 stack.emp3;

21 Combining SAS Data Sets: Concatenating and Interleaving
General form of a DATA step interleave: DATA SAS-data-set; SET SAS-data-set1 SAS-data-set2 …; BY BY-variable; RUN; Sort all SAS data set first by using PROC SORT Example: PROC SORT data=stack.emp2 OUT=stack.emp2_sorted; BY Salary; RUN; DATA stack.allEmp; SET stack.emp1 stack.emp2 stack.emp3; BY salary;

22 Match-Merging SAS Data Sets
One-to-one match merge One-to-many match merge Many-to-many match merge The SAS statements for all three types of match merge are identical in the following form: DATA new-data-set; MERGE data-set-1 data-set-2 data-set-3 …; BY by-variable(s); /* indicates the variable(s) that control which observations to match */ RUN;

23 Merging SAS Data Sets: A More Complex Example
Example: Merge two data sets acquire the names of the group team that is scheduled to fly next week. combData.employee combData.groupsched EmpID LastName E00632 Strauss E01483 Lee E01996 Nick E04064 Waschk EmpID FlightNum E04064 5105 E0632 5250 E01996 5501 /* To match-merge the data sets by common variables - EmpID, the data sets must be ordered by EmpID */ PROC SORT data=combData.Groupsched; BY EmpID; RUN;

24 Merging SAS Data Sets: A More Complex Example
/* simply merge two data sets */ DATA combData.nextweek; MERGE combData.employee combData.groupsched; BY EmpID; RUN; EmpID LastJName FlightNum E00632 Strauss 5250 E01483 Lee E01996 Nick 5501 E04064 Waschk 5105

25 Merging SAS Data Sets: A More Complex Example
Eliminating Nonmatches Use the IN= data set option to determine which dataset(s) contributed to the current observation. General form of the IN=data set option: SAS-data-set (IN=variable) Variable is a temporary numeric variable that has two possible values: 0 indicates that the data set did not contribute to the current observation. 1 indicates that the data set did contribute to the current observation.

26 Merging SAS Data Sets: A More Complex Example
/*Exclude from the data set employee who are scheduled to fly next week. */ LIBNAME combData “K:\sas class\merge”; DATA combData.nextweek; MERGE combData.employee combData.groupsched (in=InSched); BY EmpID; IF InSched=1; True RUN; EmpID LastJName FlightNum E00632 Strauss 5250 E01996 Nick 5501 E04064 Waschk 5105

27 Merging SAS Data Sets: A More Complex Example
/* Find employees who are not in the flight scheduled group. */ LIBNAME combData “K:\sas class\merge”; DATA combData .nextweek; MERGE combData .employee (in=InEmp) combData.groupsched (in=InSched); BY EmpID; IF InEmp=1; True IF InSched=0; False RUN; EmpID LastJName FlightNum E01483 Lee

28 Different Types of Merges in SAS
One-to-Many Merging Work.two Work.one X E 1 A1 A2 2 B1 3 C1 C2 X Y 1 A 2 B 3 C Work.three DATA work.three; MERGE work.one work.two; BY X; RUN; X Y Z 1 A A1 A2 2 B B1 3 C C1 C2

29 Different Types of Merges in SAS
Many-to-Many Merging Work.two Work.one X Z 1 AA1 AA2 AA3 2 BB1 BB2 X Y 1 A1 A2 2 B1 B2 Work.three DATA work.three; MERGE work.one work.two; BY X; RUN; X Y Z 1 A1 AA1 A2 AA2 AA3 2 B1 BB1 B2 BB2

30 Some simple regression analysis procedure
The REG Procedure The LOGISTIC Procedure

31 The REG procedure The REG procedure is one of many regression procedures in the SAS System. The REG procedure allows several MODEL statements and gives additional regression diagnostics, especially for detection of collinearity. It also creates plots of model summary statistics and regression diagnostics. PROC REG <options>; MODEL dependents=independents </options>; PLOT <yvariable*xvariable>; RUN;

32 An example PROC REG DATA=water;
MODEL Water = Temperature Days Persons / VIF; MODEL Water = Temperature Production Days / VIF; RUN; MODEL Water = Temperature Production Days; PLOT STUDENT.* PREDICTED.; PLOT STUDENT.* NPP.; PLOT NPP.*r.; PLOT r.*NQQ.; the keyword NPP. or NQQ., which can be used with any of the preceding variables to construct normal P-P or Q-Q plots,

33 The LOGISTIC procedure
The binary or ordinal responses with continuous independent variables PROC LOGISTIC < options > ; MODEL dependents=independents < / options > ; RUN; The binary or ordinal responses with categorical independent variables CLASS categorical variables < / option > ; Binary responses (for example, success and failure), and ordinal responses (for example, normal, mild, and severe

34 Example PROC LOGISTIC data=Neuralgia; CLASS Treatment Sex;
MODEL Pain= Treatment Sex Treatment*Sex Age Duration; RUN;

35 Overview Summary Report Procedures
PROC FREQ: produce frequency counts PROC TABULATE: produce one- and two-dimensional tabular reports PROC REPORT: produce flexible detail and summary reports 35

36 The FREQ Procedure The FREQ procedure display frequency counts of the data values in a SAS data set. General form of a simple PROC FREQ steps: PROC FREQ DATA = SAS-data-set; TABLE SAS-variables </options>; RUN; 36

37 PROC FREQ DATA = class.crew ;
The FREQ Procedure Example: PROC FREQ DATA = class.crew ; FORMAT JobCode $codefmt. Salary money.; TABLE JobCode*Salary /NOCOL NOROW OUT =freqTable; RUN; 37

38 The TABULATE Procedure
PROC TABULATE displays descriptive statistics in tabular format. General form of a simple PROC TABULATE steps: PROC TABULATE DATA=SAS-data-set; CLASS class-variables; VAR analysis-variables; TABLE row-expression, column-expression</options>; RUN; 38

39 The TABULATE Procedure
Example: TITLE 'Average Salary for Cary and Frankfurt'; PROC TABULATE DATA= class.crew FORMAT=dollar12.; WHERE Location IN ('Cary','Frankfurt'); CLASS Location JobCode; VAR Salary; TABLE JobCode, Location*Salary*mean; RUN; 39

40 The REPORT procedure REPORT procedure combines features of the PRINT, MEANS, and TABULATE procedures. It enables you to create listing reports create summary reports enhance reports request separate subtotals and grand totals 40

41 The REPORT procedure Example
PROC REPORT DATA =class.crew nowd HEADLINE HEADSKIP; COLUMN JobCode Location Salary; DEFINE JobCode / GROUP WIDTH= 8 'Job Code'; DEFINE Location / GROUP 'Home Base'; DEFINE Salary / FORMAT=dollar10. 'Average Salary‘ MEAN ; RBREAK AFTER / SUMMARIZE DOL; RUN; 41


Download ppt "How to start using SAS."

Similar presentations


Ads by Google