Download presentation
Presentation is loading. Please wait.
Published byHarry Fields Modified over 9 years ago
1
SAS Programming: Working With Variables
2
Data Step Manipulations New variables should be created during a Data step Existing variables should be manipulated during a data step
3
Missing Values in SAS SAS uses a period (.) to represent missing values in a SAS data set Different SAS procedures and functions treat missing values differently - always be careful when your SAS data set contains missing values
4
Working With Numeric Variables SAS uses the standard arithmetic operators +, -, *, /, ** (exponentiation) Note on Missing Values: Arithmetic operators propagate missing values. SAS has many built-in numeric functions round(variable,value): Rounds variable to nearest unit given by value. sum(variable1, variable2, …): Adds any number of variables and ignores missing values
5
Acting on Selected Observations Working with selected observations - subsets of a SAS data set - is easy in SAS First, you must decide on a selection process. What is the distinguishing characteristic of the observations you want to work with?
6
Selecting Observations: IF-THEN Statements The IF-THEN statement is the most common way to select observations. Format: IF condition THEN action; condition is one or more comparisons. For any observation, condition is either true or false. If condition is true, SAS performs the action.
7
IF-THEN Statement: Example Suppose INC is a variable representing annual household income and you want to create a dummy variable, DUM, based on income that takes value 1 when income is less than $10,000. IF INC<10000 THEN DUM=1; IF INC >=10000 THEN DUM=0;
8
Using OBS in condition In a SAS data set, each record has an observation number which is the number stored in the variable OBS OBS can be used in a condition, but you must refer to the observation number using the variable _n_ Example: set the first 10 observations of INC equal to zero IF _n_ <= 10 THEN INC=0;
9
Comparison Operators There are 6 comparison operators Can use either the symbol or mnemonic SymbolMnemonicMeaning =EQEqual to ^=NENot equal to >GTGreater than <LTLess than >=GEGreater than or equal to <=LELess than or equal to
10
Multiple Comparisons Can make more than one comparison in condition by using AND/OR AND / &: All parts must be true for condition to be true Or / |: At least one part must be true for condition to be true Be careful when using AND/OR Can use parentheses in condition
11
Selecting Observations for New SAS Data Sets Can use IF-THEN statements to create new SAS data sets Either delete or keep selected observations based on condition
12
Deleting Observations Format for IF-THEN: IF condition THEN DELETE; Example: Removing missing observations. Suppose the variable INC is missing for some households and you want to drop these observations IF INC=. THEN DELETE;
13
Keeping Selected Observations A more straightforward way to create new SAS data sets is to keep only those observations that meet some condition. Format: IF condition;
14
Example The file salary.dat contains data for 93 employees of a Chicago bank. The file contains the following variables: Y: Salary X: Years of education E: Months of previous work experience T: Number of months after 1/1/69 that the individual was hired First 61 observations are females, last 32 males
15
Example: Create Dummy for Males *Program to create dummy variables and; *new SAS data sets ; data salary; infile ‘s:\mysas\salary.dat; input y x e t; IF _n_ >61 THEN G=1; IF _n_ <= 60 THEN G=0; run;
16
Example: Create Data Set for Males *Make a new SAS data set composed of only; *records for males ; data males; *New SAS data set; set=salary; *Created from salary; IF G=1; run;
17
Example: Create Data Set for Females *Make a new SAS data set composed of only; *records for females ; data females; *New SAS data set; set=salary; *Created from salary; IF G=0; run;
18
Describing Data: Sample Statistics Format: PROC UNIVARIATE ; VAR variable-list; BY variable-list; FREQ variable; WEIGHT variable;
19
Selected Options DATA=SAS-data-set; Specify Data Set If omitted, uses most recent SAS data set FREQGenerate Frequency Table NOPRINTSuppress Printed Output
20
VAR Statement List of variables to calculate sample statistics for. If no variables are specified, sample statistics are generated for all numeric variables
21
WEIGHT Statement Specifies a numeric variable in the SAS data set whose values are used to weight each observation
22
BY Statement Can be used to obtain separate analyses on observations in groups defined by some value of a variable. Example: Suppose SEX=1 if individual is male, SEX=0 if individual is female; EARN=annual earnings. PROC UNIVARIATE; *Generates statistics; VAR EARN; *on earnings for men; BY SEX; *and women; RUN;
23
BY Statements and Sorting Before using a BY statement, the SAS data set must be sorted on the variable specified SAS puts the observations in order, based on the values of the variables specified in the BY statement. Use PROC SORT
24
PROC SORT FORMAT: PROC SORT ; BY variables; Sort Order: ascending. For descending, put DESCENDING on BY line
25
Describing Data: Frequencies FORMAT: PROC FREQ ; BY variables; TABLES requests ; WEIGHT variable;
26
One-Way Frequency Table SEX=1 (Male) SEX=0(Female) EDUCATION=1(Less than High School), =2(High School),=3(Some College),=4(College grad.) EARN=Annual Earnings PROC FREQ; TABLES EDUCATION; RUN; PROC FREQ; TABLES EDUCATION; BY SEX; RUN;
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.