Summer SAS Workshop Lecture 3
Summer SAS Workshop Website
Part I of Lecture 3 Thinking through a programming problem Programming logic Subsetting data
Summer How to program? What are your goals? What does your data look like? How does your data need to look to accomplish your goals? What is the first thing you type and run when you open SAS and want to start coding?????
Summer Dropping or Keeping Variables You often get big data sets that you only want to use part of. First Drop the variables that you don’t want. (or keep the ones you do want) Data newdata; set annie.olddataset; Keep name ssnumber visdate dob; Run; Or Data newdata; set annie.olddataset (Keep= name ssnumber visdate dob); Run;
Summer Conditional Logic [If-Then-Else] Frequently, you want an assignment statement to apply to some observations but not all - under some conditions, but not others. This is also how you create new variables by recategorizing the old variables into new groupings. 1)IF condition THEN action; 2)IF condition THEN action; ELSE IF condition THEN action; 3)IF condition THEN action; ELSE IF condition THEN action; ELSE action;
Summer IF-THEN-ELSE Rules A single IF-THEN statement can only have one action. If you add the keywords DO and END, then you can execute more than one action (put it in a loop). You can also specify multiple conditions with the keywords AND and OR *Remember SAS considers missing values to be smaller than non-missing values.
Summer Comparison Operators These operators can be coded using Symbols or Mnemonics. SymbolMnemonicMeaning =EQEquals ~=NENot Equal >GTGreater Than <LTLess Than >=GEGreater than or Equal <=LELess than or Equal &ANDAll comparisons must be true |OROnly one comparison must be true
Summer Subsetting Often programmers find that they want to use some of the observations in a data set and exclude the rest. The most common way to do this is with a subsetting IF statement in a DATA step. Syntax: IF expression;Ex: IF Sex = ‘f’;
Summer Subsetting (cont.) If the expression is true, then SAS continues with the DATA step. If the expression is false, then no further statements are processed for that observation; that observation is not added to the data set being created; and SAS moves to the next observation. While the subsetting IF statement tells SAS which observations to include, the DELETE statement tells SAS which observations to exclude: IF expression THEN DELETE; IF Sex = ‘m’ THEN DELETE; (same as If Sex = “f”;)
Summer Open SAS code from website Go through the code. Run the program. Questions? How could we make a new data set with only Males in it?
Part II of Lecture 3 Merging data sets SAS Functions
Summer It’s all in the way that you look at things …
Summer Combining Data Sets Using One-to-One Match Merge When you have two data sets with related data and you want to combine them. If you merge two data sets, and they have variables with the same names – besides the BY variables, then variables from the second data set will overwrite any variables having the same names in the first data set. All observations from old data sets will be included in the new data set whether they have a match or not.
Summer Match Merge Example Proc Sort Data = Rat; BY RatID Date; Run; Proc Sort Data = Rat2; BY RatID Date; Run; DATA BigRat; MERGE Rat Rat2; BY RatID Date; Run;
Summer SAS Functions Previously created SAS functions are used to simplify some complex programming problems Usually arithmetic or mathematical calculations Syntax of Function used in an expression: NewVar = FunctionName (VariableName);
Summer Common Functions Log ( ); Log10 ( ); Sin ( ); Cos ( ); Tan ( ); Int ( ); SQRT ( ); Weekday ( ); MDY (,, ); Round (x, 1); Mean ( ); RANUNI ( ); Put ( ); Input ( ); Lag ( ); Dif ( ); N ( ); NMISS ( );
Part III of Lecture 3 Debugging
Summer Debugging? “If debugging is the process of removing bugs, then programming must be the process of putting them in.” –From some strange, but insightful website
Summer Syntactic Errors vs. Logic Errors We will focus mainly on syntax errors; however, it is also possible for SAS to calculate a new variable using syntactically correct code that results in inaccurate calculations, I.e. a logic error. For this reason, it is always wise to check values of a new variable against values of the original variable used in the calculation.
Summer READ THE LOG WINDOW!! I know that I spout this all of the time, and that is because too many people begin skipping this step and then can’t figure out why their program isn’t working If you have an ERROR message, look at that line as well as a few of the lines above it Don’t ignore Warnings and Notes in the log simply because your program seems to have run, they could indicate a serious error that just did not happen to be syntactically incorrect, in this case, check your logic or add some Proc Prints to understand what is going on inside your program
Summer Debugging: The Basics The better you can read and understand your program, the easier it is to find the problem(s). Put only one SAS statement on a line Use indentions to show the different parts of the program within DATA and PROC steps Use comment statements GENEROUSLY to document your code
Summer Know your colors Make sure that you are using the enhanced editor and know what code is generally what color (i.e. comments are green)
Summer Scroll Up Remember that your output and log windows are scrolled to the very bottom of the screen, scroll ALL the way up and check the whole thing. Look for common mistakes first (Semicolons and spelling errors!) Make sure you haven’t typed an ‘O’ where you want an ‘0’ or vice versa, this can cause SAS to think that your numeric or character variable should be change to the other variable type. SAS may do this automatically when you don’t want it done!
Summer What is wrong here? *Read the data file ToadJump.dat using a list input Data toads; Infile ‘c:MyRawData\ToadJump.dat’; Input ToadName$ Weight Jump1 Jump2 Jump3; Run;
Summer Here is the log window… ___________________________________________________________ *Read the data file ToadJump.dat using the list input Data toads; Infile ‘c:\MyRawData\ToadJump.dat’; ERROR : Statement is not valid or it is used out of proper order. Input ToadName$ Weight Jump1 Jump2 Jump3; ERROR : Statement is not valid or it is used out of proper order. Run; __________________________________________________________
Summer
Summer SAS is still running… You need to check the message above the menu on the Log window. If it says, as in this example, "DATA STEP running", then steps must be taken to stop the program from running. Even though SAS will continue to process other programs, results of such programs may be inaccurate, without any indication of syntax problems showing up in the log.
Summer SAS is still running… Several suggestions to stop the program are: Submit the following line: '; run; Submit the following line: *))%*'''))*/; If all else fails, exit SAS entirely (making sure that the revised program has been saved) and re-start it again.
Summer TPA Data Practice Go to the website and download the a TPA sample data set. Save it in a place that you can successfully write the Libname to point to! Either find a SAS program that you can change to fit the current problem or begin writing the code with a blank Editor page. The Goal: See how much reproduction of Tables 1 and 2 from the published paper you can recreate with your sample. We will practice Proc Boxplot together.