Presentation is loading. Please wait.

Presentation is loading. Please wait.

Lab 3 and HRP259 Lab and Combining (with SQL)

Similar presentations


Presentation on theme: "Lab 3 and HRP259 Lab and Combining (with SQL)"— Presentation transcript:

1 Lab 3 and HRP259 Lab and Combining (with SQL)
October 26, 2009 Copyright © Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected by copyright law and international treaties. Unauthorized reproduction of this presentation, or any portion of it, may result in severe civil and criminal penalties and will be prosecuted to maximum extent possible under the law.

2 But it used to work… This project worked fine. Then I closed it and reopened it. The code stopped working. When the flowchart is run again, it fails to notice that the airport2 table is needed. So, airport2 is created after this code node is attempted. Draw in an arrow so that airport2 leads to this node.

3 With the arrow saved in the project, EG gets the logic correct and builds the airport2 table before it attempts to make the analysis table. The next time the process flow is run, it will run completely.

4 And now for something completely different….
Here are a few thoughts on simulations and stuff for the HRP 259 students. Use putlog instead of put Leverage the point and click stuff even for simulations.

5 SAS as a Calculator There is a put function and a put statement. To avoid confusion, use the putlog statement to display information in the log.

6 The _NULL_ dataset is not created in work but the instructions inside of the data step are done.

7 Describe the shape of the normal curve with these values.

8 Display the values formatted to show 6 characters, 4 after the decimal place.

9 Calculate the PDF values starting at -3 out to 3 by
Calculate the PDF values starting at -3 out to 3 by .01 and put them into a data set.

10 Select the Data windowpane.
Drag & Drop

11 The probability of getting the score if the data is a standard normal (Mean =0, Standard Deviation = 1)

12

13

14 Show Central Limit Theorem
Make data: samples of the same size. Calculate the means on the 1000 samples. Plot the means.

15

16

17

18 Sample size = 1 Use sample size of 5

19 Sample size = 5 Sample size = 5000 Sample size = 30

20 Prompts Instead of having to change the code with every run, it makes sense to ask the person running the code to type the sample size and number of samples they want to collect into a dialog box.

21 Setting up prompts is easy…
Right click on the program and choose properties. Choose the Prompts windowpane. Click Prompt Manager… Click Add…

22

23

24 Note the ? in the upper right corner.
Precede the prompt name with an & and follow it with a . with no spaces. This is called a macro variable. Note the ? in the upper right corner. If you can’t see the Prompt Manager, find it on the View menu.

25 PROC SQL - Set Operators NO GUI (“noh gooey”)
Outer Union Corresponding concatenates Unions unique rows from both queries Except rows that are part of first query Intersect rows common to both queries

26 outer union corresponding
You can concatenate data files. I rarely use it. proc sql; create table isOuter as select dude from baseline outer union corresponding select dude from followup; quit;

27 union You can also concatenate data files and keep unique records:
proc sql; create table isUnion as select dude from baseline union select dude from followup; quit;

28 except Say you needed everyone who did not come back. Start out with the baseline group and remove the people who came back. proc sql; select id from baseline except select id from followup; quit;

29 intersect Say you wanted to know who came back. In other words, what IDs are in both files? proc sql; select id from baseline intersect select id from followup; quit;

30 PROC SQL - Set Operators
When you have tables (with more than one column) with the same structure, you can combine them with these set operators. Be extremely careful because SAS/SQL is forgiving about the structure of the tables and you may not notice problems in the data. For this to work as intended, the two tables must have the same variables, in the same order, and the variables must be of the same type (variables with the same name must both be character or both be numeric). Use the key word corresponding to have it match like-named variables.

31 corresponding The columns do not need to have matching names or even the same length and it will still operate on them. Use corresponding to help spot this problem.

32 Working with Repeated Keys
A file tracking diagnoses or treatments will have multiple records for some people. If you want to count the number of records for a person, specify what variable(s) are used to group by. Count records in the group with count(*) or count not missing values with count(variableName)

33 Repeat Counting I want to know:
how many people I have how many diagnoses each person has how many distinct diagnoses each person has You can sort the data and count or use the SQL commands on grouped data.

34

35 How many records? Select ID to be included in the new data set.
Add an Advanced expression as a Computed Column and select the count() function.

36 It automatically groups the data by ID when you do the count(
It automatically groups the data by ID when you do the count(*) function.

37 Other Aggregates To get the counts of diagnoses and/or the distinct diagnoses, drag the diagnosis (DX) variable over to the select variable list and choose the appropriate summary statistic.

38


Download ppt "Lab 3 and HRP259 Lab and Combining (with SQL)"

Similar presentations


Ads by Google