Presentation is loading. Please wait.

Presentation is loading. Please wait.

Beginning Data Manipulation HRP 223 - Topic 4 Oct 14 th 2012 Copyright © 1999-2013 Leland Stanford Junior University. All rights reserved. Warning: This.

Similar presentations


Presentation on theme: "Beginning Data Manipulation HRP 223 - Topic 4 Oct 14 th 2012 Copyright © 1999-2013 Leland Stanford Junior University. All rights reserved. Warning: This."— Presentation transcript:

1 Beginning Data Manipulation HRP 223 - Topic 4 Oct 14 th 2012 Copyright © 1999-2013 Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected by copyright law and international treaties. Unauthorized reproduction of this presentation, or any portion of it, may result in severe civil and criminal penalties and will be prosecuted to maximum extent possible under the law.

2 What I am working on I had an idea for a graphic... It is a totally custom picture so I needed fake data.

3 Fake data... and real graphics code

4 Real Graphic with Fake Data

5 Some fake data Procedures Functions Procedures summarize over a dataset Functions work on a within a record of a dataset. Notice SAS remembers the capitalization

6 Print a Dataset

7 What SAS writes

8 Labels fix these Formats fix these I changed the capitalization.

9 Turning Labels On for Browsing Tables I typically keep this off. I want to see the real names for writing code.

10

11

12 Average months treatment Calculate a mean

13 Average 3 labs Search the function list in onlineDoc for a function that does average.

14

15

16 Modifying datasets with SQL

17

18 I like to split my.egp file into several process flowcharts. One sets the libraries and formats. One does cleaning. One (or several) for analyses. Right click here and choose Properties. Label this process flow Make data. Note the name.

19 Note the new name.

20 Automatically Make Libraries and/or Formats You can make a process flow that runs whenever you start up your project. Just name the process flow autoexec.

21 User Defined Formats I typically create my formats with code but if you want to use the GUI.

22 Set this A short name

23

24 After pushing Run fix the node name to match the format.

25 Make At Least 1 Analysis Process Flow If you have an autoexec file you don’t need to include the library in the analysis sheet but I like to see it:

26 Moving Between Process Flows Here Or here

27 Need a new variable? You can check a value using an if statement in a data step:

28 else If the value is not greater than or equal to 175 then set the result to be good: New character variables are 8 letters wide if you use an input statement. Otherwise it uses the first reference to set the length. It gets the length for existing variables from the first reference in the source dataset.

29 Change this to "Bad " or use a length statement.

30 Missing values are negative infinity….

31

32

33

34 You can get the same result with SQL.

35 Showing Combinations Often I am asked to show sets of treatments or sets of drugs. This quickly gets too complex for contingency tables (for 5 treatments you need 2x2x2x2x2 tables). I use binary lists. For example, common cancer treatments include Chemo, Radiation, Surgery (but you can use this same system for fine distinctions). Somebody who got Chemo and Surgery but no radiation can be represented as CrS. Code everybody like that and count the combinations.

36

37

38


Download ppt "Beginning Data Manipulation HRP 223 - Topic 4 Oct 14 th 2012 Copyright © 1999-2013 Leland Stanford Junior University. All rights reserved. Warning: This."

Similar presentations


Ads by Google