Presentation is loading. Please wait.

Presentation is loading. Please wait.

Working with Data in Windows HRP223 – 2010 October 4 th, 2010 Copyright © 1999-2010 Leland Stanford Junior University. All rights reserved. Warning: This.

Similar presentations


Presentation on theme: "Working with Data in Windows HRP223 – 2010 October 4 th, 2010 Copyright © 1999-2010 Leland Stanford Junior University. All rights reserved. Warning: This."— Presentation transcript:

1 Working with Data in Windows HRP223 – 2010 October 4 th, 2010 Copyright © 1999-2010 Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected by copyright law and international treaties. Unauthorized reproduction of this presentation, or any portion of it, may result in severe civil and criminal penalties and will be prosecuted to maximum extent possible under the law.

2 Sources of Data Toy data – For statistics classes, you may be able to type in the data directly into a SAS code file into EG like in TLSB for EG. Excel – For small amounts of HIPAA safe data you can use Excel with validation. Text files with columns of numbers and text – Exports created by databases frequently provide a text file full of data and a program for loading it into SAS. SAS – Native SAS datasets created by somebody else.

3 Recognize File Types Windows adds a period and a suffix that is a couple of letters long to the names of files to indicate what program uses the file. By default, the suffix is hidden.

4 2 3 Uncheck 4 1 5 Follow these steps to show file extensions (suffixes) in Vista.

5 Show File Extensions (Suffixes) in XP 2 3 Uncheck 4 1 5

6 Types of Files.pdf Adobe portable document format.zip archives full of compressed data.xls Excel prior to 2007.xlsx Excel 2007 and later.csv comma separated values (text which Excel likes).txt text files.sas SAS code files.egp Enterprise Guide projects.sas7bdat SAS data files.htm or.html web pages

7 SAS and EG files.sas files are text files full of instructions that a programmer can write and/or edit..egp files are not.

8 Searching Because the contents of.egp files are incomprehensible (without special tools) you will have trouble searching for things inside of projects. This affects me when I can’t remember the name of a project and to find it I want to search for key words in the code (like the principal investigator’s name or the name of the source data file). – I can not find a tool to search the contents of all the.egp files on my hard drive.

9 Files in Enterprise Guide You can (and should) save SAS code files outside of the EG project to make it easy to search. Most people create EG projects that reference data files that live outside of EG. – SAS datasets – Excel files – Text files full of data Converted to SAS format Native Excel format

10 Shortcuts Windows indicates a “shortcut” to a file that lives elsewhere with an arrow in the bottom left corner of an icon. EG uses the same symbol to denote a shortcut to a file outside of the project.

11 What is in an EGP file? An EG project file.egp contains information and instructions but it will have links to a lot of external files. Shortcut to a file NOT in the project. This is part of the project Shortcut to a file NOT in the project.

12 EG and Code You can write and store your “code” instructions to SAS inside of the EG project or you can create a short cut to the code file which lives outside of EG. Right click and choose New > ProgramLook at the process flowNo shortcut icon

13 External SAS files You can easily save a code file outside of the project by choosing Save Program As… from the File menu or clicking the Save or Save As … from the program tab (when the code is open). Shortcut

14 Where are SAS Data Sets Stored? While SAS can refer to files using their Windows path, it is easier to type a short name instead of a long path. SAS calls the short names “libraries”. EG automatically knows about a couple of places where data can be stored. – It creates a temporary work folder whenever EG starts. – It creates a permanent sasuser folder when EG is installed. The locations for data are called libraries.

15 Libraries By default the data goes into the sasuser library. This is a very bad idea. You will end up with every file in one folder. Anybody using SAS can access that folder, so there are significant HIPAA issues. Right click on a file and pick Properties to see where it is stored.

16 Libraries You can see the contents of libraries by going to the Server List window and opening the local libraries “file drawer.” If you previously closed the window use the View menu to select Server List. Double click the dataset to browse it.

17 Change the Default File Location On every machine you, use you should change the default file location to the work library. Do this once per machine.

18 Click 1st Click 2x

19 Permanent Store I suggest that you save your data into the temporary work library by default. If you have a huge file which you only want to import once, or if you want to keep a permanent copy of a SAS data file, you will want to set up a permanent library. – This is just a fancy way of specifying what folder SAS should use to save the.sas7bdat data files.

20 Loading Data The Easy Way First fix the problematic registry entries that are described in the instructions on installing SAS. www.stanford.edu/class/hrp223/2010/SAS92TS2M3.pptx If you have mixtures of characters and number values in a column in Excel programs reading the data (including SAS) can drop the cells that have character data without warning.

21 SAS R

22 Importing the Easy Way The most bulletproof way for importing with EG 4.2 is to use the import wizard.

23 Always check this on.

24 Double check that it guesses the right Type, especially for dates.

25 Tell SAS that there is a folder which can hold data by creating a library. This only makes it aware of the folder. It does not automatically put stuff in the folder.

26 It’s just a folder! When the library is created it is just a pointer to a preexisting folder. That folder can contain anything. When you want to use the folder you need to explicitly tell EG to store data in the folder. First rename the node and draw an arrow to indicate where the library is used. These changes are only aesthetic.

27 Now it looks good but the import is still into work. 1 st rename the node to match the library name 2 nd add a line to the flowchart connecting the library to the import. It just looks good.

28 Find your library here.

29 Notice it is in the library. A “design feature” is that you have to Refresh the library to see the freshly added file.

30 Playing with Data Once the data is imported you can add code “nodes” to the flowchart or use the graphical user interface to tweak the data and do analyses. Complex changes Quick and easy subset and sorting

31

32 It gives you more options as you add in sort variables. SQL is built behind the scenes. Note the awful new name.

33 Convert to a 4 digit number with the input function: input( t1.score, 4. )

34 Context sensitive menus help you describe the data you are browsing. BeforeAfter

35 Descriptive Statistics drag

36


Download ppt "Working with Data in Windows HRP223 – 2010 October 4 th, 2010 Copyright © 1999-2010 Leland Stanford Junior University. All rights reserved. Warning: This."

Similar presentations


Ads by Google