Presentation is loading. Please wait.

Presentation is loading. Please wait.

The Information Delivery Process Data In Information Out ManageOrganizeExploit.

Similar presentations


Presentation on theme: "The Information Delivery Process Data In Information Out ManageOrganizeExploit."— Presentation transcript:

1 The Information Delivery Process Data In Information Out ManageOrganizeExploit

2 2 Turning Data Into Information Data DATA Step PROC Steps Data Information SAS Data Sets Data PROC Steps Information

3 3 Turning Data Into Information Process of delivering meaningful information: 80% Data-related: Access Scrub Transform Manage Store and retrieve 20% Analysis

4 4 The Raw Data Partial fixed-column raw data file:  11223344

5 5 Browsing the Data Values                 

6 6 Reading a Raw Data File Raw Data File SAS Data Set

7 7 Reading Raw Data Files Raw Data File DATA Step SAS Data Set data...; infile...; input...; run; 0031GOLDENBERG DESIREE 0040WILLIAMS ARLENE M. 0071PERRY ROBERT A. 0082MCGWIER-WATTSCHRISTINA      

8 8 Reading Raw Data Files In order to create a SAS data set from a raw data file, you must start a DATA step and name the SAS data set being created (DATA statement) identify the location of the raw data file to read (INFILE statement) describe how to read the data fields from the raw data file (INPUT statement).

9 9 Creating a SAS Data Set with the DATA Statement General form of the DATA statement: This DATA statement creates a SAS data set called WORK.EMPDATA: data work.empdata; DATA SAS-data-set(s);

10 10 Pointing to a Raw Data File with the INFILE Statement General form of the INFILE statement: Examples: OS/390 infile ‘edc.prog1.employee’; UNIX infile ‘/user/prog1/employee.dat’; WIN infile ‘C:\workshop\winsas\ prog1\employee.dat’; INFILE ‘filename’ ;

11 11 Reading Raw Data Using Column Input General form of column input: To read raw data values with column input, 1.name the SAS variable you want to create 2.use a dollar sign, $, if the SAS variable is character 3.specify the starting column, a dash, and the ending column of the raw data field. INPUT variable $ startcol-endcol …;

12 12 Reading Raw Data Using Column Input 0031GOLDENBERG DESIREE PILOT1 50221.62 input empid $ 1-4 lastname $ 5-17 1---5----0----5----0----5----0----5----0----5 21123344

13 13 Reading Raw Data Using Column Input 0031GOLDENBERG DESIREE PILOT1 50221.62 input empid $ 1-4 lastname $ 5-17 firstname $ 18-30 1---5----0----5----0----5----0----5----0----5 21123344

14 14 Reading Raw Data Using Column Input 0031GOLDENBERG DESIREE PILOT1 50221.62 input empid $ 1-4 lastname $ 5-17 firstname $ 18-30 jobcode $ 31-36 1---5----0----5----0----5----0----5----0----5 21123344

15 15 Reading Raw Data Using Column Input 0031GOLDENBERG DESIREE PILOT1 50221.62 input empid $ 1-4 lastname $ 5-17 firstname $ 18-30 jobcode $ 31-36 salary 37-45; 1---5----0----5----0----5----0----5----0----5 21123344

16 16 Reading Raw Data Using Column Input 0031GOLDENBERG DESIREE PILOT1 50221.62 input empid $ 1-4 lastname $ 5-17 firstname $ 18-30 jobcode $ 31-36 salary 37-45; 1---5----0----5----0----5----0----5----0----5 21123344

17 17 Business Scenario International Airlines is preparing to review its flight crew. The immediate goal is to read the Excel spreadsheet and create a SAS data set. Excel Spreadsheet SAS Data Set

18 18 What is the Import Wizard? A point-and-click graphical interface that enables you to create a SAS data set from several types of external files including dBASE file (*.DBF) Excel 97 Spreadsheet (*.XLS) Microsoft Access Table Delimited file (*.*) Comma Separated Values (*.CSV)

19 19 The Raw Data The aircraft data is stored in a fixed-column raw data file:          11232344 aircraft modeldate in service last maintenance date aircraft ID Partial data:

20 20 Using Formatted Input The raw data file will be read with formatted input. Raw Data File DATA Step SAS Data Set data sas-data-set-name; infile raw-filename; input pointer-control variable informat-name; run;   

21 21 What is a SAS Format? A format is an instruction that the SAS System uses to write data values. SAS formats have the following form: format.

22 22 SAS Formats Selected SAS formats: w.d standard numeric format $w. standard character format COMMAw.d commas in a number: 12,234.21 DOLLARw.d dollar signs and commas in a number: $12,234.41

23 23 SAS Formats

24 24 Using Formatted Input General form of the INPUT statement with formatted input: Pointer control: @nmoves the pointer to column n. +nmoves the pointer n positions. INPUT pointer-control column informat...;

25 25 Using Formatted Input Formatted input can be used to read non- standard data values by moving the input pointer to the starting position of the field specifying a column name specifying an informat. An informat specifies the width of the input field and how to read the data values that are stored in the field.

26 26 Using Formatted Input General form of an informat: $indicates a character format. informat-namenames the informat. wis an optional field width.. is the required delimiter. doptionally, specifies a decimal for numeric informats. $informat-namew.d

27 27 Selected Informats 7. or 7.0reads seven columns of numeric data. 7.2reads seven columns of numeric data and inserts a decimal point in the data value. $5.reads five columns of character data and removes leading blanks. $CHAR5.reads five columns of character data and preserves leading blanks.

28 28 Selected Informats COMMA7.reads seven columns of numeric data and removes selected nonnumeric characters, such as dollar signs and commas. PD4.reads four columns of packed decimal data. MMDDYY10.reads dates of the form 01/20/2000.

29 29 Working with Date Values The raw data file contains date values. These date values will be read with the MMDDYY10. informat:     

30 30 Converting Dates to SAS Date Values SAS uses date informats to read and convert dates to SAS date values. For example, Stored Value InformatConverted Value 10/29/1999MMDDYY10.14546 29OCT1999DATE9.14546 29/10/1999DDMMYY10.14546

31 31 SAS Formats Selected SAS date formats: MMDDYYw.101692 (MMDDYY6.) 10/16/92 (MMDDYY8.) 10/16/1992 (MMDDYY10.) DATEw.16OCT92 (DATE7.) 16OCT1992 (DATE9.)

32 32 Locating and Browsing the Raw Data File Browse the raw data file and determine the column layout and type:          11232344 aircraft modeldate in service last maintenance date aircraft ID Partial raw data file:

33 33 Starting the DATA Step Use the DATA statement to begin the DATA step and name the SAS data set: data work.aircraft; other SAS statements run; Use the INFILE statement to identify the input raw data file: data work.aircraft; infile ‘aircraft.dat’; other SAS statements run;

34 34 Writing the INPUT Statement Use the INPUT statement and pointer control to read the record starting with the first column. Read the value with the $16. informat and assign it to the variable MODEL.  data work.aircraft; infile ‘aircraft.dat’; input @1 model $16. other SAS statements run;  

35 35 Writing the INPUT Statement Use the INPUT statement and pointer control to read the record starting with column 18. Read the value with the $6. informat and assign the value to AIRCRAFTID.  data work.aircraft; infile ‘aircraft.dat’; input @1 model $16. @18 aircraftid $6. other SAS statements run;  

36 36 Writing the INPUT Statement Use the INPUT statement and pointer control to read the record starting with column 25. Read the value with the MMDDYY10. informat and assign the value to INSERVICE.  data work.aircraft infile ‘aircraft.dat’; input @1 model $16. @18 aircraftid $6. @25 inservice mmddyy10. other SAS statements run;  

37 37 Use the INPUT statement and pointer control to read the record starting with column 36. Read the value with the MMDDYY10. informat and assign the value to LASTMAINT.  data work.aircraft; infile ‘aircraft.dat’; input @1 model $16. @18 aircraftid $6. @25 inservice mmddyy10. @36 lastmaint mmddyy10.; run;   Writing the INPUT Statement

38 38 SAS Syntax Rules They can begin and end in any column. One or more blanks or special characters can be used to separate words. A single statement can span multiple lines. Several statements can be on the same line. SAS statements are free-format. data work.mech_pilot; infile 'c:\coursedata\emplist.dat'; input lastname $ 1-20 firstname $ 21-30 jobtitle $ 36-43 salary 54-59; run; proc means data=work.mech_pilot n mean; class jobtitle; var salary;run; Unconventional spacing

39 39 SAS Syntax Rules They can begin and end in any column. One or more blanks or special characters can be used to separate words. A single statement can span multiple lines. Several statements can be on the same line. SAS statements are free-format. data work.mech_pilot; infile 'c:\coursedata\emplist.dat'; input lastname $ 1-20 firstname $ 21-30 jobtitle $ 36-43 salary 54-59; run; proc means data=work.mech_pilot n mean; class jobtitle; var salary;run; Unconventional spacing

40 40 SAS Syntax Rules data work.mech_pilot; infile 'c:\coursedata\emplist.dat'; input lastname $ 1-20 firstname $ 21-30 jobtitle $ 36-43 salary 54-59; run; proc print data=work.mech_pilot; run; proc means data=work.mech_pilot n mean; class jobtitle; var salary; run; SAS statements usually begin with an identifying keyword always end with a semicolon.

41 41 Adding a New Variable  Create a new variable by extracting the four-digit year values from the SAS date values.

42 42 Using an Assignment Statement An assignment statement evaluates an expression and assigns the resulting value to a variable. General syntax of an assignment statement: variable=expression;

43 43 Using Operators Selected operators for basic arithmetic calculations in an assignment statement:

44 44 Using SAS Functions A SAS function is a routine that returns a value that is determined from specified arguments. General syntax of a SAS function: function-name(argument1,argument2,...)

45 45 Using SAS Functions SAS functions perform arithmetic operations compute statistics (for example, mean) manipulate SAS dates and process character values perform many other tasks.

46 46 Creating a Vertical Bar Chart Use the GCHART procedure and the VBAR statement to create a vertical bar chart. proc gchart data=work.aircraft; vbar yrbeg_service; title 'Aircraft In Service, by Year'; run;

47 47 Reading a Subset of Raw Data Use the DATA step that was written earlier. Add a subsetting IF statement to process only the subset in which the value of AGE is at least 15. data work.aircraft; infile ‘aircraft.dat’; input @1 model $16. @18 aircraftid $6. @25 inservice mmddyy10. @36 lastmaint mmddyy10.; yrbeg_service=year(inservice); age=year(today())-yrbeg_service; if age>=15; run;

48 48 What Is a SAS Data Library?

49 49 What Is a SAS Data Library? Regardless of which host operating system you use, you identify SAS data libraries by assigning each one a libref. libref

50 50 What Is a SAS Data Library? By default, SAS creates two SAS data libraries: a temporary library called WORK a permanent library called SASUSER. SASUSER WORK

51 51 SAS Data Libraries You can think of a SAS data library as a drawer in a filing cabinet and a SAS data set as one of the file folders in the drawer.

52 52 SAS Data Libraries WORK - temporary library When you invoke SAS, you automatically have access to a temporary and a permanent SAS data library. SASUSER - permanent library You can create and access your own permanent libraries. IA - permanent library

53 53 Reading a SAS Data Set Input data setOutput data set SET statementDATA statement Temporary SAS data set Temporary SAS data set Permanent SAS data set Permanent SAS data set

54 54 Two-level SAS Filenames The first name (libref) refers to the library. Every SAS file has a two-level name. The second name (filename) refers to the file in the library. The data set MECH_PILOT is a SAS file in the WORK library. libref.filename

55 55 Browsing the Data Portion The PRINT procedure displays the data portion of a SAS data set. By default, PROC PRINT displays all observations all variables OBS column on the left-hand side.

56 56 Browsing the Data Portion General form of the PRINT procedure: Example: proc print data=work.empdata; run; PROC PRINT DATA=SAS-data-set; RUN;

57 57 Objectives Generate list reports using the PRINT procedure. Display selected variables in a list report using the VAR statement. Display selected observations in a list report using the WHERE statement. Sort the observations in a SAS data set using the SORT procedure.

58 58 Creating a List Report      PROC Step proc print data=work.empdata; var empid salary jobcode; run;

59 59 Formatting Data Values      proc print data=work.empsort; format salary dollar11.2; run;

60 60 Creating a Frequency Report            PROC Step

61 61 Creating a Frequency Report The FREQ procedure displays frequency counts of the data values in a SAS data set. General form of a simple PROC FREQ step: PROC FREQ DATA=SAS-data-set; RUN; Example: proc freq data=work.empsort; run;

62 62 Creating a One-Way Frequency Report Only variables listed on the TABLES statement are included in the frequency counts. These are typically variables that have a limited number of distinct values. General form of a PROC FREQ step: PROC FREQ DATA=SAS-data-set; TABLES SAS-variables; RUN;

63 63 Calculating Job Code Frequencies        

64 64 Calculating Salary Frequencies          

65 65 Calculating Job Code/Salary Frequencies                     

66 66 Creating a Frequency Report By default, PROC FREQ analyzes every variable in the SAS data set displays each distinct data value calculates the number of observations in which each data value appears (and corresponding percentage) indicates for each variable how many observations have missing values.

67 67 Calculating Summary Statistics The MEANS procedure displays simple descriptive statistics for the numeric variables in a SAS data set. General form of a simple PROC MEANS step: PROC MEANS DATA=SAS-data-set; RUN; Example: proc means data=ia.aircraftcap; run;

68 68 Calculating Summary Statistics        proc means data=ia.aircraftcap; run;

69 69 Calculating Summary Statistics By default, PROC MEANS analyzes every numeric variable in the SAS data set prints the statistics N, MEAN, STD, MIN, and MAX excludes missing values before calculating statistics.

70 70 proc means data=ia.aircraftcap; var totpasscap; run; Selecting Variables       

71 71 proc means data=ia.aircraftcap maxdec=2; var totpasscap; class model; run; Grouping Observations

72 72 Calculating Capacity Statistics for Each Type of Plane          


Download ppt "The Information Delivery Process Data In Information Out ManageOrganizeExploit."

Similar presentations


Ads by Google