Statistics in Science  Introducing SAS ® software Acknowlegements to David Williams Caroline Brophy.

Slides:



Advertisements
Similar presentations
The INFILE Statement Reading files into SAS from an outside source: A Very Useful Tool!
Advertisements

Statistical Methods Lynne Stokes Department of Statistical Science Lecture 7: Introduction to SAS Programming Language.
Computer Science & Engineering 2111 Text Functions 1CSE 2111 Lecture-Text Functions.
Chapter 3: Editing and Debugging SAS Programs. Some useful tips of using Program Editor Add line number: In the Command Box, type num, enter. Save SAS.
Tutorial 12: Enhancing Excel with Visual Basic for Applications
Introduction to SAS Programming Christina L. Ughrin Statistical Software Consulting Some notes pulled from SAS Programming I: Essentials Training.
The Web Warrior Guide to Web Design Technologies
SAS Output Delivery System. Find heart in the sashelp library Double click.
If You Missed Last Week Go to Click on Syllabus, review lecture 01 notes, course schedule Contact your TA ( on website) Schedule.
CS31: Introduction to Computer Science I Discussion 1A 4/2/2010 Sungwon Yang
1 Computer Applications in Epidemiology Dongmei Li Lecture 26 5/6/2009.
Introduction to scripting
Understanding SAS Data Step Processing Alan C. Elliott stattutorials.com.
Introduction to SAS Essentials Mastering SAS for Data Analytics Alan Elliott and Wayne Woodward SAS Essentials - Elliott & Woodward1.
Into to SAS ®. 2 List the components of a SAS program. Open an existing SAS program and run it. Objectives.
Creating SAS® Data Sets
SAS ® ANOVA Essentials. 2 List the components of a SAS program. Open an existing SAS program and run it. Objectives.
Chapter 2: Working with Data in a Project
SAS Workshop Lecture 1 Lecturer: Annie N. Simpson, MSc.
Introduction to SAS BIO 226 – Spring Outline Windows and common rules Getting the data –The PRINT and CONTENT Procedures Manipulating the data.
1 Experimental Statistics - week 4 Chapter 8: 1-factor ANOVA models Using SAS.
General Programming Introduction to Computing Science and Programming I.
IPC144 Introduction to Programming Using C Week 1 – Lesson 2
STREAMS AND FILES OVERVIEW.  Many programs are "data processing" applications  Read the input data  Perform sequence of operations on this data  Write.
Introduction to SAS. What is SAS? SAS originally stood for “Statistical Analysis System”. SAS is a computer software system that provides all the tools.
SAS 介绍和举例 Presented by 经济实验教学中心 商务数据挖掘中心. Raw Data Read in Data Process Data (Create new variables) Output Data (Create SAS Dataset) Analyze Data Using.
SAS Efficiency Techniques and Methods By Kelley Weston Sr. Statistical Programmer Quintiles.
Knowing Understanding the Basics Writing your own code SAS Lab.
EPIB 698C Lecture 2 Notes Instructor: Raul Cruz 2/14/11 1.
Chapter 1: Introduction to SAS  SAS programs: A sequence of statements in a particular order  Rules for SAS statements: –Every SAS statement ends in.
Lesson 2 Topic - Reading in data Chapter 2 (Little SAS Book)
ISU Basic SAS commands Laboratory No. 1 Computer Techniques for Biological Research Animal Science 500 Ken Stalder, Professor Department of Animal Science.
Introduction to Enterprise Guide Jennifer Schmidt Rhonda Ellis Cassandra Hall.
A Simple Guide to Using SPSS ( Statistical Package for the Social Sciences) for Windows.
1 EPIB 698E Lecture 1 Notes Instructor: Raul Cruz 7/9/13.
1 Data Manipulation (with SQL) HRP223 – 2010 October 13, 2010 Copyright © Leland Stanford Junior University. All rights reserved. Warning: This.
Laboratory 1. Introduction to SAS u Statistical Analysis System u Package for –data entry –data manipulation –data storage –data analysis –reporting.
BMTRY 789 Lecture 11: Debugging Readings – Chapter 10 (3 rd Ed) from “The Little SAS Book” Lab Problems – None Homework Due – None Final Project Presentations.
Here’s another problem (see section 2.13 on page 54). A file contains two different types of records (say A’s and B’s) and we only want to read in the.
Internet & World Wide Web How to Program, 5/e © by Pearson Education, Inc. All Rights Reserved.
1 Introduction to SAS Available at
Chapter 17: Formatting Data 1 STAT 541 ©Spring 2012 Imelda Go, John Grego, Jennifer Lasecki and the University of South Carolina.
Introduction to SAS Essentials Mastering SAS for Data Analytics Alan Elliott and Wayne Woodward SAS Essentials - Elliott & Woodward1.
ISQS 6347, Data & Text Mining 1 ISQS 6339, Data Management & Business Intelligence Data Preparation for Analytics Using SAS Zhangxi Lin Texas Tech University.
Chapter 1: Overview of SAS System Basic Concepts of SAS System.
Programming Fundamentals. Overview of Previous Lecture Phases of C++ Environment Program statement Vs Preprocessor directive Whitespaces Comments.
Summer SAS Workshop Lecture 3. Summer SAS Workshop Website
Programming Fundamentals. Summary of previous lectures Programming Language Phases of C++ Environment Variables and Data Types.
Controlling Input and Output
Computing with SAS Software A SAS program consists of SAS statements. 1. The DATA step consists of SAS statements that define your data and create a SAS.
ISQS 6347, Data & Text Mining 1 ISQS 6339, Data Management & Business Intelligence Data Preparation for Analytics Using SAS Zhangxi Lin Texas Tech University.
1 Introduction to SAS Available at
Chapter 21: Controlling Data Storage Space 1 STAT 541 ©Spring 2012 Imelda Go, John Grego, Jennifer Lasecki and the University of South Carolina.
Lesson 2 Topic - Reading in data Programs 1 and 2 in course notes –Chapter 2 (Little SAS Book)
1 Data Manipulation (with SQL) HRP223 – 2009 October 12, 2009 Copyright © Leland Stanford Junior University. All rights reserved. Warning: This.
1 CSC160 Chapter 1: Introduction to JavaScript Chapter 2: Placing JavaScript in an HTML File.
1 EPIB 698C Lecture 1 Instructor: Raul Cruz-Cano
SAS Programming Training Instructor:Greg Grandits TA: Textbooks:The Little SAS Book, 5th Edition Applied Statistics and the SAS Programming Language, 5.
Based on Learning SAS by Example: A Programmer’s Guide Chapters 1 & 2
SAS ® 101 Based on Learning SAS by Example: A Programmer’s Guide Chapters 3 & 4 By Tasha Chapman, Oregon Health Authority.
SAS Programming Introduction to SAS.
Chapter 1: Introduction to SAS
Instructor: Raul Cruz-Cano
IDENTIFIERS CSC 111.
Tamara Arenovich Tony Panzarella
Introduction to SAS A SAS program is a list of SAS statements executed in order Every SAS statement ends with a semicolon! SAS statements can be in caps.
Chapter 2: Introduction to C++.
Instructor: Raul Cruz 9/4/13
Introduction to SAS Essentials Mastering SAS for Data Analytics
Presentation transcript:

Statistics in Science  Introducing SAS ® software Acknowlegements to David Williams Caroline Brophy

Statistics in Science  Need to know –SAS environment –SAS files (datasets, catalogs etc) & libraries –SAS programs How to:  Get data in  Manipulate data  Get results out

Statistics in Science  SAS software environment

Statistics in Science  SAS Windows (SAS 9)

Statistics in Science  Some (!) SAS windows –Editor Where code is written or imported, and submitted –Log What happened, including what went wrong –Output Results of program procedures that produce output –Explorer Shows libraries (SAS & Windows), their files, and where you can see data, graphs –Results Shows how the output is made up of tables, graphs, datasets etc –Notepad A useful place to keep bits of code

Statistics in Science  SAS software programs

Statistics in Science  SAS Programs data one; input x y; datalines; ; run; proc print data = one (obs = 5); run; proc means data = one; run; DATA step creates SAS data set PROC steps process data in data set

Statistics in Science  SAS steps begin with a DATA statement PROC statement. SAS detects the end of a step when it encounters a RUN statement (for most steps) a QUIT statement (for some procedures) the beginning of another step (DATA statement or PROC statement). Recommendation: use RUN; at end of each step Step Boundaries

Statistics in Science  data seedwt; input oz $ rad wt; datalines; Low High Low run; proc print data = two; proc means data = seedwt; class oz; var rad wt; run; Step Boundaries

Statistics in Science  When you execute a SAS program, the output generated by SAS is divided into two major parts: SAS log contains information about the processing of the SAS program, including any warning and error messages. SAS output contains reports generated by SAS procedures and DATA steps. Submitting a SAS Program

Statistics in Science  1)Submit all (or selected) code by  F4  Click on the runner in the toolbar 2)Read log 3)Look in output window if you expect code to produce output 4)Problems  Bad syntax  Missing ; at end of line  Missing quote ’ at end of title (nasty!) Recommended steps!

Statistics in Science  Improved output - HTML Tools  Options  Preferences Results  Do this & resubmit code  Check HTML output in Results Window

Statistics in Science  SAS data sets

Statistics in Science  SAS data sets SAS procedures ( PROC … ) process data from SAS data sets Need to know (briefly!) –What a SAS data set looks like –How to get out data into a SAS data set

Statistics in Science  SAS data sets live in libraries have a descriptor part (with useful info) have a data part which is a rectangular table of character and/or numeric data values (rows called observations) have names with syntax datasetname libname defaults to work if omitted

Statistics in Science  work library SAS data sets with a single part name like oz, wp or mybestdata99 1)are stored in the work library 2)can be referenced e.g. as mybestdata99 or work.mybestdata99 3)are deleted at end of SAS session!

Statistics in Science  Don’t loose your data! Keep the SAS program that read the data from its original source... More later!

Statistics in Science  Viewing descriptor & data /* view descriptor part */ proc contents data = wp; run; /* view data part */ proc print data = work.wp; run; Alternatively: Use SAS Explorer: Open (for data) Properties (for descriptor) Properties is not as clear as CONTENTS

Statistics in Science  SAS variables There are two types of variables: charactercontain any value: letters, numbers, special characters, and blanks. Character values are stored with a length of 1 to 32,767 bytes (default is 8). One byte equals one character. numericstored as floating point numbers in 8 bytes of storage by default. Eight bytes of floating point storage provide space for 16 or 17 significant digits. You are not restricted to 8 digits. Don’t change the 8 byte length!

Statistics in Science  SAS variables The CONTENTS Procedure Alphabetic List of Variables and Attributes # Variable Type Len 1 oz Char 8 2 rad Num 8 3 wt Num 8 OUTPUT

Statistics in Science  SAS names – for data sets & variables can be 32 characters long. can be uppercase, lowercase, or mixed-case but are not case sensitive! must start with a letter or underscore. Subsequent characters can be letters, underscores, or numeric digits - no or spaces.

Statistics in Science  LastName FirstName JobTitle Salary TORRES JAN Pilot LANGKAMM SARAH Mechanic SMITH MICHAEL Mechanic. WAGSCHAL NADJA Pilot TOERMOEN JOCHEN A value must exist for every variable for each observation. Missing values are valid values. A numeric missing value is displayed as a period. A character missing value is displayed as a blank. Missing Data Values

Statistics in Science  SAS syntax Not case sensitive Each ‘line’ usually begins with keyword and ends with ; Common Errors: –Forget ; –Miss-spelt or wrong keyword –Missing final quote in title title ‘Woodpecker Habitat; /* quote mark missing */ title ‘Woodpecker Habitat’;

Statistics in Science  Comments 1.Type /* to begin a comment. 2.Type your comment text. 3.Type */ to end the comment. To comment selected typed text remember: Ctrl+/ Alternative: * comment ;

Statistics in Science  SAS Creating a SAS data set

Statistics in Science  Getting data in! Consider 2 methods 1)Data in program (briefly!) 2)Data in Excel workbook

Statistics in Science  Getting data in! Data in program file: data oz; input oz $ rad wt; datalines; Low High Low ; run; Note: 1.oz is text variable so requires $ 2.No missing values 3.Values of oz don’t contain spaces are at most 8 character long

Statistics in Science  Getting data in! from Excel Use IMPORT wizard saving program to reduce future clicking!

Statistics in Science  Creating new variables Adding a new variable to an existing SAS data set (say work.old) 1.Use set 2.Give definition of new variable data new; /* read data from work.old */ set old; y2 = y**2; ly = log(y); ly_base10 = log10(y); t1 = (treat = 1); run;

Statistics in Science  Data set: work.new Obstreatyysquaredlogylogy_base10t1 1A A B B B

Statistics in Science  Data Screening

Statistics in Science  Data Screening checking input data for gross errors Use PRINT procedure to scan for obvious anomalies Use MEANS procedure & examine summary table –MAXIMUM, MINIMUM – reasonable? –MEAN - near middle of range? –MISSING VALUES - input or calculation error e.g. log(0)? –CV (= 100*std.dev/mean) - 50% implies skewness for any positive variable

Statistics in Science  SAS syntax MEANS syntax What else should go here?

Statistics in Science  Dealing with data errors Check original records Change mistakes in recording where the correct value is beyond question Regenerate observations where possible – e.g. reweigh sample, redo chemical analysis With a large body of data in an unbalanced design err on the side of omitting questionable data Do not proceed until data has been properly cleaned – if necessary perform a number of screening runs