Presentation is loading. Please wait.

Presentation is loading. Please wait.

How to start using SAS SARBAJIT MUKHERJEE. WHAT IS SAS? SAS stands for Statistical Analysis System. Useful for the following types of task: 1. Data entry,

Similar presentations


Presentation on theme: "How to start using SAS SARBAJIT MUKHERJEE. WHAT IS SAS? SAS stands for Statistical Analysis System. Useful for the following types of task: 1. Data entry,"— Presentation transcript:

1 How to start using SAS SARBAJIT MUKHERJEE

2 WHAT IS SAS? SAS stands for Statistical Analysis System. Useful for the following types of task: 1. Data entry, retrieval, and management 2. Report writing and graphics 3. Statistical and mathematical analysis

3 SAS programs A SAS program is a sequence of steps that the user submits for execution. Data steps are typically used to create SAS data sets PROC steps are typically used to process SAS data sets (that is, generate reports and graphs, edit data, sort data and analyze data

4 SAS Data Libraries A SAS data library is a collection of SAS files that are recognized as a unit by SAS A SAS data set is one type of SAS file stored in a data library Work library is temporary library, when SAS is closed, all the datasets in the Work library are deleted; create a permanent SAS dataset via your own library.

5 SAS Data Libraries Identify SAS data libraries by assigning each a library reference name (libref) with LIBNAME statement LIBNAME libref “file-folder-location”; Eg: LIBNAME readData 'C:\temp\sas class\readData‘; Rules for naming a libref: The name must be 8 characters or less The name must begin with a letter or underscore The remaining characters must be letters, numbers or underscores.

6 Reading raw data set into SAS system In order to create a SAS data set from a raw data file, you must Start a DATA step and name the SAS data set being created (DATA statement) Identify the location of the raw data file to read (INFILE statement) Describe how to read the data fields from the raw data file (INPUT statement)

7 Example 1 Reading raw data separated by spaces /* Create a SAS permanent data set named HighLow1; Read the data file temperature1.dat using listing input */ DATA readData.HighLow1; INFILE ‘C:\sas class\readData\temperature1.dat’; INPUT City $ State $ NormalHigh NormalLow RecordHigh RecordLow; RUN; /* The PROC PRINT step creates a isting report of the readData.HighLow1 data set */ PROC PRINT DATA = readData.highlow1; TITLE ‘High and Low Temperatures for July’; RUN; Nome AK 55 44 88 29 Miami FL 90 75 97 65 Raleign NC 88 68 105 50 temperature1.dat:

8 Reading Delimited or PC Database Files with the IMPORT Procedure If your data file has the proper extension, use the simplest form of the IMPORT procedure: PROC IMPORT DATA FILE = ‘filename’ OUT = data-set Type of File Extension DBMS Identifier Comma-delimited.csv CSV Tab-delimited.txt TAB Excel.xls EXCEL Lotus Files.wk1,.wk3,.wk4 WK1,WK3,WK4 Delimiters other than commas or tabs DLM Examples: 1. PROC IMPORT DATAFILE=‘c:\temp\sale.csv’ OUT=readData.money; RUN; 2. PROC IMPORT DATAFILE=‘c:\temp\bands.xls’ OUT=readData.music; RUN;

9 SAS or R ? I think there are several issues (in ascending order of possible validity): Tradition / habit: people are used to SAS, and don't want to have to learn something new. (Making it more difficult, the way you think in SAS and R is different.) This can apply to anyone who might have to send you code, or read / use your code, including managers and colleagues. Distrust of freeware: Several people say they aren't willing to accept results from R because you don't have a for-profit company vetting the code to ensure it gives correct results before it goes out to customers, lest they end up losing business. Big data: R performs operations with everything in memory, whereas SAS doesn't necessarily. Thus, if your data approaches the limits of your memory, there will be problems. Better documentation: R is getting better at this, but documentation, especially the official documentation, is often kind of terrible and opaque

10 Usage of SAS and other Analytics S/W.

11 Why use SAS ? SAS is very efficient with data manipulation if you know what you're doing. It's been designed to work with sequential tapes so it is built with the assumption that data access is expensive. Makes wonders when you work truly massive datasets. SAS is good at opening up gigantic data sets even on computer which do not have a lot of computing power. Essentially data sets that would crash most programs on a given computer in a heart beat can load in SAS. SAS as a company is smart and designs its products at corporate cost centers. This includes doing things like company wide installations and setting up its platform in a way that makes it easy for corporate it departments to setup a company wide SAS infrastructure.

12 Industry Usage

13 SAS is really pricey !!!! Well, there is a solution to that too !! SAS provides a free university edition software that runs on a virtual machine. Every details about the installation is in the documentation.

14 Why the University Edition ?

15 DEMO

16 QUESTIONs ? THANK YOU


Download ppt "How to start using SAS SARBAJIT MUKHERJEE. WHAT IS SAS? SAS stands for Statistical Analysis System. Useful for the following types of task: 1. Data entry,"

Similar presentations


Ads by Google