Chapter 17: Formatting Data 1 STAT 541 ©Spring 2012 Imelda Go, John Grego, Jennifer Lasecki and the University of South Carolina.

Slides:



Advertisements
Similar presentations
Examples from SAS Functions by Example Ron Cody
Advertisements

Chapter 9: Introducing Macro Variables 1 © Spring 2012 Imelda Go, John Grego, Jennifer Lasecki and the University of South Carolina.
Statistics in Science  Introducing SAS ® software Acknowlegements to David Williams Caroline Brophy.
C++ for Engineers and Scientists Third Edition
Chapter 5 Data Manipulation and Transaction Control Oracle 10g: SQL
Microsoft Office Word 2013 Expert Microsoft Office Word 2013 Expert Courseware # 3251 Lesson 4: Working with Forms.
Chapter 18: Modifying SAS Data Sets and Tracking Changes 1 STAT 541 ©Spring 2012 Imelda Go, John Grego, Jennifer Lasecki and the University of South Carolina.
Understanding SAS Data Step Processing Alan C. Elliott stattutorials.com.
Into to SAS ®. 2 List the components of a SAS program. Open an existing SAS program and run it. Objectives.
Creating SAS® Data Sets
Welcome to SAS…Session..!. What is SAS..! A Complete programming language with report formatting with statistical and mathematical capabilities.
FORMAT FESTIVAL AN INTRODUCTION TO SAS® FORMATS AND INFORMATS By David Maddox.
11 Chapter 2: Working with Data in a Project 2.1 Introduction to Tabular Data 2.2 Accessing Local Data 2.3 Importing Text Files 2.4 Editing Tables in the.
Chapter 10:Processing Macro Variables at Execution Time 1 STAT 541 © Spring 2012 Imelda Go, John Grego, Jennifer Lasecki and the University of South Carolina.
Chapter 7: Arrays. In this chapter, you will learn about: One-dimensional arrays Array initialization Declaring and processing two-dimensional arrays.
Introduction to SAS Essentials Mastering SAS for Data Analytics Alan Elliott and Wayne Woodward SAS ESSENTIALS -- Elliott & Woodward1.
Access 2000 Part 1 Introduction to Access Agenda Starting Access. Creating Tables. Working with Tables. Setting Field Properties.
European Computer Driving Licence Syllabus version 5.0 Module 4 – Spreadsheets Chapter 22 – Functions Pass ECDL5 for Office 2007 Module 4 Spreadsheets.
Introduction to SAS. What is SAS? SAS originally stood for “Statistical Analysis System”. SAS is a computer software system that provides all the tools.
©Spring 2012 Imelda Go, John Grego, Jennifer Lasecki and the University of South Carolina Chapter 17 supplement: Review of Formatting Data STAT 541.
Designing a Database (Part I) -Identify all fields needed to produce the required information -Group related fields into tables -Determine Each Table’s.
Microsoft Access – Tutorial 2 Designing Databases In this tutorial, we will create a new database create a new table import tables from an existing database.
SAS Efficiency Techniques and Methods By Kelley Weston Sr. Statistical Programmer Quintiles.
Chapter 1: Introduction to SAS  SAS programs: A sequence of statements in a particular order  Rules for SAS statements: –Every SAS statement ends in.
Chapter 6.  If a cell style will be used over and over again it can be modified in the cell styles gallery  Home ⇒ Cell Styles ⇒ right-click a style.
Date Variables Visual Basic for Applications 5. Objectives n In this tutorial, you will learn how to: n Reserve a Date variable n Use an assignment statement.
Lesson 2 Topic - Reading in data Chapter 2 (Little SAS Book)
Summer SAS Workshop Lecture 2. Summer Summer SAS Workshop Lecture 2 I’ve got Data…how do I get started? Libname Review How do you do arithmetic.
1 Chapter 2: Working with Data in a Project 2.1 Introduction to Tabular Data 2.2 Accessing Local Data 2.3 Accessing Remote Data 2.4 Importing Text Files.
C++ for Engineers and Scientists Second Edition Chapter 11 Arrays.
Chapter 16: Using Lookup Tables to Match Data 1 STAT 541 ©Spring 2012 Imelda Go, John Grego, Jennifer Lasecki and the University of South Carolina.
1 EPIB 698E Lecture 1 Notes Instructor: Raul Cruz 7/9/13.
Lesson 6 - Topics Reading SAS datasets Subsetting SAS datasets Merging SAS datasets.
Chapter 22: Using Best Practices 1 STAT 541 ©Spring 2012 Imelda Go, John Grego, Jennifer Lasecki and the University of South Carolina.
Chapter 5 Reading and Manipulating SAS ® Data Sets and Creating Detailed Reports Xiaogang Su Department of Statistics University of Central Florida.
Creating and Using Custom Formats for Data Manipulation and Summarization Presented by John Schmitz, Ph.D. Schmitz Analytic Solutions, LLC Certified Advanced.
Verification & Validation. Batch processing In a batch processing system, documents such as sales orders are collected into batches of typically 50 documents.
Pascal Programming Today Chapter 11 1 Chapter 11.
Introduction to SAS Essentials Mastering SAS for Data Analytics Alan Elliott and Wayne Woodward SAS Essentials - Elliott & Woodward1.
Chapter 1: Overview of SAS System Basic Concepts of SAS System.
SAS for Data Management and Analysis
Computing with SAS Software A SAS program consists of SAS statements. 1. The DATA step consists of SAS statements that define your data and create a SAS.
Customize SAS Output Using ODS Joan Dong. The Output Delivery System (ODS) gives you greater flexibility in generating, storing, and reproducing SAS procedure.
Chapter 17 Supplement: Alternatives to IF-THEN/ELSE Processing STAT 541 ©Spring 2012 Imelda Go, John Grego, Jennifer Lasecki and the University of South.
Microsoft Visual Basic 2012 CHAPTER FOUR Variables and Arithmetic Operations.
Chapter 21: Controlling Data Storage Space 1 STAT 541 ©Spring 2012 Imelda Go, John Grego, Jennifer Lasecki and the University of South Carolina.
Lesson 2 Topic - Reading in data Programs 1 and 2 in course notes –Chapter 2 (Little SAS Book)
Chapter 6: Modifying and Combining Data Sets  The SET statement is a powerful statement in the DATA step DATA newdatasetname; SET olddatasetname;.. run;
Chapter 14: Combining Data Vertically 1 STAT 541 ©Spring 2012 Imelda Go, John Grego, Jennifer Lasecki and the University of South Carolina.
Based on Learning SAS by Example: A Programmer’s Guide Chapters 1 & 2
SAS Certification Prep Guide Chapter 7 Creating and Applying User-Defined Formats.
Copyright 2009 The Little Engine That Could: Using EXCEL LIBNAME Engine Options to Enhance Data Transfers between SAS® and Microsoft® Excel Files William.
SAS ® 101 Based on Learning SAS by Example: A Programmer’s Guide Chapters 14 & 19 By Tasha Chapman, Oregon Health Authority.
FILES AND EXCEPTIONS Topics Introduction to File Input and Output Using Loops to Process Files Processing Records Exceptions.
SAS ® 101 Based on Learning SAS by Example: A Programmer’s Guide Chapters 16 & 17 By Tasha Chapman, Oregon Health Authority.
SAS ® 101 Based on Learning SAS by Example: A Programmer’s Guide Chapters 5 & 6 By Ravi Mandal.
SAS ® 101 Based on Learning SAS by Example: A Programmer’s Guide Chapters 3 & 4 By Tasha Chapman, Oregon Health Authority.
Applied Business Forecasting and Regression Analysis
SQL and SQL*Plus Interaction
Chapter 2: Getting Data into SAS
Chapter 3: Working With Your Data
Chapter 18: Modifying SAS Data Sets and Tracking Changes
Variables and Arithmetic Operations
Instructor: Raul Cruz-Cano
SAS Essentials How SAS Thinks
Chapter 3 The DATA DIVISION.
Producing Descriptive Statistics
Introduction to SAS Essentials Mastering SAS for Data Analytics
Presentation transcript:

Chapter 17: Formatting Data 1 STAT 541 ©Spring 2012 Imelda Go, John Grego, Jennifer Lasecki and the University of South Carolina

2 Creating a Format with Overlapping Values VALUE format-name (MULTILABEL); allows the assignment of multiple labels or external values to internal values. allows the assignment of multiple labels or external values to internal values. Example of VALUE statement assigning multiple labels to a single internal value: Example of VALUE statement assigning multiple labels to a single internal value: value one (multilabel) value one (multilabel) 1=’ONE in English’ 1=’ONE in English’ 1=’UNO in Spanish’; 1=’UNO in Spanish’; (Multiple labels can also be assigned to a single range of internal values.)

3 Creating a Format with Overlapping Values (continued) Example of assigning multiple labels to overlapping ranges of internal values: Example of assigning multiple labels to overlapping ranges of internal values: value age (multilabel) value age (multilabel) 15-29=’below 30’ 15-19=’15 to 19’ 20-29=’20 to 29’;

4 Creating a Format with Overlapping Values (continued) Multilabel formatting allows an observation to be included in multiple rows or categories. Multilabel formatting allows an observation to be included in multiple rows or categories. To use multilabel formats, specify the MLF option in class variables in procedures that support it (e.g., PROC TABULATE, PROC MEANS, PROC SUMMARY). To use multilabel formats, specify the MLF option in class variables in procedures that support it (e.g., PROC TABULATE, PROC MEANS, PROC SUMMARY).

5 Creating a Format with Overlapping Values (continued) proc format; value age (multilabel) 15-29=’below 30’ 15-29=’below 30’ 15-19=’15 to 19’ 15-19=’15 to 19’ 20-29=’20 to 29’; 20-29=’20 to 29’; data age; input age books cards; ; proc means sum maxdec=0; class age/mlf; format age age.; var books; run; The MEANS Procedure Analysis Variable : counter N age Obs Sum ’15 to 19’ 1 13 ’20 to 29’ 2 35 ’below 30’

6 Creating Custom Formats Using the Picture Statement PICTURE statements can be used to create a template for printing numbers. PICTURE statements can be used to create a template for printing numbers. PICTURE format-name value-range=’picture’; Value-range is the individual value or range of values to be labeled Value-range is the individual value or range of values to be labeled Picture specifies a template for formatting values of numeric variables. The template is a sequence of at most 40 characters enclosed in quotation marks. Picture specifies a template for formatting values of numeric variables. The template is a sequence of at most 40 characters enclosed in quotation marks.

7 Creating Custom Formats Using the Picture Statement (continued) There are three types of characters in pictures: There are three types of characters in pictures: 1. Digit selectors 2. Message characters 3. Directives

8 Digit Selectors in the Picture Statement (continued) Digit selectors: Digit selectors: –are numeric characters--0 through 9. –define positions for numeric values. Nonzero digit selectors add zeros to the formatted value as needed. Nonzero digit selectors add zeros to the formatted value as needed. Zero digit selectors do not add any zeros to the formatted value. Zero digit selectors do not add any zeros to the formatted value.

9 Digit Selectors in the Picture Statement (continued) Example for Digit Selectors Example for Digit Selectors Picture Definition Data Values Formatted Values picture month 1-12=’99’; picture month 1-12=’00’;

10 Message Characters in the Picture Statement (continued) Message characters: Message characters: –are nonnumeric characters that print as specified in the picture. –are inserted into the picture after the numeric digits are formatted. –must come after digit selectors in picture definitions.

11 Message Characters in the Picture Statement (continued) Example for Message Characters Example for Message Characters Picture DefinitionData Values Formatted Values Picture millA low-high = '009.9M' (mult=.00001); M Picture millB low-high = '009.9M' (prefix='$' mult=.00001); $1.4M Picture millC (round) low-high = '009.9M' (prefix='$' mult=.00001); $1.5M M is the message character in the examples above. The multiplier (MULT) is a number that the value is to be multiplied by before formatting. The PREFIX option can be used to append text in front of digits. The ROUND option rounds the value to the nearest integer before formatting. Without the ROUND option, the format multiplies the value by the multiplier, truncates the decimal portion (if any), and prints the result according to the picture definition. With the ROUND option, the format multiplies the value by the multiplier, rounds that result to the nearest integer, and then formats the value according to the picture definition. A value of.5 rounds to the next highest integer.

12 Directives in the Picture Statement (continued) Directives: Directives: –are special characters that can be used in the picture to format date, time, or datetime values. –must specify the DATATYPE= option in the PICTURE statement. The option specifies that the picture applies to a SAS date, SAS time, or SAS datetime value. The option value is either DATE, TIME, or DATETIME.

13 Directives in the Picture Statement (continued) Example for Directives Example for Directives proc format lib=form541; picture dt low-high = 'TIME STAMP: %A %B %d, %Y.' (datatype=date) ; picture tm low-high = '%I:%M.%S%p' (datatype=time); data _null_; file print; now = today(); tm = time(); put now dt40. tm tm.; run; TIME STAMP: Wednesday January 18, :7.55PM %A = full weekday name %B = full month name %d = day of the month with no leading zero %Y = year with century %I = 12-hr clock time with no leading zero %M = minute as a decimal number 0-59 with no leading zero %S = second as a number 0-59 with no leading zero %p = AM or PM dt40. displays the value of variable now up to 40 characters

14 Managing Custom Formats: Using FMTLIB with PROC FORMAT to Document Formats Adding the keyword FMTLIB to the PROC FORMAT statement displays a list of all the formats in the specified catalog, along with descriptions of values. Adding the keyword FMTLIB to the PROC FORMAT statement displays a list of all the formats in the specified catalog, along with descriptions of values. The SELECT and EXCLUDE statements allow you to process specific formats instead of processing an entire catalog. The SELECT and EXCLUDE statements allow you to process specific formats instead of processing an entire catalog.

15 Managing Custom Formats: Using FMTLIB with PROC FORMAT to Document Formats (continued) Example: Example: libname form541 ' f:\STAT 541\sas formats ' ; proc format lib=form541 fmtlib; select dt tm; select dt tm; *exclude dt; *exclude dt;

16 Managing Custom Formats: Using FMTLIB with PROC FORMAT to Document Formats (continued) Example of format listings from a specified catalog

17 Managing Custom Formats: Using PROC CATALOG to Manage Formats Formats are saved as catalog entries. Therefore, PROC CATALOG can be used to manage the formats. Formats are saved as catalog entries. Therefore, PROC CATALOG can be used to manage the formats. PROC CATALOG can: PROC CATALOG can: 1.Create a listing of catalog contents 2.Copy a catalog or selected entries within a catalog 3.Delete or rename entries within a catalog

18 Managing Custom Formats: Using PROC CATALOG to Manage Formats (continued) Example: Example: proc catalog catalog=form541.formats; copy out=work.formats; copy out=work.formats; select dt.format; select dt.format;run; proc catalog cat=work.formats; contents; contents;run; ` Use the full catalog entry name of DT.FORMAT for DT in the SELECT statement. The format DT is copied from the form541.formats catalog to the work.formats catalog. The CONTENTS statement displays the contents of the work.formats catalog.

19 Using Custom Formats SAS statements in a DATA Step can permanently assign a format to a variable. SAS statements in a DATA Step can permanently assign a format to a variable. A format can be temporarily specified for a variable in a PROC step. A format can be temporarily specified for a variable in a PROC step. PROC DATASETS can be used to assign, change, or remove the format associated with a variable in a SAS data set. PROC DATASETS can be used to assign, change, or remove the format associated with a variable in a SAS data set.

20 Using Custom Formats (continued) Example: Example: proc datasets lib=Mylib; modify flights; modify flights; format dest $dest.; format dest $dest.; format baggage; format baggage;quit; Mylib is the name of the SAS library that contains the data that needs to be modified. Flights is the name of the SAS data set to be modified. The format $dest is associated with variable dest. Since no format is associated with variable baggage, the format associated with the variable is removed.

21 Using a Permanent Storage Location for Formats When a format is permanently associated with a variable, it is important to know where the format is located and to reference it whenever the variable is being used. When a format is permanently associated with a variable, it is important to know where the format is located and to reference it whenever the variable is being used. The location of the format is determined when the format is created in PROC FORMAT. The location of the format is determined when the format is created in PROC FORMAT.

22 Using a Permanent Storage Location for Formats (continued) Formats can be stored anywhere. However, SAS must be told which format catalogs to search before the formats can be accessed. Formats can be stored anywhere. However, SAS must be told which format catalogs to search before the formats can be accessed. When a format is referenced, SAS automatically looks through the following libraries in this order: When a format is referenced, SAS automatically looks through the following libraries in this order: –Work.formats –Libref.formats (The library libref is recommended for formats because it is automatically searched when a format is referenced. Use LIB=Libref in the PROC FORMAT step that creates the format. Use the same libname statement with the library name Libref in the program that needs to reference the format. )

23 Using a Permanent Storage Location for Formats (continued) When other libraries or catalogs need to be searched, use the FMTSEARCH= system option to indicate where to search for formats. When other libraries or catalogs need to be searched, use the FMTSEARCH= system option to indicate where to search for formats. OPTIONS FMTSEARCH = (catalog-1 catalog-2… catalog-n); catalog-n);

24 Substituting Formats to Avoid Errors If SAS fails to locate the format you need, it issues an error message and stops processing the step. The system behavior defaults to FMTERR. If SAS fails to locate the format you need, it issues an error message and stops processing the step. The system behavior defaults to FMTERR. To prevent this, use the NOFMTERR option where SAS substitutes a format (w. or $w.) for the missing format and continues processing. To prevent this, use the NOFMTERR option where SAS substitutes a format (w. or $w.) for the missing format and continues processing. OPTIONS FMTERR | NOFMTERR;

25 Creating Formats from SAS Data Sets PROC FORMAT’S CNTLIN = option is used to read the input control data set and create the format. PROC FORMAT’S CNTLIN = option is used to read the input control data set and create the format. The input control data set must be of a certain form with all the information needed to create the format. The input control data set must be of a certain form with all the information needed to create the format.

INPUT CONTROL DATA SETS (CNTLIN=) TYPE: C for Character FORMAT N for Numeric FORMAT I for Numeric INFORMAT J for Character INFORMAT

27 Creating SAS Data Sets from Custom Formats Use PROC FORMAT’S CNTLOUT = option to create a SAS data set (a.k.a. output control data set). Use PROC FORMAT’S CNTLOUT = option to create a SAS data set (a.k.a. output control data set). The output data set will contain variables that completely describe all aspects of each format, including optional settings. The output data set will contain variables that completely describe all aspects of each format, including optional settings.

28 Creating SAS Data Sets from Custom Formats (continued) Control output data sets are useful when you need to modify a format but no longer have the specifications for the format in a SAS program or in the form of an input control data set. Control output data sets are useful when you need to modify a format but no longer have the specifications for the format in a SAS program or in the form of an input control data set. 1.Use the CNTLOUT= option to obtain the output control data set associated with a format. 2.Edit the data set so that it is suitable for use with the CNTLIN= option. 3.Create the format using the updated data set using the CNTLIN= option.

29 Creating SAS Data Sets from Custom Formats (continued) proc format cntlout=outform; value $gender ’M’=’male’ ’F’=’female’;

30 Creating SAS Data Sets from Custom Formats (continued) This is data set outform. F D M E L P N T S L F E R O S E N T A A N F E M F E T E E O A A E B M M U G U F U I D Y X X H B M R N E I A L T Z I L L I P C C L S E T D L N X T H Z X T L T E L L O 1 GENDER F F female C N N 2 GENDER M Mmale C N

31 Creating SAS Data Sets from Custom Formats (continued) This is the PROC CONTENTS output for data set OUTFORM. Data Set Name: WORK.OUTFORM Observations: 2 Member Type: DATA Variables: 17 Engine: V612 Indexes: 0 Created: 22:16 Tue, Jul 13, 2011 Observation Length: 63 Last Modified: 22:16 Tue, Jul 13, 2011 Deleted Observations: 0 Protection: Compressed: NO Data Set Type: Sorted: NO Label: -----Engine/Host Dependent Information----- Data Set Page Size: 8192 Number of Data Set Pages: 1 File Format: 607 First Data Page: 1 Max Obs per Page: 129 Obs in First Data Page: 2

32 Creating SAS Data Sets from Custom Formats (continued) This is the rest of the PROC CONTENTS output for data set OUTFORM Alphabetic List of Variables and Attributes----- # Variable Type Len Pos Label DEFAULT Num 3 22 Default length 16 EEXCL Char 1 52 End exclusion 3 END Char 1 9 Ending value for format 12 FILL Char 1 46 Fill character 1 FMTNAME Char 8 0 Format name 9 FUZZ Num 8 28 Fuzz value 17 HLO Char Additional information 4 LABEL Char 6 10 Format value label 8 LENGTH Num 3 25 Format length 6 MAX Num 3 19 Maximum length 5 MIN Num 3 16 Minimum length 11 MULT Num 8 38 Multiplier 13 NOEDIT Num 3 47 Is picture string noedit? 10 PREFIX Char 2 36 Prefix characters 15 SEXCL Char 1 51 Start exclusion 2 START Char 1 8 Starting value for format 14 TYPE Char 1 50 Type of format