1 Graphics in EG and R HRP223 – 2009 November, 2010 Copyright © 1999-2010 Leland Stanford Junior University. All rights reserved. Warning: This presentation.

Slides:



Advertisements
Similar presentations
Summary Statistics/Simple Graphs in SAS/EXCEL/JMP.
Advertisements

MS® PowerPoint.
® Microsoft Office 2010 Excel Tutorial 4: Enhancing a Workbook with Charts and Graphs.
Final Thoughts HRP 223 – 2013 December 4 th, 2013 Copyright © Leland Stanford Junior University. All rights reserved. Warning: This presentation.
1 Graphics – Part 3 HRP223 – 2013 December 2, 2013 Copyright © Leland Stanford Junior University. All rights reserved. Warning: This presentation.
1 SAS Formats and SAS Macro Language HRP223 – 2011 November 9 th, 2011 Copyright © Leland Stanford Junior University. All rights reserved. Warning:
1 Creating and Tweaking Data HRP223 – 2010 October 24, 2011 Copyright © Leland Stanford Junior University. All rights reserved. Warning: This.
What is New in SAS 9.2? Graphics and More. A Brief History of SAS Graphics 6.x 7.x = had the Output Delivery System (ODS) but it was not widely available.
Graphics in EG and R HRP223 – 2009 November, 2010
McGraw-Hill Technology Education © 2004 by the McGraw-Hill Companies, Inc. All rights reserved. Office Excel 2003 Lab 2 Charting Worksheet Data.
1 Computing for Todays Lecture 10 Yumei Huo Fall 2006.
ABC’s of PowerPoint (Office 2007) Part 1: Basic Vocabulary Part 2: Cursors Part 3: Insert Your Text Part 4: Insert Your Pictures Part 5: Basic Tools &
SPSS Statistical Package for the Social Sciences is a statistical analysis and data management software package. SPSS can take data from almost any type.
1 Graphics – Part 2 HRP223 – 2013 November 20, 2013 Copyright © Leland Stanford Junior University. All rights reserved. Warning: This presentation.
Using Charts in a Presentation Lesson 6. Software Orientation Charts can help your audience understand relationships among numerical values. The figure.
PowerPoint: Tables Computer Information Technology Section 5-11 Some text and examples used with permission from: Note: We are.
XP New Perspectives on Microsoft Office Excel 2003 Tutorial 4 1 Microsoft Office Excel 2003 Tutorial 4 – Working With Charts and Graphics.
Excel Lesson 6 Enhancing a Worksheet
1 Windows and Beginning Data Manipulation HRP223 – 2013 Oct 9, 2012 Copyright © Leland Stanford Junior University. All rights reserved. Warning:
ADVANCED MICROSOFT POWERPOINT Lesson 6 – Creating Tables and Charts
1 Graphics HRP223 – 2013 November 18, 2013 Copyright © Leland Stanford Junior University. All rights reserved. Warning: This presentation is.
Instructor: Professor Cora Martinez, PhD Department of Civil and Environmental Engineering Florida International University.
CHAPTER 14 Formatting a Workbook Part 1. Learning Objectives Format text, numbers, dates, and time Format cells and ranges CMPTR Chapter 14: Formatting.
1 Graphics HRP223 – 2011 November 28, 2011 Copyright © Leland Stanford Junior University. All rights reserved. Warning: This presentation is.
Bay Area SUG June SAS ® 9.2 Implications for Biotech SAS ® 9.2 Implications for Biotech Bay Area SAS User’s Group June 7 th 2010 Sarmad Pirzada,
Make the Main Title with Large Bold Type Your Name and Title Here Your Department Here Texas A&M Health Science Center Make the Main Title with Large Bold.
Data Analysis Using SPSS
Abstract # 0000 Make the Main Title with Large Bold Type Use Smaller Type for the Subtitle. Above Type is 105pt. This Type is 70pt. Make authors’ names.
1 Data Manipulation (with SQL) HRP223 – 2010 October 13, 2010 Copyright © Leland Stanford Junior University. All rights reserved. Warning: This.
Committed to Shaping the Next Generation of IT Experts. Exploring Microsoft Office Word 2007 Chapter 3: Enhancing a Document Robert Grauer, Keith Mulbery,
HPR Copyright © Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected by copyright law and.
Working with Data in Windows HRP223 – 2009 Sept 28 th, 2009 Copyright © Leland Stanford Junior University. All rights reserved. Warning: This.
1 Graphics in EG and R HRP223 – 2009 November 16 th, 2009 Copyright © Leland Stanford Junior University. All rights reserved. Warning: This presentation.
Copyright © 2008 SAS Institute Inc. All rights reserved. SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks.
ODS Statistical Graphics in SAS 9.2 January 17, 2010.
1 Graphics HRP223 – 2012 November 28, 2012 Copyright © Leland Stanford Junior University. All rights reserved. Warning: This presentation is.
Microsoft ® Office Excel 2007 Working with Charts.
Introduction to SAS/Graph 9.2 Ken Barz Colorado Prevention Center 22Oct2009 Ken Barz Colorado Prevention Center.
1 Graphics in EG and R HRP223 – 2009 November 16 th, 2009 Copyright © Leland Stanford Junior University. All rights reserved. Warning: This presentation.
1 Data Manipulation (with SQL) HRP223 – 2010 October 13, 2010 Copyright © Leland Stanford Junior University. All rights reserved. Warning: This.
1 Graphics in EG and R HRP223 – 2009 November, 2010 Copyright © Leland Stanford Junior University. All rights reserved. Warning: This presentation.
McGraw-Hill Career Education© 2008 by the McGraw-Hill Companies, Inc. All Rights Reserved. Office Excel 2007 Lab 2 Charting Worksheet Data.
1 Graphics – Part 2 HRP223 – 2013 November 20, 2013 Copyright © Leland Stanford Junior University. All rights reserved. Warning: This presentation.
1 Graphics HRP223 – 2011 November 28, 2011 Copyright © Leland Stanford Junior University. All rights reserved. Warning: This presentation is.
© 2004 by the McGraw-Hill Companies, Inc. All rights reserved. Lecture 29 Enhancing Presentations with Graphics (2)
HRP Copyright © Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected by copyright law and.
1 Statistical Software Programming. STAT 6360 –Statistical Software Programming SAS Graphics SAS has two main facilities for producing graphics: 1.ODS.
1 Lab 1 HRP223 – 2011 Oct 10, 2011 Copyright © Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected.
Graphics in EG and R HRP223 – 2009 November 16th, 2009
HRP Copyright © Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected by copyright law and.
Progress and Outcome Measures - Part 3 Progress and Outcome Measures Part 3, Slide 1Copyright © 2004, Jim Schwab, University of Texas at Austin.
Introduction to SAS ODS Graphics September 16, 2015 Rocio Lopez.
1 Data Manipulation (with SQL) HRP223 – 2009 October 12, 2009 Copyright © Leland Stanford Junior University. All rights reserved. Warning: This.
Committed to Shaping the Next Generation of IT Experts. Exploring Microsoft Office Word 2007 Chapter 3: Enhancing a Document Robert Grauer, Keith Mulbery,
Beginning Data Manipulation HRP Topic 4 Oct 14 th 2012 Copyright © Leland Stanford Junior University. All rights reserved. Warning: This.
HRP Copyright © Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected by copyright law and.
SOC 305, Prof. Robert Martin Southeastern Louisiana University.
Make the Main Title with Large Bold Type Use Smaller Type for the Subtitle. Above Type is 110pt. This Type is 80pt. Make authors’ names smaller. This is.
Lesson 4: Working with Charts and Tables
Chapter 8: ODS Graphics ODS graphics were not available prior to SAS 9.2 They have been implemented across a wide range of procedures Functionality isn’t.
Working with Data in Windows
SAS Output Delivery System
Lab 3 and HRP259 Lab and Combining (with SQL)
Lab 2 HRP223 – 2010 October 18, 2010 Copyright © Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected.
Data Manipulation (with SQL)
Using Charts in a Presentation
Charts A chart is a graphic or visual representation of data
Presentation transcript:

1 Graphics in EG and R HRP223 – 2009 November, 2010 Copyright © Leland Stanford Junior University. All rights reserved. Warning: This presentation is protected by copyright law and international treaties. Unauthorized reproduction of this presentation, or any portion of it, may result in severe civil and criminal penalties and will be prosecuted to maximum extent possible under the law.

2 Robbins Creating More Effective Graphics by Naomi Robbins is a wonderful book showing the right and wrong ways to visualize scientific data. Read it when you have an afternoon off. It is an ideal read on a transcontinental flight.

3 Why Do Data Visualization? Well designed pictures will show you the details and the whole pattern in your data. Numeric descriptions can easily hide important patterns. Some patterns are hard to detect in tables. – Whenever data is reported over time or locations, you need art. YOU CAN LEARN A LOT BY JUST LOOKING. -Yogi Berra

4 Fisher’s Plot Data Reported in Cleveland Based on code written by Robert Allison at SAS Institute Year 1Year 2

5 Scatter Plot for Correlations All have r 2 =.67Anscombe 1973, Graphs in Statistical Analysis

6 Bad Things First, I want to talk about bad graphics that I frequently see. – 3d – Pie – Donuts – Stacked graphics

7 General 3D graphics – Don’t, Don’t, Don’t While the SAS implementation of 3D graphics is relatively good, don’t use 3D effects, unless you are measuring something in 3D. Even then, don’t.

8 Tufte is a God to many. The empiricist in me is very nervous about the amount of pontificating in his books… – I want to have evidence-based advice. His best advice is to put no extra ink on the page. – Think about the ink-to-information ratio. – Remove all chart junk. Note: the irony of the chart junk on this slide….

9 Example Bar Chart Serum Samples in Each Trimester You can remove ink rather than adding.

10 Ink-to-Information Ratio How much ink for seven numbers? Based on Soukup & Davidson, 2002 Visual Data Mining

11 Cleveland If you want to know how to do scientific visualization, you must read William Cleveland’s work. – He attempted to quantify what makes a good graphic good. His early work on graphics is one of the reasons why R/S-plus is taking over the statistical world.

12 Pie is bad. Work by Cleveland (and experimental psychologists) suggests that: – people are bad at judging the relative magnitude of angles – if you twist the rotation of the pie you can cause people to systematically misjudge the size of the angles – a 3 rd dimension makes judgment worse If you get a glossy handout with a 3D pie, assume someone is lying to you. Don’t use them.

13 Don’t Explode! This exploded 3D pie (brought to you by Excel) is nearly useless for judging amounts.

14 Forbidden Donut…. Donut plots have the same problems as pies (if not worse) ….

15 Stacking is Bad Cleveland also quantified the fact that people are bad at judging the relative height of stacked data.

16 Wow, a cinnamon roll plot! Good luck making rapid judgments using this stacked 3D pie.

17 What is a good graphic? Don’t make your audience think unnecessarily! Minimize the amount of ink on the page. – This needs to be studied. Show the central tendency and the variability. Plot the quantity (inference) that you want people to notice. Be sure colorblind people can understand it. – Use a black and white photocopier and make sure you can distinguish all groups.

18 Avoid Thinking Put labels on the graphic directly instead of using a key. If you want people to compare the difference between two lines, plot the difference, not the two lines.

19 Bivariate Comparisons with Lines People are extremely bad at judging the distance between two curves. Never ask people to judge up and down (vertical) distances between curves. Based on: Robbins Creating More Effective Graphs, 2005 The distance between the two curves is the same at all points.

20 Plot Types Univariate (one variable) – Categorical variables Bar charts Dot plots Waffle plots – Continuous variables Histogram Box plot Violin plots

21 Bar Charts The ink-to-information ratio is lousy. A one dimensional quantity is being “expanded” into two dimensions. – Doubling of the amount corresponds to how much of an increase in area?

22 SAS Bar Charts SAS makes the reader do extra work by rotating the axis labels in ActiveX images. They pointlessly include variable labels by default.

23 How to do it? Notice you can Edit the data and apply filters. You can right click on variables and apply user-defined formats off the Properties dialog.

24 First create the format. In the Data windowpane of the Bar Chart GUI, right click on the variable and change the format to the User Defined format you had created.

25 The GUI is Solid My only complaints are that the rotate grouping values text does not work (position in this example) and the summary statistics do not show up when you request ActiveX images.

26 Saving the Graphic for Publication The easiest way to get publication quality graphics is to set the output type to be RTF.

27.PNG format ActiveX image format

28 Default Output and Graphics The default graphic format in EG is ActiveX. These images can be edited (even on the web) but they only display with Internet Explorer. I have set my graphics to display as ActiveX images. Tweak this with Tools> Options… > Graph.

29 Types of Images The default formats of the images are determined by the ODS destinations you are using: – LISTING: pgn visible in the Windows Image Fax Viewer – HTML: png, gif, jpg contained in web pages and visible in Internet Explorer, Firefox or Opera – LATEX: PostScrpt, epsi, gif, jpeg, pgn are visible in GhostView – PCL or PS: contained in Postscript file are visible in GhostView – PDF: contained in pdf, which is visible with Adobe Reader – RTF: visible in MS Word RTF graphics are done at 300 dpi by default

30 I Typically Use HTML This says the images should show tooltips with extra statistical details when you hover the mouse over parts of the graphic. (I can’t image these.) This is the appearance template. For optimal results use: Analysis: color Default : overdistinguishes symbols for color or B&W Journal or journal2, etc: black and white Statistical or statistical2, etc: color Include image_dpi = 300 to set the resolution to be higher than the default 100 dots per inch. Try 300 for final images pasting into MS Office.

31 What is ODS? The Output Delivery System (ODS) controls the type and appearance (aka the style) of SAS output. Different appearance templates Different output destinations/types.

32 You can browse the ODS appearance templates from the Style Manager on the Tools menu.

33 ods graphics on; This turns on the ODS statistical graphics. Behind the scenes this combines your data with a pre-specified description of what to plot and the aesthetics of the appearance. Your data Graph template Style template What Where? Colors Fonts

34 Useful ods graphics options ods graphics on / ods graphics / reset; ods graphics off; Width = 8in Height = 11in Imagefmt = jpg imagename = thingy imagefmt = staticmap ; Make a series of graphics called thingy1, thingy2, etc. If you set only width or height, it will use a 4:3 aspect ratio. Reset the graphic counter back to 1 Use pop-up tooltips with details. If you want to disable ods graphics for a procedure

35

36

37 ODS SGraphics Compared to the competition, for the last 10 years SAS graphics have been between poor and pathetic. – Graphics procedures rendered with okay quality, at best. – No “what you see is what you get” editing. – Many plots were nearly impossible to render. – Custom graphics required extensive programming. SAS 9.x has attempted to solve this problem.

38 Old vs. New Procedures The old (commonly used) graphics procedures were gchart, gplot. Now most analysis procedures have built in high quality graphics that can be invoked with an ODS graphics on statement. – Early on in the class I told you to tweak the EG options to include “ODS graphics on” with every run. There are also new “easy to use” statistical graphics (sg) procedures.

39 New Graphics Statistical Graphics Procs proc sgPlot – general plotting procedure that replaces gplot proc sgScatter – lots of tools for scatterplots and scatter matrices proc sgPanel – quick and easy trellis/lattice/matrix/panel of plots Proc sgRender – used with proc template to make totally custom plots – It replaces proc greplay

40 Plot Types Univariate (one variable) – Categorical variables Bar charts Dot plots Waffle plots – Continuous variables Histogram Box plot Violin plots Quantile and QQ plots

41 You can get an okay looking graphic using sgpanel. Categorical variables

42 I was able to get exactly the graphic I wanted using R. Categorical variables

43 If you want to use R Download R for Mac or PC cran.cnr.berkeley.edu/bin/macosx/ cran.cnr.berkeley.edu/bin/windows/base cran.cnr.berkeley.edu/bin/macosx/ cran.cnr.berkeley.edu/bin/windows/base

44 If you use a PC, also get PERL and Tinn-R PERL is a text manipulation language that is used by a couple of key R packages. It ships with Mac OS X. PC users can get ActivePerl (what I use) or Strawberry Perl for Windows. Tinn-R is a text editor that knows the R language. sourceforge.net/projects/tinn-r/

45 R Help R help files are user hostile. To learn about the options for dotchart type: ?dotchart Use: rseek.orgrseek.org

46 Browse To see why people use R for graphics look here: addictedtor.free.fr/graphiques/thumbs.php

47 Additional Libraries If you see sample code that includes require() or library(), you will need to do a onetime download of the additional package. If you are using Vista, run R as the administrator (by right clicking on the R icon instead of just double clicking ) to install and update packages.

48 Waffle Plots (aka pixel plots) I have not found software to do them. Image from: Visual language for Designers by Connie Malamed Categorical variables

49 Continuous Outcomes The Distribution Analysis menu option can do basic plots. Continuous variables

50 The resolution of the histogram is okay but the others are unacceptable.

51 Use sgplot for high resolution plots. Continuous variables

52 Continuous variables

53 Violin A violin plot mirrors the shape of the histogram (density). They can be done in R. Continuous variables

54 Grouped Categorical Variables To graph categorical data in SAS you need to get Michael Friendly’s Visualizing Categorical Data. Unfortunately, his macros are copyrighted with the book… So I will show you the R versions. – Fourfold plots – Mosaic plots – Association plots Grouped categorical variables

55 Fourfold Plots They draw 4 slices of pie with the area corresponding to the number of people in each cell of a 2x2 table and they have confidence bands such that if the confidence bounds overlap on adjacent pie pieces, they are not statistically significantly different. Grouped categorical variables 45% male vs. 30% female admission

56 More males were admitted than females. There is clear evidence of sexist policies in admissions! Grouped categorical variables

57 Department A admitted more females than males and every other department had no bias! The joy of Simpsons paradox. Grouped categorical variables

58 Mosaic Plots So you have an contingency table and you want to know if there is as an association. You do a chi-square test and it says there are associations between the rows and columns. What next? Grouped categorical variables

59 Some basic voodoo in R shows which combinations are over (in blue) or under represented (in red). Grouped categorical variables

60 I prefer the simpler association plots. Grouped categorical variables

61 Grouped Continuous Variables You can use the Distribution Analysis to get basic grouped plots. For better looking plots you need to write sgplot and/or sgpanel code. Grouped continuous variables

62 Request distinct graphics by subgroups. Grouped continuous variables

63 Grouped continuous variables

64 Actually this took a bit of voodoo. Grouped continuous variables

65 1 st 2 nd Grouped continuous variables

66 Double click here. Put details on the histogram tweaks here. I use/tweak nrow ncol and endpoints often. endpoints = 2 to 10 by 0.5 midpoints = Grouped continuous variables

67 Grouped continuous variables

68 Grouped continuous variables

69 Side by Side Violin Plots Grouped continuous variables

70 Scatter Plot Grouped continuous variables

71 Jittered Plot

72 Jitter vs. Sunflowers In R you can also do sunflower plots. Grouped continuous variables

73 Ordinary Least Squares Regression People typically plot a regression line to show a relationship between two continuous variables. Grouped continuous variables

74 Bisquare Figure out what is an odd value and then put a weight on it to devalue it. There are many robust regression algorithms around. R and S-Plus software have them well implemented. Grouped continuous variables

75 Loess and Splines Loess is a technique essentially creates a rolling window and gets a weighted average across the values visible inside the window. Splines are curved lines that allow different amounts of stiffness to the curves. Grouped continuous variables

76 Smooth = 25 Smooth = 50 Smooth = 99

77 Tweaking Specialized Plots Most analysis procedures now have customized high resolution graphics. Most are automatically produced if you type ods graphics on. Proc Freq – I wanted a deviation plot for a 2x2 (or really any sized table) showing which cell is driving a significant chi- square. They only give you a plot for a one-way table. – The ORPlot is very nice. Grouped continuous variables

78 Specifying the plot name is optional in proc freq. Turn on editable graphics with ods listing sge= on.

79 Deviance Plot

80 ODS Graphics Editor with EG If you want to do extensive tweaking to a graphic, you can use the WYSIWYG ODS Graphics editor. Unfortunately it only works with ODS graphics procedures and you need to rerun the code in SAS to invoke it.

81 Move code from EG to SAS 1.Use the query builder to put your data in a permanent SAS library (not the work library). 2.Right click on the graphic node which is run on data in a permanent library and choose Open… Open Last Submitted Code. 3.Copy the code beginning with the SQL that makes the data. 4.Start SAS and paste the code into the program editor.

82 Move all your code to SAS Because the ODS graphics editor is not in EG (yet), you can export the entire set of code for the project and then rerun it in SAS.

83 ODS Graphics Editor with EG (2) After exporting all your EG project, open the code in SAS and add these lines at the top of the program: ods rtf file = "c:\blah\somefile.rtf"; ods listing sge = on; Then open the graphic of interest.

84

85 WYSIWYG Editing Right click and/or double click to set properties for objects in the plot. The tool is optimized for some of the ODS style templates but you can use custom colors.

86 Right click on things to set properties. – Colors, text details, fonts – Point and click annotation – Symbols, arrows, text, circles

87 WYSIWYG Editing While the Statistical graphics editor is a much needed improvement, it is incomplete. You can only use a few, style templates (for setting default colors and such) and you can not use custom style templates. This means that you can not do critical tasks like manually set the color for different values in scatter plots.

88 Too Many Graphics If the ods graphics on statement gives you too many graphics, you can specify which graphics you want by including code designed for the procedure. Typically it looks like this: plot(only) = (table names). This design is poorly implemented because you need to know where to put the plot statement and what the table names are. Does it go on the proc line (like phreg), the tables line (like proc freq), or some other line? Also the table names specified with a plot statement do not always match the ODS table names.

89 Usually you can use an ODS exclude statement or an ODS select statement to pick the correct things to print. Using the plots(only) = syntax is more efficient.

90 Proc phreg has a lot of new features but nothing major in the graphics. With phreg, if you specify ods graphics on you do not automatically get any plots. Here I request survival and cumulative hazard plots including the global confidence limits option (cl). Once again the option names are not consistent with the table names.

91 Proc lifetest can show the number at risk but the implementation is weak. It labels the groups with numbers even if the strata are character strings. You have to manually edit them and this affords ample opportunity for mistakes. I don’t see a way to change the censoring symbol in the legend. This shows the number of people at risk after 20, 40 etc days.

92 Splitting a Grid Some procedures produce a grid of plots. You can get access to the individual plots by specifying plots(unpack). Then you can use plots(only)=tableName to get just the right parts. ODS select or exclude statements will not work.

93 plots(GlobalOptionsGoHere). The global options apply to all graphics in this procedure.

94 Beyond the Basic Univariate plots There are 4 SG procedures that allow you to build up complex univariate plots and do multivariate (trellis/lattice) plots.

95 New Graphics Statistical Graphics Procs proc sgPlot – general plotting procedure that replaces gplot proc sgScatter – lots of tools for scatterplots and scatter matrices proc sgPanel – quick and easy trellis/lattice/matrix/panel of plots Proc sgRender – used with proc template to make totally custom plots – It replaces proc greplay

96 proc sgPlot Basic plots – scatter, series, band, needle Fits curves and generates confidence bounds – loess, regression, penalized b-splines, ellipse Distributions – boxplots, histograms, normal curves, kernel density Categorization – dot plots, bar charts, line charts From Heath SAS/Graph procedures for creating statistical graphics

97 onLineDoc helps (some) onlineDoc for sgplot needs a LOT more hyperlinks and examples. Find these pages: The SGPLOT Procedure: Overview The SGPLOT Procedure: Examples The SGPLOT Procedure: Procedure Syntax

98 As you add more requests to the plot, it resizes and shifts things to make room. It draws them in the order you request them. It reads the requests from the first listed to the bottom. Change the order if you want to have an item appear layered on top of, or behind, another thing. Some colors are not set yet in the enhanced editor. Use the menu Tools>Options>Enhanced Editor… then click User Defined Keywords to add the coloring.

99 How is that made? proc format library = work; value $smoked "Non-smoker" = "None " missing = "Missing" other = "Not none" ; run; data fram; set sashelp.heart; smokin = put(smoking_Status, $smoked.); run;

100 How is that made? proc sgplot data = fram; histogram cholesterol; density cholesterol / type = kernal; density cholesterol / type = normal; keylegend / location=inside position=topright across=1; run; Layers of features are added to the graphic in the order listed.

101 How is that made? proc sgplot data = fram tmplout= "c:\blah\plate.sas"; histogram cholesterol; density cholesterol / type = kernal; density cholesterol / type = normal; keylegend / location=inside position=topright across=1; run; The statistical graphics language template can be saved and studied.

102 proc template; define statgraph sgplot; begingraph; layout overlay; Histogram Cholesterol / primary=true binaxis=false LegendLabel="Cholesterol"; ; DensityPlot Cholesterol / Lineattrs=GraphFit kernel() LegendLabel="Kernel" NAME="DENSITY"; ; DensityPlot Cholesterol / Lineattrs=GraphFit2 normal() LegendLabel="Normal" NAME="DENSITY1"; ; DiscreteLegend "DENSITY" "DENSITY1" / Location=Inside across=1 halign=right valign=top; endlayout; endgraph; end; run; proc sgrender data = fram template = sgplot; run; This was saved in plate.sas. Render a graphic with the template and dataset specified. Note the name of this template.

103 I want to add in a reference line showing what is normal and put the categories in order.

104

105

106 Grids You can produce lattices full of graphics with proc gpanel.

107

108 Spaghetti Plots Data from Singer and Willett:

109 Customizing graphics You can tweak the graphics that ship with SAS by modifying their graph template or you can create truly custom graphics by making your own statistical graph template. Your data Graph template Style template

110 If you do not want to explain what Kernel density estimation is… remove the lines.

111 Finding the template Add before the procedure that draws the graphic add ods trace on; and include ods trace off; afterwards. This prints the names of all the templates used by the procedure in the log. product.procedure.Graphis.TemplateName

112 Looking at a Template You can ask proc template to display the template with the source statement: proc template; source stat.ttest.graphics.summary2; run; Remember to type this before you start editing: ods path(prepend) work.template (update);

113 Don’t Panic This is a complete template except for the proc template statement here and a run statement at the bottom. Copy this into an editor window and add proc template.

114 After adding proc template and commenting out the Kernel statements rerun the code.

115 Oops. Unknown key words… You can fix the color coding on the template code easily.

116 Fixed (permanently) All your subsequent plots will have no density line.

117 Details on that new template. You can ask SAS to list, into the log, all the locations where the graphics templates are stored by using the command ods path show: Your new template is stored here. The untouchable original is here but it is “masked” by the 1 st one.

118 Want a temporary template? You can request that your templates go into work instead of SASUSER with the command: ods path (prepend) work.template (update); When you quit SAS the template will be deleted along with everything else in work.

119 Note the dynamic variables proc template; define statgraph Stat.Ttest.Graphics.Summary2; notes "Comparative histograms with normal/kernel densities and boxplots, (two-sample)"; dynamic _Y1 _Y2 _Y _VARNAME _XLAB _SHORTXLAB _CLASS1 _CLASS2 _CLASSNAME _LOGNORMAL _OBSVAR; BeginGraph; entrytitle "Distribution of " _VARNAME; layout lattice / rows=3 columns=1 columndatarange=unionall rowweights=(.4.4.2) shrinkfonts=true; columnaxes; columnaxis / display=(ticks tickvalues label) label=_XLAB shortlabel=_SHORTXLAB griddisplay=auto_on; endcolumnaxes; layout overlay / xaxisopts=(display=none); histogram _Y1 / binaxis=false primary=true; if ((NOT EXISTS(_LOGNORMAL)) AND (NOT(EXISTS(_PAIRED) AND EXISTS(_RATIO)))) densityplot _Y1 / normal () name="Normal" legendlabel="Normal" lineattrs= GRAPHFIT; endif; *densityplot _Y1 / kernel () name="Kernel" legendlabel="Kernel" lineattrs=GRAPHFIT2; Dynamic variables allow the same template to work with lots of datasets

120 dynamic You can see what things/variables are being passed to a template by a procedure by printing it in a title: proc template; define statgraph Stat.Ttest.Graphics.Summary2; notes "Comparative histograms with normal/kernel densities and boxplots, (two-sample)"; dynamic _Y1 _Y2 _Y _VARNAME _XLAB _SHORTXLAB _CLASS1 _CLASS2 _CLASSNAME _LOGNORMAL _OBSVAR; BeginGraph; entrytitle "Does _Y1 exist? " eval(exists(_Y1)) " It is the value: " _Y1; entrytitle2 "Does _VARNAME exist? " eval(exists(_VARNAME)) " It is the value: " _VARNAME; *entrytitle "Distribution of " _VARNAME; This resolves to 1 or 0 depending on if the variable is used.

121 entrytitle "Does _Y1 exist? " eval(exists(_Y1)) " It is the value: " _Y1;

122 Setting dynamic Variables You can set the values of dynamic variables when you call them: proc sgrender data = blah template= thing; dynamic _var1Label= 'Dude'; run;

123 SGPlot vs Template You can replicate everything done with proc sgplot using the template language but don’t reinvent the wheel if you don’t need to. You will want to use proc template to build custom graphics that use many panels. Proc sgplot uses statements that start like reg but template uses names like regressionplot. – Similar but not identical names… boo.

124

125

126 layout gridded = ticks do not have to align layout lattice = ticks must align

127

128

129 Styles You can also tweak the style (aesthetics/ appearance) of your graphics. Your data Style template Graph template

130 What styles? You can use the GUI to look at the details of the styles or you can explore them with code: proc template; source styles.statistical; run; This template includes sections for: fontsIndexTitle GraphFontsIndexProcName TableSystemFooter HeaderGraphColors DataGraph ColorGraphBackground GraphGridlines

131 FontsSysTitleAndFooterContainerListItemTwoColorRampGraphMissing GraphFontsTitleAndNoteContainerParagraphTwoColorAltRampGraphControlLImits color_listTitlesAndFootersListThreeColorRampGraphRunText ColorBylineContainerList2ThreeColorAltRampGraphStars GraphColorsSystemTitleList3GraphOutlier HtmlSstemFooterGraphGraphFit—GraphFit2 TextPageNoGraphWallsGraphConfidence—2GraphClipping ContainerExtendedPageGraphAxisLinesGraphPredictionLayoutcontainer IndexBylineGrapGridLinesGraphPredictionLiimits DocumentParskipGraphOutliensGraphError BodyContinuedGraphBox FrameProcTitleGraphBorderLinesGraphBoxMedian ContentsProcTitleFixedGraphReferenceGraphBoxMean PagesOutputGraphTitleTextGraphBoxWhisker DateTableGraphFootnoteTextGraphHistogram BodyDateBatchGraphDataTextGraphEllipse IndexItemNoteGraphLabelTextBraphBand ContentFoldernoteBannerGraphValueTextGraphContour ByContentFolderUserTextGraphUnicodeTextGraphBlock IndexProcNamePrePgeGraphBackground ContentProcLabelNoteContentFixedGraphFloorGraphAltBlock PagesProcLabelWarnBannerGraphLegendBackgrondGraphAnnoLine IndexTitleWarnContentFxedGraphHeaderBackgroundGraphAnnotext ContentsTitleErrorBanerDropShadowStyleGraphAnnoShape PagesTitleErrorContentFixedGraphDataDefaultGraphSelection FatalBanerGraphData1—GraphData12GraphConnectLine There are a LOT of different parts of a template that can be tweaked.

132 Your Own Style Template You can customize a style template based on another: proc template; define style myStyle; parent = styles.Statistical; style graphdata1 from graphdata1/ color = colors('docbg'); style graphdata2 from graphdata1/ color = violet; style graphdata3 from graphdata1/ color = turquoise; style GraphFonts from GraphFonts / 'GraphDataFont' = (", ", 9pt); end; run; Make the graphic element match the background of graphic (invisible camouflage) Change the appearance of the font used for labeling data elements. Use everything in the statistical template except tweaks listed below.

133 To get a list of known colors proc registry list startat="COLORNAMES"; run;

134 About the colors You can pick colors by names or specifying details 12 th item in grouped data Contrast around 12 th item in grouped data (typically confidence bounds)

135 About those colors The weird color names are colors in RGB hexadecimal format prefixed with "cx" Go play at kuler.adobe.com/#create/fromacolor

136 Using the style template Once the style is created you can apply it to an ODS destination (pipeline) with code like: ods listing style= myStyle; * stuff goes here; ods listing close; or something like this: ods html style= myStyle ; ods graphics on / width = 11in height = 11in; proc sgrender data=whatWhen template=blockplot1; run; ods html close;

137 How to set the color for a histogram

138 proc sgplot data = fram; histogram weight / fillattrs = (color = coral); run;

139 You can also tweak the style template

140 Tweaking the Style Template proc template; define style myStyle; parent = styles.Statistical; style GraphDataDefault / color=coral; end; run; ods html style = myStyle; proc sgplot data = fram; histogram weight ; run; ods html close;

141 vbar Version proc sgplot data = fram; vbar weight / group = sex; run;

142 proc sgplot data = fram; vbar weight / group = sex; xaxis fitpolicy = thin ; run;

143 proc template; define style myStyle; parent = styles.Statistical; style graphdata1 from graphdata1 / contrastColor=pink color = pink; style graphdata2 from graphdata1 / contrastColor=blue color = blue; end; run; ods html style = myStyle; proc sgplot data = fram; vbar weight / group = sex; xaxis fitpolicy = thin ; run; ods html close;

144 What is the Current color? proc template; source styles.default; run; kuler.adobe.com/#

145 Setting Colors … The Hard Way proc template; define statgraph TABLENAME; begingraph; entrytitle ''; layout overlay ; histogram v / fillattrs = (color = black) outlineattrs = (color=orange) ; endlayout; endgraph; end; run; proc sgrender data = blah template = TABLENAME; run;

146 Footnotes In the template use: entryfotnote halign=left textattrs=graphvaluetext "TEXT"; or use the %modtmplt macro title; footnote "halign=left textattrs=graphvaluetext 'blah' "; %modtmplt(template=NAME, step=t, options titles noquotes) Use the template then delete temp version: %modtmplt(template= NAME, step=d) Search online doc for modtmplt and look at this:

147 proc sgplot data = fram; scatter x = height y = weight; run; proc sgplot data = fram; reg x = height y = weight; run;

148 ods listing sge = on style = statistical; proc sgplot data = fram; reg x = height y = weight / markerattrs = (color = green) lineattrs = graphdata1 (color = lime); run;

149 ods listing style = statistical; proc sgplot data = fram; reg x = height y = weight / group = sex ; run;

150 proc template; define style sexE; parent = styles.Statistical; style graphdata1 / contrastColor=pink markersymbol = "star"; style graphdata2 / contrastColor=blue markersymbol = "plus"; end; run; ods listing sge = on style = sexE; proc sgplot data = fram; scatter x = height y = weight / group = sex ; reg x = height y = weight / group = sex ; run;

151

152 The syntax for proc template vs. proc sgplot The following slides marked with: show the syntax that I have written into enhanced editor keyboard macros for sgplot and template. So, after downloading and installing the keyboard macros use the title on the following slides and it will auto-complete with useful syntax. keyboard macro

153 proc template scatter proc template; define statgraph TABLENAME; begingraph; entrytitle ''; layout overlay / xaxisopts = (offsetmin=.05 offsetmax=.05 label=' ') yaxisopts = (offsetmin=.05 offsetmax=.05 label=' ' linearopts = (tickvaluesequence = (start = end = increment = ) viewmin = ) ); scatterplot y = x = / datalabel = LABELVARIABLE markerattrs = (symbol = circlefilled color = black size = 3px); endlayout; endgraph; end; run; proc sgrender data = template = TABLENAME; run; Required Instead of title statement Based on code in Statistical Graphics in SAS by Warren F. Kuhfeld For a single panel keyboard macros

154 proc template; define statgraph classscatter; begingraph; entrytitle 'Weight by Height'; layout overlay / xaxisopts = (offsetmin=.05 offsetmax=.05 label='Class Height') yaxisopts = (offsetmin=.05 offsetmax=.05 label='Class weight' linearopts = (tickvaluesequence = (start = 50 end = 150 increment = 25) viewmin = 50) ); scatterplot y = weight x = height / datalabel = name markerattrs = (symbol = circlefilled color = black size = 3px ); endlayout; endgraph; end; run; proc sgrender data = sashelp.class template = classscatter; run; Edge of plot to fist tick Force to include the lower tick Tick range to consider

155 proc sgplot scatter proc sgplot data = ; title ""; scatter y = x = / datalabel = markerattrs = (symbol = circlefilled color = black size = 3px); xaxis offsetmin =.05 offsetmax =.05 label = ""; yaxis offsetmin =.05 offsetmax =.05 label = "" values = ( to by ); run; keyboard macros

156 Using proc sgplot scatter proc sgplot data = sashelp.class; title "Weight by Height"; scatter y = weight x = height /datalabel = name markerattrs = (symbol=circlefilled color=black size =3px); regressionplot y = weight x = height xaxis offsetmin =.05 offsetmax =.05 label = "Height"; yaxis offsetmin =.05 offsetmax =.05 label = "Weight" values = (50 to 150 by 25); run;

157 Proc template reg proc template; define statgraph TABLENAME; begingraph; entrytitle ''; layout overlay; scatterplot y = x = ; regressionplot y = x = / degree = 3; endlayout; endgraph; end; run; proc sgrender data = template = TABLENAME; run; keyboard macros

158 Proc sgplot reg proc sgplot data = ; title ""; reg y = x = / datalabel = markerattrs = (symbol = circlefilled color = black size = 3px); xaxis offsetmin =.05 offsetmax =.05 label = ""; yaxis offsetmin =.05 offsetmax =.05 label = "" values = ( to by ); run; keyboard macros

159 Proc template loess proc template; define statgraph TABLENAME; begingraph; entrytitle ''; layout overlay; scatterplot y = x = ; loessplot y = x =; endlayout; endgraph; end; run; proc sgrender data = template = TABLENAME; run; keyboard macros

160 proc sgplot loess proc sgplot data = ; title ""; loess y = x = / datalabel = markerattrs = (symbol = circlefilled color = black size = 3px); xaxis offsetmin =.05 offsetmax =.05 label = ""; yaxis offsetmin =.05 offsetmax =.05 label = "" values = ( to by ); run; keyboard macros

161 proc loess proc loess global ods graphics on; * Locally optimal; proc loess data =; model = ; run; * Globally optimal fit; proc loess data= ; model = / select = AICC(global); run; keyboard macros

162 Proc template bspline proc template; define statgraph TABLENAME; begingraph; entrytitle ''; layout overlay; scatterplot y = x = ; pbsplineplot y = x =; endlayout; endgraph; end; run; proc sgrender data = template = TABLENAME; run; keyboard macros

163 proc sgplot bspline proc sgplot data = ; title ""; pbspline y = x = / datalabel = markerattrs = (symbol = circlefilled color = black size = 3px); xaxis offsetmin =.05 offsetmax =.05 label = ""; yaxis offsetmin =.05 offsetmax =.05 label = "" values = ( to by ); run; keyboard macros

164 proc transreg For model informaiton on bsplines * Global optimum; proc transreg data =; model identity(OUTCOME) = pbspline(PREDICTOR); run; * Local optimum; proc transreg data = ; model identity(OUTCOME) = pbspline(PREDICTOR / sbc lambda = range); run; keyboard macros

165 Proc template reg group proc template; define statgraph TABLENAME; begingraph; entrytitle ''; layout overlay; scatterplot y = x = / group =; regressionplot y = x = / group = degree = 3 name ="thingy"; discretelegend = "thingy" / title = ""; endlayout; endgraph; end; run; proc sgrender data = template = TABLENAME; run; keyboard macros

166 Proc sgplot reg group proc sgplot data = ; title ""; reg y = x = / group = datalabel = markerattrs = (symbol = circlefilled color = black size = 3px); xaxis offsetmin =.05 offsetmax =.05 label = ""; yaxis offsetmin =.05 offsetmax =.05 label = "" values = ( to by ); run; keyboard macros

167 Proc template barchart proc template; define statgraph TABLENAME; begingraph; entrytitle ''; layout overlay ; barchart y = x = / stat = mean /*freq pct sum */ orient= horizontal; endlayout; endgraph; end; run; proc sgrender data = template = TABLENAME; run;

168 proc sgplot hbar proc sgplot data = ; title ""; hbar GROUP / response = RESPONSE stat = mean /*freq mean sum */ numstd = 2 limitstat = /* clm stddev stderr */; run;

169 proc template histogram proc template; define statgraph TABLENAME; begingraph; entrytitle ''; layout overlay ; histogram VARIABLE / endlabels = true; endlayout; endgraph; end; run; proc sgrender data = template = TABLENAME; run;

170 Proc sgplot histogram proc sgplot data = ; title ""; histogram VARIABLE; run;

171 proc template density proc template; define statgraph TABLENAME; begingraph; entrytitle ''; layout overlay ; histogram VARIABLE / endlabels = true; densityplot VARIABLE / kernel(); /* normal() */ endlayout; endgraph; end; run; proc sgrender data = template = TABLENAME; run;

172 proc template fringe proc template; define statgraph TABLENAME; begingraph; entrytitle ''; layout overlay ; histogram / endlabels = true; densityplot / kernel(); /* normal() */ fringeplot ; endlayout; endgraph; end; run; proc sgrender data = template = TABLENAME; run;

173 proc template boxplot proc template; define statgraph TABLENAME; begingraph; entrytitle ''; layout overlay ; vbox y = x = / orient = horizontal; endlayout; endgraph; end; run; proc sgrender data = template = TABLENAME; run;

174 proc sgplot boxplot proc sgplot data = noautolegend; title ""; boxplot OUTCOME / category = GROUP; run;

175 proc template series proc template; define statgraph TABLENAME; begingraph; entrytitle ''; layout overlay ; seriesplot y = OUTCOME x = DATEVAR / group = GROUPVAR name = 'thingy'; discretelegend 'thingy' / title = "SOMETHING"; endlayout; endgraph; end; run; proc sgrender data = template = TABLENAME; run;

176 proc template dot proc means data = noprint nway; var OUTCOME; class THEGROUP; output out = tmp mean = OUTCOME lclm = lower uclm = upper; run; proc template; define statgraph dotplot; begingraph; entrytitle ''; layout overlay / yaxisopts = (type = discrete griddisplay = on reverse = true); scatterplot y = THEGROUP x = OUTCOME / xerrorlower = lower xerrorupper = upper markerattrs = (symbol = circlefilled) name = 'thingy' legendlabel = "mean and 95% Confidence Limits"; discretelegend 'thingy' / title = "whatever"; endlayout; endgraph; end; run; proc sgrender data = tmp template = dotplot; run;

177 proc template needle proc template; define statgraph TABLENAME; begingraph; entrytitle ''; layout overlay; needleplot y = x = ; endlayout; endgraph; end; run; proc sgrender data = template = TABLENAME; run;

178 proc template step proc template; define statgraph TABLENAME; begingraph; entrytitle ''; layout overlay; stepplot y = x = / display = (markers) markersize = (size = 3px); endlayout; endgraph; end; run; proc sgrender data = template = TABLENAME; run;

179 proc template block proc template; define statgraph TABLENAME; begingraph; entrytitle ''; layout overlay; blockplot x = DATE block = THEBLOCK / filltype=multicolor datatransparency=.3 valuevalign=top labelposition=top display=(fill values label) blockindex = IDNUMBER; endlayout; endgraph; end; run; proc sgrender data = template = TABLENAME; run;