Presentation, data and programs at: Stata 1, Graphics Hein Stigum Presentation, data and programs at: http://folk.uio.no/heins/
Why use graphs?
Problem example Lunch meals per week Table of means (around 5 per week) Linear regression Jan-19 H.S.
Problem example 2 Iron level Both linear and logistic regression Opposite results Jan-19 H.S.
Structure of talk Order Focus: Work/presentation plots Plot types Outcome type Focus: The right plot The commands Jan-19 H.S.
Plot types
Plottypes Plot at start of analysis: scatter, density, box (raw data) Plot results: bar, dot (derived data) Pie: poor, parts of a whole Main difference: twoway plots individual data, the others plots summaries of data (means, medians, …) Jan-19 H.S.
Continuous outcome
Univariate Density Boxplot kdensity weight graph hbox weight Jan-19 H.S.
Density with “box” information Jan-19 H.S.
Bivariate Scatter scatter weight gest Jan-19 H.S. Use scatter and density for Cont Y and cat x Cont Y and cont x Binary Y and cont x Jan-19 H.S.
Scatter and density plots for many types of data Jan-19 H.S.
Twoway density Syntax One plot Two plots overlaid Side by side graph twoway (plot1, opts) (plot2, opts), opts One plot kdensity x Two plots overlaid twoway ( kdensity weight if sex==1, lcolor(blue) ) /// ( kdensity weight if sex==2, lcolor(red) ) Side by side twoway ( kdensity weight ), by(sex) Show distributions, density best Area scales to 1, can compare groups of different size. Jan-19 H.S.
Twoway scatter + fit Syntax Examples graph twoway (plot1, opts) (plot2, opts), opts Examples scatter y x twoway (scatter y x) (fpfitci y x) (lfit y x) Fitlines Scatter shows association and outliers No kernel smoother with ci Run 1 ex lfit lfitci Linear qfit qfitci quadratic mband, mspline Median band, median spline fpfitci Fractional polynomial lowess Local regression Jan-19 H.S.
Continuous by 3 categories Is birth weight the same over parity? Density plot Scatterplot Scatter to see linear/no-linear effect, look for outliers Density to see equal variance Equal means? Linear effect? Outliers? Equal variances? Jan-19 H.S.
Continuous by 3 categories Scatterplot twoway (scatter weight parity3) (fpfitci weight parity3) (lfit weight parity3) , legend(off) Look for: Outliers (all analyses) Non-linear effects (regression) Scatter to see linear/no-linear effect, look for outliers Density to see equal variance Jan-19 H.S.
Continuous by 3 categories Density plot twoway (kdensity weight if parity3==0, lcol(black)) (kdensity weight if parity3==1, lcol(blue)) (kdensity weight if parity3==2, lcol(red)) , yscale(off) Look for: Different locations Different shapes (ANOVA, regression) Scatter to see linear/no-linear effect, look for outliers Density to see equal variance Jan-19 H.S.
Twoway density options kdensity x, normal add normal curve kdensity x, area(400) frequency, N=400 display r(width) previous width kdensity x, width(80) less smoothing Jan-19 H.S.
Twoway options Syntax Options graph twoway (plot1, opts) (plot2, opts), opts Options lcolor(red) line color lpattern(“.-”) line pattern lwidth(*2) line width *2 legend( ring(0) legend inside plot pos(2) legend at 2 o’clock position col(1) legends in 1 column label(1 “First”) legend label plot 1 label(2 “Second”) legend label plot 2 ) Jan-19 H.S.
Continuous by continuous twoway (scatter weight gest) (fpfitci weight gest) (lfit weight gest) Look for: Main effect (line) Non-linearity (smooth) outliers Fpfitci plus lfit gives a graphic test of linearity Jan-19 H.S.
More twoway options Syntax Options graph twoway (plot1, opts) (plot2, opts), opts Options msize(*0.5) marker size mlabel(id) marker label =variable id xline(24) line at x=24 scale(1.5) all elements 1.5*larger Jan-19 H.S.
Mark outliers twoway (scatter weight gest) (scatter weight gest if gest>400, mlabel(id)) Jan-19 H.S.
Titles, legend, labels and scale
Titles scatter weight gest, title("title") subtitle("subtitle") /// xtitle("xtitle") ytitle("ytitle") note("note") Jan-19 H.S.
Legend …, legend( ring(0) pos(11) col(1) label(1 ”Boys, N=283”) label(2 ”Girls, N=270”) ) …, legend(off) Jan-19 H.S.
Axis scale and label scatter weight gest, xscale(range(250 310)) /// xlabel( 250(20)310 281) Range will not exclude data, must use “if” for that. Jan-19 H.S.
Categorical outcome
Comparing means or proportions V1, v2, v3 may be cont or binary X is categorical Jan-19 H.S.
Comparing means/prop. better preserve “save” data collapse (mean) v1 v2 v3, by(parity) aggregate list list the new data twoway (scatter v1 parity) (line v1 parity) /// (scatter v2 parity) (line v2 parity) /// (scatter v3 parity) (line v3 parity) restore restore original data Could also use scatter y x, connect(direct) Jan-19 H.S.
Binary outcome
Scatter: binary by countinuous Jan-19 H.S.
Binary with rug and smooth gen yy=. replace yy= 0.02*(lowbw==0)+ 0.98*(lowbw==1) twoway (rspike yy lowbw gest) (fpfit lowbw gest) Jan-19 H.S.
Regression results Jan-19 H.S. Program not automatic, not show May use log axis Jan-19 H.S.