Presentation is loading. Please wait.

Presentation is loading. Please wait.

Apr-15H.S.1 Stata: Linear Regression Stata 3, linear regression Hein Stigum Presentation, data and programs at: courses.

Similar presentations

Presentation on theme: "Apr-15H.S.1 Stata: Linear Regression Stata 3, linear regression Hein Stigum Presentation, data and programs at: courses."— Presentation transcript:

1 Apr-15H.S.1 Stata: Linear Regression Stata 3, linear regression Hein Stigum Presentation, data and programs at: courses

2 SYNTHETIC DATA EXAMPLE Birth weight by gestational age Apr-15H.S.2

3 Apr-15H.S.3 Linear regression Birth weight by gestational age

4 Apr-15H.S.4 Regression idea

5 Apr-15H.S.5 Model, measure and assumptions Model Association measure  1 = change in y for one unit increase in x 1 Assumptions –Independent errors –Linear effects –Constant error variance Robustness –influence

6 Apr-15H.S.6 Association measure Model: Start with: Hence:

7 Apr-15H.S.7 Purpose of regression Estimation –Estimate association between outcome and exposure adjusted for other covariates Prediction –Use an estimated model to predict the outcome given covariates in a new dataset

8 Outcome distributions by exposure Apr-15H.S.8 Linear regression Quantile regression or cutoff, logistic regression Linear regression or transform, linear regression

9 Apr-15H.S.9 Workflow DAG Scatter- and densityplots Bivariate analysis Regression –Model estimation –Test of assumptions Independent errors Linear effects Constant error variance –Robustness Influence E gest age D birth weight C2 education C1 sex

10 Scatter and density plots Scatter of birth weight by gestational age Distribution of birth weight for low/high gestational age Apr-15H.S.10 Look for deviations from linearity and outliers Look for shift in shape

11 Apr-15H.S.11 Syntax Estimation –regress y x1 x2 linear regression –regress y c.age continuous age, categorical sex –regress y main+interaction Compare models –estimates store m1 save model –estimates table m1 m2 compare coefficients –estimates stats m1 m2 compare model fit Post estimation –predict res, residuals predict residuals

12 Apr-15H.S.12 Model 1: outcome+exposure regress bw gestcrude model estimates store m1store model results

13 Apr-15H.S.13 Model 2 and 3: Add covariates Estimate association: m1 is biased, m2=m3 Prediction: m3 is best regress bw gest i.educsexadd covariates estimates table m1 m2 m3compare coefs estimates stats m1 m2 m3compare fit m3 more precise?

14 Factor (categorical) variables Variable –educ = 1, 2, 3 for low, medium and high education Built in –i.educuse educ=1 as base (reference) –ib3.educuse educ=3 as base (reference) Manual “dummies” –educ=1 as base, make dummies for 2 and 3 –generate Medium =(educ==2) if educ<. –generate High =(educ==3) if educ<. Apr-15H.S.14

15 Create meaningful constant Expected birth weight at: gest= 0, educ=1, sex=0, not meaningful gest=280, educ=1, sex=0 Expected birth weight: Margins: margins, at(gest= 0 educ=1 sex=0) = not meaningful margins, at(gest= 280 educ=1 sex=0) = 3426 Apr-15H.S.15

16 Results so far Apr-15H.S.16 Would normally check for interaction now!


18 Apr-15H.S.18 Test of assumptions Assumptions –Independent residuals: –Linear effects: –Constant variance: estat hettest p=0.9 no heteroskedasticity discuss plot residuals versus predicted y predict res, residuals predict pred, xb scatter res pred

19 Apr-15H.S.19 Violations of assumptions Dependent residuals Use mixed models or GEE Non linear effects Add square term or spline Non-constant variance Use robust variance estimation

20 ROBUSTNESS Measures of influence Apr-15H.S.20

21 Apr-15H.S.21 Influence idea

22 Apr-15H.S.22 Measures of influence Measure change in: –Predicted outcome –Deviance –Coefficients (beta) Delta beta Remove obs 1, see change remove obs 2, see change

23 Apr-15H.S.23 Leverage versus residuals 2 lvr2plot, mlabel(id) high influence

24 Delta-beta for gestational age Apr-15H.S.24 dfbeta(gest) scatter _dfbeta_1 id OBS, variable specific If obs nr 370 is removed, beta will change from 17.9 to 18.6

25 Apr-15H.S.25 Removing outlier regress bw gest i.educ sex if id!=370 est store m4 est table m3 m4, b(%8.1f)

26 Removing outlier Apr-15H.S.26 Full model N=5000 Outlier removed N=4999 One outlier affected several estimates Final model

27 Help Linear regression –help regress syntax and options –help regress postestimation dfbeta estat hettest lvr2plot predict margins Apr-15H.S.27

28 NON-LINEAR EFFECTS bw2 Apr-15H.S.28

29 bw2: Non-linear effects Apr-15H.S.29 Handle: add polynomial or spline

30 Non-linear effects: polynomial Apr-15H.S.30 regress bw2 c.gest##c.gest i.educ sex2. order polynomial in gest margins, at(gest=(250(10)310))predicted bw2 by gest marginsplotplot

31 Non-linear effects: spline Qubic spline Plot Linear spline Apr-15H.S.31 mkspline g=gest, cubic nknots(4)make spline with 4 knots regress bw2 g1 g2 g3 i.educ sexregression with spline gen igest=5*round(gest/5)5-year integer values of gest margins, over(igest)predicted bw by gest marginsplot mkspline g1 280 g2=gestmake linear spline with knot at 280 regress bw2 g1 g2 i.educ sexregression with spline

32 INTERACTION bw3 Apr-15H.S.32

33 Interaction definitions Interaction: combined effect of two variables Scale –Linear modelsadditive y=b 0 +b 1 x 1 +b 2 x 2 both x 1 and x 2 = b 1 +b 2 –Logistic, Poisson, Coxmultiplicative Interaction –deviation from additivity (multiplicativity)  –effect of x 1 depends on x 2 Apr-15H.S.33

34 bw3: Interaction (only linear effects) Add interaction terms Show results Apr-15H.S.34 regress bw3 i.educgest-sex interaction margins, dydx(gest) at(sex=0)effect of gest for boys margins, dydx(gest) at(sex=1)effect of gest for girls

35 Summing up 1 Build model –regress bw gestcrude model –est store m1store –regress bw gest i.educ sexfull model –est store m2 –est table m1 m2compare coefficients Interaction –regress bw3 i.eductest interaction –margins, dydx(gest) at(sex=0)gest for boys Assumptions –predict res, residualsresiduals –predict pred, xbpredicted –scatter res predplot Apr-15H.S.35

36 Summing up 2 Non-linearity (linear spline) –mkspline g1 280 g2=gestspline with knot at 280 –regress bw2 g1 g2 i.educ sexregression with spline Robustness –dfbeta(gest)delta-beta –scatter _dfbeta_1 idplot versus id Apr-15H.S.36

Download ppt "Apr-15H.S.1 Stata: Linear Regression Stata 3, linear regression Hein Stigum Presentation, data and programs at: courses."

Similar presentations

Ads by Google