Presentation is loading. Please wait.

Presentation is loading. Please wait.

Unit 2a: Dealing “Empirically” with Nonlinear Relationships © Andrew Ho, Harvard Graduate School of EducationUnit 2a – Slide 1

Similar presentations


Presentation on theme: "Unit 2a: Dealing “Empirically” with Nonlinear Relationships © Andrew Ho, Harvard Graduate School of EducationUnit 2a – Slide 1"— Presentation transcript:

1 Unit 2a: Dealing “Empirically” with Nonlinear Relationships © Andrew Ho, Harvard Graduate School of EducationUnit 2a – Slide 1 http://xkcd.com/1162/

2 Revisiting the importance of the linearity assumption Understanding Tukey’s Ladder and the Rule of the Bulge Transformations and the Box-Cox Procedure © Andrew Ho, Harvard Graduate School of Education Unit 2a– Slide 2 Multiple Regression Analysis (MRA) Multiple Regression Analysis (MRA) Do your residuals meet the required assumptions? Test for residual normality Use influence statistics to detect atypical datapoints If your residuals are not independent, replace OLS by GLS regression analysis Use Individual growth modeling Specify a Multi-level Model If time is a predictor, you need discrete- time survival analysis… If your outcome is categorical, you need to use… Binomial logistic regression analysis (dichotomous outcome) Multinomial logistic regression analysis (polytomous outcome) If you have more predictors than you can deal with, Create taxonomies of fitted models and compare them. Form composites of the indicators of any common construct. Conduct a Principal Components Analysis Use Cluster Analysis Use non-linear regression analysis. Transform the outcome or predictor If your outcome vs. predictor relationship is non-linear, Use Factor Analysis: EFA or CFA? Course Roadmap: Unit 2a Today’s Topic Area

3 © Andrew Ho, Harvard Graduate School of Education Unit 2a– Slide 3 “In the population, …Assumption How Does Failure of the Assumption Affect OLS Regression Analysis? Linear Outcome/Predictor Relationships … the bivariate relationship between the outcome and each predictor must be linear.” If the modeled relationship is not linear, then it will be misrepresented by the linear regression analysis, and the fundamental underpinnings of the entire analysis are at risk:  OLS-estimated regression slope will not represent the population relationship.  Assumptions about the population residuals (sometimes called, simply, “errors”) will be violated.  Estimated residuals will be incorrect.  Statistical inference will be incorrect. High-priority conditions must be met for accurate statistical inference with linear OLS regression. (Most of this falls under the heading of “independent and identically normally distributed errors.”

4 © Andrew Ho, Harvard Graduate School of EducationUnit 2a – Slide 4 Two General Approaches to Fitting Nonlinear Relationships  Use theory, or knowledge of the field, to postulate a non-linear model for the hypothesized relationship between outcome and predictor.  Use nonlinear regression analysis to fit the postulated trend, and conduct all of your statistical inference there.  Interpret the parameter estimates directly, and produce plots of findings.  Use theory, or knowledge of the field, to postulate a non-linear model for the hypothesized relationship between outcome and predictor.  Use nonlinear regression analysis to fit the postulated trend, and conduct all of your statistical inference there.  Interpret the parameter estimates directly, and produce plots of findings. Next Class Harder to apply, easier to interpret Theory-Driven, “Rational” Approach  Find an ad-hoc transformation of either the outcome or the predictor, or both, that renders their relationship linear.  Use regular linear regression analysis to fit a linear trend in the transformed world, and conduct all statistical inference there.  De-transform fitted model to produce plots of findings, and tell the substantive story in the untransformed world.  Find an ad-hoc transformation of either the outcome or the predictor, or both, that renders their relationship linear.  Use regular linear regression analysis to fit a linear trend in the transformed world, and conduct all statistical inference there.  De-transform fitted model to produce plots of findings, and tell the substantive story in the untransformed world. Today’s Class Easier to apply, harder to interpret Data-Driven, “Empirical” Approach

5 © Andrew Ho, Harvard Graduate School of EducationUnit 2a – Slide 5 Nancy Bayley’s Infant #8: The Development of Intelligence DatasetBAYLEY.txt OverviewIQ as a function of age for a female infant, from birth to age 60 months. Source Target child is a female infant (infant #8) from the Berkeley Growth and Guidance Study. More Info To learn more about the data, consult:  The overview of the Oakland and Berkeley Growth and Guidance Studies at the Carolina Population Center. Carolina Population Center  Glen Elder’s presentation on “Longitudinal Studies and the Life Course, the 1960s and 1970s,” prepared for the anniversary of the Institute of Human Development, UC Berkeley (2003).Longitudinal Studies and the Life Course, the 1960s and 1970s Sample sizeOne infant, over 21 occasions of measurement. Last updatedOctober 6, 2007 Structure of Dataset Col. # Variable Name Variable DescriptionVariable Metric/Labels 1IQ Infant’s score on the Bayley Scales of Infant DevelopmentBayley Scales of Infant Development Continuous raw score 2AGEAge of infantMonths IQ AGE 4 1 10 2 17 3 37 5 65 7 85 9 88 10 95 11 101 12 103 13 107 14 113 15 121 18 148 21 161 24 165 27 187 36 205 42 218 48 218 54 228 60 IQ AGE 4 1 10 2 17 3 37 5 65 7 85 9 88 10 95 11 101 12 103 13 107 14 113 15 121 18 148 21 161 24 165 27 187 36 205 42 218 48 218 54 228 60 This analysis comes with some caveats. 1) We’re interested in the nature of individual growth over time (addressed later in this class and in S-077), and 2) we aren’t fully accounting for differences between individuals (we only have 1), or 3) the problem of “autocorrelation” that can arise in time series data. “Adjacent” errors may not be independent!

6 © Andrew Ho, Harvard Graduate School of EducationUnit 2a – Slide 6 Well, a simple linear fit doesn’t look like it’s going to suffice. RQ: What is the functional form of the growth trajectory?

7 © Andrew Ho, Harvard Graduate School of EducationUnit 2a – Slide 7 Use residual plots for better diagnosis of regression assumptions Residual plots (including those introduced in Unit 1d with standardized residuals) are far better at detecting nonlinearity than straight scatterplots. These statistics seem quite compelling but are deeply misleading. R-squared understates the strength of the nonlinear relationship, and interpreting the slope of a line, as well as its significance, is an exercise in describing a poorly specified model.

8 © Andrew Ho, Harvard Graduate School of EducationUnit 5 / Page 8 UP Middle rung: No transformation (power = 1) Middle rung: No transformation (power = 1) Upper rungs: Higher powers (power > 1) Upper rungs: Higher powers (power > 1) Really low rungs: Inverses (power < 0) Really low rungs: Inverses (power < 0) Increasing power Decreasing power Lower rungs: Roots (0 < power < 1) Lower rungs: Roots (0 < power < 1).. DOWN http://onlinestatbook.com/stat_sim/transformations/index.html

9 © Andrew Ho, Harvard Graduate School of EducationUnit 2a – Slide 9 Which transformation? For which variable? UP Increasing power Decreasing power.. DOWN

10 © Andrew Ho, Harvard Graduate School of EducationUnit 1b – Slide 10

11 © Andrew Ho, Harvard Graduate School of EducationUnit 1b – Slide 11

12 © Andrew Ho, Harvard Graduate School of EducationUnit 2a – Slide 12 A Rough, Shallow, Data-Driven Approach Original correlation Best of these correlations

13 © Andrew Ho, Harvard Graduate School of EducationUnit 2a – Slide 13 “Starting” or “Tuning” your Transformation by Adding a Constant  Power and log transformations become problematic with negative and zero values.  Even with all positive values, we often add 1 by convention to “start” or “tune” the transformation.  This ends up making a small difference and offers another indication of how arbitrary and shallow this data-driven process can be.  Power and log transformations become problematic with negative and zero values.  Even with all positive values, we often add 1 by convention to “start” or “tune” the transformation.  This ends up making a small difference and offers another indication of how arbitrary and shallow this data-driven process can be. Add the starting/tuning constant BEFORE you transform. Untransformed Transformed, no starting constant. Add a starting constant of 1 prior to transformation.

14 Unit 2a – Slide 14 The Box-Cox Procedure: A Formal, Still Shallow, Data-Driven Approach © Andrew Ho, Harvard Graduate School of Education The Box-Cox procedure will overstate R- sq and inflate your Type I error rate (false alarms) if used uncritically. It capitalizes on chance variation in the sample that leads to a fit that does not generalize to the population (overfitting) UP.. DOWN

15 Unit 2a – Slide 15© Andrew Ho, Harvard Graduate School of Education All we’re doing is plotting this equation over a scatterplot of our data

16 Unit 2a – Slide 16© Andrew Ho, Harvard Graduate School of Education All we’re doing is plotting this equation over a scatterplot of our data

17 Unit 2a – Slide 17 The Transformed World and the Untransformed World © Andrew Ho, Harvard Graduate School of Education Regression line in the transformed space (bending the points): Implied function in the untransformed space (bending the line)

18 © Andrew Ho, Harvard Graduate School of EducationUnit 1b – Slide 18 UntransformedTransformed Horizontal vs. vertical acceleration. Sometimes accompanied by telltale decrease in the density of observations (positive skew) http://www.ats.ucla.edu/stat/mult_pkg/faq/general/log_transformed_regression.htm


Download ppt "Unit 2a: Dealing “Empirically” with Nonlinear Relationships © Andrew Ho, Harvard Graduate School of EducationUnit 2a – Slide 1"

Similar presentations


Ads by Google