# Getting More out of Multiple Regression Darren Campbell, PhD.

## Presentation on theme: "Getting More out of Multiple Regression Darren Campbell, PhD."— Presentation transcript:

Getting More out of Multiple Regression Darren Campbell, PhD

Overview View on Teaching Statistics When to Apply How to Use & How to Interpret

Multiple Regression Techniques 1. Centring removing /group difference confounds 2. Centring interpret continuous interactions 3. Spline functions – Piecemeal Polynomials  Estimate separate slopes each angle of the regression polynomial

Perks of Multiple Regression  1. Realistic many influences  Behaviour  2. Control over confounds  3. Test for relative importance  4. Identify interactions

Why Not Use ANOVAs? Not realistic:  Many behaviours / constructs are continuous e.g., intelligence, personality Loss of statistical power - categories  scores assumed to be the same + error  mixing systematic patterns into the error term

What is Centring? Simple re-scaling of raw scores  Raw Score minus Some Constant value  x1 – 5.1 1 – 5.1 = -4.1 4 – 5.1 = -1.1  x2 – 29.4 30 – 29.4 = 0.6 35 -- 29.4 = 5.6

A Simple Case for Centring Babies:  Cry & Fuss – parent report diary measures  Fail about - limb movement Are these 2 infant behaviours related?  Emotional Responses & Emotion Regulation

A Simple Case for Centring AgeMoves / HrCrying Hrs/Day 6 week olds5.14.7 6 month olds29.4 3.5 Full Sample17.24.1 Are these 2 infant behaviours related?

6 Week-Olds r = +.47 some infants cry more & move more others cry less & move less

6 Month-Olds r = +.38 some infants cry more & move more others cry less & move less What if we combine the two groups?

Full sample r = -0.22 Do we get a significant corr? If so, what kind?

What happened with the Correlations? 6 Week-olds: r = +.47 6 Month-Olds: r = +.38 6 Week & 6 Month-olds: r = -0.22

Correlations = Grand Mean Centring 1) Mean Deviations for each variable: X & Y 2) Rank Order Mean Deviations 3) Correlate 2 rank orders of X & Y

The Disappearing Correlation Explained Grand Mean Centring lead to  all the older infants being classified as high movers  young infants low movers  Young high criers & high movers -> high criers & low movers  Large Group differences in movement altered the detection of within-group r’s What should we do?

Solution: Create Group Mean Deviations Re-scale raw scores Raw – Group Mean 6 week-olds: xs – 5.1 6 month-olds: xs – 29.4

Solution: Create Group Mean Deviations CryingRaw AL Group Means Group Centred AL 5.71-5.11-4.11 64-5.11-1.11 25-5.11-0.11 0.530-29.40.63 2.535-29.45.63 234-29.44.63

Raw Scores

Group Centred Scores Group mean data r =.41 - full sample Mulitple Regression could also work on uncentred variables Crying = Group + Uncentred AL Not a Group x AL interaction – the relation is the same for both groups

Centring so far 1. Centring is Magic 2. Different types of centring  Depending on the number used to re-scale the data  Grand mean – Pearson Correlations  Group Means – Infant Limb Movements

Regression Interactions Centring Great for Interpreting Interactions  trickier than for ANOVAs  do not have pre-defined levels or groups  based on 2+ continuous vars

Multiple Regression - the Basics The Basic Equation: Y = a + b1*X 1 + b2*X 2 + b3*X 3 + e Outcome = Intercept + Beta1 * predictor1 + B2 * pred2 + B3 * pred3 + Error a = expected mean response of y betas: every 1 unit change in X you get a beta sized change in Y

Regression Interactions Centring Reducing multicollinearity interaction predictor = x1 * x2 x1 & x2 numbers near 0 stay near 0 and high x1 & x2 numbers get really high interaction term is highly correlated with original x1 & x2 variables Centring makes each predictor: x1 & x2 have more moderate numbers above and below zero positive and negative numbers Reduces the multiplicative exaggeration between x1 & x2 and the interaction product x1*x2

Centring to reduce Multicollinearity

Regression Y = a + b1*X 1 + b2*X 2 + b3*X 1 *X 3 + e How does X2 relate to Y at different levels of X1? How does predictor 2 (shyness) relate to the outcome (social interactions) at different stress levels (X1)?

Uncentred DataCentred Data X1 = 26.2 (14.5)X1 = 0.0 (14.5) X2 = 24.8 (27.6)X2 = 0.0 (27.6) x1x2x12yx1cx2cx12cy x1 --0.58**0.65**0.14** x1c --0.58**0.110.14* x2 --0.96**0.28** x2c --0.66**0.28** x12 --0.34** x12c --0.34** Correlation Matrix: ** p =.01 * p =.05

Regression Equation Results No Interaction:  Y = b0 + b1 * X1 + b2 * X2 Uncentred:  Y = 1164.8 – 4 X1 + 20 X2 ** Centred:  Y = 1550.8 – 4 X1 + 20 X2 **

Regression Equation Results Interaction Term Included:  Y = b0 + b1 * X1 + b2 * X2 + b3 * X1*X2 Uncentred:  Y = 1733 – 19.1 X1 – 31.7 X2 ** + 1.26 X1*X2 Centred:  Y = 1260 + 12.0 X1 + 1.1 X2 + 1.26 X1*X2

But what does it mean… How does X2 relate to Y at different levels of X1? How does predictor 2 (shyness) relate to the outcome (social interactions) at different stress levels (X1)?

Post Hocs Y = b0 + b1 * X1 + b2 * X2 + b3 * X1*X2 Y = ( b1 * X1 + b0 ) + ( b2 + b3 * X1 ) * X2 -1 SD below X1 Mean& + 1SD above X1 Mean X - (- 14.547663)X - 14.547663 X + 14.547663

Scatterplots: Moving the Y Axis

-1 SD Below X1 Mean  Y = 1085 -19.1 X1 - 17.1 X2 + 1.26 X1*X2  t (1,196) = -1.40, p =.16 Centred:  Y = 1260 + 12.0 X1 + 1.1 X2 + 1.26 X1*X2  t (1,196) = 0.12, p =.88 +1 SD Above X1 Mean  Y = 1435 - 19.1 X1+ 19.4 X2 ** + 1.26 X1*X2  t (1,196) = 3.66, p =.001

Regression Interaction Example Predicting inhibitory ability with motor activity & age  simon says like games  4 to 6 yr-olds & physical movement  Move by Age interaction F (1, 81) = 5.9, p <.02  Young (-1.5SD): move beta sig + Inhibition  Middle (Mean) : move beta p =.10 ~ Inhibition  Older (+1.5SD): move beta n.s. inhibition

Polynomials, Centring, & Spline Functions Polynomial relations: quadratic, cubic, etc Y = a + b1*X 1 - b2*X 1 *X 1 + e

Curvilinear Pattern Assume a symmetric pattern – X 2 But, it may not be... Perceived Control (Y) slowly increases & then declines rapidly in old age

This Brings us to Spline Functions Split up predictor X  2+ variables X Low & X High X Low = X – (-5) & set values at the next change point to zero  Ditto for X High Re-run Y = a + b1*X Low - b2*X High + e

Perks of Spline Functions Estimate slope anywhere along the range Can be sig on one part - n.s. on another Steeper or shallower

Multiple Regression Techniques 1. Centring removing /group difference confounds 2. Centring interpret continuous interactions 3. Spline functions More precise understanding of polynomial patterns

Questions Alpha control procedures for spline functions – Could be argue that you are describing the pattern already identified? – Conservatively, you could apply an alpha control procedure. I like the False Discovery Rate procedures. – Replication is preferred, but not always possible.

Alpha Control Aside The source of Type 1 errors is typically poorly described. Typical: If enough probability tests are run, the probability will increase to the point where something becomes significant just by chance. – But, probability is linked to the representativeness of your data and type 1 error is a proxy for the likelihood of the representativeness of your data. My View: The real source of Type 1 errors is that if you – divide up the data into enough subgroupings – eventually one of those subgroupings will differ because it is misrepresentative of reality.

Standardized vs Centred Centred is x – x M Standardized (x – x M )/ SDx – Makes variability for each predictor = 1 – Standardized Beta = raw b * SDx / SDy – Similar to centring but different metric needs to be adjusted for interaction terms To get comparable results with interaction term – Standardization should be applied to X1 and X2 prior to the X1*X2 estimate then use “raw” coefficients

Centring and Spline Functions Relatively simple procedures Old dogs in the Statistic World  but new tricks for many That’s All Folks!