1. Refresher on the general linear model, interactions, and contrasts UCL Linguistics workshop on mixed-effects modelling in R 18-20 May 2016.

Slides:



Advertisements
Similar presentations
Questions From Yesterday
Advertisements

Topic 12: Multiple Linear Regression
Factorial ANOVA More than one categorical explanatory variable.
Multiple Regression Models: Interactions and Indicator Variables
Chapter 6 The 2k Factorial Design
Analytic Comparisons & Trend Analyses Analytic Comparisons –Simple comparisons –Complex comparisons –Trend Analyses Errors & Confusions when interpreting.
Some Terms Y =  o +  1 X Regression of Y on X Regress Y on X X called independent variable or predictor variable or covariate or factor Which factors.
Chapter 13 Multiple Regression
More on ANOVA. Overview ANOVA as Regression Comparison Methods.
Regression Models w/ k-group & Quant Variables Sources of data for this model Variations of this model Main effects version of the model –Interpreting.
Chapter 12 Multiple Regression
January 6, afternoon session 1 Statistics Micro Mini Multiple Regression January 5-9, 2008 Beth Ayers.
Analyses of K-Group Designs : Analytic Comparisons & Trend Analyses Analytic Comparisons –Simple comparisons –Complex comparisons –Trend Analyses Errors.
Introduction the General Linear Model (GLM) l what “model,” “linear” & “general” mean l bivariate, univariate & multivariate GLModels l kinds of variables.
PSY 1950 General Linear Model November 12, The General Linear Model Or, What the Hell ’ s Going on During Estimation?
Biol 500: basic statistics
Stat 217 – Day 25 Regression. Last Time - ANOVA When?  Comparing 2 or means (one categorical and one quantitative variable) Research question  Null.
C ENTERING IN HLM. W HY CENTERING ? In OLS regression, we mostly focus on the slope but not intercept. Therefore, raw data (natural X metric) is perfectly.
Ch 11: Correlations (pt. 2) and Ch 12: Regression (pt.1) Nov. 13, 2014.
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 15-1 Chapter 15 Multiple Regression Model Building Basic Business Statistics 11 th Edition.
Correlation and Regression Analysis
1 Topic 11 – ANOVA II Balanced Two-Way ANOVA (Chapter 19)
ANOVA Chapter 12.
Regression with 2 IVs Generalization of Regression from 1 to 2 Independent Variables.
Correlation and Regression. The test you choose depends on level of measurement: IndependentDependentTest DichotomousContinuous Independent Samples t-test.
Moderation & Mediation
ALISON BOWLING THE GENERAL LINEAR MODEL. ALTERNATIVE EXPRESSION OF THE MODEL.
Soc 3306a Multiple Regression Testing a Model and Interpreting Coefficients.
Correlation and Linear Regression. Evaluating Relations Between Interval Level Variables Up to now you have learned to evaluate differences between the.
So far... We have been estimating differences caused by application of various treatments, and determining the probability that an observed difference.
Chapter 9 Analyzing Data Multiple Variables. Basic Directions Review page 180 for basic directions on which way to proceed with your analysis Provides.
Coding Multiple Category Variables for Inclusion in Multiple Regression More kinds of predictors for our multiple regression models Some review of interpreting.
Multiple Regression I KNNL – Chapter 6. Models with Multiple Predictors Most Practical Problems have more than one potential predictor variable Goal is.
Multivariate Analysis. One-way ANOVA Tests the difference in the means of 2 or more nominal groups Tests the difference in the means of 2 or more nominal.
Lab 5 instruction.  a collection of statistical methods to compare several groups according to their means on a quantitative response variable  Two-Way.
1 Prices of Antique Clocks Antique clocks are sold at auction. We wish to investigate the relationship between the age of the clock and the auction price.
Ordinary Least Squares Estimation: A Primer Projectseminar Migration and the Labour Market, Meeting May 24, 2012 The linear regression model 1. A brief.
Web example squares-means-marginal-means-vs.html.
Multiple Regression BPS chapter 28 © 2006 W.H. Freeman and Company.
Regression Models w/ 2 Quant Variables Sources of data for this model Variations of this model Main effects version of the model –Interpreting the regression.
Social Science Research Design and Statistics, 2/e Alfred P. Rovai, Jason D. Baker, and Michael K. Ponton Within Subjects Analysis of Variance PowerPoint.
Department of Cognitive Science Michael J. Kalsher Adv. Experimental Methods & Statistics PSYC 4310 / COGS 6310 Regression 1 PSYC 4310/6310 Advanced Experimental.
Regression Models w/ 2-group & Quant Variables Sources of data for this model Variations of this model Main effects version of the model –Interpreting.
Lecture 4 Introduction to Multiple Regression
Plotting Linear Main Effects Models Interpreting 1 st order terms w/ & w/o interactions Coding & centering… gotta? oughta? Plotting single-predictor models.
Regression Models for Quantitative (Numeric) and Qualitative (Categorical) Predictors KNNL – Chapter 8.
Specific Comparisons This is the same basic formula The only difference is that you are now performing comps on different IVs so it is important to keep.
Categorical Independent Variables STA302 Fall 2013.
General Linear Model.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Multiple Regression Model Building Statistics for Managers.
Linear Regression Chapter 7. Slide 2 What is Regression? A way of predicting the value of one variable from another. – It is a hypothetical model of the.
General Linear Model What is the General Linear Model?? A bit of history & our goals Kinds of variables & effects It’s all about the model (not the data)!!
Designs for Experiments with More Than One Factor When the experimenter is interested in the effect of multiple factors on a response a factorial design.
Factorial BG ANOVA Psy 420 Ainsworth. Topics in Factorial Designs Factorial? Crossing and Nesting Assumptions Analysis Traditional and Regression Approaches.
In general, for any fixed effect the ratio of its estimate divided by its standard error is knows as A. Effect size B. F-ratio C. Conditional marginal.
4-1 MGMG 522 : Session #4 Choosing the Independent Variables and a Functional Form (Ch. 6 & 7)
Engineering Statistics Design of Engineering Experiments.
Michael J. Kalsher PSYCHOMETRICS MGMT 6971 Regression 1 PSYC 4310 Advanced Experimental Methods and Statistics © 2014, Michael Kalsher.
29 October 2009 MRC CBU Graduate Statistics Lectures 4: GLM: The General Linear Model - ANOVA & ANCOVA1 MRC Cognition and Brain Sciences Unit Graduate.
Stats Methods at IC Lecture 3: Regression.
Chapter 15 Multiple Regression Model Building
The simple linear regression model and parameter estimation
Learning Objectives For two quantitative IVs, you will learn:
Learning Objectives For models with dichotomous intendant variables, you will learn: Basic terminology from ANOVA framework How to identify main effects,
Multiple Regression.
Interactive Models: Two Quantitative Variables
Soc 3306a: ANOVA and Regression Models
Correlations: testing linear relationships between two metric variables Lecture 18:
Soc 3306a Lecture 11: Multivariate 4
Multiple comparisons - multiple pairwise tests - orthogonal contrasts
Presentation transcript:

1. Refresher on the general linear model, interactions, and contrasts UCL Linguistics workshop on mixed-effects modelling in R May 2016

Code at: 2

The general linear model Ŷ = b 0 + b 1 x 1 + b 2 x 2 + b 3 x 3 … 3

4

An example with one continuous predictor 5

An example with one categorical predictor 6

7

An example with one polytomous categorical predictor 8

An example with their interaction 9

That interaction again, with the continuous IV centered 10

Interpreting interactions: crossed interaction 11

Time is the slope of the regression line of Time for Diet1 Time:Diet2 is how much different the slope of the regression line of Time is for Diet2 than it was for Diet = slope of 8.61 etc. Interpretation: For any given condition’s regression line, test whether it’s different from the baseline condition’s regression line 12

Interpreting interactions: nested interaction 13

Interactions in a nested model Now each condition has its own regression line in the model summary Interpretation: For any given condition’s regression line, test whether it’s significantly different from zero 14

15

Now let’s recreate these interaction tests with two categorical variables (rather than a categorical and a continuous), using real psycholinguistic data 16

Main effects and dummy coding? 17 (See also

Deviation coding 18

Simple example of dummy coding Intercept: 5 b condition : -2 Ŷ = x condition x condition can be either 0 or 1. So Ŷ will be { *0 = 5} or { *1 = 3} 19

Simple example of deviation coding 20 Intercept: 4 b condition : -2 Ŷ = x condition x condition can be either -.5 or.5. So Ŷ will be { *.5 = 3} or { *-. 5 = 5}

Dummy coding in an interaction Intercept: 5 b Avs.B : -2 b 1vs.2 : -2 Since x Avs.B can have values of 0 or 1, and so can x 1vs.2, this means the effect of b Avs.B is the effect when x 1vs.2 ==0, i.e., only the dark bars 21

Sum coding in an interaction 22 Intercept: 4 b Avs.B : 0 b 1vs.2 : 0 Since x Avs.B and x 1vs.2 have values of -.5 or.5, this means the effect of b Avs.B is the effect when x 1vs.2 ==0, i.e., at the ‘mean’ level of ‘1vs.2’

Take-home message on interaction coding If you care about simple effects at each level of some factor, then use dummy coding If you want to be able to look at your results like an ANOVA table (i.e. if you care about main effects and “main interactions”), then use deviation coding 23

Deviation coding vs. sum coding Although these terms are often used interchangeably, technically deviation coding usually means (in R, at least) coding contrasts as -.5 and.5, whereas sum coding means coding as ‑ 1 and 1 What’s the difference? (-.5, 5) deviation coding is useful for binomial (2-level) factors (-1, 1) sum coding can be useful for polytomous (>2-level) factors Same p-values, just different interpretation of the coefficient estimates 24

A binomial factor, deviation coding Deviation coding tells us how much the two levels differ from each other 25

A binomial factor, sum coding Sum coding tells us how much the non-reference level differs from the grand mean 26

A polytomous factor, deviation coding Deviation coding tells us 2× how much a given level differs from the grand mean 27

A polytomous factor, sum coding Sum coding tells us how much a given level differs from the grand mean 28

Custom contrast coding What if we want to test hypotheses on custom combination conditions, e.g., Does Diet 3 have higher weight than the average of Diets 2 and 4? 29

Set up series of codes which Sums to 0 Conditions you don’t care about are all 0 The two sets of conditions you want to respect sum to, respectively, 1 and -1 If non-orthogonal (i.e., if the sum of the products of these columns is not 0), run with glht {multcomp} If orthogonal, you can use directly in the model, but the interpretation of the estimates is complicated (although the p-values are reliable) Custom contrast coding Diet 1Diet 2Diet 3Diet 4 3 vs avg(2,4)0-1/21 1 vs avg(2,3,4)1-1/3 4 vs

Shortcomings of custom contrast coding Putting custom contrasts directly in the original model requires orthogonal contrasts (I think?) and even then the coefficients are confusing Testing with glht requires you to put in multiple contrasts even if you only care about one (this seems to be new) This also means the p-values are multiple-comparison-corrected even if you don’t want that. Fortunately, the b, SE, and t values are reliable, so you can still use them to re-calculate raw p-values 31

Directly comparing two coefficients Recall our nested interaction example: 32 The intercept for Diet2 is non-significant but Diet3 is significant. But “the difference between significant and non-significant is not itself significant” (Gelman). How to directly compare these?

Directly comparing two coefficients 1.Re-code the model, with Diet2 or Diet3 (rather than Diet1) as the baseline 2.glht 3.Hack a t-test from the variance-covariance matrix 4.Model comparison (this afternoon!) 33

Directly comparing coefficients (by manually hacking a t-test varcov <- vcov(model) t <- (fixef(model)[9] - fixef(model)[10]) / sqrt(varcov[9,9] + varcov[10,10] - 2*varcov[9,10]) 34

Coding more complex interactions Technically, an interaction is coded as e.g. A:B A*B is shorthand for A + B + A:B (i.e. both main effects and their interaction) R expands this ‘under the hood’ Sometimes you don’t care about, or don’t even want all the effects; then there are some tricks you can use to customize your interaction 35

Interaction coding tricks A*B – B or A/B (expands to A + A:B ) Tests the main effect of A and the A:B interaction, but no main effect of B (A+B+C)^2 (expands to A + B + C + A:B + A:C + B:C ) Tests all the main effects and two-way interactions, but not the three-way In general, (A+B+C+D….)^n tests all the possible interactions of n th order and below, but no higher-order interactions Therefore, (A+B+C)^3 is the same thing as A*B*C Useful when we do model comparison (this afternoon) (A+B+C+D)*E (expands to A + B + C + D + E + A:E + B:E + C:E + D:E ) Tests all main effects, and all interactions with E, but doesn’t allow [A,B,C,D] to interact with one another Useful when you care about a few crucial interactions but have no predictions about the others and want to avoid very complex interactions 36