CO3 Modeling Workshop Michael Young.

CO3 Modeling Workshop Michael Young

Today’s workshop A brief intro to general linear modeling
Key concepts to understand Limitations Multilevel linear modeling Terminology and process Generalized linear modeling for choice data Bringing it all together: Multilevel generalized linear modeling Repeated measures analysis of choice data

Why Multilevel Modeling for Repeated Measures?
Repeated measures ANOVA approaches are restricted to categorical predictors MLM elegantly handles unbalanced data and even missing (at random) data through maximum likelihood estimation MLM allows a more refined specification of the dependencies in your data (variance/covariance assumptions) MLM is more intuitive!

Murray and Bevins (2009) Note 6 sessions of 10 trials each There was a main effect of Drug, F(1, 15) = , p < .001, a main effect of Trial, F(59, 885) = 8.34, p < .001, and a Drug × Trial interaction, F(59, 885) = 6.19, p < .001, MSE = Elevation scores were higher on nicotine than saline for Trials 7, 13–20, and 22–60, LSDmmd = 2.11.

Now, discussion becomes about intercept differences, slope differences, and (for alternative analysis on top) quadratic curvature differences. Plus, trial (within session) and session can be considered separately, and relationship between trial (linear vs. quadratic) and DV can be tested. AND, model has 7 df (linear model) or 11 df (quadratic model) instead of original 119 df.

Some prerequisites…

General linear modeling
ANOVA and regression are both part of the GLM approach ANOVA is a special case of regression in which categorical values are turned into numeric variables Too often, experimental psychologists overemphasize ANOVA approaches which results in turning continuous predictors into categorical ones to fit the ANOVA model (Young, 2016)

GLM Mindset It’s just a bunch of predictors, transformations of those predictors, interactions among those predictors – they are used to predict the outcome variable The traditional mindset is to think of each analysis as a different tool – regression, ANOVA, and ANCOVA – and you learn to pick the right tool on the stats package menu In GLM you just fit a model!

Example from JMP’s Fit Model
Predicting Log(RT) – Model comparison Because there were 105 trials, an ANOVA based on trial isn’t tractable Furthermore, ANOVA is ignorant of order of levels, and identifying trends requires a trend analysis So, people will often block the trials to make it fit an ANOVA and then visually examine trends

Behind the scenes in a GLM
Categorical variables must be coded to create numeric predictors Can be done manually (e.g., dummy or effect coding) Is being done automatically by your software BTW, an ANOVA is using “effect coding”

Dummy coding Technically, dummy coding is 0/1 coding
I recommend effect coding of categorical variables for most purposes

Simplest nominal variable: Two levels
Dummy coding 1/0. E.g., 0 for male, 1 for female. This is also called “treatment coding” in which one of the levels is specified as the baseline (0) and the other is compared to it (so beta for category is size of diff). Effect coding +1/-1 This is also called “sum to zero coding” (for obvious reasons) This is what JMP uses to do regressions/GLM with nominal/categorical predictors. ANOVA in all packages is based on this form of coding Note, this is like centering each of the code columns

Dummy coding Effect coding Categorization predicted mean Pseudocategorization predicted mean = PredY = * PseudoCat where PseudoCat = 1 for Pseudocategorization and = 0 for Categorization Categorization predicted mean = Pseudocategorization predicted mean = PredY = * Cond[Cat] where Cond[Cat] = 1 for Categorization and = -1 for Pseudocategorization Here, the coding makes no difference to conclusions because there are no interactions and thus no possibility of non-essential collinearity between ME and interactions

With more than two levels
Need g – 1 predictors to code a variable with g levels

Dummy or Treatment Coding
Baseline Sum to Zero or Effect Coding -1 -1 What is the Baseline? The unweighted mean across all conditions

Dummy or Treatment Coding
Copyright © 2012 Pearson Education, Inc. All rights reserved.

Zero Sum Coding aka Effect Coding
0 - 1- 2- 3

Dummy vs. Effect Coding The coding drastically changes the interpretation of the intercept and of any main effects or interactions (other than the highest level interaction) The coding DOES NOT affect the overall model fit (R2), model predictions, RMSE, etc. When interpreting a predictor’s effect, it assumes that every other variable is set to a value of ‘0’ In ANOVA, this means we’re examining marginal means

Behind the scenes of the earlier example
Asides: why are there fractional dfs? What are VIFs? Why does the interaction show “(trial – 53)”?

GLM – it’s all one model This means that same principles apply across models Assumptions of normality of residuals, homogeneity of variance, and independence of measurements (don’t believe claims of robustness!) Careful with collinearity (yes, categorical predictors can be collinear) Linearity Really only relevant for continuous predictors because dummy/effect-coded variables are only two points along the function Calling something a “moderator” or a “covariate” is a convenience – it’s a label In a moderation, the analysis treats both variables in an interaction the same A covariate is just adding a continuous predictor to a bunch of categorical ones

A complete example (with a multilevel model)
Comment on nesting. Demonstrate Tukey’s HSD, planned contrasts,

A complete example, cont.
JMP Open S-D Mixed I’ll use the equivalent of a repeated measures ANOVA but through multilevel modeling Data is aggregated across choices to create a percentage outcome variable R (using RStudio as interface) library(lme4) and library(lsmeans) Many libraries have to be installed; then can load freely FYI: lstrends command can be used for simple slopes

An aside: Model comparison
The field is slowly moving toward model comparison rather than “pro forma” complete models Model with interaction vs. one without Not testing all possible interactions in complex designs (higher order ones often don’t replicate) Model with “trials” as predictor vs. “log(trials)” R2 is the classic, but doesn’t work well for generalized linear models and multilevel models Thus, AIC, BIC, & DIC are used – lower values are better Based on negative log-likelihood with complexity penalty

AIC and BIC Complications
Sensitive to number of values, so two models compared must be based on same data Changes if DV scale changes, so can’t be used to compare models with/without transformed DV (including when it’s not obvious like with link functions) Scale is meaningless in isolation – only useful when compared to another value

The difference matters: Raftery (1995)
Another useful paper for interpretation of AIC/BIC values: Wagenmakers & Farrell (2004) PB&R

Modeling data dependencies using multilevel modeling
Alternative names Mixed effects modeling Random coefficients modeling Some classic methods are special cases of multilevel modeling HLM (hierarchical linear modeling) Growth curve analysis Can integrate MLM with Generalized linear modeling to produce generalized linear mixed effects models (GLMM) Can do nonlinear MLM (R library nlme)

What is it? Uses maximum likelihood estimation
Includes fixed and random effects (hence “mixed”) Can estimate parameters at different levels (hence “multilevel modeling”) Group parameter estimate is a function of Individual parameter estimates and vice versa

Fixed effects and random effects for Within-subject variables
Fixed effects: Variables selected for study – IVs or predictors Random effects: Random deviations from the baseline performance in a group or condition. Within-subject variable effects also can be estimated at the participant level Random variations across subjects are due to individual differences or random sampling What types of effects can randomly vary across participants? Within-subject slope effects Intercepts (performance when all predictors set to zero)

Allowing the intercept/slope of predictor to vary when it is categorical
Only intercept allowed to vary across subjects Intercept and slope allowed to vary across subjects

Allowing the intercept/slope of predictor to vary when it is continuous (4 subjects)
Only intercept allowed to vary across subjects (Circularity assumption) Intercept and month index slope allowed to vary across subjects Intercept and month index slope and quadratic slope allowed to vary across subjects Note, BIG difference in fit between first and second models (AIC = 4794 vs. 4748); small difference between second and third (AIC = 4746). Titi Monkeys!

One-predictor equation for multilevel model
Population intercept, g00 – fixed effect aka overall intercept estimate Random variation around that fixed effect is u0j for the jth subject Population slope, g10 – fixed effect aka overall slope estimate Random variation around that fixed effect is u1j i represents the ith data point or case

To obtain estimate of intercept for subject 1, you add the fixed effect estimate, g00 and the random component, u01 To obtain the estimate of the slope for subject 1, you add the fixed effect estimate, g10 and the random component, u11 The multilevel model is estimating both the fixed effects parts that we’re interested in and the random effects that are of less interest.

From JMP Fixed Effects intercept estimate, g00
Fixed Effects slope estimate, g10 Note, if you add the random intercept parts (u0j) = = .000 Ditto for random slope parts (u1j) = = .000 Note, this model did not allow quadratic slope to vary.

Fixed vs. random predictor variable
Variable having fixed effect only Possible for within-subject variable – effect may not vary significantly across subjects (see Titi model) All between-subject variables – can’t estimate random variation across subjects Variable having fixed and random effects Fixed is the group or average effect, random is random variation in that effect across subjects. Variable having random effect but not fixed component Without a fixed effect, you get estimates of the slope for each individual but without them being influenced by any estimated group slope (see later discussion on shrinkage). No attempt to estimate any group or average effect. Always remember that intercept is a variable for the purposes of understanding these distinctions

Fixed vs. Random, cont. (Dummy-coded) Intercept as random effect
Slope as random effect Both as random effects Extra sleep under two drug treatments

Fixed vs. Random, cont.

Random intercept model
Analyze letting intercepts vary but not slopes AIC: NOTE: Subject ids must be unique in the experiment or you’ll have to nest your subject column within any between-group variables Group/Subject

Random Intercept and Slope model
Analyze letting intercepts and slopes vary AIC: Getting RE structure right is critical to correct estimate of the parameter estimates AND their SEs.

Multilevel modeling in JMP
For all repeated measures data First, need long format, not wide Second, one of the columns must specify the subject identifier (unique!) Should be categorical

Random intercept model in JMP
Specify model per usual in Fit Model Add to model the column in your data set indicating the subject id Select the subject term in the lower right window and use the Degree Attributes pull-down to select Random Effect Run it!

Random intercept and slopes model
To allow for a random slope, specify model per usual in Fit Model, add subject column (for intercept) and add predictor*Subject interaction Select both the subject term and the subject*predictor term and designate both as random effects Run it! Repeat for any additional random slopes to test various models

Transformations in JMP

A theoretical interlude
Variance/covariance assumptions

Capturing dependencies across repeated measures
May allow estimation of variance-covariance structure How much does one measure from a subject correlate with the next Repeated measures ANOVA assumes compound symmetry Equivalent of only allowing intercepts to vary MANOVA assumes an unstructured matrix Mixed effects may allow you to fit various types of structures From unstructured to identity (homogeneity of variance) SAS and SPSS can allow specification, lme in R can, but lmer and JMP do not (they use the random effects model)

What is the variance/covariance structure?
Recall: subject measured multiple times YA1, YA2, YA3, YA4 for measurements of reaction time across four levels of an IV, A What do we assume about the variance of each of these levels? Variance structure What do we assume about the correlations between pairs of these measures? Covariance structure

Variance/Covariance Trial 1 2 3 4 5 s12 s22 s32 s42 s52 Trial 1 2 3 4

Variance/Covariance matrix assumptions
Compound symmetry (sphericity) Each s2 is identical and each r is identical Assumption of univariate repeated measures ANOVA Multivariate ANOVA Assumes nothing – each si2 and rij are estimated Unstructured variance/covariance matrix Others matrix assumptions include… Autoregressive: Those differing by one time step = r, those by 2 = r2, by 3 = r3 In other words, assumes that the correlation between measures decreases the farther apart they are in time Unity: variances are equal and no correlation This is the assumption of standard general linear modeling

Approach used in JMP The random effects modeling approach does not directly estimate the variance/covariance matrix structure The multivariate approach allows the specification of a specific variance/covariance matrix type Drawback: requires complete data (nothing missing) The random effects approach allows us to test a series of models with specific assumptions about individual correlations Random intercept, Random slope for first w/in subject variable, random slope for second w/in subject variable… Note, random intercept model = Compound Symmetry = Univariate repeated measures ANOVA

Reminder: Advantages of Multilevel Modeling
Uses all available data When data is missing, estimates are simply less certain (because it uses MLE, not error minimization) It allows inclusion of continuous predictors Not so for repeated measures ANOVA/MANOVA Allows testing of various variance-covariance relationships: thus, not too conservative (MANOVA) or too liberal (repeated measures ANOVA).

Adding a between-subject variable
Between-subject variables allow you to predict the observed variations across subjects. For example, perhaps sex, species, estrogen level, or lesion size causes observed differences in intercepts or slopes. Copyright © 2012 Pearson Education, Inc. All rights reserved.

The “mathematics” behind the scenes
In essence, what the software is doing is something like this: Compute slope t or categorical effect F for each participant for the within-subjects variable. This produces a column of subject ts and Fs which you can then analyze to determine if the column of ts are greater than zero or the column of Fs are greater than one. Thus, logic is analogous to that used in the paired-t test. However, it’s a bit more complicated because: Each subject has differential influence on the group estimate based on the uncertainty of its estimate (due to missing or variable data) The group estimate will influence the individual estimates (Shrinkage)

Factorial analyses

Full factorial within-subjects
Just do a full factorial while including your subject identifier column Specify all terms involving the subject variable as random effects. This includes interactions But, in general I don’t recommend including all interactions, either in the fixed effects structure nor the random effects structure

Full factorial between-within
Those terms (main effects or interactions) that include a between-subject variable cannot be random effects A(between) x B(within) x C(within) Fixed Effects (full model) A B C AB AC BC ABC Random Effects (full model) Intercept B C BC

Gap and Delay are continuous within-subject variables
Causal order is categorical between-subject variable DV was causal rating

Creating a model: Strategy
Between-subject factor can be used to explain any random variations in intercepts or slopes between participants. Remember: Cannot designate a term containing a between-subject factor as random effect This assumes that there are random effects of intercept or slope. Traditionally, you fit the model only with within-subject IVs first. Then, find best fitting model by allowing intercepts or slopes/intercepts to vary (based on AIC or chi-squared test). i.e., find the best random effect structure Once you have the best model, then you can add between-subject IVs to “explain” random effects.

Aside: Shrinkage due to uncertainty - Missing values and/or variability
Here, shrinkage means toward the fixed effect estimate

Complicating factors Highly unbalanced data and visualization
Generating plots with the right error bars

Highly unbalanced data
The general linear modeling approaches that use MLE handle unbalanced data very nicely in the analysis BUT, figures based on raw averages can be seriously biased by unbalanced data It’s unlikely to happen to you, but be very wary if your data are highly unbalanced – see me for tips

Generating plots with error bars
Standard error bars The standard errors generated without consideration of the data dependencies will include both (within-subject) replication noise and (between-subject) individual differences noise And, the SEs may reflect both intercept and slope SE See Loftus and Masson (1994), Cousineau (2005), Morey (2008), Franz and Loftus (2012)…. AHHH! In R, there is code I will share to generate good general-use SE bars using ggplot

For example, if you want to estimate the effect of number (its slope)

How to do basic MLM in R? Preparing the data
myd$c.power<-myd$Power-0.75 myd$c.trial<-myd$Trial.in.Phase-5.5 myd$c.t<-myd$c.trial/10 contrasts(myd$Condition) = contr.sum(2) myrReduced<-lmer(Latency~ (c.power+c.t|participant), data=myd) # where’s the intercept? myrFull<-lmer(Latency~c.power*c.t+c.power*Condition (c.power+c.t|participant), data=myd) # where’s the MEs? anova(myrReduced, myrFull) # LR test summary(myrFull) # If necessary, use lsmeans, lstrends for post hocs, or multcomp library for planned comparisons

Another aside Random effect structure
There is variation in how people determine the best random effect structure to capture dependencies Recent work by Doug Bates suggests that higher-order interactions in the RE structure are unlikely to replicate and produce instabilities in the fit Especially for small N studies So, I recommend mostly sticking with intercepts and simple slopes in the RE structure

Wait: what are the assumptions?
Like GLM, residuals are assumed to be normally distributed and variance is homogeneous across the predictor values BUT, what if they’re not???

Lower variability as you approach the ceiling
Predicts impossible values

Options Attempt to transform DV to achieve normality/homogeneity (e.g., logit for percentage data) Specify a different set of assumptions…

New assumptions: Generalized linear modeling
There’s no time to go through all the variations, so I’m going to focus on ONE – binomial outcomes, because… Choice data are common in comparative psych People have some familiarity with logistic regression (although you’ll have to brush up!) It disaggregates percentages into individual choices which allows us to compare P(More) as a function of continuous predictors like trial

Generalized multilevel modeling in R
Ability to conduct analysis depends on software Main challenge for people is weak understanding of GLM (issues discussed earlier) and thinking in a log-odds-transformed space

A Generalized Linear Model has three components
An error distribution that is from the exponential family. Gaussian, binomial, exponential, Poisson, Gamma, etc. A linear predictor model E.g., bk*xk+…+b2*x2+b1*x1+b0 A nonlinear link function, g, so that: E(Y) = g-1(bk*xk+…+b2*x2+b1*x1+b0) Alternatively: g(E(Y)) = bk*xk+…+b2*x2+b1*x1+b0

Link functions Each error distribution has a set of associated link functions and one canonical function

Value of predictor function
The logistic function P(response) Value of predictor function Σ bixi+b0

Logistic (g-1) vs. Logit (g)

Brief intro to how logistic regression works
It is an iterative method to find the optimal b weights to maximize likelihood for the following: The b weights are interpreted in terms of their effect on the log odds of “1” : (log(p/(1-p)) = logit(p).

Why use logistic regression rather than linear regression?
With linear regression, the predicted values can become greater than one and less than zero. An assumption of linear regression is that the variance of Y is constant across values of X. But, with a binomial variable, the variance is p × (1-p). When 50% of the people are 1s (p = .5), the variance is .25, its maximum value. As we move to more extreme values, the variance decreases, so as p approaches 1 or 0, the variance approaches 0.

How to do multilevel logistic regression in R
Same lme4 library, but we’ll use glmer (not lmer) and specify the distribution family glmer(DV ~ withinA * withinB * betwC + (withinA + withinB| Subject), family = binomial, data=myd) That’s it!!! Of course, you still have to consider dummy/effect coding, centering of continuous predictors, the RE structure, post hocs/planned comparisons, etc.

Example 1 In the first example, I didn’t have access to the trial-by-trial data (1997 data set!) But, I did have a record of the number of trials of each type

Repeated measures logistic regression
Currently can’t be done in JMP, so I exported the data as csv and analyzed in R Will show in R, but here’s the key syntax: myr<-glmer(ChoseDifferent~entropy+(entropy|Bird), family=binomial, data=myd, weights=Freq) myr<-glmer(ChoseDifferent~entropy+(entropy+1|Bird), family=binomial, data=myd, weights=Freq) Equivalent to previous line myr<-glmer(ChoseDifferent~entropy+(1|Bird), family=binomial, data=myd, weights=Freq)

AIC BIC logLik deviance df.resid
Scaled residuals: Min Q Median Q Max Random effects: Groups Name Variance Std.Dev. Corr Bird (Intercept) entropy Number of obs: 136, groups: Bird, 4 Fixed effects: Estimate Std. Error z value Pr(>|z|) (Intercept) e-10 *** entropy < 2e-16 *** --- Signif. codes: 0 ‘***’ ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Correlation of Fixed Effects: (Intr) entropy

In R, you can get the percentage back by…
Fixed effects: Estimate Std. Error z value Pr(>|z|) (Intercept) e-10 *** entropy < 2e-16 *** In logistic regression, you are predicting the log(odds) of an outcome: log(P(diff)/(1-P(diff)) = *entropy Or, P(different) = logistic( *entropy) In R, you can get the percentage back by… library(boot) inv.logit( *0) = .11 inv.logit( *4) = .87

If you want to see the individual differences…
The ranef(modelname) command outputs the random variations around the fixed effect The coef(modelname) adds the random effects to the fixed effect > fixef(myr) (Intercept) entropy > ranef(myr) $Bird (Intercept) entropy 25y 33r 3y 51r > coef(myr) $Bird (Intercept) entropy 25y 33r 3y 51r

Decisions about random effects
Was it necessary to allow slopes to vary across birds? I ran two models, one with (entropy|Bird) and one with (1|Bird) as REs Recall that entropy|Bird automatically includes intercept variation I compared the model using a Likelihood Ratio test > anova(myr1, myr2) Data: myd Models: myr2: ChoseDifferent ~ entropy + (1 | Bird) myr1: ChoseDifferent ~ entropy + (entropy | Bird) Df AIC BIC logLik deviance Chisq Chi Df Pr(>Chisq) myr myr e-15 ***

Model comparison aside
Note, the LR test via the anova command is only valid when comparing nested models You can compare AIC/BIC values when not nested, but no p-value generated But, be careful that DVs and rows for the two analyses are the same!

Visualizing the results
“xyplot” in the lattice library is the workhorse for multilevel models, but it’s limited The ggplot2 library is worth the effort Error bars for a repeated measures design is complicated (see earlier references)

Smoothed Plot (note data are binary)
Model Fit – with Random Effects Model Fit – Fixed Effect

Example 2 Unpublished study by Josh Beckmann and myself in which people judged whether a stimulus was long or short as a function of its duration The stimulus only changed visually (visual) or was accompanied by an auditory change (audio)

Raw data Note presence of learning – greater sensitivity in later trials

Go to R…

Results – Pretty picture

Gamma distributions for RT?
Raw Means Predicted Means glmer(RT~c.trial*c.duration*Audio.Visual+c.trial*I(c.duration^2)*Audio.Visual+ (c.trial+c.duration+Audio.Visual|Subject), data=joshd, family=Gamma(link="log"))

Extension Nonlinear multilevel modeling
See Young (in press) JEAB on researchgate.net and associated code accessible through my home lab page This uses an example of fitting hyperbolic and hyperboloid discounting functions to discounting data

That’s all folks! You will need to put time into learning these techniques You will run into things that you don’t understand and frustrate you You will run into collaborators, advisors, reviewers and editors who either discourage this approach or ask for clearer explanations Model your presentation to parallel those used for traditional approaches in your literature You should learn and use these approaches The ability to smoothly incorporate categorical and continuous predictors, handle imbalance (incl. missing values), use non-normal outcome variables, and model nonlinear relations is too important Plus, you want results that replicate!

CO3 Modeling Workshop Michael Young.

Similar presentations

Presentation on theme: "CO3 Modeling Workshop Michael Young."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

CO3 Modeling Workshop Michael Young.

Similar presentations

Presentation on theme: "CO3 Modeling Workshop Michael Young."— Presentation transcript:

Similar presentations

About project

Feedback