Non-orthogonal regressors: concepts and consequences

Slides:



Advertisements
Similar presentations
The SPM MfD course 12th Dec 2007 Elvina Chu
Advertisements

Basis Functions. What’s a basis ? Can be used to describe any point in space. e.g. the common Euclidian basis (x, y, z) forms a basis according to which.
2nd level analysis – design matrix, contrasts and inference
General Linear Model L ύ cia Garrido and Marieke Schölvinck ICN.
General Linear Model Beatriz Calvo Davina Bristow.
2nd level analysis – design matrix, contrasts and inference
1 st Level Analysis: design matrix, contrasts, GLM Clare Palmer & Misun Kim Methods for Dummies
SPM 2002 C1C2C3 X =  C1 C2 Xb L C1 L C2  C1 C2 Xb L C1  L C2 Y Xb e Space of X C1 C2 Xb Space X C1 C2 C1  C3 P C1C2  Xb Xb Space of X C1 C2 C1 
Outline What is ‘1st level analysis’? The Design matrix
The General Linear Model Or, What the Hell’s Going on During Estimation?
Classical inference and design efficiency Zurich SPM Course 2014
Statistical Inference Guillaume Flandin Wellcome Trust Centre for Neuroimaging University College London SPM Course Zurich, February 2009.
The General Linear Model (GLM)
Statistical Inference
1st level analysis: basis functions and correlated regressors
Parametric modulation, temporal basis functions and correlated regressors Mkael Symmonds Antoinette Nicolle Methods for Dummies 21 st January 2008.
SPM short course – May 2003 Linear Models and Contrasts The random field theory Hammering a Linear Model Use for Normalisation T and F tests : (orthogonal.
1st Level Analysis Design Matrix, Contrasts & Inference
General Linear Model & Classical Inference Guillaume Flandin Wellcome Trust Centre for Neuroimaging University College London SPM M/EEGCourse London, May.
The General Linear Model
Principal Components Analysis BMTRY 726 3/27/14. Uses Goal: Explain the variability of a set of variables using a “small” set of linear combinations of.
Contrasts and Basis Functions Hugo Spiers Adam Liston.
With many thanks for slides & images to: FIL Methods group, Virginia Flanagin and Klaas Enno Stephan Dr. Frederike Petzschner Translational Neuromodeling.
With a focus on task-based analysis and SPM12
General Linear Model & Classical Inference London, SPM-M/EEG course May 2014 C. Phillips, Cyclotron Research Centre, ULg, Belgium
Analysis of fMRI data with linear models Typical fMRI processing steps Image reconstruction Slice time correction Motion correction Temporal filtering.
SPM5 Tutorial Part 2 Tiffany Elliott May 10, 2007.
FMRI Methods Lecture7 – Review: analyses & statistics.
SPM short course – Oct Linear Models and Contrasts Jean-Baptiste Poline Neurospin, I2BM, CEA Saclay, France.
Contrasts & Statistical Inference
Statistical Inference Christophe Phillips SPM Course London, May 2012.
FMRI Modelling & Statistical Inference Guillaume Flandin Wellcome Trust Centre for Neuroimaging University College London SPM Course Chicago, Oct.
Idiot's guide to... General Linear Model & fMRI Elliot Freeman, ICN. fMRI model, Linear Time Series, Design Matrices, Parameter estimation,
The General Linear Model
SPM short – Mai 2008 Linear Models and Contrasts Stefan Kiebel Wellcome Trust Centre for Neuroimaging.
1 st level analysis: Design matrix, contrasts, and inference Stephane De Brito & Fiona McNabe.
The general linear model and Statistical Parametric Mapping I: Introduction to the GLM Alexa Morcom and Stefan Kiebel, Rik Henson, Andrew Holmes & J-B.
Contrasts & Statistical Inference Guillaume Flandin Wellcome Trust Centre for Neuroimaging University College London SPM Course London, October 2008.
The general linear model and Statistical Parametric Mapping II: GLM for fMRI Alexa Morcom and Stefan Kiebel, Rik Henson, Andrew Holmes & J-B Poline.
The General Linear Model Christophe Phillips SPM Short Course London, May 2013.
The General Linear Model Guillaume Flandin Wellcome Trust Centre for Neuroimaging University College London SPM fMRI Course London, October 2012.
SPM short course – Mai 2008 Linear Models and Contrasts Jean-Baptiste Poline Neurospin, I2BM, CEA Saclay, France.
General Linear Model & Classical Inference London, SPM-M/EEG course May 2016 Sven Bestmann, Sobell Department, Institute of Neurology, UCL
General Linear Model & Classical Inference Short course on SPM for MEG/EEG Wellcome Trust Centre for Neuroimaging University College London May 2010 C.
General Linear Model & Classical Inference
The general linear model and Statistical Parametric Mapping
The General Linear Model
A word on correlation/estimability
Statistical Inference
A very dumb dummy thinks about modelling, contrasts, and basis functions. ?
Statistical Inference
SPM short course at Yale – April 2005 Linear Models and Contrasts
and Stefan Kiebel, Rik Henson, Andrew Holmes & J-B Poline
Statistical Inference
The SPM MfD course 12th Dec 2007 Elvina Chu
The General Linear Model (GLM)
Contrasts & Statistical Inference
The General Linear Model
Rachel Denison & Marsha Quallo
Introduction to Connectivity Analyses
The General Linear Model
The General Linear Model (GLM)
Statistical Inference
Contrasts & Statistical Inference
MfD 04/12/18 Alice Accorroni – Elena Amoruso
The General Linear Model
Statistical Inference
The General Linear Model
The General Linear Model
Contrasts & Statistical Inference
Presentation transcript:

Non-orthogonal regressors: concepts and consequences

overview Problem of non-orthogonal regressors Concepts: orthogonality and uncorrelatedness SPM (1st level): covariance matrix detrending how to deal with correlated regressors Example

design matrix regressors Scan number Each column in your design matrix represents 1) events of interest or 2) a measure that may confound your results. Column = regressor The optimal linear combination of all these columns attempts to explain as much variance in your dependent variable (the BOLD signal) as possible

= + + error 1 2 x1 x2 e Time BOLD signal Source: spm course 2010, Stephan http://www.fil.ion.ucl.ac.uk/spm/course/slides10-zurich/

The beta’s are estimated on a voxel-by-voxel basis high beta means regressor explains much of BOLD signal’s variance (i.e. strongly covaries with signal)

Problem of non-orthogonal regressors Y total variance in BOLD signal

Orthogonal regressors Y X2 = + X1 total variance in BOLD signal X1 X2 every regressor explains unique part of the variance in the BOLD signal

Orthogonal regressors Y X2 = + X1 total variance in BOLD signal X1 X2 There is only 1 optimal linear combination of both regressors to explain as much variance as possible. Assigned beta’s will be as large as possible, stats using these beta’s will have optimal power

non-orthogonal regressors Y X1 X2 = + Regressor 1 & 2 are not orthogonal. Part of the explained variance can be accounted for by both regressors and is assigned to neither. Therefore, betas for both regressors will be suboptimal

Entirely non-orthogonal X1 X2 total variance in BOLD signal regressor 2 regressor 1 = + Betas can’t be estimated. Variance can not be assigned to one or the other

“It is always simpler to have orthogonal regressors and therefore designs.“ (spm course 2010)

orthogonality Regressors can be seen as vectors in n-dimensional space, where n = number of scans. Suppose now n = 2 r1 r2 --------------- 1 2 2 1 1 2 r1 r2

orthogonality Two vectors are orthogonal if raw vectors have inner product == 0 angle between vectors == 90° cosine of angle == 0 Inner product: r1 • r2 = (1 * 2) + (2 * 1) = 4 θ = acos(4 / (|r1| * |r2|) = about 35 degrees 1 2 r1 r2 35

orthogonality Orthogonalizing one vector wrt another: it matters which vector you choose! (Gram-Schmidt orthogonalization) Orthogonalize r1 wrt r2: u1 = r1 – projr2(r1) u1 = [1 2] – (r1 • r2)/(r2 • r2) u1 = [-0.6 1.2] Inner product: u1 • r2 = (-0.6 * 2) + (1.2 * 1) = 0 1 2 r1 r2 u1

orthogonality & uncorrelatedness An aside on these two concepts Orthogonal is defined as: X’Y = 0 (inner product of two raw vectors = 0) Uncorrelated is defined as: (X – mean(X))’(Y – mean(Y)) = 0 (inner product of two detrended vectors = 0) Vectors can be orthogonal while being correlated, and vice versa!

Orthogonal because: Inner product 1*5 + -5*1 + 3*1 + -1*3 = 0 please read Rodgers et al. (1984) Linearly independent, orthogonal and uncorrelated variables. The American Statistician, 38:133-134. Will be in the FAM folder as well Orthogonal because: Inner product 1*5 + -5*1 + 3*1 + -1*3 = 0 Inner product van orthogonal example geven, laten zien dat ongecorreleerd is. Figuur laten zien waar vectoren orthogonal zijn, en na detrending gecorreleerd zijn. Laat switchen van vectoren zien

Orthogonal, but correlated! 3.75 6.75 -5.25 -0.25 please read Rodgers et al. (1984) Linearly independent, orthogonal and uncorrelated variables. The American Statistician, 38:133-134. Will be in the FAM folder as well Detrend: Mean(X) = -0.5 Mean(Y) = 2.5 X_det Y_det 1.5 2.5 -4.5 -1.5 3.5 -1.5 -0.5 0.5 ================== Mean(X_det) = 0 Mean(Y_det) = 0 Inner product: 5 Orthogonal, but correlated! 3.75 6.75 -5.25 -0.25 Inner product van orthogonal example geven, laten zien dat ongecorreleerd is. Figuur laten zien waar vectoren orthogonal zijn, en na detrending gecorreleerd zijn. Laat switchen van vectoren zien

r1_det r2_det -0.9 0.5 0.9 -0.5 r1 r2 -0.6 2 1.2 1 r1 detrend r2 1 2

orthogonality & uncorrelatedness Q: So should my regressors be uncorrelated or orthogonal? A: When building your SPM.mat (i.e. running your jobfile) all regressors are detrended (except the grand mean scaling regressor). This is why orthogonal and uncorrelated are both used when talking about regressors update: it is unclear whether all regressors are detrended when building an SPM.mat. This seems to be the case, but recent SPM mailing list activity suggests detrending might not take place in versions newer than SPM99. Donders batch? Include email from Guillaume stating this is not the case since SPM99 Explain people talk about orthogonal in the case of uncorrelatedness, and I will do so from now on as well “effectively there has been a change between SPM99 and SPM2 such that regressors were mean-centered in SPM99 but they are not any more (this is regressed out by the constant term anyway).” Link

Your regressors correlate Despite scrupulous design, your regressors likely still correlate to some extent This causes beta estimates to be lower than they could be You can see correlations using review  SPM.mat  Design  design orthogonality

For detrended data, the cosine of the angle (black = 1, white = 0) between two regressors is the same as the correlation r ! orthogonal vectors cos(90) = 0 r = 0 r2 = 0 correlated vector cos(81) = 0.16 r = 0.16 r2 = 0.0256 r2 indicates how much variance is common between the two vectors (2.56% in this example). Note: -1 ≤ r ≤ 1 and 0 ≤ r2 ≤ 1

Correlated regressors: variance from single regressor to shared

Correlated regressors: variance from single regressor to shared t-test uses beta, determined by amount of variance explained by single regressor.

Correlated regressors: variance from single regressor to shared t-test uses beta, determined by amount of variance explained by single regressor. Large shared variance: low statistical power

Correlated regressors: variance from single regressor to shared t-test uses beta, determined by amount of variance explained by single regressor. Large shared variance: low statistical power Not necessarily a problem if you do not intend to test these two regressors! Movement regressor 1 Movement regressor 2

How to deal with correlated regressors? Strong correlations between regressors are not necessarily a problem. What is relevant is correlation between contrasts of interest relative to the rest of the design matrix Example: lights on vs lights off. If movement regressors correlate with these conditions (contrast of interest not orthogonal to rest of design matrix), there is a problem. If nuisance regressors only correlate with each other, no problem! Grand mean scaling is not centered around 0 (i.e. not detrended), these correlations are not informative

But what about the fact that SPM book says that correlation between contrast and rest of design matrix matters (and not all regressors of interest vs each other and versus movement regressors? A: it is a problem assuming you will test all of your regressors of interest. Then all contrasts will not be orthogonal to the rest of the design matrix

How to deal with correlations between contrast and rest of design matrix? Orthogonalize regressor A wrt regressor B: all shared variance will now be assigned to B.

orthogonality 1 2 r1 r2

total variance in BOLD signal orthogonality r1 r2 total variance in BOLD signal regressor 1 regressor 2 1 2

How to deal with correlations between contrast and rest of design matrix? Orthogonalize regressor A wrt regressor B: all shared variance will now be assigned to B. Only permissible given a priori reason to do this: hardly ever the case

How to deal with correlations between contrast and rest of design matrix? do an F-test to test overall significance of your model. For example, to see if adding a regressor will significantly improve your model. Shared variance is taken along to determine significance then. In the case where a number of regressors represent the same manipulation (e.g. switch activity, convolved with different hrfs) you can serially orthogonalize the regressors before estimating betas.

Example how not to do it: 2 types of trials: gain and loss Voon et al. (2010) Mechanisms underlying dopamine-mediated reward bias in compulsive behaviors. Neuron

Example how not to do it: 4 regressors: Gain predicted outcome Positive prediction error (gain trials) Loss predicted outcome Negative prediction error (loss trials) Highly correlated! Highly correlated! Highly correlated because they simply are, ESPECIALLY when no jitter is introduced Voon et al. (2010) Mechanisms underlying dopamine-mediated reward bias in compulsive behaviors. Neuron

Example how not to do it: Performed 6 separate analyses (GLMs) Shared variance is attributed to single regressor in all GLMs Amazing! Similar patterns of activation! Voon et al. (2010) Mechanisms underlying dopamine-mediated reward bias in compulsive behaviors. Neuron

Take home messages If regressors correlate, explained variance in your BOLD signal will be assigned to neither, which reduces power on t-tests If you orthogonalize regressor A with respect to regressor B, values of A will be changed and A will have equal uniquely explained variance. B, the unchanged variable, will come to explain all variance shared by A and B. However, don’t do this unless you have a valid reason. Orthogonality and uncorrelatedness are only the same thing if your data is centered around 0 (detrended, spm_detrend) SPM does (NOT?) detrend your regressors the moment you go from job.mat to SPM.mat

Interesting reads http://imaging.mrc-cbu.cam.ac.uk/imaging/DesignEfficiency#head-525685650466f8a27531975efb2196bdc90fc419 Combines SPM book and Rik Henson’s own attempt at explaining design efficiency and the issue of correlated regressors. Rodgers et al. (1984) Linearly independent, orthogonal and uncorrelated variables. The American Statistician, 38:133-134 15-minute read that describes three basic concepts in statistics/algebra

regressors

x y x y -3 3 3 6 0 -6 6 -3 3 3 9 6 Same vectors, but detrended: -3 3 0 -6 3 3 Raw vectors: x y 3 6 6 -3 9 6 Inner product: 54 Non-orthogonal inner product: uncorrelated  But!  Include example here where raw vectors are orthogonal, but after detrending (which is what spm does) the vectors are correlated.