The general linear model and Statistical Parametric Mapping I: Introduction to the GLM Alexa Morcom and Stefan Kiebel, Rik Henson, Andrew Holmes & J-B.

Slides:



Advertisements
Similar presentations
General Linear Model L ύ cia Garrido and Marieke Schölvinck ICN.
Advertisements

SPM 2002 C1C2C3 X =  C1 C2 Xb L C1 L C2  C1 C2 Xb L C1  L C2 Y Xb e Space of X C1 C2 Xb Space X C1 C2 C1  C3 P C1C2  Xb Xb Space of X C1 C2 C1 
The General Linear Model Or, What the Hell’s Going on During Estimation?
Classical inference and design efficiency Zurich SPM Course 2014
Statistical Inference Guillaume Flandin Wellcome Trust Centre for Neuroimaging University College London SPM Course Zurich, February 2009.
The General Linear Model (GLM) Methods & models for fMRI data analysis in neuroeconomics November 2010 Klaas Enno Stephan Laboratory for Social & Neural.
The General Linear Model (GLM)
The General Linear Model (GLM) SPM Course 2010 University of Zurich, February 2010 Klaas Enno Stephan Laboratory for Social & Neural Systems Research.
Statistical Inference
Classical (frequentist) inference Methods & models for fMRI data analysis in neuroeconomics 14 October 2009 Klaas Enno Stephan Laboratory for Social and.
Classical (frequentist) inference Methods & models for fMRI data analysis 11 March 2009 Klaas Enno Stephan Laboratory for Social and Neural System Research.
Classical (frequentist) inference Methods & models for fMRI data analysis 22 October 2008 Klaas Enno Stephan Branco Weiss Laboratory (BWL) Institute for.
1st level analysis: basis functions and correlated regressors
Lorelei Howard and Nick Wright MfD 2008
Introduction to Regression Analysis, Chapter 13,
SPM short course – May 2003 Linear Models and Contrasts The random field theory Hammering a Linear Model Use for Normalisation T and F tests : (orthogonal.
General Linear Model & Classical Inference
The General Linear Model and Statistical Parametric Mapping Stefan Kiebel Andrew Holmes SPM short course, May 2002 Stefan Kiebel Andrew Holmes SPM short.
General Linear Model & Classical Inference Guillaume Flandin Wellcome Trust Centre for Neuroimaging University College London SPM M/EEGCourse London, May.
The General Linear Model
2nd Level Analysis Jennifer Marchant & Tessa Dekker
2nd level analysis – design matrix, contrasts and inference
With many thanks for slides & images to: FIL Methods group, Virginia Flanagin and Klaas Enno Stephan Dr. Frederike Petzschner Translational Neuromodeling.
Brain Mapping Unit The General Linear Model A Basic Introduction Roger Tait
7/16/2014Wednesday Yingying Wang
General Linear Model & Classical Inference London, SPM-M/EEG course May 2014 C. Phillips, Cyclotron Research Centre, ULg, Belgium
SPM Course Zurich, February 2015 Group Analyses Guillaume Flandin Wellcome Trust Centre for Neuroimaging University College London With many thanks to.
SPM short course – Oct Linear Models and Contrasts Jean-Baptiste Poline Neurospin, I2BM, CEA Saclay, France.
Wellcome Dept. of Imaging Neuroscience University College London
Contrasts & Statistical Inference
The General Linear Model (for dummies…) Carmen Tur and Ashwani Jha 2009.
Methods for Dummies Second level Analysis (for fMRI) Chris Hardy, Alex Fellows Expert: Guillaume Flandin.
Statistical Inference Christophe Phillips SPM Course London, May 2012.
FMRI Modelling & Statistical Inference Guillaume Flandin Wellcome Trust Centre for Neuroimaging University College London SPM Course Chicago, Oct.
Idiot's guide to... General Linear Model & fMRI Elliot Freeman, ICN. fMRI model, Linear Time Series, Design Matrices, Parameter estimation,
The General Linear Model
The General Linear Model Guillaume Flandin Wellcome Trust Centre for Neuroimaging University College London SPM fMRI Course London, May 2012.
SPM short – Mai 2008 Linear Models and Contrasts Stefan Kiebel Wellcome Trust Centre for Neuroimaging.
Contrasts & Statistical Inference Guillaume Flandin Wellcome Trust Centre for Neuroimaging University College London SPM Course London, October 2008.
Variance components Wellcome Dept. of Imaging Neuroscience Institute of Neurology, UCL, London Stefan Kiebel.
The general linear model and Statistical Parametric Mapping II: GLM for fMRI Alexa Morcom and Stefan Kiebel, Rik Henson, Andrew Holmes & J-B Poline.
The General Linear Model Christophe Phillips SPM Short Course London, May 2013.
The General Linear Model Guillaume Flandin Wellcome Trust Centre for Neuroimaging University College London SPM fMRI Course London, October 2012.
SPM short course – Mai 2008 Linear Models and Contrasts Jean-Baptiste Poline Neurospin, I2BM, CEA Saclay, France.
General Linear Model & Classical Inference London, SPM-M/EEG course May 2016 Sven Bestmann, Sobell Department, Institute of Neurology, UCL
General Linear Model & Classical Inference Short course on SPM for MEG/EEG Wellcome Trust Centre for Neuroimaging University College London May 2010 C.
The General Linear Model (GLM)
Group Analyses Guillaume Flandin SPM Course London, October 2016
The General Linear Model (GLM)
General Linear Model & Classical Inference
The general linear model and Statistical Parametric Mapping
The General Linear Model
Statistical Inference
Design Matrix, General Linear Modelling, Contrasts and Inference
Statistical Inference
and Stefan Kiebel, Rik Henson, Andrew Holmes & J-B Poline
The General Linear Model (GLM)
Contrasts & Statistical Inference
The General Linear Model
The general linear model and Statistical Parametric Mapping
The General Linear Model
The General Linear Model (GLM)
Statistical Inference
Contrasts & Statistical Inference
The General Linear Model
The General Linear Model (GLM)
Statistical Inference
The General Linear Model
The General Linear Model
Contrasts & Statistical Inference
Presentation transcript:

The general linear model and Statistical Parametric Mapping I: Introduction to the GLM Alexa Morcom and Stefan Kiebel, Rik Henson, Andrew Holmes & J-B Poline

Introduction Essential concepts –Modelling –Design matrix –Parameter estimates –Simple contrasts Summary Overview

Some terminology SPM is based on a mass univariate approach that fits a model at each voxel –Is there an effect at location X? Investigate localisation of function or functional specialisation –How does region X interact with Y and Z? Investigate behaviour of networks or functional integration A General(ised) Linear Model –Effects are linear and additive –If errors are normal (Gaussian), General (SPM99) –If errors are not normal, Generalised (SPM2)

Parametric –one sample t-test –two sample t-test –paired t-test –ANOVA –ANCOVA –correlation –linear regression –multiple regression –F-tests –etc… all cases of the (univariate) General Linear Model Or, with non-normal errors, the Generalised Linear Model Classical statistics...

Parametric –one sample t-test –two sample t-test –paired t-test –ANOVA –ANCOVA –correlation –linear regression –multiple regression –F-tests –etc… all cases of the (univariate) General Linear Model Or, with non-normal errors, the Generalised Linear Model Classical statistics...

Parametric –one sample t-test –two sample t-test –paired t-test –ANOVA –ANCOVA –correlation –linear regression –multiple regression –F-tests –etc… Multivariate? all cases of the (univariate) General Linear Model Or, with non-normal errors, the Generalised Linear Model Classical statistics...  PCA/ SVD, MLM

Parametric –one sample t-test –two sample t-test –paired t-test –ANOVA –ANCOVA –correlation –linear regression –multiple regression –F-tests –etc… Multivariate? Non-parametric? all cases of the (univariate) General Linear Model Or, with non-normal errors, the Generalised Linear Model  SnPM Classical statistics...  PCA/ SVD, MLM

1.Decompose data into effects and error 2.Form statistic using estimates of effects and error 1.Decompose data into effects and error 2.Form statistic using estimates of effects and error Make inferences about effects of interest Why? How? Use any available knowledge Model? data model effects estimate error estimate statistic Why modelling?

Variance Bias No smoothing No normalisation Massive model but... not much sensitivity Captures signal Lots of smoothing Lots of normalisation Sparse model but... may miss signal but... may miss signal High sensitivity A very general model default SPM default SPM Choose your model “All models are wrong, but some are useful” George E.P.Box

Modelling with SPM Preprocessing SPMs Functional data Templates Smoothed normalised data Smoothed normalised data Design matrix Variance components Contrasts Thresholding Parameter estimates Parameter estimates Generalised linear model Generalised linear model

Modelling with SPM Generalised linear model Generalised linear model Preprocessed data: single voxel Design matrix Variance components SPMs Contrasts Parameter estimates Parameter estimates

Passive word listening versus rest Passive word listening versus rest 7 cycles of rest and listening 7 cycles of rest and listening Each epoch 6 scans with 7 sec TR Each epoch 6 scans with 7 sec TR Question: Is there a change in the BOLD response between listening and rest? Time series of BOLD responses in one voxel Time series of BOLD responses in one voxel Stimulus function One session fMRI example

GLM essentials The model –Design matrix: Effects of interest –Design matrix: Confounds (aka effects of no interest) –Residuals (error measures of the whole model) Estimate effects and error for data –Parameter estimates (aka betas) –Quantify specific effects using contrasts of parameter estimates Statistic –Compare estimated effects – the contrasts – with appropriate error measures –Are the effects surprisingly large?

GLM essentials The model –Design matrix: Effects of interest –Design matrix: Confounds (aka effects of no interest) –Residuals (error measures of the whole model) Estimate effects and error for data –Parameter estimates (aka betas) –Quantify specific effects using contrasts of parameter estimates Statistic –Compare estimated effects – the contrasts – with appropriate error measures –Are the effects surprisingly large?

Intensity Time Regression model = 11 22 + + error x1x1 x2x2     (error is normal and independently and identically distributed) Question: Is there a change in the BOLD response between listening and rest? Hypothesis test:  1 = 0? (using t-statistic) General case

Regression model =++ Model is specified by 1.Design matrix X 2.Assumptions about  Model is specified by 1.Design matrix X 2.Assumptions about  General case

Matrix formulation Y i =   x i +    +  i i = 1,2,3 Y 1 x 1 1    1 Y 2 =x  2 Y 3 x 3 1    3 Y= X  +  Y 1 =   x 1 +   × 1 +  1 Y 2 =   x 2 +   × 1 +  2 Y 3 =   x 3 +   × 1 +  3 dummy variables

Matrix formulation Y i =   x i +    +  i i = 1,2,3 Y 1 x 1 1    1 Y 2 =x  2 Y 3 x 3 1    3 Y= X  +  Y 1 =   x 1 +   × 1 +  1 Y 2 =   x 2 +   × 1 +  2 Y 3 =   x 3 +   × 1 +  3 dummy variables x y     ^ ^  1 Linear regression Parameter estimates   &    Fitted values Y 1 & Y 2 Residuals  1 &  2 ^^ ^ ^ ^ ^ YiYi YiYi ^ Hats = estimates ^

Y=    X     X   (1,1,1) (x1, x2, x3) XX XX O Y1x111Y2=x21+2Y3x313Y1x111Y2=x21+2Y3x313 DATA (Y 1, Y 2, Y 3 ) Y design space Geometrical perspective

GLM essentials The model –Design matrix: Effects of interest –Design matrix: Confounds (aka effects of no interest) –Residuals (error measures of the whole model) Estimate effects and error for data –Parameter estimates (aka betas) –Quantify specific effects using contrasts of parameter estimates Statistic –Compare estimated effects – the contrasts – with appropriate error measures –Are the effects surprisingly large?

Parameter estimation Estimate parameters such that minimal residuals Least squares parameter estimate Parameter estimates Ordinary least squares

Parameter estimation residuals = r Parameter estimates Error variance  2 = (sum of) squared residuals standardised for df = r T r / df (sum of squares) …where degrees of freedom df (assuming iid): = N - rank(X) (=N-P if X full rank) Error variance  2 = (sum of) squared residuals standardised for df = r T r / df (sum of squares) …where degrees of freedom df (assuming iid): = N - rank(X) (=N-P if X full rank) Ordinary least squares

Y=    X     X   (1,1,1) (x1, x2, x3) XX XX O Y1x111Y2=x21+2Y3x313Y1x111Y2=x21+2Y3x313 Y design space Estimation (geometrical) (Y1,Y2,Y3)(Y1,Y2,Y3) ^ ^^ Y ^  = (       ) T ^ ^^ ^ DATA (Y 1, Y 2, Y 3 ) RESIDUALS

GLM essentials The model –Design matrix: Effects of interest –Design matrix: Confounds (aka effects of no interest) –Residuals (error measures of the whole model) Estimate effects and error for data –Parameter estimates (aka betas) –Quantify specific effects using contrasts of parameter estimates Statistic –Compare estimated effects – the contrasts – with appropriate error measures –Are the effects surprisingly large?

Inference - contrasts A contrast = a linear combination of parameters: c´ x    spm_con*img boxcar parameter > 0 ? Null hypothesis: t-test: one-dimensional contrast, difference in means/ difference from zero Does boxcar parameter model anything? Null hypothesis: variance of tested effects = error variance F-test: tests multiple linear hypotheses – does subset of model account for significant variance  spmT_000*.img SPM{t} map  ess_000*.img SPM{F} map

t-statistic - example c = X Standard error of contrast depends on the design, and is larger with greater residual error and ‚greater‘ covariance/ autocorrelation Degrees of freedom d.f. then = n-p, where n observations, p parameters Standard error of contrast depends on the design, and is larger with greater residual error and ‚greater‘ covariance/ autocorrelation Degrees of freedom d.f. then = n-p, where n observations, p parameters Contrast of parameter estimates Variance estimate Tests for a directional difference in means

F-statistic - example Null hypothesis H 0 : That all these betas  3-9 are zero, i.e. that no linear combination of the effects accounts for significant variance This is a non-directional test Null hypothesis H 0 : That all these betas  3-9 are zero, i.e. that no linear combination of the effects accounts for significant variance This is a non-directional test H 0 :  3-9 = ( ) test H 0 : c´ x  = 0 ? H 0 : True model is X 0 This model ?Or this one ? Do movement parameters (or other confounds) account for anything? X 1    X0X0 X0X0

H 0 :  3-9 = ( ) c’ = SPM{F} test H 0 : c´ x  = 0 ? H 0 : True model is X 0 F-statistic - example Do movement parameters (or other confounds) account for anything? X 1    X0X0 This model ?Or this one ? X0X0

Summary so far The essential model contains –Effects of interest A better model? –A better model (within reason) means smaller residual variance and more significant statistics –Capturing the signal – later –Add confounds/ effects of no interest –Example of movement parameters in fMRI –A further example (mainly relevant to PET)…

Example PET experiment  1  2  3  4 rank(X)=3 12 scans, 3 conditions (1-way ANOVA) y j = x 1j b 1 + x 2j b 2 + x 3j b 3 + x 4j b 4 + e j where (dummy) variables: x 1j = [0,1] = condition A (first 4 scans) x 2j = [0,1] = condition B (second 4 scans) x 3j = [0,1] = condition C (third 4 scans) x 4j = [1] = grand mean (session constant) 12 scans, 3 conditions (1-way ANOVA) y j = x 1j b 1 + x 2j b 2 + x 3j b 3 + x 4j b 4 + e j where (dummy) variables: x 1j = [0,1] = condition A (first 4 scans) x 2j = [0,1] = condition B (second 4 scans) x 3j = [0,1] = condition C (third 4 scans) x 4j = [1] = grand mean (session constant)

Global effects May be variation in PET tracer dose from scan to scan Such “global” changes in image intensity (gCBF) confound local / regional (rCBF) changes of experiment Adjust for global effects by: - AnCova (Additive Model) - PET? - Proportional Scaling - fMRI? Can improve statistics when orthogonal to effects of interest… …but can also worsen when effects of interest correlated with global May be variation in PET tracer dose from scan to scan Such “global” changes in image intensity (gCBF) confound local / regional (rCBF) changes of experiment Adjust for global effects by: - AnCova (Additive Model) - PET? - Proportional Scaling - fMRI? Can improve statistics when orthogonal to effects of interest… …but can also worsen when effects of interest correlated with global global AnCova global Scaling x x x x x x o o o o o o x x x x x x o o o o o o x x x x x x o oo o o o x x x x x x o o o o o o

Global effects (AnCova)  1  2  3  4  5 rank(X)=4 12 scans, 3 conditions, 1 confounding covariate y j = x 1j b 1 + x 2j b 2 + x 3j b 3 + x 4j b 4 + x 5j b 5 + e j where (dummy) variables: x 1j = [0,1] = condition A (first 4 scans) x 2j = [0,1] = condition B (second 4 scans) x 3j = [0,1] = condition C (third 4 scans) x 4j = [1] = grand mean (session constant) x 5j = global signal (mean over all voxels) 12 scans, 3 conditions, 1 confounding covariate y j = x 1j b 1 + x 2j b 2 + x 3j b 3 + x 4j b 4 + x 5j b 5 + e j where (dummy) variables: x 1j = [0,1] = condition A (first 4 scans) x 2j = [0,1] = condition B (second 4 scans) x 3j = [0,1] = condition C (third 4 scans) x 4j = [1] = grand mean (session constant) x 5j = global signal (mean over all voxels)

rank(X)=4  1  2  3  4  5 Global effects not accounted for Maximum degrees of freedom (global uses one) Global effects not accounted for Maximum degrees of freedom (global uses one) Global effects (AnCova) Global effects independent of effects of interest Smaller residual variance Larger T statistic More significant Global effects independent of effects of interest Smaller residual variance Larger T statistic More significant Global effects correlated with effects of interest Smaller effect &/or larger residuals Smaller T statistic Less significant Global effects correlated with effects of interest Smaller effect &/or larger residuals Smaller T statistic Less significant No Global Correlated global Orthogonal global  1  2  3  4 rank(X)=3  1  2  3  4  5 rank(X)=4

Two types of scaling: Grand Mean scaling and Global scaling -Grand Mean scaling is automatic, global scaling is optional -Grand Mean scales by 100/mean over all voxels and ALL scans (i.e, single number per session) -Global scaling scales by 100/mean over all voxels for EACH scan (i.e, a different scaling factor every scan) Problem with global scaling is that TRUE global is not (normally) known… … only estimated by the mean over voxels -So if there is a large signal change over many voxels, the global estimate will be confounded by local changes -This can produce artifactual deactivations in other regions after global scaling Since most sources of global variability in fMRI are low frequency (drift), high-pass filtering may be sufficient, and many people do not use global scaling Two types of scaling: Grand Mean scaling and Global scaling -Grand Mean scaling is automatic, global scaling is optional -Grand Mean scales by 100/mean over all voxels and ALL scans (i.e, single number per session) -Global scaling scales by 100/mean over all voxels for EACH scan (i.e, a different scaling factor every scan) Problem with global scaling is that TRUE global is not (normally) known… … only estimated by the mean over voxels -So if there is a large signal change over many voxels, the global estimate will be confounded by local changes -This can produce artifactual deactivations in other regions after global scaling Since most sources of global variability in fMRI are low frequency (drift), high-pass filtering may be sufficient, and many people do not use global scaling Global effects (scaling)

Summary General(ised) linear model partitions data into –Effects of interest & confounds/ effects of no interest –Error Least squares estimation –Minimises difference between model & data –To do this, assumptions made about errors – more later Inference at every voxel –Test hypothesis using contrast – more later –Inference can be Bayesian as well as classical Next: Applying the GLM to fMRI