The General Linear Model …a talk for dummies

Slides:



Advertisements
Similar presentations
The SPM MfD course 12th Dec 2007 Elvina Chu
Advertisements

General Linear Model L ύ cia Garrido and Marieke Schölvinck ICN.
General Linear Model Beatriz Calvo Davina Bristow.
2nd level analysis – design matrix, contrasts and inference
1 st Level Analysis: design matrix, contrasts, GLM Clare Palmer & Misun Kim Methods for Dummies
SPM 2002 C1C2C3 X =  C1 C2 Xb L C1 L C2  C1 C2 Xb L C1  L C2 Y Xb e Space of X C1 C2 Xb Space X C1 C2 C1  C3 P C1C2  Xb Xb Space of X C1 C2 C1 
Outline What is ‘1st level analysis’? The Design matrix
The General Linear Model Or, What the Hell’s Going on During Estimation?
Realigning and Unwarping MfD
The General Linear Model (GLM) Methods & models for fMRI data analysis in neuroeconomics November 2010 Klaas Enno Stephan Laboratory for Social & Neural.
The General Linear Model (GLM)
The General Linear Model (GLM) SPM Course 2010 University of Zurich, February 2010 Klaas Enno Stephan Laboratory for Social & Neural Systems Research.
1st level analysis: basis functions and correlated regressors
Lorelei Howard and Nick Wright MfD 2008
General Linear Model & Classical Inference
General Linear Model & Classical Inference Guillaume Flandin Wellcome Trust Centre for Neuroimaging University College London SPM M/EEGCourse London, May.
The General Linear Model
Objectives of Multiple Regression
With many thanks for slides & images to: FIL Methods group, Virginia Flanagin and Klaas Enno Stephan Dr. Frederike Petzschner Translational Neuromodeling.
Brain Mapping Unit The General Linear Model A Basic Introduction Roger Tait
General Linear Model & Classical Inference London, SPM-M/EEG course May 2014 C. Phillips, Cyclotron Research Centre, ULg, Belgium
FMRI Methods Lecture7 – Review: analyses & statistics.
SPM short course – Oct Linear Models and Contrasts Jean-Baptiste Poline Neurospin, I2BM, CEA Saclay, France.
General Linear Model. Y1Y2...YJY1Y2...YJ = X 11 … X 1l … X 1L X 21 … X 2l … X 2L. X J1 … X Jl … X JL β1β2...βLβ1β2...βL + ε1ε2...εJε1ε2...εJ Y = X * β.
The General Linear Model (for dummies…) Carmen Tur and Ashwani Jha 2009.
General Linear Model and fMRI Rachel Denison & Marsha Quallo Methods for Dummies 2007.
FMRI Modelling & Statistical Inference Guillaume Flandin Wellcome Trust Centre for Neuroimaging University College London SPM Course Chicago, Oct.
Idiot's guide to... General Linear Model & fMRI Elliot Freeman, ICN. fMRI model, Linear Time Series, Design Matrices, Parameter estimation,
The General Linear Model
The General Linear Model Guillaume Flandin Wellcome Trust Centre for Neuroimaging University College London SPM fMRI Course London, May 2012.
SPM short – Mai 2008 Linear Models and Contrasts Stefan Kiebel Wellcome Trust Centre for Neuroimaging.
1 st level analysis: Design matrix, contrasts, and inference Stephane De Brito & Fiona McNabe.
The general linear model and Statistical Parametric Mapping I: Introduction to the GLM Alexa Morcom and Stefan Kiebel, Rik Henson, Andrew Holmes & J-B.
SPM and (e)fMRI Christopher Benjamin. SPM Today: basics from eFMRI perspective. 1.Pre-processing 2.Modeling: Specification & general linear model 3.Inference:
The general linear model and Statistical Parametric Mapping II: GLM for fMRI Alexa Morcom and Stefan Kiebel, Rik Henson, Andrew Holmes & J-B Poline.
The General Linear Model Christophe Phillips SPM Short Course London, May 2013.
The General Linear Model Guillaume Flandin Wellcome Trust Centre for Neuroimaging University College London SPM fMRI Course London, October 2012.
SPM short course – Mai 2008 Linear Models and Contrasts Jean-Baptiste Poline Neurospin, I2BM, CEA Saclay, France.
General Linear Model & Classical Inference London, SPM-M/EEG course May 2016 Sven Bestmann, Sobell Department, Institute of Neurology, UCL
General Linear Model & Classical Inference Short course on SPM for MEG/EEG Wellcome Trust Centre for Neuroimaging University College London May 2010 C.
The General Linear Model (GLM)
The General Linear Model (GLM)
Statistical Data Analysis - Lecture /04/03
General Linear Model & Classical Inference
The general linear model and Statistical Parametric Mapping
The General Linear Model
Statistical Inference
Design Matrix, General Linear Modelling, Contrasts and Inference
The General Linear Model (GLM): the marriage between linear systems and stats FFA.
Stats Club Marnie Brennan
and Stefan Kiebel, Rik Henson, Andrew Holmes & J-B Poline
The SPM MfD course 12th Dec 2007 Elvina Chu
The General Linear Model (GLM)
Correlation and Regression
Contrasts & Statistical Inference
The General Linear Model
Rachel Denison & Marsha Quallo
The general linear model and Statistical Parametric Mapping
The General Linear Model
Learning Theory Reza Shadmehr
The General Linear Model (GLM)
Contrasts & Statistical Inference
Chapter 3 General Linear Model
MfD 04/12/18 Alice Accorroni – Elena Amoruso
The General Linear Model
The General Linear Model (GLM)
The General Linear Model
The General Linear Model
Linear Algebra and Matrices
Contrasts & Statistical Inference
Presentation transcript:

The General Linear Model …a talk for dummies Holly Rossiter & Louise Croft Method for Dummies 2010

Topics covered The General Linear Model The Design Matrix Parameters Error An fMRI example Problems with the GLM + solutions A cake recipe

Overview of SPM Image time-series Kernel Design matrix Realignment Smoothing General linear model Gaussian field theory Statistical inference Normalisation Template p <0.05 Parameter estimates

The General Linear Model Describes a response (y), such as the BOLD response in a voxel, in terms of all its contributing factors (xβ) in a linear combination, whilst also accounting for the contribution of error (e).

The General Linear Model Dependent variable Describes a response (such as the BOLD response in a single voxel, taken from an fMRI scan)

The General Linear Model Independent Variable aka. Predictor e.g. Experimental conditions (Embodies all available knowledge about experimentally controlled factors and potential confounds)

The General Linear Model Parameters (aka regression coefficient/beta weights) Quantifies how much each predictor (X) independently influences the dependent variable (Y)  The slope of the line Y X

The General Linear Model Error Variance in the data (y) which is not explained by the linear combination of predictors (x)

Therefore… = DV = (IV x Parameters) Error + As we take samples of a response (y) many times, this equation actually represents a matrix…

= …the GLM matrix Time BOLD signal One response… ...over time Time Single voxel Time ...over time Time

The Design Matrix models all known predictors of y ...over time Time 1 Time 2 fMRI signal SPM always adds a constant term in the design matrix, to model the overall mean of the signal, which is arbitrary for BOLD data. A constant (overall mean of the signal)

Each predictor (x) has an expected signal time course, which contributes to y + 1 × 2 × 3 × X1 X2 X3 X: Predictors fMRI signal Time Y: Observed Data = + Residual residuals

The design matrix does not account for all of y If we plot our observations (n) on a graph these will not fall in a straight line This is a result of uncontrolled influences (other than x) on y This contribution to y is called the error (or residual) Minimising the difference between the response predicted by the model (y) and the actual response (y) minimises the error of the model Y X ^

…in order to minimise the error of the model So, we need to construct a design matrix which aims to explain as much of the variance in y as possible… 1 2 Time = 3 + 4 5 6 …in order to minimise the error of the model

How?

Parameters (β) Beta is the slope of the regression line Quantifies a specific predictor’s (x) contribution to y. The parameter (β) chosen for a model should minimise the error (reducing the amount of variance in y which is left unexplained) Y X

So, We have our set of hypothetical time-series: x1, x2, x3.... Generation Shadowing Baseline Measured X1 X2 X3 ≈ want to estimate (or want SPM to estimate) or model this observed time series as a linear combination of the hypothetical time series. ”Known” ....and our data

We find the best parameter values by modelling... 4 3 2 1 1 2 1 ≈ + β3* ”Unknown” parameters + β2* β1* Generation Shadowing Baseline want to estimate (or want SPM to estimate) or model this observed time series as a linear combination of the hypothetical time series. ...the best parameter will miminise the error in the model

Generation Shadowing Baseline Here, there is a lot of residual variance in y which is unexplained by the model (error) 4 3 2 1 1 2 1 ≈ β1* + β2* + β3* Generation Shadowing Baseline Not brilliant 3

Generation Shadowing Baseline ...and the same goes here 4 3 2 1 1 2 1 ≈ β1* + β2* + β3* Generation Shadowing Baseline Still not great 1 4

...but here we have a good fit, with minimal error 4 3 2 + β2* + β1* 1 β0* + β3* β1* 2 ≈ ≈ Generation Shadowing Baseline 0.83 0.16 2.98

Generation Shadowing Baseline ...and the same model can fit different data – just use different parameters 3 2 1 + β2* + β1* 1 β0* + β3* β1* 2 ≈ ≈ ≈ β3* + β1* + β2* Generation Shadowing Baseline 0.68 0.82 2.17 In other words:

Generation Shadowing Baseline ...as you can see Different data (y) The same predictors (x) 3 2 1 + β2* + β1* 1 β0* + β3* β1* 2 ≈ ≈ ≈ ≈ β0* + β1* + β2* Generation Shadowing Baseline 0.03 0.06 2.04 Doesn’t care: Different parameters ()

So the same model can be used across all voxels of the brain, just using different paramteres beta_0001.img ... Time-series beta_0002.img ... beta_0003.img

Finding the optimal parameter can also be visualised in geometric space ^ The y and x in our model are all vectors of the same dimensionality and so, lie within the same large space. The design matrix (x1, x2, x3…) defines a subspace; the design space (green panel). x2 x1 Design space defined by X

Parameters determine the co-ordinates of the predicted response (y) within this space ^ y The actual data (y) however, lie outside this space. So there is always a difference between the predicted y and actual y x2 y ˆ ^ x1

We need to minimise the difference between predicted y and actual y ^ y So the GLM aims find the projection of y on the design space which minimises the error of the model (minimises the difference between predicted y and actual y) e (minimise) ^ x2 ^ y ˆ x1

The smallest error vector is orthogonal to the design space… y …So the best parameter will position y so that the error vector between y and y is orthogonal to the design space (minimising the error) e ^ ^ x2 y ˆ x1

How do we find the parameter which produces minimal error?

The optimal parameter can be calculated using Ordinary Least Squares A statistical method for estimating unknown parameters from sampled data - minimizing the difference between predicted data and observed data. å = N t e 1 2 = minimum

Some assumptions in GLM: Error 1. Errors are normally distributed 2. Error is the same in each & every measurement point x 3. There is no correlation between errors at different time points/data points

To re-cap…

x + = E Time BOLD signal BOLD signal time course Multiple predictors Variance accounted for by predictors + E Unaccounted variance (error) Multiple predictors x =

An fMRI example 34

An example, using fMRI… Time Time BOLD signal Single voxel analysed across many different time points Y = BOLD signal at each time point from that voxel Time In order to explain how GLM is applied to brain imaging data, I will use the example of a simple fMRI experiment. So obviously with fMRI we want to look at all the voxels in the brain but each GLM is just focusing on one voxel in the brain across a series of time points (left-hand picture) and the actual value of Y is the strength of the BOLD signal (at that location and at that time point) (right hand picture). Time BOLD signal 35

A simple fMRI experiment One session Passive word listening versus rest 7 cycles of rest and listening So I’m just going to briefly explain this simple fMRI experiment which I borrowed from some SPM course slides. This is an fMRI experiment with just one session and is comparing passive word listening to rest, these are cycled 7 times. So the main question we want to answer in this experiment is whether the BOLD response changes between listening and rest and in which areas does this change occur. The graph demonstrates the strength of the BOLD response at a particular voxel (y axis) against time (x axis) and the change between listening and rest can be seen in the stimulus function, so as you can see the BOLD response is changing between these conditions. Stimulus function Question: Is there a change in the BOLD response between listening and rest? 36

General Linear Model: fMRI example BOLD response at each time point at chosen voxel Predictors that explain the data such as listening vs rest in this example How much each predictor explains the data (Coefficient) Variance in the data that cannot be explained by the predictors (noise) This is how the GLM applies to the fMRI experiment. Y is the strength of the BOLD response at a chosen voxel at each time point. X are the different predictors that may explain the changes seen in the BOLD response, for example in this study listening vs rest. It is also important to include other predictors even if they are not experimentally interesting for example you might want to include the effect of gender, or fatigue to the response as well. The beta value for each x indicates how much that particular predictor explains the BOLD response. Then e explains all the variance that cannot be explained by your predictors so any noise leftover in the data. β = 0.44 37

Statistical Parametric Mapping Null hypothesis is that your effect of interest explains none of your data. Looking at simple listening task, how much does listening vs rest explain the BOLD signal – is it statistically significant? From this equation B can be calculated. So for your effect of interest, in this case lets say listening as compared to rest, the null hypothesis is that it explains none of the BOLD response. So you calculate Beta and then you can determine whether this effect is statistically significant by performing all sorts of statistical tests such as t-tests or ANOVAs etc. Statistical Inference eg. P<0.05 38

Problems with this model/issues to consider... BOLD signal is not a simple on/off Low-frequency noise Assumptions about error may be violated Physiological confounds Although this is an extremely useful and important model, there are some issues that need to be considered. Firstly the BOLD response which is not simply an on or off signal so the data needs to be adjusted in order to match the shape of the BOLD response. There is also a lot of low-frequency noise within fMRI data which may contaminate the data. GLM has many assumptions about the error, however some of these may be violated and there are also physiological confounds that need to be considered when recording and analysing fMRI data. I’ll go through each of these in turn and then explain how we can attempt to solve these problems. 39

Problem 1: Shape of BOLD response Solution: Convolution model HRF Expected BOLD Impulses  = expected BOLD response = input function hemodynamic response function (HRF) So, the BOLD signal does not respond immediately and it responds in this shape (middle picture), but the impulses or conditions change in a 0 to 1 or on/off way (left-hand picture). The model needs to be altered to accommodate the haemodynamic response function (right-hand picture). This is termed the convolution model.

Convolution model of the BOLD response Convolve stimulus function with a canonical hemodynamic response function (HRF): Original Convolved HRF  HRF So you alter the model from being an on-off (red column on left) to a more smoothed convolved model (green column on left). This graph (right hand side) shows the BOLD response against scan number and shows how the model can be adjusted to have a closer match with the actual HRF. 41

Problem 2: Low-frequency noise Solution: High pass filtering blue = data black = mean + low-frequency drift green = predicted response, taking into account low-frequency drift red = predicted response, NOT taking into account low-frequency drift The second problem is that there is a lot of low-frequency noise or baseline drift in the fMRI data, this can be due to breathing. In order to remove this we can add a high-pass filter. This figure shows columns in the matrix that are very low frequency which can be modelled in the data in order to take account of this low-frequency noise. discrete cosine transform (DCT) set 42

e in time t is correlated with e in time t-1 Problem 3: Error is correlated although model assumes it is not Solution: Autoregressive model The error at each time point is correlated to the error at the previous time point It should be… t e e e in time t is correlated with e in time t-1 One of the main violations of the error assumptions with GLM is that it assumes that the error is not correlated whereas it often is in fMRI data. For the model it wants the error to be like this (left-hand graph) whereas in fact it is more like this (right-hand graph) so error at time t is correlated to error at time t-1. t It is… 43

Autoregressive Model Temporal autocorrelation: in y = Xβ + e over time et = aet-1 + ε This autocovariance function can be calculated and then the data can be analysed taking this into account, GLM is then performed again without this correlation error. This is called ‘whitening’. Autoregressive model autocovariance function In order to correct this, SPM uses something called an autoregressive model. This equation simply shows that for an error at time t, there is an autocovariance function a which shows its correlation with error at time t-1. The error at one point is correlated to the previous time point and to a lesser degree the time point before that but it slopes off quickly. So it is possible to use this equation to calculate to what degree the error is correlated and then remove it from the GLM equation. All you basically need to understand is that: GLM assumes that the error is not correlated but it is, SPM is able to perform an ‘autoregressive model’ which is able to calculate this correlation and then remove it from the data so it is no longer a problem. 44

Problem 4: Physiological Confounds head movements arterial pulsations breathing eye blinks (visual cortex) adaptation affects, fatigue, changes in attention to task The fourth problem is that of physiological confounds such as the subject moving their head in the scanner, eye blinks, fatigue and so on which may affect the data. Some of these can be minimised by giving clear instructions to the participant and ensuring your scan doesn’t last for too long! 45

(Predictors x Parameters...) To recap... = Response = (Predictors x Parameters...) Error + So just a final recap of the GLM: Y is your data i.e. the BOLD response. X are the predictors that are able to explain your data. B is how much each predictor explains the data and the error is whatever is leftover. 46

Analogy: Reverse Cookery Start with finished product and try to explain how it is made... You specify which ingredients to add (X) For each ingredient, GLM finds the quantities (β) that produce the best reproduction Then if you tried to make the cake with what you know about X and β then the error would be the difference between the original cake/data and yours! I found this in one of the previous years slides and thought it was quite a good way of explaining GLM. So they said it’s like reverse cookery. So if you imagine you start with the finished product of a cake (Y), you then work out what ingredients are in it (X your predictors), you work out how much you need of each ingredient (B) and if you tried to make the cake with that knowledge then the difference between the original and what you made would be the error!

Thanks to... Previous years MfD slides (2003-2009) Guillaume Flandin (our expert) Slides from previous SPM courses http://www.fil.ion.ucl.ac.uk/spm/course/ Suz and Christian We’d just like to thank Guillaume for his help, the previous years presenters and SPM course for their slides and Suz and Christian. Thanks for listening. 48