The General Linear Model Or, What the Hell’s Going on During Estimation?

Slides:



Advertisements
Similar presentations
The SPM MfD course 12th Dec 2007 Elvina Chu
Advertisements

Basis Functions. What’s a basis ? Can be used to describe any point in space. e.g. the common Euclidian basis (x, y, z) forms a basis according to which.
General Linear Model L ύ cia Garrido and Marieke Schölvinck ICN.
General Linear Model Beatriz Calvo Davina Bristow.
2nd level analysis – design matrix, contrasts and inference
3.3 Hypothesis Testing in Multiple Linear Regression
1 st Level Analysis: design matrix, contrasts, GLM Clare Palmer & Misun Kim Methods for Dummies
Outline What is ‘1st level analysis’? The Design matrix
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Variance and covariance M contains the mean Sums of squares General additive models.
ANOVA: ANalysis Of VAriance. In the general linear model x = μ + σ 2 (Age) + σ 2 (Genotype) + σ 2 (Measurement) + σ 2 (Condition) + σ 2 (ε) Each of the.
Statistical Inference
1 Chapter 3 Multiple Linear Regression Ray-Bing Chen Institute of Statistics National University of Kaohsiung.
Topic 3: Regression.
1st level analysis: basis functions and correlated regressors
Lorelei Howard and Nick Wright MfD 2008
Simple Linear Regression Analysis
Review for Final Exam Some important themes from Chapters 9-11 Final exam covers these chapters, but implicitly tests the entire course, because we use.
Linear Algebra and Matrices
Statistical hypothesis testing – Inferential statistics II. Testing for associations.
Correlation and Regression
General Linear Model & Classical Inference Guillaume Flandin Wellcome Trust Centre for Neuroimaging University College London SPM M/EEGCourse London, May.
The General Linear Model
Objectives of Multiple Regression
Contrasts and Basis Functions Hugo Spiers Adam Liston.
With many thanks for slides & images to: FIL Methods group, Virginia Flanagin and Klaas Enno Stephan Dr. Frederike Petzschner Translational Neuromodeling.
Contrasts (a revision of t and F contrasts by a very dummyish Martha) & Basis Functions (by a much less dummyish Iroise!)
Brain Mapping Unit The General Linear Model A Basic Introduction Roger Tait
SUPA Advanced Data Analysis Course, Jan 6th – 7th 2009 Advanced Data Analysis for the Physical Sciences Dr Martin Hendry Dept of Physics and Astronomy.
Chapter 13 Multiple Regression
University of Warwick, Department of Sociology, 2012/13 SO 201: SSAASS (Surveys and Statistics) (Richard Lampard) Week 5 Multiple Regression.
General Linear Model. Y1Y2...YJY1Y2...YJ = X 11 … X 1l … X 1L X 21 … X 2l … X 2L. X J1 … X Jl … X JL β1β2...βLβ1β2...βL + ε1ε2...εJε1ε2...εJ Y = X * β.
Regression Analysis: Part 2 Inference Dummies / Interactions Multicollinearity / Heteroscedasticity Residual Analysis / Outliers.
The General Linear Model (for dummies…) Carmen Tur and Ashwani Jha 2009.
Ch. 5 Bayesian Treatment of Neuroimaging Data Will Penny and Karl Friston Ch. 5 Bayesian Treatment of Neuroimaging Data Will Penny and Karl Friston 18.
General Linear Model and fMRI Rachel Denison & Marsha Quallo Methods for Dummies 2007.
Statistical Inference Christophe Phillips SPM Course London, May 2012.
FMRI Modelling & Statistical Inference Guillaume Flandin Wellcome Trust Centre for Neuroimaging University College London SPM Course Chicago, Oct.
Idiot's guide to... General Linear Model & fMRI Elliot Freeman, ICN. fMRI model, Linear Time Series, Design Matrices, Parameter estimation,
The General Linear Model
The General Linear Model Guillaume Flandin Wellcome Trust Centre for Neuroimaging University College London SPM fMRI Course London, May 2012.
1 st level analysis: Design matrix, contrasts, and inference Stephane De Brito & Fiona McNabe.
The general linear model and Statistical Parametric Mapping I: Introduction to the GLM Alexa Morcom and Stefan Kiebel, Rik Henson, Andrew Holmes & J-B.
Contrasts & Statistical Inference Guillaume Flandin Wellcome Trust Centre for Neuroimaging University College London SPM Course London, October 2008.
Biostatistics Regression and Correlation Methods Class #10 April 4, 2000.
SPM and (e)fMRI Christopher Benjamin. SPM Today: basics from eFMRI perspective. 1.Pre-processing 2.Modeling: Specification & general linear model 3.Inference:
The general linear model and Statistical Parametric Mapping II: GLM for fMRI Alexa Morcom and Stefan Kiebel, Rik Henson, Andrew Holmes & J-B Poline.
CORRELATION-REGULATION ANALYSIS Томский политехнический университет.
The General Linear Model Christophe Phillips SPM Short Course London, May 2013.
The General Linear Model Guillaume Flandin Wellcome Trust Centre for Neuroimaging University College London SPM fMRI Course London, October 2012.
University of Warwick, Department of Sociology, 2014/15 SO 201: SSAASS (Surveys and Statistics) (Richard Lampard)   Week 5 Multiple Regression  
The General Linear Model (GLM)
Statistical Data Analysis - Lecture /04/03
The general linear model and Statistical Parametric Mapping
The General Linear Model
Simple Linear Regression - Introduction
and Stefan Kiebel, Rik Henson, Andrew Holmes & J-B Poline
The SPM MfD course 12th Dec 2007 Elvina Chu
The General Linear Model (GLM)
The General Linear Model
Rachel Denison & Marsha Quallo
The General Linear Model
The General Linear Model (GLM)
Chapter 3 General Linear Model
MfD 04/12/18 Alice Accorroni – Elena Amoruso
The General Linear Model
The General Linear Model (GLM)
The General Linear Model
The General Linear Model
Linear Algebra and Matrices
Presentation transcript:

The General Linear Model Or, What the Hell’s Going on During Estimation?

What we hope to cover: Extension of linear to multiple regression Matrix formulation of multiple regression; residuals and parameter estimates General and Generalised Linear Models Overdetermined models and the pseudoinverse solution Specific application to fMRI and basis sets

Multiple Regression Last time, David talked about linear regression – that is determination of a linear relationship between a single dependent and a single independent variable, of the form: Y = βX + c For example, we might think that the number of papers a researcher publishes a year (Y) is related to how hard working he/she is (X) and we can attempt to determine the regression coefficient ( β) which reflects how much of an effect X has on Y. This approach can be extended to account for multiple variables, such as how friendly you were to potential reviewers at a recent conference, and combined in a linear fashion: Y = β 1 x 1 + β 2 x 2 …… β L x L + ε (1)

Multiple Regression The β parameters reflect the independent contribution of each explanatory variable to Y, that is the amount of variance accounted for by that variable after all the other variables have been accounted for. For example – one might see a negative correlation between height and hair length. However, if we add an explanatory variable reflecting gender (a categorical or dummy variable) then we see that the apparent correlation above actually reflects that, on average, men are taller than women, whilst women tend to have longer hair, and that height has no independent predictive value for hair length. The regression surface (the equivalent of the slope line in simple regression) expresses the best prediction of the dependent variable, Y, given the explanatory variables (Xs). However, observed data will deviate from this regression surface, the deviation from the corresponding point being termed the residual.

Matrix Formulation of Multiple Regression Writing out equation (1) for each observation of Y gives a series of simultaneous equations: Y 1 = x 1 1 β x 1 l β l x 1 L + ε 1 : = : Y j = x j1 β x j l β l x j L + ε j : = : Y J = x J1 β x J l β l x J L + ε J Y 1 … …  1  1 Y 1 x 11 … x 1 l … x 1 L  1  1 : : … :… : : : : : … :… : : : Y j = … …  l +  j Y j = x j 1 … x j l … x j L  l +  j : : … :… : : : : : … :… : : : Y J … …  L  J Y J x J 1 … x J l … x J L  L  J Y= X×  +  In Matrix Form: Observed data Design Matrix Parameters Residuals

Parameter Estimation Typically the simultaneous equations shown before cannot be fully solved (i.e. with ε = 0), so we aim to achieve the best between model and data, by minimising the sum of squares of the residuals – this is the least squares estimate: Residual sum of squares Minimised when which is the l th row of so the least squares estimates satisfy the normal equations giving (2)

Extension to General and Generalised Linear Models Multiple Regression (as with many parametric tests, including t- and F-tests, ANOVAs, ANCOVAs etc.) is basically a limited form of a generalised linear model (GLM), with certain constraints, particularly:- Only 1 dependent variable can be analysed It assumes that errors are independently, identically and normally distributed, with mean 0 and variance σ 2 (shown as ~ iid Ν (0, σ 2 ))

Extension to General and Generalised Linear Models The General Linear Model allows linear combinations of multiple dependent variables (multivariate statistics), replacing the Y vector of J observations of a single Y variable with a matrix of J observations of N different variables – similarly the β vector is replaced with a JxN matrix. However, whilst a fMRI experiment could be modelled with a Y matrix reflecting BOLD signal at N voxels over J scans, SPM takes a mass univariate approach – that is each voxel is represented by a column vector of observations over scans and processed through the same model. Generalised Linear Models (GLMs) do not assume spherical error distributions, and hence can be utilised in order to correct for temporal correlations (this will be covered in a later talk).

Overdetermined Models and Pseudoinversion If the design matrix (X) has columns which are not linearly independent then it is rank deficient and X T X has no inverse. In this case there are an infinite number of parameter estimates which can describe this model, with an infinite number of least square estimates which satisfy (2) – such a model is said to be overdetermined. Since we hope for a single set of parameters in order to construct our significance tests a constraint must be applied to the estimates – the key point being then that inference can only be meaningfully engaged in when considering functions of those parameters not influenced by the chosen constraint. SPM uses a pseudoinverse solution, and the pseudoinverse (X T X) - can be substituted for (X T X) -1 in eq. (2)

GLM and fMRI Models We have looked so far at multiple regression and the general linear model in a fairly abstract context. We shall now think about how it applies to fMRI experiments: Y = X. β + ε Y = X. β + ε Observed data – SPM uses a mass univariate approach – that is each voxel is treated as a separate column vector of data. Design matrix – formed of several components which explain the observed data: Timing information consisting of onset vectors O m j and duration vectors D m Impulse response function h m describing the shape of expected BOLD response Other regressors e.g. movement parameters Parameters defining the contribution of each component of the design matrix to the model. These are estimated so as to minimise the error, and are used to generate the contrasts between conditions (next week). Error - the difference between the observed data and the model defined by Xβ. In fMRI these are not assumed to be spherical (temporal correlations).

GLM and fMRI Models The design of the experiment is principally defined by : The stimulus function S m, representing occurrence of a stimulus type in each of a series of contiguous time bins for each trial type m. This is generated by SPM 2 from the onset vector O m j and the duration vector, D m. The impulse response function, h m for trial type m. The observed data, Y, is then expressed as: Y = ( Σ h m conv S m ) + ε The impulse response functions are not known, but SPM assumes that they can be modelled as linear combinations of basis functions b i such that: h m i = b i. β m i A typical basis function set might comprise the haemodynamic response function (HRF) and its partial derivatives with respect to time and dispersion.

GLM and fMRI Models Observed data Model (green and red) and true signal (blue) Error + noise – set parameters to minimise this How does this look with data?

Summary The general linear model is a powerful statistical tool allowing determination of multiple parameters predicting multiple dependent variables. Many other parametric tests are special cases of the general linear model (t-tests, ANOVAs, F-test, regression) The design matrix contains the information about the designed aspects of the experiment which may explain the observed data. Minimising the sum of square differences between the modelled and observed data allows determination of the optimal parameters for the model. The parameters can then be utilised to construct F- and t-tests to determine the significance of contrasts between experimental factors (more next week). In fMRI we convolve the information about impulse response functions and the timing of different trial types to give the design matrix. We must also utilise a Generalised Linear Model to allow correction for temporal correlations over scans (more in a few weeks).