Bayesian models for fMRI data Methods & models for fMRI data analysis 06 May 2009 Klaas Enno Stephan Laboratory for Social and Neural Systems Research.

Slides:



Advertisements
Similar presentations
The General Linear Model (GLM)
Advertisements

J. Daunizeau Institute of Empirical Research in Economics, Zurich, Switzerland Brain and Spine Institute, Paris, France Bayesian inference.
Bayesian fMRI models with Spatial Priors Will Penny (1), Nelson Trujillo-Barreto (2) Guillaume Flandin (1) Stefan Kiebel(1), Karl Friston (1) (1) Wellcome.
Bayesian inference Lee Harrison York Neuroimaging Centre 01 / 05 / 2009.
Bayesian inference Jean Daunizeau Wellcome Trust Centre for Neuroimaging 16 / 05 / 2008.
Hierarchical Models and
Experimental design of fMRI studies Methods & models for fMRI data analysis in neuroeconomics April 2010 Klaas Enno Stephan Laboratory for Social and Neural.
Bayesian models for fMRI data
Group analyses of fMRI data Methods & models for fMRI data analysis in neuroeconomics November 2010 Klaas Enno Stephan Laboratory for Social and Neural.
MEG/EEG Inverse problem and solutions In a Bayesian Framework EEG/MEG SPM course, Bruxelles, 2011 Jérémie Mattout Lyon Neuroscience Research Centre ? ?
OverviewOverview Motion correction Smoothing kernel Spatial normalisation Standard template fMRI time-series Statistical Parametric Map General Linear.
Classical inference and design efficiency Zurich SPM Course 2014
Multiple testing Justin Chumbley Laboratory for Social and Neural Systems Research Institute for Empirical Research in Economics University of Zurich With.
The General Linear Model (GLM) Methods & models for fMRI data analysis in neuroeconomics November 2010 Klaas Enno Stephan Laboratory for Social & Neural.
Bayesian models for fMRI data
Multiple testing Justin Chumbley Laboratory for Social and Neural Systems Research Institute for Empirical Research in Economics University of Zurich With.
Multiple comparison correction Methods & models for fMRI data analysis 18 March 2009 Klaas Enno Stephan Laboratory for Social and Neural Systems Research.
The General Linear Model (GLM)
The General Linear Model (GLM) SPM Course 2010 University of Zurich, February 2010 Klaas Enno Stephan Laboratory for Social & Neural Systems Research.
J. Daunizeau Wellcome Trust Centre for Neuroimaging, London, UK Institute of Empirical Research in Economics, Zurich, Switzerland Bayesian inference.
Group analyses of fMRI data Methods & models for fMRI data analysis 28 April 2009 Klaas Enno Stephan Laboratory for Social and Neural Systems Research.
Multiple comparison correction Methods & models for fMRI data analysis 29 October 2008 Klaas Enno Stephan Branco Weiss Laboratory (BWL) Institute for Empirical.
Group analyses of fMRI data Methods & models for fMRI data analysis 26 November 2008 Klaas Enno Stephan Laboratory for Social and Neural Systems Research.
Preprocessing II: Between Subjects John Ashburner Wellcome Trust Centre for Neuroimaging, 12 Queen Square, London, UK.
General Linear Model & Classical Inference Guillaume Flandin Wellcome Trust Centre for Neuroimaging University College London SPM M/EEGCourse London, May.
2nd Level Analysis Jennifer Marchant & Tessa Dekker
With many thanks for slides & images to: FIL Methods group, Virginia Flanagin and Klaas Enno Stephan Dr. Frederike Petzschner Translational Neuromodeling.
Dynamic Causal Modelling (DCM): Theory Demis Hassabis & Hanneke den Ouden Thanks to Klaas Enno Stephan Functional Imaging Lab Wellcome Dept. of Imaging.
METHODSDUMMIES BAYES FOR BEGINNERS. Any given Monday at pm “I’m sure this makes sense, but you lost me about here…”
SPM Course Zurich, February 2015 Group Analyses Guillaume Flandin Wellcome Trust Centre for Neuroimaging University College London With many thanks to.
Group analyses of fMRI data Methods & models for fMRI data analysis November 2012 With many thanks for slides & images to: FIL Methods group, particularly.
Bayesian Inference and Posterior Probability Maps Guillaume Flandin Wellcome Department of Imaging Neuroscience, University College London, UK SPM Course,
Bayesian models for fMRI data Methods & models for fMRI data analysis November 2011 With many thanks for slides & images to: FIL Methods group, particularly.
Ch. 5 Bayesian Treatment of Neuroimaging Data Will Penny and Karl Friston Ch. 5 Bayesian Treatment of Neuroimaging Data Will Penny and Karl Friston 18.
Methods for Dummies Second level Analysis (for fMRI) Chris Hardy, Alex Fellows Expert: Guillaume Flandin.
FMRI Modelling & Statistical Inference Guillaume Flandin Wellcome Trust Centre for Neuroimaging University College London SPM Course Chicago, Oct.
Bayesian Methods Will Penny and Guillaume Flandin Wellcome Department of Imaging Neuroscience, University College London, UK SPM Course, London, May 12.
Bayesian inference Lee Harrison York Neuroimaging Centre 23 / 10 / 2009.
Mixture Models with Adaptive Spatial Priors Will Penny Karl Friston Acknowledgments: Stefan Kiebel and John Ashburner The Wellcome Department of Imaging.
Guillaume Flandin Wellcome Trust Centre for Neuroimaging University College London SPM Course Zurich, February 2008 Bayesian Inference.
Bayesian Inference in SPM2 Will Penny K. Friston, J. Ashburner, J.-B. Poline, R. Henson, S. Kiebel, D. Glaser Wellcome Department of Imaging Neuroscience,
The General Linear Model Guillaume Flandin Wellcome Trust Centre for Neuroimaging University College London SPM fMRI Course London, October 2012.
Bayesian fMRI analysis with Spatial Basis Function Priors
J. Daunizeau ICM, Paris, France TNU, Zurich, Switzerland
The General Linear Model (GLM)
Group Analyses Guillaume Flandin SPM Course London, October 2016
The General Linear Model (GLM)
The general linear model and Statistical Parametric Mapping
2nd Level Analysis Methods for Dummies 2010/11 - 2nd Feb 2011
Bayesian Inference Will Penny
Dynamic Causal Modelling (DCM): Theory
Neuroscience Research Institute University of Manchester
The General Linear Model (GLM)
'Linear Hierarchical Models'
Statistical Parametric Mapping
The general linear model and Statistical Parametric Mapping
SPM2: Modelling and Inference
Hierarchical Models and
The General Linear Model (GLM)
Bayesian inference J. Daunizeau
Anatomical Measures John Ashburner
Bayesian Inference in SPM2
Wellcome Centre for Neuroimaging, UCL, UK.
The General Linear Model
Mixture Models with Adaptive Spatial Priors
The General Linear Model (GLM)
The General Linear Model
Will Penny Wellcome Trust Centre for Neuroimaging,
Bayesian Model Selection and Averaging
Presentation transcript:

Bayesian models for fMRI data Methods & models for fMRI data analysis 06 May 2009 Klaas Enno Stephan Laboratory for Social and Neural Systems Research Institute for Empirical Research in Economics University of Zurich Functional Imaging Laboratory (FIL) Wellcome Trust Centre for Neuroimaging University College London With many thanks for slides & images to: FIL Methods group, particularly Guillaume Flandin The Reverend Thomas Bayes ( )

Why do I need to learn about Bayesian stats? Because SPM is getting more and more Bayesian: Segmentation & spatial normalisation Posterior probability maps (PPMs) –1 st level: specific spatial priors –2 nd level: global spatial priors Dynamic Causal Modelling (DCM) Bayesian Model Selection (BMS) EEG: source reconstruction

RealignmentSmoothing Normalisation General linear model Statistical parametric map (SPM) Image time-series Parameter estimates Design matrix Template Kernel Gaussian field theory p <0.05 Statisticalinference Bayesian segmentation and normalisation Bayesian segmentation and normalisation Spatial priors on activation extent Spatial priors on activation extent Posterior probability maps (PPMs) Posterior probability maps (PPMs) Dynamic Causal Modelling Dynamic Causal Modelling

p-value: probability of getting the observed data in the effect’s absence. If small, reject null hypothesis that there is no effect. Limitations:  One can never accept the null hypothesis  Given enough data, one can always demonstrate a significant effect  Correction for multiple comparisons necessary Solution: infer posterior probability of the effect Probability of observing the data y, given no effect (  = 0). Problems of classical (frequentist) statistics Probability of the effect, given the observed data

Overview of topics Bayes' rule Bayesian update rules for Gaussian densities Bayesian analyses in SPM –Segmentation & spatial normalisation –Posterior probability maps (PPMs) 1st level: specific spatial priors 2nd level: global spatial priors –Bayesian Model Selection (BMS)

Bayesian statistics posterior  likelihood ∙ prior Bayes theorem allows one to formally incorporate prior knowledge into computing statistical probabilities. Priors can be of different sorts: empirical, principled or shrinkage priors. The “posterior” probability of the parameters given the data is an optimal combination of prior knowledge and new data, weighted by their relative precision. new data prior knowledge

Bayes in motion - an animation

Given data y and parameters , the conditional probabilities are: Eliminating p(y,  ) gives Bayes’ rule: Likelihood Prior Evidence Posterior Bayes’ rule

y y  Observation of data likelihood p(y|  ) prior distribution p(  ) likelihood p(y|  ) prior distribution p(  )  Formulation of a generative model  Update of beliefs based upon observations, given a prior state of knowledge Principles of Bayesian inference

Likelihood & Prior Posterior: Posterior mean = variance-weighted combination of prior mean and data mean Prior Likelihood Posterior Posterior mean & variance of univariate Gaussians

Likelihood & prior Posterior: Prior Likelihood Posterior Same thing – but expressed as precision weighting Relative precision weighting

Likelihood & Prior Posterior Relative precision weighting Prior Likelihood Posterior Same thing – but explicit hierarchical perspective

Bayesian GLM: univariate case Relative precision weighting Normal densities Univariate linear model

One step if C e is known. Otherwise iterative estimation with EM. General Linear Model Bayesian GLM: multivariate case Normal densities 22 11

An intuitive example

Less intuitive

Even less intuitive

Likelihood distributions from different subjects are independent  one can use the posterior from one subject as the prior for the next Under Gaussian assumptions this is easy to compute: group posterior covariance individual posterior covariances group posterior mean individual posterior covariances and means “Today’s posterior is tomorrow’s prior” Bayesian (fixed effects) group analysis

Bayesian analyses in SPM5 Segmentation & spatial normalisation Posterior probability maps (PPMs) –1 st level: specific spatial priors –2 nd level: global spatial priors Dynamic Causal Modelling (DCM) Bayesian Model Selection (BMS) EEG: source reconstruction

Spatial normalisation: Bayesian regularisation Deformations consist of a linear combination of smooth basis functions  lowest frequencies of a 3D discrete cosine transform. Find maximum a posteriori (MAP) estimates: simultaneously minimise –squared difference between template and source image –squared difference between parameters and their priors MAP: Deformation parameters “Difference” between template and source image Squared distance between parameters and their expected values (regularisation)

Bayesian segmentation with empirical priors Goal: for each voxel, compute probability that it belongs to a particular tissue type, given its intensity Likelihood model: Intensities are modelled by a mixture of Gaussian distributions representing different tissue classes (e.g. GM, WM, CSF). Priors are obtained from tissue probability maps (segmented images of 151 subjects). Goal: for each voxel, compute probability that it belongs to a particular tissue type, given its intensity Likelihood model: Intensities are modelled by a mixture of Gaussian distributions representing different tissue classes (e.g. GM, WM, CSF). Priors are obtained from tissue probability maps (segmented images of 151 subjects). Ashburner & Friston 2005, NeuroImage p (tissue | intensity)  p (intensity | tissue) ∙ p (tissue)

Unified segmentation & normalisation Circular relationship between segmentation & normalisation: –Knowing which tissue type a voxel belongs to helps normalisation. –Knowing where a voxel is (in standard space) helps segmentation. Build a joint generative model: –model how voxel intensities result from mixture of tissue type distributions –model how tissue types of one brain have to be spatially deformed to match those of another brain Using a priori knowledge about the parameters: adopt Bayesian approach and maximise the posterior probability Ashburner & Friston 2005, NeuroImage

General Linear Model: What are the priors? with In “classical” SPM, no priors (= “flat” priors) Full Bayes: priors are predefined on a principled or empirical basis Empirical Bayes: priors are estimated from the data, assuming a hierarchical generative model  PPMs in SPM Parameters of one level = priors for distribution of parameters at lower level Parameters and hyperparameters at each level can be estimated using EM Bayesian fMRI analyses

Hierarchical models and Empirical Bayes Hierarchical model Hierarchical model Parametric Empirical Bayes (PEB) Parametric Empirical Bayes (PEB) EM = PEB = ReML Restricted Maximum Likelihood (ReML) Restricted Maximum Likelihood (ReML) Single-level model Single-level model

Posterior Probability Maps (PPMs) Posterior distribution: probability of the effect given the data Posterior probability map: images of the probability (confidence) that an activation exceeds some specified threshold , given the data y Two thresholds: activation threshold  : percentage of whole brain mean signal (physiologically relevant size of effect) probability  that voxels must exceed to be displayed (e.g. 95%) Two thresholds: activation threshold  : percentage of whole brain mean signal (physiologically relevant size of effect) probability  that voxels must exceed to be displayed (e.g. 95%) mean: size of effect precision: variability

PPMs vs. SPMs LikelihoodPrior Posterior SPMsSPMs PPMsPPMs Bayesian test: Classical t-test:

2 nd level PPMs with global priors In the absence of evidence to the contrary, parameters will shrink to zero. In the absence of evidence to the contrary, parameters will shrink to zero. 1 st level (GLM): 2 nd level (shrinkage prior): 0 Basic idea: use the variance of  over voxels as prior variance of  at any particular voxel. 2 nd level:  (2) = average effect over voxels,  (2) = voxel-to-voxel variation.  (1) reflects regionally specific effects  assume that it sums to zero over all voxels  shrinkage prior at the second level  variance of this prior is implicitly estimated by estimating  (2)

Shrinkage Priors Small & variable effect Large & variable effect Small but clear effect Large & clear effect

2 nd level PPMs with global priors 1 st level (GLM): 2 nd level (shrinkage prior): Once C ε and C  are known, we can apply the usual rule for computing the posterior mean & covariance:  We are looking for the same effect over multiple voxels  Pooled estimation of C  over voxels voxel-specific global  pooled estimate Friston & Penny 2003, NeuroImage

PPMs and multiple comparisons No need to correct for multiple comparisons: Thresholding a PPM at 95% confidence: in every voxel, the posterior probability of an activation   is  95%. At most, 5% of the voxels identified could have activations less than . Independent of the search volume, thresholding a PPM thus puts an upper bound on the false discovery rate.

PPMs vs.SPMs PPMs: Show activations greater than a given size SPMs: Show voxels with non-zero activations

PPMs: pros and cons One can infer that a cause did not elicit a response Inference is independent of search volume SPMs conflate effect- size and effect- variability One can infer that a cause did not elicit a response Inference is independent of search volume SPMs conflate effect- size and effect- variability Disadvantages Advantages Estimating priors over voxels is computationally demanding Practical benefits are yet to be established Thresholds other than zero require justification Estimating priors over voxels is computationally demanding Practical benefits are yet to be established Thresholds other than zero require justification

1 st level PPMs with local spatial priors Neighbouring voxels often not independent Spatial dependencies vary across the brain But spatial smoothing in SPM is uniform Matched filter theorem: SNR maximal when smoothing the data with a kernel which matches the smoothness of the true signal Basic idea: estimate regional spatial dependencies from the data and use this as a prior in a PPM  regionally specific smoothing  markedly increased sensitivity Contrast map AR(1) map Penny et al. 2005, NeuroImage

 A q1q1 q2q2  W Y u1u1 u2u2 Y=XW+E r1r1 r2r2 The generative spatio-temporal model Penny et al. 2005, NeuroImage  = spatial precision of parameters = observation noise precision  = precision of AR coefficients

Prior for k-th parameter: Shrinkage prior Spatial kernel matrix Spatial precision: determines the amount of smoothness The spatial prior Different choices possible for spatial kernel matrix S. Currently used in SPM: Laplacian prior (same as in LORETA)

Smoothing Global priorLaplacian Prior Example: application to event-related fMRI data Contrast maps for familiar vs. non-familiar faces, obtained with -smoothing -global spatial prior -Laplacian prior

SPM5 graphical user interface

Bayesian model selection (BMS) Given competing hypotheses on structure & functional mechanisms of a system, which model is the best? For which model m does p(y|m) become maximal? Which model represents the best balance between model fit and model complexity? Pitt & Miyung (2002), TICS

Model evidence: Various approximations, e.g.: -negative free energy -AIC -BIC Penny et al. (2004) NeuroImage Bayesian model selection (BMS) Model comparison via Bayes factor: Bayes’ rules: accounts for both accuracy and complexity of the model allows for inference about structure (generalisability) of the model

Example: BMS of dynamic causal models modulation of back- ward or forward connection? additional driving effect of attention on PPC? bilinear or nonlinear modulation of forward connection? V1 V5 stim PPC M2 attention V1 V5 stim PPC M1 attention V1 V5 stim PPC M3 attention V1 V5 stim PPC M4 attention BF = 2966 M2 better than M1 M3 better than M2 BF = 12 M4 better than M3 BF = 23    Stephan et al. (2008) NeuroImage

Thank you