J. Daunizeau ICM, Paris, France TNU, Zurich, Switzerland

Slides:



Advertisements
Similar presentations
J. Daunizeau Institute of Empirical Research in Economics, Zurich, Switzerland Brain and Spine Institute, Paris, France Bayesian inference.
Advertisements

Bayesian inference Lee Harrison York Neuroimaging Centre 01 / 05 / 2009.
Bayesian inference Jean Daunizeau Wellcome Trust Centre for Neuroimaging 16 / 05 / 2008.
Wellcome Trust Centre for Neuroimaging
Overview of SPM p <0.05 Statistical parametric map (SPM)
Bayesian models for fMRI data
Bayesian Inference Chris Mathys Wellcome Trust Centre for Neuroimaging UCL SPM Course London, May 12, 2014 Thanks to Jean Daunizeau and Jérémie Mattout.
Pattern Recognition and Machine Learning
Lecture XXIII.  In general there are two kinds of hypotheses: one concerns the form of the probability distribution (i.e. is the random variable normally.
MEG/EEG Inverse problem and solutions In a Bayesian Framework EEG/MEG SPM course, Bruxelles, 2011 Jérémie Mattout Lyon Neuroscience Research Centre ? ?
Bayes for beginners Methods for dummies 27 February 2013 Claire Berna
Classical inference and design efficiency Zurich SPM Course 2014
Bayesian inference Gil McVean, Department of Statistics Monday 17 th November 2008.
Bayesian models for fMRI data
07/01/15 MfD 2014 Xin You Tai & Misun Kim
Introduction  Bayesian methods are becoming very important in the cognitive sciences  Bayesian statistics is a framework for doing inference, in a principled.
Bayesian models for fMRI data Methods & models for fMRI data analysis 06 May 2009 Klaas Enno Stephan Laboratory for Social and Neural Systems Research.
J. Daunizeau Wellcome Trust Centre for Neuroimaging, London, UK Institute of Empirical Research in Economics, Zurich, Switzerland Bayesian inference.
Machine Learning CMPT 726 Simon Fraser University
General Linear Model & Classical Inference
General Linear Model & Classical Inference Guillaume Flandin Wellcome Trust Centre for Neuroimaging University College London SPM M/EEGCourse London, May.
General Linear Model & Classical Inference London, SPM-M/EEG course May 2014 C. Phillips, Cyclotron Research Centre, ULg, Belgium
Corinne Introduction/Overview & Examples (behavioral) Giorgia functional Brain Imaging Examples, Fixed Effects Analysis vs. Random Effects Analysis Models.
Bayesian Inference and Posterior Probability Maps Guillaume Flandin Wellcome Department of Imaging Neuroscience, University College London, UK SPM Course,
J. Daunizeau Wellcome Trust Centre for Neuroimaging, London, UK UZH – Foundations of Human Social Behaviour, Zurich, Switzerland Dynamic Causal Modelling:
J. Daunizeau ICM, Paris, France ETH, Zurich, Switzerland Dynamic Causal Modelling of fMRI timeseries.
Bayesian models for fMRI data Methods & models for fMRI data analysis November 2011 With many thanks for slides & images to: FIL Methods group, particularly.
Ch15: Decision Theory & Bayesian Inference 15.1: INTRO: We are back to some theoretical statistics: 1.Decision Theory –Make decisions in the presence of.
Lecture 2: Statistical learning primer for biologists
Spatial Smoothing and Multiple Comparisons Correction for Dummies Alexa Morcom, Matthew Brett Acknowledgements.
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
Lecture 3: MLE, Bayes Learning, and Maximum Entropy
Bayesian Methods Will Penny and Guillaume Flandin Wellcome Department of Imaging Neuroscience, University College London, UK SPM Course, London, May 12.
Bayesian inference Lee Harrison York Neuroimaging Centre 23 / 10 / 2009.
Mixture Models with Adaptive Spatial Priors Will Penny Karl Friston Acknowledgments: Stefan Kiebel and John Ashburner The Wellcome Department of Imaging.
Bayesian Inference in fMRI Will Penny Bayesian Approaches in Neuroscience Karolinska Institutet, Stockholm February 2016.
Guillaume Flandin Wellcome Trust Centre for Neuroimaging University College London SPM Course Zurich, February 2008 Bayesian Inference.
Bayes for Beginners Anne-Catherine Huys M. Berk Mirza Methods for Dummies 20 th January 2016.
Bayesian Inference in SPM2 Will Penny K. Friston, J. Ashburner, J.-B. Poline, R. Henson, S. Kiebel, D. Glaser Wellcome Department of Imaging Neuroscience,
General Linear Model & Classical Inference London, SPM-M/EEG course May 2016 Sven Bestmann, Sobell Department, Institute of Neurology, UCL
General Linear Model & Classical Inference Short course on SPM for MEG/EEG Wellcome Trust Centre for Neuroimaging University College London May 2010 C.
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 1: INTRODUCTION.
Bayesian Model Selection and Averaging SPM for MEG/EEG course Peter Zeidman 17 th May 2016, 16:15-17:00.
Applied statistics Usman Roshan.
Lecture 1.31 Criteria for optimal reception of radio signals.
Group Analyses Guillaume Flandin SPM Course London, October 2016
Variational Bayesian Inference for fMRI time series
General Linear Model & Classical Inference
The general linear model and Statistical Parametric Mapping
Hypothesis Testing – Introduction
Wellcome Trust Centre for Neuroimaging
Neuroscience Research Institute University of Manchester
Contrasts & Statistical Inference
'Linear Hierarchical Models'
Linear Hierarchical Modelling
Statistical Parametric Mapping
Pattern Recognition and Machine Learning
The general linear model and Statistical Parametric Mapping
SPM2: Modelling and Inference
Expectation-Maximization & Belief Propagation
Hierarchical Models and
Bayesian inference J. Daunizeau
Contrasts & Statistical Inference
Bayesian Inference in SPM2
Wellcome Centre for Neuroimaging, UCL, UK.
Mixture Models with Adaptive Spatial Priors
Will Penny Wellcome Trust Centre for Neuroimaging,
Bayesian Model Selection and Averaging
Contrasts & Statistical Inference
Presentation transcript:

J. Daunizeau ICM, Paris, France TNU, Zurich, Switzerland An introduction to Bayesian inference and model comparison J. Daunizeau ICM, Paris, France TNU, Zurich, Switzerland

Overview of the talk An introduction to probabilistic modelling Bayesian model comparison SPM applications

Overview of the talk An introduction to probabilistic modelling Bayesian model comparison SPM applications

Probability theory: basics Degree of plausibility desiderata: - should be represented using real numbers (D1) - should conform with intuition (D2) - should be consistent (D3) a=2 • normalization: a=2 b=5 Bayesian theory of probabilities furnishes: - a language allowing to represent information and uncertainty. - calculus tools able to manipulate these notions in a coherent manner. It is an extension to formal logic, which works with propositions which are either true or false. Theory of probabilities rather works with propositions which are more or less true or more or less false. In other words, these propositions are characterized by their degree of plausibility. D2: e.g., a higher degree of plausibility should correspond to a higher number. D3: i.e. should not contain contradictions. E.g., when a result can be derived in different ways, all calculus should yield the same result. Finally, equal states of belief should correspond to equal degrees of plausibility. - plausibility can be measured on probability scale (sum to 1) • marginalization: • conditioning : (Bayes rule)

Deriving the likelihood function - Model of data with unknown parameters: e.g., GLM: - But data is noisy: - Assume noise/residuals is ‘small’: f Bayesian theory of probabilities furnishes: - a language allowing to represent information and uncertainty. - calculus tools able to manipulate these notions in a coherent manner. It is an extension to formal logic, which works with propositions which are either true or false. Theory of probabilities rather works with propositions which are more or less true or more or less false. In other words, these propositions are characterized by their degree of plausibility. D2: e.g., a higher degree of plausibility should correspond to a higher number. D3: i.e. should not contain contradictions. E.g., when a result can be derived in different ways, all calculus should yield the same result. Finally, equal states of belief should correspond to equal degrees of plausibility. - plausibility can be measured on probability scale (sum to 1) → Distribution of data, given fixed parameters:

Forward and inverse problems forward problem likelihood model data inverse problem posterior distribution 6

Likelihood, priors and the model evidence Bayes rule: generative model m EXAMPLE: -Y : EEG electric potential measured on the scalp -theta : connection strength between two nodes of a network

Bayesian model comparison Principle of parsimony : « plurality should not be assumed without necessity » y = f(x) x “Occam’s razor” : model evidence p(y|m) space of all data sets Model evidence: y=f(x)

Hierarchical models ••• inference causality

Directed acyclic graphs (DAGs)

Variational approximations (VB, EM, ReML) → VB : maximize the free energy F(q) w.r.t. the approximate posterior q(θ) under some (e.g., mean field, Laplace) simplifying constraint

Type, role and impact of priors Types of priors: Explicit priors on model parameters (e.g., Gaussian) Implicit priors on model functional form (e.g., evolution & observation functions) Choice of “interesting” data features (e.g., response magnitude vs response profile) Role of explicit priors (on model parameters): Resolving the ill-posedness of the inverse problem Avoiding overfitting (cf. generalization error) Impact of priors: On parameter posterior distributions (cf. “shrinkage to the mean” effect) On model evidence (cf. “Occam’s razor”)

Overview of the talk An introduction to probabilistic modelling Bayesian model comparison SPM applications

Frequentist versus Bayesian inference • define the null, e.g.: • define two alternative models, e.g.: space of all datasets • estimate parameters (obtain test stat.) • apply decision rule, e.g.: • apply decision rule, i.e.: if then accept m0 if then reject H0 classical (null) hypothesis testing Bayesian Model Comparison

Family-level inference model selection error risk: P(m1|y) = 0.04 P(m2|y) = 0.25 A B A B P(m2|y) = 0.01 P(m2|y) = 0.7 A B u A B u

Family-level inference model selection error risk: P(m1|y) = 0.04 P(m2|y) = 0.25 A B A B family inference (pool statistical evidence) P(m2|y) = 0.01 P(m2|y) = 0.7 A B u A B u P(f1|y) = 0.05 P(f2|y) = 0.95

Sampling subjects as marbles in an urn → ith marble is blue → ith marble is purple = frequency of blue marbles in the urn → (binomial) probability of drawing a set of n marbles: … Thus, our belief about the frequency of blue marbles is:

RFX group-level model comparison At least, we can measure how likely is the ith subject’s data under each model! … … Our belief about the frequency of models is: … Exceedance probability:

SPM: frequentist vs Bayesian RFX analysis parameter estimates subjects

Overview of the talk An introduction to probabilistic modelling Bayesian model comparison SPM applications

posterior probability segmentation and normalisation posterior probability maps (PPMs) dynamic causal modelling multivariate decoding realignment smoothing general linear model statistical inference Gaussian field theory normalisation p <0.05 template

aMRI segmentation mixture of Gaussians (MoG) model … class variances class means ith voxel value label frequencies grey matter CSF white matter

Decoding of brain images recognizing brain states from fMRI + fixation cross >> pace response log-evidence of X-Y sparse mappings: effect of lateralization log-evidence of X-Y bilateral mappings: effect of spatial deployment

fMRI time series analysis spatial priors and model comparison PPM: regions best explained by short-term memory model short-term memory design matrix (X) fMRI time series GLM coeff prior variance of GLM coeff of data noise AR coeff (correlated noise) PPM: regions best explained by long-term memory model long-term memory design matrix (X)

Dynamic Causal Modelling network structure identification attention attention attention attention PPC PPC PPC PPC stim V1 V5 stim V1 V5 stim V1 V5 stim V1 V5 V1 V5 stim PPC attention 1.25 0.13 0.46 0.39 0.26 0.10 estimated effective synaptic strengths for best model (m4) models marginal likelihood m1 m2 m3 m4 15 10 5

I thank you for your attention.

A note on statistical significance lessons from the Neyman-Pearson lemma Neyman-Pearson lemma: the likelihood ratio (or Bayes factor) test is the most powerful test of size to test the null. what is the threshold u, above which the Bayes factor test yields a error I rate of 5%? ROC analysis MVB (Bayes factor) u=1.09, power=56% 1 - error II rate CCA (F-statistics) F=2.20, power=20% error I rate