# Hierarchical Models and

## Presentation on theme: "Hierarchical Models and"— Presentation transcript:

Hierarchical Models and
Variance Components Will Penny Wellcome Department of Imaging Neuroscience, University College London, UK SPM Course, London, May 2003

Outline Random Effects Analysis General Framework
Summary statistic approach 2nd level) General Framework Multiple variance components and Hierarchical models Multiple variance components F-tests and level Modelling fMRI serial level Hierarchical models for Bayesian Inference SPMs versus PPMs

Outline Random Effects Analysis General Framework
Summary statistic approach 2nd level) General Framework Multiple variance components and Hierarchical models Multiple variance components F-tests and level Modelling fMRI serial level Hierarchical models for Bayesian Inference SPMs versus PPMs

Random Effects Analysis:Summary-Statistic Approach
1st Level nd Level Data Design Matrix Contrast Images 1 ^ SPM(t) 1 ^ 2 ^ 2 ^ 11 ^ 11 ^ ^ One-sample level 12 ^ 12 ^

Validity of approach ^ ^ Gold Standard approach is EM – see later –
estimates population mean effect as MEANEM the variance of this estimate as VAREM For N subjects, n scans per subject and equal within-subject variance we have VAREM = Var-between/N + Var-within/Nn In this case, the SS approach gives the same results, on average: Avg[a] = MEANEM Avg[Var(a)] =VAREM In other cases, with N~12, and typical ratios of between-subject to within-subject variance found in fMRI, the SS approach will give very similar results to EM. ^ ^

Example: Multi-session study of auditory processing
SS results EM results Friston et al. (2003) Mixed effects and fMRI studies, Submitted.

Two populations Estimated population means Contrast images Two-sample
level

Outline Random Effects Analysis General Framework
Summary statistic approach 2nd level) General Framework Multiple variance components and Hierarchical models Multiple variance components F-tests and level Modelling fMRI serial level Hierarchical models for Bayesian Inference SPMs versus PPMs

The General Linear Model
y = X  + e N  N  L L  N  1 Error covariance N 2 Basic Assumptions Identity Independence N We assume ‘sphericity’

Multiple variance components
y = X  + e N  N  L L  N  1 Error covariance N Errors can now have different variances and there can be correlations N We allow for ‘nonsphericity’

Non-Sphericity Errors are independent but not identical
Errors are not independent and not identical Error Covariance

General Framework Multiple variance components Hierarchical Models
at each level With hierarchical models we can define priors and make Bayesian inferences. If we know the variance components we can compute the distributions over the parameters at each level.

Estimation EM algorithm Friston, K. et al. (2002), Neuroimage ( ) å y
E-Step ( ) y C X T 1 - = e q h M-Step r for i and j { } { Q tr J g i j ij k å + l Friston, K. et al. (2002), Neuroimage

Algorithm Equivalence
Parametric Empirical Bayes (PEB) Hierarchical model EM=PEB=ReML Restricted Maximimum Likelihood (ReML) Single-level model

Outline Random Effects Analysis General Framework
Summary statistic approach 2nd level) General Framework Multiple variance components and Hierarchical models Multiple variance components F-tests and level Modelling fMRI serial level Hierarchical models for Bayesian Inference SPMs versus PPMs

Non-Sphericity Errors are independent but not identical
Errors are not independent and not identical Error Covariance

Non-Sphericity Error can be Independent but Non-Identical when…
1) One parameter but from different groups e.g. patients and control groups 2) One parameter but design matrices differ across subjects e.g. subsequent memory effect

Non-Sphericity Error can be Non-Independent and Non-Identical when…
1) Several parameters per subject e.g. Repeated Measurement design 2) Conjunction over several parameters e.g. Common brain activity for different cognitive processes 3) Complete characterization of the hemodynamic response e.g. F-test combining HRF, temporal derivative and dispersion regressors

Example I U. Noppeney et al.
Stimuli: Auditory Presentation (SOA = 4 secs) of (i) words and (ii) words spoken backwards Subjects: (i) 12 control subjects (ii) 11 blind subjects jump touch koob “click” Scanning: fMRI, 250 scans per subject, block design Q. What are the regions that activate for real words relative to reverse words in both blind and control groups?

Independent but Non-Identical Error
1st Level Controls Blinds 2nd Level Controls and Blinds Conjunction between the 2 groups

Example 2 U. Noppeney et al.
Stimuli: Auditory Presentation (SOA = 4 secs) of words motion sound visual action jump touch “jump” “click” “pink” “turn” “click” Subjects: (i) 12 control subjects Scanning: fMRI, 250 scans per subject, block design Q. What regions are affected by the semantic content of the words ?

= = = ? ? ? Non-Independent and Non-Identical Error
1st Leve visual sound hand motion ? = ? = ? = 2nd Level F-test

Example III U. Noppeney et al.
Stimuli: (i) Sentences presented visually (ii) False fonts (symbols) Some of the sentences are syntactically primed Scanning: fMRI, 250 scans per subject, block design Q. Which brain regions of the “sentence reading system” are affected by Priming?

Non-Independent and Non-Identical Error
1st Level Sentence > Symbols No-Priming>Priming Orthogonal contrasts 2nd Level Conjunction of 2 contrasts Left Anterior Temporal

Example IV Modelling serial correlation in fMRI time series
Model errors for each subject as AR(1) + white noise.

Outline Random Effects Analysis General Framework
Summary statistic approach 2nd level) General Framework Multiple variance components and Hierarchical models Multiple variance components F-tests and level Modelling fMRI serial level Hierarchical models for Bayesian Inference SPMs versus PPMs

Bayes Rule

Example 2:Univariate model
Likelihood and Prior Posterior Relative Precision Weighting

Example 2:Univariate model
Likelihood and Prior AIM: Make inferences based on posterior distribution Similar expressions exist for posterior distributions in multivariate models Posterior But how do we compute the variance components or ‘hyperparameters’ ?

Estimation EM algorithm Friston, K. et al. (2002), Neuroimage ( ) å y
E-Step ( ) y C X T 1 - = e q h M-Step r for i and j { } { Q tr J g i j ij k å + l Friston, K. et al. (2002), Neuroimage

Estimating mean and variance
Maximum Likelihood (ML), maximises p(Y|m,b) Expectation-Maximisation (EM), maximises for ‘vague’ prior on m

Estimating mean and variance
For a prior on m with prior mean 0 and prior precision a Expectation-Maximisation (EM) gives where Larger a more shrinkage

Estimating mean and variance at multiple voxels
For a prior on m over voxels with prior mean 0 and prior precision a Expectation-Maximisation (EM) gives at voxel i=1..V, scan n=1..N where Prior precision can be estimated from data. If mean activation over all voxels is 0 then these EM estimates are more accurate than ML

The Interface PEB WLS Parameters Parameters, and REML Hyperparameters
No Priors Shrinkage priors

Bayesian Inference 1st level = within-voxel Likelihood Shrinkage Prior
In the absence of evidence to the contrary parameters will shrink to zero 2nd level = between-voxels

Bayesian Inference: Posterior Probability Maps
PPMs Posterior Likelihood Prior SPMs

SPMs and PPMs PPMs: Show activations of a given size
SPMs: show voxels with non-zero activations