Presentation is loading. Please wait.

Presentation is loading. Please wait.

Hidden Process Models with Applications to fMRI Data Rebecca A. Hutchinson March 24, 2010 Biostatistics and Biomathematics Seminar Fred Hutchinson Cancer.

Similar presentations


Presentation on theme: "Hidden Process Models with Applications to fMRI Data Rebecca A. Hutchinson March 24, 2010 Biostatistics and Biomathematics Seminar Fred Hutchinson Cancer."— Presentation transcript:

1 Hidden Process Models with Applications to fMRI Data Rebecca A. Hutchinson March 24, 2010 Biostatistics and Biomathematics Seminar Fred Hutchinson Cancer Research Center

2 2 Introduction Hidden Process Models (HPMs): –A probabilistic model for time series data. –Designed for data generated by a collection of latent processes. Motivating problem: –Modeling mental processes (e.g. making a decision) in functional Magnetic Resonance Imaging time series. Characteristics of potential domains: –Processes with spatial-temporal signatures. –Uncertainty about temporal location of processes. –High-dimensional, sparse, noisy. 2

3 3 fMRI Data … Signal Amplitude Time (seconds) Hemodynamic Response Neural activity Features: 10,000 voxels, imaged every second. Training examples: 10-40 trials (task repetitions).

4 4

5 5 Study: Pictures and Sentences Task: Decide whether sentence describes picture correctly, indicate with button press. 13 normal participants, 40 trials per participant. Sentences and pictures describe 3 symbols: *, +, and $, using ‘above’, ‘below’, ‘not above’, ‘not below’. Images are acquired every 0.5 seconds. Read Sentence View PictureRead Sentence View PictureFixation Press Button 4 sec.8 sec.t=0 Rest Keller01

6 6 Motivation To track mental processes over time. –Estimate process hemodynamic responses. –Estimate process timings. Allowing processes that do not directly correspond to the stimuli timing is a key contribution of HPMs! To compare hypotheses of cognitive behavior.

7 7 Related Work fMRI –General Linear Model (Dale99) Must assume timing of process onset to estimate hemodynamic response. –Computer models of human cognition (Just99, Anderson04) Predict fMRI data rather than learning parameters of processes from the data. Machine Learning –Classification of windows of fMRI data (overview in Haynes06) Does not typically model overlapping hemodynamic responses. –Dynamic Bayes Networks (Murphy02, Ghahramani97) HPM assumptions/constraints can be encoded by extending factorial HMMs with links between the Markov chains. 7

8 8 Outline Overview of HPMs –Generative model –Formalism –Graphical model –Algorithms Synthetic data experiments –Accurately estimate parameters –Choose correct model from alternatives with different numbers of processes Real data experiments –Evaluation methodology –Extensions to standard HPMs –Model comparison via classification accuracy and data log-likelihood Visualizing HPMs Conclusions –Summary of contributions –Future work

9 9 Process 1: ReadSentence Response signature W: Duration d: 11 sec. Offsets  : {0,1} P(  ): {  0,  1 } One configuration c of process instances i 1, i 2, … i k : Predicted mean: Input stimulus  : i1i1  Timing landmarks : 2 1 i2i2 Process instance: i 2 Process  : 2 Timing landmark: 2 Offset O: 1 (Start time: 2 + O) sentence picture v1 v2 Process 2: ViewPicture Response signature W: Duration d: 11 sec. Offsets  : {0,1} P(  ): {  0,  1 } v1 v2 Processes of the HPM: v1 v2 + N(0,  1 2 ) + N(0,  2 2 )

10 10 HPM Formalism HPM =  =, a set of processes (e.g. ReadSentence)  =, a process W = response signature d = process duration  = allowable offsets  = multinomial parameters over values in  C =, a set of possible configurations c =, a set of process instances i =, a process instance (e.g. ReadSentence(S1))  = process ID = timing landmark (e.g. stimulus presentation of S1) O = offset (takes values in   ) C= a latent variable indicating the correct configuration  =, standard deviation for each voxel 10

11 11 HPMs: the graphical model Offset o Process Type  Start Time s observed unobserved Timing Landmark Y t,v i 1,…,i I t=[1,T], v=[1,V]  The set C of configurations constrains the joint distribution on {  (k),o(k)}  k. Configuration c 11

12 12 Encoding Experiment Design Configuration 1: Input stimulus  : Timing landmarks : 2 1 ViewPicture = 2 ReadSentence = 1 Decide = 3 Configuration 2: Configuration 3: Configuration 4: Constraints Encoded:  (i 1 ) = {1,2}  (i 2 ) = {1,2}  (i 1 ) !=  (i 2 ) o(i 1 ) = 0 o(i 2 ) = 0  (i 3 ) = 3 o(i 3 ) = {1,2} Processes: 12 ViewPicture = 2 ReadSentence = 1 Decide = 3 ViewPicture = 2 ReadSentence = 1

13 13 Inference Over C, the latent indicator of the correct configuration Choose the most likely configuration C n for each trial (n=[1,N]) where: 13

14 14 Learning Parameters to learn: –Response signature W for each process –Timing distribution  for each process –Standard deviation  for each voxel Expectation-Maximization (EM) algorithm: –E step: estimate the probability distribution over C. –M step: update estimates of W (using reweighted least squares), , and  (using standard MLEs) based on the E step. 14

15 15 Synthetic Data Approximate timings for ReadSentence, ViewPicture, Decide

16 16 Synthetic Experiments MSE of responses (averaged over all voxels, processes, timepoints) = 0.4427. MSE of timing parameters (averaged over all processes, offsets) ~0.01. Estimated standard deviations are 2.4316 and 2.4226 (true value is 2.5).

17 17 Model Selection All numbers *10^5

18 18 Synthetic Data Results Tracking processes: HPMs can recover the parameters of the model used to generate the data. Comparing models: HPMs can use held-out data log-likelihood to identify the model with the correct number of latent processes.

19 19 Evaluating HPMs on real data No ground truth for the problems HPMs were designed for. Can use data log-likelihood to compare –Baseline: average of all training trials Can do classification of known entities (like the stimuli), but HPMs are not optimized for this.

20 20 Models HPM-GNB: ReadSentence and ViewPicture, duration=8sec. (no overlap) –an approximation of Gaussian Naïve Bayes classifier, with HPM assumptions and noise model HPM-2: ReadSentence and ViewPicture, duration=12sec. (temporal overlap) HPM-3: HPM-2 + Decide (offsets=[0,7] images following second stimulus) HPM-4: HPM-3 + PressButton (offsets = {-1,0} following button press)

21 21 Configurations for HPM-3

22 22 Held-out log-likelihood (improvement over baseline) 1000 most active voxels per participant 5-fold cross-validation per participant; mean over 13 participants. Standard HPMs HPM-GNB-293 HPM-2-1150 HPM-3-2000 HPM-4-4490 22

23 23 Extension 1: Regularization Subtract a term from the objective function penalizing deviations from: –Temporal smoothness –Spatial smoothness (based on adjacency matrix A) –Other possibilities: spatial sparsity and spatial priors.

24 24 Extension 2: Basis Functions Re-parameterize process response signatures in terms of a basis set

25 25 The basis set Generated as in Hossein-Zadeh03 –Create Q (10,000 x 24): 10,000 realizations of 24 timepoints of h(t) varying a in [0.05,0.21] and b in [3,7] Basis set = first 3 principal components of Q’Q

26 26 Held-out log-likelihood (improvement over baseline) 1000 most active voxels per participant 5-fold cross-validation per participant; mean over 13 participants. StandardRegularizedBasis functions HPM-GNB-29325902010 HPM-2-115039103740 HPM-3-200049604710 HPM-4-449048104770 26

27 27 Classification Accuracies ReadSentence vs. ViewPicture for first 2 processes 1000 most active voxels per participant, 5-fold cross- validation per participant; mean over 13 participants. Standard Regularized Basis functions HPM-GNB85.886.590.4 HPM-287.190.090.6 HPM-386.788.790.4 HPM-483.884.486.2 GNB = 93.1

28 28 Interpretation and Visualization Focus in on HPM-3 for a single participant, trained on all trials, all voxels. Timing for the third (Decide) process in HPM-3: (Values have been rounded.) Offset:01234567 Stand.0.30.080.10.05 0.20.080.15 Reg.0.30.080.10.05 0.20.080.15 Basis0.50.1 0.080.050.030.050.08

29 29 Standard 29

30 30 Regularized 30

31 31 Basis functions 31

32 32 Time courses Standard Regularized Basis functions 32

33 33 Trial 1: observed RDLPFC Z-slice 5 Trial 1: predicted Standard HPM Full brain, trained on all trials

34 34 ViewPicture ReadSentence Decide Standard HPM

35 35 Trial 1: observed RDLPFC Z-slice 5 Trial 1: predicted Regularized HPM Full brain, trained on all trials

36 36 ViewPicture ReadSentence Decide Regularized HPM

37 37 Trial 1: observed RDLPFC Z-slice 5 Trial 1: predicted Basis function HPM Full brain, trained on all trials

38 38 ViewPicture ReadSentence Decide Basis Function HPM

39 39 Caveats While visualizing these parameters can help us understand the model, it is important to remember that they are specific to the design choices of the particular HPM. These are parameters – not the results of statistical significance tests.

40 40 Summary of Results Synthetic data results –HPMs can recover the parameters of the model used to generate the data in an ideal situation. –We can use held-out data log-likelihood to identify the model with the correct number of latent processes. Real data results –Standard HPMs can overfit on real fMRI data. –Regularization and HPMs parameterized with basis functions consistently outperform the baseline in terms of held-out data log-likelihood. –Example comparison of 4 models.

41 41 Contributions Estimates for Decide! To our knowledge, HPMs are the first probabilistic model for fMRI data that can estimate the hemodynamic response for overlapping mental processes with unknown onset while simultaneously estimating a distribution over the timing of the processes.

42 42 Future Directions Combine regularization and basis functions. Develop a better noise model. Relax the linearity assumption. Automatically discover the number of latent processes. Learn process durations. Continuous offsets. Leverage DBN algorithms.

43 43 References John R. Anderson, Daniel Bothell, Michael D. Byrne, Scott Douglass, Christian Lebiere, and Yulin Qin. An integrated theory of the mind. Psychological Review, 111(4):1036–1060, 2004. http://act-r.psy.cmu.edu/about/. Anders M. Dale. Optimal experimental design for event-related fMRI. Human Brain Mapping, 8:109–114, 1999. Zoubin Ghahramani and Michael I. Jordan. Factorial hidden Markov models. Machine Learning, 29:245–275, 1997. John-Dylan Haynes and Geraint Rees. Decoding mental states from brain activity in humans. Nature Reviews Neuroscience, 7:523–534, July 2006. Gholam-Ali Hossein-Zadeh, Babak A. Ardekani, and Hamid Soltanian-Zadeh. A signal subspace approach for modeling the hemodynamic response function in fmri. Magnetic Resonance Imaging, 21:835–843, 2003. Marcel Adam Just, Patricia A. Carpenter, and Sashank Varma. Computational modeling of high-level cognition and brain function. Human Brain Mapping, 8:128–136, 1999. http://www.ccbi.cmu.edu/project 10modeling4CAPS.htm. Tim A. Keller, Marcel Adam Just, and V. Andrew Stenger. Reading span and the timecourse of cortical activation in sentence-picture verification. In Annual Convention of the Psychonomic Society, 2001. Kevin P. Murphy. Dynamic bayesian networks. To appear in Probabilistic Graphical Models, M. Jordan, November 2002. 43

44 44 Thank you!

45 45 Questions?

46 46 (end of talk)

47 47 ReadSentence Landmarks ReadSentence Process (  =1,  = {0}) ViewPicture Landmarks ViewPicture Process (  =2,  = {0}) Decide Landmarks Decide Process (  =3,  = {0,1,2}) fMRI I 3 (1 ) S 3 (1) S 3 (2) I 3 (2 ) I 3 (3 ) S 3 (3) Y3Y3

48 48 PressButton Landmarks PressButton Process (  =4,  = {0,1,2}) Decide Landmarks fMRI Decide Process (  =3,  = {0,1,2})


Download ppt "Hidden Process Models with Applications to fMRI Data Rebecca A. Hutchinson March 24, 2010 Biostatistics and Biomathematics Seminar Fred Hutchinson Cancer."

Similar presentations


Ads by Google