Modeling fMRI data generated by overlapping cognitive processes with unknown onsets using Hidden Process Models Rebecca A. Hutchinson (1) Tom M. Mitchell.

Slides:



Advertisements
Similar presentations
FMRI Methods Lecture 10 – Using natural stimuli. Reductionism Reducing complex things into simpler components Explaining the whole as a sum of its parts.
Advertisements

Bayesian inference Lee Harrison York Neuroimaging Centre 01 / 05 / 2009.
Basis Functions. What’s a basis ? Can be used to describe any point in space. e.g. the common Euclidian basis (x, y, z) forms a basis according to which.
The General Linear Model Or, What the Hell’s Going on During Estimation?
Designing a behavioral experiment
Segmentation and Fitting Using Probabilistic Methods
More MR Fingerprinting
Visual Recognition Tutorial
A Bayesian Approach to Joint Feature Selection and Classifier Design Balaji Krishnapuram, Alexander J. Hartemink, Lawrence Carin, Fellow, IEEE, and Mario.
Assuming normally distributed data! Naïve Bayes Classifier.
1 Exploiting Parameter Domain Knowledge for Learning in Bayesian Networks Thesis Committee: Tom Mitchell (Chair) John Lafferty Andrew Moore Bharat Rao.
Hidden Process Models with applications to fMRI data Rebecca Hutchinson Oregon State University Joint work with Tom M. Mitchell Carnegie Mellon University.
Lecture 5: Learning models using EM
1 Hidden Process Models Rebecca Hutchinson Joint work with Tom Mitchell and Indra Rustandi.
1 Learning Entity Specific Models Stefan Niculescu Carnegie Mellon University November, 2003.
Hidden Process Models: Decoding Overlapping Cognitive States with Unknown Timing Rebecca A. Hutchinson Tom M. Mitchell Carnegie Mellon University NIPS.
1 Classifying Instantaneous Cognitive States from fMRI Data Tom Mitchell, Rebecca Hutchinson, Marcel Just, Stefan Niculescu, Francisco Pereira, Xuerui.
Hidden Process Models Rebecca Hutchinson Tom M. Mitchell Indrayana Rustandi October 4, 2006 Women in Machine Learning Workshop Carnegie Mellon University.
1 Automated Feature Abstraction of the fMRI Signal using Neural Network Clustering Techniques Stefan Niculescu and Tom Mitchell Siemens Medical Solutions,
Introduction to Bayesian Learning Ata Kaban School of Computer Science University of Birmingham.
Hidden Process Models Rebecca Hutchinson May 26, 2006 Thesis Proposal Carnegie Mellon University Computer Science Department.
Hidden Process Models for Analyzing fMRI Data Rebecca Hutchinson Joint work with Tom Mitchell May 11, 2007 Student Seminar Series In partial fulfillment.
Hidden Process Models with Applications to fMRI Data Rebecca A. Hutchinson March 24, 2010 Biostatistics and Biomathematics Seminar Fred Hutchinson Cancer.
Learning to Identify Overlapping and Hidden Cognitive Processes from fMRI Data Rebecca Hutchinson, Tom Mitchell, Indra Rustandi Carnegie Mellon University.
Radial Basis Function Networks
TSTAT_THRESHOLD (~1 secs execution) Calculates P=0.05 (corrected) threshold t for the T statistic using the minimum given by a Bonferroni correction and.
EE513 Audio Signals and Systems Statistical Pattern Classification Kevin D. Donohue Electrical and Computer Engineering University of Kentucky.
Topic Models in Text Processing IR Group Meeting Presented by Qiaozhu Mei.
1 Blockwise Coordinate Descent Procedures for the Multi-task Lasso with Applications to Neural Semantic Basis Discovery ICML 2009 Han Liu, Mark Palatucci,
1 / 41 Inference and Computation with Population Codes 13 November 2012 Inference and Computation with Population Codes Alexandre Pouget, Peter Dayan,
FMRI Methods Lecture7 – Review: analyses & statistics.
Current work at UCL & KCL. Project aim: find the network of regions associated with pleasant and unpleasant stimuli and use this information to classify.
Machine Learning for Analyzing Brain Activity Tom M. Mitchell Machine Learning Department Carnegie Mellon University October 2006 Collaborators: Rebecca.
Bayesian Modelling of Functional Imaging Data Will Penny The Wellcome Department of Imaging Neuroscience, UCL http//:
Contrasts & Inference - EEG & MEG Himn Sabir 1. Topics 1 st level analysis 2 nd level analysis Space-Time SPMs Time-frequency analysis Conclusion 2.
You had it coming... Precursors of Performance Errors Tom Eichele, MD PhD Department of Biological and Medical Psychology University of Bergen Source:
Learning Theory Reza Shadmehr LMS with Newton-Raphson, weighted least squares, choice of loss function.
MACHINE LEARNING 8. Clustering. Motivation Based on E ALPAYDIN 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2  Classification problem:
BCS547 Neural Decoding. Population Code Tuning CurvesPattern of activity (r) Direction (deg) Activity
INTRODUCTION TO Machine Learning 3rd Edition
BCS547 Neural Decoding.
Paper Reading Dalong Du Nov.27, Papers Leon Gu and Takeo Kanade. A Generative Shape Regularization Model for Robust Face Alignment. ECCV08. Yan.
Effective Automatic Image Annotation Via A Coherent Language Model and Active Learning Rong Jin, Joyce Y. Chai Michigan State University Luo Si Carnegie.
Learning to distinguish cognitive subprocesses based on fMRI Tom M. Mitchell Center for Automated Learning and Discovery Carnegie Mellon University Collaborators:
1 Modeling the fMRI signal via Hierarchical Clustered Hidden Process Models Stefan Niculescu, Tom Mitchell, R. Bharat Rao Siemens Medical Solutions Carnegie.
Machine Learning 5. Parametric Methods.
Ch. 5 Bayesian Treatment of Neuroimaging Data Will Penny and Karl Friston Ch. 5 Bayesian Treatment of Neuroimaging Data Will Penny and Karl Friston 18.
Introduction  Conway 1 proposes there are two types of autobiographical event memories (AMs):  Unique, specific events  Repeated, general events  These.
6. Population Codes Presented by Rhee, Je-Keun © 2008, SNU Biointelligence Lab,
EE 551/451, Fall, 2006 Communication Systems Zhu Han Department of Electrical and Computer Engineering Class 15 Oct. 10 th, 2006.
Dynamic Causal Model for evoked responses in MEG/EEG Rosalyn Moran.
Parameter Estimation. Statistics Probability specified inferred Steam engine pump “prediction” “estimation”
Bayesian inference Lee Harrison York Neuroimaging Centre 23 / 10 / 2009.
Mixture Models with Adaptive Spatial Priors Will Penny Karl Friston Acknowledgments: Stefan Kiebel and John Ashburner The Wellcome Department of Imaging.
Neural Codes. Neuronal codes Spiking models: Hodgkin Huxley Model (brief repetition) Reduction of the HH-Model to two dimensions (general) FitzHugh-Nagumo.
Bayesian Perception.
Strategy for EEG/fMRI fusion Thomas Vincent 1,2 Neurospin 1: CEA/NeuroSpin/LNAO 2: IFR49 December 17, 2009.
Data Mining Lecture 11.
Hidden Process Models with applications to fMRI data
Spatio-Temporal Clustering
Dynamic Causal Model for evoked responses in M/EEG Rosalyn Moran.
Revealing priors on category structures through iterated learning
SPM2: Modelling and Inference
Bayesian Methods in Brain Imaging
School of Computer Science, Carnegie Mellon University
Learning Theory Reza Shadmehr
Mixture Models with Adaptive Spatial Priors
Probabilistic Modelling of Brain Imaging Data
Mathematical Foundations of BME
Will Penny Wellcome Trust Centre for Neuroimaging,
Presentation transcript:

Modeling fMRI data generated by overlapping cognitive processes with unknown onsets using Hidden Process Models Rebecca A. Hutchinson (1) Tom M. Mitchell (1,2) (1) Computer Science Department, Carnegie Mellon University (2) Machine Learning Department, Carnegie Mellon University Statistical Analyses of Neuronal Data (SAND4), May 30, 2008

Hidden Process Models HPMs are a new probabilistic model for time series data. HPMs are designed for data that is: –generated by a collection of latent processes that have overlapping spatial-temporal signatures. –high-dimensional, sparse, and noisy. –accompanied by limited prior knowledge about when the processes occur. HPMs can simultaneously recover the start times and spatial-temporal signatures of the latent processes.

t t d1 … dN … Process 1: t t d1 … dN … Process P: … d1 … dN Prior knowledge: An instance of Process 1 begins in this window. An instance of Process P begins in this window. An instance of either Process 1 OR Process P begins in this window. There are a total of 6 processes in this window of data. Example

Simple Case: Known Timing T D = … … … p1p3p2 p1 p3 p2 D W(1) W(2) W(3) Y Apply the General Linear Model: Y=XW Data Y Convolution Matrix XUnknown parameters W [Dale 1999]

Challenge: Unknown Timing T D = … … … p1p3p2 p1 p3 p2 D W(1) W(2) W(3) Y Uncertainty about the processes essentially makes the convolution matrix a random variable.

fMRI Data … Signal Amplitude Time (seconds) Hemodynamic Response Neural activity Features: 10,000 voxels, imaged every second. Training examples: trials (task repetitions).

Goals for fMRI To track cognitive processes over time. –Estimate process hemodynamic responses. –Estimate process timings. Allowing processes that do not directly correspond to the stimuli timing is a key contribution of HPMs! To compare hypotheses of cognitive behavior.

Our Approach Model of processes contains a probability distribution over when it occurs relative to a known event (called a timing landmark). When predicting the underlying processes, use prior knowledge about timing to limit the hypothesis space.

Study: Pictures and Sentences Task: Decide whether sentence describes picture correctly, indicate with button press. 13 normal subjects, 40 trials per subject. Sentences and pictures describe 3 symbols: *, +, and $, using ‘above’, ‘below’, ‘not above’, ‘not below’. Images are acquired every 0.5 seconds. Read Sentence View PictureRead Sentence View PictureFixation Press Button 4 sec.8 sec.t=0 Rest [Keller et al, 2001]

Process 1: ReadSentence Response signature W: Duration d: 11 sec. Offsets  : {0,1} P(  ): {  0,  1 } One configuration c of process instances  1,  2, …  k : (with prior  c ) Predicted mean: Input stimulus  : 11  Timing landmarks : 2 1 22 Process instance:  2 Process h: 2 Timing landmark: 2 Offset O: 1 (Start time: 2 + O) sentence picture v1 v2 Process 2: ViewPicture Response signature W: Duration d: 11 sec. Offsets  : {0,1} P(  ): {  0,  1 } v1 v2 Processes of the HPM: v1 v2 + N(0,  1 ) + N(0,  2 )

HPM Formalism HPM = H =, a set of processes (e.g. ReadSentence) h =, a process W = response signature d = process duration  = allowable offsets  = multinomial parameters over values in  C =, a set of configurations c =, a set of process instances  =, a process instance (e.g. ReadSentence(S1)) h = process ID = timing landmark (e.g. stimulus presentation of S1) O = offset (takes values in  h )  =, priors over C  =, standard deviation for each voxel [Hutchinson et al, 2006]

Encoding Experiment Design Configuration 1: Input stimulus  : Timing landmarks : 2 1 ViewPicture = 2 ReadSentence = 1 Decide = 3 Configuration 2: Configuration 3: Configuration 4: Constraints Encoded: h(  1 ) = {1,2} h(  2 ) = {1,2} h(  1 ) != h(  2 ) o(  1 ) = 0 o(  2 ) = 0 h(  3 ) = 3 o(  3 ) = {1,2} Processes:

Inference Over configurations Choose the most likely configuration, where: C=configuration, Y=observed data,  =input stimuli, HPM=model

Learning Parameters to learn: –Response signature W for each process –Timing distribution  for each process –Standard deviation  for each voxel Expectation-Maximization (EM) algorithm to estimate W and . –E step: estimate a probability distribution over configurations. –M step: update estimates of W (using reweighted least squares) and  (using standard MLEs) based on the E step. –After convergence, use standard MLEs for 

Uncertain Timings Convolution matrix models several choices for each time point … … … … … PD t=1 t=2 … t=18 … T’>T S Configurations for each row: 3,4 1,2 3,4 1,2 … …

Uncertain Timings … … … PD e1 e2 e3 e4 … S Y = W 3,4 1,2 3,4 1,2 … Configurations:Weights: e1 = P(C=3|Y,W old,  old,  old ) + P(C=4|Y,W old,  old,  old ) Weight each row with probabilities from E-step.

Potential Processes Can group these many ways to form different HPMs

Comparing HPMS Cross-validated data log-likelihood. All values are *10 6. Participant

Are we learning the right number of processes? For each training set, the table shows the average (over 30 runs) test set log- likelihood of each of 3 HPMs (with 2, 3, and 4 processes) on each of 3 synthetic data sets (generated with 2, 3, and 4 processes). Each cell is reported as mean ± standard deviation. NOTE: All values in this table are *10 5.

Ongoing Research Regularization for process response signatures (adding bias for temporal and/or spatial smoothness, spatial priors, spatial sparsity). Modeling process response signatures with basis functions. Allowing continuous start times (decoupling process starts from the data acquisition rate) A Dynamic Bayes Net formulation of HPMs.

References Dale, A.M., Optimal experiment design for event- related fMRI, 1999, Human Brain Mapping, 8, Hutchinson, R.A., Mitchell, T.M., & Rustandi, I., Hidden Process Models, 2006, Proceedings of the 23 rd International Conference on Machine Learning, Keller, T.A., Just, M.A., & Stenger, V.A., Reading span and the time-course of cortical activation in sentence-picture verification, 2001, Annual Convention of the Psychonomic Society.