Learning to Identify Overlapping and Hidden Cognitive Processes from fMRI Data Rebecca Hutchinson, Tom Mitchell, Indra Rustandi Carnegie Mellon University.

Slides:



Advertisements
Similar presentations
Bayesian inference Lee Harrison York Neuroimaging Centre 01 / 05 / 2009.
Advertisements

Bayesian inference Jean Daunizeau Wellcome Trust Centre for Neuroimaging 16 / 05 / 2008.
Basis Functions. What’s a basis ? Can be used to describe any point in space. e.g. the common Euclidian basis (x, y, z) forms a basis according to which.
MEG/EEG Inverse problem and solutions In a Bayesian Framework EEG/MEG SPM course, Bruxelles, 2011 Jérémie Mattout Lyon Neuroscience Research Centre ? ?
The General Linear Model Or, What the Hell’s Going on During Estimation?
Supervised Learning Recap
A (very) brief introduction to multivoxel analysis “stuff” Jo Etzel, Social Brain Lab
Probabilistic Generative Models Rong Jin. Probabilistic Generative Model Classify instance x into one of K classes Class prior Density function for class.
A Bayesian Approach to Joint Feature Selection and Classifier Design Balaji Krishnapuram, Alexander J. Hartemink, Lawrence Carin, Fellow, IEEE, and Mario.
Overview Full Bayesian Learning MAP learning
Assuming normally distributed data! Naïve Bayes Classifier.
1 Exploiting Parameter Domain Knowledge for Learning in Bayesian Networks Thesis Committee: Tom Mitchell (Chair) John Lafferty Andrew Moore Bharat Rao.
Hidden Process Models with applications to fMRI data Rebecca Hutchinson Oregon State University Joint work with Tom M. Mitchell Carnegie Mellon University.
Announcements  Project proposal is due on 03/11  Three seminars this Friday (EB 3105) Dealing with Indefinite Representations in Pattern Recognition.
1 Hidden Process Models Rebecca Hutchinson Joint work with Tom Mitchell and Indra Rustandi.
1 Learning Entity Specific Models Stefan Niculescu Carnegie Mellon University November, 2003.
Hidden Process Models: Decoding Overlapping Cognitive States with Unknown Timing Rebecca A. Hutchinson Tom M. Mitchell Carnegie Mellon University NIPS.
Group analyses of fMRI data Methods & models for fMRI data analysis 28 April 2009 Klaas Enno Stephan Laboratory for Social and Neural Systems Research.
Group analyses of fMRI data Methods & models for fMRI data analysis 26 November 2008 Klaas Enno Stephan Laboratory for Social and Neural Systems Research.
1 Learning fMRI-Based Classifiers for Cognitive States Stefan Niculescu Carnegie Mellon University April, 2003 Our Group: Tom Mitchell, Luis Barrios, Rebecca.
1 Classifying Instantaneous Cognitive States from fMRI Data Tom Mitchell, Rebecca Hutchinson, Marcel Just, Stefan Niculescu, Francisco Pereira, Xuerui.
Hidden Process Models Rebecca Hutchinson Tom M. Mitchell Indrayana Rustandi October 4, 2006 Women in Machine Learning Workshop Carnegie Mellon University.
Collaborative Ordinal Regression Shipeng Yu Joint work with Kai Yu, Volker Tresp and Hans-Peter Kriegel University of Munich, Germany Siemens Corporate.
1 Automated Feature Abstraction of the fMRI Signal using Neural Network Clustering Techniques Stefan Niculescu and Tom Mitchell Siemens Medical Solutions,
Introduction to Bayesian Learning Ata Kaban School of Computer Science University of Birmingham.
Hidden Process Models Rebecca Hutchinson May 26, 2006 Thesis Proposal Carnegie Mellon University Computer Science Department.
Hidden Process Models for Analyzing fMRI Data Rebecca Hutchinson Joint work with Tom Mitchell May 11, 2007 Student Seminar Series In partial fulfillment.
Learning HMM parameters Sushmita Roy BMI/CS 576 Oct 21 st, 2014.
Hidden Process Models with Applications to fMRI Data Rebecca A. Hutchinson March 24, 2010 Biostatistics and Biomathematics Seminar Fred Hutchinson Cancer.
Modeling fMRI data generated by overlapping cognitive processes with unknown onsets using Hidden Process Models Rebecca A. Hutchinson (1) Tom M. Mitchell.
Radial Basis Function Networks
Semi-Supervised Learning
Crash Course on Machine Learning
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
EM and expected complete log-likelihood Mixture of Experts
Bayesian networks Classification, segmentation, time series prediction and more. Website: Twitter:
1 Preliminary Experiments: Learning Virtual Sensors Machine learning approach: train classifiers –fMRI(t, t+  )  CognitiveState Fixed set of possible.
Machine Learning for Analyzing Brain Activity Tom M. Mitchell Machine Learning Department Carnegie Mellon University October 2006 Collaborators: Rebecca.
Corinne Introduction/Overview & Examples (behavioral) Giorgia functional Brain Imaging Examples, Fixed Effects Analysis vs. Random Effects Analysis Models.
Empirical Research Methods in Computer Science Lecture 7 November 30, 2005 Noah Smith.
CS774. Markov Random Field : Theory and Application Lecture 19 Kyomin Jung KAIST Nov
Processing Sequential Sensor Data The “John Krumm perspective” Thomas Plötz November 29 th, 2011.
MLE’s, Bayesian Classifiers and Naïve Bayes Machine Learning Tom M. Mitchell Machine Learning Department Carnegie Mellon University January 30,
Paper Reading Dalong Du Nov.27, Papers Leon Gu and Takeo Kanade. A Generative Shape Regularization Model for Robust Face Alignment. ECCV08. Yan.
Learning to distinguish cognitive subprocesses based on fMRI Tom M. Mitchell Center for Automated Learning and Discovery Carnegie Mellon University Collaborators:
1 Modeling the fMRI signal via Hierarchical Clustered Hidden Process Models Stefan Niculescu, Tom Mitchell, R. Bharat Rao Siemens Medical Solutions Carnegie.
Ch. 5 Bayesian Treatment of Neuroimaging Data Will Penny and Karl Friston Ch. 5 Bayesian Treatment of Neuroimaging Data Will Penny and Karl Friston 18.
6. Population Codes Presented by Rhee, Je-Keun © 2008, SNU Biointelligence Lab,
FMRI Modelling & Statistical Inference Guillaume Flandin Wellcome Trust Centre for Neuroimaging University College London SPM Course Chicago, Oct.
SPM and (e)fMRI Christopher Benjamin. SPM Today: basics from eFMRI perspective. 1.Pre-processing 2.Modeling: Specification & general linear model 3.Inference:
The general linear model and Statistical Parametric Mapping II: GLM for fMRI Alexa Morcom and Stefan Kiebel, Rik Henson, Andrew Holmes & J-B Poline.
Mixture Models with Adaptive Spatial Priors Will Penny Karl Friston Acknowledgments: Stefan Kiebel and John Ashburner The Wellcome Department of Imaging.
SUPERVISED AND UNSUPERVISED LEARNING Presentation by Ege Saygıner CENG 784.
Bayesian Inference in SPM2 Will Penny K. Friston, J. Ashburner, J.-B. Poline, R. Henson, S. Kiebel, D. Glaser Wellcome Department of Imaging Neuroscience,
Unsupervised Learning Part 2. Topics How to determine the K in K-means? Hierarchical clustering Soft clustering with Gaussian mixture models Expectation-Maximization.
Data Mining Lecture 11.
Hidden Process Models with applications to fMRI data
Hidden Markov Models Part 2: Algorithms
SPM2: Modelling and Inference
Bayesian Methods in Brain Imaging
School of Computer Science, Carnegie Mellon University
Learning Theory Reza Shadmehr
Hierarchical Models and
Parametric Methods Berlin Chen, 2005 References:
Bayesian inference J. Daunizeau
Bayesian Inference in SPM2
Probabilistic Modelling of Brain Imaging Data
Mathematical Foundations of BME
Will Penny Wellcome Trust Centre for Neuroimaging,
Mathematical Foundations of BME Reza Shadmehr
Presentation transcript:

Learning to Identify Overlapping and Hidden Cognitive Processes from fMRI Data Rebecca Hutchinson, Tom Mitchell, Indra Rustandi Carnegie Mellon University

Decide whether consistent How can we track hidden cognitive processes? Read sentence View picture Cognitive processes: ? Observed fMRI: cortical region 1: cortical region 2: Observed button press:

Typical BOLD response Signal Amplitude Time (seconds) At left is a typical averaged BOLD response Here, subject reads a word, decides whether it is a noun or verb, and pushes a button in less than 1 second.

Related Work General linear model (GLM) applied to fMRI –E.g., [Dale 1999]; SPM; –Accommodates multiple, overlapping processes, –But not unknown process timing Dynamic Bayesian Networks –Family of probabilistic models for time series –E.g., Factorial HMMs [Ghahramani & Jordan 1998] –Accommodate hidden timings/states –But do not capture convolution of overlapping states –Require learning detailed next-state function

General Linear Model Common fMRI data analysis approach Define ‘design matrix’ X which describes timing of input stimuli y = X h + ε Observed fMRI time series Design matrix (stimulus timing) Gaussian noise Responses to individual stimuli HPM’s correspond to assuming X describes both stimuli and hidden mental processes, and is partially unknown

Approach: Hidden Process Models Probabilistic model –Can evaluate P(model | data), P(data | model) Describe hidden processes by their –Type, duration, start time, fMRI signature Algorithms for learning model, interpreting data –Learn maximum likelihood models and data interpretations

Hidden Process Models Process ID = 3 Process ID = 2 Process Instances: Observed fMRI: Decide whether consistent View picture Processes: ID: 1 Timing: P(start= +O) Response: ID: 2 Timing: P(start= +O) Response: ID: 3 Timing: P(start= +O) Response: Process ID = 1  Time landmarks: ¢ 1 ¢ 2 ¢ 1 ¢ 3

Process: ViewPicture Duration d: 11 sec. P(Offset times): ,  Response signature W: Configuration C of Process Instances h  1,  2, … i Observed data Y: Input Stimulus  : 11 44  Timing landmarks : ¢ 2 ¢ 1 ¢ 3 22 Process instance:  2 Process h: ViewPicture Timing landmark : 2 Offset time O: 1 sec Start time ´ + O sentence picture sentence 33 Hidden Process Models

HPMs More Formally… Process h = h d, ,  W i Process Instance  = h h,, O i Configuration C = set of Process Instances Hidden Process Model HPM = h H, , C,  i H: set of processes  : prior probs over H C: set of candidate configurations  : h  1 …  v i voxel noise model

HPM Generative Model Probabilistically generate data using a configuration of N process instances with known landmarks: 1.Generate a configuration C of process instances: For i=1 to N, generate process instance  i Choose a process h i according to P(h| i,  ) Choose an offset O i according to P(O|  (h) ) 2.Generate all observed fMRI data y tv given C:

HPM Inference Given: –An HPM, including a set of candidate configurations we typically assume processes known, but not timing –Observed data Y Determine: –The most probable process instance configuration c –P(C=c|Y, HPM)  P(Y|C=c, HPM) P(C=c | HPM)

Inference: Example Configuration 1: Observed data ProcessID=1, S=1 ProcessID=2, S=17 ProcessID=3, S=21 Configuration 2: ProcessID=2, S=1 ProcessID=1, S=17 ProcessID=3, S=23 Prediction 1 Prediction 2

Learning HPMs with unknown timing O(  ), known processes h(  ) EM (Expectation-Maximization) algorithm E-step –Estimate the conditional distribution over start times of the process instances given observed data, P(O(  1 )…O(  N ) | Y, h(  1 )… h(  N ), HPM). M-step –Use the distribution from the E step to get maximum-likelihood estimates of the HPM parameters. * In real problems, some timings are often known

HPMs are learnable from realistic amounts of data

Figure 1. The learner was given 80 training examples with known start times for only the first two processes. It chooses the correct start time (26) for the third process, in addition to learning the HDRs for all three processes. true signal Observed noisy signal true response W learned W Process 1Process 2Process 3

fMRI Study: Pictures and Sentences Each trial: determine whether sentence correctly describes picture 40 trials per subject. Picture first in 20 trials, Sentence first in other 20 Images acquired every 0.5 seconds. Read Sentence View PictureRead Sentence View PictureFixation Press Button 4 sec.8 sec.t=0 Rest

Decide whether consistent HPM model for Picture-Sentence Comparison Read sentence View picture Cognitive processes: ? Observed fMRI: cortical region 1: cortical region 2: Observed button press:

Learned HPM with 3 processes (S,P,D), and R=13sec (TR=500msec). P P SS D? observed Learned models: S P D D start time chosen by program as t+18 reconstructed P P SS D D D?

HPMs provide more accurate classification of unknown processes than earlier methods (e.g., Gaussian Naïve Bayes (GNB) classifier)

Standard classifier formulation View Picture Or Read Sentence Or View Picture Fixation Press Button 4 sec.8 sec.t=0 Rest picture or sentence? 16 sec. GNB: Standard formulation of classification problem (e.g., Gaussian Naïve Bayes (GNB)): Train on labeled data: known Processes, known StartTimes Test on unlabeled data: unknown Processes, known StartTimes

HPM classifier accounts for overlap View Picture Or Read Sentence Or View Picture Fixation Press Button 4 sec.8 sec.t=0 Rest picture or sentence? 16 sec. GNB: picture or sentence? HPM:

View Picture Or Read Sentence Or View Picture Fixation Press Button 4 sec.8 sec.t=0 Rest picture or sentence? 16 sec. GNB: picture or sentence? HPM: Results HPM with overlapping processes improves accuracy by 15% on average.

HPMs allow detecting and examining hidden processes with unknown timing

Decide whether consistent Two cognitive processes, or three? Read sentence View picture Cognitive processes: ? Observed fMRI: cortical region 1: cortical region 2: Observed button press:

Choosing Between Alternative HPM Models Train 2-process HPM 2 on training data Train 3-process HPM 3 on training data Test HPM 2 and HPM 3 on separate test data –Which predicts process identities better? –Which has higher probability given the test data? –(use n-fold cross-validation for test)

2-process HPM, 3-process HPM, GNB

Summary Hidden Process Model formalism Superiority over earlier classification methods Basis for studying hidden cognitive processes

Future Directions Add temporal and/or spatial smoothness constraints to process fMRI signatures Allow variable duration processes Give processes input arguments, output results Feature selection for HPMs Process libraries, hierarchies