Hedging Your Bets by Learning Reward Correlations in the Human Brain

Slides:



Advertisements
Similar presentations
Risk-Responsive Orbitofrontal Neurons Track Acquired Salience
Advertisements

Volume 60, Issue 5, Pages (December 2008)
Elizabeth V. Goldfarb, Marvin M. Chun, Elizabeth A. Phelps  Neuron 
Volume 69, Issue 3, Pages (February 2011)
Luke Clark, Andrew J. Lawrence, Frances Astley-Jones, Nicola Gray 
Medial Prefrontal Cortex Predicts Internally Driven Strategy Shifts
Rachel Ludmer, Yadin Dudai, Nava Rubin  Neuron 
Volume 64, Issue 3, Pages (November 2009)
Volume 72, Issue 4, Pages (November 2011)
Volume 82, Issue 1, Pages (April 2014)
Frontal Cortex and the Discovery of Abstract Action Rules
Alan N. Hampton, Ralph Adolphs, J. Michael Tyszka, John P. O'Doherty 
Sang Wan Lee, Shinsuke Shimojo, John P. O’Doherty  Neuron 
Volume 92, Issue 5, Pages (December 2016)
Neural Mechanisms of Hierarchical Planning in a Virtual Subway Network
Choice Certainty Is Informed by Both Evidence and Decision Time
Martin O'Neill, Wolfram Schultz  Neuron 
Volume 84, Issue 1, Pages (October 2014)
Volume 94, Issue 2, Pages e6 (April 2017)
Volume 66, Issue 6, Pages (June 2010)
Learning to Simulate Others' Decisions
Volume 94, Issue 4, Pages e7 (May 2017)
Sheng Li, Stephen D. Mayhew, Zoe Kourtzi  Neuron 
Perceptual Learning and Decision-Making in Human Medial Frontal Cortex
Volume 93, Issue 2, Pages (January 2017)
An Optimal Decision Population Code that Accounts for Correlated Variability Unambiguously Predicts a Subject’s Choice  Federico Carnevale, Victor de Lafuente,
Volume 62, Issue 5, Pages (June 2009)
Volume 79, Issue 1, Pages (July 2013)
Vincent B. McGinty, Antonio Rangel, William T. Newsome  Neuron 
Dustin E. Stansbury, Thomas Naselaris, Jack L. Gallant  Neuron 
Volume 82, Issue 5, Pages (June 2014)
Cortical Mechanisms of Smooth Eye Movements Revealed by Dynamic Covariations of Neural and Behavioral Responses  David Schoppik, Katherine I. Nagel, Stephen.
Volume 94, Issue 2, Pages e6 (April 2017)
Jack Grinband, Joy Hirsch, Vincent P. Ferrera  Neuron 
Volume 74, Issue 3, Pages (May 2012)
A Neurocomputational Model of Altruistic Choice and Its Implications
Dharshan Kumaran, Hans Ludwig Melo, Emrah Duzel  Neuron 
Volume 45, Issue 4, Pages (February 2005)
Human Orbitofrontal Cortex Represents a Cognitive Map of State Space
Franco Pestilli, Marisa Carrasco, David J. Heeger, Justin L. Gardner 
Independent Category and Spatial Encoding in Parietal Cortex
Volume 22, Issue 18, Pages (September 2012)
Caleb E. Strait, Tommy C. Blanchard, Benjamin Y. Hayden  Neuron 
Joseph T. McGuire, Matthew R. Nassar, Joshua I. Gold, Joseph W. Kable 
John T. Arsenault, Koen Nelissen, Bechir Jarraya, Wim Vanduffel  Neuron 
Kerstin Preuschoff, Peter Bossaerts, Steven R. Quartz  Neuron 
Subliminal Instrumental Conditioning Demonstrated in the Human Brain
Erie D. Boorman, John P. O’Doherty, Ralph Adolphs, Antonio Rangel 
Neural Mechanisms Underlying Human Consensus Decision-Making
James M. Jeanne, Tatyana O. Sharpee, Timothy Q. Gentner  Neuron 
Volume 92, Issue 5, Pages (December 2016)
Value-Based Modulations in Human Visual Cortex
Franco Pestilli, Marisa Carrasco, David J. Heeger, Justin L. Gardner 
A Neural Signature of Hierarchical Reinforcement Learning
Visual Feature-Tolerance in the Reading Network
Orbitofrontal Cortex Uses Distinct Codes for Different Choice Attributes in Decisions Motivated by Curiosity  Tommy C. Blanchard, Benjamin Y. Hayden,
Nicholas E. Bowman, Konrad P. Kording, Jay A. Gottfried  Neuron 
Learning to Simulate Others' Decisions
Predictive Neural Coding of Reward Preference Involves Dissociable Responses in Human Ventral Midbrain and Ventral Striatum  John P. O'Doherty, Tony W.
Volume 87, Issue 6, Pages (September 2015)
Encoding of Stimulus Probability in Macaque Inferior Temporal Cortex
Medial Prefrontal Cortex Predicts Internally Driven Strategy Shifts
Perceptual Classification in a Rapidly Changing Environment
Tuned Normalization Explains the Size of Attention Modulations
Supratim Ray, John H.R. Maunsell  Neuron 
Volume 90, Issue 5, Pages (June 2016)
Orbitofrontal Cortex Uses Distinct Codes for Different Choice Attributes in Decisions Motivated by Curiosity  Tommy C. Blanchard, Benjamin Y. Hayden,
Visual Feature-Tolerance in the Reading Network
Volume 28, Issue 18, Pages e3 (September 2018)
Volume 23, Issue 11, Pages (June 2013)
Presentation transcript:

Hedging Your Bets by Learning Reward Correlations in the Human Brain Klaus Wunderlich, Mkael Symmonds, Peter Bossaerts, Raymond J. Dolan  Neuron  Volume 71, Issue 6, Pages 1141-1152 (September 2011) DOI: 10.1016/j.neuron.2011.07.025 Copyright © 2011 Elsevier Inc. Terms and Conditions

Figure 1 Experimental Design (A) Subjects were presented with a slider to set portfolio weights that determine the fraction of each resource (wind or solar power) in the energy mix (screen 1). The weights could be set within the range from −1 to 2, with a fixed relationship that both weights always add up to 1, i.e., wwind = 1 − wsun. The trial outcome (screen 2) displayed the individual resource values for sun and wind, and the portfolio value of the combined mix (calculated by the weights from screen 1). (B) Optimal portfolio weight wsun (wwind = 1 – wsun) increases as a function of the correlation coefficient between sun and wind outcomes. The background color indicates portfolio standard deviation (blue = small SD, red = large SD). Optimal portfolio weights (for variance minimization) are displayed as white line, the gray lines indicate the 10% interval around the optimal choice (a deviation of that amount from the optimal weights would result in a 10% higher SD). (C) The correlation estimate ρ (red line) is updated from trial to trial (x axis) via a correlation prediction error ζ (green stems) and then in a second step used to allocate weights in every trial. Zeta is calculated as the cross-product between the two resource outcome prediction errors (gray bars). The correlation coefficient that was used to generate the data in this illustration is −0.60 during the first ten trials and afterward changes to +0.80 (dashed line). Learning of ρ from ζ is depicted here for a learning rate of 0.2. Neuron 2011 71, 1141-1152DOI: (10.1016/j.neuron.2011.07.025) Copyright © 2011 Elsevier Inc. Terms and Conditions

Figure 2 Model Fit and Behavior (A) The correlation learning model explained subjects' behavior best. Plotted are the Bayesian information criterions, which are corrected for the different levels of complexity in the models (smaller values are better). The r2 value represents the proportion of behavioral variance explained by each model. (B) Regression of actual weights on model predicted weights. Data is pooled over all subjects; for single subject results see Table 1. Note that the deviations at the extremes are a result from bounding the possible weight range at −1 and 2; any behavioral errors at the boundary could therefore happen only in one direction. Error bars = SEM. (C) Both the response of a representative subject (blue) and the model predicted weights (red) approach the normative best response under full knowledge of the generative correlation (black line) with some lag, which results from the time necessary to observe changes in correlation. Subjects responded after a 20-trial long observation-only phase (not shown). Neuron 2011 71, 1141-1152DOI: (10.1016/j.neuron.2011.07.025) Copyright © 2011 Elsevier Inc. Terms and Conditions

Figure 3 Neural Representation of Correlation Strength (A) Neural activity in midinsular cortex correlated with the trial-by-trial model predicted correlation strength between the two resource values at the presentation of the outcome screen. (B) Effect size plots (average percent signal change across subjects). Data plotted separately for trials in which the model predicted correlation strength was low and high in four bins (25/50/75/100 percentile of correlation range, errors bars = SEM). Activity in insula increased linearly with the correlation coefficient (that is, in contrast to the covariance, normalized by the standard deviations of the resources). Data were extracted using a cross-validation (leave-one-out) procedure to ensure independence of data used for localization and effect measure. (C) Time course plot of effect size for the correlation coefficient regressor. The correlation coefficient is represented at the time of the outcome screen, when new evidence becomes available, but not during the choice period. Thin lines = SEM. (D) Comparison of explained variance in the behavioral model with the explained variance in the fMRI analysis. Fluctuations in BOLD activity in midinsula can be particularly well explained within those subjects whose behavior is also well explained by the model (r = 0.50, p = 0.03). Each dot represents one subject and the line is the regression slope. Neuron 2011 71, 1141-1152DOI: (10.1016/j.neuron.2011.07.025) Copyright © 2011 Elsevier Inc. Terms and Conditions

Figure 4 Neural Representation of Correlation Prediction Errors (A) Activity in rostral cingulate cortex correlated with the correlation prediction error. (B) Effect size plots (similar to Figure 3B) for the cluster confirm a linear effect. Neuron 2011 71, 1141-1152DOI: (10.1016/j.neuron.2011.07.025) Copyright © 2011 Elsevier Inc. Terms and Conditions

Figure 5 Absolute Weight Updates Activity in ACC/DMPFC and anterior insula correlated, at the time of the outcome screen, with the absolute amount that subjects update the resource allocation weights during the following choice. Neuron 2011 71, 1141-1152DOI: (10.1016/j.neuron.2011.07.025) Copyright © 2011 Elsevier Inc. Terms and Conditions