Learning, Volatility and the ACC Tim Behrens FMRIB + Psychology, University of Oxford FIL - UCL.

Slides:



Advertisements
Similar presentations
Figure Three-dimensional reconstruction of the left hemisphere of the human brain showing increased activity in ventrolateral area 45 during verbal.
Advertisements

Attention and neglect.
Blindsight Seeing without Awareness. What is Blindsight ‘Blindsight’ (Weiskrantz): residual visual function after V1 damage in the lack of any visual.
Real-time fMRI brain computer interfaces: Self- regulation of single brain regions to networks Sergio Ruiz, Korhan Buyukturkoglu, Mohit Rana, Niels Birbaumer,
NATURE REVIEWS | NEUROSCIENCE SEP 01
INTRODUCTION Assessing the size of objects rapidly and accurately clearly has survival value. Thus, a central multi-sensory module for magnitude assessment.
Center for Neural Science
1. Which color represents the frontal lobe?. 2. Which color represents the occipital lobe?
1 Brain Circuits Involved in Emotion processing: Subcortical Regions BIOS E 232 Sabina Berretta, MD Harvard Medical School McLean Hospital.
Charles Spence Department of Experimental Psychology, Oxford University New Perspectives from the Human Sciences Consumer Focus Workshop (November, 2001)
Motor systems III: Cerebellum April 16, 2007 Mu-ming Poo Population coding in the motor cortex Overview and structure of cerebellum Microcircuitry of cerebellum.
Reward processing (1) There exists plenty of evidence that midbrain dopamine systems encode errors in reward predictions (Schultz, Neuron, 2002) Changes.
FIGURE 4 Responses of dopamine neurons to unpredicted primary reward (top) and the transfer of this response to progressively earlier reward-predicting.
Cortical motor structures. Hierarchical Organization of Motor System.
DCM Advanced, Part II Will Penny (Klaas Stephan) Wellcome Trust Centre for Neuroimaging Institute of Neurology University College London SPM Course 2014.
EVOLUTION AND DIVERSITY OF NERVOUS SYSTEMS Development of nerve nets  Cephalization and nerve cords.
Prediction in Human Presented by: Rezvan Kianifar January 2009.
Reinforcement learning and human behavior Hanan Shteingart and Yonatan Loewenstein MTAT Seminar in Computational Neuroscience Zurab Bzhalava.
Dopamine enhances model-based over model-free choice behavior Peter Smittenaar *, Klaus Wunderlich *, Ray Dolan.
Comparative Diffusion Tensor Imaging (DTI) Study of Tool Use Pathways in Humans, Apes and Monkeys Ashwin G. Ramayya 1,2, Matthew F. Glasser 1, David A.
Neural coding (1) LECTURE 8. I.Introduction − Topographic Maps in Cortex − Synesthesia − Firing rates and tuning curves.
Sensorimotor functions of the cerebellum
Orbitofrontal Cortex and Its Contribution to Decision-Making Part 1 Group 1 Amanda Ayoub, Alyssa Nolde, Cor Baerveldt, Baoyu Wang.
CS344 : Introduction to Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 26- Reinforcement Learning for Robots; Brain Evidence.
 The newer neural networks are located in the cerebrum. The cerebrum is the two large hemispheres of the brain and is covered by the cerebral cortex.
Functional-anatomical correspondence Meta-analysis of motor and executive fMRI/PET activations showed close correspondence between functionally and connectivity-defined.
Chapter 16. Basal Ganglia Models for Autonomous Behavior Learning in Creating Brain-Like Intelligence, Sendhoff et al. Course: Robots Learning from Humans.
Tracing pain pathways from stimulus to report Lauren Y. Atlas 1, Matthew Davidson 1, Niall Bolger 1, Kate Dahl 1, Martin Lindquist 2, Tor D. Wager 1 1.
INVESTIGATING THE ROLE OF THE ANTERIOR CINGULATE CORTEX IN THE SELECTION OF WILLED ACTIONS AND PERFORMANCE MONITORING Department of Experimental Psychology,
Coricelli and Nagel (2008) Introduction Methods Results Conclusion.
Association Cortices. Structure of the Human Neocortex Including Association Cortices.
D ECIDING WHEN TO CUT YOUR LOSSES Matt Cieslak, Tobias Kluth, Maren Stiels & Daniel Wood.
PHYSIOLOGICAL UNDERPINNINGS OF LANGUAGE, PROBLEM SOLVING, AND REASONING.
Article by Calvo-Merine, Glaser, Grezes, Passingham, Haggard, 2005.
Human Anatomy & Physiology FIFTH EDITION Elaine N. Marieb PowerPoint ® Lecture Slide Presentation by Vince Austin Copyright © 2003 Pearson Education, Inc.
Freesurfer Cortical Quality Check
Orienting Attention to Semantic Categories T Cristescu, JT Devlin, AC Nobre Dept. Experimental Psychology and FMRIB Centre, University of Oxford, Oxford,
A Pain by any other Name (Rejection, Exclusion, Ostracism) still Hurts the Same Mathew D. Leiberman and Naomi I. Eisenberger By: Shauna Halaharvi.
KUHL, B.A., JOHNSON, M.K., AND CHUN, M.M. (2013). DISSOCIABLE NEURAL MECHANISMS FOR GOAL-DIRECTED VERSUS INCIDENTAL MEMORY REACTIVATION. THE JOURNAL OF.
Drug abuse liability is associated with higher reward-sensitivity: An fMRI study using the Monetary Incentive Delay task C. Corbly, T. Kelly, Y. Jiang,
Cortical gyri and sulci
Neural correlates of risk sensitivity An fMRI study of instrumental choice behavior Yael Niv, Jeffrey A. Edlund, Peter Dayan, and John O’Doherty Cohen.
Chapter 6 Learning. Objectives 6.1 How We Learn Distinguish among three major types of learning theories focusing on behavior. 6.2 Classical Conditioning.
Does the brain compute confidence estimates about decisions?
Methods of Assigning Probabilities l Classical Probability; l Empirical Probability; and l Subjective Probability l P (A) = N(A) / N l P (X) = ƒ (X) /
(A review by D.J. Kravitz et. al)
Neural Coding of Basic Reward Terms of Animal Learning Theory, Game Theory, Microeconomics and Behavioral Ecology Wolfram Schultz Current Opinion in Neurobiology.
Neuroimaging of associative learning
Multiple Change Point Detection for Symmetric Positive Definite Matrices Dehan Kong University of Toronto JSM 2018 July 30, 2018.
Volume 72, Issue 4, Pages (November 2011)
Neuroimaging of associative learning
Presented by: Rezvan Kianifar January 2009
Perceptual Learning and Decision-Making in Human Medial Frontal Cortex
Volume 93, Issue 2, Pages (January 2017)
Volume 62, Issue 5, Pages (June 2009)
Matthew A.J. Apps, Matthew F.S. Rushworth, Steve W.C. Chang  Neuron 
Volume 87, Issue 5, Pages (September 2015)
Volume 65, Issue 6, Pages (March 2010)
Reward-Guided Learning with and without Causal Attribution
Frontal Cortex and Reward-Guided Learning and Decision-Making
Know Your Place: Neural Processing of Social Hierarchy in Humans
Neuroimaging of associative learning
Value-Based Modulations in Human Visual Cortex
Volume 87, Issue 5, Pages (September 2015)
Predictive Neural Coding of Reward Preference Involves Dissociable Responses in Human Ventral Midbrain and Ventral Striatum  John P. O'Doherty, Tony W.
Will Penny Wellcome Trust Centre for Neuroimaging,
Perceptual Classification in a Rapidly Changing Environment
Orbitofrontal Cortex as a Cognitive Map of Task Space
The anatomy of attention.
Economic Choice as an Untangling of Options into Actions
Presentation transcript:

Learning, Volatility and the ACC Tim Behrens FMRIB + Psychology, University of Oxford FIL - UCL.

B Trials Into Past Reward History Weight (β) CON i-1i-2i-3i-4i-5i-6i-7i-8 Kennerley, et al., Nature Neuroscience, 2006

ACCs B Trials Into Past Reward History Weight (β) CON i-1i-2i-3i-4i-5i-6i-7i-8 Kennerley et al. Nature Neuroscience, 2006

Monkeys will sacrifice food opportunities to look at other monkeys ACC G Rudebeck,et al. Science 2005

Interest in other individuals is reduced after ACC gyrus lesion ACC G Rudebeck,et al. Science 2005

Anatomy - Differences in connections between ACCs and ACCg. Connections unique to the sulcus are mainly with motor regions: Primary motor cortex Premotor cortex Parietal motor areas Spinal Cord ACCs has information about our own actions

Anatomy - Differences in connections between ACCs and ACCg. Connections unique to the gyrus are mainly with regions that process emotional and biological stimuli: Periacqueductal grey hypothalamus STS/STG Insula/Temporal pole connections are stronger to the gyrus ACCg has access to information about other agents.

Anatomy - shared connections between ACCs and ACCg. Some shared connections Orbitofrontal cortex Amydala Ventral striatum ACCg and ACCs are strongly interconnected Both regions have access to and influence over reward and value processing.

ACC Sulcus and learning about your actions.

ACCs B Trials Into Past Reward History Weight (β) CON i-1i-2i-3i-4i-5i-6i-7i-8 Kennerley et al. Nature Neuroscience, 2006

Kennerly et al. Nat Neurosci 2006Sugrue et al. Science 2005 Trials Into Past Reward History Weight (β) CON i-1i-2i-3i-4i-5i-6i-7i-8 What determines the integration length?

Kennerly et al. Nat Neurosci 2006Sugrue et al. Science 2005 VOLATILE Reward probabilities change approximately every 25 trials STABLE Reward probabilities change only after hundreds of trials Trials Into Past Reward History Weight (β) CON i-1i-2i-3i-4i-5i-6i-7i-8

Reinforcement learning We need to continually re-appraise the value of an action based each new experience.  prediction (V t ) outcome  x  new prediction (V t+1 )

Updating beliefs on the basis of new information 14 V t+1 =V t +(  x  The learning rate is the weight given to the current information The prediction error is the information available from this event

The learning rate and the value of information. V t+1 =V t +(  x  The learning rate should represent the value of the current information for guiding future beliefs.

 =0.01  =0.4  =0.1 Relationship with integration length

stable Behrens et al., Nature Neuroscience, 2007

Behrens, Woolrich, Walton, Rushworth, Nature Neuroscience, 2007 V t+1 =V t +  x 

changes in reward estimates occur throughout the task… Behrens, Woolrich, Walton, Rushworth, Nature Neuroscience, 2007 …as do change in volatility estimates

Decide Monitor x Volatility Behrens et al., Nature Neuroscience, 2007

ACC effect size predicts learning rate across subjects Behrens, Woolrich, Walton &Rushworth Nat Neurosci 2007

ACC Gyrus and learning about your social partners.

Interest in other individuals is reduced after ACC gyrus lesion ACC G Rudebeck et al. Science 2005

Rudebeck et al., Science, 2006

25 Learning about other agents Behrens, Hunt, Woolrich, Rushworth Nature 2008

Sources of information Probability that confederate advice is good Probability that correct colour is blue Value of action information Value of social information Behrens, Hunt, Woolrich, Rushworth Nature 2008

Social information is integrated over time - behaviour

Reward Prediction Error Reward -Expectation V t+1 =V t +(  x  Outcome Time Effect size Behrens, Hunt, Woolrich, Rushworth Nature 2008

Prediction error on a social partner. Lie event -Lie prediction V t+1 =V t +(  x  Outcome Time Effect size Behrens, Hunt, Woolrich, Rushworth Nature 2008

The value of information and the ACC 30 Value of reward information Value of social information V t+1 =V t +(  x 

Combining Information to drive behaviour V t+1 =V t +(  x 

32 Conclusions ACC codes a learning signal when information is observed. This signal predicts the speed of learning. Learning from our own and others’ actions are processed in parallel in ACCs and ACCg. The outputs of these parallel learning processes are combined in the reward system.

33 Acknowledgments Matthew Rushworth Mark Woolrich Laurence Hunt Mark Walton 33