Presentation is loading. Please wait.

Presentation is loading. Please wait.

Neural correlates of risk sensitivity An fMRI study of instrumental choice behavior Yael Niv, Jeffrey A. Edlund, Peter Dayan, and John O’Doherty Cohen.

Similar presentations


Presentation on theme: "Neural correlates of risk sensitivity An fMRI study of instrumental choice behavior Yael Niv, Jeffrey A. Edlund, Peter Dayan, and John O’Doherty Cohen."— Presentation transcript:

1 Neural correlates of risk sensitivity An fMRI study of instrumental choice behavior Yael Niv, Jeffrey A. Edlund, Peter Dayan, and John O’Doherty Cohen lab meeting, Feb 2007

2 Background and aims Question: Are learned values adjusted to reflect risk? Do you prefer to get $10, or should we toss a coin for $20 or nothing? Human decision making is sensitive not only to the expected reward, but also to the variance, or risk of the outcome. Personal preferences vary: Risk avoidance vs. risk seeking Models of choice between different outcomes: Expected utility models Risk affects subjective utility For instance: “utility curve” 10 2030 $ subjective utility Reinforcement learning models Online learning of value = expected future reward Learned values indifferent to risk Neuroscientific support : 1.Dopamine firing 2.Imaging correlates of TD prediction error learning

3 Experimental design You won 40 cents < 1 sec 0.5 sec 2-5sec ITI 5 sec ISI 5 stimuli: CS20 → 20 cents CS0/40 → 0 or 40 cents (p=0.5) CS40 → 40 cents CS0 → 0 cents Randomly ordered trials; counterbalanced; 234 trials: 130 choice, 104 single stimulus 19 subjects, 3T scanner, TR=2sec Choice trials: Behavioral risk sensitivity Single trials: Neural values of stimuli

4 Behavioral results 1.Subjects learned the task 2.Subjects showed risk sensitivity reaction time first half second half score (points) Subject # blocks of trials proportion correct on different-mean choices blocks of 10 choices proportion choice of certain option in 20 vs 0/40 choice

5 Why are subjects risk sensitive? At least two possible reasons: Learning according to risk- neutral TD learning δ(t) = r(t) + V(t+1) – V(t) V(t) = V(t) + ηδ(t) Choices can be risk sensitive due to online learning Without choice: no bias in means Interaction between learning and choices in stochastic task A B choice reward value A value B Can be implemented in “risk sensitive TD learning” of values δ(t) = r(t) + V(t+1) – V(t) V(t) = V(t) + ηδ(t)(1±κ) Positive κ  risk averse (learned mean < real mean) Negative κ  risk seeking (learned mean > real mean) Different utilities for risky (CS0/40) and non-risky (CS20) options, despite similar mean value You won 40 points You won 0 points 20 κ=0 κ>0 κ<0

6 Comparing models: Behavioral fit Both models provide similarly good explanations of the behavioral choices κ (risk adjustment) CS20-CS0/40 value Risk neutral TD can explain risk-sensitive behavior Risk adjustment of temporal difference learning can explain risk aversion (fitted κ related to actual preference) The value of κ predicts a difference between the learned values of CS20 and CS0/40 subjects (ordered by performance) prediction probability per choice trial r 2 =0.83 proportion choice of certain option κ (risk adjustment)

7 Neural correlates of stimulus value: NAC R Y=+6 p<0.001 p<0.0001 seconds from stimulus onset seconds from CS0/40 onset Bilateral nucleus accumbens (ROI) correlated with TD error regressor of both models Time courses extracted from peak voxel in 8mm sphere around group peak L: (-12,3,-15); R: (9,3,-15) Activations consistent with TD error signal

8 The critical question: CS20 vs CS0/40 value seconds from stimulus onset risk averse (> 0.7) subjects risk prone (< 0.7) subjects CS20 - CS0/40 value proportion choice of certain option No evidence for correlation between value differences and risk preference (or for risk adjusted values) Qualitatively different prediction of the models: Risk sensitive TD model: CS20 value ≠ CS0/40 value even when sampled without bias Risk neutral TD model: CS20 value = CS0/40 value when sampled without bias -> Compare single CS averaged values of CS20 and CS0/40

9 Neural correlates: other TD error areas R Y=+20 p<0.001 p<0.0001 R Z=-6 p<0.001 p<0.0001 L X=-12 p<0.001 p<0.0001 Temporal difference error predictor also correlated with (at p< 0.001, uncorrected): - L mPFC and extending to mOFC - L caudate (especially on choice trials) - Bilateral hippocampus - R anterior cingulate - L temporal cortex TD error in single CS trials minus choice trials: Strong bilateral activation in anterior insula and mPFC (also same insula region in comparison between risky and non-risky trials)

10 Neural correlates: Risk vs. No Risk R Z=-4 p<0.005 Comparison of trials which included CS0/40 (trials involving risk) to trials which only involved constant rewarding options: Bilateral activation in anterior insula

11 Discussion and future directions Simple instrumental task confirms TD value learning in the brain Reaction time data illustrate Pavlovian-instrumental interactions, and provide a window to difficulty of decision making No evidence (yet?) for risk adjustment in TD based value learning Noise noise noise noise noise… But: TD learning alone can explain much of data Other interesting comparisons: Trials involving risk and those that don’t (anterior insula) Choice versus single CS trials TD error in choice versus single CS trials - another way to look at the dorsal and ventral divide in the striatum Relationship to neuroeconomic issues: what is the basis of risk seeking? Are there other non-TD mechanisms that represent value? What is the role of the risk related signal in the insula? What is the relationship to a conflict signal in the anterior cingulate cortex?

12 References Preuschoff, Bossaerts & Quartz (2006) – Neural differentiation of expected reward and risk in human subcortical structures, Neuron Morris, Nevet, et al. (2006) – Midbrain dopamine neurons encode decisions for future action, Nature Neuroscience Kuhnen & Knutson (2005) – The neural basis of financial risk taking, Neuron Niv, Duff & Dayan (2005) – Dopamine, uncertainty and TD learning, Behavioral and Brain Functions O’Doherty, Dayan et al. (2004) – Dissociable roles of ventral and dorsal striatum in instrumental conditioning, Science Seymour, O’Doherty, et al. (2004) – Temporal difference models describe higher order learning in humans, Nature Fiorillo, Tobler & Schultz (2003) – Discrete coding of reward probability and uncertainty by dopamine neurons, Science Mihatsch & Neuneier (2002) – Risk sensitive reinforcement learning, Machine Learning Niv, Joel et al. (2002) – Evolution of reinforcement learning in uncertain environments, Adaptive Behavior


Download ppt "Neural correlates of risk sensitivity An fMRI study of instrumental choice behavior Yael Niv, Jeffrey A. Edlund, Peter Dayan, and John O’Doherty Cohen."

Similar presentations


Ads by Google