# Time and Causality: A theory of learning

## Presentation on theme: "Time and Causality: A theory of learning"— Presentation transcript:

Time and Causality: A theory of learning
What is associative learning for? How does Rescorla Wagner do? How does it fail? Wagner’s time-based theory of learning Applications

What is associative learning for?

What is associative learning for?
Learning about causality Tone --> food

What is associative learning for?
Learning about features of stimuli - what goes with what warm fruit nice pastry juicy

If you want to design a model to learn about causality, what should it be like?

If you want to design a model to learn about causality, what should it be like?
Directionality Cause -->effect or effect --> Cause

If you want to design a model to learn about causality, what should it be like?
Sensitivity to delay between Cause and effect

If you want to design a model to learn about causality, what should it be like?
Sensitive to correlation

If you want to design a model to learn about causality, what should it be like?
Sensitive to correlation

If you want to design a model to learn about causality, what should it be like?
Sensitive to correlation

If you want to design a model to learn about causality, what should it be like?
Learning about predictable outcomes ? ?

If you want to design a model to learn about causality, what should it be like?
Learning about predictable outcomes ? ?

What possible rules are there for forming associations?
Would a pure contiguity model have the properties we want? (e.g. Hebb) V =  ()

Predictable outcomes X
Direction X Delay yes Correlation X Predictable outcomes X V =  ()

Rescorla Wagner avoids some of these problems:
V =  ( - V)

Rescorla Wagner avoids some of these problems:
bracketed term means how surprising US is V =  ( - V)

Rescorla Wagner thus allow selective learning about surprising outcomes
it can also explain sensitivity to correlation

Rescorla Wagner thus allow selective learning about surprising outcomes
it can also explain sensitivity to correlation

Rescorla Wagner thus allow selective learning about surprising outcomes
it can also explain sensitivity to correlation

it can also explain sensitivity to correlation
Rescorla Wagner thus allow selective learning about surprising outcomes it can also explain sensitivity to correlation context context context context context context context context context context context context

context ---> food context+tone ---> food

Rescorla Wagner Direction X Delay X Correlation yes Predictable outcomes yes

Rescorla Wagner cannot explain why backward conditioning should not work, and cannot easily explain the effect of trace intervals….. this is because there is nothing in the Rescorla Wagner equation that refers to time – and time is the essence of causality

Sometimes Opponent Process Theory
Wagner’s SOP (1981) Sometimes Opponent Process Theory incorporates time, by basing itself on the idea that processing of a stimulus can vary: as a function of time (cf trace decay in STM) as a function of recent events

stimulus processing is reduced if:
the same stimulus has just been presented self-generated priming a predictor (CS) for the stimulus has just been presented retrieval-generated priming

General Assumptions Stimulus represented as a set of elements, some of which may be activated by stimulus presentation. Elements may be inactive, or in one of two states: A1 is a primary state of limited capacity (corresponding to rehearsal/STM) A2 is a secondary state of activation.

General Assumptions Differences between A1 and A2.... Response elicited by A2 often less intense than that elicited by A1 – in some cases it’s the opposite

inactive  A1  A2  inactive
General Assumptions When a stimulus is presented, some of its inactive elements enter A1, then gradually decay into A2, and then become inactive again. inactive  A1  A  inactive fast slow

A2 I A1

A2 I A1

A2 fast I A1

A2 slow I A1

How does this model produce self generated priming?

How does this model produce self generated priming?

but after a while..... A2 fast A1 I

When the stimulus is first presented its elements go into A1, and then quickly decay into A2
Elements cannot go from A2 directly to A1; must decay to I first The more elements accumulated in A2 state, the fewer are left for the next presentation of the CS to put into A1 So the second presentation produces less A1 activity, and the stimulus is less effective

so by the time the next CS occurs.....
A2 fast A1 I

inactive  A2  inactive Retrieval-generated priming:
if an associate of the stimulus is presented, then its elements are activated directly to the A2 state. inactive  A  inactive

Condition Tone --->Food.....
and present Tone; what happens to Food elements? A2 I A1

...so when food presented it is less effective
“conditioned diminution of the UR” A2 I A1

Differences between A1 and A2....
Learning about A1 and A2 obeys different rules.. in order to form an excitatory association : the CS must be A1 --- if the US must be in A1 if the US is in A2 an inhibitory association forms

How does conditioning happen?
After one trial: tone food A2 A2 I A1 I A1

How does conditioning stop?
After many trials: tone food A2 A2 I A1 I A1 CS mainly A1, US mainly in A2 --> mix of excitatory and inhibitory learning

How does extinction happen?
tone food A2 A2 I A1 I A1 CS mainly A1, US all in A2 --> inhibitory learning

Inhibitory conditioning:
nothing

Inhibitory conditioning:
First establish tone-->food association tone food A2 A2 I A1 I A1

then introduce tone+lightnothing trials
food A2 A2 I A1 I A1 CS mainly A1, US all in A2 --> inhibitory learning

An inhibitor prevents inactive elements of the US from entering A2.
It will thus interfere with action of a conditioned excitor, which is trying to put inactive US elements into A2.

So how does this model do all the things that learning about causality would require?
Selective learning about signals for surprising events Correlation Delay Directionality

tonefood tone+light food
Blocking: Early Stage 1 tonefood tone+light food food tone A2 A2 I A1 I A1

tonefood tone+light food
Blocking: Late Stage 1 tonefood tone+light food food tone A2 A2 I A1 I A1

tonefood tone+light food
Blocking: Stage 2 tonefood tone+light food food light A2 A2 I A1 I A1 CS mainly A1, US mainly in A2 --> mix of excitatory and inhibitory learning

Excitatory Conditioning Short ISI
Mainly A1/A1 ---> strong excitatory association

Excitatory Conditioning Medium ISI
Less CS in A1 ---> weaker excitatory association

Excitatory Conditioning Very Long ISI
No CS in A1 ---> no excitatory association

Backward conditioning
food tone A2 A2 I A1 I A1

Further Predictions and Applications

The theory predicts that a US will be processed less effectively when it is predicted. This was tested by Terry and Wagner (1975). Train US: CS--> US no US: CS--> - (or the opposite)

Then train tone-->US light-->no US
after this training US: CS--> US no US: CS--> - Test : compare tone --> US: CS ?? light --> US: CS ?? Predicted shock should be less effective than unsignalled shock Tone trials should be less accurate than light trials

same A --> X different A --> Y
Another prediction of the account is that a predicted CS is less effective at evoking its CR than a surprising one -- priming A --> X --> food B --> Y --> food test CR to X and Y with same combinations... and different combinations same A --> X different A --> Y B --> Y B --> X

Applications 1 Andresen et al (1990) – The scapegoat effect Suggested novel tasting food eaten after “normal” food which precedes CT will acquire strong association and overshadow association to normal food (act as a scapegoat) This idea appeals to two principles: (i) conditioning two stimuli together results in less learning than if you condition just one -- overshadowing (ii) novel stimuli condition better than familiar ones – latent inhibition – latent inhibition

Applications 1 Andresen et al (1990) – The scapegoat effect Suggested novel tasting food eaten after “normal” food which precedes CT will acquire strong association and overshadow association to normal food (act as a scapegoat) This idea appeals to two principles: (i) conditioning two stimuli together results in less learning than if you condition just one -- overshadowing (ii) novel stimuli condition better than familiar ones – latent inhibition – latent inhibition CS CS CS CS +

Applications 1 Andresen et al (1990) – The scapegoat effect Suggested novel tasting food eaten after “normal” food which precedes CT will acquire strong association and overshadow association to normal food (act as a scapegoat) This idea appeals to two principles: (i) conditioning two stimuli together results in less learning than if you condition just one -- overshadowing (ii) novel stimuli condition better than familiar ones – latent inhibition – latent inhibition CS CS CS CS + context context context context

Applications 2 Drug addiction and tolerance e.g Paletta & Wagner 1986 Response elicited by A2 may be opposite to that elicited by A1 If the UR has two phases, one opposite to the other, it suggests A1 and A2 activity are opposite to each other e.g. UR to morphine sedation/hypoactivity (A1 response) followed by hyperactivity (compensatory A2 response) this means that CSs associated with the drug may produce tolerance to drug’s effect

Paletta & Wagner 1986 Three groups of animals: Morphine (distinctive context) Morphine (home cage) No drug Then tested all groups in distinctive context measure activity and sensitivity to pain (tail flick test) Across several experiments they found evidence of hyperactivity and hyperalgesia in the group that had experienced morphine in a distinctive context – the opposite of drug’s normal effects

Suggested Reading Dickinson, A. (1980). Contemporary animal learning theory. Cambridge University Press) (Short, sophisticated but compelling introduction to learning theory written from a causal perspective) Honey, R.C., Hall, G., & Bonardi, C. (1993). Negative priming in associative learning: Evidence from serial conditioning procedures. Journal of Experimental Psychology: Animal Behavior Processes, 19, A test of Wagner's theory Marlin, N.A., & Miller, R.R. (1981). Associations to contextual stimuli as a determinant of long term habituation. Journal of Experimental Psychology: Animal Behavior Processes, 7, Paletta, M.S., & Wagner, A.R. (1986). Development of context-specific tolerance to morphine: support for a dual process interpretation. Behavioral Neuroscience, 100, Application of Wagner's theory Terry, W.S., & Wagner, A.R. (1975). Short term memory for "surprising" versus "expected" conditioned stimuli in Pavlovian conditioning. Journal of Experimental Psychology: Animal Behavior Processes, 104, Wagner, A.R. (1981) SOP: A model of automatic memory processing in animals. In N.E. Miller & R.R. Spear (Eds.) Information processes in animals: Memory Mechanims (pp ). Hillsdale, N.J. Erlbaum Wagner’s theory!