Presentation on theme: "Conditioning Bear with me. Bare with me. Beer with me. Stay focused."— Presentation transcript:
Conditioning Bear with me. Bare with me. Beer with me. Stay focused.
Learning A. Two-process learning (Rescorla-Solomon 67) fast: fear and arousal slow: adaptive behavioral responses B. Three-process learning A declarative memory (as opposed to procedural) C. More-than-three-process learning A declarative memory episodic memory semantic memory more stuff Typically this subsides as this is learned.
Conditional and Unconditional Training Delay procedure CS US Trace procedure CS US UR S innate CS USUR/CR innate learned US = “Reinforcer” easier harder
Classical and Operant CS USUR/CR innate learned delivery of the reinforcer is contingent on the occurrence of a stimulus (the CS). S1 USAction innate learned delivery of the reinforcer is contingent on the occurrence of a designated response CC predicts that the animal will produce UR/CR while performing the desired action, but does not explain why the animal learns to select the action.
Selectionist View Selectionist principles –Behaviors are varied, selected and retained in a process similar to the natural selection of the species –Only overt behaviors can be reinforced by the environment –Principle of the selection is based in the behavioral discrepancy
Behavioral Discrepancy Behavioral discrepancy is the change in an ongoing behavior produced by the eliciting stimulus Example: Presentation of food produces salivation which would not otherwise occur
Unified Selection Principle Whenever a behavioral discrepancy occurs, an environment- behavior relation is selected that consists -- other things being equal -- of all those stimuli occurring immediately before the discrepancy and all those responses occurring immediately before and at the same time as the elicited response. Under this principle there is no difference between Classical and Operant conditioning as far as learning goes.
Conditioning Phenomena Name Pavlovian Overshadowing Inhibitory Blocking Upwards unblocking Downwards unblocking Set ISet IITest It goes on...
Conditioning/Selection Models Trial-by-trial Probabilistic (Dayan-Long, Cheng-Novick) … and not (Rescorla-Wagner) NN (Donohoe) Moment-by-moment Sutton-Barto Mignault Schmajuk (NN) ~ Bazillion of others... S1 and S2 processing should happen at roughly the same time so almost all models suggest a multiplicative relationship between levels of S1 and S2.
Rescorla-Wagner model Trial based Based on net prediction of the reward Only happens when prediction discrepancy is detected Falls out straight from ML estimation of association strength Is essantially the delta-rule Problems: Does not deal well with overshadowing and downwards unblocking... Does not depend on the temporal relations between stimuli Does not explain re-acquisition rate rewardassociation strength update net prediction stimulus eligibility
Sutton-Barto model Real-time model Combines Y theory with RW model time-derivative model presumes that all stimuli produce +V at the onset and -V at the offset Deals with secondary conditioning Problems: Does not model Inter-Stimulus Intervals where the efficiency of the training should decrease with increased ISI Does not deal with reacquisition sum of all the associative strengths at a given time
Temporal Difference model Is related to the SB model (and the RW model) Models reward in small discrete intervals Models second order conditioning Based on the assumption that the goal of learning is to accurately predict the future US levels discounted prediction of the future reward (V for predicted values of S) Problems: No model of attention, salience, configuration etc... No indirect associations modeled (sensory preconditioning) Problems with downwards unblocking
Statistical models This results in exactly the RW model with ML. This is EM. Similar to comparator models of conditioning (whatever they are). Has problems with inhibitory conditioning. Dayan & Long’s model. Models the conditioning phenomena. Does not consider associability (eligibility in SB) and attention. No distinction between preparatory and consumatory conditioning
NN models Everything is a neural net - things happen naturally The weights propagate and this forms the dynamics of the Stimulus-Stimulus interactions Whatever…. Warning: a personal opinion! Stuff happens here S1 S2 Response
Bruce’s favorite model Model time and rate of CS and reinforcement Time -scale invariant Non-associative framework cumulative duration of Sn cumulative number of reinforcements in presence of Sn rates of reinforcement cumulative duration of the conjunction of S1 and Sn
References Dayan, P., and Abbot, L. F. (2000?). Theoretical Neuroscience. In Print??? (http://www.gatsby.ucl.ac.uk/~dayan/book/) Dayan, P. and Long, T., (1998?). Statistical Models of Conditioning. NIPS10. Gallistel, C. R., and Gibbon, J., (2000). Time, Rate and Conditioning. Psychological review, in print. Pavlov, I. P. (1927). Conditioned Reflexes. Oxford: Oxford University Press. Mignault, A. and Marley, A. A. J. (1997). A Real-Time Neuronal Model of Classical Conditioning. Adaptive Behavior. Vol. 6-1, 3-61. Rescorla, R. A. (1988). Behavioral studies of Pavlovian conditioning. Annual Review of Neuroscience 11: 329 - 352. Rescorla, R. A., and R. L. Solomon. (1967). Two-process learning theory: Relationships between Pavlovian conditioning and instrumental learning. Psychological Review 74: 151 - 182. Rescorla, R. A., and A. R. Wagner. (1972). A theory of Pavlovian conditioning: Variations in the effectiveness of reinforcement and nonreinforcement. In A. H. Black and W. F. Proskay, Eds., Classical Conditioning, vol. 2, Current Research and Theory. New York: Appleton-Century-Crofts, pp. 54 - 99. Roitblat, H. L. and Meyer, J.-A.. Comparative Approaches to Cognitive Science. MIT Press. Schmajuk, N. A. (1997). Animal Learning and Cognition. A neural Network approach. Skinner, B. F. (1938). The Behavior of Organisms. New York: Appleton-Century-Crofts. Sutton, R. S., and Barto, A. W, (1990). Computational Neuroscience: Foundations of Adaptive Networks. MIT Press Thorndike, E. L. (1911). Animal Intelligence: Experimental Studies. New York: Macmillan. Wilson, R. A. and Keil, F. (1999) The MIT Encyclopedia of Cognitive Sciences. MIT Press. MITECS (http://cognet.mit.edu/MITECS)