Instrumental Conditioning II
Delay of Reinforcement Start DelayChoice Correct Incorrect Grice (1948) Goal Reward or No Reward
Grice (1948) Results
Overcoming the effects of delay Secondary reinforcers “Marking” procedure
Lieberman, McIntosh & Thomas (1979)
Delayed reinforcement
ReinforcementPunishment Positive contingency Negative contingency Chocolate BarElectric Shock Excused from Chores No TV privileges Effect on Rate Behavior
Professor Drew
Anticipatory Contrast - Crespi (1942) Rats run down maze to find food pellets in goal arm.
What is a reinforcer? Operational Definition (behaviorists): That which increases the probability of the response that preceded it. Thorndike: A stimulus that produces a “satisfying state of affairs”
Drive Reduction Theory Amt of H2O in body Compare with Set Point Seek water/ don’t seek water drives
Drive Reduction Considered: Are reinforcers necessary for survival? –Eating to excess –Drugs of Abuse –“Pleasure centers” of the brain
Behavioral Regulation View: The Premack Principle Behaviors are reinforcing, not stimuli To predict what will be reinforcing, observe the baseline frequency of different behaviors Highly probable behaviors will reinforce less probable behaviors
Premack Revised: The Response Deprivation Hypothesis Low frequency behaviors can reinforce high frequency behaviors (and vice versa) All behaviors have a preferred frequency = the behavioral bliss point Deprivation below that frequency is aversive, and organisms will work to remedy this Timberlake & Allison (1974)
Response deprivation hypothesis The ice cream scale (in pints) Bliss point (1.0 pints/night) Will work to avoid ice cream Will work to obtain
The behavioral bliss point and motivation
Contiguity versus Contingency in operant conditioning
Degraded Contingency Effect = bar press = food Perfect contingency Strong Responding Degraded contingency Weak Responding
G.V. Thomas (1983) Contiguity pitted against contingency “Free” reinforcers given every 20s Lever press advances delivery of pellet, but cancels pellet for next 20-s interval So if you press at second 2, you get a pellet immediately, but you get no pellet during seconds 3-20 and s40s60s
G.V. Thomas (1983) Contiguity pitted against contingency So if you press at second 2, you get a pellet immediately, but you get no pellet during seconds 3-20 and s40s60s Lever press here Lose this pellet
“Superstitious Behavior” Suggested that temporal contiguity more important than contingency 15-s FT, no response requirement “adventitious reinforcement” “In 6 out of 8 cases the resulting responses were so clearly defined that two observers could agree perfectly in counting instances. One bird was conditioned to turn counter-clockwise about the cage, making 2 or 3 turns between reinforcements. Another repeatedly thrust its head into one of the upper corners of the cage….”
Orienting toward feeder Pecking near feeder Moving along wall ¼ turn
“Misbehavior” and the limits of operant conditioning
Limits of Operant Conditioning Some behaviors can’t be conditioned –Yawning –Scratching Belongingness –Presentation of a female won’t reinforce biting “Misbehavior”
Marian Breland Bailey – How to train a chicken
The famous dancing chicken
What is learned in operant conditioning?
SR What is learned? Edwin Guthrie: mere contiguity of a stimulus and a behavior stamps in that S-R; reinforcement is not necessary
SR What is learned? Thorndike: Reinforcement “stamps in” this connection
SR O What is learned? ?
SR O 2-Process Theory operant Pavlovian
SR CR 2-Process Theory operant Pavlovian
Evidence for 2-process theory Pavlovian-Instrumental Transfer Phase 1Phase 2Test Lever FoodLight FoodLight: #Presses? No Light: #Presses? # Presses LightNo CS The presence of the CS intensifies operant responding
SR O ? ? What is learned? Does the Pavlovian S-O association activate a vague emotional state or a specific mental representation of the outcome?
Specific Outcome Representations Trapold Phase 1Phase 2Test (operant)(classical) R Lever PelletTone PelletTone:Left? Right? L Lever SucroseLight SucroseLight:Left? Right? # Presses LightNoise Left Right
RORO Colwill & Rescorla (1986) Phase 1DevaluationTest Push Left PelletPellet+LiClRight? Push Right SucroseSucrose+LiClLeft? # Pushes Pellet Devalued Sucrose Devalued Right Pushes Left Pushes