Foundations Edward Thorndike (1874-1949) –Puzzle Box –Cats became more efficient with each trial –Law of effect Rewarded behavior is likely to recur
B.F. Skinner (1904-1990) –Started with Thorndike’s Law of Effect –Operant Chamber (Skinner Box) –Utilized food as reinforcer for a variety of behaviors –Resulted in animal learning
Operant Conditioning Association of behaviors and their consequences Behavior is strengthened (repeated) if followed by reinforcement or diminished if followed by punishment Operant because the behavior operates on the environment to produce a consequence.
Shaping Procedure in which rewards such as food gradually guide an animal’s behavior toward a desired behavior Reinforcement given for successive approximations of desired behavior –Baby steps toward desired behavior
Principles of Reinforcement Reinforcer: any event that strengthens the behavior that it follows. –Positive (pleasant given: food) –Negative (aversive taken away: turning off alarm) –Primary (Innate: satisfies biological need) –Secondary (Learned: money, good grades, pleasant tone of voice, all linked with primary)
Reinforcement Schedules Continuous –Reinforcing the desired response (behavior) whenever it occurs Learning occurs rapidly Extinction also rapid once reinforcement stops
Reinforcement Schedules Partial –Reinforcing response only part of the time Slower acquisition of learning Greater resistance to extinction
Fixed Ratio Reinforce behavior after a set number of responses –High rate of responding Example: Paid $10 for every 100 envelopes stuffed
Variable Ratio Reinforcement occurs after an unpredictable number of responses –High rates of responding –Resists extinction Example: gambling, giving in to child’s whining in grocery store every so often, sales commissions
Fixed Interval Reinforce the first response after a fixed time period has occurred. –Rapid rate of responding as anticipated time of reward approaches –Choppy pattern of responding Example: checking the cookies as the baking time is nearly done, checking for the mail as the time approaches for the delivery to occur
Variable Interval Reinforce the first response after varying time intervals –Slow steady responding Example: Boss walks around on a varying schedule to check on employees progress.
Problems with Punishment Human studies re: physical… –Temporary suppression negatively reinforces parental punishing behavior May learn discrimination (do it when you won’t get caught) –Increased aggressiveness –Develop fear –Doesn’t guide
What to do? Reinforcement of desired behaviors best Reframe contingencies from threats to positive incentives
Cognition Cognitive Map – mental representation of one’s environment Latent Learning – learning that occurs without reinforcement or punishment