Presentation is loading. Please wait.

Presentation is loading. Please wait.

Instrumental Learning & Operant Reinforcement

Similar presentations


Presentation on theme: "Instrumental Learning & Operant Reinforcement"— Presentation transcript:

1 Instrumental Learning & Operant Reinforcement
Chapters 5 Instrumental Learning & Operant Reinforcement

2 Operant Learning Stimulus Response Outcome

3 Classical vs. Operant Classical Operant Can operate together
Requires reflex action Neutral stimulus associated with US Outside of subject’s control Operant Strengthening/weakening of “voluntary” action Subject responds or doesn’t Can operate together

4 What’s in a Name? Operant learning: subject operates on environment
Instrumental conditioning: subject is instrumental in obtaining outcome

5 Trial and Error Learning
E.L. Thorndike Animal intelligence Maze studies

6 Puzzle Box Cats Cage with mechanism to open door Escape latency
Discrete trial procedure

7 Law of Effect Any behaviour followed by an appetitive stimulus will increase in frequency

8 Terms Operant (response): any behaviour that operates on the environment to produce an effect Reinforcer: any event that increases the frequency of a behaviour Punisher: any event that decreases the frequency of a behaviour

9 Operant Learning B.F. Skinner Operant chamber Free operant procedure

10 Discrete Trial & Free Operant
One trial at a time “Apparatus” must be re-set Measure some behaviour e.g., mazes Free Operant can occur at any time Operant can occur repeatedly Response rate e.g., operant chamber

11 Four Contingencies Positive reinforcement Negative reinforcement
Positive punishment Negative punishment

12 Positive and Negative Positive: presents some stimulus
Negative: removes some stimulus

13 Reinforcers and Punishers
Reinforcer: increases a behaviour Punisher: decreases a behaviour

14 Contingencies Response Rate: Response Causes Stimulus to Be: Increases
Decreases Removed Presented Response Causes Stimulus to Be: Positive Reinforcement Positive Punishment Lever press --> Food Lever press --> Shock Negative Reinforcement Negative Punishment Lever press --> Shock off Lever press --> Food removed

15 Types of Reinforcers Primary Secondary
Not dependent on an association with other reinforcers Secondary Initially neutral stimulus Paired with primary reinforcer “Conditioned Reinforcer”

16 Secondary Reinforcers
“Bridging”, “clicker” Secondary extinction without periodic pairings with primary Generally weaker than primary Generalized reinforcer Paired with many other kinds of reinforcers e.g., money

17 Strength of Operant Learning
Can condition practically any behaviour Shaping (successive approximations)

18 Shaping a Lever Press Gradual process
Reinforce more appropriate/precise responses Feedback

19 Response Chains Sequences of behaviours in specific order
Objective: primary reinforcer Conditioned reinforcers Discriminative stimuli

20 Forward Chaining Start with first response in sequence, then work through to last response in additive steps

21 Backwards Chaining Often used with “complex” training
Start with last response in chain Next, second last response Third last, etc.

22 Factors in Operant Learning

23 Contingency Correlation between behaviour & outcome
Strong contingency --> better learning Random contingency --> no learning Both reinforcement and punishment

24 Contiguity Time between behaviour & outcome
Shorter = better learning Delays let other behaviours occur, forgetting, extinction (behaviour w/o reinforcement) Learning with delay if stimulus “placeholder” provided (conditioned reinforcer?) More important for punishment

25 Reinforcer Characteristics
Larger reinforcers --> stronger learning Not a linear effect Qualitative differences in reinforcers and punishers Species & individual differences Intensity of punisher

26 Task Characteristics Some tasks easier to learn than others
Species & individual differences Innate and/or prior conditioning

27 Deprivation Levels Generally, the greater the deprivation, the more effective the reinforcer Reinforcers can satiate Deprivation can provide motivation to engage in punishable behaviours

28 Extinction Behavioural does not lead to same outcome
Response no longer produces same outcome Extinction burst (with reinforcement) Variability of behaviour Aggression and frustration Spontaneous recovery Resurgence

29 Theories of Reinforcement

30 Hull’s Drive Reduction Theory
Animals have motivational states (drives) Necessary for survival Reinforcers are things that reduce drives Physiological value Reduce physiological state

31 Drive Reduction Reinforcers
Works well with primary reinforcers Many secondary reinforcers have no physiological value Hull: association links secondary to drive Some reinforcers hard to classify as primary or secondary Some increase a physiological state Some necessities undetectable Roller coasters Vitamins Saccharin

32 Relative Value Theory & Premack Principle
Treat reinforcers as behaviours Is it the food, or the behaviour of eating that is the reinforcer? Behavioural probability scale Greater or lesser value of behaviours relative to one another No distinction between primary and secondary

33 Premack Principle One behaviour will reinforce a second behaviour
High probability behaviour reinforces low probability behaviour Baseline probability scale Time Rank order Reinforcement relativity No absolutes Time spent on response Total time Probabilty of response =

34 Example Behaviours Baseline (30 minutes)
Eat ice cream (I), play video game (V), read book (B) Baseline (30 minutes) Student 1: I (2min), V (8min), B (20min) Scale: I -- V -- B Student 2: I (8min), V (20min), B (2min) Scale: B -- I -- V Student 1: V reinforces I, B reinforces V & I Student 2: I reinforces B, V reinforces I & B

35 Problems Baseline phase Time problems Fair rating?
How to compare very different behaviours Time problems What if time not important to behaviour? Behaviour duration? Length of baseline period?

36 Response Deprivation Theory
Deprived behaviours = reinforcing behaviours Drop below baseline level of performance Not relative frequency of one behaviour compared to another (i.e., Premack) Level of deprivation for a behaviour Praise? “Yes”?

37 Escape and Avoidance

38 Definitions Escape Avoidance
Get away from aversive stimulus that is in progress Avoidance Get away from aversive stimulus before it begins

39 Shuttle Box Solomon & Wynne (1953) Dogs Chamber with barrier; Shock
Light off as signal

40 Two-Process Theory Classical and operant conditioning
Shock = US Fear/pain/jump/twitch/squeal = UR Darkness = CS Fear of dark = CR Fear: heart rate, breathing, stomach cramps, etc. Negative reinforcement Removal of fear (CR) Escape of CS, not avoidance of shock

41 Support for Two-Process Theory
Rescorla & LoLordo (1965) Dog in shuttlebox No signal Response gives “safe time” Pair tone with shock Tone increases rate of response CS can amplify avoidance Conditioned inhibition can reduce avoidance

42 Problems with Two-Process Theory
Avoidance without observable fear Heart rate Not consistent Fear diminishes with avoidance learning

43 Measuring Fear Kamin, Brimer, and Black (1963)
Lever press ---> food Auditory CS ---> avoidance in shuttle box until: 1, 3, 9, 27 avoidances in a row CS in Skinner box; check for suppression of lever press

44 Results Fear decreases during extended avoidance training
But, avoidance still strong Even low fear is enough? Avoidance responses Responding 1 3 9 27

45 Extinction in Avoidance Behaviour
Odd prediction from two-process theory “Yo-yo” effect Avoidance should toggle But! Avoidance is extremely persistent successful avoidance trials # of US received

46 One-Process Theory Classical conditioning component unnecessary
Avoidance, not fear reduction, is reinforcer “Safety”

47 Sidman Avoidance Task Free-operant avoidance Shock at random intervals
Can avoidance be learned if no warning CS? Shock at random intervals Response gives safe time Extensive training --> learn avoidance But, usually never perfect High variability across subjects Two-process theory suggests: Time becomes a CS (time elicits fear)

48 Herrnstein & Hineline (1966)
Rapid and slow shock rate schedules Lever press switches schedules Shocks presented randomly, no signal Responses give shock reduction Reduction in shock is reinforcer

49 Learned Helplessness Behaviour has no effect on situation Generalizes
Laboratory Give inescapable shocks Shuttle box Will not switch sides Expectation that behaviour has no effect

50 Learned Helplessness in Humans
Depression Situations beyond your control Three dimensions Situation: specific or global Attribute: internal or external Time: short-term or long-term

51 Maier & Seligman (1976) Motivational impairment Cognitive impairment
Emotional impairment

52 Therapeutic Application
Confidence building (“can not fail”) Implementation issues Tasks that can be successfully completed Produces immunization Escapable condition … inescapable condition Learned helplessness less likely to develop


Download ppt "Instrumental Learning & Operant Reinforcement"

Similar presentations


Ads by Google