Presentation is loading. Please wait.

Presentation is loading. Please wait.

Instrumental Learning & Operant Reinforcement. Operant Learning Stimulus Response Outcome.

Similar presentations


Presentation on theme: "Instrumental Learning & Operant Reinforcement. Operant Learning Stimulus Response Outcome."— Presentation transcript:

1 Instrumental Learning & Operant Reinforcement

2 Operant Learning Stimulus Response Outcome

3 Classical vs. Operant Classical Requires reflex action Neutral stimulus associated with US Outside of subject’s control Operant Strengthening/weakening of “voluntary” action Subject responds or doesn’t Can operate together

4 What’s in a Name? Operant learning: subject operates on environment Instrumental conditioning: subject is instrumental in obtaining outcome

5 Trial and Error Learning E.L. Thorndike Animal intelligence Maze studies

6 Puzzle Box Cats Cage with mechanism to open door Escape latency Discrete trial procedure

7 Law of Effect Any behaviour followed by an appetitive stimulus will increase in frequency

8 Terms Operant (response): any behaviour that operates on the environment to produce an effect Reinforcer: any event that increases the frequency of a behaviour Punisher: any event that decreases the frequency of a behaviour

9 Operant Learning B.F. Skinner Operant chamber Free operant procedure

10 Discrete Trial & Free Operant Discrete One trial at a time “Apparatus” must be re- set Measure some behaviour e.g., mazes Free Operant can occur at any time Operant can occur repeatedly Response rate e.g., operant chamber

11 Four Contingencies Positive reinforcement Negative reinforcement Positive punishment Negative punishment

12 Positive and Negative Positive: presents some stimulus Negative: removes some stimulus

13 Reinforcers and Punishers Reinforcer: increases a behaviour Punisher: decreases a behaviour

14 Contingencies Response Rate: IncreasesDecreases Removed Presented Response Causes Stimulus to Be: Positive Reinforcement Negative Reinforcement Positive Punishment Negative Punishment Lever press --> Food Lever press --> Shock off Lever press --> Shock Lever press --> Food removed

15 Types of Reinforcers Primary Not dependent on an association with other reinforcers Secondary Initially neutral stimulus Paired with primary reinforcer “Conditioned Reinforcer”

16 Secondary Reinforcers “Bridging”, “clicker” Secondary extinction without periodic pairings with primary Generally weaker than primary Generalized reinforcer Paired with many other kinds of reinforcers e.g., money

17 Strength of Operant Learning Can condition practically any behaviour Shaping (successive approximations)

18 Shaping a Lever Press Gradual process Reinforce more appropriate/precise responses Feedback

19 Response Chains Sequences of behaviours in specific order Objective: primary reinforcer Conditioned reinforcers Discriminative stimuli

20 Forward Chaining Start with first response in sequence, then work through to last response in additive steps

21 Backwards Chaining Often used with “complex” training Start with last response in chain Next, second last response Third last, etc.

22

23 Contingency Correlation between behaviour & outcome Strong contingency --> better learning Random contingency --> no learning Both reinforcement and punishment

24 Contiguity Time between behaviour & outcome Shorter = better learning Delays let other behaviours occur, forgetting, extinction (behaviour w/o reinforcement) Learning with delay if stimulus “placeholder” provided (conditioned reinforcer?) More important for punishment

25 Reinforcer Characteristics Larger reinforcers --> stronger learning Not a linear effect Qualitative differences in reinforcers and punishers Species & individual differences Intensity of punisher

26 Task Characteristics Some tasks easier to learn than others Species & individual differences Innate and/or prior conditioning

27 Deprivation Levels Generally, the greater the deprivation, the more effective the reinforcer Reinforcers can satiate Deprivation can provide motivation to engage in punishable behaviours

28 Extinction Behavioural does not lead to same outcome Response no longer produces same outcome Extinction burst (with reinforcement) Variability of behaviour Aggression and frustration Spontaneous recovery Resurgence

29

30 Hull’s Drive Reduction Theory Animals have motivational states (drives) Necessary for survival Reinforcers are things that reduce drives Physiological value Reduce physiological state

31 Drive Reduction Reinforcers Works well with primary reinforcers Many secondary reinforcers have no physiological value Hull: association links secondary to drive Some reinforcers hard to classify as primary or secondary Some increase a physiological state Some necessities undetectable Roller coasters Vitamins Saccharin

32 Relative Value Theory & Premack Principle Treat reinforcers as behaviours Is it the food, or the behaviour of eating that is the reinforcer? Behavioural probability scale Greater or lesser value of behaviours relative to one another No distinction between primary and secondary

33 Premack Principle One behaviour will reinforce a second behaviour High probability behaviour reinforces low probability behaviour Baseline probability scale Time Rank order Reinforcement relativity No absolutes Probabilty of response = Time spent on response Total time

34 Example Behaviours Eat ice cream (I), play video game (V), read book (B) Baseline (30 minutes) Student 1: I (2min), V (8min), B (20min) Scale: I -- V -- B Student 2: I (8min), V (20min), B (2min) Scale: B -- I -- V Student 1: V reinforces I, B reinforces V & I Student 2: I reinforces B, V reinforces I & B

35 Problems Baseline phase Fair rating? How to compare very different behaviours Time problems What if time not important to behaviour? Behaviour duration? Length of baseline period?

36 Response Deprivation Theory Deprived behaviours = reinforcing behaviours Drop below baseline level of performance Not relative frequency of one behaviour compared to another (i.e., Premack) Level of deprivation for a behaviour Praise? “Yes”?

37

38 Definitions Escape Get away from aversive stimulus that is in progress Avoidance Get away from aversive stimulus before it begins

39 Shuttle Box Solomon & Wynne (1953) Dogs Chamber with barrier; Shock Light off as signal

40 Two-Process Theory Classical and operant conditioning Shock = US Fear/pain/jump/twitch/ squeal = UR Darkness = CS Fear of dark = CR Fear: heart rate, breathing, stomach cramps, etc. Negative reinforcement Removal of fear (CR) Escape of CS, not avoidance of shock

41 Support for Two-Process Theory Rescorla & LoLordo (1965) Dog in shuttlebox No signal Response gives “safe time” Pair tone with shock Tone increases rate of response CS can amplify avoidance Conditioned inhibition can reduce avoidance

42 Problems with Two-Process Theory Avoidance without observable fear Heart rate Not consistent Fear diminishes with avoidance learning

43 Measuring Fear Kamin, Brimer, and Black (1963) Lever press ---> food Auditory CS ---> avoidance in shuttle box until: 1, 3, 9, 27 avoidances in a row CS in Skinner box; check for suppression of lever press

44 Results Fear decreases during extended avoidance training But, avoidance still strong Even low fear is enough? Avoidance responses Responding 139 27

45 Extinction in Avoidance Behaviour Odd prediction from two-process theory “Yo-yo” effect Avoidance should toggle But! Avoidance is extremely persistent successful avoidance trials # of US received

46 One-Process Theory Classical conditioning component unnecessary Avoidance, not fear reduction, is reinforcer “Safety”

47 Sidman Avoidance Task Free-operant avoidance Can avoidance be learned if no warning CS? Shock at random intervals Response gives safe time Extensive training --> learn avoidance But, usually never perfect High variability across subjects Two-process theory suggests: Time becomes a CS (time elicits fear)

48 Herrnstein & Hineline (1966) Rapid and slow shock rate schedules Lever press switches schedules Shocks presented randomly, no signal Responses give shock reduction Reduction in shock is reinforcer

49 Learned Helplessness Behaviour has no effect on situation Generalizes Laboratory Give inescapable shocks Shuttle box Will not switch sides Expectation that behaviour has no effect

50 Learned Helplessness in Humans Depression Situations beyond your control Three dimensions Situation: specific or global Attribute: internal or external Time: short-term or long-term

51 Maier & Seligman (1976) Motivational impairment Cognitive impairment Emotional impairment

52 Therapeutic Application Confidence building (“can not fail”) Implementation issues Tasks that can be successfully completed Produces immunization Escapable condition … inescapable condition Learned helplessness less likely to develop


Download ppt "Instrumental Learning & Operant Reinforcement. Operant Learning Stimulus Response Outcome."

Similar presentations


Ads by Google