Instrumental Learning & Operant Reinforcement

Name: Instrumental Learning & Operant Reinforcement
Uploaded: 2017-08-11T20:22:36+00:00
Duration: PTM13S55
Channel: Connor Rust
Description: Instrumental Learning & Operant Reinforcement

Instrumental Learning & Operant Reinforcement
Chapters 5 Instrumental Learning & Operant Reinforcement

Operant Learning Stimulus Response Outcome

Classical vs. Operant Classical Operant Can operate together
Requires reflex action Neutral stimulus associated with US Outside of subject’s control Operant Strengthening/weakening of “voluntary” action Subject responds or doesn’t Can operate together

What’s in a Name? Operant learning: subject operates on environment
Instrumental conditioning: subject is instrumental in obtaining outcome

Trial and Error Learning
E.L. Thorndike Animal intelligence Maze studies

Puzzle Box Cats Cage with mechanism to open door Escape latency
Discrete trial procedure

Law of Effect Any behaviour followed by an appetitive stimulus will increase in frequency

Terms Operant (response): any behaviour that operates on the environment to produce an effect Reinforcer: any event that increases the frequency of a behaviour Punisher: any event that decreases the frequency of a behaviour

Operant Learning B.F. Skinner Operant chamber Free operant procedure

Discrete Trial & Free Operant
One trial at a time “Apparatus” must be re-set Measure some behaviour e.g., mazes Free Operant can occur at any time Operant can occur repeatedly Response rate e.g., operant chamber

Four Contingencies Positive reinforcement Negative reinforcement
Positive punishment Negative punishment

Positive and Negative Positive: presents some stimulus
Negative: removes some stimulus

Reinforcers and Punishers
Reinforcer: increases a behaviour Punisher: decreases a behaviour

Contingencies Response Rate: Response Causes Stimulus to Be: Increases
Decreases Removed Presented Response Causes Stimulus to Be: Positive Reinforcement Positive Punishment Lever press --> Food Lever press --> Shock Negative Reinforcement Negative Punishment Lever press --> Shock off Lever press --> Food removed

Types of Reinforcers Primary Secondary
Not dependent on an association with other reinforcers Secondary Initially neutral stimulus Paired with primary reinforcer “Conditioned Reinforcer”

Secondary Reinforcers
“Bridging”, “clicker” Secondary extinction without periodic pairings with primary Generally weaker than primary Generalized reinforcer Paired with many other kinds of reinforcers e.g., money

Strength of Operant Learning
Can condition practically any behaviour Shaping (successive approximations)

Shaping a Lever Press Gradual process
Reinforce more appropriate/precise responses Feedback

Response Chains Sequences of behaviours in specific order
Objective: primary reinforcer Conditioned reinforcers Discriminative stimuli

Forward Chaining Start with first response in sequence, then work through to last response in additive steps

Backwards Chaining Often used with “complex” training
Start with last response in chain Next, second last response Third last, etc.

Factors in Operant Learning

Contingency Correlation between behaviour & outcome
Strong contingency --> better learning Random contingency --> no learning Both reinforcement and punishment

Contiguity Time between behaviour & outcome
Shorter = better learning Delays let other behaviours occur, forgetting, extinction (behaviour w/o reinforcement) Learning with delay if stimulus “placeholder” provided (conditioned reinforcer?) More important for punishment

Reinforcer Characteristics
Larger reinforcers --> stronger learning Not a linear effect Qualitative differences in reinforcers and punishers Species & individual differences Intensity of punisher

Task Characteristics Some tasks easier to learn than others
Species & individual differences Innate and/or prior conditioning

Deprivation Levels Generally, the greater the deprivation, the more effective the reinforcer Reinforcers can satiate Deprivation can provide motivation to engage in punishable behaviours

Extinction Behavioural does not lead to same outcome
Response no longer produces same outcome Extinction burst (with reinforcement) Variability of behaviour Aggression and frustration Spontaneous recovery Resurgence

Theories of Reinforcement

Hull’s Drive Reduction Theory
Animals have motivational states (drives) Necessary for survival Reinforcers are things that reduce drives Physiological value Reduce physiological state

Drive Reduction Reinforcers
Works well with primary reinforcers Many secondary reinforcers have no physiological value Hull: association links secondary to drive Some reinforcers hard to classify as primary or secondary Some increase a physiological state Some necessities undetectable Roller coasters Vitamins Saccharin

Relative Value Theory & Premack Principle
Treat reinforcers as behaviours Is it the food, or the behaviour of eating that is the reinforcer? Behavioural probability scale Greater or lesser value of behaviours relative to one another No distinction between primary and secondary

Premack Principle One behaviour will reinforce a second behaviour
High probability behaviour reinforces low probability behaviour Baseline probability scale Time Rank order Reinforcement relativity No absolutes Time spent on response Total time Probabilty of response =

Example Behaviours Baseline (30 minutes)
Eat ice cream (I), play video game (V), read book (B) Baseline (30 minutes) Student 1: I (2min), V (8min), B (20min) Scale: I -- V -- B Student 2: I (8min), V (20min), B (2min) Scale: B -- I -- V Student 1: V reinforces I, B reinforces V & I Student 2: I reinforces B, V reinforces I & B

Problems Baseline phase Time problems Fair rating?
How to compare very different behaviours Time problems What if time not important to behaviour? Behaviour duration? Length of baseline period?

Response Deprivation Theory
Deprived behaviours = reinforcing behaviours Drop below baseline level of performance Not relative frequency of one behaviour compared to another (i.e., Premack) Level of deprivation for a behaviour Praise? “Yes”?

Escape and Avoidance

Definitions Escape Avoidance
Get away from aversive stimulus that is in progress Avoidance Get away from aversive stimulus before it begins

Shuttle Box Solomon & Wynne (1953) Dogs Chamber with barrier; Shock
Light off as signal

Two-Process Theory Classical and operant conditioning
Shock = US Fear/pain/jump/twitch/squeal = UR Darkness = CS Fear of dark = CR Fear: heart rate, breathing, stomach cramps, etc. Negative reinforcement Removal of fear (CR) Escape of CS, not avoidance of shock

Support for Two-Process Theory
Rescorla & LoLordo (1965) Dog in shuttlebox No signal Response gives “safe time” Pair tone with shock Tone increases rate of response CS can amplify avoidance Conditioned inhibition can reduce avoidance

Problems with Two-Process Theory
Avoidance without observable fear Heart rate Not consistent Fear diminishes with avoidance learning

Measuring Fear Kamin, Brimer, and Black (1963)
Lever press ---> food Auditory CS ---> avoidance in shuttle box until: 1, 3, 9, 27 avoidances in a row CS in Skinner box; check for suppression of lever press

Results Fear decreases during extended avoidance training
But, avoidance still strong Even low fear is enough? Avoidance responses Responding 1 3 9 27

Extinction in Avoidance Behaviour
Odd prediction from two-process theory “Yo-yo” effect Avoidance should toggle But! Avoidance is extremely persistent successful avoidance trials # of US received

One-Process Theory Classical conditioning component unnecessary
Avoidance, not fear reduction, is reinforcer “Safety”

Sidman Avoidance Task Free-operant avoidance Shock at random intervals
Can avoidance be learned if no warning CS? Shock at random intervals Response gives safe time Extensive training --> learn avoidance But, usually never perfect High variability across subjects Two-process theory suggests: Time becomes a CS (time elicits fear)

Herrnstein & Hineline (1966)
Rapid and slow shock rate schedules Lever press switches schedules Shocks presented randomly, no signal Responses give shock reduction Reduction in shock is reinforcer

Learned Helplessness Behaviour has no effect on situation Generalizes
Laboratory Give inescapable shocks Shuttle box Will not switch sides Expectation that behaviour has no effect

Learned Helplessness in Humans
Depression Situations beyond your control Three dimensions Situation: specific or global Attribute: internal or external Time: short-term or long-term

Maier & Seligman (1976) Motivational impairment Cognitive impairment
Emotional impairment

Therapeutic Application
Confidence building (“can not fail”) Implementation issues Tasks that can be successfully completed Produces immunization Escapable condition … inescapable condition Learned helplessness less likely to develop

Instrumental Learning & Operant Reinforcement

Similar presentations

Presentation on theme: "Instrumental Learning & Operant Reinforcement"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Instrumental Learning & Operant Reinforcement

Similar presentations

Presentation on theme: "Instrumental Learning & Operant Reinforcement"— Presentation transcript:

Similar presentations

About project

Feedback