# Chapter 6 – Schedules or Reinforcement and Choice Behavior

## Presentation on theme: "Chapter 6 – Schedules or Reinforcement and Choice Behavior"— Presentation transcript:

Chapter 6 – Schedules or Reinforcement and Choice Behavior
Outline Simple Schedules of Intermittent Reinforcement Ratio Schedules Interval Schedules Comparison of Ratio and Interval Schedules Choice Behavior: Concurrent Schedules Measures of Choice Behavior The Matching Law Complex Choice Concurrent-Chain Schedules Studies of “Self Control”

Simple Schedules of Intermittent Reinforcement Ratio Schedules
RF depends only on the number of responses performed Continuous reinforcement (CRF) each response is reinforced barpress = food key peck = food CRF is rare outside the lab. Partial or intermittent RF

Partial or intermittent Schedules of Reinforcement FR (Fixed Ratio)
fixed number of operants (responses) CRF is FR1 FR 10 = every 10th response  RF originally recorded using a cumulative record Now computers can be graphed similarly

Figure 6.1 – The construction of a cumulative record by a cumulative recorder for the continuous recording of behavior.

The cumulative record represents responding as a function of time
the slope of the line represents rate of responding. Steeper = faster

Responding on FR scheds.
Faster responding = sooner RF So responding tends to be pretty rapid Postreinforcement pause Postreinforcement pause is directly related to FR. Small FR = shorter pauses FR 5 large FR = longer pauses FR 100 wait a while before they start working. Domjan points out this may have more to do with the upcoming work than the recent RF Pre-ratio pause?

how would you respond if you received \$1 on an FR 5 schedule? FR 500?
Post RF pauses? RF history explanation of post RF pause Contiguity of 1st response and RF FR 5 1st response close to RF only 4 more FR 100 1st response long way from RF 99 more

VR (Variable ratio schedules)
Number of responses still critical Varies from trial to trial VR 10 reinforced on average for every 10th response. sometimes only 1 or 2 responses are required other times 15 or 19 responses are required.

Example (# = response requirement) VR10 FR10
19  RF 10  RF 2  RF 10  RF 8  RF 10  RF 18  RF 10  RF 5  RF 10  RF 15  RF 10  RF 12  RF 10  RF 1  RF 10  RF VR 10 ( )/8 = 10

VR = very little postreinforcement pause
why would this be? Slot machines very lean schedule of RF But - next lever pull could result in a payoff.

FI (Fixed Interval Schedule)
1st response after a given time period has elapsed is reinforced. FI 10s 1st response after 10s  RF. RF waits for animal to respond responses prior to 10-s not RF. scalloped responding patterns FI scallop

Similarity of FI scallop and post RF pause?
The FI scallop has been used to assess animals’ ability to time.

VI (variable interval schedule)
Time is still the important variable However, time elapse requirement varies around a set average VI 120s time to RF can vary from a few seconds to a few minutes \$1 on a VI 10 minute schedule for button presses? Could be RF in seconds Could be 20 minutes post reinforcement pause?

Produces stable responding at a constant rate
peck..peck..peck..peck..peck sampling whether enough time has passed The rate on a VI schedule is not as fast as on an FR and VR schedule why? ratio schedules are based on response. faster responding gets you to the response requirement quicker, regardless of what it is? On a VI schedule # of responses don’t matter, steady even pace makes sense.

Interval Schedules and Limited Hold
Limited hold restriction Must respond within a certain amount of time of RF setup Like lunch at school Too late you miss it

Comparison of Ratio and Interval Schedules
What if you hold RF constant Rat 1 = VR Rat 2 = Yoked control rat on VI RF is set up when Rat 1 gets to his RF If Rat 1 responds faster, RF will set up sooner for Rat2 If Rat 1 is slower, RF will be delayed

Comparison of Ratio and Interval Schedules

Why is responding faster on ratio scheds?
Molecular view Based on moment x moment RF Inter-response times (IRTs) R1……………R2 RF Reinforces long IRT R1..R2 RF Reinforces short IRT More likely to be RF for short IRTs on VR than VI

Molar view Feedback functions
Average RF rate during the session is the result of average response rates How can the animal increase reinforcement in the long run (across whole session)? Ratio - Respond faster = more RF for that day FR 30 Responding 1 per second RF at 30s Respond 2 per second RF at 15s

Molar view continued Interval - No real benefit to responding faster
Responding 1 per second RF at 30 or 31 (30.5) What if 2 per second 30 or 30.5 (30.25) Pay Salary? Clients?

Choice Behavior: Concurrent schedules
The responding that we have discussed so far has involved schedules where there is only one thing to do. In real life we tend to have choices among various activities Concurrent schedules examines how an animal allocates its responding among two schedules of reinforcement? The animals are free to switch back and forth

Figure 6.4 – Diagram of a concurrent schedule for pigeons.

Measures of choice behavior
Relative rate of responding for left key BL (BL + BR) BL = Behavior on left BR = Behavior on right We are just dividing left key responding by total responding.

This computation is very similar to the computation for the suppression ratio.
If the animals are responding equally to each key what should our ratio be? = 20+20 If they respond more to the left key? = 40+20 If they respond more to the right key? = 20+40

Relative rate of responding for right key
Will be reciprocal of left key responding, but also can be calculated with the same formula BR (BR + BL) Concurrent schedules? If VI 60 VI 60 The relative rate of responding for either key will be .5 Split responding equally among the two keys

What about the relative rate of reinforcement?
Left key? Simply divide the rate of reinforcement on the left key by total reinforcement. rL (rL + rR) VI 60 VI 60? If animals are dividing responding equally? .50 again

The Matching Law relative rate of responding matches relative rate of RF when the same VI schedule is used .50 and .50 What if different schedules of RF are used on each key?

Left key = VI 6 min (10 per hour) Right key = VI 2 min (30 per hour)
Left key relative rate of responding BL = rL =.25 left (BL + BR) (rL + rR) 40 Right key? simply the reciprocal .75 Can be calculated though BR = rR =.75 right (BR + BL) (rR + rL) 40 Thus - three times as much responding on right key .25x3 = .75

again – three times as much responding on right key
Matching Law continued: Simpler computation. BL = rL . BR rR 10 30 again – three times as much responding on right key

Herrnstein (1961) compared various VI schedules
Matching Law. Figure 6.5 in your book

Application of the matching law
The matching law indicates that we match our behaviors to the available RF in the environment. Law,Bulow, and Meller (1998) Predicted adolescent girls that live in RF barren environments would be more likely to engage in sexual behaviors Girls that have a greater array of RF opportunities should allocate their behaviors toward those other activities Surveyed girls about the activities they found rewarding and their sexual activity The matching law did a pretty good job of predicting sexual activity Many kids today have a lot of RF opportunities. May make it more difficult to motivate behaviors you want them to do Like homework X-box Texting friends TV

Complex Choice Many of the choices we make require us to live with those choices We can’t always just switch back and forth Go to college? Get a full-time job? Sometimes the short-term and long-term consequences (RF) of those choices are very different Go to college Poor now; make more later Get a full-time job Money now; less earning in the long run

Concurrent-Chain Schedules
Allows us to examine these complex choice behaviors in the lab Example Do animals prefer a VR or a FR? Variety is the spice of life?

Figure 6.6 – Diagram of a concurrent-chain schedule.

Subjects prefer the VR10 over the FR10
Choice of A 10 minutes on VR 10 Choice of B 10 minutes on FR 10 Subjects prefer the VR10 over the FR10 How do we know? Subjects will even prefer VR schedules that require somewhat more responding than the FR Why do you think that happens?

Studies of Self control
Often a matter of delaying immediate gratification (RF) in order to obtain a greater reward (RF) later. Study or go to party? Work in summer to pay for school or enjoy the time off?

Self control in pigeons?
Rachlin and Green (1972) Choice A = immediate small reward Coice B = 4s Delay  large reward Direct choice procedure Pigeons choose immediate, small reward Concurrent-chain procedure Could learn to choose the larger reward Only if a long enough delay between initial choice and the next link.

Figure 6.7 – Diagram of the experiment by Rachlin and Green (1972) on self control.

Value-discounting function (1+KD)
This idea that imposing a delay between a choice and the eventual outcomes helps organisms make “better” (higher RF) outcomes works for people to. Value-discounting function V = M . (1+KD) V-value of RF M- magnitude of RF D – delay of reward K – is a correction factor for how much the animal is influenced by the delay All this equation is saying is that the value of a reward is inversely affected by how long you have to wait to receive it. IF there is no delay D=0 Then it is simply magnitude over 1

If I offer you \$50 now or \$100 now? 50 . = 50 100 . = 100
= = 100 (1+1x0) (1+1x0) \$50 now or \$100 next year? = = 7.7 (1+1x0) (1+1x12)

Figure 6.8 – Hypothetical relations between reward value and waiting time to reward delivery for a small reward and a large reward presented some time later.