The Matching Law Richard J. Herrnstein. Reinforcement schedule Fixed-Ratio (FR) : the first response made after a given number of responses is reinforced.

Slides:

Advertisements

Similar presentations

Schedules of reinforcement

Advertisements

AAEC 2305 Fundamentals of Ag Economics Chapter 2 Economics of Demand.

Steven I. Dworkin, Ph.D. 1 Choice and Matching Chapter 10.

Schedules of Reinforcement: Continuous reinforcement: – Reinforce every single time the animal performs the response – Use for teaching the animal the.

Common Properties of Differential Reinforcement A target behavior performed in the presence of a particular stimulus is reinforced. The same behavior is.

Jeff Beard Lisa Helma David Parrish Start Presentation.

Developing Behavioral Persistence Through the Use of Intermittent Reinforcement Chapter 6.

Quiz #3 Last class, we talked about 6 techniques for self- control. Name and briefly describe 2 of those techniques. 1.

Copyright © 2011 Pearson Education, Inc. All rights reserved. Developing Behavioral Persistence Through the Use of Intermittent Reinforcement Chapter 6.

Learning Operant Conditioning.  Operant Behavior  operates (acts) on environment  produces consequences  Respondent Behavior  occurs as an automatic.

Myers EXPLORING PSYCHOLOGY (6th Edition in Modules) Module 19 Operant Conditioning James A. McCubbin, PhD Clemson University Worth Publishers.

Chapter 8 Operant Conditioning.  Operant Conditioning  type of learning in which behavior is strengthened if followed by reinforcement or diminished.

 JqjlrHA JqjlrHA.

PSY402 Theories of Learning Chapter 4 (Cont.) Schedules of Reinforcement.

Schedules of Reinforcement Lecture 14. Schedules of RFT n Frequency of RFT after response is important n Continuous RFT l RFT after each response l Fast.

PSY 402 Theories of Learning Chapter 7 – Behavior & Its Consequences Instrumental & Operant Learning.

PSY402 Theories of Learning Chapter 7 – Theories and Applications of Appetitive Conditioning.

Theory of Consumer Behavior Basics of micro theory: how individuals choose what to consume when faced with limited income? Components of consumer demand.

PSY 402 Theories of Learning Chapter 7 – Behavior & Its Consequences Instrumental & Operant Learning.

Lectures 15 & 16: Instrumental Conditioning (Schedules of Reinforcement) Learning, Psychology 5310 Spring, 2015 Professor Delamater.

OPERANT CONDITIONING DEF: a form of learning in which responses come to be controlled by their consequences.

Learning the Consequences of Behavior

Chapter 7 Operant Conditioning:

 Also called Differentiation or IRT schedules.  Usually used with reinforcement  Used where the reinforcer depends BOTH on time and the number of reinforcers.

Week 5: Increasing Behavior

Ratio Schedules Focus on the number of responses required before reinforcement is given.

Psychology of Learning EXP4404 Chapter 6: Schedules of Reinforcement Dr. Steve.

Chapter 9 Adjusting to Schedules of Partial Reinforcement.

Operant Conditioning: Schedules and Theories Of Reinforcement.

Chapter 6 Operant Conditioning Schedules. Schedule of Reinforcement Appetitive outcome --> reinforcement –As a “shorthand” we call the appetitive outcome.

Ninth Edition 5 Burrhus Frederic Skinner.

What is Operant Conditioning? Module 16: Operant Conditioning.

Organizational Behavior Types of Intermittently Reinforcing Behavior.

Reinforcement Consequences that strengthen responses.

Learning Theories Learning To gain knowledge, understanding, or skill, by study, instruction, or experience.

Overview of the Day Learning - Part 2 Operant conditioning Observational learning.

Learning (Part II) 7-9% of AP Exam Classical Conditioning UCS + UCR + N, etc… Acquisition Extinction Biological Predisposition Pavlov Watson Operant Conditioning.

Chapter 3 Learning (II) Operant (Instrumental) Conditioning.

4 th Edition Copyright 2004 Prentice Hall5-1 Learning Chapter 5.

Chapter 13: Schedules of Reinforcement

Verification & Validation

Investment Analysis and Portfolio Management Chapter 7.

PROBABILITY (6MTCOAE205) Chapter 6 Estimation. Confidence Intervals Contents of this chapter: Confidence Intervals for the Population Mean, μ when Population.

Education Research 250:205 Writing Chapter 3. Objectives Subjects Instrumentation Procedures Experimental Design Statistical Analysis  Displaying data.

Section 8.1 Estimating  When  is Known In this section, we develop techniques for estimating the population mean μ using sample data. We assume that.

PSY402 Theories of Learning Chapter 6 – Appetitive Conditioning.

Principles of Behavior Sixth Edition Richard W. Malott Western Michigan University Power Point by Nikki Hoffmeister.

Schedules of Reinforcement and Choice. Simple Schedules Ratio Interval Fixed Variable.

Operant conditioning (Skinner – 1938, 1956)

Schedules of Reinforcement CH 17,18,19. Divers of Nassau Diving for coins Success does not follow every attempt Success means reinforcement.

Schedules of Reinforcement Thomas G. Bowers, Ph.D.

Schedules of reinforcement

Schedules of Reinforcement Reinforcement – any stimulus that makes a behavior more likely to occur. In terms of delivering reinforcement one may: Vary.

Fundamentals of Microeconomics

Operant Conditioning. Learning when an animal or human performs a behavior, and the following consequence increases or decreases the chance that the behavior.

Correlation & Regression Analysis

Operant Conditioning I. Volunteer? Priscilla the Fastidious Pig

4 th Edition Copyright 2004 Prentice Hall5-1 Psychology Stephen F. Davis Emporia State University Joseph J. Palladino University of Southern Indiana PowerPoint.

How is behavior “shaped” through operant conditioning? Operant shaping: demonstration, analysis and terminology Chaining operant behavior (again) The cumulative.

1 Quiz Question: In what way are fixed-ratio (FR) and variable-ratio (VR) reinforcement schedules: (a) similar? (b) different?

Schedules of Reinforcement

Operant Conditioning A type of learning in which the frequency of a behavior depends on the consequence that follows that behavior.

Correlated equilibria, good and bad: an experimental study

Choice Behavior One.

Schedules of Reinforcement

Learning Operant (Instrumental) Conditioning

Learning: Operant Conditioning.

PSY402 Theories of Learning

Schedules of Reinforcement

Learning Any relatively permanent change in behavior (or behavior potential) produced by experience.

Presentation transcript:

The Matching Law Richard J. Herrnstein

Reinforcement schedule Fixed-Ratio (FR) : the first response made after a given number of responses is reinforced. (ex. Peicework system) Variable-Ratio (VR) : similar to FR except that the number of responses required varies between reinforcements. (ex. Gambling)

Reinforcement schedule Fixed-Interval (FI) : the first response made after a given time interval is reinforced. (ex. Stipend) Variable-Interval (VI) : similar to FI except the interval requirements vary between reinforcers around some specified average value. (ex. Fishing)

Ch1 – Experimental design In this experiment, the reinforcement were given to pigeons when they pecked on either of two keys. The reinforcement for one key was delivered on a variable-interval schedule which was independent of the schedule for the other key.

Ch1 - Experimental design The mean time interval between reinforcements on each key was the primary independent variable. These intervals were chosen so that the mean interval of reinforcement for the two keys was held constant at 1.5 minutes. For example, VI(3) VI(3); VI(2.25) VI(4.5); VI(1.5) VI(∞);

Ch1 – Result Relative frequency of responding to Key A. It is exactly equal to the relative frequency of reinforcement.

Ch1 – Result The absolute rate of responding on each of the keys. It is approximately a linear function that passes through the origin.

Ch1 – Discussion Rate of responding is a linear measure of response strength. The relative frequency of responding on a given key closely approximated the relative frequency of reinforcement on that key.

Ch4 – Maximizing vs Matching From a viewpoint of the maximizer, equilibrium is reached when a distribution of activities cannot improve the obtained outcomes by a redistribution of choices. Matching requires the ratio of the frequencies of any two behavior, B 1 and B 2, to match that of their obtained reinforcements, R 1 and R 2.

Ch4 - Melioration How does an organism come to match its distribution of choices to the obtained reinforcements? By shifting behavior toward higher local rates of reinforcement. If R D is zero, equilibrium is achieved. When R D >0, time allocation shifts toward t 1 ; when R D <0, it shifts toward t 2.

Ch4 – Comparison between matching and maximization Melioration implies that behavior maximizes total reinforcement, R T, under two and only two conditions, as follows:

Ch4 – Concurrent VR VR, where

Ch4 – Concurrent VR VR Melioration predicts preference for alternative 1, since R D >0 at all allocations. Maximization predicts likewise because R T is at its maximum at t 1 =1. When alternative 1 reinforces with a higher probability than alternative 2

Ch4 – Concurrent VI VR Local rate of reinforcement for VI, where V : scheduled average interreinforcement time d 1 : average interresponse time during responding on the VI I : a measure of interchangeover times between two schedules t 1 : proportion of time spent on the VI (1-t 1 ) : proportion of time on VR Local rate of reinforcement for VR

Ch4 – Concurrent VI VR, where,

Ch4 – Concurrent VI VR The optimal strategy for conc VI VR would seem to call for lots of time on the VR with occasional forays to the VI to collect a reinforcement come due. Nevertheless, no subject displayed any such bias toward VR. Solid : best-fitting line Dashed : prediction of matching Dot-dashed : prediction of maximization

Ch4 – Concurrent VI VR Divergence between two theories The value of R T when R D =0 is about 15 percent lower than when RT is maximized, which is the reinforcement cost of matching.

Ch4 – Mazur’s Experiment VI 45-second schedule, which randomly and equally often assigned each dark period to one key or the other. During the 3-second dark periods, a small ration of food was delivered with the following probability :

Ch4 – Mazur’s Experiment For a maximizer, the pigeons should always sample each alternative frequently and equally. For a matcher, the pigeons should shift preference along with the proportions of yielding food.

Ch4 – Mazur’s Experiment

Ch4 – Vaughan’s Experiment Modified Conc VI VI schedule Schedule values were updated every 4 minutes of responding. In condition a, the left schedule reinforces at a higher rate than the right schedule; in condition b, vice versa.

Ch4 – Vaughan’s Experiment Maximization picture In either condition, a subject earned the maximum, 180 reinforcements per hour, by spending of its time responding to the right altenative.

Ch4 – Vaughan’s Experiment Melioration picture Melioration should have held choice within the interval from during condition a and within the interval during condition b.

Ch4 – Vaughan’s Experiment The results for the three pigeons

Limits of Melioration Comparative Most of data showing the different predict between maximizing and melioration came from the pigeons. Psychophysical Ambiguity in the meaning of “local” rates of reinforcement Motivational Food was used as the only reinforcement. Procedural Melioration can be generalized from concurrent to single- response procedures and multiple schedules, but there is no fully satisfactory formula for multiple-schedule responding yet. Inherent Limits Is the class of equally reinforced movements also the class of maximally reinforced movements?