Presentation on theme: "Operant Conditioning Skinner, positive & negative reinforcement, response cost, punishment and schedules of reinforcement."— Presentation transcript:
1Operant ConditioningSkinner, positive & negative reinforcement, response cost, punishment and schedules of reinforcement
2Three-phase model of operant conditioning Skinner “operant conditioning” Thorndike calls “Instrumental learning”Operant is the response(s) that “operate” or act upon the environment to produce some kind of effectEg – Thorndike’s experiment the operants was the cat biting on the bar and clawing the boxBased on Thorndike’s law of effect – an organism will repeat a behaviour (operants) that have a desirable consequence (cat gets fish) or that will enable it to avoid an undesirable consequence (detention).Also an organism will not tend to repeat a behaviour that has an undesirable consequence (speeding fine = speed less)
3Components of operant conditioning S.R.CS = Stimulus that comes before the operant responseR = Operant Response to the stimulusC = Consequence to the operant responseExample: Thorndike’s cat puzzle box experimentS = boxR = sequence of movements to open the door (operating the environment)C = Escaping the box and getting fish
6Reinforcement and Punishment Skinner’s and Thorndike’s studies provide evidence for the concept of reinforcement – because learning through operant conditioning occurs as a result of the consequences of behaviour.Reinforcement and Punishment are the main aspects of operant conditioning
7ReinforcementReinforcement is when a stimulus (object or event) stregthens or increases the likelihood or frequency of a response that it followsReinforcer is any stimulus (object or event) that increases the likelihood of a response that it follows – reinforcer is the stimulus that allows for the reinforcement to occur
8Positive and Negative reinforcement Positive Reinforcement (adds something) +Presenting a stimulus (positive reinforcer) that strengthens or increases the likelihood of a desired response by providing a satisfying consequence. Eg. Being well behaved in class to get a gold star on your name; cleaning your room to get pocket moneyNegative Reinforcement (takes something away) –Removing an unpleasant stimulus that increases or strengthens the likelihood of a desired response. Eg. leaving home early one day and finding no traffic on the road may encourage you to leave home early again (response) in the future to avoid heavy traffic (removal of unpleasant stimulus)
9Schedules of reinforcement Refer to the schedules or programs that are set out to determine how often reinforcement should be given in relation to the correct response.Continuous reinforcement = reinforcement is provided immediately after every correct / desired response is madePartial reinforcement = reinforcement is provided for some correct/desirable responses but not all of them
104 types of Partial reinforcement Fixed-ratio scheduleA reinforcer is given after a set (fixed) amount of responses (ratio) are made. Eg a ratio of 1:5 means one reinforcer for every five correct responses. Eg. factory workers may be paid a certain amount for every 5 garments that they make.Variable-ratio scheduleA reinforcer is given after an unpredictable (variable) number of correct responses (ratio) are made. Eg 1 reinforcer for a mean of 5 ratios made but after 1, 7, 11 etc
11Partial Reinforcement Fixed-interval scheduleA reinforcer is given after a specific fixed period of time has elapsed (interval) since the previous reinforcer, provided the correct response has been made. Eg. workers are given monthly reviews, they may work harder in the weeks leading up to their review, rather than the days after the review.Variable – interval scheduleA reinforcer is given after irregular (variable) periods of time have passed (interval) provided the correct response has been made. There is a mean period of time, but at variable unpredictable times. Reponses made before the scheduled delivery time or before the interval has passed will not be reinforced, even if they are correct
12PunishmentPunishment is the delivery of an unpleasant consequence following a response or the removal of a pleasant consequence following a responseEg. delivery of an unpleasant consequence following a response (smacking a child after they misbehave)Eg. removal of a pleasant consequence following a response (losing money through a fine)Punishment is different to negative reinforcement. NR is the removal of an unpleasant stimulus to increase a response recurring. Punishment imposes an unpleasant consequence (or removes a pleasant one) and decreases or weakens the response from occurring. Also punishment is ‘given’ or ‘applied’ where as negative reinforcement is avoided or prevented.
13Positive and Negative punishment Positive punishment +The presentation of an unpleasant stimulus that decreases or weakens the likelihood of the response occurring again. Eg, having arrived to sport training late, made to run 5 laps to decrease the likelihood that you will be late againNegative punishment –Removal of a stimulus that decreases or weakens the likelihood of a response from occurring again. Eg, removing your mobile phone from you for using it in class
14Factors that influence the effectiveness of reinforcement and punishment OATO = Order of presentation. To be effective it is essential that the reinforcement or punishment is presented after the response, never before.A = Appropriateness. Must be appropriate for the behaviour or response that has occurred. The punishment or reinforcement must be suited to the characteristic of the individual as wellT = Timing. Reinforcement and punishment should be given immediately after the response has occurred.
15Key processes in Operant Conditioning AcquisitionThe establishment of a response through reinforcement. The types of behaviour that become learned are more complex in operant conditioning, than the simple responses of classical conditioningExtinctionGradual decrease in the strength of a conditioned (learned) response following consistent non-reinforcement of that response. Eg. Skinner’s pigeons when stopped receiving food pellets, their conditioned response (press the lever) was extinguished. Less likely to occur with partial reinforcement. (Eg gamblers – less likely to stop as reward is unpredictable)
16Spontaneous recoveryExhibits the response in the absence of reinforcement. Response is weaker and doesn’t last longStimulus generalisationOccurs when the correct response is make to another stimulus that is similar. Eg, sound of a car backfiring may cause athletes to generalise this sound to a ‘starters pistol’ and begin running
17Stimulus discrimination Makes the correct response to a stimulus and is reinforced, but not to a response that is similar stimuli. Eg. sniffer dog will only bark at certain smells (drugs and specific plant matter) not at every smell
18Applications of Operant Conditioning Applications of behaviour modification include Shaping and Token economies.ShapingAlso known as the ‘method of successive approximations’. It means giving reinforcement for any response that successively approximates or moves towards the desired response or behaviour. Eg. Shaping may be used when teaching and encouraging young children to swim
19Token EconomiesSettings in which if an individual exhibits desired behaviour, they receive tokens (reinforcers) which are collected and these tokens or reinforcers can be exchanged for other reinforcers in the form of actual tangible rewards. Eg. In prison, an inmate’s good behaviour may earn him a token which could be cashed in for special rewards such as cigarettes and privileges.Can easily fail, especially if people feel they are being manipulated.
20Comparison of Classical and Operant conditioning See handout for similarities and differences