Positive Reinforcement
(+R)
The Four Quadrants of Operant Conditioning
Types of Conditioning
Classical (Respondent) Conditioning : Learning through the repeated pairing of two stimuli. The response to the second stimulus is now paired solely with first stimulus.
​
Example : A horse sniffs a wooden board and a rat jumps out. This spooks the horse. Prior to conditioning, the horse did not spook at boards, but now, the horse may become wary of boards as he is expecting the surprise of a rat! So, the response to the rat is transferred to the board.
​
Example 2 : A horse is caught, and put to work. If the horse finds the work aversive, he may begin to try to avoid it. Prior to conditioning, the horse did not run at the presence of the head collar, but now, when he sees his human carrying it, he runs to avoid work. The association becomes head collar = work.
​
Operant (Instrumental) Conditioning : Learning in which the strength of a behaviour is increased/decreased by reinforcement/punishment.
​
Example : A horse is afraid of a wooden board due to a bad experience with a rat. A trainer rewards the horse for each step he takes toward the board until the horse is no longer afraid. The horse learns through positive reinforcement that investigating the board is a behaviour that earns rewards. The horse may be more likely to investigate boards now because he was reinforced for that behaviour.
​
Example 2 : A horse runs from his owner when she is carrying a head collar. The owner rewards the horse when he is caught and does work with him that he finds pleasurable. The horse learns that the head collar, and subsequently being caught, earns him a reward. The horse will be more likely to approach the owner instead of running away.
​
Behaviourist Learning Theory states that individuals learn through four quadrants, and learn to display certain behaviours in order to either achieve reinforcement (+R or -R) or avoid punishment (+P or -P).
Add
Something
Remove
Something
Increase
Behaviour
Frequently
Decrease
Behaviour
Frequently
+R
Positive
Reinforcement
+P
Positive
Punishment
-P
Positive
Punishment
-R
Negative
Reinforcement
(+R) Positive Reinforcement : The addition of a desirable (pleasurable or gratifying) stimulus to increase the chance that a particular behaviour will occur again.
​
Example : A horse receives a carrot for not biting when his girth is touched. The horse will now be more likely not to bite when having his girth adjusted.
Behaviour : not biting = Consequence : earns a carrot
​
(-R) Negative Reinforcement : The removal of an aversive (unpleasant or painful) stimulus to increase the chance that a particular behaviour will occur again. With horses the use of pressure (an aversive stimulus) is often increased until the desired behaviour is achieved. Then comes the relief or removal of the aversive.
​
Example : While mounted, a rider applies leg pressure to a horse’s sides. If the horse does not move, the rider applies more pressure. When the horse steps forward, the rider removes the leg pressure. The horse will now be more likely to move forward when less leg pressure is applied in effort to avoid the increase of the aversive (pressure).
Behaviour : moving forward = Consequence : release of leg pressure
​
(+P) Positive Punishment : The addition of an aversive (unpleasant or painful) stimulus to decrease the chance that a particular behaviour will occur again.
​
Example : A horse & his rider approach a jump. The horse slams on the breaks, refusing to jump. The rider hits the horse with a whip. The horse may be more likely to go over the jump next time to avoid being hit with the whip.
Behaviour : stopping at the jump = Consequence : hit with the whip
​
(-P) Negative Punishment : The removal of a desirable (pleasurable or gratifying) stimulus to decrease the chance that a particular behaviour will occur again.
​
Example : A horse does not touch a target when it’s presented to him. The trainer does not give him a treat.
Behaviour : not touching the target = Consequence : no reward
Remember!
-
Reinforcement = Increase Behaviour
-
Punishment = Decrease Behaviour
-
Positive = Add Something
-
Negative = Remove Something
Example : The horse moves his head away from the trainer who is working on the horse not mugging them for treats. As he moves his head away, the trainer clicks and treats the horse. The horse learns very quickly which behaviours are earning him the click through only a few repetitions.
Definitions
Primary Reinforcer
​
A biological desire such as food, water, or release from pain or discomfort.
For humans, this is also food, water, and comfort.
These do not need to be conditioned as they are instinctive. A horse will almost always want a treat.
Secondary Reinforcer
​
Something that has been strongly associated with a primary reinforcer. The “click” of a clicker in clicker training tells the animal that a primary reinforcer (food) is coming. For humans, an example would be money. Money is strongly associated with food, shelter, clothing, and other desirable things. They mean that something good might happen. (Secondary Reinforcers also occur in -R)
Marker or Bridge Signal
​
+R trainers often use clickers or a specific sound to “mark” a desired behaviour. The marker or bridge signal, bridges the gap from the behaviour to the reinforcer. The click becomes a secondary reinforcer because the horse knows that his reward is coming.
Trainers use bridge signals to make the training more precise and communicate information to the horse that the behaviour he was performing at the time of the click earned him the reward.
The use of a clicker tool or a tongue click (not to be confused with a “cluck” as for most horses that is conditioned to mean faster!) is important for more complex behaviours.
By using a bridge signal, you don’t have to try to feed the moment the behaviour is offered. Doing so would be ineffective and unclear to the horse, especially with behaviours like piaffe, or trotting.
Cue : Information given to the horse to indicate which behaviour will earn him a reward.
Cues can be a word), a hand movement, body language, etc. It is entirely up to the trainer to choose! (Cues are also used in other types of training)
​
Example : A horse is mugging his trainer for treats. The trainer turns their body away slightly, a body language cue. The horse moves his head away. The handler then clicks and gives the horse his reward. The handler’s body position is the horse’s cue to move his head away.
​
Example 2 : A horse is walking. His trainer says, “Trot”, a verbal cue. The horse trots. The trainer clicks, and gives the horse his reward. The word “Trot” is the horse’s cue to trot.
Positive reinforcement changes behaviour for the better, while criticism stabilises negative behaviour and blocks change.
~ Virginia H. Pearce