Animal Behavior for Shelter Veterinarians and Staff. Группа авторов
Чтение книги онлайн.
Читать онлайн книгу Animal Behavior for Shelter Veterinarians and Staff - Группа авторов страница 42
The effects of intermittent reinforcement are commonly found in the shelter. Food‐dispensing toys are often provided to facilitate an enriching environment. Caregivers might vary the type of food or switch between food or scent. However, the dog has a preference for food enrichment over the scent enrichment, and after some experience getting the toy with food on some occasions and getting the toy with scent on others, whether or not the toy contains food is a mystery to the dog! The effect of the intermittent presence of food is evident in the dog’s behavior: the dog is likely to check the toy every time it is placed into its enclosure. The behavior of checking the toy is on an intermittent schedule of reinforcement, leading the behavior to occur reliably when the toy is present (even though the food reinforcer only occurs sometimes).
Box 3.2 Variable Schedule Reinforcement in the Shelter
Training Dogs to Sit Using Variable Ratio Reinforcement
An animal trainer is training dogs in the shelter to sit when someone walks by their kennel. The trainer decides to deliver food on a variable ratio 5 (written as VR 5). This means that on average, every fifth response will receive a food reward when someone walks by. The dog might receive a piece of food on the first response (sitting when the first person walks by), sixth response, second response, eighth response, fifth response, eighth response, and so on. If that same dog was trained the same behavior with a fixed ratio 5 (FR 5), then it would receive a piece of hotdog after every fifth response since the number of required responses is fixed at five.
Training Dogs Not to Bark Using Variable Interval Reinforcement
An example of an interval schedule would be an animal trainer training dogs in a shelter not to bark when someone enters the kennels. During this training the trainer walks by and randomly rewards a quiet dog. The dog could be quiet for 30 seconds or 10 seconds, but on average the dog will get rewarded for being quiet every 30 seconds, thus this results in a variable interval schedule of reinforcement (VI 30 sec).
Trainers often take advantage of intermittent schedules of reinforcement to facilitate the persistence of behavior in dog training programs (Hall 2017). For example, detection dogs are trained to search for a target item, like drugs or explosives, for long periods of time. As soon as the dogs detect the item, they are trained to notify the handler. In other words, detecting the item is a cue to engage in a different behavior (notifying the handler). The dogs are then given reinforcement for correctly notifying their handler about the found item. Because reinforcers are delivered only after the dog finds an item, it can be tricky to train the dog to continue to persist in searching behavior since no reinforcement is delivered during that time. To examine the behavior further, Thrailkil et al. (2016) demonstrated how an intermittent schedule of reinforcement can be used to increase the persistence of behavior in a rat model of detection dog training. In their experiment, rats were trained to pull a chain that served as an analog for search behavior. Successfully pulling the chain resulted in the production of a lever that was analogous to finding a target item. The lever presentation cued a lever press, which was then reinforced with food. Pressing the lever was analogous to notifying the handler about the found item. All rats were first trained to pull the chain on a continuous reinforcement schedule, meaning that each chain pull gave the rats the opportunity to press the lever. Later, for some rats, the schedule of reinforcement was slowly faded to an intermittent schedule so that pulling the chain produced the lever only one‐third of the time. For other rats, the schedule of reinforcement remained continuous such that every chain pull produced the lever. To test how the two groups of rats would behave when reinforcement is no longer available, the researchers stopped providing the lever altogether. The rats that underwent intermittent reinforcement persisted in the chain‐pulling behavior for a much longer period of time than rats that received continuous reinforcement. This is good news for those dogs working in the field—as long as they find the target item every now and then and get reinforcement, their searching behavior should maintain for long durations.
3.4.1 Conditioned Reinforcement and Conditioned Punishment
When using reinforcement in animal training, we often think of using food, like meat‐flavored treats. Food is a biologically based reinforcer, along with others such as water, shelter, and mating, and these are all called primary or unconditioned reinforcers. The same goes for punishers. Some stimuli are unconditioned and function as punishers because of their inherent aversiveness, such as a painful electric shock.
It is obvious, especially when analyzing human behavior, that most of what influences behavior is not a piece of food or access to a mate. Instead, human behavior is often influenced by stimuli that are more complex. For example, students study to get good grades, employees work for money, and children draw silly cartoons for their parent’s approval. These stimuli (i.e., grades, money, and approval) get their reinforcing efficacy through the individual’s prior learning experience. Without an associative learning history, a good grade or a dollar bill are unlikely to produce any behavior changes. In this respect, they begin as neutral stimuli. Neutral stimuli acquire reinforcing function by being paired with an already established reinforcer. After repeated pairings of the neutral stimulus and a reinforcer, the neutral stimulus becomes a conditioned reinforcer. This should sound familiar! The classical conditioning process of stimulus‐stimulus pairings results in the capacity for neutral stimuli to become conditioned reinforcers or conditioned punishers (Williams 1994).
Conditioned reinforcement has been thoroughly investigated in the behavioral laboratory. In the laboratory, when pigeons earn food reinforcers by pecking a key, a grain dispenser, also called a food hopper, is made accessible for a certain period of time so that the pigeon can consume the primary reinforcer (grain). When the food hopper activates, it produces a distinct sound. After repeated pairings of the sound and food, the sound itself becomes a conditioned reinforcer and thus can strengthen behavior (Kelleher 1961). This means that the pigeon will peck at the key just to produce the hopper sound!
Conditioned reinforcers have been shown not only to be effective in strengthening or maintaining behavior, but they can also establish new behavior (Alfernik et al. 1973). A dog is not born wanting to play with toys, but when that toy is paired with primary reinforcers such as social interaction, the toy itself can reinforce a response. The toy can be used to reinforce behaviors the dog already knows as well as behaviors that the dog is learning.
Although conditioned reinforcers can maintain learned responses and establish new ones, they are at risk of losing their reinforcing value if they aren’t periodically paired with the unconditioned reinforcer. If the pigeon’s key pecks produced only the sound of the hopper but no food, after a while the pigeon would stop pecking. The sound will only function as a reinforcer if it is occasionally paired with food. Similarly, money maintains its reinforcing value because it can be exchanged for goods and services. If someone tried to use Canadian dollars in the United States, the Canadian dollars will lose their reinforcing value quickly because they are no longer paired with other reinforcers.
The same concepts that apply to conditioned reinforcers also apply to conditioned punishers. For example, some dog owners use invisible fencing systems to keep their dogs within the boundaries of their yard. When the dog approaches a boundary, the dog’s collar emits a tone and then shortly thereafter a shock. After some experience hearing the tone and then experiencing the shock, the tone alone is aversive to the dog, and the dog refrains from approaching the boundaries. Once a stimulus becomes a conditioned punisher, it can successfully diminish behaviors beyond the context in which they were first paired. That