Midsession reversal learning by pigeons: Effect on accuracy of increasing the number of stimuli associated with one of the alternatives

Abstract

The midsession reversal task involves a simultaneous discrimination in which choice of one stimulus (S1) is correct for the first 40 trials and choice of the other stimulus (S2) is correct for the last 40 trials of each 80-trial session. When pigeons are trained on the midsession reversal task, they appear to use the passage of time from the start of the session as a cue to reverse. As the reversal approaches, they begin to make anticipatory errors, choosing S2 early, and following the reversal they make perseverative errors, continuing to choose S1. Recent research suggests that anticipatory errors can be reduced (while not increasing perseverative errors) by reducing the probability of reinforcement for correct S2 choices from 100% to 20%. A similar effect can be found by increasing the response requirement for choice of S2 from one peck to ten pecks. In the present experiments, we asked if a similar effect could be attained by increasing the number of stimuli that, over trials, could serve as S2. Instead, in both experiments, we found that increasing the number of S2 stimuli actually increased the number of anticipatory errors. Several interpretations of this result are provided, including the possibility that attention to the variable S2 stimuli may have interfered with attention to the S1 stimulus.

Introduction

A measure of the intelligence of an organism is its ability to modify its behavior as the contingencies of reinforcement change. For this reason, comparative psychologists have been interested in the ability of animals that have acquired a discrimination to reverse the discrimination. If one uses a measure of reversal learning that compares it with original learning, one can argue that it avoids the problem of difference in the rate of original learning due to factors unrelated to behavioral flexibility (e.g., sensory ability or motivational state).

A serial reversal task is one in which the reversal occurs repeatedly. It can be used to determine the extent to which there is an improvement in the rate of learning with successive reversals (e.g., Mackintosh, McGonigle, Holgate, & Vanderver, 1968). The degree of improvement with successive reversals (relative to original learning) has been taken as a measure of the animal’s cognitive flexibility (Bitterman, 1975).

The optimal performance with a task that involves multiple reversals is to base one’s choice on the consequences of the preceding trial. If that response was rewarded, one should stay with it; if it was not rewarded, then one should shift to the alternative response. This strategy has been referred to as win-stay/lose-shift.

A variation of the serial reversal task is the midsession reversal task, in which there is an additional cue to signal the occurrence of the reversal. With this task, each session begins with a simple simultaneous discrimination between two stimuli in which S1 is correct for the first half of the session (S2 is incorrect) and S2 is correct for the remainder of the session (S1 is incorrect). If an animal was able to adopt a win-stay/lose-shift strategy, it would result in only a single error that would occur on the first reversal trial. It is a strategy that most human subjects adopt (Rayburn-Reeves, Molet, & Zentall, 2011), and using a spatial version of this task, the performance of rats comes very close to this level of accuracy (Rayburn-Reeves, Stagner, Kirk, & Zentall, 2013).

Pigeons, however, show a very different pattern of behavior (Rayburn-Reeves et al., 2011). Although pigeons choose S1 almost exclusively early in the session and choose S2 almost exclusively late in the session, they tend to make many errors in the middle of the session. On the one hand, they make many anticipatory errors, choosing S2 well before the reversal, and on the other hand, they make many perseverative errors, continuing to choose S1 following the reversal. The function describing choice of S1 is in the form of a psychophysical ogive.

This finding has suggested to researchers that the pigeons are attempting to use the passage of time from the start of the session as the basis for stimulus choice (McMillan & Roberts, 2012; Smith, Beckmann, & Zentall, 2017). The research indicates that if pigeons are trained with a 5-s intertrial interval, the switch from choice of S1 to choice of S2 occurs at approximately the midpoint of the session. In support of the timing hypothesis, however, if the intertrial interval is shortened to 2.5 s, the pigeons switch from choice of S1 to choice of S2 earlier than the midsession, and if the intertrial interval is lengthened to 10 s, the pigeons switch later than the midsession.

What is surprising about this behavior is that timing the midpoint of the session is not an optimal strategy, as it results in many more errors than win-stay/lose-shift. Not only is the time from the start of the session more difficult to estimate, but to some extent it depends on the latency with which the pigeons respond to the discriminative stimuli on each trial. The question is why do the pigeons use time from the start of the session as a cue when the feedback from the preceding response (reinforcement or its absence) would be a much better cue.

Rayburn-Reeves et al. (2011) suggested that the predictability of the reversal (always after 40 trials) might encourage the pigeons to use time from the start of the session as a cue. To test this hypothesis, they trained pigeons on a version of the midsession reversal task in which the point of the reversal was not predictable. What they found was surprising. Even though time from the start of the session was no longer a predictive cue for the reversal, the pigeons continued to use time as a cue. When the reversal occurred early in the session, the pigeons made few anticipatory errors but they made a large number of perseverative errors. When the reversal occurred late in the session, the pigeons made a large number of anticipatory errors but few perseverative errors. Remarkably, the pigeons made the fewest errors when the reversal occurred at the midpoint in the session, although with this procedure, the reversal occurred at the midpoint only a small percentage of the time.

Why pigeons use time into the session as a cue to reverse rather than local cues associated with the feedback from the last trial(s) is not clear. Smith et al. (2017) suggested that the pigeons might not always be able to remember which stimulus they had chosen and the results of that choice, especially in the region of the session around the reversal where reference memory was not as reliable. Although the typical intertrial interval is only 5 s, memory for a stimulus location pecked and reinforcement (or its absence) following that response can suffer considerable forgetting following a 5-s delay (Randall & Zentall, 1997). To test this memory-loss hypothesis, Smith et al. “reminded” the pigeons of their choice and its consequence during the intertrial interval following each trial by illuminating a houselight in the ceiling following choice of S1 or a houselight at the top of the response panel following choice of S2. Furthermore, if the choice had been correct, the feeder light remained on during the intertrial interval.

Smith et al. (2017) found that the group that was reminded of their choice and its consequence during the intertrial interval made significantly fewer anticipatory and perseverative errors. However, they did continue to make both kinds of error. Furthermore, they found evidence that the pigeons continued to use time from the start of the session as a cue to reverse because when the intertrial interval was shortened to 2.5 s, the pigeons reversed later and when it was lengthened to 10 s, the pigeons reversed earlier.

Using a clever design, Rayburn-Reeves, Qadri, Brooks, Keller, and Cook (2017) sought to identify the source of cues that pigeons use to decide which stimulus to choose over the course of the session. On some sessions, they trained pigeons with a distinctive cue inserted during the intertrial interval that signaled each half of the session. On other sessions, there was no intertrial-interval cue. Then on probe sessions without a regular cue, they occasionally miscued the pigeons (by presenting the second-half cue during the first half of the session or the reverse).

Rayburn-Reeves et al. (2017) found that miscuing had very little effect on choice accuracy early and late in the session but it had a large effect on choice accuracy in the middle of the session, especially in anticipation of the reversal. Thus, early and late in the session, timing cues overshadowed the intertrial interval miscues, whereas in the middle of the session, relative to intertrial intervals with no cue, the miscues interfered with the timing cues and when presented close to the point of the reversal, the miscues produced errors at a high rate.

A different approach to identifying the cues used by pigeons in performing the midsession reversal task was used by Santos, Soares, Vasconcelos, and Machado (2019). They manipulated the percentage of reinforcement associated with correct S1 and correct S2 choices. As expected, the reduction in percentage of reinforcement for correct S1 responses (from 100% to 20%) resulted in an increase in anticipatory errors and a decrease in perseverative errors; however, a similar reduction in percentage of reinforcement for correct S2 responses resulted in a decrease in anticipatory errors but no increase in perseverative errors. Thus, surprisingly, reducing the percentage of reinforcement for correct S2 responses resulted in more accurate performance than when 100% reinforcement was provided for both correct S1 and S2 choices.

Santos et al. (2019) concluded that with a low proportion of reinforced trials in the first half of the session, the pigeons seem to rely on the passage of time to guide their switching behavior. With a higher proportion of reinforced trials in the first half but a lower proportion of reinforced trials in the second half of the session, the pigeons appeared to rely primarily on local cues, they continue to choose S1 until one or more trials without food, and then they switch to S2. That is, in this high-low condition “the pigeons finally resort to the win-stay/lose-shift strategy that is conspicuously absent from the condition in which it was most expected,” the standard condition in which all correct responses are reinforced. Why this should occur is not clear, and the authors provide little explanation for this interesting effect.

In a recent study, Zentall, Andrews, Case, and Peng (in press) replicated the effect of reduced reinforcement (20%) for correct S2 responses. They proposed that the reason that pigeons used the time from the start of the session to the reversal was due to response competition between S1 and S2 during those trials approaching and immediately after the reversal. At the start of the session, the associative response strength of S1 is high and at the end of the session, the associative response strength of S2 is high, but in the middle of the session, competition between the two responses results in errors. The reason the pigeons that experienced reduced reinforcement for correct choice of S2 were better able to use the local feedback from the preceding trial was because it reduced the response competition between S1 and S2. This allowed the pigeons to attend better to the feedback from the response to S1. This had two effects on performance. First, it reduced the number of anticipatory errors, and second, it made the pigeons more sensitive to the reversal so there was no increase in the number of perseverative errors. However, if the total number of errors was reduced by the reduction in response competition due to the reduction in percentage of reinforcement for correct choice of S1, why did the reduction in percentage of reinforcement for correct choice of S1 not result in a similar effect?

The asymmetry that resulted when the percentage of reinforcement for correct choice of S1 was reduced resulted from the fact that correct responding during the first half of the session provided poor feedback (only 20% correct). Although choice of S2 during the first half of the session produced good feedback, to get that feedback required the pigeons to make an error, and that resulted in an increase in anticipatory errors.

In their second experiment, Zentall et al. (in press) tested the hypothesis that better accuracy with the reduction in percentage reinforcement associated with correct S2 responses resulted from increased attention to the consequences of choice of S1. They reasoned that a similar bias to choose S1 as the reversal approached might be obtained if the response requirement to S2 was increased. Rather than reduce the percentage of reinforcement associated with correct choice of S2, they increased the response requirement to choose S2 from one peck to ten pecks (thereby also increasing the delay to reinforcement for correct choice of S2). The results of Experiment 2 were quite similar to those of Experiment 1, thus supporting the hypothesis that the increased bias to choose S1 during the first half of the session not only reduced the number of anticipatory errors, but also made the pigeons more sensitive to the feedback from the reversal.

The attentional hypothesis suggests that any manipulation that causes the pigeons to attend more to the consequences of choice of S1 during the first half of the session should improve midsession reversal accuracy by reducing the number of anticipatory errors while not increasing the number of perseverative errors.

Experiment 1

The purpose of Experiment 1 was to test the attentional hypothesis in a third way. To bias the pigeons to attend better to the consequences of choice of S1, we allowed the S2 stimulus to change color from trial to trial. That is, throughout training there was a single S1 stimulus color but the S2 stimulus varied among four different colors. With this procedure, would the pigeons choose to attend to the feedback from choice of S1 rather than attend to the feedback from choice of S2 (that changed color from trial to trial)? One might also think of this procedure as involving a kind of feature-positive manipulation, as S1 would consist of a constant feature, whereas S2 would be variable.

Method

Subjects

Twelve unsexed White Carneau pigeons (age 5–12 years) purchased from the Palmetto Pigeon Plant, Sumter, SC, USA, served as subjects. All pigeons had had experience with successive color discriminations but not with simultaneous discriminations or with a reversal-learning task.

Apparatus

The experiment took place in a BRS/LVE (Laurel, MD, USA) sound-attenuating operant test chamber measuring 34 cm high, 30 cm wide, and 35 cm across the response panel. Three horizontally aligned circular response keys (2.54 cm diameter) on the response panel (spaced 6.0 cm apart edge to edge) were located 25 cm from the floor. A 12-stimulus in-line projector (Industrial Electronics Engineering, Van Nuys, CA, USA) with 28-V, 0.1-A lamps (GE 1820) was mounted behind the left and right response keys that projected green, red, blue, yellow, and white hues. Reinforcement consisted of 1.5-s access to mixed-grain (Purina Pro Grains – a mixture of corn, wheat, peas, kafir, and vetch) provided from a feeder. A 28-V, 0.04-A lamp illuminated the feeder when reinforcement was delivered. A microcomputer controlled experimental events by an interface located in an adjacent room.

Procedure

Subjects were randomly assigned to one of two groups: Experimental (4S2) and Control (1S2). For pigeons in the Control group, a trial started with illumination of the right and left keys, one red the other green. In each 80-trial session, one color was correct for the first 40 trials (S1), the other color was correct for the last 40 trials (S2). The red and green colors occurred randomly on the right and left response keys. For three of the pigeons in each group, red was S1 and green was S2. For the remaining pigeons, green was S1 and red was S2. For pigeons in the Experimental group, a trial started with illumination of the right and left keys. For three of the pigeons, the S1 color was red and the S2 color, over trials, was randomly green, yellow, blue, or white. For the remaining pigeons, the S1 color was green and the S2 color was randomly red, yellow, blue, or white. For the first 40 trials, the S1 color was correct. For the remaining trials, the S2 color was correct. A single peck turned off both keys and if the response was correct resulted in 1.5-s access to reinforcement. Reinforcement followed all correct responses. The 5-s intertrial interval followed the choice response. Each session was 80 trials long. Sessions were conducted 6 days a week and there were 40 sessions of training.

Analyses

Pigeons typically reach stable accuracy after about 30 sessions of training. We combined the data from sessions 31–40 and subjected them to analysis. An initial analysis was conducted on the overall accuracy of the two groups. Because the feedback from the pigeons’ choice on Trial 41 was provided only after the pigeons’ choice and we were interested in the pigeons’ approximation to a win-stay/lose-shift strategy, accuracy on Trial 41 was reverse coded (coded as a trial from the first half of the session). Separate analyses were conducted on the first 41 trials combined (to assess anticipatory errors) and the last 39 trials combined (to assess perseverative errors).

Results

The data from Sessions 31–40 plotted as a function of choice of the first correct stimulus (S1) are presented in Fig. 1. A preliminary analysis performed on the overall accuracy of the Experimental group (M = 78.8% correct) and the Control group (M = 89.1% correct) indicated that the two groups were significantly different, t(10) = 2.30, p = .044, Cohen’s d = 1.45. Surprisingly, accuracy for the Control group was significantly better than for the Experimental group. A separate analysis performed on the first 41 trials, pooled over Sessions 31–40, indicated that the Control group (M = 88.2% correct) was significantly more accurate than the Experimental group (68.7% correct), t(8) = 2.40, p = .037, Cohen’s d = 1.52. An analysis performed on the last 39 trials indicated that the Control (M = 90.0% correct) and Experimental (M = 89.4% correct) groups were not significantly different, t < 1.

Fig. 1
figure1

Experiment1: Percentage choice of S1 color (correct for first 40 Trials) as a function of trial in the session for the Experimental group (four S2 colors over trials) and the Control group (one S2 color). Dashed vertical line appears after Trial 41 (the first trial providing feedback that the reversal has occurred). Error bars are ±1 standard error of the mean

Discussion

The purpose of this experiment was to test the hypothesis that the pigeons would attend better to the S1 stimulus if the S2 stimulus changed from trial to trial and, as a result, the Experimental group would perform better than the Control group. The surprising finding was that the Experimental group performed significantly worse than the Control group. We hypothesized that the manipulation might produce a feature positive effect as S1 involved a constant feature, whereas S2 was variable.

One interpretation of these results is that similarity between one or more of the colors associated with the S2 stimulus and the color associated with the S1 stimulus may have made the S1 stimulus more difficult to discriminate from the S2 stimuli for the experimental group than for the control group due to stimulus generalization.

Experiment 2

The purpose of Experiment 2 was to test the reliability and generality of the variable S2 effect on midsession reversal accuracy. In Experiment 2, we replaced the colors used in Experiment 1 with national flags and we conducted the experiment in an operant box fitted with a touch screen. The national flags were selected to be maximally distinguishable in color, pattern, and other characteristics (see Zentall, Peng, & Miles, in press).

Subjects

The subjects were 12 pigeons similar to those used in Experiment 1 with similar prior experience.

Apparatus

The apparatus was a modified operant chamber with an Open-Frame Touch monitor on the front wall and a center-mounted grain feeder located 10 cm above the floor on the back wall. When projected onto the screen the flags were 3.3 cm wide by 2.5 cm high. The bottom edges of the flags were located 12.7 cm above the floor of the test chamber and were presented in pairs horizontally centered on the screen, with 1.5 cm between the flags. For the six pigeons in the Control group, three pigeons were trained with the flag of Japan as the S1 and the flag of Brazil as the S2. The remaining three pigeons were trained with the flag of Brazil as the S1 and the flag of Japan as the S2. For the six pigeons in the Experimental group, the S1 and S2 stimuli were the same as for the Control group, but, in addition, the flags of Sweden, Switzerland, and Argentina were used as additional S2 stimuli for both Experimental subgroups (see Fig. 2).

Fig. 2
figure2

Flags used in Experiment 2 for the Experimental group. The S2 stimuli changed randomly from trial to trial. The Control group received S1 and S2a. The flags used for S1 and S2a were counterbalanced over subgroups (top line and bottom line)

Procedure

Over a period of 2 weeks, the pigeons were trained to peck a single flag on the touch screen for 2.0 s of reinforcement. Reinforcement in Experiment 2 was extended to allow for the fact that the grain magazine was located on the back wall of the chamber. At the end of 2 weeks of training, all pigeons had been training to peck the touch screen with sufficient force and for a sufficient peck duration (the touch screen required a peck duration of 16 ms for the peck to be recorded).

During midsession reversal training, for pigeons in the Control group, a trial started with illumination of two flags – one on the left, the other on the right. The S1 flag was correct for the first 40 trials. The S2 flag was correct for the last 40 trials. The two flags occurred randomly on the right and left over trials. For pigeons in the Experimental group, the S1 flag was correct for the first 40 trials with one of the four S2 flags incorrect on each trial. For the remaining trials, on each trial, one of the S2 flags was correct. For purposes of counterbalancing, for three of the pigeons in each group, the flags that were used as S1 and S2 (or in the case of the Experimental group one of the S2 stimuli) were exchanged. A single peck turned off both flags, and if the response was correct, resulted in 2.0-s access to reinforcement. Reinforcement followed all correct responses. A 5-s intertrial interval followed the choice response. Each session was 80 trials long. Sessions were conducted 6 days a week and there were 40 sessions of training.

Analyses

We combined the data from Sessions 31–40 and subjected them to analysis. An initial analysis was conducted on the overall accuracy of the two groups. Because the feedback from the pigeons’ choice on Trial 41 was provided only after the pigeons’ choice, and we were interested in the pigeons’ approximation to a win-stay/lose-shift strategy, accuracy on Trial 41 was reverse coded (coded as a trial from the first half of the session). Separate analyses were conducted on the first 41 trials combined (to assess anticipatory errors) and the last 39 trials combined (to assess perseverative errors).

Results

The data from Sessions 31–40 plotted as a function of choice of the first correct stimulus (S1) are presented in Fig. 3. A preliminary analysis performed on the overall accuracy of the Experimental group (M = 83.7% correct) and the Control group (M = 90.5% correct) indicated that the two groups were significantly different, t(10) = 2.59, p = .027, Cohen’s d = 1.64.

Fig. 3
figure3

Experiment2: Percentage choice of S1 flag (correct for first 40 Trials) as a function of trial in the session for the Experimental group (four S2 flags over trials) and the Control group (one S2 flag). Dashed vertical line appears after Trial 41 (the first trial providing feedback that the reversal has occurred). Error bars are ±1 standard error of the mean

Once again, a separate analysis performed on the first 41 trials, pooled over Sessions 31–40, indicated that the Control group (M = 88.2% correct) was significantly more accurate than the Experimental group (68.7% correct), t(10) = 3.98, p = .003, Cohen’s d = 2.52. Again, an analysis performed on the last 39 trials indicated that the Control (M = 88.8% correct) and Experimental (M = 87.0% correct) groups were not significantly different, t < 1.

To test the hypothesis that similarity of the S1 stimulus to one of the S2 stimuli may have made the discrimination more difficult, we compared the overall accuracy between the two counterbalancing Experimental subgroups. Whatever the similarity between the S1 stimulus and any of the S2 stimuli was for one subgroup, it should have been reduced for the other Experimental subgroup as that S1 stimulus was now one of the S2 stimuli. The difference in accuracy between the two Experimental subgroups was not statistically significant, t(4) = 1.14, p = .31.

Discussion

In spite of the fact that the stimuli in Experiment 2 (flags) were quite different from those in Experiment 1 (colors), the results of the second experiment confirmed the results of the first experiment. As in Experiment 1, the pigeons for which the S2 stimuli changed from trial to trial (four different flags) made significantly more anticipatory errors than the pigeons for which there was a constant S2 stimulus. Furthermore, perseverative errors did not differ for the two groups.

General discussion

In past research using the midsession reversal task, we had confirmed that reducing the probability of reinforcement for correct choice of S2 from 100% to 20%, while maintaining the probability of reinforcement for correct choice of S1, had an unexpected result (Zentall et al., in press; see also Santos et al., 2019). It was anticipated that this manipulation would result in both a decrease in anticipatory errors and an increase in perseverative errors. One might think of this prediction as resulting from an acquired bias to choose S1 early in the session and to choose S2 late in the session, but in the middle of the session, where there is some uncertainty, S1 should be chosen. In other words, when in doubt, choose S1. Although this reduction in reinforcement for correct S2 responses did result in a significant reduction in anticipatory errors, it did not result in an increase in perseverative errors (Zentall et al., Experiment 1, in press).

We proposed that the reduction in the probability of reinforcement for correct S2 responses might have caused the pigeons to shift their attention from reinforcement associated with both S1 and S2 primarily to the consequences of choice of S1. This would mean continued choice of S1 until it no longer provided food, and then switch to S2, the only way to obtain further reinforcement. The resulting choice behavior would approximate a win-stay/lose-shift response strategy.

We argued further that if this hypothesis was correct, any manipulation that biased the pigeons to choose S1 over S2, as the middle of the session approached, should have the same effect. In Zentall et al., Experiment 2 (in press), we tested this hypothesis by increasing the number of pecks required to the S2 stimulus from one to ten. Consistent with this hypothesis, the manipulation resulted in a decrease in the number of anticipatory errors with no concomitant increase in perseverative errors.

In the present research, we attempted to test the theory further by increasing the number of S2 stimuli that the pigeons would encounter over trials from one to four. The hypothesis was that with S2 spread over four stimuli, the pigeons would be inclined to attend better to the consequences of choice of the S1 stimulus and show an effect similar to that found when the percentage reinforcement for correct S2 responses was reduced or the number of pecks required to S2 was increased. Instead, the increase in the number of S2 stimuli over trials had the opposite effect. It actually increased the number of anticipatory errors.

One explanation of this effect is that discriminations in which the S+ consists of a fixed cue, while the S- consists of a variable cue (compared to the reverse), actually produces a “feature-negative effect” (Beckmann & Young, 2007). Interestingly, Beckmann and Young interpreted the result as a form of feature-positive effect with variability as an abstract feature. Thus, subjects learned faster when the S+ consisted of variable stimuli and the S- consisted of a constant stimulus, compared to the reverse. They described this phenomenon as “novel pop-out.”

Consistent with this result, Macphail and Reilly (1989) found that pigeons readily learned to withhold responses to familiar pictures and respond to pictures not previously presented. Furthermore, Honey (1990) suggested that the familiarity or variability of a stimulus can be thought of as a stimulus property much like color or shape.

An alternative explanation for the poorer accuracy of the Experimental group compared with the Control group during the first 40 trials of each session is that with more stimuli representing S2 (over trials), there may have been more generalization between one of those stimuli and the S1 stimulus. Although some generalization is always possible, the stimuli were selected to be spectrally far apart. Also, the S1 stimulus and one of the S2 stimuli were counterbalanced over subjects in each group. Furthermore, Guttman and Kalish (1956) found that training on a nominal green stimulus (550 nm) showed no generalization to yellow (610 nm) or to a greenish blue (510 nm), and training on a nominal yellow stimulus (610 nm) showed virtually no generalization to green (550 nm) or orange (630 nm). Nevertheless, the purpose of Experiment 2 was to use national flag stimuli for which there was likely to be less generalization between the S1 stimulus and any of the S2 stimuli, and similar results were found.

A second alternative explanation of the difference in accuracy between the Experiment and Control groups was suggested by McMillan, Sturdy, and Spetch (2015), who tested pigeons on the midsession reversal task using a go/no-go procedure. The procedure involved single stimulus trials with random occurrences of S1 and S2 presented from trial to trial. They found that the latency to respond to the S2 stimuli during the first 40 trials decreased as the time to the reversal approached. They concluded that the problem that pigeons had with this task prior to the reversal was learning to inhibit responding to S2 as the reversal approached, rather than learning to respond to S1 as the reversal approached. They suggested that the pigeons learned independent rules about S1 and S2. Applying this theory to the procedure used in the present experiment may explain the problem that the pigeons had with multiple S2 stimuli. If the pigeons had to learn to inhibit responding to S2, having multiple S2 stimuli might make learning to inhibit responding to the S2 stimuli more difficult.

It is not clear, however, that what is learned with their successive discrimination is the same as what is learned with a typical simultaneous discrimination. In a successive discrimination, in the presence of the S-, the subjects must learn to inhibit responding because there is no alternative stimulus to which to respond, whereas in a simultaneous discrimination there is an S+. Furthermore, there is evidence that little inhibition accrues to the S- stimulus in a simultaneous discrimination. In fact, there is evidence that in a simultaneous discrimination, the S- comes to function as a higher order conditioned stimulus signaling the presence of the S+ (see Zentall & Sherburne, 1994). Once the pigeon has learned not to peck the S-, whenever the pigeons sees the S- it serves as a cue that the S+ is present.

It is possible, however, that the fact that the S2 that serves as an S- during the first 40 trials becomes an S+ during the last 40 trials of each session means that inhibition may accrue to S2 during the first 40 trials. The inhibition hypothesis makes an interesting prediction. It suggests that in the midsession reversal task, if multiple stimuli are used to represent the S1 but only one stimulus is used to represent the S-, performance by those pigeons during the first 40 trials of each sessions should be quite similar to that of control pigeons. This should be true because there should be no difference in inhibition to the single S2 stimulus for the two groups. With multiple S1 stimuli during the last 40 trials of each session, however, when the experimental pigeons need to learn to inhibit choice of S1, there should be an increase in perseverative errors relative to control pigeons. On the other hand, if, as Beckmann and Young (2007) suggest, stimulus variability can be used as a cue, then the Experimental group should make fewer anticipatory errors than the Control group, but they should not make more perseverative errors.

The proposed experiment, to increase the number of S1 stimuli rather than S2 stimuli, also may address the possibility that the effects found in the present research occurred because of greater generalization between the S1 and S2 stimuli (with more stimuli in either set, the possibility of generalization would increase). If stimulus generalization was responsible for the effects found in the present experiments, increasing the number of S1 stimuli should have an effect similar to the effects found in the present experiments (i.e., an increase in anticipatory errors for the experimental group). Thus, further research will help clarify the mechanism responsible for the effects found in the present experiments.

References

  1. Beckmann, J. S. & Young, M. E. (2007). The feature positive effect in the face of variability: Novelty as a feature. Journal of Experimental Psychology: Animal Behavior Processes, 33, 72-77.

    PubMed  Google Scholar 

  2. Bitterman, M. E. (1975). The comparative analysis of learning. Science, 188, 699-709.

    Article  Google Scholar 

  3. Guttman, N. & Kalish, H. I. (1956). Discriminability and stimulus generalization. Journal of Experimental Psychology, 51, 79-88.

    Article  Google Scholar 

  4. Honey, R. C. (1990). Stimulus generalization as a function of stimulus novelty and familiarity in rats. Journal of Experimental Psychology: Animal Behavior Processes, 16, 178-184.

    PubMed  Google Scholar 

  5. Mackintosh, N. J., McGonigle, B., Holgate, V., & Vanderver, V. (1968). Factors underlying improvement in serial reversal learning. Canadian Journal of Psychology, 22, 85-95.

    Article  Google Scholar 

  6. Macphail, E. M., & Reilly, S. (1989). Rapid acquisition of a novelty versus familiarity concept by pigeons (Columba livia). Journal of Experimental Psychology: Animal Behavior Processes, 15, 242–252.

    Google Scholar 

  7. McMillan, N., & Roberts, W. A. (2012). Pigeons make errors as a result of interval timing in a visual, but not a visual-spatial, midsession reversal task. Journal of Experimental Psychology: Animal Behavior Processes, 38, 440–445.

    PubMed  Google Scholar 

  8. McMillan, N., Sturdy, C. B., & Spetch, M. L. (2015). When is a choice not a choice? Pigeons fail to inhibit incorrect responses on a go/no-go midsession reversal task. Journal of Experimental Psychology: Animal Learning and Cognition, 41, 255-265.

    Google Scholar 

  9. Randall, C. K., & Zentall, T. R. (1997). Win-stay/lose-shift and win-shift/lose-stay learning by pigeons in the absence of overt response mediation. Behavioural Processes, 41, 227-236.

    Article  Google Scholar 

  10. Rayburn-Reeves, R. M., Molet, M., Zentall, T. R. (2011). Simultaneous discrimination reversal learning in pigeons and humans: Anticipatory and perseverative errors. Learning & Behavior, 39, 125-137.

    Article  Google Scholar 

  11. Rayburn-Reeves, R. M., Qadri, M. A., Brooks, A. M., & Cook, R. G. (2017). Dynamic cue use in pigeon mid-session reversal. Behavioural Processes, 137, 53-63.

    Article  Google Scholar 

  12. Rayburn-Reeves, R. M., Stagner, J. P., Kirk, C. R. & Zentall, T. R. (2013). Reversal learning in rats (Rattus norvegicus) and pigeons (Columba livia): Qualitative differences in behavioral flexibility. Journal of Comparative Psychology, 127, 202-211.

    Article  Google Scholar 

  13. Santos, C., Soares, C., Vasconcelos, M., & Machado, A. (2019). The effect of reinforcement probability on time discrimination in the midsession reversal task. Journal of the Experimental Analysis of Behavior, 111, 371-386.

    Article  Google Scholar 

  14. Smith, A. P., Beckmann, J. S., & Zentall, T. R. (2017). Gambling-like behavior in pigeons: ‘Jackpot’ signals promote maladaptive risky choice. Nature: Scientific Reports, 7, 6625 doi:https://doi.org/10.1038/s41598-017-06641-

    Article  PubMed Central  Google Scholar 

  15. Zentall, T. R., Andrews, D. M., Case, J. P., Peng, D. N. (in press). Less information results in better midsession reversal accuracy by pigeons. Journal of Experimental Psychology: Animal Learning and Cognition.

  16. Zentall, T. R., Peng, D., & Miles, L. (in press). Transitive inference in pigeons may result from differential tendencies to reject the test stimuli acquired during training. Animal Cognition.

  17. Zentall, T. R., & Sherburne, L. M. (1994). Transfer of value from S+ to S- in a simultaneous discrimination. Journal of Experimental Psychology: Animal Behavior Processes, 20, 176-183.

    PubMed  Google Scholar 

Download references

Open Practices Statement

The data and materials for both experiments are available from the first author.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Thomas R. Zentall.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Zentall, T.R., Peng, D.N., House, D.C. et al. Midsession reversal learning by pigeons: Effect on accuracy of increasing the number of stimuli associated with one of the alternatives. Learn Behav 47, 326–333 (2019). https://doi.org/10.3758/s13420-019-00390-9

Download citation

Keywords

  • Midsession reversal
  • Attention
  • Stimulus variability
  • Inhibition
  • Pigeons