A Causal Link Between Prediction Errors PDF

a r t ic l e s
A causal link between prediction errors, dopamine

neurons and learning
Elizabeth E Steinberg1,2,11, Ronald Keiflin1,11, Josiah R Boivin1,2, Ilana B Witten3,4, Karl Deisseroth5–8 &
Patricia H Janak1,2,9,10
Situations in which rewards are unexpectedly obtained or withheld represent opportunities for new learning. Often, this learning
includes identifying cues that predict reward availability. Unexpected rewards strongly activate midbrain dopamine neurons.
This phasic signal is proposed to support learning about antecedent cues by signaling discrepancies between actual and expected
outcomes, termed a reward prediction error. However, it is unknown whether dopamine neuron prediction error signaling and
© 2013 Nature America, Inc. All rights reserved.
cue-reward learning are causally linked. To test this hypothesis, we manipulated dopamine neuron activity in rats in two
behavioral procedures, associative blocking and extinction, that illustrate the essential function of prediction errors in learning.
We observed that optogenetic activation of dopamine neurons concurrent with reward delivery, mimicking a prediction error,
was sufficient to cause long-lasting increases in cue-elicited reward-seeking behavior. Our findings establish a causal role for
temporally precise dopamine neuron signaling in cue-reward learning, bridging a critical gap between experimental evidence and
influential theoretical frameworks.
Much of the behavior of humans and other animals is directed toward rewards elicit strong increases in firing rate, whereas anticipated
seeking out rewards. Learning to identify environmental cues that rewards produce little or no change8,10,11. Conversely, when an
provide information about where and when natural rewards can be expected reward fails to materialize, neural activity is depressed below
obtained is an adaptive process that allows this behavior to be distrib- baseline8–10. Reward-evoked dopamine release at terminal regions
uted efficiently. Theories of associative learning have long recognized in vivo is also more pronounced when rewards are unexpected12.
that simply pairing a cue with reward is not sufficient for learning to On the basis of this parallel between RPE and dopamine responses,
occur. In addition to contiguity between two events, learning also a current hypothesis suggests that dopamine neuron activity at the
requires the subject to detect a discrepancy between an expected time of reward delivery acts as a teaching signal and causes learn-
reward and the reward that is actually obtained1. ing about antecedent cues2–4. This conception is further supported
This discrepancy, or reward prediction error (RPE), acts as a teach- by the observation that dopamine neurons are strongly activated by
ing signal that is used to correct inaccurate predictions. Presentation primary rewards before cue-reward associations are well learned.
npg
of unpredicted reward or reward that is better than expected gener- As learning progresses and behavioral performance nears asymptote,
ates a positive prediction error and strengthens cue-reward associa- the magnitude of dopamine neuron activation elicited by reward
tions. Presentation of a perfectly predicted reward does not generate a delivery progressively wanes7,10.
prediction error and fails to support new learning. Conversely, omis- Although the correlative evidence linking reward-evoked dopamine
sion of a predicted outcome generates a negative prediction error neuron activity with learning is compelling, little causal evidence
and leads to extinction of conditioned behavior. The error correction exists to support this hypothesis. Previous studies that attempted to
principle figures prominently in psychological and computational address the role of prediction errors and phasic dopamine neuron
models of associative learning1–6, but the neural bases of this influ- activity in learning employed pharmacological tools, such as targeted
ential concept have not yet been definitively demonstrated. inactivation of the VTA13 or administration of dopamine receptor
In vivo electrophysiological recordings in non-human primates and antagonists14 or indirect agonists15. Such studies suffer from the
rodents have shown that putative dopamine neurons in the ventral limitation that pharmacological agents alter the activity of neurons
tegmental area (VTA) and the substantia nigra pars compacta respond over long timescales and therefore cannot determine the contribution
to natural rewards such as palatable food7–9. Notably, the sign and of specific patterns of dopamine neuron activity to behavior. Genetic
magnitude of the dopamine neuron response is modulated by the manipulations that chronically alter the actions of dopamine neu-
degree to which the reward is expected. Surprising or unexpected rons by reducing or eliminating the ability of dopamine neurons to
1Ernest Gallo Clinic and Research Center, University of California, San Francisco, California, USA. 2Graduate program in Neuroscience, University of California, San Francisco,
California, USA. 3Princeton Neuroscience Institute, Princeton University, Princeton, New Jersey, USA. 4Department of Psychology, Princeton University, Princeton,
New Jersey, USA. 5Department of Bioengineering, Stanford University, Stanford, California, USA. 6Department of Psychiatry and Behavioral Sciences, Stanford University,
Stanford, California, USA. 7Howard Hughes Medical Institute, Stanford University, Stanford, California, USA. 8Crack the Neural Code Program, Stanford University, Stanford,
California, USA. 9Department of Neurology, University of California, San Francisco, California, USA. 10Wheeler Center for the Neurobiology of Addiction, University of
California, San Francisco, California, USA. 11These authors contributed equally to this work. Correspondence should be addressed to P.H.J. (pjanak@gallo.ucsf.edu).
Received 15 January; accepted 2 May; published online 26 May 2013; doi:10.1038/nn.3413
966 VOLUME 16 | NUMBER 7 | JULY 2013 nature NEUROSCIENCE

a r t ic l e s
fire in bursts16,17 do alter learning, but suffer from similar problems, neurons at the time of expected reward delivery affected learning
as the effect of dopamine neuron activity during specific behavioral in a manner that was consistent with the hypothesis that dopamine
events (such as reward delivery) cannot be evaluated. Other studies neuron prediction error signaling drives associative learning.
circumvented these issues by using optogenetic tools that permit tem-
porally precise control of dopamine neuron activity; however, these RESULTS
studies failed to utilize behavioral tasks that explicitly manipulate Demonstration of associative blocking
reward expectation18–21, involve natural rewards20,21 or are suitable The blocking procedure provides an illustration of the essential role of
for assessing cue-reward learning19. Thus, despite the prevalence and RPEs in associative learning. Consider two cues (for example, a tone
influence of the hypothesis that RPE signaling by dopamine neurons and a light) presented simultaneously (in compound) and followed
drives associative cue-reward learning, a direct link between the two by reward delivery. It has been shown that conditioning to one ele-
has yet to be established. ment of the compound is reduced (or blocked) if the other element
To address this unresolved issue, we capitalized on the ability has already been established as a reliable predictor of the reward24–27.
to selectively control the activity of dopamine neurons in the In other words, despite consistent pairing between a cue and reward,
awake, behaving rat with temporally precise and neuron-specific the absence of a prediction error prevents learning about the redun-
optogenetic tools21–23 to simulate naturally occurring dopamine dant cue. Consistent with the idea that dopamine neurons encode
signals. We sought to determine whether activation of dopamine prediction errors, putative dopamine neurons recorded in vivo exhibit
neurons in the VTA timed with the delivery of an expected reward little to no reward-evoked responses in a blocking procedure 28. The
would mimic a RPE and drive cue-reward learning using two distinct lack of dopamine neuron activity, combined with a failure to learn in
behavioral procedures. the blocking procedure, is considered to be a key piece of evidence
First, we employed blocking, the associative phenomenon that (albeit correlative) linking dopamine RPE signals to learning. On the
best demonstrates the role of prediction errors in learning 24–26. basis of this evidence, we determined that the blocking procedure
In a blocking procedure, the association between a cue and a reward would provide an ideal environment in which to test the hypoth-
is prevented (or blocked) if another cue present in the environment esis that RPE signaling by dopamine neurons can drive learning.
at the same time already reliably signals reward delivery27. It is gener- According to this hypothesis, artificially activating dopamine neurons
ally argued that the absence of an RPE, supposedly encoded by the during reward delivery in the blocking condition, when dopamine
reduced or absent phasic dopamine response to the reward, prevents neurons normally do not fire, should mimic a naturally occurring
further learning about the redundant cue4,28. We reasoned that arti- prediction error signal and allow subjects to learn about the otherwise
ficial VTA dopamine neuron activation paired with reward delivery blocked cue.
would mimic a positive prediction error and facilitate learning about We first examined associative blocking of reward-seeking (Fig. 1)
the redundant cue. Next, we tested the role of dopamine neuron acti- using parameters suitable for subsequent optogenetic neural manip-
vation during extinction learning. Extinction refers to the observed ulation. Two groups of rats were initially trained to respond for a
decrease in conditioned responding that results from the reduction or liquid sucrose reward (unconditioned stimulus) during an auditory
omission of an expected reward. The negative prediction error, which cue in a single cue training phase. Subsequently, a combined audi-
is supposedly encoded by a pause in dopamine neuron firing, is pro- tory and visual cue was presented in a compound training phase and
posed to induce extinction of behavioral responding4,29. We reasoned the identical sucrose unconditioned stimulus was delivered. For sub-
that artificial VTA dopamine neuron activation timed to coincide jects assigned to the blocking group, the same auditory cue was pre-
with the reduced or omitted reward would interfere with extinction sented during single and compound phases, whereas distinct auditory
learning. In both procedures, optogenetic activation of dopamine cues were used for control group subjects (Fig. 1a); in both phases,
npg
Figure 1 Behavioral demonstration of the

blocking effect. (a) Experimental design of the
a Single cue Compound cue Test b Single reinforced trial
14–15 d 4d 1d
blocking task. A, cue A; X, cue X; AX, compound 30 s
presentation of cues A and X; US, unconditioned

Cue
stimulus. (b) During reinforced trials, sucrose Blocking
X?
delivery was contingent on reward port entry during A US AX US
the 30-s cue. After entry, sucrose was delivered for Time in port
3 s, followed by a 2-s timeout. Up to six sucrose
rewards could be earned per trial, depending on Control US (sucrose)
X?
the rats’ behavior. (c) Performance across all single B US AX US 3 s (up to 6 per trial)
cue and compound training sessions. Inset, mean
performance among groups over the last 4 d of
single-cue training did not differ; controls showed c Single cue Compound d e
Blocking
reduced behavior during compound training 100 *** 25 Test
25 (n = 12)
(***P < 0.001). (d) Performance during visual
Percent time in port during cue
Control
cue test. The blocking group exhibited reduced 75
20 20 (n = 11)
responding to the cue at test, relative to controls P = 0.095
15 15
(main effect of group, P = 0.003; group × trial **
50
interaction, P = 0.286). (e) Visual cue test P < 0.001
Percent time
100 10 10
performance for the first trial and the average of
50
all three trials. The blocking group showed reduced 25
5 5
cue responding for the three-trial measure 0
Single cue Compound
(**P = 0.003), but were not different on the first 0 0 0
trial (P = 0.095). Data are presented as means and 1 3 5 7 9 11 13 15 17 19 1 2 3 1 trial 3 trials
Trial
error bars represent s.e.m. Training day
nature NEUROSCIENCE VOLUME 16 | NUMBER 7 | JULY 2013 967

a r t ic l e s
Figure 2 Dopamine neuron stimulation drives a b Paired stimulation

c
new learning. (a) Example histology from a Single Compound Test Cue AX
Th-cre+ rat injected with a Cre-dependent cue cue Time in port
ChR2-containing virus. Vertical track indicates 14–15 d
alv
PPT
4d 1d US (sucrose)
optical fiber placement above VTA. Scale bar +Stim
represents 1 mm. (b) Experimental design for
mp
blocking task with optogenetics. All groups AP – 5.8 Unpaired stimulation

received identical behavioral training according A US AX US X? Cue AX
Time in port
to the blocking group design shown in
With paired or unpaired US (sucrose)
Figure 1a. (c) Optical stimulation (1-s train, optical stimulation
Stim
5-ms pulse, 20 Hz, 473 nm) was synchronized ChR2-eYFP TH DAPI
with sucrose delivery in Paired (Cre+ and

Cre−), but not Unpaired (Cre+), groups during
d Test
* e PairedCre+ f
Single cue Compound (n = 9)
compound training. (d) Performance across all 100 15 15 **
** UnpairedCre+

single cue and compound training sessions. (n = 9)
PairedCre−
Inset, no group differences were observed over 75
(n = 10)
10 10
the last 4 d of single cue training or during n.s. n.s.
100 P = 0.055
compound training. (e) Performance during
Percent time
50
visual cue test. The PairedCre+ group exhibited 50 5 5
increased responding to the cue relative to 25
0
both control groups at test on the first trial Single cue Compound
0 0 0
(**P < 0.005). (f) Visual cue test performance
1 3 5 7 9 11 13 15 17 19 1 2 3 1 trial 3 trials
for the first trial and all three trials averaged. Training day Trial
The PairedCre+ group exhibited increased cue
responding relative to controls for the one-trial measure (PairedCre + versus UnpairedCre+, **P = 0.005; PairedCre+ versus PairedCre−, *P = 0.025;
PairedCre− versus UnpairedCre+, P = 0.26); there was a trend for a group effect for the three-trial average (main effect of group, P = 0.055). Data are
presented as means and error bars represent s.e.m.
unconditioned stimulus delivery was contingent on the rat’s pres- neurons and lead to learning observed under control conditions.
ence in the reward port during the cue (Fig. 1b). Thus, the critical To test this hypothesis, we optogenetically activated VTA dopamine
difference between experimental groups is the predictability of the neurons at the time of unconditioned stimulus delivery on com-
unconditioned stimulus during the compound phase; because of pound trials in our blocking task to drive learning under conditions
its prior association with the previously trained auditory cue, the in which learning normally does not occur. We used parameters that
unconditioned stimulus is expected for the blocking group, whereas, we have previously established elicit robust, time-locked activation
for the control group, its occurrence is unexpected. We measured of dopamine neurons and neurotransmitter release in anesthetized
conditioned responding as the amount of time spent in the reward animals or in vitro preparations21. We predicted that phasic dopamine
port during the cue, normalized to an immediately preceding pre-cue neuron activation delivered coincidently with fully-predicted reward
period of equal length. Both groups showed equivalently high levels would be sufficient to cause new learning about preceding cues.
of conditioned behavior at the end of the single cue phase (two-way Female transgenic rats expressing Cre recombinase under the con-
repeated-measures ANOVA, no effect of group or group × day inter- trol of the tyrosine hydroxylase (Th) promoter (Th-cre+ rats) and their
action, all P values > 0.05), but differed in their performance when the wild-type littermates (Th-cre− rats) were used to gain selective control
compound cue was introduced (two-way repeated-measures ANOVA, of dopamine neuron activity as described previously21. Th-cre+ and
npg
main effect of group, F1,21 = 21.15, P < 0.001; group × day interaction, Th-cre− littermates received identical injections of a Cre-dependent
F3,63 = 11.63, P < 0.001), consistent with the fact that the association virus expressing channelrhodopsin-2 (ChR2) in the VTA; chronic
between the compound cue and unconditioned stimulus had to be optical fiber implants were targeted dorsal to this region to allow for
learned by the control group (Fig. 1c). selective unilateral optogenetic dopamine neuron activation (Fig. 2a
To determine whether learning about the visual cue introduced dur- and Supplementary Fig. 1). Three groups of rats were trained under
ing compound training was affected by the predictability of reward, conditions that normally result in blocked learning to the light cue
we assessed conditioned responding to unreinforced presentations of (cue X; Fig. 2b). The behavioral performance of an experimental
the visual cue alone 1 d later. Conditioned responding was reduced group (PairedCre+) consisting of Th-cre+ rats that received optical
in the blocking group as compared with controls (two-way repeated- stimulation (1-s train, 5-ms pulse, 20 Hz) paired with the uncondi-
measures ANOVA, main effect of group, F1,21 = 11.27, P = 0.003, no tioned stimulus during compound training (see Online Methods)
group × trial interaction, F2,42 = 1.29, P = 0.286; Fig. 1d,e), indicating was compared to the performance of two control groups that received
that new learning about preceding environmental cues occurs after identical training, but differed either in genotype (PairedCre −) or
unpredicted, but not predicted, reward in this procedure, consistent the time at which optical stimulation was delivered (UnpairedCre+,
with previous findings28,30. optical stimulation during the intertrial interval, ITI; Fig. 2c). Groups
performed equivalently during single cue and compound train-
Reward-paired dopamine neuron activation drives learning ing (Fig. 2d), suggesting that all rats learned the task and that the
Putative dopamine neurons recorded in monkeys are strongly acti- optical stimulation delivered during compound training did not
vated by unexpected reward, but fail to respond to the same reward if disrupt ongoing behavior (two-way repeated-measures ANOVA
it is fully predicted10,11, including when delivered in a blocking condi- revealed no significant effect of group or group × day interaction, all
tion28. The close correspondence between dopamine neural activity P values > 0.111).
and behavioral evidence of learning in this task suggests that posi- The critical comparison among groups occurred when the visual
tive RPEs caused by unexpected reward delivery activate dopamine cue introduced during compound training was tested alone in an

a r t ic l e s
Figure 3 Dopamine neuron stimulation a Training Cue sucrose

Cue
Paired
attenuates behavioral decrements associated Time in port
stimulation
with a downshift in reward value. Water + stim
(a) Experimental design for reward downshift Downshift test Cue water
experiment. Optical stimulation (3-s train, 5-ms Cue
Unpaired
pulse, 20 Hz, 473 nm) was either paired with Time in port
stimulation
+
the water reward (PairedCre and PairedCre − Downshift recall Cue water Water
groups) or explicitly unpaired (UnpairedCre+) Stim
during the downshift test. (b) Percent time in
port during the cue across training sessions. b Training c
Downshift test dDownshift recall PairedCre+
(n = 11)
Inset, no difference in average performance

100 100 50 *** 100 50 UnpairedCre
+
during the last two training sessions. (c) Percent (n = 10)

40 40 –
time in port during the cue for the downshift 75 75 75 PairedCre
(n = 10)
test. Data are displayed for single trials (left) 30 30
n.s.
and as a session average (right). PairedCre+ rats 50 100 50 50
20 20
exhibited increased time in port compared with 50
+ + 25 25 25
controls (PairedCre versus UnpairedCre , 10 10
***P < 0.001; PairedCre+ versus PairedCre−, 0
0
0 0 0 0
***P < 0.001; PairedCre− versus UnpairedCre+,
P = 0.691). (d) Percent time in port during the
cue for downshift recall. Data are displayed
e f g
30 10 30 20 30 20
for single trials (left) and as a session average
Latency to enter port (s)
*
(right). There were no group differences during 5
15 15
this phase (two-way repeated-measures ANOVA, 20 20 *** 20
0
main effect of group, P = 0.835). (e) Latency 10 10
to enter the reward port after cue onset. Inset,
10 10 10
no group differences during last two training 5 5
sessions. (f) Data are presented as in c, but
for latency. PairedCre+ rats responded faster 0 0 0 0 0
1 3 5 7 9 11 5 10 Session 5 10 Session
to the cue compared with controls during the
Training day Trial Trial
downshift test (PairedCre+ versus UnpairedCre+,
***P < 0.001; PairedCre+ versus PairedCre−, ***P < 0.001; PairedCre− versus UnpairedCre+, P = 0.375). (g) Data are presented as in d, but for
latency. PairedCre+ rats responded faster to the cue than controls during downshift recall (PairedCre + versus UnpairedCre+, P = 0.024; PairedCre+
versus PairedCre−, P = 0.025; PairedCre− versus UnpairedCre+, P = 0.706; *P < 0.05). Data are presented as means and error bars represent s.e.m.
unreinforced session. PairedCre+ subjects responded more strongly Online Methods and Supplementary Figs. 3 and 4). This suggests
to the visual cue on the first test trial than subjects from either control that the unblocked learning about the newly added cue X was not the
group (Fig. 2e,f), indicating greater learning. A two-way repeated- result of increased reward value induced by manipulating dopamine
measures ANOVA revealed a significant interaction between group neuron activity.
and trial (F4,50 = 3.819, P = 0.009) and a trend toward a main effect
of group (F2,25 = 3.272, P = 0.055). Planned post hoc comparisons Dopamine neuron activation slows extinction
showed a significant difference between the PairedCre+ group and Negative prediction errors also drive learned behavioral changes. For
PairedCre− (P = 0.005) or UnpairedCre+ (P < 0.001) controls on the example, after a cue-reward association has been learned, decrementing
npg
first test trial, whereas control groups did not differ (UnpairedCre+ or omitting the expected reward results in decreased reward-seeking
versus PairedCre−, P = 0.155; Fig. 2e,f). This result indicates that uni- behavior. Dopamine neurons show a characteristic pause in firing in
lateral VTA dopamine neuron activation at the time of unconditioned response to reward decrements or omissions 8–10, and this pause is
stimulus delivery was sufficient to cause new learning about preced- proposed to contribute to decreased behavioral responding to cues
ing environmental cues. The observed dopamine neuron–induced after reward decrement4,29. Having established that optogenetically
learning enhancement was temporally specific, as responding to the activating dopamine neurons can drive new learning about cues under
visual cue was blocked in the UnpairedCre+ group receiving optical conditions in which dopamine neurons normally do not change their
stimulation outside of the cue and unconditioned stimulus periods. firing patterns from baseline levels, we next tested whether similar
Notably, PairedCre+ and UnpairedCre+ rats received equivalent stim- artificial activation at a time when dopamine neurons normally decrease
ulation, and this stimulation was equally reinforcing (Supplementary firing could counter decrements in behavioral performance associ-
Fig. 2a–c), so discrepancies in the efficacy of optical stimulation ated with reducing the value of the unconditioned stimulus. Th-cre+
between the PairedCre+ and UnpairedCre+ groups cannot explain and Th-cre− rats that received unilateral ChR2-containing virus infu-
the observed behavioral differences. sions and optical fiber implants targeted to the VTA (Supplementary
One possible explanation for the behavioral changes that we Fig. 1) were trained to respond for sucrose whose availability was pre-
observed in the blocking experiment is that optical stimulation of dicted by an auditory cue. The auditory cue was presented 1 d after the
dopamine neurons during compound training served to increase last training session, but water was substituted for the sucrose uncon-
the value of the paired sucrose reward. Such an increase in value ditioned stimulus (downshift test; Fig. 3a). PairedCre+ and PairedCre−
would result in a RPE (although not encoded by dopamine neurons) rats received dopamine neuron optical stimulation (3-s train, 5-ms
and unblock learning. We found, however, that the manipulation pulse, 20 Hz) concurrent with water delivery when they entered the
of dopamine neuron activity during the consumption of one of two reward port during the cue; UnpairedCre+ rats received stimulation
equally preferred, distinctly flavored sucrose solutions did not change during the ITI. Rats were subjected to a downshift recall session
the relative value of these rewards (measured as reward preference; later; the recall session was identical to the initial extinction test,

a r t ic l e s
Figure 4 Dopamine neuron stimulation a Training Cue sucrose

Cue
Paired
attenuates behavioral decrements associated Time in port
stimulation
with reward omission. (a) Experimental design Stim
for extinction experiment. Note that the same Extinction test Cue Ø
Cue
subjects from the downshift experiment were Unpaired
+ Time in port
used for this procedure, with Cre groups stimulation
shuffled between experiments (see Online Extinction recall Cue Ø Stim
Methods). Optical stimulation (3-s train, 5-ms
pulse, 20 Hz, 473 nm) was delivered at the b Training cExtinction test d Extinction recall
time of expected reward for Paired groups and PairedCre+

+ 100 60 40 60 40 (n = 11)
during ITI for UnpairedCre rats during the
*** UnpairedCre+
extinction test. (b) Percent time in port during 30 (n = 10)
75 30
the cue across training sessions. Inset, no 40 40 PairedCre−
difference was observed in average performance 50 100 20 20
(n = 10)
during the last two training sessions. (c) Percent ***
50 20 20
time in port during the cue for the extinction 25 10 10
test. Data are displayed for single trials (left) 0
and as a session average (right). PairedCre+ rats 0 0 0 0 0
exhibited increased time in port compared with
controls (PairedCre+ versus UnpairedCre+, e f g
***P < 0.001; PairedCre+ versus PairedCre−, 30 10 30 30 30 30
***P < 0.001; PairedCre− versus UnpairedCre+, ***
P = 0.920). (d) Percent time in port during the 5

20 20 20 * 20 20
cue for extinction recall. Data are displayed 0
for single trials (left) and as a session average

(right). PairedCre+ rats exhibited increased 10 10 10 10 10
time in port compared with controls (PairedCre+
+
versus UnpairedCre , ***P < 0.001; PairedCre +
0 0 0 0 0
versus PairedCre−, ***P < 0.001; PairedCre−
2 4 6 5 10 Session 5 10 Session
+
versus UnpairedCre , P = 0.984). (e) Latency to Training day Trial Trial
enter the reward port after cue onset. Inset, no
group differences were observed during the last two training sessions. (f) Data are presented as in c, but for latency. PairedCre+ rats responded faster to
the cue than controls during the extinction test (PairedCre + versus UnpairedCre+, P = 0.038; PairedCre+ versus PairedCre−, P = 0.04; PairedCre− versus
UnpairedCre+, P = 0.727; *P < 0.05). (g) Data are presented as in d, but for latency. PairedCre+ rats responded faster to the cue than controls during
extinction recall (PairedCre+ versus UnpairedCre+, ***P < 0.001; PairedCre+ versus PairedCre−, ***P < 0.001; PairedCre− versus UnpairedCre+,
P = 0.211). Data are presented as means and error bars represent s.e.m.
except that no optical stimulation was given. The purpose of the recall We next examined whether our optical manipulation would be effec-
session was to determine whether optical stimulation had caused tive if the expected reinforcer was omitted entirely (Fig. 4). Rats used
long-lasting behavioral changes. Cue responding was measured as in the downshift experiment (see Online Methods) were trained on a
the percent time spent in the reward port during the cue normalized new cue-reward association (Fig. 4a). All rats learned the new associa-
to a pre-cue baseline (Fig. 3b–d) and as the latency to enter the reward tion (Fig. 4b,e); a two-way repeated-measures ANOVA revealed no
port after cue onset (Fig. 3e–g). significant effects of group or group × day interactions at the end of
All groups acquired the initial cue–reward association (Fig. 3b,e); training (all P values > 0.242). Subsequently, all rats were subjected to
npg
a two-way repeated-measures ANOVA revealed no significant an extinction test in which the expected sucrose reward was withheld.
effects of group or group × day interactions at the end of training Instead, PairedCre+ and PairedCre− rats received optical stimulation
(all P values > 0.277). During the downshift test, PairedCre− and (3-s train, 5-ms pulse, 20 Hz) of dopamine neurons at the time of
UnpairedCre+ group performance rapidly deteriorated. This was expected unconditioned stimulus delivery, whereas UnpairedCre+ rats
evident on a trial-by-trial basis (Fig. 3c,f) and when cue responding received optical stimulation during the ITI. Rats were subjected to an
was averaged across the entire downshift test session (Fig. 3c,f). extinction recall session 1 d later in which neither the unconditioned
In contrast, PairedCre+ rats receiving optical stimulation concur- stimulus nor optical stimulation were delivered to determine whether
rent with water delivery showed much reduced (Fig. 3c) or no prior optical stimulation results in long-lasting behavioral changes.
(Fig. 3f) decrement in behavioral responding. Two-way repeated- During the extinction test, PairedCre+ rats spent more time
measures ANOVAs revealed significant effects of group and group in the reward port during the cue and responded to the cue more
× trial interactions for both time spent in the port during the quickly than both PairedCre− and UnpairedCre+ rats (Fig. 4c,f);
cue (group, F2,28 = 11.12, P < 0.001; group × trial, F18,252 = 1.953, two-way repeated-measures ANOVAs revealed significant effects of
P = 0.013) and latency to respond after cue onset (group, F2,28 = group and/or group × trial interactions for both measures (percent
12.463, P < 0.001; group × trial, F18,252 = 4.394, P < 0.001). Planned time: group, F2,28 = 40.054, P < 0.001; group × trial, F18,252 = 0.419,
post hoc comparisons revealed that PairedCre+ rats differed signifi- P = 0.983; latency: group, F2,28 = 3.827, P = 0.034; group × trial,
cantly from controls in both time and latency (P < 0.001), whereas F18,252 = 2.047, P = 0.008), and these behavioral differences per-
control groups did not differ from each other (P > 0.375). Notably, sisted into the extinction recall session (two-way repeated-measures
some group differences persisted into the downshift recall session in ANOVAs, significant main effects of group and group × trial inter
which no stimulation was delivered (latency: main effect of group, actions, F > 2, P < 0.01 in all cases; Fig. 4d,g). Thus, VTA dopamine
F2,28 = 4.597, P = 0.019; Fig. 3g). These data indicate that phasic VTA neuron activation at the time of expected reward is sufficient to sustain
dopamine neuron activation can partially counteract performance conditioned behavioral responding when expected reward is
changes associated with reducing reward value. omitted. For both reward downshift and omission, the behavioral

a r t ic l e s
effects of dopamine neuron stimulation were temporally specific, sensory representation of reward, as they do not discriminate among
as UnpairedCre+ rats responded less than PairedCre+ rats despite rewards on the basis of their sensory properties29; thus, it is not obvi-
receiving more stimulation during the test sessions (Supplementary ous how dopamine neuron activation coincident with natural reward
Fig. 2d,g) and despite verification that this stimulation is equally delivery could be perceived as distinct from that reward.
reinforcing in both Cre+ groups (Supplementary Fig. 2e,f,h,i). Although previous studies have suggested otherwise 34,35, another
Despite causing substantial behavioral changes during extinc- related possibility is that optical activation of dopamine neurons
tion, optogenetic activation of dopamine neurons failed to maintain induces behavioral changes by directly enhancing the value of the
reward-seeking behavior at pre-extinction levels. This may be a result paired natural reward. To address this possibility, we conducted a
of the inability of our dopamine neuron stimulation to fully counter control study based on the idea that high-value rewards are preferred
the expected decrease in dopamine neuron firing during reward omis- over less valuable alternatives. We paired dopamine neuron stimula-
sion or downshift. Alternatively, this may reflect competition between tion with consumption of a flavored, and therefore discriminable,
the artificially imposed dopamine signal and other neural circuits sucrose solution; we reasoned that if dopamine neuron stimulation
specialized to inhibit conditioned responding when this behavior is served to increase the value of a paired reward, this should manifest
no longer advantageous, as has been proposed31,32. as an increased preference for the stimulation-paired reward over a
Estrus cycle can modulate dopaminergic transmission under some distinctly flavored, but otherwise identical sucrose solution. However,
circumstances33. Notably, although female rats were used in these we observed that reward that was previously paired with dopamine
studies, we tracked estrus stage during a behavioral session in which neuron stimulation was preferred equivalently to one that was not.
dopamine neurons were stimulated, and we failed to observe cor- This result does not support the interpretation that optical dopamine
relations between estrus cycle stage and behavioral performance neuron stimulation supported learning in our experiments by increas-
(Supplementary Fig. 5). ing the value of the sucrose reward.
Alternatively, the behavioral changes that we observed in PairedCre+
DISCUSSION rats could reflect the development of a conditioned place preference

We found that RPE signaling by dopamine neurons is causally for the location at which optical stimulation was delivered (that is,
related to cue-reward learning. We leveraged the temporal precision the reward port), as has been demonstrated20. If this were the case,
afforded by optogenetic tools to mimic endogenous RPE signaling we should have observed generalized increases in reward-seeking
in VTA dopamine neurons to examine how these artificial signals behavior across the entire test session. Notably, our primary
affect subsequent behavior. Using an associative blocking proce- behavioral metric (time spent in the reward port during the cue)
dure, we observed that increasing dopamine neuron activity during was normalized to pre-cue baseline levels. If optical stimulation
reward delivery could drive new learning about antecedent cues that had induced nonspecific increases in reward-seeking behavior, our
would not normally guide behavior. Using extinction procedures, we normalized measure should have approached zero. However, we found
observed that reductions in conditioned responding that normally that reward-seeking was specifically elevated during cue presentation.
accompany decreases in reward value are attenuated when dopamine Although we observed robust group differences in our normalized
neuron activity is increased at the time of expected reward. Notably, measures, a separate analysis of the absolute percent time spent in the
the behavioral changes we observed in all experiments were long last- port in the pre-cue baseline period during any test session revealed
ing, persisting 24 h after dopamine neurons were optogenetically acti- no significant group differences (all P values > 0.17). Together, these
vated, and temporally specific, failing to occur if dopamine neurons findings indicate that the behavioral changes that we observed are
were activated at times outside of the reward consumption period. unlikely to be the result of a conditioned place preference.
Taken together, our results indicate that RPE signaling by dopamine Instead, the most parsimonious explanation for our results is that
neurons is sufficient to support new cue-reward learning and modify dopamine stimulation reproduced a RPE. Theories of associative
npg
previously learned cue-reward associations. learning hold that simple pairing, or contiguity, between a stimu-
Our results clearly establish that artificially activating VTA lus and reward or punishment is not sufficient for conditioning to
dopamine neurons at the time that a natural reward is delivered (or occur; learning requires the subject to detect a discrepancy or predic-
expected) supports cue-elicited responding. A question of funda- tion error between the expected and actual outcome that serves to
mental importance is why this occurs. In particular, for the block- correct future predictions1. Although compelling correlative evidence
ing study, one possibility is that dopamine stimulation acted as an suggests that dopamine neurons are well-suited to provide such a
independent reward, discriminable from the paired sucrose reward, teaching signal, little proof exists to support this notion. For this
which initiated the formation of a parallel association between the reason, our results represent an advance over previous work. Although
reward-predictive cue and dopamine stimulation itself. However, this prior studies that also used optogenetic tools to permit temporally
explanation assumes cue independence and would require the rat to precise control of dopamine neuron activity found that dopamine
compute two simultaneous, yet separate, prediction errors controlling neuron activation is reinforcing, these studies did not establish the
the strength of two separate associations (cue A → sucrose, cue X → means by which this stimulation can reinforce behavior. Because we
dopamine stimulation). Indeed, the assumption of cue independence used behavioral procedures in which learning is driven by reward
was challenged1 specifically because separate prediction errors cannot prediction errors, our data establish the critical behavioral mechanism
account for the phenomenon of blocking. If each cue generated its (RPE) through which phasic dopamine signals timed with reward
own independent prediction error, then the preconditioning of one cause learning.
cue would not affect the future conditioning of other cues, but it does, Through which cellular and circuit mechanisms could this
as the blocking procedure revealed. Blocking showed that cues pre- dopamine signal cause learning to occur? Although few in number,
sented simultaneously interact and compete for associative strength. VTA dopamine neurons send extensive projections to a variety of
Thus, it is unlikely that a parallel association formed between reward- cortical and subcortical areas and are therefore well-positioned to
predictive cues and dopamine stimulation can account for our results. influence neuronal computation2,36–38. Increases in dopamine neuron
Of interest, putative dopamine neurons do not appear to encode a firing during unexpected reward could function as a teaching signal

a r t ic l e s
used broadly in efferent targets to strengthen neural representa- abuse through the University of California, San Francisco (P.H.J.), a National
tions that facilitate reward receipt39,40, possibly via alterations in the Science Foundation Graduate Research Fellowship (E.E.S.), and grants from the
strength and direction of synaptic plasticity37,41–43. Because our arti- National Institute of Mental Health, the National Institute on Drug Abuse, the
Michael J Fox Foundation, the Howard Hughes Medical Institute, and the Defense
ficial manipulation of dopamine neuron activity produced behavioral Advanced Research Projects Agency Reorganization and Plasticity to Accelerate
changes that lasted at least 24 h after the stimulation ended, such Injury Recovery Program (K.D.).
dopamine-induced, downstream changes in synaptic function may
have occurred; in addition, both natural cue-reward learning44 and AUTHOR CONTRIBUTIONS
E.E.S., R.K. and P.H.J. designed the experiments. E.E.S., R.K. and J.R.B. performed
optogenetic stimulation of dopamine neurons45 alter glutamatergic the experiments. I.B.W. and K.D. contributed reagents. E.E.S., R.K. and P.H.J. wrote
synaptic strength onto dopamine neurons themselves, providing the paper with comments from all of the authors.
another possible basis for the long-lasting effects of dopamine neu-
ron activation on behavior. One or both of these synaptic mechanisms COMPETING FINANCIAL INTERESTS
The authors declare no competing financial interests.
may underlie the behavioral changes reported here. Although the
physiological consequences of optogenetic dopamine neuron activa- Reprints and permissions information is available online at http://www.nature.com/
tion have been investigated in in vitro preparations and in anesthe- reprints/index.html.
tized rats, to fully explore these synaptic mechanisms, a first critical
step is to define the effects of optical activation on neuronal firing and 1. Rescorla, R.A. & Wagner, A.R. A theory of Pavlovian conditioning: variations in the
on dopamine release in the awake behaving subject. effectiveness of reinforcement and nonreinforcement. in Classical Conditioning II:
Current Research and Theory (eds. Black, A.H. & Prokasy, W.F.) 64–99 (Appleton
We focused on the role of dopamine neuron activation at the time Century Crofts, New York, 1972).
of reward. Another hallmark feature of dopamine neuron firing dur- 2. Glimcher, P.W. Understanding dopamine and reinforcement learning: the dopamine
reward prediction error hypothesis. Proc. Natl. Acad. Sci. USA 108 (suppl. 3):
ing associative learning is the gradual transfer of neural activation 15647–15654 (2011).
from reward delivery to cue onset. Early in learning, when cue- 3. Montague, P.R., Dayan, P. & Sejnowski, T.J. A framework for mesencephalic
reward associations are weak, dopamine neurons respond robustly dopamine systems based on predictive Hebbian learning. J. Neurosci. 16,
1936–1947 (1996).
to the occurrence of reward and weakly to reward-predictive cues. 4. Schultz, W., Dayan, P. & Montague, P.R. A neural substrate of prediction and reward.
As learning progresses neural responses to the cue become more Science 275, 1593–1599 (1997).
pronounced and reward responses diminish10. Although our results 5. Schultz, W. & Dickinson, A. Neuronal coding of prediction errors. Annu. Rev.
Neurosci. 23, 473–500 (2000).
support the idea that reward-evoked dopamine neuron activity 6. Sutton, R.S. & Barto, A.G. Toward a modern theory of adaptive networks: expectation
drives conditioned behavioral responding to cues, the function(s) and prediction. Psychol. Rev. 88, 135–170 (1981).
7. Schultz, W., Apicella, P. & Ljungberg, T. Responses of monkey dopamine neurons
of cue-evoked dopamine neuron activity remain a fruitful avenue to reward and conditioned stimuli during successive steps of learning a delayed
for investigation. response task. J. Neurosci. 13, 900–913 (1993).
Possible answers to this question have already been proposed. 8. Cohen, J.Y., Haesler, S., Vong, L., Lowell, B.B. & Uchida, N. Neuron type–specific
signals for reward and punishment in the ventral tegmental area. Nature 482,
This transfer of the dopamine teaching signal from the primary 85–88 (2012).
reinforcer to the preceding cue is predicted by temporal-difference 9. Roesch, M.R., Calu, D.J. & Schoenbaum, G. Dopamine neurons encode the better
models of learning46. In such models, the back-propagation of the option in rats deciding between differently delayed or sized rewards. Nat. Neurosci.
10, 1615–1624 (2007).
teaching signal allows the earliest predictor of the reward to be 10. Hollerman, J.R. & Schultz, W. Dopamine neurons report an error in the temporal
identified, thereby delineating the chain of events leading to reward prediction of reward during learning. Nat. Neurosci. 1, 304–309 (1998).
11. Mirenowicz, J. & Schultz, W. Importance of unpredictability for reward responses
delivery2,6,46. Alternatively, or in addition, cue-evoked dopamine in primate dopamine neurons. J. Neurophysiol. 72, 1024–1027 (1994).
may encode the cue’s incentive value, endowing the cue itself with 12. Day, J.J., Roitman, M.F., Wightman, R.M. & Carelli, R.M. Associative learning
some of the motivational properties originally elicited by the reward, mediates dynamic shifts in dopamine signaling in the nucleus accumbens.
Nat. Neurosci. 10, 1020–1028 (2007).
thereby making the cue desirable in its own right34. Using behavioral
npg
13. Takahashi, Y.K. et al. The orbitofrontal cortex and ventral tegmental area are
procedures that allow a cue’s predictive and incentive properties to be necessary for learning from unexpected outcomes. Neuron 62, 269–280 (2009).
assessed separately, a recent study provided evidence for dopamine’s 14. Iordanova, M.D., Westbrook, R.F. & Killcross, A.S. Dopamine activity in the nucleus
accumbens modulates blocking in fear conditioning. Eur. J. Neurosci. 24,
role in the acquisition of cue-reward learning for the latter, but not 3265–3270 (2006).
the former, process47. Such behavioral procedures could also prove 15. O’Tuathaigh, C.M. et al. The effect of amphetamine on Kamin blocking and
overshadowing. Behav. Pharmacol. 14, 315–322 (2003).
useful to determine in greater detail how learning induced by mimick- 16. Parker, J.G. et al. Absence of NMDA receptors in dopamine neurons attenuates
ing RPE signals affects cue-induced conditioned responding. These dopamine release, but not conditioned approach, during Pavlovian conditioning.
and other future attempts to define the precise behavioral conse- Proc. Natl. Acad. Sci. USA 107, 13491–13496 (2010).
17. Zweifel, L.S. et al. Disruption of NMDAR-dependent burst firing by dopamine
quences of dopamine neuron activity during cues and rewards will neurons provides selective assessment of phasic dopamine-dependent behavior.
further refine our conceptions of the role of dopamine RPE signals in Proc. Natl. Acad. Sci. USA 106, 7281–7288 (2009).
associative learning. 18. Adamantidis, A.R. et al. Optogenetic interrogation of dopaminergic modulation of the
multiple phases of reward-seeking behavior. J. Neurosci. 31, 10829–10835 (2011).
19. Domingos, A.I. et al. Leptin regulates the reward value of nutrient. Nat. Neurosci.
Methods 14, 1562–1568 (2011).
20. Tsai, H.C. et al. Phasic firing in dopaminergic neurons is sufficient for behavioral
Methods and any associated references are available in the online conditioning. Science 324, 1080–1084 (2009).
version of the paper. 21. Witten, I.B. et al. Recombinase-driver rat lines: tools, techniques, and
optogenetic application to dopamine-mediated reinforcement. Neuron 72, 721–733
Note: Supplementary information is available in the online version of the paper. (2011).
22. Boyden, E.S., Zhang, F., Bamberg, E., Nagel, G. & Deisseroth, K. Millisecond-
Acknowledgments timescale, genetically targeted optical control of neural activity. Nat. Neurosci. 8,
1263–1268 (2005).
We are grateful for the technical assistance of M. Olsman, L. Sahuque and
23. Zhang, F., Wang, L.P., Boyden, E.S. & Deisseroth, K. Channelrhodopsin-2 and optical
R. Reese, and for critical feedback from H. Fields during manuscript preparation. control of excitable cells. Nat. Methods 3, 785–792 (2006).
This research was supported by the US National Institutes of Health (grants 24. Kamin, L.J. “Attention-like” processes in classical conditioning. in Miami Symposium
DA015096 and AA17072 to P.H.J. and a New Innovator award to I.B.W.), on the Prediction of Behavior, 1967: Aversive Stimulation (ed. Jones, M.R.)
funds from the State of California for medical research on alcohol and substance 9–31 (University of Miami Press, 1968).

a r t ic l e s
25. Kamin, L.J. Selective association and conditioning. in Fundamental Issues in 36. Beckstead, R.M., Domesick, V.B. & Nauta, W.J. Efferent connections of the
Associative Learning (eds. Mackintosh, N.J. & Honig, F.W.K.) 42–64 (Dalhousie substantia nigra and ventral tegmental area in the rat. Brain Res. 175, 191–217
University Press, 1969). (1979).
26. Kamin, L.J. Predictability, surprise, attention and conditioning. in Punishment and 37. Fields, H.L., Hjelmstad, G.O., Margolis, E.B. & Nicola, S.M. Ventral tegmental area
Aversive Behavior (eds. Campbell, B.A. & Church, R.M.) 279–296 neurons in learned appetitive behavior and positive reinforcement. Annu. Rev.
(Appleton-Century-Crofts, New York, NY, 1969). Neurosci. 30, 289–316 (2007).
27. Holland, P.C. Unblocking in Pavlovian appetitive conditioning. J. Exp. Psychol. 38. Swanson, L.W. The projections of the ventral tegmental area and adjacent regions:
Anim. Behav. Process. 10, 476–497 (1984). a combined fluorescent retrograde tracer and immunofluorescence study in the rat.
28. Waelti, P., Dickinson, A. & Schultz, W. Dopamine responses comply with basic Brain Res. Bull. 9, 321–353 (1982).
assumptions of formal learning theory. Nature 412, 43–48 (2001). 39. Reynolds, J.N. & Wickens, J.R. Dopamine-dependent plasticity of corticostriatal
29. Schultz, W. Predictive reward signal of dopamine neurons. J. Neurophysiol. 80, synapses. Neural Netw. 15, 507–521 (2002).
1–27 (1998). 40. Wickens, J.R., Horvitz, J.C., Costa, R.M. & Killcross, S. Dopaminergic mechanisms
30. Burke, K.A., Franz, T.M., Miller, D.N. & Schoenbaum, G. The role of the orbitofrontal in actions and habits. J. Neurosci. 27, 8181–8183 (2007).
cortex in the pursuit of happiness and more specific rewards. Nature 454, 340–344 41. Gerfen, C.R. & Surmeier, D.J. Modulation of striatal projection systems by dopamine.
(2008). Annu. Rev. Neurosci. 34, 441–466 (2011).
31. Daw, N.D., Kakade, S. & Dayan, P. Opponent interactions between serotonin and 42. Reynolds, J.N., Hyland, B.I. & Wickens, J.R. A cellular mechanism of reward-related
dopamine. Neural Netw. 15, 603–616 (2002). learning. Nature 413, 67–70 (2001).
32. Peters, J., Kalivas, P.W. & Quirk, G.J. Extinction circuits for fear and addiction 43. Tye, K.M. et al. Methylphenidate facilitates learning-induced amygdala plasticity.
overlap in prefrontal cortex. Learn. Mem. 16, 279–288 (2009). Nat. Neurosci. 13, 475–481 (2010).
33. Becker, J.B. Gender differences in dopaminergic function in striatum and nucleus 44. Stuber, G.D. et al. Reward-predictive cues enhance excitatory synaptic strength
accumbens. Pharmacol. Biochem. Behav. 64, 803–812 (1999). onto midbrain dopamine neurons. Science 321, 1690–1692 (2008).
34. Berridge, K.C. & Robinson, T.E. What is the role of dopamine in reward: hedonic 45. Brown, M.T. et al. Drug-driven AMPA receptor redistribution mimicked by selective
impact, reward learning or incentive salience? Brain Res. Rev. 28, 309–369 dopamine neuron stimulation. PLoS ONE 5, e15870 (2010).
(1998). 46. Suri, R.E. TD models of reward predictive responses in dopamine neurons. Neural
35. Wassum, K.M., Ostlund, S.B., Balleine, B.W. & Maidment, N.T. Differential Netw. 15, 523–533 (2002).
dependence of Pavlovian incentive motivation and instrumental incentive learning 47. Flagel, S.B. et al. A selective role for dopamine in stimulus-reward learning. Nature
processes on dopamine signaling. Learn. Mem. 18, 475–483 (2011). 469, 53–57 (2011).
npg

ONLINE METHODS Downshift procedure. Rats received one session where sucrose reward was
Subjects and surgery. 115 female transgenic rats (Long-Evans background) were delivered to the active (right) port (50 deliveries, 30 s, variable interval) to shape
used in these studies; 68 rats expressed Cre recombinase under the control of the reward-seeking behavior to this location. Subsequently, rats were trained to
tyrosine hydroxylase promoter (Th-cre+) and 47 rats were their wild-type litter respond for sucrose during an auditory cue (white noise) as described above
mates (Th-cre−). All rats weighed >225 g at the time of surgery. During testing in 11 daily sessions. A downshift test session was administered 24 h later that
(except the flavor preference study), rats were mildly food restricted to 18 g of lab was identical to previous training sessions except that water was substituted for
chow per day given after the conclusion of daily behavioral sessions; on average, sucrose and optical stimulation was delivered coincidently. A downshift recall
rats maintained >95% free-feeding weight. Water was available ad libitum in the test was administered 24 h later, in which water was delivered during the cue, but
home cage. Rats were singly housed under a 12-h:12-h light/dark cycle, with lights optical stimulation did not occur.
on at 7 a.m. The majority of behavioral experiments were conducted during the
light cycle. Animal care and all experimental procedures were in accordance Extinction procedure. This experiment was conducted 2 weeks after the end of
with guidelines from the US National Institutes of Health and were approved in the downshift experiment with the same subjects; group assignment for Cre+ rats
advance by the Gallo Center Institutional Animal Care and Use Committee. We was shuffled between experiments. Rats received one session of sucrose reward
used stereotaxic surgical procedures for VTA infusion of Cre-dependent virus delivery to the opposite (left) port used in the downshift test to shape reward-
(Ef1α-DIO-ChR2-eYFP)20 and optical fiber placement as previously described21, seeking behavior to this location. Subsequently, rats were trained to respond
with the exception that dorsoventral coordinates were adjusted to account for the for sucrose during an auditory cue (pulsed tone) as described above in six daily
smaller size of female rats as follows: dorsoventral –8.1 and –7.1 mm below skull training sessions. An extinction test session was administered 24 h later that
surface for virus infusions and –7.1 mm for optical fiber implants. was identical to previous training sessions except that no reward was given and
optical stimulation was delivered at the time that the sucrose reward had been
Behavioral procedures. All behavioral experiments were conducted >2 weeks available in previous training. 24 h later, an extinction recall test was administered
post-surgery; sessions that included optical stimulation were conducted >4 weeks in which the auditory cue was presented, but no reward or optical stimulation
post-surgery. was delivered.
Apparatus. Behavioral sessions were conducted in sound-attenuated condition- Intracranial self-stimulation (ICSS). Following completion of the experiments
ing chambers (Med Associates). The left and right walls were fitted with reward described above, all rats were given four daily 1-h sessions of ICSS training, as
delivery ports; computer-controlled syringe pumps located outside of the sound- described previously21. Food restriction ceased at least 24 h before the first ICSS
attenuating cubicle delivered sucrose solution or water to these ports. The left wall session. A response at the nosepoke port designated as active resulted in the deliv-
had two nosepoke ports flanking the central reward delivery port; each nosepoke ery of a train of light pulses matched to the stimulation parameters used in that
port had three LED lights at the rear. Chambers were outfitted with 2,700-Hz pure subject’s previous behavioral experiment (1 s, 20 Hz for rats in blocking or flavor
tone and white noise auditory stimuli, both delivered at 70 dB, as well as a 28-V preference studies, 3 s, 20 Hz for rats in downshift or extinction studies).
chamber light above the left reward port. During behavioral sessions, the pure
tone was pulsed at 3 Hz (0.1 s on, 0.2 s off) to create a stimulus that was easily Flavor preference study. Rats were initially trained to drink unflavored 15%
distinguished from continuous white noise. sucrose solution (wt/vol) from the reward port in the conditioning chambers
(0.1 ml delivered on variable interval 30-s schedule, 50 deliveries). Rats were
Reward delivery. All experiments (except the flavor preference study) involved then given overnight access to 40 ml each of the flavored sucrose solutions (15%
delivery of a liquid sucrose solution (15%, wt/vol) during the presentation of sucrose + 0.15% Kool-Aid, tropical punch or grape flavors, wt/vol) in their
auditory or combined auditory-visual cues. During each cue, entry into the active home cage to ensure that all subjects had sampled both flavors before critical
port triggered a 3-s delivery of sucrose solution (0.1 ml). After a 2-s timeout, consumption tests. Home cages were equipped with two bottle slots; before
another entry into the port (or the rat’s continued presence at the port) trig- the start of the experiment both slots were occupied by water bottles to reduce
gered an additional 3-s reward delivery. This 5-s cycle could be repeated up to possible side bias.
six times per 30-s trial, depending on the rat’s behavior. For sessions in which For home cage consumption tests, water bottles were removed from the home
optical stimulation was delivered, the laser was activated each time sucrose was cage 15–30 min before the start of consumption tests. A standardized procedure
npg
delivered (or expected; Figs. 2c, 3a and 4a). This method of reward delivery, was used to ensure that rats briefly sampled both flavors before free access to the
where reward and optical stimulation were both contingent on the rat’s presence solutions began. The purpose of this procedure was to make sure that rats were
in the active port, was used for all experiments as it allowed for the coincident aware that both flavors were available, so that any measured preference reflected
delivery of natural rewards and optical stimulation and maximized the temporal true choice. The experimenter placed a flavor bottle on the left side of the cage
precision of reward expectation. until the rat consumed the solution for 2–3 s. This bottle was removed and a sec-
ond bottle containing the other flavored solution was placed on the right side of
Blocking procedure. Rats received a 1-d habituation session where all auditory the cage until the rat consumed the new solution for 2–3 s. The second bottle was
and visual cues used during future training sessions, as well as the sucrose reward, then removed and both bottles were simultaneously placed on the home cage to
were presented individually (three presentations of each cue, 5-min ITI; ~60 start the test. The bottles were removed 10 min later and the amounts consumed
reward deliveries, 1-min ITI). This session was intended to minimize uncon- were recorded. The cage side assigned to each flavor (left or right) alternated
ditioned responses to novel stimuli and shape reward-seeking behavior to the between consumption tests to control for possible side bias.
correct (left) reward port. Next, rats underwent single-cue training where one Flavor training began in the conditioning chambers 24 h after the home cage
of two auditory cues (white noise or pulsed tone, counterbalanced across sub- baseline consumption test (eight sessions total). Only one flavored sucrose solu-
jects) was presented for 30 s on a variable interval, 4-min schedule for ten trials tion was available per day; training days with each flavor were interleaved. One
per session. Sucrose was delivered during each cue as described above. After of the two flavors was randomly assigned for each rat to be the stimulated flavor.
14–15 sessions of single cue training, compound cue training commenced and On training days when the stimulated flavor was available, optical stimulation was
lasted for 4 d. During this phase, either the same auditory cue used in single cue either paired with reward consumption for PairedCre+ and PairedCre− groups,
training (blocking groups) or a new auditory cue (control group) was presented or explicitly unpaired (presented during the ITI at times when no reward was
simultaneously with a visual cue. The visual cue consisted of the chamber light, available) for UnpairedCre+ rats (Supplementary Fig. 4). Flavored sucrose was
which was the sole source of chamber illumination, flashing on/off at 0.3 Hz (1 s delivered to a reward port on a variable interval 30-s schedule, with the excep-
on, 2 s off). Sucrose reward was delivered as described above during this phase. tion that each reward had to be consumed before the next would be delivered. A
A probe test was administered 24 h after the conclusion of the last compound reward was considered to be consumed if the rat maintained presence in the port
training session to assess conditioned responding to the visual cue. During this for 1s or longer. Sessions lasted until the maximum of 50 rewards were consumed
session the visual cue was presented alone in the absence of sucrose, auditory or 1 h elapsed, whichever occurred first. The final home cage preference test was
cues or optical stimulation. conducted 24 h after the last flavor training session.
nature NEUROSCIENCE doi:10.1038/nn.3413

Optical activation. Intracranial light delivery in behaving rats was achieved as e xperimental intervention. Conditioned responding was measured as the
described21. For all experiments, 5-ms light pulses were delivered with a 50-ms amount of time spent in the reward port during cue presentation, normalized
inter-pulse interval (that is, 20 Hz). For blocking and flavor preference experi- by subtracting the time spent in the port during a pre-cue period of equal
ments, 20 pulses were used (1 s of stimulation). For downshift and extinction length. Note that, during reinforced training sessions, this measure is not a pure
experiments, 60 pulses were used (3 s). Data from sessions where light output was index of learning, as the time spent in the port during the cue also reflects time
compromised because of broken or disconnected optical cables was discarded and spent consuming sucrose. For the blocking experiment, we focused exclusively
these subjects were excluded from the study. This criterion led to the exclusion on this measure because it proved to be particularly robust. Notably, during the
of one rat from each of the blocking and extinction experiments, and four rats blocking test itself, this measure is a pure index of learning because no reward
from the self-stimulation protocol. is delivered during this session. For other experiments, we also measured the
latency to enter the reward port after cue onset. Pilot experiments and power
Assessment of estrus cycle. Stage of estrus cycle was assessed by vaginal cyto- analyses for both the blocking and the extinction study indicated that 8–10
logical examination using well-established methods48. After behavioral sessions subjects per group allowed for detection of differences between experimental
(downshift study), the tip of a moistened cotton swab was gently inserted into the and control conditions, with α = 0.05 and β = 0.80. In cases in which behav-
exterior portion of the vaginal canal and then rotated to dislodge cells from the ioral data from individual subjects varied from the group mean by more than
vaginal wall. The swab was immediately rolled onto a glass slide, and the sample two s.d. (calculated with data from all subjects included), these subjects were
preserved with spray fixative (Spray-Cyte, Fisher Scientific) without allowing the excluded as statistical outliers (two rats from the blocking experiment and
cells to dry. Samples were collected over five consecutive days to ensure observation three each from the downshift and extinction experiments) and their data
of multiple estrus cycle stages. This was done to improve the accuracy of determin- were not further analyzed. Behavioral measures were analyzed using a mixed
ing estrus cycle stage on any single day of the experiment. Slides were then stained factorial ANOVA with the between-subjects factor of experimental group and
with a modified Papanicolau staining procedure as follows: 50% ethyl alcohol, the within-subjects factor of session or trial, followed by planned Student
3 min; tap water, 10 dips (×2); Gill’s hematoxylin 1, 6 min; tap water, 10 dips (×2); Newman-Keul’s tests when indicated by significant main effects or interac-
Scott’s water, 4 min; tap water, 10 dips (×2); 95% ethyl alcohol, 10 dips (×2); modi- tions. For all tests, α = 0.05, and all statistical tests were two-sided. By their
fied orange-greenish 6 (OG-6), 1 min; 95% ethyl alcohol, 10 dips; 95% ethyl alcohol design, the experiments focused on three planned comparisons (PairedCre+
8 dips; 95% ethyl alcohol 6 dips; modified eosin azure 36 (EA-36), 20 min; 95% versus UnpairedCre+, PairedCre+ versus PairedCre−, UnpairedCre+ ver-
ethyl alcohol, 40 dips; 95% ethyl alcohol, 30 dips; 95% ethyl alcohol, 20 dips; 100% sus PairedCre−). We found no major deviation from the assumptions of the
ethyl alcohol, 10 dips (×2); xylene, 10 dips (×2); coverslip immediately. All staining ANOVA. For the cases in which normality or equal variance was question-
solutions (Gill’s hematoxylin 1, OG-6, and EA-36) were sourced from Richard Allen able, the results of the ANOVA were confirmed by non-parametric tests
Scientific. Estrus cycle stage was determined by identifying cellular morphology (Kruskal-Wallis followed by post hoc Dunn’s test). Although explicit blinding
characteristic to each phase according to previously described criteria48. procedures were not employed, experimental group allocation was not noted
on subject cage cards, and all behavioral data were collected automatically
Histology. Immunohistochemical detection of YFP and tyrosine hydroxylase via computer. Blocking and extinction experiments are replications of pilot
was performed as described previously21. Although optical fiber placements and experiments. Although based on a pilot study, the flavor preference study was
virus expression varied slightly from subject to subject, no subject was excluded conducted as described only one time, but with sufficient sample size to make
on the basis of histology (Supplementary Fig. 1). statistical inferences.
Data analysis. Counterbalancing procedures were used to form experimental 48. Karim, B.O. et al. Estrous cycle and ovarian changes in a rat mammary carcinogenesis
groups that were balanced in terms of age, weight, conditioning chamber model after irradiation, tamoxifen chemoprevention and aging. Comp. Med. 53,
used, cue identity and behavioral performance in the sessions preceding the 532–538 (2003).
npg
doi:10.1038/nn.3413 nature NEUROSCIENCE

Steinberg, Keiflin et al. Supplemental materials 1

Supplemental materials for:
A causal link between prediction errors, dopamine neurons and learning
Elizabeth E. Steinberg1-2*, Ronald Keiflin1*, Josiah R. Boivin1-2, Ilana B. Witten5, Karl

Deisseroth6, and Patricia H. Janak1-4†
1
Ernest Gallo Clinic & Research Center, University of California, San Francisco, San Francisco,
CA 94143, USA
2
Graduate program in Neuroscience, University of California, San Francisco, San Francisco CA
94143, USA.
3
Dept. of Neurology, University of California, San Francisco, San Francisco CA 94143, USA.
4
Wheeler Center for the Neurobiology of Addiction, University of California, San Francisco, San
Francisco CA 94143, USA.
5
Princeton Neuroscience Institute & Department of Psychology, Princeton University, Princeton
NJ, 08544, USA.
6
Department of Bioengineering, Department of Psychiatry and Behavioral Sciences, Howard
Hughes Medical Institute, CNC Program, Stanford University, Stanford, CA 94305, USA.
†Correspondence to: pjanak@gallo.ucsf.edu
*Denotes equal contribution.

Nature Neuroscience: doi:10.1038/nn.3413


Figure S1
Fig. S1. Histological reconstruction of optical fiber placements for subjects in all studies. (a)
Top, ChR2-YFP expression for a representative PairedCre+ rat. Blue line denotes the location of
the optical fiber tip in this rat. Note that virus injection and optical fiber placements were
unilateral in this and all other studies. Bottom, optical fiber tip placement for all rats in this group
(n=37). (b) Top, ChR2-YFP expression for a representative UnpairedCre+ rat. Orange line
denotes optical fiber tip location in this rat. Bottom, optical fiber tip placement for all rats in this
group (n=36). (c) Top, ChR2-YFP expression for a representative PairedCre– rat. Grey line
denotes optical fiber tip location in this rat. Bottom, optical fiber tip placement for all rats in this
group (n=41). Distance is posterior relative to bregma.

Figure S2
a b c
300
Nosepoke
5 *
Stimulation trains (compound)
n.s. Fixed 12 n.s. 1

4 Active Paired Cre+
8
Nosepokes (x1000)
Inactive (n=9)
200 4 n.s.
3
Active Inactive 0 Active
0 Unpaired Cre+
Active Inactive
Inactive (n=9)
2
100 Active
Paired Cre−
1 Inactive (n=9)
Stim
0 1s 0
Paired Paired Unpaired 1 2 3 4
*
Cre+ Cre− Cre+
d e f 16 n.s. 1
5 8
80 n.s.
Nosepoke
Stimulation trains (downshift test)
Fixed 4 0 0 Active
Nosepokes (x1000)
Paired Cre+
***
Active Inactive
60 (n=10)
Inactive
3
Active Inactive Active
40 Unpaired Cre+
2 Inactive (n=9)
20 Active
1 Paired Cre−
Inactive (n=8)
0 Stim 0
Paired Paired Unpaired 3s 1 2 3 4
Cre+ Cre− Cre+
**
g h i 5
12
n.s.
1
8
80 Nosepoke
Stimulation trains (extinction test)
4 n.s.
Fixed 4 Active Paired Cre+
0 0
60 (n=9)
Nosepokes (x1000)
Active Inactive Inactive
40
*** Active Inactive
3
Active
Unpaired Cre+
2 Inactive (n=9)
Active
20 Paired Cre−
1
Inactive (n=9)
0
Stim
0
3s
Paired Paired Unpaired 1 2 3 4
Cre+ Cre− Cre+ Training day


Fig. S2. Optical activation of dopamine neurons is equally reinforcing for Cre+ groups in
Blocking, Downshift and Extinction studies. (a) Stimulation trains delivered to all groups
across four days of compound training in the Blocking study (Fig. 2). PairedCre+ and
PairedCre– rats received equivalent amounts of optical stimulation (t-test, p=0.723) because of
equivalent behavioral performance during these sessions. UnpairedCre+ rats received a fixed
number of stimulation trains during the ITIs, set at the maximum number of trains that could be
earned by Paired groups. (b) Schematic of intracranial self-stimulation (ICSS) task for rats in the
Blocking study (see Methods). Stimulation parameters were 1s train, 5ms pulse, 20 Hz to match
those used in the Blocking experiment. (c) Responding across four daily ICSS sessions for rats
in the Blocking study. PairedCre+ and UnpairedCre+ groups made significantly more active
nosepoke responses than PairedCre– controls (two-way RM ANOVA, main effect of group and
group x day interaction F>4.589, p<0.008; SNK post-hoc test PairedCre+ vs. PairedCre–
p=0.018, UnpairedCre+ vs. Paired Cre–p=0.005) but did not differ from each other (p=0.355).
There were no group differences in inactive nosepoke responding (two-way RM ANOVA, main
effect of group and group x day interaction, F<2.17, p>0.056). Inset, summed responding across
all 4 sessions. (d) Stimulation trains delivered to all groups during the downshift test session
(Fig. 3). PairedCre+ rats earned more stimulation trains than PairedCre– rats (t-test, p<0.001)
because of differences in behavioral performance during this test. (e) Schematic of ICSS task for
rats in Downshift study. Stimulation parameters were 3s train, 5ms pulse, 20 Hz to match those
used in the Downshift/Extinction experiments. (f) As in C, but for rats in the Downshift study.
PairedCre+ and UnpairedCre+ groups made significantly more active nosepoke responses than
PairedCre– controls (two-way RM ANOVA, main effect of group and group x day interaction
F>3.711, p<0.003; SNK post-hoc test PairedCre+ vs. PairedCre– p<0.001, UnpairedCre+ vs.
PairedCre– p=0.025) but did not differ from each other (p=0.068). There were no group
differences in inactive nosepoke responding (two-way RM ANOVA, main effect of group and
group x day interaction, F<1.7, p>0.154). (g) Stimulation trains delivered to all groups during the
extinction test session (Fig. 4). PairedCre+ rats earned more stimulation trains than PairedCre–
rats (t-test, p<0.001) because of differences in behavioral performance during this test. (h)
Schematic of ICSS task for rats in Extinction study. Stimulation parameters were 3s train, 5ms
pulse, 20 Hz to match those used in the Downshift/Extinction experiments. (i) As in C, but for
rats in the Extinction study. PairedCre+ and UnpairedCre+ groups made significantly more
active nosepoke responses than PairedCre– controls (two-way RM ANOVA, main effect of
group and group x day interaction F>3.041, p<0.01; SNK post-hoc test PairedCre+ vs.
PairedCre– p=0.003, UnpairedCre+ vs. PairedCre– p=0.004) but did not differ from each other
(p=0.612). There were no group differences in inactive nosepoke responding (two-way RM
ANOVA, main effect of group and group x day interaction, F<2.187, p>0.054). Note that the
same behavioral data was used in F and I, but data from Cre+ rats was sorted differentially in
each case according to each subject’s group assignment in Downshift and Extinction studies, as
the same rats were used for both experiments. *p<0.05; ***p<0.001.

Figure S3
a b Flavor A
Home cage 300
Pre-training Flavor A Flavor B
preference n.s.
test Flavor B 200
Rewards
100
Paired stimulation (Cre+, Cre−)

Flavor A 0
Paired Unpaired Paired
Flavor training Stimulation Cre+ Cre+ Cre−
Flavors tested on
alternate days Unpaired stimulation (Cre+)
Flavor A c *
Flavor B x 4 days Flavor A x 4 days Stimulation 300
Stimulation trains
200
Home cage
Post-training Flavor A
preference 100
test Flavor B
0
Paired Unpaired Paired
Cre+ Cre+ Cre−
d 1
Pre-training e 12
n.s. f
Post-training 8
******
Flavor A pre post
Nosepokes (x1000)
n.s. Flavor B pre post
Consumption (mL)
Preference index
0.5 6
8 n.s.
0 4 ***
4 Paired Cre+
2 Unpaired Cre+
-0.5
Paired Cre−
0 0
-1 1 2 3 4
Paired Cre+ Unpaired Cre+ Paired Cre− Paired Cre+ Unpaired Cre+ Paired Cre−
n=7 n=7 n=10 n=7 n=7 n=10 ICSS training day


Fig. S3. Dopamine neuron activation does not alter preference for a paired natural reward.
(a) Experimental design for flavor preference study. Cre+ rats, or their Cre– littermates, were
exposed to distinctly-flavored sucrose solutions in their home cage to quantify initial preference
between the two flavors during a brief (10 min) pre-training test. The next day, flavor training
commenced. Flavor training sessions were conducted in operant chambers; subjects received 50
reward deliveries on a VI-30s schedule. Only one flavor was available per day; flavor A and
flavor B training days were interleaved. On days where flavor A was available, rats also received
optical stimulation (1s train, 5ms pulse, 20 Hz) of VTA DA neurons either paired or unpaired
with reward delivery. 24 hours after the last flavor training session, subjects were again
simultaneously exposed to both flavors during a brief (10 min) post-training test in their home
cage to assess flavor preference. (b) Total number of rewards (flavor A and flavor B) consumed
by each group during flavor training. There were no differences between groups or between
flavors (two-way RM ANOVA, main effect of group, flavor and group x flavor interaction
F<1.296, p>0.268). (c) Total number of stimulation trains delivered during flavor A training
sessions. Note similarity to stimulation parameters used in the blocking experiment (Fig. S2A).
PairedCre+ and UnpairedCre+ rats received more stimulation trains than PairedCre– rats (one-
way ANOVA, main effect of group F2,21=4.869, p=0.018; SNK post-hoc test PairedCre+ vs.
PairedCre– p=0.027, UnpairedCre+ vs. PairedCre– p-0.026) because of slight differences in
behavior during training sessions. (d) Pre-training and post-training flavor preference quantified
using a preference index which was calculated as follows: (mL flavor A – mL flavor B) ÷ (mL
flavor A + mL flavor B). Positive values indicate a preference for flavor A, and negative values
indicate a preference for flavor B. There were no group differences in flavor preference before or
after flavor training (two-way RM ANOVA, main effect of group, time and group x time
interaction F<0.815, p>0.453). (e) Total consumption of flavor A and flavor B during 10 min
home cage tests. There were no group differences in the consumption of either flavor before or
after flavor training (flavor A; two-way RM ANOVA, main effect of group, time and group x
time interaction F<0.137, p>0.715; flavor B; two-way RM ANOVA, main effect of group, time
and group x time interaction F<2.061, p>0.152). (f) Although no group differences were
observed in the flavor preference study, subsequent ICSS training revealed that optical
stimulation was highly reinforcing in the Cre+ rats used in these experiments (two-way RM
ANOVA, main effect of group and group x day interaction F>14.738, p<0.001; SNK post-hoc
test PairedCre+ vs. PairedCre–, p<0.001; UnpairedCre+ vs. PairedCre–, p<0.001; PairedCre+ vs.
UnpairedCre+, p=0.406). *p<0.05; ***p<0.001.


Figure S4
Fig. S4. Relative timing of reward delivery and optical stimulation for the flavor preference
study. Relative timing of flavored sucrose delivery, the rats’ presence in the reward port, and
optical stimulation for two example trials. (a) For rats assigned to Paired groups, optical
stimulation is delivered the first time the rat enters the reward port when a new reward is
available. If the rat does not maintain presence in the port for 1s or longer, indicating that the rat
has not fully consumed the reward, subsequent port entries will trigger additional stimulation
trains until the 1s requirement is met (second trial example). This is done to ensure that reward
consumption and optical stimulation are always experienced coincidently. Because of this, more
stimulation trains were delivered than rewards. (b) To ensure that UnpairedCre+ rats received
equivalent amounts of stimulation as their PairedCre+ counterparts, the number of stimulation
trains delivered to the UnpairedCre+ rats was determined using the same criteria, but instead of
being delivered during flavored sucrose consumption, the stimulation trains were delivered
during the ITI (grey line), when no sucrose solution was present.


Figure S5
a b
Paired Cre+ Unpaired Cre+ Paired Cre−
Diestrus 5 2 3
Proestrus 0 1 1
Diestrus Proestrus Estrus 3 4 4
Metestrus 3 3 2
Cycling 8 10 8
Irregular/
Noncycling 3 0 2
Estrus Metestrus
c 80 d 20
Paired Cre+ (n=11)
Unpaired Cre+ (n=10)
Paired Cre− (n=10)

60 15
40 10
20 5
0 0
Metestrus Diestrus Proestrus Estrus Metestrus Diestrus Proestrus Estrus


Fig. S5. Estrus cycle stage is not related to behavioral performance during a test session
with dopamine neuron activation. (a) Typical cytology observed during diestrus, proestrus,
estrus and metestrus from cell samples collected from the vaginal wall. Stages were classified
according to established criteria (see Methods). (b) Vaginal cytology results from all rats in the
Downshift experiment during the downshift test when optical stimulation was delivered.
Assessments of cycle regularity were based on an examination of cytology samples over five
consecutive days. Estrus cycle stage was determined by identifying cellular morphology
characteristic to each phase (diestrus, proestrus, estrus, metestrus) according to previously
described criteria48. Rats were considered to be cycling if consecutive daily samples represented
three stages out of four. If morphology consistent with diestrus, metestrus, or a combination of
both was observed for four consecutive days, the rat was considered to be non-cycling. If estrus
was observed but further samples did not indicate progression through successive stages, the
rat’s cycle was considered to be irregular. (c) Scatterplot of behavioral performance and estrus
cycle stage for all rats during the downshift test. A one-way ANOVA revealed no main effect of
estrus cycle stage on performance (pooled data from all subjects; main effect of stage,
F3,27=0.502, p=0.684). Behavioral performance was measured as percent time spent in the reward
port during the cue. (d). As in C, but with behavioral performance measured as the latency to
enter the reward port after cue onset. A one-way ANOVA revealed no main effect of estrus cycle
stage on performance (pooled data from all subjects; main effect of stage, F3,27=0.41, p=0.747).

A Causal Link Between Prediction Errors PDF

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

A Causal Link Between Prediction Errors PDF

Hochgeladen von

Copyright:

Verfügbare Formate

a r t ic l e s

A causal link between prediction errors, dopamine

Received 15 January; accepted 2 May; published online 26 May 2013; doi:10.1038/nn.3413

966 VOLUME 16 | NUMBER 7 | JULY 2013 nature NEUROSCIENCE

Figure 1 Behavioral demonstration of the

presentation of cues A and X; US, unconditioned

nature NEUROSCIENCE VOLUME 16 | NUMBER 7 | JULY 2013 967

Figure 2 Dopamine neuron stimulation drives a b Paired stimulation

blocking task with optogenetics. All groups AP – 5.8 Unpaired stimulation

with sucrose delivery in Paired (Cre+ and

Percent time in port during cue

968 VOLUME 16 | NUMBER 7 | JULY 2013 nature NEUROSCIENCE

Figure 3 Dopamine neuron stimulation a Training Cue sucrose

Percent time in port during cue

during the last two training sessions. (c) Percent (n = 10)

nature NEUROSCIENCE VOLUME 16 | NUMBER 7 | JULY 2013 969

Figure 4 Dopamine neuron stimulation a Training Cue sucrose

Percent time in port during cue

P = 0.920). (d) Percent time in port during the 5

for single trials (left) and as a session average

970 VOLUME 16 | NUMBER 7 | JULY 2013 nature NEUROSCIENCE

DISCUSSION rats could reflect the development of a conditioned place preference

nature NEUROSCIENCE VOLUME 16 | NUMBER 7 | JULY 2013 971

972 VOLUME 16 | NUMBER 7 | JULY 2013 nature NEUROSCIENCE

nature NEUROSCIENCE VOLUME 16 | NUMBER 7 | JULY 2013 973

nature NEUROSCIENCE doi:10.1038/nn.3413

doi:10.1038/nn.3413 nature NEUROSCIENCE

Supplemental materials for:

A causal link between prediction errors, dopamine neurons and learning

Elizabeth E. Steinberg1-2*, Ronald Keiflin1*, Josiah R. Boivin1-2, Ilana B. Witten5, Karl

Nature Neuroscience: doi:10.1038/nn.3413

Nature Neuroscience: doi:10.1038/nn.3413

n.s. Fixed 12 n.s. 1

Active Inactive Inactive

Nature Neuroscience: doi:10.1038/nn.3413

Nature Neuroscience: doi:10.1038/nn.3413

Paired stimulation (Cre+, Cre−)

Nature Neuroscience: doi:10.1038/nn.3413

Nature Neuroscience: doi:10.1038/nn.3413

Nature Neuroscience: doi:10.1038/nn.3413

Diestrus Proestrus Estrus 3 4 4

Unpaired Cre+ (n=10)

Paired Cre− (n=10)

Nature Neuroscience: doi:10.1038/nn.3413

Nature Neuroscience: doi:10.1038/nn.3413

Das könnte Ihnen auch gefallen

Elizabeth E. Steinberg1-2, Ronald Keiflin1, Josiah R. Boivin1-2, Ilana B. Witten5, Karl