Wolfram Schultz, Peter Dayan, P. Read Montague*
W. Schultz is at the Institute of Physiology, University of
Fribourg, CH-1700 Fribourg, Switzerland. E-mail:
Wolfram.Schultz@unifr.ch P. Dayan is in the Department
of Brain and Cognitive Sciences, Center for Biological
and Computational Learning, E-25 MIT, Cambridge,
MA 02139, USA. E-mail: dayan@ai.mit.edu P.
R. Montague is in the Division of Neuroscience, Center
for Theoretical Neuroscience, Baylor College of Medicine,
1 Baylor Plaza, Houston, TX 77030, USA. E-mail:
read@bcm.tmc.edu
http://101.96.10.63/www.gatsby.ucl.ac.uk/~dayan/papers/sdm97.pdf
Abstract
The capacity to predict future events permits a creature to detect, model, and manipulate the causal structure of its interactions with its environment. Behavioral experiments suggest that learning is driven by changes in the expectations about future salient events such as rewards and punishments. Physiological work has recently complemented these studies by identifying dopaminergic neurons in the primate whose fluctuating output apparently signals changes or errors in the predictions of future salient and rewarding events. Taken together, these findings can be understood through quantitative theories of adaptive optimizing control.