10.5061/DRYAD.J7H6F69
Lehmann, Marco P.
École Polytechnique Fédérale de Lausanne
Xu, He A.
École Polytechnique Fédérale de Lausanne
Liakoni, Vasiliki
École Polytechnique Fédérale de Lausanne
Herzog, Michael H.
École Polytechnique Fédérale de Lausanne
Gerstner, Wulfram
École Polytechnique Fédérale de Lausanne
Preuschoff, Kerstin
0000-0001-7254-833X
University of Geneva
Data from: One-shot learning and behavioral eligibility traces in
sequential decision making
Dryad
dataset
2019
human
2019-11-11T00:00:00Z
2019-11-11T00:00:00Z
en
https://doi.org/10.7554/eLife.47463
166794306 bytes
4
CC0 1.0 Universal (CC0 1.0) Public Domain Dedication
In many daily tasks we make multiple decisions before reaching a goal. In
order to learn such sequences of decisions, a mechanism to link earlier
actions to later reward is necessary. Reinforcement learning theory
suggests two classes of algorithms solving this credit assignment problem:
In classic temporal-difference learning, earlier actions receive reward
information only after multiple repetitions of the task, whereas models
with eligibility traces reinforce entire sequences of actions from a
single experience (one-shot). Here we show one-shot learning of sequences.
We developed a novel paradigm to directly observe which actions and states
along a multi-step sequence are reinforced after a single reward. By
focusing our analysis on those states for which RL with and without
eligibility trace make qualitatively distinct predictions, we find direct
behavioral (choice probability) and physiological (pupil dilation)
signatures of reinforcement learning with eligibility trace across
multiple sensory modalities.
Behaviour and Pupil Dilation during Human learningPupil dilations and
behavioral data recorded during a sequential decision making task. Matlab
(mat) and Text (csv) files.DataDryadUpload.zip