Gershman SJ., Lak A.

Limits on information processing capacity impose limits on task performance. We show that male and female mice achieve performance on a perceptual decision task that is near-optimal given their capacity limits, as measured by policy complexity (the mutual information between states and actions). This behavioral profile could be achieved by reinforcement learning with a penalty on high complexity policies, realized through modulation of dopaminergic learning signals. In support of this hypothesis, we find that policy complexity suppresses midbrain dopamine responses to reward outcomes. Furthermore, neural and behavioral reward sensitivity were positively correlated across sessions. Our results suggest that policy compression shapes basic mechanisms of reinforcement learning in the brain.

Policy Complexity Suppresses Dopamine Responses.

Gershman SJ., Lak A.

DOI

Type

Publication Date

Volume

Keywords