Cookies on this website

We use cookies to ensure that we give you the best experience on our website. If you click 'Accept all cookies' we'll assume that you are happy to receive all cookies and you won't see this message again. If you click 'Reject all non-essential cookies' only necessary cookies providing core functionality such as security, network management, and accessibility will be enabled. Click 'Find out more' for information on how to change your cookie settings.

Longitudinal tracking of long-term learning behaviour and striatal dopamine revealed that dopamine teaching signals shape individuals’ diverse yet systematic learning trajectories, captured mathematically by a deep neural network.
Longitudinal tracking of long-term learning behaviour and striatal dopamine revealed that dopamine teaching signals shape individuals’ diverse yet systematic learning trajectories, captured mathematically by a deep neural network.

The brain’s dopamine signals play important roles in learning. However, we know little about how they govern diverse learning across individuals over long periods of time. New research from the Lak Lab, published today in Cell, set out to investigate the role of dopamine in long-term learning.

 

Led by Samuel Liebana, DPhil student in the Lak Lab, involving several researchers within Lak Lab and computational collaborators Andrew Saxe and Rafal Bogacz, the study shows that long-term learning involves dopamine teaching signals that shape individual learning trajectories. ‘The findings could have future implications for understanding individual differences in learning, revealing the mechanisms of disorders involving dopamine such as addiction - and this will enable us to further AI’, commented Samuel Liebana. ‘Our study shows how small differences in dopamine signals early in learning lead to large differences in the behaviour of individuals later on in learning, a finding that helped us to formulate new theories of dopamine function’, added Armin Lak, senior author of the paper. 

Learning over long time periods

Long-term learning defines many aspects of our life from our career to our hobbies. However, it has been difficult to study in a lab setting because it requires long periods of time. The researchers achieved this through a suitable learning paradigm for mice that was simple enough that mice could learn it over multiple days, yet complex enough that they observed diversity in learning.  The task involved showing mice a visual stimulus either on the left or right half of a screen. Over a few weeks, the mice learned to use their front paws to turn a wheel to move the visual stimulus to the centre of the screen and receive a reward.

Understanding diversity in learning

The team found considerable diversity in behaviour across mice, particularly early in learning when the mice were moving the wheel without using the visual stimuli. Some mice were moving mostly left, others mostly right, and some in a more balanced fashion.

The diverse behaviour early in learning predicted the mice's strategies later in learning – when they were able to perform the task accurately. Those animals that were initially right-biased developed a strategy where they only used the stimulus on the right side of the screen and never on the left. Conversely, left-biased animals only used the left stimulus. The balanced animals, that moved the wheel both left and right early on, developed a strategy using both left and right stimuli.

Investigating the role of dopamine

The team imaged dopamine release across the dorsal striatum. They demonstrated that dopamine is involved in long-term learning through updating associations between stimuli and actions. However, they found that the common description of dopamine encoding a 'total' reward prediction error (the difference between actual and predicted reward) did not capture the patterns of dopamine they observed. Instead, dopamine acted as a ‘partial’ reward prediction error, reflecting and updating associations proportional to the difference between actual reward and a reward prediction based on partial cortical information available to each striatal circuit. To verify whether dopamine plays a causal role in driving learning, the researchers used optogenetics to turn dopamine neurons ON and OFF, confirming the partial reward prediction error hypothesis.

Using neural networks to understand long-term learning rules

Finally, the team used a deep neural network model to capture the mice's learning trajectories. The network was trained with teaching signals that mimicked the recorded dopamine and thus updated parameters based on partial reward prediction errors. The model explained the mice's diverse progression through strategies, as well as why mice often remained stuck using certain strategies.

 

Read the paper here