The role of the dorsomedial striatum in instrumental conditioning
Henry H. Yin, Sean B. Ostlund, Barbara J. Knowlton and Bernard W. Balleine
European Journal of Neuroscience (2005)
doi:10.1111/j.1460-9568.2005.04218.x
The gap/question: What’s the specific role of the posterior dorsomedial striatum (pDMS) in learning the action-outcome associations?
How the authors proposed the question: an ongoing study of their previous findings (shared yesterday).
Brief summary: in this study, Yin and colleagues shown that the excitotoxic lesions of pDMS impaired both the acquisition and expression of action-outcome (A-O) conditioning. The impairments could be attributed to the deficits in learning the A-O contingency, but not to the inability to discriminate different outcomes or different actions.
This is a classic work both in the instrumental learning and the striatum fields (with 1000 citations to date). It was an in-depth study inquiring the specific roles of pDMS in instrumental conditioning. In their previous study (Blockade of NMDA receptors in the dorsomedial striatum prevents action–outcome learning in instrumental conditioning, shared in 2021-07-25), they found that the blockade of the NMDA receptors in pDMS disrupted the acquisition of A-O association (instrumental conditioning). Notably, no mechanisms were experimentally tested in that study. So I gave my interpretation in the last sentence ‘The deficits may more possibly be attributed to the failure in learning the A-O associations and, the failure could be caused by the insensitive detection of the reward-outcome when pDMS was inhibited.’ And indeed, based on the findings in the current study, the interpretation can stand up. Intuitively, impairments in learning A-O contingency could be decomposed into three components at least: failure to learn the contingency, failure to discriminate distinct actions, failure to differentiate different outcomes. To dissociate these possible mechanisms, they adopted three behavioral tests: a contingency-degradation test, a reward-reinstatement test, a heterogeneous chain of instrumental actions test.
In the study, the authors trained rats to press two levers to get two distinct rewards. Lesions (by the injection of NMDA) of the pDMS, but not the anterior DMS, during both before and after the acquisition of the action-outcome (reward) associations , impaired the sensitivity to the reward value after devaluation. Next they tried to explore the possible mechanisms referred above.
In the normal A-O conditioning framework, reward was delivered only if a specific action was implemented. Under the contingency-degradation condition, a reward was delivered randomly and independent of the action at a certain probability, making the corresponding A-O contingency degraded. Immediately after the degradation, animals were tested with the extinction procedures. Sham-treated animals made fewer choices corresponding to the degraded reward, while the pDMS-lesioned group made comparable choices corresponding to both rewards, suggesting a failure to discriminate the two A-O contingencies. This is a positive control.
To test whether the impaired acquisition of A-O contingency was caused by the insensitivity to the rewards, they utilized a reward-reinstatement test. Following the degradation test, in the next day, responses were extinguished for both rewards for 20 minutes. Immediately following the extinction learning, a single reward was delivered (that is reinstatement) and then the response during the next 2 minutes was assessed. Comparing to the increased response leading to the re-instated reward in control group, the pDMS lesioned group made comparable choices leading to both rewards, implying a discounted sensitivity to the reward per se.
At last, to test the lesion effects on motor control, the authors trained rats to perform a series of action-sequences (e.g. press left lever, and then the right to get reward or the opposite sequence) test and, found that the lesions did not significantly affect the acquisition of the action-sequences, excluding the possible contributions of motor deficits to the impairments in both the acquisition and expression of A-O associations observed during the devaluation and degradation tests.
Taken together, based on a series of elaborately designed behavioral tests, the authors concluded that pDMS was critical for both the acquisition and expression of A-O associations, or instrumental conditioning, and specifically pDMS may function through the discrimination of reward outcomes.
This study is an excellent model showing how scientists disentangle possible hypotheses with well-controlled experiments. It deserves in-depth reading.