Electrophysiological measures reveal the role of anterior cingulate cortex in learning from unreliable feedback
Although a growing number of studies have investigated the neural mechanisms of reinforcement learning, it remains unclear how the brain responds to feedback that is unreliable. A recent theory proposes that the reward positivity (RewP) component of the event-related brain potential (ERP) and frontal midline theta (FMT) power reflect separate feedback-related processing functions of anterior cingulate cortex (ACC). In the present study, the electroencephalogram (EEG) was recorded from participants as they engaged in a time estimation task in which feedback reliability was manipulated across conditions. After each response, they received a cue that indicated that the following feedback stimulus was 100%, 75%, or 50% reliable. The results showed that participants’ time estimates adjusted linearly according to the feedback reliability. Moreover, presentation of the cue indicating 100% reliability elicited a larger RewP-like ERP component than the other cues did, and feedback presentation elicited a RewP of approximately equal amplitude for all of the three reliability conditions. By contrast, FMT power elicited by negative feedback decreased linearly from the 100% condition to 75% and 50% condition, and only FMT power predicted behavioral adjustments on the following trials. In addition, an analysis of Beta power and cross-frequency coupling (CFC) of Beta power with FMT phase suggested that Beta-FMT communication modulated motor areas for the purpose of adjusting behavior. We interpreted these findings in terms of the hierarchical reinforcement learning account of ACC, in which the RewP and FMT are proposed to reflect reward processing and control functions of ACC, respectively.
KeywordsFrontal midline theta Reward positivity Anterior cingulate cortex Feedback reliability
This study was supported by the National Natural Science Foundation of China (31671158 & 31671150), the (Key) Project of DEGP (2015WTSCX094), Shenzhen Peacock Plan (grant no. KQTD2015033016104926), and the youth Project of Humanities and Social Sciences of Shenzhen University (16QNFC51).
- Altamura, M., Goldberg, T. E., Elvevåg, B., Holroyd, T., Carver, F. W., & Weinberger, D. R., et al. (2010). Prefrontal cortex modulation during anticipation of working memory demands as revealed by magnetoencephalography. Journal of Biomedical Imaging, 2010(10), 12.Google Scholar
- Bernat, E. M., Nelson, L. D., Steele, V. R., Gehring, W. J., & Patrick, C. J. (2011). Externalizing psychopathology and gain–loss feedback in a simulated gambling task: Dissociable components of brain response revealed by time-frequency analysis. Journal of Abnormal Psychology, 120(2), 352-364.PubMedPubMedCentralCrossRefGoogle Scholar
- Bernat, E.M., Nelson, L.D., Holroyd, C.B., Gehring, W.J., and Patrick, C.J. (2008). Separating cognitive processes with principal components analysis of EEG time-frequency distributions. Proceedings of the Society of Photo-Optical Instrumentation Engineers, Vol. 7074, 70740S.Google Scholar
- HajiHosseini, A., & Holroyd, C. B. (2015b). Frontal beta oscillations reflect encoding of information related to desired task performance irrespective of feedback valence. Program No. 352.24. 2015 Neuroscience Meeting Planner. Washington, DC: Society for Neuroscience, 2015. OnlineGoogle Scholar
- Holroyd, C. B. (2004). A note on the oddball N200 and the feedback ERN. Neurophysiology, 78, 447-455.Google Scholar
- Holroyd, C. B., (2016). The waste disposal problem of effortful control. In: Braver, T. (Ed.), Motivation and cognitive control. Psychology Press, New York, NY, pp. 235–260.Google Scholar
- Khamassi, M., Lallée, S., Enel, P., Procyk, E., & Dominey, P. F. (2011). Robot cognitive control with a neurophysiologically inspired reinforcement learning model. Frontiers in neurorobotics, 5.Google Scholar
- Sutton, R. S., & Barto, A. G. (1998). Reinforcement learning: An introduction (Vol. 1, No. 1). Cambridge: MIT press.Google Scholar
- Warren, C. M., & Holroyd, C. B. (2012). The impact of deliberative strategy dissociates ERP components related to conflict processing vs. reinforcement learning. Frontiers in neuroscience, 6.Google Scholar
- Wang, J., Chen, Z., Peng, X., Yang, T., Li, P., Cong, F., & Li, H. (2016). To know or not to know? theta and delta reflect complementary information about an advanced cue before feedback in decision-making. Frontiers in psychology, 7. Google Scholar