Advertisement

Electrophysiological measures reveal the role of anterior cingulate cortex in learning from unreliable feedback

Article

Abstract

Although a growing number of studies have investigated the neural mechanisms of reinforcement learning, it remains unclear how the brain responds to feedback that is unreliable. A recent theory proposes that the reward positivity (RewP) component of the event-related brain potential (ERP) and frontal midline theta (FMT) power reflect separate feedback-related processing functions of anterior cingulate cortex (ACC). In the present study, the electroencephalogram (EEG) was recorded from participants as they engaged in a time estimation task in which feedback reliability was manipulated across conditions. After each response, they received a cue that indicated that the following feedback stimulus was 100%, 75%, or 50% reliable. The results showed that participants’ time estimates adjusted linearly according to the feedback reliability. Moreover, presentation of the cue indicating 100% reliability elicited a larger RewP-like ERP component than the other cues did, and feedback presentation elicited a RewP of approximately equal amplitude for all of the three reliability conditions. By contrast, FMT power elicited by negative feedback decreased linearly from the 100% condition to 75% and 50% condition, and only FMT power predicted behavioral adjustments on the following trials. In addition, an analysis of Beta power and cross-frequency coupling (CFC) of Beta power with FMT phase suggested that Beta-FMT communication modulated motor areas for the purpose of adjusting behavior. We interpreted these findings in terms of the hierarchical reinforcement learning account of ACC, in which the RewP and FMT are proposed to reflect reward processing and control functions of ACC, respectively.

Keywords

Frontal midline theta Reward positivity Anterior cingulate cortex Feedback reliability 

Notes

Acknowledgements

This study was supported by the National Natural Science Foundation of China (31671158 & 31671150), the (Key) Project of DEGP (2015WTSCX094), Shenzhen Peacock Plan (grant no. KQTD2015033016104926), and the youth Project of Humanities and Social Sciences of Shenzhen University (16QNFC51).

Supplementary material

13415_2018_615_MOESM1_ESM.docx (457 kb)
ESM 1 (DOCX 457 kb)

References

  1. Altamura, M., Goldberg, T. E., Elvevåg, B., Holroyd, T., Carver, F. W., & Weinberger, D. R., et al. (2010). Prefrontal cortex modulation during anticipation of working memory demands as revealed by magnetoencephalography. Journal of Biomedical Imaging, 2010(10), 12.Google Scholar
  2. Axmacher, N., Henseler, M. M., Jensen, O., Weinreich, I., Elger, C. E., & Fell, J. (2010). Cross-frequency coupling supports multi-item working memory in the human hippocampus. Proceedings of the National Academy of Sciences, 107(7), 3228-3233.CrossRefGoogle Scholar
  3. Baker, T. E. and C. B. Holroyd (2011). "Dissociated roles of the anterior cingulate cortex in reward and conflict processing as revealed by the feedback error-related negativity and N200." Biological Psychology 87(1): 25-34.PubMedCrossRefGoogle Scholar
  4. Behrens, T. E., Woolrich, M. W., Walton, M. E., & Rushworth, M. F. (2007). Learning the value of information in an uncertain world. Nature neuroscience, 10(9), 1214-1221.PubMedCrossRefGoogle Scholar
  5. Bernat, E. M., Nelson, L. D., Steele, V. R., Gehring, W. J., & Patrick, C. J. (2011). Externalizing psychopathology and gain–loss feedback in a simulated gambling task: Dissociable components of brain response revealed by time-frequency analysis. Journal of Abnormal Psychology, 120(2), 352-364.PubMedPubMedCentralCrossRefGoogle Scholar
  6. Bernat, E.M., Nelson, L.D., Holroyd, C.B., Gehring, W.J., and Patrick, C.J. (2008). Separating cognitive processes with principal components analysis of EEG time-frequency distributions. Proceedings of the Society of Photo-Optical Instrumentation Engineers, Vol. 7074, 70740S.Google Scholar
  7. Botvinick, M. M. (2007). Conflict monitoring and decision making: reconciling two perspectives on anterior cingulate function, Cognitive Affective, & Behavioral Neuroscience, 7(4), 356-366.CrossRefGoogle Scholar
  8. Bromberg-Martin, E. S., and Hikosaka, O. (2009). Midbrain dopamine neurons signal preference for advance information about upcoming rewards. Neuron, 63,119–126.PubMedPubMedCentralCrossRefGoogle Scholar
  9. Bruns, A., & Eckhorn, R. (2004). Task-related coupling from high- to low-frequency signals among visual cortical areas in human subdural recordings. International Journal of Psychophysiology, 51(2), 97.PubMedCrossRefGoogle Scholar
  10. Buschman, T. J., Denovellis, E. L., Diogo, C., Bullock, D., & Miller, E. K. (2012). Synchronous oscillatory neural ensembles for rules in the prefrontal cortex. Neuron, 76(4), 838-846.PubMedPubMedCentralCrossRefGoogle Scholar
  11. Buschman, T.J., and Miller, E.K. (2007). Top-down versus bottom-up control of attention in the prefrontal and posterior parietal cortices. Science, 315, 1860–1862.PubMedCrossRefGoogle Scholar
  12. Canolty R. T. & Knight, R. T. (2010). The functional role of cross-frequency coupling. Trends in Cognitive Sciences, 14(11), 506-515.PubMedPubMedCentralCrossRefGoogle Scholar
  13. Carlson, J. M., Foti, D., Mujica-Parodi, L. R., Harmon-Jones, E., & Hajcak, G. (2011). Ventral striatal and medial prefrontal BOLD activation is correlated with reward-related electrocortical activity: a combined ERP and fMRI study. Neuroimage, 57(4), 1608-1616.PubMedCrossRefGoogle Scholar
  14. Carter, C. S., & Van Veen, V. (2007). Anterior cingulate cortex and conflict detection: an update of theory and data, Cognitive Affective, & Behavioral Neuroscience, 7(4), 367-379.CrossRefGoogle Scholar
  15. Cavanagh, J. F., & Shackman, A. J. (2015). Frontal midline theta reflects anxiety and cognitive control: meta-analytic evidence. Journal of Physiology-Paris, 109(1–3), 3–15.CrossRefGoogle Scholar
  16. Cavanagh, J. F., & Frank, M. J. (2014). Frontal theta as a mechanism for cognitive control. Trends in Cognitive Sciences, 18(8), 414-421.PubMedPubMedCentralCrossRefGoogle Scholar
  17. Cavanagh, J. F., Frank, M. J., Klein, T. J., & Allen, J. J. (2010). Frontal theta links prediction errors to behavioral adaptation in reinforcement learning. Neuroimage, 49(4), 3198-3209.PubMedCrossRefGoogle Scholar
  18. Cavanagh, J. F., Zambrano-Vazquez, L., & Allen, J. J. (2012). Theta lingua franca: A common mid-frontal substrate for action monitoring processes. Psychophysiology, 49(2), 220-238.PubMedCrossRefGoogle Scholar
  19. Chase, H. W., Swainson, R., Durham, L., Benham, L., & Cools, R. (2011). Feedback-related negativity codes prediction error but not behavioral adjustment during probabilistic reversal learning. Journal of Cognitive Neuroscience, 23(4), 936-946.PubMedCrossRefGoogle Scholar
  20. Cohen, M. X., Elger, C. E., & Fell, J. (2009). Oscillatory activity and phase–amplitude coupling in the human medial frontal cortex during decision making. Journal of cognitive neuroscience, 21(2), 390-402.PubMedCrossRefGoogle Scholar
  21. Cohen, M. X., Elger, C. E., & Ranganath, C. (2007). Reward expectation modulates feedback-related negativity and eeg spectra. Neuroimage, 35(2), 968–978.PubMedPubMedCentralCrossRefGoogle Scholar
  22. Cohen, M. X., & Ranganath, C. (2007). Reinforcement learning signals predict future decisions. Journal of Neuroscience, 27(2), 371.PubMedCrossRefGoogle Scholar
  23. Cohen, M. X., Wilmes, K. A., & Vijver, I. V. D. (2011). Cortical electrophysiological network dynamics of feedback learning. Trends in Cognitive Sciences, 15(12), 558–566.PubMedCrossRefGoogle Scholar
  24. Delorme, A., & Makeig, S. (2004). EEGLAB: an open source toolbox for analysis of single-trial EEG dynamics including independent component analysis. Journal of neuroscience methods, 134(1), 9-21.PubMedCrossRefGoogle Scholar
  25. Diedenhofen, B. & Musch, J. (2015). cocor: A comprehensive solution for the statistical comparison of correlations. PLoS ONE, 10(4): e0121945.PubMedPubMedCentralCrossRefGoogle Scholar
  26. Engel, A. K., & Fries, P. (2010). Beta-band oscillations—signalling the status quo?. Current opinion in neurobiology, 20(2), 156-165.PubMedCrossRefGoogle Scholar
  27. Ernst, B., & Steinhauser, M. (2015). Effects of invalid feedback on learning and feedback-related brain activity in decision-making. Brain and cognition, 99, 78-86.PubMedCrossRefGoogle Scholar
  28. Ernst, B., & Steinhauser, M. (2017). Top-down control over feedback processing: the probability of valid feedback affects feedback-related brain activity. Brain & Cognition, 115, 33.CrossRefGoogle Scholar
  29. Hsieh, L. T., & Ranganath, C. (2013). Frontal midline theta oscillations during working memory maintenance and episodic encoding and retrieval. Neuroimage, 85(2), 721-729.PubMedGoogle Scholar
  30. HajiHosseini, A., & Holroyd, C. B. (2013). Frontal midline theta and N200 amplitude reflect complementary information about expectancy and outcome evaluation. Psychophysiology, 50(6), 550-562.PubMedCrossRefGoogle Scholar
  31. HajiHosseini, A., & Holroyd, C. B. (2015a). Reward feedback stimuli elicit high-beta eeg oscillations in human dorsolateral prefrontal cortex. Scientific Reports, 5, 13021.PubMedCrossRefGoogle Scholar
  32. HajiHosseini, A., & Holroyd, C. B. (2015b). Frontal beta oscillations reflect encoding of information related to desired task performance irrespective of feedback valence. Program No. 352.24. 2015 Neuroscience Meeting Planner. Washington, DC: Society for Neuroscience, 2015. OnlineGoogle Scholar
  33. HajiHosseini, A., Rodríguez-Fornells, A., Marco-Pallarés, J., 2012. The role of beta-gamma oscillations in unexpected rewards processing. Neuroimage, 60, 1678–1685.PubMedCrossRefGoogle Scholar
  34. Hittner, J. B., May, K., & Silver, N. C. (2003). A Monte Carlo evaluation of tests for comparing dependent correlations. The Journal of general psychology, 130(2), 149-168.PubMedCrossRefGoogle Scholar
  35. Holroyd, C. B. (2004). A note on the oddball N200 and the feedback ERN. Neurophysiology, 78, 447-455.Google Scholar
  36. Holroyd, C. B., (2016). The waste disposal problem of effortful control. In: Braver, T. (Ed.), Motivation and cognitive control. Psychology Press, New York, NY, pp. 235–260.Google Scholar
  37. Holroyd, C. B., HajiHosseini, A., & Baker, T. E. (2012). ERPs and EEG oscillations, Best friends forever: comment on Cohen et al. Trends in Cognitive Sciences, 16, 192.PubMedCrossRefGoogle Scholar
  38. Holroyd, C. B. & Krigolson O. E. (2007). Reward prediction error signals associated with a modified time estimation task. Psychophysiology, 44(6): 913-917.PubMedCrossRefGoogle Scholar
  39. Holroyd, C. B. and M. G. H. Coles (2002). The neural basis of human error processing: Reinforcement learning, dopamine, and the error-related negativity. Psychological Review, 109, 679-709.PubMedCrossRefGoogle Scholar
  40. Holroyd, C. B., Pakzad-Vaezi, K. L., & Krigolson, O. E. (2008). The feedback correct-related positivity: sensitivity of the event-related brain potential to unexpected positive feedback. Psychophysiology, 45(5), 688–697.PubMedCrossRefGoogle Scholar
  41. Holroyd, C. B., & McClure, S. M. (2015). Hierarchical control over effortful behavior by rodent medial frontal cortex: A computational model. Psychological review, 122(1), 54.PubMedCrossRefGoogle Scholar
  42. Holroyd, C. B., Nieuwenhuis, S., Yeung, N., Nystrom, L., Mars, R. B., Coles, M. G., & Cohen, J. D. (2004). Dorsal anterior cingulate cortex shows fMRI response to internal and external error signals. Nature Neuroscience, 7(5), 497.PubMedCrossRefGoogle Scholar
  43. Holroyd, C. B., & Umemoto, A. (2016). The research domain criteria framework: The case for anterior cingulate cortex. Neuroscience & Biobehavioral Reviews, 71, 418-443.CrossRefGoogle Scholar
  44. Holroyd, C. B., & Yeung, N. (2012). Motivation of extended behaviors by anterior cingulate cortex. Trends in Cognitive Sciences, 16(2), 122–128.PubMedCrossRefGoogle Scholar
  45. Irene, V. D. V., Ridderinkhof, K. R., & Cohen, M. X. (2011). Frontal oscillatory dynamics predict feedback learning and action adjustment. Journal of Cognitive Neuroscience, 23(12), 4106-4121.CrossRefGoogle Scholar
  46. Johnston, K., Levin, H. M., Koval, M. J., & Everling, S. (2007). Top-down control-signal dynamics in anterior cingulate and prefrontal cortex neurons following task switching. Neuron, 53(3), 453-462.PubMedCrossRefGoogle Scholar
  47. Karlsson, M. P., Tervo, D. G., Karpova, A. Y. (2012). Network resets in medial prefrontal cortex mark the onset of behavioral uncertainty. Science, 338(6103):135–139.PubMedCrossRefGoogle Scholar
  48. Khamassi, M., Lallée, S., Enel, P., Procyk, E., & Dominey, P. F. (2011). Robot cognitive control with a neurophysiologically inspired reinforcement learning model. Frontiers in neurorobotics, 5.Google Scholar
  49. Li, P., Jia, S., Feng, T., Liu, Q., Suo, T., & Li, H. (2010). The influence of the diffusion of responsibility effect on outcome evaluations: Electrophysiological evidence from an ERP study. Neuroimage, 52(4), 1727–1733.PubMedCrossRefGoogle Scholar
  50. Li, P., Baker, T. E., Warren, C., & Li, H. (2016). Oscillatory profiles of positive, negative and neutral feedback stimuli during adaptive decision making. International Journal of Psychophysiology, 107, 37-43.PubMedCrossRefGoogle Scholar
  51. Luft, C. D. B., et al. (2013). High-Learners Present Larger Mid-Frontal Theta Power and Connectivity in Response to Incorrect Performance Feedback. Journal of Neuroscience 33(5): 2029-2038.PubMedCrossRefGoogle Scholar
  52. Marco-Pallarés, J., Cucurell, D., Cunillera, T., García, R., Andrés-Pueyo, A., Münte, T. F., & Rodríguez-Fornells, A. (2008). Human oscillatory activity associated to reward processing in a gambling task. Neuropsychologia, 46, 241-248.PubMedCrossRefGoogle Scholar
  53. Marco-Pallarés, J., Münte, T. F., & Rodríguez-Fornells, A. (2015). The role of high-frequency oscillatory activity in reward processing and learning. Neuroscience & Biobehavioral Reviews, 49, 1-7.CrossRefGoogle Scholar
  54. Miltner, W. H. R., Braun, C. H., & Coles, M. G. H. (1997). Event-related brain potentials following incorrect feedback in a time-estimation task: Evidence for a “generic” neural system for error detection. Journal of Cognitive Neuroscience, 9, 788–798.PubMedCrossRefGoogle Scholar
  55. Mouraux, A., Guerit, J. M., & Plaghki, L. (2003). Non-phase locked electroencephalogram (EEG) responses to CO 2 laser skin stimulations may reflect central interactions between A∂-and C-fibre afferent volleys. Clinical neurophysiology, 114(4), 710-722.PubMedCrossRefGoogle Scholar
  56. Mouraux, A., & Iannetti, G. D. (2008). Across-trial averaging of event-related eeg responses and beyond. Magnetic Resonance Imaging, 26(7), 1041-54.PubMedCrossRefGoogle Scholar
  57. Nieuwenhuis, S., Astonjones, G., & Cohen, J. D. (2005a). Decision making, the p3, and the locus coeruleus-norepinephrine system. Psychological Bulletin, 131(4), 510-32.PubMedCrossRefGoogle Scholar
  58. Nieuwenhuis, S., Slagter, H., Alting von Geusau, N., Heslenfeld, D.J., & Holroyd, C.B. (2005b). Knowing good from bad: Differential activation of human cortical areas by positive and negative outcomes. European Journal of Neuroscience, 21, 3161-3168.PubMedCrossRefGoogle Scholar
  59. Onslow, A. C., Bogacz, R., & Jones, M. W. (2011). Quantifying phase-amplitude coupling in neuronal network oscillations. Progress in Biophysics & Molecular Biology, 105(1–2), 49-57.CrossRefGoogle Scholar
  60. O’Reilly, J. X., Schüffelgen, U., Cuell, S. F., Behrens, T. E., Mars, R. B., & Rushworth, M. F. (2013). Dissociable effects of surprise and model update in parietal and anterior cingulate cortex. Proceedings of the National Academy of Sciences, 110(38), E3660-E3669.CrossRefGoogle Scholar
  61. Pesaran, B., Nelson, M.J., and Andersen, R.A. (2008). Free choice activates a decision circuit between frontal and parietal cortex. Nature, 453, 406–409.PubMedPubMedCentralCrossRefGoogle Scholar
  62. Proudfit, G. H. (2015). The reward positivity: From basic research on reward to a biomarker for depression. Psychophysiology, 52(4), 449-459.PubMedCrossRefGoogle Scholar
  63. Rutishauser, U., Ross, I. B., Mamelak, A. N., & Schuman, E. M. (2010). Human memory strength is predicted by theta-frequency phase-locking of single neurons. Nature, 464(7290), 903-7.PubMedCrossRefGoogle Scholar
  64. Sambrook, T. D., & Goslin, J. (2015). A neural reward prediction error revealed by a meta-analysis of ERPs using great grand averages. Psychological Bulletin, 141(1), 213-235.PubMedCrossRefGoogle Scholar
  65. Schiffer, A. M., Siletti, K., Waszak, F., & Yeung, N. (2017). Adaptive behaviour and feedback processing integrate experience and instruction in reinforcement learning. NeuroImage, 146, 626-641.PubMedPubMedCentralCrossRefGoogle Scholar
  66. Siegel, M., Donner, T. H., & Engel, A. K. (2012). Spectral fingerprints of large-scale neuronal interactions. Nature Reviews Neuroscience, 13(2), 121-134.PubMedCrossRefGoogle Scholar
  67. Sutton, R. S., & Barto, A. G. (1998). Reinforcement learning: An introduction (Vol. 1, No. 1). Cambridge: MIT press.Google Scholar
  68. Umemoto, A., Hajihosseini, A., Yates, M. E., & Holroyd, C. B. (2017). Reward-based contextual learning supported by anterior cingulate cortex. Cognitive Affective & Behavioral Neuroscience, 17(3), 642.CrossRefGoogle Scholar
  69. Verguts, T. (2017). Binding by random bursts: a computational model of cognitive control. Journal of Cognitive Neuroscience, 29(6), 1103-1118.PubMedCrossRefGoogle Scholar
  70. Walsh, M. M., & Anderson, J. R. (2011). Modulation of the feedback-related negativity by instruction and experience. Proceedings of the National Academy of Sciences of the United States of America, 108(47), 19048-53.PubMedPubMedCentralCrossRefGoogle Scholar
  71. Walsh, M. M., & Anderson, J. R. (2012). Learning from experience: event-related potential correlates of reward processing, neural adaptation, and behavioral choice. Neuroscience & Biobehavioral Reviews, 36(8), 1870-1884.CrossRefGoogle Scholar
  72. Warren, C. M., & Holroyd, C. B. (2012). The impact of deliberative strategy dissociates ERP components related to conflict processing vs. reinforcement learning. Frontiers in neuroscience, 6.Google Scholar
  73. Warren, C. M., Hyman, J. M., Seamans, J. K., & Holroyd, C. B. (2015). Reward processing in the rodent anterior cingulate cortex. Journal of Physiology, Paris, 109 (1), 87-94.PubMedCrossRefGoogle Scholar
  74. Wang, J., Chen, Z., Peng, X., Yang, T., Li, P., Cong, F., & Li, H. (2016). To know or not to know? theta and delta reflect complementary information about an advanced cue before feedback in decision-making. Frontiers in psychology, 7. Google Scholar

Copyright information

© Psychonomic Society, Inc. 2018

Authors and Affiliations

  • Peng Li
    • 1
    • 2
  • Weiwei Peng
    • 1
  • Hong Li
    • 1
    • 2
    • 3
  • Clay B. Holroyd
    • 4
  1. 1.Brain Function and Psychological Science Research CenterShenzhen UniversityShenzhenChina
  2. 2.Shenzhen Key Laboratory of Affective and Social Cognitive ScienceShenzhen UniversityShenzhenChina
  3. 3.Center for Language and BrainShenzhen Institute of NeuroscienceShenzhenChina
  4. 4.Department of PsychologyUniversity of VictoriaVictoriaCanada

Personalised recommendations