Skip to main content
Log in

Progress in speech decoding from the electrocorticogram

  • Review Article
  • Published:
Biomedical Engineering Letters Aims and scope Submit manuscript

Abstract

Recent advances in neuroimaging methods have improved our ability to explore the neurological processes underlying speech and language. As a result of these investigations, it is now possible to decode aspects of speech directly from neural activity toward the development of neuroprosthetic devices for individuals with severe neuromuscular and communication disorders. Much of what is known about the neural correlates of speech articulation and perception is based on lesion and cortical electrical stimulation studies, as well as modern non-invasive neuroimaging. Though extremely important to the current understanding of brain function, traditional neuroimaging methods are primarily limited by the spatial and temporal resolution of the imaging technique. Electrical activity measured from the cortex, or electrocorticography (ECoG), offers several advantages over other neuroimaging modalities for characterization and real-time decoding of brain activity. Specifically, ECoG is well-suited for the study of speech and language owing to its unique spatial and temporal resolution capabilities that allow it to accurately capture the fast-changing dynamics of the large cortical networks underlying speech processing. This review presents the current progress of ECoG-based speech characterization and decoding studies, including an overview of prior neuroimaging studies, ECoG representations of speech production and perception, and a discussion of future directions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Wolpaw JR, Birbaumer N, McFarland DJ, Pfurtscheller G, Vaughan TM. Brain-computer interfaces for communication and control. Clin Neurophysiol. 2002; 113(6):767–91.

    Article  Google Scholar 

  2. Ficke RC. Digest of data on persons with disabilities. Washington, DC: National Institute on Disability and Rehabilitation Research. 1992.

    Google Scholar 

  3. Pasley BN, Knight RT. Decoding speech for understanding and treating aphasia. Prog Brain Res. 2013; 207:435–56.

    Article  Google Scholar 

  4. Price CJ. A review and synthesis of the first 20 years of PET and fMRI studies of heard speech, spoken language and reading. Neuroimage. 2012; 62(2):816–47.

    MathSciNet  Google Scholar 

  5. Pei X, Hill J, Schalk G. Silent communication: toward using brain signals. IEEE Pulse. 2012; 3(1):43–6.

    Article  Google Scholar 

  6. Ojemann GA. Cortical organization of language. J Neurosci. 1991; 11(8):2281–7.

    Google Scholar 

  7. Broca P. Perte de la parole, ramollissement chronique et destruction partielle du lobe antérieur gauche du cerveau. Bull Soc Anthropol. 1861; 2:235–8.

    Google Scholar 

  8. Wernicke C. Der aphasische symptomenkomplex. Springer Berlin Heidelberg: 1974.

    Google Scholar 

  9. Hickok G, Poeppel D. The cortical organization of speech processing. Nat Rev Neurosci. 2007; 8(5):393–402.

    Article  Google Scholar 

  10. Hickok G. Computational neuroanatomy of speech production. Nat Rev Neurosci. 2012; 13(2):135–45.

    Article  Google Scholar 

  11. Guenther FH, Ghosh SS, Tourville JA. Neural modeling and imaging of the cortical interactions underlying syllable production. Brain Lang. 2006; 96(3):280–301.

    Article  Google Scholar 

  12. Price CJ, Wise RJ, Warburton EA, Moore CJ, Howard D, Patterson K, Frackowiak RS, Friston KJ. Hearing and saying the functional neuro-anatomy of auditory word processing. Brain. 1996; 119(3):919–31.

    Article  Google Scholar 

  13. Price CJ. The anatomy of language: contributions from functional neuroimaging. J Anat. 2000; 197(3):335–59.

    Article  Google Scholar 

  14. Fiez JA, Petersen SE. Neuroimaging studies of word reading. Proc Natl Acad Sci USA. 1998; 95(3):914–21.

    Article  Google Scholar 

  15. Binder JR, Frost JA, Hammeke TA, Bellgowan PS, Springer JA, Kaufman JN, Possing ET. Human temporal lobe activation by speech and nonspeech sounds. Cereb Cortex. 2000; 10(5):512–28.

    Article  Google Scholar 

  16. Talavage TM, Gonzalez-Castillo J, Scott SK. Auditory neuroimaging with fMRI and PET. Hear Res. 2014; 307:4–15.

    Article  Google Scholar 

  17. Ganushchak LY, Christoffels IK, Schiller NO. The use of electroencephalography in language production research: a review. Front Psychol. 2011; 2(208):1–6.

    Google Scholar 

  18. Sanders LD, Neville HJ. An ERP study of continuous speech processing: I. Segmentation, semantics, and syntax in native speakers. Brain Res Cogn Brain Res. 2003; 15(3):228–40.

    Article  Google Scholar 

  19. Hagoort P, Brown CM. ERP effects of listening to speech: semantic ERP effects. Neuropsychologia. 2000; 38(11):1518–30.

    Article  Google Scholar 

  20. Indefrey P, Levelt WJ. The spatial and temporal signatures of word production components. Cognition. 2004; 92(1–2):101–44.

    Article  Google Scholar 

  21. Leuthardt EC, Pei XM, Breshears J, Gaona C, Sharma M, Freudenberg Z, Barbour D, Schalk G. Temporal evolution of gamma activity in human cortex during an overt and covert word repetition task. Front Hum Neurosci. 2012; 6:99.

    Google Scholar 

  22. Palmini A. The concept of the epileptogenic zone: a modern look at Penfield and Jasper’s views on the role of interictal spikes. Epileptic Disord. 2006; 8 Suppl 2:S10–5.

    Google Scholar 

  23. Schalk G, Leuthardt EC. Brain-computer interfaces using electrocorticographic signals. IEEE Rev Biomed Eng. 2011; 4:140–54.

    Article  Google Scholar 

  24. Kellis S, Miller K, Thomson K, Brown R, House P, Greger B. Decoding spoken words using local field potentials recorded from the cortical surface.J Neural Eng. 2010; 7(5):056007.

    Google Scholar 

  25. Blakely T, Miller KJ, Rao RP, Holmes MD, Ojemann JG. Localization and classification of phonemes using high spatial resolution electrocorticography (ECoG) grids. Conf Proc IEEE Eng Med Biol Soc. 2008; 2008:4964–7.

    Google Scholar 

  26. Chang EF, Rieger JW, Johnson K, Berger MS, Barbaro NM, Knight RT. Categorical speech representation in human superior temporal gyrus. Nat Neurosci. 2010; 13(11):1428–32.

    Article  Google Scholar 

  27. Leuthardt EC, Gaona C, Sharma M, Szrama N, Roland J, Freudenberg Z, Solis J, Breshears J, Schalk G. Using the electrocorticographic speech network to control a brain — computer interface in humans. J Neural Eng. 2011; 8(3):036004.

    Google Scholar 

  28. Schwartz AB, Cui XT, Weber DJ, Moran DW. Brain-controlled interfaces: movement restoration with neural prosthetics. Neuron. 2006; 52(1):205–20.

    Article  Google Scholar 

  29. Sillay KA, Rutecki P, Cicora K, Worrell G, Drazkowski J, Shih JJ, Sharan AD, Morrell MJ, Williams J, Wingeier B. Long-term measurement of impedance in chronically implanted depth and subdural electrodes during responsive neurostimulation in humans. Brain Stimul. 2013; 6(5):718–26.

    Article  Google Scholar 

  30. Wu C, Evans JJ, Skidmore C, Sperling MR, Sharan AD. Impedance variations over time for a closed-loop neurostimulation device: early experience with chronically implanted electrodes. Neuromodulation. 2013; 16(1):46–50.

    Article  Google Scholar 

  31. Crone NE, Sinai A, Korzeniewska A. High-frequency gamma oscillations and human brain mapping with electrocorticography. Prog Brain Res. 2006; 159:275–95.

    Article  Google Scholar 

  32. Leuthardt EC, Schalk G, Wolpaw JR, Ojemann JG, Moran DW. A brain-computer interface using electrocorticographic signals in humans. J Neural Eng. 2004; 1(2):63–71.

    Article  Google Scholar 

  33. Yanagisawa T, Hirata M, Saitoh Y, Goto T, Kishima H, Fukuma R, Yokoi H, Kamitani Y, Yoshimine T. Real-time control of a prosthetic hand using human electrocorticography signals. J Neurosurg. 2011; 114(6):1715–22.

    Article  Google Scholar 

  34. Schalk G, Miller KJ, Anderson NR, Wilson JA, Smyth MD, Ojemann JG, Moran DW, Wolpaw JR, Leuthardt EC. Twodimensional movement control using electrocorticographic signals in humans. J Neural Eng. 2008; 5(1):75–84.

    Article  Google Scholar 

  35. Hinterberger T, Widman G, Lal TN, Hill J, Tangermann M, Rosenstiel W, Schölkopf B, Elger C, Birbaumer N. Voluntary brain regulation and communication with electrocorticogram signals. Epilepsy Behav. 2008; 13(2):300–6.

    Article  Google Scholar 

  36. Crone NE, Boatman D, Gordon B, Hao L. Induced electrocorticographic gamma activity during auditory perception. Clin Neurophysiol. 2001; 112(4):565–82.

    Article  Google Scholar 

  37. Pei X, Leuthardt EC, Gaona CM, Brunner P, Wolpaw JR, Schalk G. Spatiotemporal dynamics of electrocorticographic high gamma activity during overt and covert word repetition. Neuroimage. 2011; 54(4):2960–72.

    Article  Google Scholar 

  38. Crone NE, Hao L, Hart J Jr, Boatman D, Lesser RP, Irizarry R, Gordon B. Electrocorticographic gamma activity during word production in spoken and sign language. Neurology. 2001; 57(11):2045–53.

    Article  Google Scholar 

  39. Sinai A, Bowers CW, Crainiceanu CM, Boatman D, Gordon B, Lesser RP, Lenz FA, Crone NE. Electrocorticographic high gamma activity versus electrical cortical stimulation mapping of naming. Brain. 2005; 128(7):1556–70.

    Article  Google Scholar 

  40. Edwards E, Soltani M, Deouell LY, Berger MS, Knight RT. High gamma activity in response to deviant auditory stimuli recorded directly from human cortex. J Neurophysiol. 2005; 94(6):4269–80.

    Article  Google Scholar 

  41. Bouchard KE, Mesgarani N, Johnson K, Chang EF. Functional organization of human sensorimotor cortex for speech articulation. Nature. 2013; 495(7441):327–32.

    Article  Google Scholar 

  42. Miller KJ, Abel TJ, Hebb AO, Ojemann JG. Rapid online language mapping with electrocorticography. J Neurosurg Pediatr. 2011; 7(5):482–90.

    Article  Google Scholar 

  43. Kubanek J, Brunner P, Gunduz A, Poeppel D, Schalk G. The tracking of speech envelope in the human cortex. PLoS One. 2013; 8(1):e53398.

    Google Scholar 

  44. Edwards E, Soltani M, Kim W, Dalal SS, Nagarajan SS, Berger MS, Knight RT. Comparison of time-frequency responses and the event-related potential to auditory speech stimuli in human cortex. J Neurophysiol. 2009; 102(1):377–86.

    Article  Google Scholar 

  45. Nourski KV, Reale RA, Oya H, Kawasaki H, Kovach CK, Chen H, Howard MA 3rd, Brugge JF. Temporal envelope of timecompressed speech represented in the human auditory cortex. J Neurosci. 2009; 29(49):15564–74.

    Article  Google Scholar 

  46. Canolty RT, Soltani M, Dalal SS, Edwards E, Dronkers NF, Nagarajan SS, Kirsch HE, Barbaro NM, Knight RT. Spatiotemporal dynamics of word processing in the human brain. Front Neurosci. 2007; 1(1):185–96.

    Article  Google Scholar 

  47. Chang EF, Niziolek CA, Knight RT, Nagarajan SS, Houde JF. Human cortical sensorimotor network underlying feedback control of vocal pitch. Proc Natl Acad Sci USA. 2013; 110(7):2653–8.

    Article  Google Scholar 

  48. Towle VL, Yoon HA, Castelle M, Edgar JC, Biassou NM, Frim DM, Spire JP, Kohrman MH. ECoG gamma activity during a language task: differentiating expressive and receptive speech areas. Brain. 2008; 131(8):2013–27.

    Article  Google Scholar 

  49. Greenlee JD, Jackson AW, Chen F, Larson CR, Oya H, Kawasaki H, Chen H, Howard MA 3rd. Human auditory cortical activation during self-vocalization. PLoS One. 2011; 6(3):e14744.

    Google Scholar 

  50. Wang W, Degenhart AD, Sudre GP, Pomerleau DA, Tyler-Kabara EC. Decoding semantic information from human electrocorticographic (ECoG) signals. Conf Proc IEEE Eng Med Biol Soc. 2011; 2011:6294–8.

    Google Scholar 

  51. Pei X, Barbour DL, Leuthardt EC, Schalk G. Decoding vowels and consonants in spoken and imagined words using electrocorticographic signals in humans. J Neural Eng. 2011; 8(4):046028.

    Google Scholar 

  52. Kanas VG, Mporas I, Benz HL, Sgarbas KN, Bezerianos A, Crone NE. Joint spatial-spectral feature space clustering for speech activity detection from ECoG signals. IEEE Trans Biomed Eng. 2014; 61(4):1241–50.

    Article  Google Scholar 

  53. Zhang D, Gong E, Wu W, Lin J, Zhou W, Hong B. Spoken sentences decoding based on intracranial high gamma response using dynamic time warping. Conf Proc IEEE Eng Med Biol Soc. 2012; 2012:3292–5.

    Google Scholar 

  54. Mugler EM, Patton JL, Flint RD, Wright ZA, Schuele SU, Rosenow J, Shih JJ, Krusienski DJ, Slutzky MW. Direct classification of all American English phonemes using signals from functional speech motor cortex. J Neural Eng. 2014; 11(3):035015.

    Google Scholar 

  55. Martin S, Brunner P, Holdgraf C, Heinze HJ, Crone NE, Rieger J, Schalk G, Knight RT, Pasley BN. Decoding spectrotemporal features of overt and covert speech from the human cortex. Front Neuroeng. 2014; 7:14.

    Article  Google Scholar 

  56. Zavaglia M, Canolty RT, Schofield TM, Leff AP, Ursino M, Knight RT, Penny WD. A dynamical pattern recognition model of gamma activity in auditory cortex. Neural Netw. 2012; 28:1–14.

    Article  Google Scholar 

  57. Pasley BN, David SV, Mesgarani N, Flinker A, Shamma SA, Crone NE, Knight RT, Chang EF. Reconstructing speech from human auditory cortex. PLoS Biol. 2012; 10(1):e1001251.

    Google Scholar 

  58. Behroozmand R, Larson CR. Error-dependent modulation of speech-induced auditory suppression for pitch-shifted voice feedback. BMC Neurosci.2011; 12:54.

    Article  Google Scholar 

  59. Parbery-Clark A, Strait DL, Anderson S, Hittner E, Kraus N. Musical experience and the aging auditory system: implications for cognitive abilities and hearing speech in noise. PLoS One. 2011; 6(5):e18082.

    Google Scholar 

  60. Heinrich A, Schneider BA, Craik FI. Investigating the influence of continuous babble on auditory short-term memory performance. Q J Exp Psychol. 2008; 61(5):735–51.

    Article  Google Scholar 

  61. Pichora-Fuller MK. Audition and cognition: What audiologistsneed to know about listening. In: Palmer C, Seewald R, editors. Hearing Care for Adults. Stäfa, Switzerland: Phonak; 2007. pp 71–85.

    Google Scholar 

  62. Deng L, O’Shaughnessy D. Speech processing: a dynamic and optimization-oriented approach. CRC Press; 2003.

    Google Scholar 

  63. Guenther FH, Brumberg JS, Wright EJ, Nieto-Castanon A, Tourville JA, Panko M, Law R, Siebert SA, Bartels JL, Andreasen DS, Ehirim P, Mao H, Kennedy PR. A wireless brain-machine interface for real-time speech synthesis. PLoS One. 2009; 4(12):e8218.

    Google Scholar 

  64. Brumberg JS, Wright EJ, Andreasen DS, Guenther FH, Kennedy PR. Classification of intended phoneme production from chronic intracortical microelectrode recordings in speechmotor cortex. Front Neurosci. 2011; 5:65.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dean J. Krusienski.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chakrabarti, S., Sandberg, H.M., Brumberg, J.S. et al. Progress in speech decoding from the electrocorticogram. Biomed. Eng. Lett. 5, 10–21 (2015). https://doi.org/10.1007/s13534-015-0175-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13534-015-0175-1

Keywords

Navigation