Skip to main content
Log in

Communication Between Speech Production and Perception Within the Brain—Observation and Simulation

  • Published:
Journal of Computer Science and Technology Aims and scope Submit manuscript

Abstract

Realization of an intelligent human-machine interface requires us to investigate human mechanisms and learn from them. This study focuses on communication between speech production and perception within human brain and realizing it in an artificial system. A physiological research study based on electromyographic signals (Honda, 1996) suggested that speech communication in human brain might be based on a topological mapping between speech production and perception, according to an analogous topology between motor and sensory representations. Following this hypothesis, this study first investigated the topologies of the vowel system across the motor, kinematic, and acoustic spaces by means of a model simulation, and then examined the linkage between vowel production and perception in terms of a transformed auditory feedback (TAF) experiment. The model simulation indicated that there exists an invariant mapping from muscle activations (motor space) to articulations (kinematic space) via a coordinate consisting of force-dependent equilibrium positions, and the mapping from the motor space to kinematic space is unique. The motor-kinematic-acoustic deduction in the model simulation showed that the topologies were compatible from one space to another. In the TAF experiment, vowel production exhibited a compensatory response for a perturbation in the feedback sound. This implied that vowel production is controlled in reference to perception monitoring.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Denes P, Pinson E. The Speech Chain. 2nd Edition, New York: W.H. Freeman and Co. 1993.

    Google Scholar 

  2. Lombard E. Le signe de l'elevation de la voix. Annales Maladies Oreilles Larynx Nez Pharynx, 1911, 37: 101–119.

    Google Scholar 

  3. Lee B S. Effects of delayed speech feedback. J. Acoust. Soc. Ame., 1950, 22: 824–826.

    Google Scholar 

  4. Kawahara H. Interactions between speech production and perception under auditory feedback perturbations on fundamental frequencies. J. Acoust. Soc. Jpn., 1994, 15(3): 201–202.

    Google Scholar 

  5. Liberman A M, Cooper F S, Shankweiler D P, Studdert-Kennedy M. Perception of the speech code. Psych. Rev., 1967, 74(6): 431–461.

    Google Scholar 

  6. Liberman A M, Mattingly I G. The motor theory of speech perception revised. Cognition, 1985, 21: 1–36.

    Article  Google Scholar 

  7. Savariaux C, Perrier P, Orliaguet J. Compensation strategies for the perturbation of the rounded vowel [u] using a lip tube: A study of the control space in speech production. J. Acoust. Soc. Ame., 1995, 98(5): 2428–2442.

    Google Scholar 

  8. Honda M, Fujino A, Kaburagi T. Compensatory responses of articulators to unexpected perturbation of the palate shape. J. Phonetics, 2002, 30: 281–302.

    Google Scholar 

  9. Nota Y, Honda K. Brain regions involved in control of speech. Acoust. Sci. & Tech., 2004, 25(4): 286–289.

    Article  Google Scholar 

  10. Sakai K L, Homae F, Hashimoto R, Suzuki K. Functional imaging of the human temporal cortex during auditory sentence processing. Am. Lab., 2002, 34: 34–40.

    Google Scholar 

  11. Honda K. Organization of tongue articulation for vowels, J. Phonetics, 1996, 24: 39–52.

    Google Scholar 

  12. Dang J, Honda K. Construction and control of a physiological articulatory model. J. Acoust. Soc. Ame., 2004, 115(2): 853–870.

    Google Scholar 

  13. Baer T, Alfonso J, Honda K. Electromyography of the tongue muscle during vowels in /∂pvp/environment. Ann. Bull. R. I. L. P., Univ. Tokyo, 1988, 7: 7–18.

    Google Scholar 

  14. Maeda S. Compensatory Articulation During Speech: Evidence from the Analysis of Vocal Tract Shapes Using an Articulatory Model. Hardcastle, Marchal Speech Production and Speech Modeling, Dordrecht: Kluwer Academic Publishers, 1990, pp.131–149.

    Google Scholar 

  15. Carré R, Mrayati M. Articulatory-Acoustic-Phonetic Relations and Modeling, Regions and Modes. Speech Production and Speech Modeling, Hardcastle W, Marchal A (eds.), Netherland: Kluwer Academic Publishers, 1990, pp.211–240.

    Google Scholar 

  16. Honda K, Kusakawa N. Compatibility between auditory and articulatory representations of vowels. Acta Otolaryngol. (Stockh), Suppl., 532: 103–105.

  17. Niimi S, Kumada M, Niitsu M. Functions of tongue-related muscles during production of the five Japanese vowels. Ann. Bull. R. I. L. P., Univ. Tokyo, 1994, 28: 33–40.

    Google Scholar 

  18. Stone M, Davis E, Douglas A, Ness Aiver M, Gullapalli R, Levine W, Lundberg A. Modeling motion of the internal tongue from tagged cine—MRI images. J. Acoust. Soc. Am., 2001, 109(6): 2974–2982.

    Article  Google Scholar 

  19. Dang J, Honda K. Estimation of vocal tract shape from sounds via a physiological articulatory model. J. Phonetics, 2002, 30: 511–532.

    Google Scholar 

  20. Houde J, Jordan M. Sensorimotor adaptation in speech production. Science, 1998, 279(5354): 1213–1216.

    Article  Google Scholar 

  21. Callan E, Kent D, Guenther H, Vorperian K. An auditory-feedback-based neural network model of speech production that is robust to developmental changes in the size and shape of the articulatory system. Journal of Speech, Language, and Hearing Research, 2000, 43: 721–736.

    Google Scholar 

  22. Purcell D, Johnsrude I, Munhall K. Perception of altered formant feedback influences speech production. In Proc. ISCA Workshop on Plasticity in Speech Perception, London, UK, 2005, pp.15–17.

  23. Masaki S, Honda K. Estimation of temporal processing unit of speech motor programming for Japanese words based on the measurement of reaction time. In Proc. ICSLP 94, Yokohama Japan, 1994, pp.663–666.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jianwu Dang.

Additional information

This research has been supported in part by the National Institute of Information and Communications Technology and in part by a Grant-in-Aid for Scientific Research of Japan (Grant No. 16300053).

Jianwu Dang received his B.E. and M.S. degrees from Tsinghua Univ., China, in 1982 and 1984, respectively. He worked for Tianjin University as a lecturer from 1984 to 1988. He was awarded the Ph.D. Eng. from Shizuoka Univ., Japan in 1992. Dr. Dang worked for ATR Human Information Processing Lab., Japan from 1992 to 2001. He joined the University of Waterloo, Canada, as a visiting scholar for one year in 1998. He has been with the Japan Advanced Institute of Science and Technology (JAIST) since 2001, where he is a professor. He joined the Institute of Communication Parlee, Center of National Research Scientific (CNRS), France, as a research scientist the first class for one year in 2002. His research interests are in all of the fields of speech science, especially in speech production. He is a member of the Acoustic Societies of America and Japan, and also a member of the Institute of Electronics, Information and Communication Engineers.

Masato Akagi received the B.E. degree in electronic engineering from Nagoya Institute of Technology in 1979, and the M.E. and the Ph.D. Eng. degrees in computer science from Tokyo Institute of Technology in 1981 and 1984. In 1984, he joined the Electrical Communication Laboratory, Nippon Telegraph and Telephone Corporation (NTT). From 1986 to 1990, he worked at the ATR Auditory and Visual Perception Research Laboratories. Since 1992, he has been with the School of Information Science, JAIST, where he is currently a professor. His research interests include speech perception mechanisms of humans, and speech signal processing. Dr. Akagi received the IEICE Excellent Paper Award from the IEICE in 1987, and the Sato Prize for Outstanding Paper from the ASJ in 1998.

Kiyoshi Honda graduated from Nara Medical University in 1976 and joined the Faculty of Medicine at the University of Tokyo to work in the voice clinic and conduct speech research. He was also a visiting scholar at Haskins Laboratory, New Haven, for three years from 1981. He was awarded a Ph.D. degree in medical science in 1985. Dr. Honda moved to Kanazawa Institute of Technology in 1986 and continued speech research as an associate professor. He then moved to ATR in 1991 to be the supervisor of the Auditory and Visual Processing Research Laboratories and Human Information Processing Research Laboratories. He was also a senior scientist in the field at the University of Wisconsin for three years from 1995. Currently he is the head of Department of Biophysical Imaging (speech production group) at ATR Human Information Science Laboratories. His research work focuses on speech science and physiological experimental phonetics using MRI to investigate the form and function of the speech organs.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Dang, J., Akagi, M. & Honda, K. Communication Between Speech Production and Perception Within the Brain—Observation and Simulation. J Comput Sci Technol 21, 95–105 (2006). https://doi.org/10.1007/s11390-006-0095-8

Download citation

  • Received:

  • Accepted:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11390-006-0095-8

Keywords

Navigation