Communication Between Speech Production and Perception Within the Brain—Observation and Simulation

Dang, Jianwu; Akagi, Masato; Honda, Kiyoshi

doi:10.1007/s11390-006-0095-8

Communication Between Speech Production and Perception Within the Brain—Observation and Simulation

Published: January 2006

Volume 21, pages 95–105, (2006)
Cite this article

Journal of Computer Science and Technology Aims and scope Submit manuscript

Jianwu Dang^1,2,
Masato Akagi¹ &
Kiyoshi Honda²

76 Accesses
1 Citation
Explore all metrics

Abstract

Realization of an intelligent human-machine interface requires us to investigate human mechanisms and learn from them. This study focuses on communication between speech production and perception within human brain and realizing it in an artificial system. A physiological research study based on electromyographic signals (Honda, 1996) suggested that speech communication in human brain might be based on a topological mapping between speech production and perception, according to an analogous topology between motor and sensory representations. Following this hypothesis, this study first investigated the topologies of the vowel system across the motor, kinematic, and acoustic spaces by means of a model simulation, and then examined the linkage between vowel production and perception in terms of a transformed auditory feedback (TAF) experiment. The model simulation indicated that there exists an invariant mapping from muscle activations (motor space) to articulations (kinematic space) via a coordinate consisting of force-dependent equilibrium positions, and the mapping from the motor space to kinematic space is unique. The motor-kinematic-acoustic deduction in the model simulation showed that the topologies were compatible from one space to another. In the TAF experiment, vowel production exhibited a compensatory response for a perturbation in the feedback sound. This implied that vowel production is controlled in reference to perception monitoring.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Neural Network Dynamics and Audiovisual Integration

Auditory and somatosensory feedback mechanisms of laryngeal and articulatory speech motor control

Article 23 June 2022

Differentiation of Speech and Language Functional Systems and Analysis of the Differences in Related Neural Networks

Article 23 June 2023

References

Denes P, Pinson E. The Speech Chain. 2nd Edition, New York: W.H. Freeman and Co. 1993.
Google Scholar
Lombard E. Le signe de l'elevation de la voix. Annales Maladies Oreilles Larynx Nez Pharynx, 1911, 37: 101–119.
Google Scholar
Lee B S. Effects of delayed speech feedback. J. Acoust. Soc. Ame., 1950, 22: 824–826.
Google Scholar
Kawahara H. Interactions between speech production and perception under auditory feedback perturbations on fundamental frequencies. J. Acoust. Soc. Jpn., 1994, 15(3): 201–202.
Google Scholar
Liberman A M, Cooper F S, Shankweiler D P, Studdert-Kennedy M. Perception of the speech code. Psych. Rev., 1967, 74(6): 431–461.
Google Scholar
Liberman A M, Mattingly I G. The motor theory of speech perception revised. Cognition, 1985, 21: 1–36.
Article Google Scholar
Savariaux C, Perrier P, Orliaguet J. Compensation strategies for the perturbation of the rounded vowel [u] using a lip tube: A study of the control space in speech production. J. Acoust. Soc. Ame., 1995, 98(5): 2428–2442.
Google Scholar
Honda M, Fujino A, Kaburagi T. Compensatory responses of articulators to unexpected perturbation of the palate shape. J. Phonetics, 2002, 30: 281–302.
Google Scholar
Nota Y, Honda K. Brain regions involved in control of speech. Acoust. Sci. & Tech., 2004, 25(4): 286–289.
Article Google Scholar
Sakai K L, Homae F, Hashimoto R, Suzuki K. Functional imaging of the human temporal cortex during auditory sentence processing. Am. Lab., 2002, 34: 34–40.
Google Scholar
Honda K. Organization of tongue articulation for vowels, J. Phonetics, 1996, 24: 39–52.
Google Scholar
Dang J, Honda K. Construction and control of a physiological articulatory model. J. Acoust. Soc. Ame., 2004, 115(2): 853–870.
Google Scholar
Baer T, Alfonso J, Honda K. Electromyography of the tongue muscle during vowels in /∂pvp/environment. Ann. Bull. R. I. L. P., Univ. Tokyo, 1988, 7: 7–18.
Google Scholar
Maeda S. Compensatory Articulation During Speech: Evidence from the Analysis of Vocal Tract Shapes Using an Articulatory Model. Hardcastle, Marchal Speech Production and Speech Modeling, Dordrecht: Kluwer Academic Publishers, 1990, pp.131–149.
Google Scholar
Carré R, Mrayati M. Articulatory-Acoustic-Phonetic Relations and Modeling, Regions and Modes. Speech Production and Speech Modeling, Hardcastle W, Marchal A (eds.), Netherland: Kluwer Academic Publishers, 1990, pp.211–240.
Google Scholar
Honda K, Kusakawa N. Compatibility between auditory and articulatory representations of vowels. Acta Otolaryngol. (Stockh), Suppl., 532: 103–105.
Niimi S, Kumada M, Niitsu M. Functions of tongue-related muscles during production of the five Japanese vowels. Ann. Bull. R. I. L. P., Univ. Tokyo, 1994, 28: 33–40.
Google Scholar
Stone M, Davis E, Douglas A, Ness Aiver M, Gullapalli R, Levine W, Lundberg A. Modeling motion of the internal tongue from tagged cine—MRI images. J. Acoust. Soc. Am., 2001, 109(6): 2974–2982.
Article Google Scholar
Dang J, Honda K. Estimation of vocal tract shape from sounds via a physiological articulatory model. J. Phonetics, 2002, 30: 511–532.
Google Scholar
Houde J, Jordan M. Sensorimotor adaptation in speech production. Science, 1998, 279(5354): 1213–1216.
Article Google Scholar
Callan E, Kent D, Guenther H, Vorperian K. An auditory-feedback-based neural network model of speech production that is robust to developmental changes in the size and shape of the articulatory system. Journal of Speech, Language, and Hearing Research, 2000, 43: 721–736.
Google Scholar
Purcell D, Johnsrude I, Munhall K. Perception of altered formant feedback influences speech production. In Proc. ISCA Workshop on Plasticity in Speech Perception, London, UK, 2005, pp.15–17.
Masaki S, Honda K. Estimation of temporal processing unit of speech motor programming for Japanese words based on the measurement of reaction time. In Proc. ICSLP 94, Yokohama Japan, 1994, pp.663–666.

Download references

Author information

Authors and Affiliations

Japan Advanced Institute of Science and Technology, Ishikawa, Japan
Jianwu Dang & Masato Akagi
ATR Human Information Science Laboratories, Kyoto, Japan
Jianwu Dang & Kiyoshi Honda

Authors

Jianwu Dang
View author publications
You can also search for this author in PubMed Google Scholar
Masato Akagi
View author publications
You can also search for this author in PubMed Google Scholar
Kiyoshi Honda
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jianwu Dang.

Additional information

This research has been supported in part by the National Institute of Information and Communications Technology and in part by a Grant-in-Aid for Scientific Research of Japan (Grant No. 16300053).

Jianwu Dang received his B.E. and M.S. degrees from Tsinghua Univ., China, in 1982 and 1984, respectively. He worked for Tianjin University as a lecturer from 1984 to 1988. He was awarded the Ph.D. Eng. from Shizuoka Univ., Japan in 1992. Dr. Dang worked for ATR Human Information Processing Lab., Japan from 1992 to 2001. He joined the University of Waterloo, Canada, as a visiting scholar for one year in 1998. He has been with the Japan Advanced Institute of Science and Technology (JAIST) since 2001, where he is a professor. He joined the Institute of Communication Parlee, Center of National Research Scientific (CNRS), France, as a research scientist the first class for one year in 2002. His research interests are in all of the fields of speech science, especially in speech production. He is a member of the Acoustic Societies of America and Japan, and also a member of the Institute of Electronics, Information and Communication Engineers.

Masato Akagi received the B.E. degree in electronic engineering from Nagoya Institute of Technology in 1979, and the M.E. and the Ph.D. Eng. degrees in computer science from Tokyo Institute of Technology in 1981 and 1984. In 1984, he joined the Electrical Communication Laboratory, Nippon Telegraph and Telephone Corporation (NTT). From 1986 to 1990, he worked at the ATR Auditory and Visual Perception Research Laboratories. Since 1992, he has been with the School of Information Science, JAIST, where he is currently a professor. His research interests include speech perception mechanisms of humans, and speech signal processing. Dr. Akagi received the IEICE Excellent Paper Award from the IEICE in 1987, and the Sato Prize for Outstanding Paper from the ASJ in 1998.

Kiyoshi Honda graduated from Nara Medical University in 1976 and joined the Faculty of Medicine at the University of Tokyo to work in the voice clinic and conduct speech research. He was also a visiting scholar at Haskins Laboratory, New Haven, for three years from 1981. He was awarded a Ph.D. degree in medical science in 1985. Dr. Honda moved to Kanazawa Institute of Technology in 1986 and continued speech research as an associate professor. He then moved to ATR in 1991 to be the supervisor of the Auditory and Visual Processing Research Laboratories and Human Information Processing Research Laboratories. He was also a senior scientist in the field at the University of Wisconsin for three years from 1995. Currently he is the head of Department of Biophysical Imaging (speech production group) at ATR Human Information Science Laboratories. His research work focuses on speech science and physiological experimental phonetics using MRI to investigate the form and function of the speech organs.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Dang, J., Akagi, M. & Honda, K. Communication Between Speech Production and Perception Within the Brain—Observation and Simulation. J Comput Sci Technol 21, 95–105 (2006). https://doi.org/10.1007/s11390-006-0095-8

Download citation

Received: 24 February 2005
Accepted: 29 August 2005
Issue Date: January 2006
DOI: https://doi.org/10.1007/s11390-006-0095-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Communication Between Speech Production and Perception Within the Brain—Observation and Simulation

Abstract

Access this article

Similar content being viewed by others

Neural Network Dynamics and Audiovisual Integration

Auditory and somatosensory feedback mechanisms of laryngeal and articulatory speech motor control

Differentiation of Speech and Language Functional Systems and Analysis of the Differences in Related Neural Networks

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Communication Between Speech Production and Perception Within the Brain—Observation and Simulation

Abstract

Access this article

Similar content being viewed by others

Neural Network Dynamics and Audiovisual Integration

Auditory and somatosensory feedback mechanisms of laryngeal and articulatory speech motor control

Differentiation of Speech and Language Functional Systems and Analysis of the Differences in Related Neural Networks

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation