A Speech Recognition System using an Auditory Model and TOM Neural Network

Hartwich, E.; Alexandre, F.

doi:10.1007/978-3-7091-6492-1_23

E. Hartwich⁴ &
F. Alexandre⁴

459 Accesses

Abstract

This paper is devoted to a neurobiologically plausible approach for the design of speech processing systems. The temporal organization map (TOM) neural net model is a connectionist model for time representation. The definition of a generic neural unit, inspired by the neurobiological model of the cortical column, allows the model to be used for problems including the temporal dimension. In the framework of automatic speech recognition, TOM has been previously tested with conventional techniques of signal processing. An auditory model as front-end processor is now used with TOM, in order to test the efficiency and the accuracy of a physiologically based speech recognition system. Preliminary results axe presented for speaker-dependent and speaker-independent speech recognition experiments. The interest of auditory model is the possibility to develop more valuable processing and communication strategies between TOM and the front-end processor, including afferent and efferent information flow.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

W. A. Ainsworth. Auditory mechanisms for speech perception. In Proc. of Euro speech’95, pages 171–178, Madrid, Spain, 1995.
Google Scholar
F. Alexandre, F. Guyot, J. P. Haton, and Y. Burnod. The cortical column: a new processing unit for multilayered networks. Neural networks, 4:15–25, 1991.
Article Google Scholar
F. Berthommier. Intégration neuronale dans le système auditif. Modélisation de réseaux neuronaux temporo-dépendants. PhD thesis, Université Joseph Fourier — Grenoble I, 1992.
Google Scholar
Y. Burnod. An adaptive neural network: The cerebral cortex. Masson Paris, 1988.
Google Scholar
S. B. Davis and P. Mermelstein. Comparison of parametric representation for monosyllabic word recognition in continuously spoken sentences. IEEE Transactions on acoustics, speech, and signal processing, ASSP-28(4):357–366, 1980.
Article Google Scholar
S. Durand and F. Alexandre. Spatio-temporal mask learning: application to speech recognition. In D. W. Pearson, N. C. Steele, R. F. Albrecht (editors), Artificial Neural Nets and Genetic Algorithms, pages 132–135, Springer-Verlag, Wien, April 1995.
Chapter Google Scholar
S. Durand and F. Alexandre. Tom, a new temporal neural net architecture for speech signal processing. In IEEE International Conference on Acoustic Speech and Signal Processing, Atlanta, USA, 1996.
Google Scholar
J. L. Elman. Finding structure in time. Cognitive Science, 14:179–211, 1990.
Article Google Scholar
B. Fritzke. A growing neural gas network learns topologies. In G. Tesauro, D.S. Touretzky, and T.K. Leen, editors, Advances in Neural Information Processing Systems 7. MIT Press, Cambridge MA, 1995.
Google Scholar
Y. Gao, T. Huang, S. Chen, and J. P. Haton. Auditory model based speech processing. In Proc. of ICSLP, pages 73–76, Alberta, Canada, 1992.
Google Scholar
T. Kohonen. Self-Organization and Associative Memory. Springer Series in Information Sciences. Springer-Verlag, third edition, 1989.
Google Scholar
G. Langner. Periodicity coding in the auditory system. Hearing Research, 60:115–142, 1992.
Article Google Scholar
T. M. Martinetz and K. J. Schulten. A “neural-gas” network learns topologies. In T. Kohonen, K. Mäkisara, O. Simula, and J. Kangas, editors, Artificial Neural Network, pages 397–402. North-Holland, Amsterdam, 1991.
Google Scholar
S. Seneff. A joint synchrony/mean-rate model of auditory speech processing. Journal of Phonetics, 16:55–76, 1988.
Google Scholar
A. Waibel, T. Hanazawa, G. Hinton, K. Shikano, and K J. Lang. Phoneme recognition using time-delay neural networks. IEEE Transaction on Acoustics, Speech and Signal Processing, 37(3):328–339, 1989.
Article Google Scholar

Download references

Author information

Authors and Affiliations

CRIN-CNRS/INRIA Lorraine, BP 239, F-54506, Vandœuvre-lès-Nancy, France
E. Hartwich & F. Alexandre

Authors

E. Hartwich
View author publications
You can also search for this author in PubMed Google Scholar
F. Alexandre
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hartwich, E., Alexandre, F. (1998). A Speech Recognition System using an Auditory Model and TOM Neural Network. In: Artificial Neural Nets and Genetic Algorithms. Springer, Vienna. https://doi.org/10.1007/978-3-7091-6492-1_23

Download citation

DOI: https://doi.org/10.1007/978-3-7091-6492-1_23
Publisher Name: Springer, Vienna
Print ISBN: 978-3-211-83087-1
Online ISBN: 978-3-7091-6492-1
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics