Skip to main content

A Speech Recognition System using an Auditory Model and TOM Neural Network

  • Conference paper
Book cover Artificial Neural Nets and Genetic Algorithms
  • 459 Accesses

Abstract

This paper is devoted to a neurobiologically plausible approach for the design of speech processing systems. The temporal organization map (TOM) neural net model is a connectionist model for time representation. The definition of a generic neural unit, inspired by the neurobiological model of the cortical column, allows the model to be used for problems including the temporal dimension. In the framework of automatic speech recognition, TOM has been previously tested with conventional techniques of signal processing. An auditory model as front-end processor is now used with TOM, in order to test the efficiency and the accuracy of a physiologically based speech recognition system. Preliminary results axe presented for speaker-dependent and speaker-independent speech recognition experiments. The interest of auditory model is the possibility to develop more valuable processing and communication strategies between TOM and the front-end processor, including afferent and efferent information flow.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. W. A. Ainsworth. Auditory mechanisms for speech perception. In Proc. of Euro speech’95, pages 171–178, Madrid, Spain, 1995.

    Google Scholar 

  2. F. Alexandre, F. Guyot, J. P. Haton, and Y. Burnod. The cortical column: a new processing unit for multilayered networks. Neural networks, 4:15–25, 1991.

    Article  Google Scholar 

  3. F. Berthommier. Intégration neuronale dans le système auditif. Modélisation de réseaux neuronaux temporo-dépendants. PhD thesis, Université Joseph Fourier — Grenoble I, 1992.

    Google Scholar 

  4. Y. Burnod. An adaptive neural network: The cerebral cortex. Masson Paris, 1988.

    Google Scholar 

  5. S. B. Davis and P. Mermelstein. Comparison of parametric representation for monosyllabic word recognition in continuously spoken sentences. IEEE Transactions on acoustics, speech, and signal processing, ASSP-28(4):357–366, 1980.

    Article  Google Scholar 

  6. S. Durand and F. Alexandre. Spatio-temporal mask learning: application to speech recognition. In D. W. Pearson, N. C. Steele, R. F. Albrecht (editors), Artificial Neural Nets and Genetic Algorithms, pages 132–135, Springer-Verlag, Wien, April 1995.

    Chapter  Google Scholar 

  7. S. Durand and F. Alexandre. Tom, a new temporal neural net architecture for speech signal processing. In IEEE International Conference on Acoustic Speech and Signal Processing, Atlanta, USA, 1996.

    Google Scholar 

  8. J. L. Elman. Finding structure in time. Cognitive Science, 14:179–211, 1990.

    Article  Google Scholar 

  9. B. Fritzke. A growing neural gas network learns topologies. In G. Tesauro, D.S. Touretzky, and T.K. Leen, editors, Advances in Neural Information Processing Systems 7. MIT Press, Cambridge MA, 1995.

    Google Scholar 

  10. Y. Gao, T. Huang, S. Chen, and J. P. Haton. Auditory model based speech processing. In Proc. of ICSLP, pages 73–76, Alberta, Canada, 1992.

    Google Scholar 

  11. T. Kohonen. Self-Organization and Associative Memory. Springer Series in Information Sciences. Springer-Verlag, third edition, 1989.

    Google Scholar 

  12. G. Langner. Periodicity coding in the auditory system. Hearing Research, 60:115–142, 1992.

    Article  Google Scholar 

  13. T. M. Martinetz and K. J. Schulten. A “neural-gas” network learns topologies. In T. Kohonen, K. Mäkisara, O. Simula, and J. Kangas, editors, Artificial Neural Network, pages 397–402. North-Holland, Amsterdam, 1991.

    Google Scholar 

  14. S. Seneff. A joint synchrony/mean-rate model of auditory speech processing. Journal of Phonetics, 16:55–76, 1988.

    Google Scholar 

  15. A. Waibel, T. Hanazawa, G. Hinton, K. Shikano, and K J. Lang. Phoneme recognition using time-delay neural networks. IEEE Transaction on Acoustics, Speech and Signal Processing, 37(3):328–339, 1989.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 1998 Springer-Verlag Wien

About this paper

Cite this paper

Hartwich, E., Alexandre, F. (1998). A Speech Recognition System using an Auditory Model and TOM Neural Network. In: Artificial Neural Nets and Genetic Algorithms. Springer, Vienna. https://doi.org/10.1007/978-3-7091-6492-1_23

Download citation

  • DOI: https://doi.org/10.1007/978-3-7091-6492-1_23

  • Publisher Name: Springer, Vienna

  • Print ISBN: 978-3-211-83087-1

  • Online ISBN: 978-3-7091-6492-1

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics