Skip to main content

Automatic speech recognition with neural networks: Beyond nonparametric models

  • Part II The Quest of Perceptual Primitives
  • Chapter
  • First Online:
Intelligent Perceptual Systems

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 745))

  • 122 Accesses

Abstract

In the last few years different connectionist models have been applied to many perceptual tasks. Many efforts have been focussed in particular to different speech recognition tasks in the attempt of exploring the remarkable potential learning capabilities of connectionist models. In this paper we briefly review most successful approaches to speech recognition in the attempt of assessing their actual contribution to the field. A detailed analysis of different problems found in speech recognition allows us to identify some “desiderata” to be met for building challenging models. One of the most remarkable targets is that of proposing an effective model of the speech time dimension. Moreover, many proposed connectionist models turn out to be severely limited by their inherent nonparametric structure which makes learning of many tasks very hard. We suggest methods for introducing prior knowledge in recurrent networks and briefly discuss how can they learn more effectively in presence of “structured tasks”.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Y. Bengio, R. De Mori, and M. Gori, “Learning the dynamic nature of speech with back-propagation for sequences,” Pattern Recognition Letters, Vol. 13, No. 5, May 1992.

    Google Scholar 

  2. H. Bourlard and C. J. Wellekens, “Speech Pattern Discrimination and Multilayered Perceptrons,” Computer Speech and Language, no. 3, 1989, pp. 1–19

    Google Scholar 

  3. J.L. Elman and D. Zipser, “Learning the hidden structure of the speech,” Journal of the Acoustic Society of America vol. 83, no. 4, pp. 1615–1626, April 1988.

    Google Scholar 

  4. P. Frasconi, M. Gori, M. Maggini, and G. Soda, “A Unified Approach for Integrating Explicit Knowledge and Learning by Example in Recurrent Networks”, Proceedings of IEEE-IJCNN91, Seattle, I 811–816, July 8–12 1991.

    Google Scholar 

  5. P. Frasconi, M. Gori, and G. Soda, “Local Feedback Multi-Layered Networks,” Neural Computation vol. 4, no. 1, pp. 120–130, 1991.

    Google Scholar 

  6. P. Frasconi, M. Gori, M. Maggini, and G. Soda, “Unified Integration of Explicit Rules and Learning by Example in Recurrent Networks,” IEEE Trans. on Knowledge and Data Engineering

    Google Scholar 

  7. P. Frasconi, M. Gori, and G. Soda, “Recurrent networks with activation feedback,” Proc. of 3th Italian Workshop on Parallel Architectures and Neural Networks,” Vietri sul Mare, Salerno, 15–18 May 1990, pp. 329–336

    Google Scholar 

  8. P. Frasconi, M. Gori, and G. Soda, “Injecting Nondeterministic Finite State Automata into Recurrent Neural Networks,” Technical Report RT15/92, Universita’ di Firenze, August 1992.

    Google Scholar 

  9. S. Geman, E. Bienenstock, and R. Dourstat, “Neural Networks and the Bias/Variance Dilemma”, Neural Computation, Vo. 4, No. 1, January 1992, pp. 1–58

    Google Scholar 

  10. M. Gori and A. Tesi, “On the Problem of Local Minima in BackPropagation”, IEEE Trans. Pattern Anal. and Machine Intell., vol. 14, no. 1, pp. 76–86, 1991.

    Google Scholar 

  11. J.E. Hopcroft, J.D. Ullman, Introduction to Automata Theory, Languages and Computation, Addison-Wesley, Reading, Mass., 1979.

    Google Scholar 

  12. T. Kohonen, “The Self-Organizing Map,” Proc. of the IEEE, vol. 78, no. 9, September 1990 (special issue on neural networks I)

    Google Scholar 

  13. M.L. Minsky, and S.A. Papert, Perceptrons — Expanded Edition, MIT Press, 1988.

    Google Scholar 

  14. M.C. Mozer, “A focused Backpropagation algorithm for temporal pattern recognition,” Complex Systems, no. 3, pp. 349–381

    Google Scholar 

  15. B.A. Pearlmutter, “Learning State Space Trajectories in Recurrent Neural Networks,” Neural Computation vol. 1, no. 2, pp. 263–269, 1989.

    Google Scholar 

  16. L. Rabiner, “A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition,” IEEE ASSP Magaz., 1989, pp. 267–295

    Google Scholar 

  17. A.J. Robinson and F. Fallside, “Static and Dynamic error propagation networks with application to speech coding,” In Dana Z. Anderson editor, Neural Information Processing Systems, American Institute of Physics, New York 1987

    Google Scholar 

  18. H. Sakoe and C. Chiba, “Dynamic Programming algorithm optimization for spoken word recognition,” IEEE Trans. on ASSP, vol. 54, no. 1, pp. 43–49, February 1978

    Google Scholar 

  19. A. Waibel, T. Hanazawa, G. Hinton, K. Shikano, K. Lang, “Phoneme Recognition Using Time-Delay Neural Networks,” IEEE Trans. on ASSP, vol. 37, no. 3, March 1989.

    Google Scholar 

  20. A. Waibel, H. Sawai, and K. Shikano, “Modularity and Scaling in Large Phonemic Neural Networks,” IEEE Trans. on ASSP, Becember 1989

    Google Scholar 

  21. R.L. Watrous, “Speech Recognition Using Connectionist Networks,” Ph.D. Thesis, University of Pennsylvania, Philadelphia, PA 190104 November 1988

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Vito Roberto

Rights and permissions

Reprints and permissions

Copyright information

© 1993 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Frasconi, P., Gori, M., Soda, G. (1993). Automatic speech recognition with neural networks: Beyond nonparametric models. In: Roberto, V. (eds) Intelligent Perceptual Systems. Lecture Notes in Computer Science, vol 745. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-57379-8_6

Download citation

  • DOI: https://doi.org/10.1007/3-540-57379-8_6

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-57379-1

  • Online ISBN: 978-3-540-48103-4

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics