Automatic speech recognition with neural networks: Beyond nonparametric models

Frasconi, Paolo; Gori, Marco; Soda, Giovanni

doi:10.1007/3-540-57379-8_6

Paolo Frasconi¹,
Marco Gori¹ &
Giovanni Soda¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 745))

122 Accesses

Abstract

In the last few years different connectionist models have been applied to many perceptual tasks. Many efforts have been focussed in particular to different speech recognition tasks in the attempt of exploring the remarkable potential learning capabilities of connectionist models. In this paper we briefly review most successful approaches to speech recognition in the attempt of assessing their actual contribution to the field. A detailed analysis of different problems found in speech recognition allows us to identify some “desiderata” to be met for building challenging models. One of the most remarkable targets is that of proposing an effective model of the speech time dimension. Moreover, many proposed connectionist models turn out to be severely limited by their inherent nonparametric structure which makes learning of many tasks very hard. We suggest methods for introducing prior knowledge in recurrent networks and briefly discuss how can they learn more effectively in presence of “structured tasks”.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Y. Bengio, R. De Mori, and M. Gori, “Learning the dynamic nature of speech with back-propagation for sequences,” Pattern Recognition Letters, Vol. 13, No. 5, May 1992.
Google Scholar
H. Bourlard and C. J. Wellekens, “Speech Pattern Discrimination and Multilayered Perceptrons,” Computer Speech and Language, no. 3, 1989, pp. 1–19
Google Scholar
J.L. Elman and D. Zipser, “Learning the hidden structure of the speech,” Journal of the Acoustic Society of America vol. 83, no. 4, pp. 1615–1626, April 1988.
Google Scholar
P. Frasconi, M. Gori, M. Maggini, and G. Soda, “A Unified Approach for Integrating Explicit Knowledge and Learning by Example in Recurrent Networks”, Proceedings of IEEE-IJCNN91, Seattle, I 811–816, July 8–12 1991.
Google Scholar
P. Frasconi, M. Gori, and G. Soda, “Local Feedback Multi-Layered Networks,” Neural Computation vol. 4, no. 1, pp. 120–130, 1991.
Google Scholar
P. Frasconi, M. Gori, M. Maggini, and G. Soda, “Unified Integration of Explicit Rules and Learning by Example in Recurrent Networks,” IEEE Trans. on Knowledge and Data Engineering
Google Scholar
P. Frasconi, M. Gori, and G. Soda, “Recurrent networks with activation feedback,” Proc. of 3th Italian Workshop on Parallel Architectures and Neural Networks,” Vietri sul Mare, Salerno, 15–18 May 1990, pp. 329–336
Google Scholar
P. Frasconi, M. Gori, and G. Soda, “Injecting Nondeterministic Finite State Automata into Recurrent Neural Networks,” Technical Report RT15/92, Universita’ di Firenze, August 1992.
Google Scholar
S. Geman, E. Bienenstock, and R. Dourstat, “Neural Networks and the Bias/Variance Dilemma”, Neural Computation, Vo. 4, No. 1, January 1992, pp. 1–58
Google Scholar
M. Gori and A. Tesi, “On the Problem of Local Minima in BackPropagation”, IEEE Trans. Pattern Anal. and Machine Intell., vol. 14, no. 1, pp. 76–86, 1991.
Google Scholar
J.E. Hopcroft, J.D. Ullman, Introduction to Automata Theory, Languages and Computation, Addison-Wesley, Reading, Mass., 1979.
Google Scholar
T. Kohonen, “The Self-Organizing Map,” Proc. of the IEEE, vol. 78, no. 9, September 1990 (special issue on neural networks I)
Google Scholar
M.L. Minsky, and S.A. Papert, Perceptrons — Expanded Edition, MIT Press, 1988.
Google Scholar
M.C. Mozer, “A focused Backpropagation algorithm for temporal pattern recognition,” Complex Systems, no. 3, pp. 349–381
Google Scholar
B.A. Pearlmutter, “Learning State Space Trajectories in Recurrent Neural Networks,” Neural Computation vol. 1, no. 2, pp. 263–269, 1989.
Google Scholar
L. Rabiner, “A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition,” IEEE ASSP Magaz., 1989, pp. 267–295
Google Scholar
A.J. Robinson and F. Fallside, “Static and Dynamic error propagation networks with application to speech coding,” In Dana Z. Anderson editor, Neural Information Processing Systems, American Institute of Physics, New York 1987
Google Scholar
H. Sakoe and C. Chiba, “Dynamic Programming algorithm optimization for spoken word recognition,” IEEE Trans. on ASSP, vol. 54, no. 1, pp. 43–49, February 1978
Google Scholar
A. Waibel, T. Hanazawa, G. Hinton, K. Shikano, K. Lang, “Phoneme Recognition Using Time-Delay Neural Networks,” IEEE Trans. on ASSP, vol. 37, no. 3, March 1989.
Google Scholar
A. Waibel, H. Sawai, and K. Shikano, “Modularity and Scaling in Large Phonemic Neural Networks,” IEEE Trans. on ASSP, Becember 1989
Google Scholar
R.L. Watrous, “Speech Recognition Using Connectionist Networks,” Ph.D. Thesis, University of Pennsylvania, Philadelphia, PA 190104 November 1988
Google Scholar

Download references

Author information

Authors and Affiliations

Dipartimento di Sistemi e Informatica, Via di Santa Marta 3, 50139, Firenze, Italy
Paolo Frasconi, Marco Gori & Giovanni Soda

Authors

Paolo Frasconi
View author publications
You can also search for this author in PubMed Google Scholar
Marco Gori
View author publications
You can also search for this author in PubMed Google Scholar
Giovanni Soda
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Vito Roberto

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Frasconi, P., Gori, M., Soda, G. (1993). Automatic speech recognition with neural networks: Beyond nonparametric models. In: Roberto, V. (eds) Intelligent Perceptual Systems. Lecture Notes in Computer Science, vol 745. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-57379-8_6

Download citation

DOI: https://doi.org/10.1007/3-540-57379-8_6
Published: 30 May 2005
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-57379-1
Online ISBN: 978-3-540-48103-4
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics