Abstract
In the last few years different connectionist models have been applied to many perceptual tasks. Many efforts have been focussed in particular to different speech recognition tasks in the attempt of exploring the remarkable potential learning capabilities of connectionist models. In this paper we briefly review most successful approaches to speech recognition in the attempt of assessing their actual contribution to the field. A detailed analysis of different problems found in speech recognition allows us to identify some “desiderata” to be met for building challenging models. One of the most remarkable targets is that of proposing an effective model of the speech time dimension. Moreover, many proposed connectionist models turn out to be severely limited by their inherent nonparametric structure which makes learning of many tasks very hard. We suggest methods for introducing prior knowledge in recurrent networks and briefly discuss how can they learn more effectively in presence of “structured tasks”.
Preview
Unable to display preview. Download preview PDF.
References
Y. Bengio, R. De Mori, and M. Gori, “Learning the dynamic nature of speech with back-propagation for sequences,” Pattern Recognition Letters, Vol. 13, No. 5, May 1992.
H. Bourlard and C. J. Wellekens, “Speech Pattern Discrimination and Multilayered Perceptrons,” Computer Speech and Language, no. 3, 1989, pp. 1–19
J.L. Elman and D. Zipser, “Learning the hidden structure of the speech,” Journal of the Acoustic Society of America vol. 83, no. 4, pp. 1615–1626, April 1988.
P. Frasconi, M. Gori, M. Maggini, and G. Soda, “A Unified Approach for Integrating Explicit Knowledge and Learning by Example in Recurrent Networks”, Proceedings of IEEE-IJCNN91, Seattle, I 811–816, July 8–12 1991.
P. Frasconi, M. Gori, and G. Soda, “Local Feedback Multi-Layered Networks,” Neural Computation vol. 4, no. 1, pp. 120–130, 1991.
P. Frasconi, M. Gori, M. Maggini, and G. Soda, “Unified Integration of Explicit Rules and Learning by Example in Recurrent Networks,” IEEE Trans. on Knowledge and Data Engineering
P. Frasconi, M. Gori, and G. Soda, “Recurrent networks with activation feedback,” Proc. of 3th Italian Workshop on Parallel Architectures and Neural Networks,” Vietri sul Mare, Salerno, 15–18 May 1990, pp. 329–336
P. Frasconi, M. Gori, and G. Soda, “Injecting Nondeterministic Finite State Automata into Recurrent Neural Networks,” Technical Report RT15/92, Universita’ di Firenze, August 1992.
S. Geman, E. Bienenstock, and R. Dourstat, “Neural Networks and the Bias/Variance Dilemma”, Neural Computation, Vo. 4, No. 1, January 1992, pp. 1–58
M. Gori and A. Tesi, “On the Problem of Local Minima in BackPropagation”, IEEE Trans. Pattern Anal. and Machine Intell., vol. 14, no. 1, pp. 76–86, 1991.
J.E. Hopcroft, J.D. Ullman, Introduction to Automata Theory, Languages and Computation, Addison-Wesley, Reading, Mass., 1979.
T. Kohonen, “The Self-Organizing Map,” Proc. of the IEEE, vol. 78, no. 9, September 1990 (special issue on neural networks I)
M.L. Minsky, and S.A. Papert, Perceptrons — Expanded Edition, MIT Press, 1988.
M.C. Mozer, “A focused Backpropagation algorithm for temporal pattern recognition,” Complex Systems, no. 3, pp. 349–381
B.A. Pearlmutter, “Learning State Space Trajectories in Recurrent Neural Networks,” Neural Computation vol. 1, no. 2, pp. 263–269, 1989.
L. Rabiner, “A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition,” IEEE ASSP Magaz., 1989, pp. 267–295
A.J. Robinson and F. Fallside, “Static and Dynamic error propagation networks with application to speech coding,” In Dana Z. Anderson editor, Neural Information Processing Systems, American Institute of Physics, New York 1987
H. Sakoe and C. Chiba, “Dynamic Programming algorithm optimization for spoken word recognition,” IEEE Trans. on ASSP, vol. 54, no. 1, pp. 43–49, February 1978
A. Waibel, T. Hanazawa, G. Hinton, K. Shikano, K. Lang, “Phoneme Recognition Using Time-Delay Neural Networks,” IEEE Trans. on ASSP, vol. 37, no. 3, March 1989.
A. Waibel, H. Sawai, and K. Shikano, “Modularity and Scaling in Large Phonemic Neural Networks,” IEEE Trans. on ASSP, Becember 1989
R.L. Watrous, “Speech Recognition Using Connectionist Networks,” Ph.D. Thesis, University of Pennsylvania, Philadelphia, PA 190104 November 1988
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1993 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Frasconi, P., Gori, M., Soda, G. (1993). Automatic speech recognition with neural networks: Beyond nonparametric models. In: Roberto, V. (eds) Intelligent Perceptual Systems. Lecture Notes in Computer Science, vol 745. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-57379-8_6
Download citation
DOI: https://doi.org/10.1007/3-540-57379-8_6
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-57379-1
Online ISBN: 978-3-540-48103-4
eBook Packages: Springer Book Archive