The use of multilayer perceptrons in isolated word recognition

Castro, M. J.; Casacuberta, F.

doi:10.1007/BFb0035913

M. J. Castro¹ &
F. Casacuberta¹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 540))

Included in the following conference series:

International Workshop on Artificial Neural Networks

191 Accesses
4 Citations

Abstract

Multilayer Perceptrons are the most widely used Artificial Neural Networks in Isolated Word Recognition. However, these networks do not adequately modelize the temporal structure of speech. In “static” classification of speech segments with Multilayer Perceptrons, the number of input nodes is fixed a priori, while the length of speech utterances is variable. In this paper, the technique called Trace Segmentation is explored in order to fit the lengths of both input layer and utterances. This technique has been studied both with the conventional Multilayer Perceptron with back propagation and with a combination of this algorithm and stochastic learning, as well as with other strategies such as Scaly Multilayer Perceptrons. Experimental results are reported, achieving performances that range from about 80% to nearly 100%, depending on the task (Spanish Digits and E-Set). These results are comparable to or higher than those obtained with Hidden Markov Models or more conventional and expensive Multilayer Perceptrons when applied to the same task.

On the other hand, the use of articulatory features has not been sufficiently explored in Isolated Word Recognition. In this paper we also deal with the problem of combining a module that exploits acoustic-articulatory relations with a conventional formalism such as a Multilayer Perceptron for an Isolated Word Recognition task as explained above. Some very preliminary experiments are presented under this framework.

This work has been partially supported by CICYT, under contract TIC-0048/89 and by ESPRIT (BRA) under contract 3279

Supported by a postgraduate grant from Conselleria Cultura, Educació i Ciència of Valencia.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Alspector, J., Allen, R.B., Hu, V. & Satyanarayana, S. (1988). “Stochastic Learning Networks and their Electronic Implementation”. In Neural Information Processing Systems. Anderson, D.Z. (Eds.). New York: American Institute of Physics, pp. 9–21.
Google Scholar
Burr. (1988). “Experiments on Neural Net Recognition of Spoken and Written Text”. IEEE Trans. on ASSP, 36 (7), pp. 1162–1168.
Google Scholar
Casacuberta, F., Castro, M.J. & Puchol, C. (1991a). “Isolated Word Recognition Based on Multilayer Perceptrons”. In Pattern Recognition and Image Analysis: IV Spanish Simposium of Pattern Recognition and Image Analysis. "Machine Perception and Artificial Intelligence" Series. World Scientific Publishing Co.
Google Scholar
Casacuberta, F. (1991b). “Modelos de Markov Ocultos y Reconocimiento de Palabras Aisladas”. Universidad Politécnica de Valencia. Technical Report, DSIC-II/2/91.
Google Scholar
Casacuberta, F., Alfonso, I., Castro, M.J., Marzal, A., Pérez, J.C., Sánchez, A., Benedí, J.M., Galiano, I. & Vidal, E. (1991c). “Articulatory-Acoustic Correlations in Coarticulatory Processes: A Cross-Language Investigation”. In Boletín de Sociedad Española para el Procesamiento del Lenguaje Natural. (In press).
Google Scholar
Castro, M.J., Casacuberta, F. & Puchol, C. (1991). “Trace Segmentation with Artificial Neural Networks”. Universidad Politécnica de Valencia. Technical Report, DSIC-II/13/91.
Google Scholar
Demichelis, P., Fissore, L., Laface, P., Micca, G. & Piccolo, E. (1989). “On the Use of Neural Networks for Speaker Independent Isolated Word Recognition”. Proc. ICASSP 89, pp. 314–317.
Google Scholar
Duda, R.O., & Hart, P.E. (1973). Pattern Recognition and Scene Analysis. John Willey and Sons.
Google Scholar
Gao, Y., Huang, T. & Chen, D. (1990). “HMM-Based Warping in Neural Networks”. Proc. ICCASP 90, pp. 501–504.
Google Scholar
Hackbarth, H. & Immendörfer, M. (1990). “Speaker-Dependent Isolated Word Recognition by Artificial Neural Networks”. Proc. Verba 90, pp. 91–98.
Google Scholar
Hinton, G.E. (1989). “Connectionist Learning Procedures”. Artificial Intelligence, 40, pp. 185–234.
Google Scholar
Krause, A. & Hackbarth, H. (1989). “Scaly Artificial Neural Networks for Speaker-Independent Recognition of Isolated Words”. Proc. ICASSP 89, pp. 21–24.
Google Scholar
Lippmann, R.P. (1989). “Review of Neural Networks for Speech Recognition”. Neural Computation, 1, pp. 1–38.
Google Scholar
Millán, J.R. & Torras, C. (1989). “Un Sistema Conexionista para Planificación de Trayectorias”. Reunión Técnica de la Asociación Española para la Inteligencia Artificial (AEPIA 89), Madrid, pp. 321–333.
Google Scholar
Peeling, S.M. & Moore, R.K. (1988). “Isolated Digit Recognition Experiments using the Multi-Layer Perceptron”. Speech Communication (7), pp. 403–409.
Google Scholar
Raudys, S.J. & Jain, A.K. (1990). “Small Sample Size Effects in Statistical Pattern Recognition: Recommendations for Practitioners and Open Problems”. 10th International Conference on Pattern Recognition, 1, pp. 417–423.
Google Scholar
Rumelhart, D.E., Hinton, G.E. & Williams, R.J. (1986). “Learning Internal Representations by Error Propagation”. In Parallel Distributed Processing: Explorations in the Microstructure of Cognition. Rumelhart, D. E. and McClelland, J. L. (Eds.). Cambridge, MA: MIT Press, pp. 318–362.
Google Scholar
Sakoe, H., Isotani, R., Yoshida, K., Iso, K. & Watanabe, T. (1989). “Speaker-Independent Word Recognition using Dynamic Programming Neural Networks”. IEEE International Conference on ASSP, pp. 29–32.
Google Scholar
Schmidbauer, O. (1989). “Robust Statistic Modelling of Systematic Variabilities in Continuous Speech incorporating Acoustic-Articulatory Relations”. Proc. ICAASP 89, pp. 616–619.
Google Scholar
Tseng, H.P., Sabin, M.J., and Lee, E.A. (1987). “Fuzzy Vector Quantization applied to Hidden Markov Modelling”. Proc. ICASSP 87, pp. 641–644.
Google Scholar
Wasserman, P.D. (1989). Neural Computing: Theory and Practice. New York: Van Nostrand Reinhold.
Google Scholar
Wiezlak, W.W., and Gubrynowicz. (1982). “Articulatory Description of Speech Signal in Isolated Word Recognition”. Proc. ICASSP 82. pp 529–534.
Google Scholar
Xue, Q., Hu, Y.H., and Milenkovic, P. (1990). “Analysis of the Hidden Units Multi-layer Perceptron and its application in Acoustic-to-Articulatory Mapping”. Proc. ICASSP 90, pp. 869–870.
Google Scholar

Download references

Author information

Authors and Affiliations

Departamento de Sistemas Informáticos y Computación, Universidad Politécnica Valencia, Spain
M. J. Castro & F. Casacuberta

Authors

M. J. Castro
View author publications
You can also search for this author in PubMed Google Scholar
F. Casacuberta
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Alberto Prieto

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Castro, M.J., Casacuberta, F. (1991). The use of multilayer perceptrons in isolated word recognition. In: Prieto, A. (eds) Artificial Neural Networks. IWANN 1991. Lecture Notes in Computer Science, vol 540. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0035913

Download citation

DOI: https://doi.org/10.1007/BFb0035913
Published: 22 June 2005
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-54537-8
Online ISBN: 978-3-540-38460-1
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics