Skip to main content

The use of multilayer perceptrons in isolated word recognition

  • Applications
  • Conference paper
  • First Online:
Book cover Artificial Neural Networks (IWANN 1991)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 540))

Included in the following conference series:

Abstract

Multilayer Perceptrons are the most widely used Artificial Neural Networks in Isolated Word Recognition. However, these networks do not adequately modelize the temporal structure of speech. In “static” classification of speech segments with Multilayer Perceptrons, the number of input nodes is fixed a priori, while the length of speech utterances is variable. In this paper, the technique called Trace Segmentation is explored in order to fit the lengths of both input layer and utterances. This technique has been studied both with the conventional Multilayer Perceptron with back propagation and with a combination of this algorithm and stochastic learning, as well as with other strategies such as Scaly Multilayer Perceptrons. Experimental results are reported, achieving performances that range from about 80% to nearly 100%, depending on the task (Spanish Digits and E-Set). These results are comparable to or higher than those obtained with Hidden Markov Models or more conventional and expensive Multilayer Perceptrons when applied to the same task.

On the other hand, the use of articulatory features has not been sufficiently explored in Isolated Word Recognition. In this paper we also deal with the problem of combining a module that exploits acoustic-articulatory relations with a conventional formalism such as a Multilayer Perceptron for an Isolated Word Recognition task as explained above. Some very preliminary experiments are presented under this framework.

This work has been partially supported by CICYT, under contract TIC-0048/89 and by ESPRIT (BRA) under contract 3279

Supported by a postgraduate grant from Conselleria Cultura, Educació i Ciència of Valencia.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Alspector, J., Allen, R.B., Hu, V. & Satyanarayana, S. (1988). “Stochastic Learning Networks and their Electronic Implementation”. In Neural Information Processing Systems. Anderson, D.Z. (Eds.). New York: American Institute of Physics, pp. 9–21.

    Google Scholar 

  • Burr. (1988). “Experiments on Neural Net Recognition of Spoken and Written Text”. IEEE Trans. on ASSP, 36 (7), pp. 1162–1168.

    Google Scholar 

  • Casacuberta, F., Castro, M.J. & Puchol, C. (1991a). “Isolated Word Recognition Based on Multilayer Perceptrons”. In Pattern Recognition and Image Analysis: IV Spanish Simposium of Pattern Recognition and Image Analysis. "Machine Perception and Artificial Intelligence" Series. World Scientific Publishing Co.

    Google Scholar 

  • Casacuberta, F. (1991b). “Modelos de Markov Ocultos y Reconocimiento de Palabras Aisladas”. Universidad Politécnica de Valencia. Technical Report, DSIC-II/2/91.

    Google Scholar 

  • Casacuberta, F., Alfonso, I., Castro, M.J., Marzal, A., Pérez, J.C., Sánchez, A., Benedí, J.M., Galiano, I. & Vidal, E. (1991c). “Articulatory-Acoustic Correlations in Coarticulatory Processes: A Cross-Language Investigation”. In Boletín de Sociedad Española para el Procesamiento del Lenguaje Natural. (In press).

    Google Scholar 

  • Castro, M.J., Casacuberta, F. & Puchol, C. (1991). “Trace Segmentation with Artificial Neural Networks”. Universidad Politécnica de Valencia. Technical Report, DSIC-II/13/91.

    Google Scholar 

  • Demichelis, P., Fissore, L., Laface, P., Micca, G. & Piccolo, E. (1989). “On the Use of Neural Networks for Speaker Independent Isolated Word Recognition”. Proc. ICASSP 89, pp. 314–317.

    Google Scholar 

  • Duda, R.O., & Hart, P.E. (1973). Pattern Recognition and Scene Analysis. John Willey and Sons.

    Google Scholar 

  • Gao, Y., Huang, T. & Chen, D. (1990). “HMM-Based Warping in Neural Networks”. Proc. ICCASP 90, pp. 501–504.

    Google Scholar 

  • Hackbarth, H. & Immendörfer, M. (1990). “Speaker-Dependent Isolated Word Recognition by Artificial Neural Networks”. Proc. Verba 90, pp. 91–98.

    Google Scholar 

  • Hinton, G.E. (1989). “Connectionist Learning Procedures”. Artificial Intelligence, 40, pp. 185–234.

    Google Scholar 

  • Krause, A. & Hackbarth, H. (1989). “Scaly Artificial Neural Networks for Speaker-Independent Recognition of Isolated Words”. Proc. ICASSP 89, pp. 21–24.

    Google Scholar 

  • Lippmann, R.P. (1989). “Review of Neural Networks for Speech Recognition”. Neural Computation, 1, pp. 1–38.

    Google Scholar 

  • Millán, J.R. & Torras, C. (1989). “Un Sistema Conexionista para Planificación de Trayectorias”. Reunión Técnica de la Asociación Española para la Inteligencia Artificial (AEPIA 89), Madrid, pp. 321–333.

    Google Scholar 

  • Peeling, S.M. & Moore, R.K. (1988). “Isolated Digit Recognition Experiments using the Multi-Layer Perceptron”. Speech Communication (7), pp. 403–409.

    Google Scholar 

  • Raudys, S.J. & Jain, A.K. (1990). “Small Sample Size Effects in Statistical Pattern Recognition: Recommendations for Practitioners and Open Problems”. 10th International Conference on Pattern Recognition, 1, pp. 417–423.

    Google Scholar 

  • Rumelhart, D.E., Hinton, G.E. & Williams, R.J. (1986). “Learning Internal Representations by Error Propagation”. In Parallel Distributed Processing: Explorations in the Microstructure of Cognition. Rumelhart, D. E. and McClelland, J. L. (Eds.). Cambridge, MA: MIT Press, pp. 318–362.

    Google Scholar 

  • Sakoe, H., Isotani, R., Yoshida, K., Iso, K. & Watanabe, T. (1989). “Speaker-Independent Word Recognition using Dynamic Programming Neural Networks”. IEEE International Conference on ASSP, pp. 29–32.

    Google Scholar 

  • Schmidbauer, O. (1989). “Robust Statistic Modelling of Systematic Variabilities in Continuous Speech incorporating Acoustic-Articulatory Relations”. Proc. ICAASP 89, pp. 616–619.

    Google Scholar 

  • Tseng, H.P., Sabin, M.J., and Lee, E.A. (1987). “Fuzzy Vector Quantization applied to Hidden Markov Modelling”. Proc. ICASSP 87, pp. 641–644.

    Google Scholar 

  • Wasserman, P.D. (1989). Neural Computing: Theory and Practice. New York: Van Nostrand Reinhold.

    Google Scholar 

  • Wiezlak, W.W., and Gubrynowicz. (1982). “Articulatory Description of Speech Signal in Isolated Word Recognition”. Proc. ICASSP 82. pp 529–534.

    Google Scholar 

  • Xue, Q., Hu, Y.H., and Milenkovic, P. (1990). “Analysis of the Hidden Units Multi-layer Perceptron and its application in Acoustic-to-Articulatory Mapping”. Proc. ICASSP 90, pp. 869–870.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Alberto Prieto

Rights and permissions

Reprints and permissions

Copyright information

© 1991 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Castro, M.J., Casacuberta, F. (1991). The use of multilayer perceptrons in isolated word recognition. In: Prieto, A. (eds) Artificial Neural Networks. IWANN 1991. Lecture Notes in Computer Science, vol 540. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0035913

Download citation

  • DOI: https://doi.org/10.1007/BFb0035913

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-54537-8

  • Online ISBN: 978-3-540-38460-1

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics