Abstract
Multilayer Perceptrons are the most widely used Artificial Neural Networks in Isolated Word Recognition. However, these networks do not adequately modelize the temporal structure of speech. In “static” classification of speech segments with Multilayer Perceptrons, the number of input nodes is fixed a priori, while the length of speech utterances is variable. In this paper, the technique called Trace Segmentation is explored in order to fit the lengths of both input layer and utterances. This technique has been studied both with the conventional Multilayer Perceptron with back propagation and with a combination of this algorithm and stochastic learning, as well as with other strategies such as Scaly Multilayer Perceptrons. Experimental results are reported, achieving performances that range from about 80% to nearly 100%, depending on the task (Spanish Digits and E-Set). These results are comparable to or higher than those obtained with Hidden Markov Models or more conventional and expensive Multilayer Perceptrons when applied to the same task.
On the other hand, the use of articulatory features has not been sufficiently explored in Isolated Word Recognition. In this paper we also deal with the problem of combining a module that exploits acoustic-articulatory relations with a conventional formalism such as a Multilayer Perceptron for an Isolated Word Recognition task as explained above. Some very preliminary experiments are presented under this framework.
This work has been partially supported by CICYT, under contract TIC-0048/89 and by ESPRIT (BRA) under contract 3279
Supported by a postgraduate grant from Conselleria Cultura, Educació i Ciència of Valencia.
Preview
Unable to display preview. Download preview PDF.
References
Alspector, J., Allen, R.B., Hu, V. & Satyanarayana, S. (1988). “Stochastic Learning Networks and their Electronic Implementation”. In Neural Information Processing Systems. Anderson, D.Z. (Eds.). New York: American Institute of Physics, pp. 9–21.
Burr. (1988). “Experiments on Neural Net Recognition of Spoken and Written Text”. IEEE Trans. on ASSP, 36 (7), pp. 1162–1168.
Casacuberta, F., Castro, M.J. & Puchol, C. (1991a). “Isolated Word Recognition Based on Multilayer Perceptrons”. In Pattern Recognition and Image Analysis: IV Spanish Simposium of Pattern Recognition and Image Analysis. "Machine Perception and Artificial Intelligence" Series. World Scientific Publishing Co.
Casacuberta, F. (1991b). “Modelos de Markov Ocultos y Reconocimiento de Palabras Aisladas”. Universidad Politécnica de Valencia. Technical Report, DSIC-II/2/91.
Casacuberta, F., Alfonso, I., Castro, M.J., Marzal, A., Pérez, J.C., Sánchez, A., Benedí, J.M., Galiano, I. & Vidal, E. (1991c). “Articulatory-Acoustic Correlations in Coarticulatory Processes: A Cross-Language Investigation”. In Boletín de Sociedad Española para el Procesamiento del Lenguaje Natural. (In press).
Castro, M.J., Casacuberta, F. & Puchol, C. (1991). “Trace Segmentation with Artificial Neural Networks”. Universidad Politécnica de Valencia. Technical Report, DSIC-II/13/91.
Demichelis, P., Fissore, L., Laface, P., Micca, G. & Piccolo, E. (1989). “On the Use of Neural Networks for Speaker Independent Isolated Word Recognition”. Proc. ICASSP 89, pp. 314–317.
Duda, R.O., & Hart, P.E. (1973). Pattern Recognition and Scene Analysis. John Willey and Sons.
Gao, Y., Huang, T. & Chen, D. (1990). “HMM-Based Warping in Neural Networks”. Proc. ICCASP 90, pp. 501–504.
Hackbarth, H. & Immendörfer, M. (1990). “Speaker-Dependent Isolated Word Recognition by Artificial Neural Networks”. Proc. Verba 90, pp. 91–98.
Hinton, G.E. (1989). “Connectionist Learning Procedures”. Artificial Intelligence, 40, pp. 185–234.
Krause, A. & Hackbarth, H. (1989). “Scaly Artificial Neural Networks for Speaker-Independent Recognition of Isolated Words”. Proc. ICASSP 89, pp. 21–24.
Lippmann, R.P. (1989). “Review of Neural Networks for Speech Recognition”. Neural Computation, 1, pp. 1–38.
Millán, J.R. & Torras, C. (1989). “Un Sistema Conexionista para Planificación de Trayectorias”. Reunión Técnica de la Asociación Española para la Inteligencia Artificial (AEPIA 89), Madrid, pp. 321–333.
Peeling, S.M. & Moore, R.K. (1988). “Isolated Digit Recognition Experiments using the Multi-Layer Perceptron”. Speech Communication (7), pp. 403–409.
Raudys, S.J. & Jain, A.K. (1990). “Small Sample Size Effects in Statistical Pattern Recognition: Recommendations for Practitioners and Open Problems”. 10th International Conference on Pattern Recognition, 1, pp. 417–423.
Rumelhart, D.E., Hinton, G.E. & Williams, R.J. (1986). “Learning Internal Representations by Error Propagation”. In Parallel Distributed Processing: Explorations in the Microstructure of Cognition. Rumelhart, D. E. and McClelland, J. L. (Eds.). Cambridge, MA: MIT Press, pp. 318–362.
Sakoe, H., Isotani, R., Yoshida, K., Iso, K. & Watanabe, T. (1989). “Speaker-Independent Word Recognition using Dynamic Programming Neural Networks”. IEEE International Conference on ASSP, pp. 29–32.
Schmidbauer, O. (1989). “Robust Statistic Modelling of Systematic Variabilities in Continuous Speech incorporating Acoustic-Articulatory Relations”. Proc. ICAASP 89, pp. 616–619.
Tseng, H.P., Sabin, M.J., and Lee, E.A. (1987). “Fuzzy Vector Quantization applied to Hidden Markov Modelling”. Proc. ICASSP 87, pp. 641–644.
Wasserman, P.D. (1989). Neural Computing: Theory and Practice. New York: Van Nostrand Reinhold.
Wiezlak, W.W., and Gubrynowicz. (1982). “Articulatory Description of Speech Signal in Isolated Word Recognition”. Proc. ICASSP 82. pp 529–534.
Xue, Q., Hu, Y.H., and Milenkovic, P. (1990). “Analysis of the Hidden Units Multi-layer Perceptron and its application in Acoustic-to-Articulatory Mapping”. Proc. ICASSP 90, pp. 869–870.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1991 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Castro, M.J., Casacuberta, F. (1991). The use of multilayer perceptrons in isolated word recognition. In: Prieto, A. (eds) Artificial Neural Networks. IWANN 1991. Lecture Notes in Computer Science, vol 540. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0035913
Download citation
DOI: https://doi.org/10.1007/BFb0035913
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-54537-8
Online ISBN: 978-3-540-38460-1
eBook Packages: Springer Book Archive