Abstract
The aim of this work is to integrate segments of words into a category-based Language Model. Two proposals of this kind of models are presented. On the other hand an interpolation of a category-based model with a classical word-based Language Model is studied as well. The models were integrated into an ASR system and evaluated in terms of WER. Experiments on a spontaneous dialogue corpus in Spanish are reported. These experiments show that integrating word segments in a category-based Language Model, a better performance of the model can be achieved.
Chapter PDF
References
Niesler, T.R., Woodland, P.C.: A variable-length category-based n-gram language model. In: IEEE ICASSP-96, Atlanta, GA, vol. I, pp. 164–167. IEEE Computer Society Press, Los Alamitos (1996)
Benedí, J., Sánchez, J.: Estimation of stochastic context-free grammars and their use as language models. Computer Speech and Language 19(3), 249–274 (2005)
Zue, V., Seneff, S., Glass, J., Polifroni, J., Pao, C., Hazen, T., Hetherington, L.: Jupiter: A telephone-based conversational interface for weather information. IEEE Trans. on Speech and Audio Proc. 8(1), 85–96 (2000)
Lamel, L., Rosset, S., Gauvin, J., Bennacef, S., Prouts, G.: The limsi arise system. In: IEEE 4th Workshop on Interactive Voice Technology for Telecommunications Applications, pp. 209–214 (1998)
Seneff, S., Polifroni, J.: Dialogue management in the mercury flight reservation system. In: ANLP-NAACL 2000 Satellite Workshop, pp. 1–6 (2000)
Torres, I., Varona, A.: k-TSS language models in speech recognition systems. Computer Speech and Language 15(2), 127–149 (2001)
DIHANA project: Dialogue System for Information Access Using Spontaneous Speech in Different Environments. Comisión Interministerial de Ciencia y Tecnología TIC2002-04103-C03-03 (2005), http://www.dihana.upv.es
Justo, R., Torres, M.I., Benedí, J.M.: Category-based language model in a spanish spoken dialogue system. In: XXII Congreso de la SEPLN, pp. 19–24 (2006)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer Berlin Heidelberg
About this paper
Cite this paper
Justo, R., Torres, M.I. (2007). Word Segments in Category-Based Language Models for Automatic Speech Recognition. In: Martí, J., Benedí, J.M., Mendonça, A.M., Serrat, J. (eds) Pattern Recognition and Image Analysis. IbPRIA 2007. Lecture Notes in Computer Science, vol 4477. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-72847-4_33
Download citation
DOI: https://doi.org/10.1007/978-3-540-72847-4_33
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-72846-7
Online ISBN: 978-3-540-72847-4
eBook Packages: Computer ScienceComputer Science (R0)