A New Word Clustering Method for Building N-Gram Language Models in Continuous Speech Recognition Systems
In this paper a new method for automatic word clustering is presented. We used this method for building n-gram language models for Persian continuous speech recognition (CSR) systems. In this method, each word is specified by a feature vector that represents the statistics of parts of speech (POS) of that word. The feature vectors are clustered by k-means algorithm. Using this method causes a reduction in time complexity which is a defect in other automatic clustering methods. Also, the problem of high perplexity in manual clustering methods is abated. The experimental results are based on "Persian Text Corpus" which contains about 9 million words. The extracted language models are evaluated by the perplexity criterion and the results show that a considerable reduction in perplexity has been achieved. Also reduction in word error rate of CSR system is about 16% compared with a manual clustering method.
KeywordsClass n-gram Models Continuous Speech Recognition Part Of Speech Persian Text Corpus Word Clustering
Unable to display preview. Download preview PDF.
- 2.Young, S.J., Jansen, J., Odell, J.J., Ollason, D., Woodland, P.C.: The HTK Hidden Markov Model Toolkit Book (1995)Google Scholar
- 3.Rabiner, L., Juang, B.H.: Fundamentals of Speech Recognition. Prentice Hall, New Jersey (1993)Google Scholar
- 4.Heeman, P.A.: POS tagging versus Classes in Language Modeling, Proc. 6th Workshop on Very Large Corpora, August 1998, pp. 179–187 (1998)Google Scholar
- 5.Brown, P., Della Pietra, V., de Souza, P., Lai, J., Mercer, R.L.: Class-based n-gram models of natural language. Computational Linguistics 18(4), 467–479 (1992)Google Scholar
- 7.Korkmaz, E.E., Ucoluk, G.: A Method for Improving Automatic Word Categorization, Workshop on Computational Natural Language Learning, Madrid, Spain, pp. 43–49 (1997)Google Scholar
- 8.Harper, M.P., Jamieson, L.H., Mitchell, C.D., Ying, G.: Integrating Language Models with Speech Recognition. In: AAAI-94 Workshop on the Integration of Natural Language and Speech Processing, August 1994, pp. 139–146 (1994)Google Scholar
- 9.Babaali, B., Sameti, H.: The Sharif Speaker-Independent Large Vocabulary Speech Recognition System. In: The 2nd Workshop on Information Technology & Its Disciplines, Kish Island, Iran, February 24-26 (2004)Google Scholar
- 10.Ney, H., Haeb-Umbach, R., Tran, B.H., Oerder, M.: Improvements in Beam Search for 10000-Word Continuous Speech Recognition, IEEE Int. In: Conf. on Acoustics, Speech and Signal Processing, pp. 13–16 (1992)Google Scholar
- 11.Bijankhan, M.: FARSDAT-The Speech Database of Farsi Spoken Language. In: Proc. The 5th Australian Int. Conf. on Speech Science and Tech., Perth, vol. 2 (1994)Google Scholar
- 12.Bahrani, M., Samet, H., Hafezi, N., Movasagh, H.: Building and Incorporating Language Models for Persian Continuous Speech Recognition Systems. In: Proc. 5th international conference on Language Resources and Evaluation, Genoa, Italy, pp. 101–104 (2006)Google Scholar
- 13.BijanKhan, M.: Persian Text Corpus, Technical report, Research Center of Intelligent Signal Processing (2004)Google Scholar
- 14.Fritzke, B.: Some competitive learning methods, System Biophysics Institute for Neural Computation Ruhr-Universität Bochum (1997), ftp://ftp.neuroinformatik.ruhr-unibochum.de/pub/software/NN/DemoGNG/sclm.ps.gz