Advertisement

A New Word Clustering Method for Building N-Gram Language Models in Continuous Speech Recognition Systems

  • Mohammad Bahrani
  • Hossein Sameti
  • Nazila Hafezi
  • Saeedeh Momtazi
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5027)

Abstract

In this paper a new method for automatic word clustering is presented. We used this method for building n-gram language models for Persian continuous speech recognition (CSR) systems. In this method, each word is specified by a feature vector that represents the statistics of parts of speech (POS) of that word. The feature vectors are clustered by k-means algorithm. Using this method causes a reduction in time complexity which is a defect in other automatic clustering methods. Also, the problem of high perplexity in manual clustering methods is abated. The experimental results are based on "Persian Text Corpus" which contains about 9 million words. The extracted language models are evaluated by the perplexity criterion and the results show that a considerable reduction in perplexity has been achieved. Also reduction in word error rate of CSR system is about 16% compared with a manual clustering method.

Keywords

Class n-gram Models Continuous Speech Recognition Part Of Speech Persian Text Corpus Word Clustering 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Huang, X., Alleva, F., Hon, H., Hwang, M., Lee, K., Rosenfield, R.: The SPHINX-II Speech Recognition System: An Overview. Computer Speech and Langauge 2, 137–148 (1993)CrossRefGoogle Scholar
  2. 2.
    Young, S.J., Jansen, J., Odell, J.J., Ollason, D., Woodland, P.C.: The HTK Hidden Markov Model Toolkit Book (1995)Google Scholar
  3. 3.
    Rabiner, L., Juang, B.H.: Fundamentals of Speech Recognition. Prentice Hall, New Jersey (1993)Google Scholar
  4. 4.
    Heeman, P.A.: POS tagging versus Classes in Language Modeling, Proc. 6th Workshop on Very Large Corpora, August 1998, pp. 179–187 (1998)Google Scholar
  5. 5.
    Brown, P., Della Pietra, V., de Souza, P., Lai, J., Mercer, R.L.: Class-based n-gram models of natural language. Computational Linguistics 18(4), 467–479 (1992)Google Scholar
  6. 6.
    Martin, S., Liermann, J., Ney, H.: Algorithms for bigram and trigram word clustering. Speech Communication 24, 19–37 (1998)CrossRefGoogle Scholar
  7. 7.
    Korkmaz, E.E., Ucoluk, G.: A Method for Improving Automatic Word Categorization, Workshop on Computational Natural Language Learning, Madrid, Spain, pp. 43–49 (1997)Google Scholar
  8. 8.
    Harper, M.P., Jamieson, L.H., Mitchell, C.D., Ying, G.: Integrating Language Models with Speech Recognition. In: AAAI-94 Workshop on the Integration of Natural Language and Speech Processing, August 1994, pp. 139–146 (1994)Google Scholar
  9. 9.
    Babaali, B., Sameti, H.: The Sharif Speaker-Independent Large Vocabulary Speech Recognition System. In: The 2nd Workshop on Information Technology & Its Disciplines, Kish Island, Iran, February 24-26 (2004)Google Scholar
  10. 10.
    Ney, H., Haeb-Umbach, R., Tran, B.H., Oerder, M.: Improvements in Beam Search for 10000-Word Continuous Speech Recognition, IEEE Int. In: Conf. on Acoustics, Speech and Signal Processing, pp. 13–16 (1992)Google Scholar
  11. 11.
    Bijankhan, M.: FARSDAT-The Speech Database of Farsi Spoken Language. In: Proc. The 5th Australian Int. Conf. on Speech Science and Tech., Perth, vol. 2 (1994)Google Scholar
  12. 12.
    Bahrani, M., Samet, H., Hafezi, N., Movasagh, H.: Building and Incorporating Language Models for Persian Continuous Speech Recognition Systems. In: Proc. 5th international conference on Language Resources and Evaluation, Genoa, Italy, pp. 101–104 (2006)Google Scholar
  13. 13.
    BijanKhan, M.: Persian Text Corpus, Technical report, Research Center of Intelligent Signal Processing (2004)Google Scholar
  14. 14.
    Fritzke, B.: Some competitive learning methods, System Biophysics Institute for Neural Computation Ruhr-Universität Bochum (1997), ftp://ftp.neuroinformatik.ruhr-unibochum.de/pub/software/NN/DemoGNG/sclm.ps.gz

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Mohammad Bahrani
    • 1
  • Hossein Sameti
    • 1
  • Nazila Hafezi
    • 1
  • Saeedeh Momtazi
    • 1
  1. 1.Speech Processing Lab, Computer Engineering DepartmentSharif University of TechnologyTehranIran

Personalised recommendations