Advertisement

Efficient Model Evaluation

  • Gernot A. Fink
Part of the Advances in Computer Vision and Pattern Recognition book series (ACVPR)

Abstract

All algorithms for the evaluation and decoding of HMMs or n-gram models presented so far represent the basic methods only for handling these models. In order to achieve the efficiency necessary in practical applications, these methods have to be extended and modified such that as many “unnecessary” computations as possible are avoided. This can be achieved by a suitable reorganization of data structures involved or by explicitly discarding “less promising” solutions early during the evaluation process.

This chapter gives an overview over the most important methods for the efficient evaluation of Markov models. At the beginning methods for speeding up the computation of output probability densities on the basis of mixture models are presented. Then the standard method for the efficient application of Viterbi decoding to larger HMMs is described. The following section presents techniques for efficiently generating first-best segmentation result as well as alternative solutions organized in the form of so-called n-best lists. Subsequently, methods are explained that apply techniques of search space pruning for the acceleration of the parameter training of HMMs. The chapter concludes with a section on tree-like model structures, which can be used both in HMMs and in n-gram models in order to increase the efficiency when processing these models.

Keywords

Viterbi Algorithm Partial Path Word Model Prefix Tree Segmentation Hypothesis 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 24.
    Bocchieri, E.: Vector quantization for efficient computation of continuous density likelihoods. In: Proc. Int. Conf. on Acoustics, Speech, and Signal Processing, Minneapolis, vol. 2, pp. 692–695 (1993) CrossRefGoogle Scholar
  2. 45.
    Chow, Y.-L., Schwartz, R.: The N-best algorithm. In: Speech and Natural Language Workshop, pp. 199–202. Morgan Kaufmann, San Mateo (1989) Google Scholar
  3. 52.
    Davenport, J., Nguyen, L., Matsoukas, S., Schwartz, R., Makhoul, J.: The 1998 BBN BYBLOS 10x real time system. In: Proc. DARPA Broadcast News Workshop, Herndon, VA (1999) Google Scholar
  4. 54.
    Deng, L.: The semi-relaxed algorithm for estimating parameters of Hidden Markov Models. Comput. Speech Lang. 5(3), 231–236 (1991) CrossRefGoogle Scholar
  5. 100.
    Fritsch, J., Rogina, I.: The bucket box intersection (BBI) algorithm for fast approximative evaluation of diagonal mixture Gaussians. In: Proc. Int. Conf. on Acoustics, Speech, and Signal Processing, Atlanta, vol. 1, pp. 837–840 (1996) Google Scholar
  6. 123.
    Huang, X., Acero, A., Hon, H.-W.: Spoken Language Processing: A Guide to Theory, Algorithm, and System Development. Prentice Hall, Englewood Cliffs (2001) Google Scholar
  7. 125.
    Huang, X.D., Ariki, Y., Jack, M.A.: Hidden Markov Models for Speech Recognition. Information Technology Series, vol. 7. Edinburgh University Press, Edinburgh (1990) Google Scholar
  8. 156.
    Knill, K.M., Gales, M.J.F., Young, S.J.: Use of Gaussian selection in large vocabulary continuous speech recognition using HMMs. In: International Conference on Spoken Language Processing, Philadelphia, PA, Oct 1996, vol. 1, pp. 470–473 (1996) Google Scholar
  9. 184.
    Lowerre, B., Reddy, R.: The Harpy speech understanding system. In: Lea, W.A. (ed.) Trends in Speech Recognition, pp. 340–360. Prentice-Hall, Englewood Cliffs (1980) Google Scholar
  10. 185.
    Lowerre, B.T.: The HARPY speech recognition system. PhD thesis, Carnegie-Mellon University, Department of Computer Science, Pittsburgh (1976) Google Scholar
  11. 207.
    Ney, H., Haeb-Umbach, R., Tran, B.H., Oerder, M.: Improvements in beam search for 10000-word continuous speech recognition. In: Proc. Int. Conf. on Acoustics, Speech, and Signal Processing, San Francisco, vol. 1, pp. 9–12 (1992) Google Scholar
  12. 208.
    Ney, H., Ortmanns, S.: Dynamic programming search for continuous speech recognition. IEEE Signal Process. Mag. 16(5), 64–83 (1999) CrossRefGoogle Scholar
  13. 217.
    Nilsson, N.J.: Artificial Intelligence: A New Synthesis. Morgan Kaufmann, San Francisco (1998) zbMATHGoogle Scholar
  14. 224.
    Ortmanns, S., Firzlaff, T., Ney, H.: Fast likelihood computation methods for continuous mixture densities in large vocabulary speech recognition. In: Proc. European Conf. on Speech Communication and Technology, Rhodes, vol. 1, pp. 139–142 (1997) Google Scholar
  15. 225.
    Ortmanns, S., Ney, H.: Look-ahead techniques for fast beam search. Comput. Speech Lang. 14, 15–32 (2000) CrossRefGoogle Scholar
  16. 231.
    Paul, D.: An investigation of Gaussian shortlists. In: Furui, S., Huang, B.H., Chu, W. (eds.) Proc. Workshop on Automatic Speech Recognition and Understanding. IEEE Signal Processing Society, Piscataway (1997) Google Scholar
  17. 271.
    Schukat-Talamazzini, E.G., Bielecki, M., Niemann, H., Kuhn, T., Rieck, S.: A non-metrical space search algorithm for fast Gaussian vector quantization. In: Proc. Int. Conf. on Acoustics, Speech, and Signal Processing, Minneapolis, pp. 688–691 (1993) CrossRefGoogle Scholar
  18. 276.
    Schwartz, R., Austin, S.: A comparison of several approximate algorithms for finding multiple (n-best) sentence hypotheses. In: Proc. Int. Conf. on Acoustics, Speech, and Signal Processing, Toronto, pp. 701–704 (1991) Google Scholar
  19. 279.
    Schwartz, R., Chow, Y.-L.: The n-best algorithms: an efficient and exact procedure for finding the N most likely sentence hypotheses. In: Proc. Int. Conf. on Acoustics, Speech, and Signal Processing, vol. 1, pp. 81–84 (1990) CrossRefGoogle Scholar
  20. 286.
    Soong, F.K., Huang, E.-F.: A tree-trellis based fast search for finding the n best sentence hypotheses in continuous speech recognition. In: Speech and Natural Language Workshop, pp. 12–19. Morgan Kaufmann, Hidden Valley (1990) Google Scholar
  21. 308.
    Wessel, F., Ortmanns, S., Ney, H.: Implementation of word based statistical language models. In: Proc. SQEL Workshop on Multi-Lingual Information Retrieval Dialogs, Plzen, pp. 55–59 (1997) Google Scholar
  22. 326.
    Young, S.J., Russell, N.H., Thornton, J.H.S.: Token passing: a simple conceptual model for connected speech recognition systems. Technical report, Cambridge University Engineering Department (1989) Google Scholar

Copyright information

© Springer-Verlag London 2014

Authors and Affiliations

  • Gernot A. Fink
    • 1
  1. 1.Department of Computer ScienceTU Dortmund UniversityDortmundGermany

Personalised recommendations