Skip to main content

Maximum Mutual Information Estimation of Hidden Markov Models

  • Chapter
Automatic Speech and Speaker Recognition

Part of the book series: The Kluwer International Series in Engineering and Computer Science ((SECS,volume 355))

Abstract

This chapter describes ways in which the concept of maximum mutual information estimation (MMIE) can be used to improve the performance of HMM-based speech recognition systems. First, the basic MMIE concept is introduced with some intuition on how it works. Then we show how the concept can be extended to improve the power of the basic models. Since estimating HMM parameters with MMIE training can be computationally expensive, this problem is studied at length and some solutions proposed and demonstrated. Experiments are presented to demonstrate the usefulness of the MMIE technique.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. L. E. Baum and J. A. Eagon, “An Inequality with Applications to Statistical Estimation for Probabilistic Functions of Markov Processes and to a Model for Ecology”, Bulletin of the American Mathematical Society, 73, 1967, pp. 360–363.

    Article  MathSciNet  MATH  Google Scholar 

  2. L. E. Baum, “An inequality and associated maximization technique in statistical estimation for probabilistics functions of Markov processes,” Inequalities, vol. 3, pp. 1–8, 1972.

    Google Scholar 

  3. L.R. Bahl, P.F. Brown, P.V. de Souza and R.L. Mercer, “Maximum Mutual Information Estimation of Hidden Markov Model Parameters for Speech Recognition”, Proc. ICASSP-86, pp. 49–52, Tokyo, 1986.

    Google Scholar 

  4. L.R. Bahl, P.F. Brown, P.V. de Souza and R.L. Mercer, “A New Algorithm for the Estimation of Hidden Markov Model Parameters”, Proc. ICASSP-88, pp. 493–496, New-York, 1988.

    Google Scholar 

  5. J.R. Bellegarda and D. Nahamoo, “Tied Mixtures Continuous Parameter Modeling for Speech Recognition,” Proc. ICASSP-89, pp. 13–16, Glasgow, 1989.

    Google Scholar 

  6. P.F. Brown, “The Acoustic-Modeling Problem in Automatic Speech Recognition”, Ph.D. Thesis, Carnegie Mellon University, Pittsburgh, May 1987.

    Google Scholar 

  7. R. Cardin, Y. Normandin, and R. De Mori, “High Performance Connected Digit Recognition Using Codebook Exponents”, Proc. ICASSP-92, p. I-505, San Francisco, May 1992.

    Google Scholar 

  8. Y.L. Chow, “Maximum Mutual Information Estimation of HMM Parameters for Continuous Speech Recognition using The N-Best Algorithm”, Proc. ICASSP-90, paper S13.6, Albuquerque, April 1990.

    Google Scholar 

  9. V. Digalakis and H. Murveit, “High-Accuracy Large-Vocabulary Speech Recognition Using Mixture Tying and Consistency Modeling”, Proceedings of the ARPA Human Language Technology Workshop, March 1994.

    Google Scholar 

  10. S. Furui, “Speaker-Independent Isolated Word Recognition Using Dynamic Features of Speech Spectrum”, IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. ASSP-34, no. 1, February 1986.

    Google Scholar 

  11. J.-L. Gauvain and C.-H. Lee, “Bayesian Learning for Hidden Markov Model with Gaussian Mixture State Observation Densities”, Speech Communication, vol. 11, nos. 2–3, June 1992.

    Google Scholar 

  12. P.S. Gopalakrishnan, D. Kanevsky, A. Nadas, and D. Nahamoo, “A Generalization of the Baum Algorithm to Rational Objective Functions”, Proc. ICASSP-89, paper S12.9, Glasgow, 1989.

    Google Scholar 

  13. M.-Y Hwang and X. Huang, “Subphonetic Modeling with Markov States — Senone”, Proc. ICASSP-92, San Francisco, May 1992, p. 1–33.

    Google Scholar 

  14. B.-H. Juang and L.R. Rabiner, “The Segmental K-Means Algorithm for Estimating Parameters of Hidden Markov Models,” IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. ASSP-38, no. 9, September 1990.

    Google Scholar 

  15. S. Karagiri, C.-H. Lee, and B.-H. Juang, “New Discriminative Algorithms Based on the Generalized Probabilistic Descent Method”, jProc. IEEE-SP Workshop on Neural Network for Signal Processing, Princeton, Sept. 1991.

    Google Scholar 

  16. ] C.-H. Lee, L.R. Rabiner, R. Pieraccini, and J.G. Wilpon, “Acoustic Modeling for Large Vocabulary Speech Recognition”, Computer Speech and Language, vol. 4, no. 2, April 1990.

    Google Scholar 

  17. R. G. Leonard, “A Database for Speaker-Independent Digit Recognition”, Proc. ICASSP-84, paper 42.11, 1984.

    Google Scholar 

  18. B. Merialdo, “Phonetic Recognition using Hidden Markov Models and Maximum Mutual Information Training”, Proc. ICASSP-88, paper S3.4, New-York, 1988.

    Google Scholar 

  19. H. Murveit, J. Butzberger, V. Digalakis, and M. Weintraub, “Large-Vocabulary Dictation Using SRI’s DECIPHER TM Speech Recognition System: Progressive Search Techniques”, Proc. ICASSP-93, Minneapolis, April 1993.

    Google Scholar 

  20. A. Nadas, “A Decision Theoretic Formulation of a Training Problem in Speech Recognition and a Comparison of Training by Unconditional Versus Conditional Maximum Likelihood”, IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. ASSP-31, no. 4, August 83, pp. 814–817.

    Google Scholar 

  21. A. Nadas, D. Nahamoo, and M.A. Picheny, “On a Model-Robust Training Method for Speech Recognition”, IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. ASSP-36, no. 11, September 1988, pp. 1432–1436.

    Article  Google Scholar 

  22. Y. Normandin, “Hidden Markov Models, Maximum Mutual Information Estimation, and the Speech Recognition Problem,” Ph.D. Thesis, McGill University, Montreal, June 1991.

    Google Scholar 

  23. Y. Normandin, R. Lacouture, and R. Cardin, “MMIE Training for Large Vocabulary Continuous Speech Recognition”, Proc. ICSLP-94, p. 1367, Yokohama, Japan, September 1994.

    Google Scholar 

  24. Y. Normandin, “Optimal Splitting of HMM Gaussian Mixture Components with MMIE training”, Proc. ICASSP-95, Detroit, May 1995.

    Google Scholar 

  25. R. Schwartz, Y. Chow, O. Kimball, S. Roucos, M. Krasner, J. Makhoul, “Context-Dependent Modeling for Acoustic-Phonetic Recognition of Continuous Speech”, Proc. ICASSP-85 April 1985.

    Google Scholar 

  26. S. Young, J. Odell, and P. Woodland, “Tree-Based State Tying for High Accuracy Acoustic Modelling” Proceedings of the ARPA Human Language Technology Workshop, March 1994.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1996 Kluwer Academic Publishers

About this chapter

Cite this chapter

Normandin, Y. (1996). Maximum Mutual Information Estimation of Hidden Markov Models. In: Lee, CH., Soong, F.K., Paliwal, K.K. (eds) Automatic Speech and Speaker Recognition. The Kluwer International Series in Engineering and Computer Science, vol 355. Springer, Boston, MA. https://doi.org/10.1007/978-1-4613-1367-0_3

Download citation

  • DOI: https://doi.org/10.1007/978-1-4613-1367-0_3

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-1-4612-8590-8

  • Online ISBN: 978-1-4613-1367-0

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics