Maximum Mutual Information Estimation of Hidden Markov Models

Normandin, Yves

doi:10.1007/978-1-4613-1367-0_3

Yves Normandin³

Part of the book series: The Kluwer International Series in Engineering and Computer Science ((SECS,volume 355))

458 Accesses
15 Citations

Abstract

This chapter describes ways in which the concept of maximum mutual information estimation (MMIE) can be used to improve the performance of HMM-based speech recognition systems. First, the basic MMIE concept is introduced with some intuition on how it works. Then we show how the concept can be extended to improve the power of the basic models. Since estimating HMM parameters with MMIE training can be computationally expensive, this problem is studied at length and some solutions proposed and demonstrated. Experiments are presented to demonstrate the usefulness of the MMIE technique.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

L. E. Baum and J. A. Eagon, “An Inequality with Applications to Statistical Estimation for Probabilistic Functions of Markov Processes and to a Model for Ecology”, Bulletin of the American Mathematical Society, 73, 1967, pp. 360–363.
Article MathSciNet MATH Google Scholar
L. E. Baum, “An inequality and associated maximization technique in statistical estimation for probabilistics functions of Markov processes,” Inequalities, vol. 3, pp. 1–8, 1972.
Google Scholar
L.R. Bahl, P.F. Brown, P.V. de Souza and R.L. Mercer, “Maximum Mutual Information Estimation of Hidden Markov Model Parameters for Speech Recognition”, Proc. ICASSP-86, pp. 49–52, Tokyo, 1986.
Google Scholar
L.R. Bahl, P.F. Brown, P.V. de Souza and R.L. Mercer, “A New Algorithm for the Estimation of Hidden Markov Model Parameters”, Proc. ICASSP-88, pp. 493–496, New-York, 1988.
Google Scholar
J.R. Bellegarda and D. Nahamoo, “Tied Mixtures Continuous Parameter Modeling for Speech Recognition,” Proc. ICASSP-89, pp. 13–16, Glasgow, 1989.
Google Scholar
P.F. Brown, “The Acoustic-Modeling Problem in Automatic Speech Recognition”, Ph.D. Thesis, Carnegie Mellon University, Pittsburgh, May 1987.
Google Scholar
R. Cardin, Y. Normandin, and R. De Mori, “High Performance Connected Digit Recognition Using Codebook Exponents”, Proc. ICASSP-92, p. I-505, San Francisco, May 1992.
Google Scholar
Y.L. Chow, “Maximum Mutual Information Estimation of HMM Parameters for Continuous Speech Recognition using The N-Best Algorithm”, Proc. ICASSP-90, paper S13.6, Albuquerque, April 1990.
Google Scholar
V. Digalakis and H. Murveit, “High-Accuracy Large-Vocabulary Speech Recognition Using Mixture Tying and Consistency Modeling”, Proceedings of the ARPA Human Language Technology Workshop, March 1994.
Google Scholar
S. Furui, “Speaker-Independent Isolated Word Recognition Using Dynamic Features of Speech Spectrum”, IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. ASSP-34, no. 1, February 1986.
Google Scholar
J.-L. Gauvain and C.-H. Lee, “Bayesian Learning for Hidden Markov Model with Gaussian Mixture State Observation Densities”, Speech Communication, vol. 11, nos. 2–3, June 1992.
Google Scholar
P.S. Gopalakrishnan, D. Kanevsky, A. Nadas, and D. Nahamoo, “A Generalization of the Baum Algorithm to Rational Objective Functions”, Proc. ICASSP-89, paper S12.9, Glasgow, 1989.
Google Scholar
M.-Y Hwang and X. Huang, “Subphonetic Modeling with Markov States — Senone”, Proc. ICASSP-92, San Francisco, May 1992, p. 1–33.
Google Scholar
B.-H. Juang and L.R. Rabiner, “The Segmental K-Means Algorithm for Estimating Parameters of Hidden Markov Models,” IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. ASSP-38, no. 9, September 1990.
Google Scholar
S. Karagiri, C.-H. Lee, and B.-H. Juang, “New Discriminative Algorithms Based on the Generalized Probabilistic Descent Method”, jProc. IEEE-SP Workshop on Neural Network for Signal Processing, Princeton, Sept. 1991.
Google Scholar
] C.-H. Lee, L.R. Rabiner, R. Pieraccini, and J.G. Wilpon, “Acoustic Modeling for Large Vocabulary Speech Recognition”, Computer Speech and Language, vol. 4, no. 2, April 1990.
Google Scholar
R. G. Leonard, “A Database for Speaker-Independent Digit Recognition”, Proc. ICASSP-84, paper 42.11, 1984.
Google Scholar
B. Merialdo, “Phonetic Recognition using Hidden Markov Models and Maximum Mutual Information Training”, Proc. ICASSP-88, paper S3.4, New-York, 1988.
Google Scholar
H. Murveit, J. Butzberger, V. Digalakis, and M. Weintraub, “Large-Vocabulary Dictation Using SRI’s DECIPHER TM Speech Recognition System: Progressive Search Techniques”, Proc. ICASSP-93, Minneapolis, April 1993.
Google Scholar
A. Nadas, “A Decision Theoretic Formulation of a Training Problem in Speech Recognition and a Comparison of Training by Unconditional Versus Conditional Maximum Likelihood”, IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. ASSP-31, no. 4, August 83, pp. 814–817.
Google Scholar
A. Nadas, D. Nahamoo, and M.A. Picheny, “On a Model-Robust Training Method for Speech Recognition”, IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. ASSP-36, no. 11, September 1988, pp. 1432–1436.
Article Google Scholar
Y. Normandin, “Hidden Markov Models, Maximum Mutual Information Estimation, and the Speech Recognition Problem,” Ph.D. Thesis, McGill University, Montreal, June 1991.
Google Scholar
Y. Normandin, R. Lacouture, and R. Cardin, “MMIE Training for Large Vocabulary Continuous Speech Recognition”, Proc. ICSLP-94, p. 1367, Yokohama, Japan, September 1994.
Google Scholar
Y. Normandin, “Optimal Splitting of HMM Gaussian Mixture Components with MMIE training”, Proc. ICASSP-95, Detroit, May 1995.
Google Scholar
R. Schwartz, Y. Chow, O. Kimball, S. Roucos, M. Krasner, J. Makhoul, “Context-Dependent Modeling for Acoustic-Phonetic Recognition of Continuous Speech”, Proc. ICASSP-85 April 1985.
Google Scholar
S. Young, J. Odell, and P. Woodland, “Tree-Based State Tying for High Accuracy Acoustic Modelling” Proceedings of the ARPA Human Language Technology Workshop, March 1994.
Google Scholar

Download references

Author information

Authors and Affiliations

Centre de recherche informatique de Montréal (CRIM), Montréal, Québec, Canada
Yves Normandin

Authors

Yves Normandin
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

AT&T Bell Laboratories, Murray Hill, NJ, 07974, USA
Chin-Hui Lee & Frank K. Soong &
School of Microelectronic Engineering, Griffith University, Australia
Kuldip K. Paliwal

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Normandin, Y. (1996). Maximum Mutual Information Estimation of Hidden Markov Models. In: Lee, CH., Soong, F.K., Paliwal, K.K. (eds) Automatic Speech and Speaker Recognition. The Kluwer International Series in Engineering and Computer Science, vol 355. Springer, Boston, MA. https://doi.org/10.1007/978-1-4613-1367-0_3

Download citation

DOI: https://doi.org/10.1007/978-1-4613-1367-0_3
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4612-8590-8
Online ISBN: 978-1-4613-1367-0
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics