Statistical and Discriminative Methods for Speech Recognition

Juang, B.-H.; Chou, W.; Lee, C.-H.

doi:10.1007/978-1-4613-1367-0_5

B.-H. Juang³,
W. Chou³ &
C.-H. Lee³

Part of the book series: The Kluwer International Series in Engineering and Computer Science ((SECS,volume 355))

436 Accesses
7 Citations

Abstract

A critical component in the pattern matching approach to speech recognition is the training algorithm which aims at producing typical (reference) patterns or models for accurate pattern comparison. In this chapter, we discuss the issue of speech recognizer training from a broad perspective with root in the classical Bayes decision theory. We differentiate the method of classifier design by way of distribution estimation and the method of discriminative training based on the fact that in many realistic applications, such as speech recognition, the real signal distribution form is rarely known precisely. We argue that traditional methods relying on distribution estimation are suboptimal when the assumed distribution form is not the true one, and that “optimality” in distribution estimation does not automatically translate into “optimality” in classifier design. We compare the two different methods in the context of hidden Markov modeling for speech recognition. We show the superiority of the discriminative method over the distribution estimation method by providing the results of several key speech recognition experiments.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

L. R. Rabiner, “A tutorial on hidden Markov models and selected applications in speech recognition,” Proc. IEEE, 77(2): 257–286, February 1989.
Article Google Scholar
L. R. Rabiner and B.-H. Juang, Fundamentals of Speech Recognition, Prentice Hall, Englewood Cliffs, NJ, 1993.
Google Scholar
R. O. Duda and P. E. Hart, Pattern Classification and Scene Analysis, New York: Wiley, 1973.
MATH Google Scholar
F. Jelinek, “The development of an experimental discrete dictation recognizer,” Proc. IEEE, 73: 1616–1624, November 1985.
Article Google Scholar
B.-H. Juang, L. R. Rabiner and J. G. Wilpon, “On the use of bandpass littering in speech recognition,” IEEE Trans. Acoust. Speech Signal Processing, ASSP-35 (7): 947–954, July 1987.
Article Google Scholar
B.-H. Juang and L. R. Rabiner, “Hidden Markov models for speech recognition,” Technometrics, vol. 33, no. 3, pp. 251–272, August 1991.
Article MathSciNet MATH Google Scholar
L. E. Baum, T. Petrie, G. Soules and N. Weiss, “A maximization technique occurring in the statistical analysis of probabilistic functions of Markov chains,” Ann. Math. Stat., 41(1): 164–171, 1970.
Article MathSciNet MATH Google Scholar
B.-H. Juang and S. Katagiri, “Discriminative learning for minimum error classification,” IEEE Trans. Signal Processing, SP-40, no. 12, pp. 3043–3054, December 1992.
Article Google Scholar
W. Chou, C.-H. Lee and B.-H. Juang, “Segmental GPD training of an hidden Markov model based speech recognizer,” IEEE Proc. ICASSP-92, pp. 473–476, 1992.
Google Scholar
W. Chou, C.-H. Lee and B.-H. Juang, “Minimum error rate training based on N-best string models,” IEEE ICASSP-93 Proceedings, 11–652–655, April 1993.
Google Scholar
W. Chou, C.-H. Lee and B-H. Juang, “Minimum error rate training of inter-word context dependent acoustic model units in speech recognition”, Proc. ICSLP’ 94 pp. 439–442, Yokohama.
Google Scholar
W. Chou, T. Matsuoka, B.-H. Juang and C.-H. Lee, “A high resolution N-best search algorithm using inter-word context dependent models for continuous speech recognition”, Proc. ICASSP’ 94
Google Scholar
W. Chou, C.-H. Lee, B.-H. Juang and F. K. Soong, “A Minimum Error Rate Pattern Recognition Approach to Speech Recognition”, International Journal of Pattern Recognition and Artificial Intelligence Vol. 8 No. 1, pp 5–31, 1994.
Article Google Scholar
W. Chou and B.-H. Juang, “Adaptive Discriminative Learning in Pattern Recognition”, Technical Report of AT&T Bell Laboratories.
Google Scholar
D. Pollard, Convergence of Stochastic Process, Springer Series in Statistics.
Google Scholar
A. Benveniste, M. Metivier and P. Priouet, Adaptive Algorithms and Stochastic Approximations, Springer-Verlag.
Google Scholar
J. R. Blum, “Multidimensional Stochastic Approximation Methods”, Ann. Math. Stat. vol 25, pp 737–744, 1954.
Article MATH Google Scholar
H. Robbins and S. Monro, “A Stochastic Approximation Method”, Ann. Math. Stat., Vol 22 (1951), pp. 400–407.
Article MathSciNet MATH Google Scholar
J.L. Doob, Stochastic Process, John Wiley and Sons, 1953.
Google Scholar
C.-H. Lee, E. Giachin, L.R. Rabiner, R. Pieraccini and A.E. Rosenberg, “Improved Acoustic Modeling for Speaker Independent Large Vocabulary Continuous Speech Recognition”, Computer Speech and Language, pp. 103–127, 1992.
Google Scholar
C.-S. Liu, C.-H. Lee, W. Chou, B.-H. Juang and A. Rosenberg, “A Study on Minimum Error Discriminative Training For Speaker Recognition”, J. Acoust Soc. Am. 97, pp. 637–648.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Speech Research Department, AT&T Bell Laboratories, 600 Mountain Avenue, Murray Hill, NJ, USA, 07974
B.-H. Juang, W. Chou & C.-H. Lee

Authors

B.-H. Juang
View author publications
You can also search for this author in PubMed Google Scholar
W. Chou
View author publications
You can also search for this author in PubMed Google Scholar
C.-H. Lee
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

AT&T Bell Laboratories, Murray Hill, NJ, 07974, USA
Chin-Hui Lee & Frank K. Soong &
School of Microelectronic Engineering, Griffith University, Australia
Kuldip K. Paliwal

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Juang, BH., Chou, W., Lee, CH. (1996). Statistical and Discriminative Methods for Speech Recognition. In: Lee, CH., Soong, F.K., Paliwal, K.K. (eds) Automatic Speech and Speaker Recognition. The Kluwer International Series in Engineering and Computer Science, vol 355. Springer, Boston, MA. https://doi.org/10.1007/978-1-4613-1367-0_5

Download citation

DOI: https://doi.org/10.1007/978-1-4613-1367-0_5
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4612-8590-8
Online ISBN: 978-1-4613-1367-0
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics