Abstract
A critical component in the pattern matching approach to speech recognition is the training algorithm which aims at producing typical (reference) patterns or models for accurate pattern comparison. In this chapter, we discuss the issue of speech recognizer training from a broad perspective with root in the classical Bayes decision theory. We differentiate the method of classifier design by way of distribution estimation and the method of discriminative training based on the fact that in many realistic applications, such as speech recognition, the real signal distribution form is rarely known precisely. We argue that traditional methods relying on distribution estimation are suboptimal when the assumed distribution form is not the true one, and that “optimality” in distribution estimation does not automatically translate into “optimality” in classifier design. We compare the two different methods in the context of hidden Markov modeling for speech recognition. We show the superiority of the discriminative method over the distribution estimation method by providing the results of several key speech recognition experiments.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
L. R. Rabiner, “A tutorial on hidden Markov models and selected applications in speech recognition,” Proc. IEEE, 77(2): 257–286, February 1989.
L. R. Rabiner and B.-H. Juang, Fundamentals of Speech Recognition, Prentice Hall, Englewood Cliffs, NJ, 1993.
R. O. Duda and P. E. Hart, Pattern Classification and Scene Analysis, New York: Wiley, 1973.
F. Jelinek, “The development of an experimental discrete dictation recognizer,” Proc. IEEE, 73: 1616–1624, November 1985.
B.-H. Juang, L. R. Rabiner and J. G. Wilpon, “On the use of bandpass littering in speech recognition,” IEEE Trans. Acoust. Speech Signal Processing, ASSP-35 (7): 947–954, July 1987.
B.-H. Juang and L. R. Rabiner, “Hidden Markov models for speech recognition,” Technometrics, vol. 33, no. 3, pp. 251–272, August 1991.
L. E. Baum, T. Petrie, G. Soules and N. Weiss, “A maximization technique occurring in the statistical analysis of probabilistic functions of Markov chains,” Ann. Math. Stat., 41(1): 164–171, 1970.
B.-H. Juang and S. Katagiri, “Discriminative learning for minimum error classification,” IEEE Trans. Signal Processing, SP-40, no. 12, pp. 3043–3054, December 1992.
W. Chou, C.-H. Lee and B.-H. Juang, “Segmental GPD training of an hidden Markov model based speech recognizer,” IEEE Proc. ICASSP-92, pp. 473–476, 1992.
W. Chou, C.-H. Lee and B.-H. Juang, “Minimum error rate training based on N-best string models,” IEEE ICASSP-93 Proceedings, 11–652–655, April 1993.
W. Chou, C.-H. Lee and B-H. Juang, “Minimum error rate training of inter-word context dependent acoustic model units in speech recognition”, Proc. ICSLP’ 94 pp. 439–442, Yokohama.
W. Chou, T. Matsuoka, B.-H. Juang and C.-H. Lee, “A high resolution N-best search algorithm using inter-word context dependent models for continuous speech recognition”, Proc. ICASSP’ 94
W. Chou, C.-H. Lee, B.-H. Juang and F. K. Soong, “A Minimum Error Rate Pattern Recognition Approach to Speech Recognition”, International Journal of Pattern Recognition and Artificial Intelligence Vol. 8 No. 1, pp 5–31, 1994.
W. Chou and B.-H. Juang, “Adaptive Discriminative Learning in Pattern Recognition”, Technical Report of AT&T Bell Laboratories.
D. Pollard, Convergence of Stochastic Process, Springer Series in Statistics.
A. Benveniste, M. Metivier and P. Priouet, Adaptive Algorithms and Stochastic Approximations, Springer-Verlag.
J. R. Blum, “Multidimensional Stochastic Approximation Methods”, Ann. Math. Stat. vol 25, pp 737–744, 1954.
H. Robbins and S. Monro, “A Stochastic Approximation Method”, Ann. Math. Stat., Vol 22 (1951), pp. 400–407.
J.L. Doob, Stochastic Process, John Wiley and Sons, 1953.
C.-H. Lee, E. Giachin, L.R. Rabiner, R. Pieraccini and A.E. Rosenberg, “Improved Acoustic Modeling for Speaker Independent Large Vocabulary Continuous Speech Recognition”, Computer Speech and Language, pp. 103–127, 1992.
C.-S. Liu, C.-H. Lee, W. Chou, B.-H. Juang and A. Rosenberg, “A Study on Minimum Error Discriminative Training For Speaker Recognition”, J. Acoust Soc. Am. 97, pp. 637–648.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1996 Kluwer Academic Publishers
About this chapter
Cite this chapter
Juang, BH., Chou, W., Lee, CH. (1996). Statistical and Discriminative Methods for Speech Recognition. In: Lee, CH., Soong, F.K., Paliwal, K.K. (eds) Automatic Speech and Speaker Recognition. The Kluwer International Series in Engineering and Computer Science, vol 355. Springer, Boston, MA. https://doi.org/10.1007/978-1-4613-1367-0_5
Download citation
DOI: https://doi.org/10.1007/978-1-4613-1367-0_5
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4612-8590-8
Online ISBN: 978-1-4613-1367-0
eBook Packages: Springer Book Archive