Parametric Density Estimation for the Classification of Acoustic Feature Vectors in Speech Recognition

Basu, Sankar; Micchelli, Charles A.

doi:10.1007/978-1-4615-5703-6_4

Sankar Basu² &
Charles A. Micchelli²

595 Accesses
5 Citations

Abstract

Motivated by applications to automatic machine recognition of speech, nongaussian parametric density estimation for classification of acoustic feature vectors, which are known to have very high dimensionality, is studied in a maximum likelihood framework. We use EM type algorithms for the estimation of parameters for a mixture model of nongaussian densities. Our experience with these techniques in the context of large vocabulary continuous speech recognition is reported. Experimental results tend to indicate that nongaussian mixture component densities model speech data more effectively from this point of view. Comments are made on the convergence of the iterative estimation algorithms developed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Jelenik, F. (1997). Statistical Methods for Speech Recognition. MIT Press.
Google Scholar
Abramowitz, M. and Stegun, I. (1972). Handbook of Mathematical Functions. Dover Publications, New York, Ninth Dover printing.
MATH Google Scholar
Redner, R. and Walker, H. (1984). Mixture densities, maximum likelihood and the EM algorithm. SIAM Review, Vol.26, No.2.
Google Scholar
Titterington, D. M., Smith, A. F. M. and Markov, U. E. (1985). Statistical analysis of finite Mixture Distributions. Wiley Interscience, New York.
MATH Google Scholar
Scott, D. W. (1997). Multivariate Density Estimation. Wiley Interscience.
Google Scholar
Thompson, J. R. and Tapia, R. A. (1997). Nonparametric Function Estimation, modelling and simulation, SIAM Publications.
Google Scholar
Hartigan, J. (1975). Clustering Algorithms. John Wiley & Sons.
Google Scholar
Jain, A. and Dubes, R. (1988). Algorithms for Clustering Data, Prentice Hall.
Google Scholar
Bishop, C. M. (1997). Neural Networks for Pattern Recognition, Cambridge University Press.
Google Scholar
Devroye, L. and Gyorfi, L. (1984). Non-Parametric Density Estimation: The L1 view. Wiley Interscience.
Google Scholar
Fukunaga, K. (1990). Statistical Pattern Recognition, second edition, Academic Press.
Google Scholar
Horn, R. A. and Johnson, C. R. (1985). Matrix Analysis, Cambridge University Press.
Google Scholar
Vapnik, V. N. (1995). The Nature of Statistical Learning Theory, Springer Verlag.
Google Scholar
Baum, L. E., Petrie, T., Soules, G., Weiss, N. (1970). A maximization technique occurring in the statistical analysis of probabilistic functions of Markov chains. The Annals of Math. Stat., Vol.41, No.1, pp.164–171.
Article MathSciNet MATH Google Scholar
Dempster, A. P., Laird, N. M. and Baum, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of Royal Statistical Soc., Ser. B, Vol. 39, pp. 1–38.
MATH Google Scholar
Marroquin, F. and Girosi, F. (1993). Some extensions of the K-means algorithm for image segmentation and pattern classifcation, MIT Artificial Intelligence Lab. A. I. Memorandum no. 1390.
Google Scholar
Ney, H., Noll, A. (1988). Phoneme modelling using continuous mixture densities. Proceedings of IEEE Int. Conf. on Acosutics Speech and Signal Processing, pp.437–440.
Google Scholar
Bahl, L. R., Desouza, P. V., Gopalkrishnan, P. S., Picheny, M. A. (1993). Context dependent vector quantization for continuous speech recognition. Proceedings of IEEE Int. Conf. on Acosutics Speech and Signal Processing, pp.632–635.
Google Scholar
Breiman, L. (1983). Classification and Regression Trees, Wadsworth International, Belmont, California.
Google Scholar
ARPA Wall Street Journal data, Available from Linguistics Data Corporation.
Google Scholar
Geman, S. (1998). Three lectures on image understanding. The Center For Imaging Science, Washington State University, video tape, Sept. 10-12, 1997. Also, David Mumford, Pattern theory, Lecture in Directors Series, IBM Yorktown Heights.
Google Scholar

Download references

Author information

Authors and Affiliations

IBM T. J. Watson Research Center, Yorktown Heights, NY, 10598, USA
Sankar Basu & Charles A. Micchelli

Authors

Sankar Basu
View author publications
You can also search for this author in PubMed Google Scholar
Charles A. Micchelli
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Katholieke Universiteit Leuven, Belgium
Johan A. K. Suykens & Joos Vandewalle &

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Basu, S., Micchelli, C.A. (1998). Parametric Density Estimation for the Classification of Acoustic Feature Vectors in Speech Recognition. In: Suykens, J.A.K., Vandewalle, J. (eds) Nonlinear Modeling. Springer, Boston, MA. https://doi.org/10.1007/978-1-4615-5703-6_4

Download citation

DOI: https://doi.org/10.1007/978-1-4615-5703-6_4
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4613-7611-8
Online ISBN: 978-1-4615-5703-6
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics