Skip to main content

Parametric Density Estimation for the Classification of Acoustic Feature Vectors in Speech Recognition

  • Chapter
Nonlinear Modeling

Abstract

Motivated by applications to automatic machine recognition of speech, nongaussian parametric density estimation for classification of acoustic feature vectors, which are known to have very high dimensionality, is studied in a maximum likelihood framework. We use EM type algorithms for the estimation of parameters for a mixture model of nongaussian densities. Our experience with these techniques in the context of large vocabulary continuous speech recognition is reported. Experimental results tend to indicate that nongaussian mixture component densities model speech data more effectively from this point of view. Comments are made on the convergence of the iterative estimation algorithms developed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Jelenik, F. (1997). Statistical Methods for Speech Recognition. MIT Press.

    Google Scholar 

  2. Abramowitz, M. and Stegun, I. (1972). Handbook of Mathematical Functions. Dover Publications, New York, Ninth Dover printing.

    MATH  Google Scholar 

  3. Redner, R. and Walker, H. (1984). Mixture densities, maximum likelihood and the EM algorithm. SIAM Review, Vol.26, No.2.

    Google Scholar 

  4. Titterington, D. M., Smith, A. F. M. and Markov, U. E. (1985). Statistical analysis of finite Mixture Distributions. Wiley Interscience, New York.

    MATH  Google Scholar 

  5. Scott, D. W. (1997). Multivariate Density Estimation. Wiley Interscience.

    Google Scholar 

  6. Thompson, J. R. and Tapia, R. A. (1997). Nonparametric Function Estimation, modelling and simulation, SIAM Publications.

    Google Scholar 

  7. Hartigan, J. (1975). Clustering Algorithms. John Wiley & Sons.

    Google Scholar 

  8. Jain, A. and Dubes, R. (1988). Algorithms for Clustering Data, Prentice Hall.

    Google Scholar 

  9. Bishop, C. M. (1997). Neural Networks for Pattern Recognition, Cambridge University Press.

    Google Scholar 

  10. Devroye, L. and Gyorfi, L. (1984). Non-Parametric Density Estimation: The L1 view. Wiley Interscience.

    Google Scholar 

  11. Fukunaga, K. (1990). Statistical Pattern Recognition, second edition, Academic Press.

    Google Scholar 

  12. Horn, R. A. and Johnson, C. R. (1985). Matrix Analysis, Cambridge University Press.

    Google Scholar 

  13. Vapnik, V. N. (1995). The Nature of Statistical Learning Theory, Springer Verlag.

    Google Scholar 

  14. Baum, L. E., Petrie, T., Soules, G., Weiss, N. (1970). A maximization technique occurring in the statistical analysis of probabilistic functions of Markov chains. The Annals of Math. Stat., Vol.41, No.1, pp.164–171.

    Article  MathSciNet  MATH  Google Scholar 

  15. Dempster, A. P., Laird, N. M. and Baum, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of Royal Statistical Soc., Ser. B, Vol. 39, pp. 1–38.

    MATH  Google Scholar 

  16. Marroquin, F. and Girosi, F. (1993). Some extensions of the K-means algorithm for image segmentation and pattern classifcation, MIT Artificial Intelligence Lab. A. I. Memorandum no. 1390.

    Google Scholar 

  17. Ney, H., Noll, A. (1988). Phoneme modelling using continuous mixture densities. Proceedings of IEEE Int. Conf. on Acosutics Speech and Signal Processing, pp.437–440.

    Google Scholar 

  18. Bahl, L. R., Desouza, P. V., Gopalkrishnan, P. S., Picheny, M. A. (1993). Context dependent vector quantization for continuous speech recognition. Proceedings of IEEE Int. Conf. on Acosutics Speech and Signal Processing, pp.632–635.

    Google Scholar 

  19. Breiman, L. (1983). Classification and Regression Trees, Wadsworth International, Belmont, California.

    Google Scholar 

  20. ARPA Wall Street Journal data, Available from Linguistics Data Corporation.

    Google Scholar 

  21. Geman, S. (1998). Three lectures on image understanding. The Center For Imaging Science, Washington State University, video tape, Sept. 10-12, 1997. Also, David Mumford, Pattern theory, Lecture in Directors Series, IBM Yorktown Heights.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1998 Springer Science+Business Media Dordrecht

About this chapter

Cite this chapter

Basu, S., Micchelli, C.A. (1998). Parametric Density Estimation for the Classification of Acoustic Feature Vectors in Speech Recognition. In: Suykens, J.A.K., Vandewalle, J. (eds) Nonlinear Modeling. Springer, Boston, MA. https://doi.org/10.1007/978-1-4615-5703-6_4

Download citation

  • DOI: https://doi.org/10.1007/978-1-4615-5703-6_4

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-1-4613-7611-8

  • Online ISBN: 978-1-4615-5703-6

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics