Variational Bayes Adapted GMM Based Models for Audio Clip Classification

  • Ved Prakash Sahu
  • Harendra Kumar Mishra
  • C. Chandra Sekhar
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5909)


The most commonly used method for parameter estimation in the Gaussian mixture models (GMMs) is maximum likelihood (ML). However, it suffers from the overfitting when the model complexity is high. Adapted GMM is an extended version of GMMs and it helps to reduce the overfitting in the model. Variational Bayesian method helps in determining optimal complexity so that it avoids overfitting. In this paper we propose the variational Bayes learning method for training the adapted GMMs. The proposed approach is free from overfitting and singularity problems that arise in the other approaches. This approach is faster in training and allows a fast-scoring technique during testing to reduce the testing time. Studies on the classification of audio clips show that the proposed approach gives a better performance compared to GMMs, adapted GMMs, variational Bayes GMMs.


GMM variational learning Bayesian adaptation 


  1. 1.
    Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, Heidelberg (2006)zbMATHCrossRefGoogle Scholar
  2. 2.
    Reynolds, D.A., Quatieri, T.F., Dunn, R.B.: Speaker verification using adapted Gaussian mixture models. Digital Signal Processing 10, 19–41 (2000)CrossRefGoogle Scholar
  3. 3.
    Nasios, N., Bors, A.: Variational learning for Gaussian mixture model. IEEE Trans. System, Man, and Cybernetics 36, 849–862 (2006)CrossRefGoogle Scholar
  4. 4.
    Zheng, R., Ulang, S., Xu, B.: Text-independent speaker identification using GMM-UBM and frame level likelihood normalization. In: Proc. ISCSLP, pp. 289–292 (2004)Google Scholar
  5. 5.
    Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification, 2nd edn. Wiley, New York (2000)Google Scholar
  6. 6.
    Gauvain, J.L., Lee, C.-H.: Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chains. IEEE Trans. Speech and Audio Processing 2, 291–298 (1994)CrossRefGoogle Scholar
  7. 7.
    Aggarwal, G., Bajpai, A., Khan, A.N., Yegnanarayana, B.: Exploring features for audio indexing. Inter-Research Institute Student Seminar, IISc Bangalore (March 2002)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Ved Prakash Sahu
    • 1
  • Harendra Kumar Mishra
    • 1
  • C. Chandra Sekhar
    • 1
  1. 1.Speech and Vision Lab, Dept. of Computer Sc. & Engg.Indian Institute of Technology-MadrasIndia

Personalised recommendations