Variational Bayesian Methods for Audio Indexing

  • Fabio Valente
  • Christian Wellekens
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3869)


In this paper we aim to investigate the use of Variational Bayesian methods for audio indexing purposes. Variational Bayesian (VB) techniques are approximated techniques for fully Bayesian learning. Contrarily to non Bayesian methods (e.g. Maximum Likelihood) or partially Bayesian criterion (e.g. Maximum a Posteriori), VB benefits from important model selection properties. VB learning is based on the Free Energy optimization; Free Energy can be used at the same time as an objective function and as a model selection criterion allowing simultaneous model learning/model selection. Here we explore the use of VB learning and VB model selection in a speaker clustering task comparing results with classical learning techniques (ML and MAP) and classical model selection criteria (BIC). Experiments are run on the evaluation data set NIST-1996 HUB-4 and results show that VB can outperform classical methods.


Hide Markov Model Universal Background Model Variational Bayesian Model Selection Problem Complete Data Likelihood 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Schwartz, G.: Estimation of the dimension of a model. Annals of Statistics 6 (1978)Google Scholar
  2. 2.
    Chen, S., Gopalakrishnan, P.: Speaker, environment and channel change detection and clustering via the Bayesian Information Criterion. In: Proceedings of the DARPA Workshop (1998)Google Scholar
  3. 3.
    Tritschler, A., Gopinath, R.: Improved speaker segmentation and segments clustering using the Bayesian information criterion. In: Proceedings of Eurospeech 1999, pp. 679–682 (1999)Google Scholar
  4. 4.
    MacKay, D.J.C.: Probable networks and plausible predictions-a review of practical Bayesian methods for supervised neural networks. Network: Comput. Neural Syst. 6, 469–505 (1995)CrossRefzbMATHGoogle Scholar
  5. 5.
    MacKay, D.J.C.: Ensemble Learning for Hidden Markov Models,
  6. 6.
    Bishop, C.M., Winn, J.: Structured variational distributions in VIBES. In: Bishop, C.M., Frey, B. (eds.) Proceedings Artificial Intelligence and Statistics. Society for Artificial Intelligence and Statistics (2003)Google Scholar
  7. 7.
    Beal, M.: Variational Algorithm for Approximate Bayesian Inference. PhD thesis, The Gatsby Computational Neuroscience Unit, University College London (2003)Google Scholar
  8. 8.
    Attias, H.: A variational Bayesian framework for graphical models. Advances in Neural Information Processing Systems 12, 209–215 (2000)Google Scholar
  9. 9.
    Attias, H.: Inferring parameters and structures of latent variable models by Variational Bayes. In: Proceedings of the 15th Conference on Uncertainty in Artificial Intelligence, pp. 21–30 (1999)Google Scholar
  10. 10.
    Olsen, J.O.: Separation of speakers in audio data. In: EUROSPEECH 1995, pp. 355–358 (1995)Google Scholar
  11. 11.
    Solomonoff, A., Mielke, A., Schmidt, G.H.: Clustering speakers by their voices. In: ICASSP 1998, pp. 557–560 (1998)Google Scholar
  12. 12.
    Lapidot, I.: SOMas Likelihood Estimator for Speaker Clustering. In: EUROSPEECH 2003 (2003)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Fabio Valente
    • 1
  • Christian Wellekens
    • 1
  1. 1.Institut EurecomSophia AntipolisFrance

Personalised recommendations