Variational learning for Dirichlet process mixtures of Dirichlet distributions and applications

Fan, Wentao; Bouguila, Nizar

doi:10.1007/s11042-012-1191-0

Variational learning for Dirichlet process mixtures of Dirichlet distributions and applications

Published: 03 August 2012

Volume 70, pages 1685–1702, (2014)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Wentao Fan¹ &
Nizar Bouguila²

488 Accesses
14 Citations
Explore all metrics

Abstract

In this paper, we propose a Bayesian nonparametric approach for modeling and selection based on a mixture of Dirichlet processes with Dirichlet distributions, which can also be seen as an infinite Dirichlet mixture model. The proposed model uses a stick-breaking representation and is learned by a variational inference method. Due to the nature of Bayesian nonparametric approach, the problems of overfitting and underfitting are prevented. Moreover, the obstacle of estimating the correct number of clusters is sidestepped by assuming an infinite number of clusters. Compared to other approximation techniques, such as Markov chain Monte Carlo (MCMC), which require high computational cost and whose convergence is difficult to diagnose, the whole inference process in the proposed variational learning framework is analytically tractable with closed-form solutions. Additionally, the proposed infinite Dirichlet mixture model with variational learning requires only a modest amount of computational power which makes it suitable to large applications. The effectiveness of our model is experimentally investigated through both synthetic data sets and challenging real-life multimedia applications namely image spam filtering and human action videos categorization.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Online Variational Learning of Dirichlet Process Mixtures of Scaled Dirichlet Distributions

Article 21 July 2020

Unsupervised Variational Learning of Finite Generalized Inverted Dirichlet Mixture Models with Feature Selection and Component Splitting

Data Clustering Using Variational Learning of Finite Scaled Dirichlet Mixture Models with Component Splitting

Notes

The complete source code is available upon request.
http://www.cs.jhu.edu/~mdredze/datasets/image_spam

References

Antoniak CE (1974) Mixtures of Dirichlet processes with applications to Bayesian nonparametric problems. Ann Stat 2:1152–1174
Article MATH MathSciNet Google Scholar
Attias H (1999) A variational Bayes framework for graphical models. In: Proc. of neural information processing systems (NIPS), pp 209–215
Biggio B, Fumera G, Pillai I, Roli F (2007) Image spam filtering using visual information. In: Proc. of the 14th international conference on image analysis and processing (ICIAP), pp 105–110
Biggio B, Fumera G, Pillai I, Roli F (2011) A survey and experimental evaluation of image spam filtering techniques. Pattern Recogn Lett 32:1436–1446
Article Google Scholar
Blackwell D, MacQueen J (1973) Ferguson distributions via Pólya Urn schemes. Ann Stat 1(2):353–355
Article MATH MathSciNet Google Scholar
Blei DM, Jordan MI (2005) Variational inference for Dirichlet process mixtures. Bayesian Analysis 1:121–144
Article MathSciNet Google Scholar
Bosch A, Zisserman A, Munoz X (2006) Scene classification via pLSA. In: Proc. of 9th European conference on computer vision (ECCV), pp 517–530
Bouguila N, Ziou D (2006) Unsupervised selection of a finite Dirichlet mixture model: an MML-based approach. IEEE Trans Knowl Data Eng 18(8):993–1009
Article Google Scholar
Bouguila N, Ziou D (2008) A Dirichlet arocess mixture of Dirichlet distributions for classification and prediction. In: Proc. of the IEEE workshop on machine learning for signal processing (MLSP), pp 297–302
Bouguila N, Ziou D, Vaillancourt J (2004) Unsupervised learning of a finite mixture model based on the Dirichlet distribution and its application. IEEE Trans Image Process 13(11):1533–1543
Article Google Scholar
Boyd S, Vandenberghe L (2004) Convex optimization. Cambridge University Press
Csurka G, Dance CR, Fan L, Willamowski J, Bray C (2004) Visual categorization with bags of keypoints. In: Workshop on statistical learning in computer vision, ECCV, pp 1–22
Dollár P, Rabaud V, Cottrell G, Belongie S (2005) Behavior recognition via sparse spatio-temporal features. In: Proc. of VS-PETS, pp 65–72
Dredze M, Gevaryahu R, Elias-Bachrach A (2007) Learning fast classifiers for image spam. In: Proc. of the conference on email and anti-spam (CEAS), pp 487–493
Elkan C (2003) Using the triangle inequality to accelerate K-means. In: Proc. of the 20th international conference on machine learning (ICML), pp 147–153
Ferguson TS (1973) A Bayesian analysis of some nonparametric problems. Ann Stat 1(2):209–230
Article MATH MathSciNet Google Scholar
Ferguson TS (1983) Bayesian density estimation by mixtures of normal distributions. Recent Adv Stat 24:287–302
MathSciNet Google Scholar
Fumera G, Pillai I, Roli F (2006) Spam filtering based on the analysis of text information embedded into images. J Mach Learn Res 7:2699–2720
Google Scholar
Hofmann T (2001) Unsupervised learning by probabilistic latent semantic analysis. Mach Learn 42(1/2):177–196
Article MATH Google Scholar
Ishwaran H, James LF (2001) Gibbs sampling methods for stick-breaking priors. J Am Stat Assoc 96:161–173
Article MATH MathSciNet Google Scholar
Jordan MI, Ghahramani Z, Jaakkola TS, Saul LK (1999) An introduction to variational methods for graphical models. Mach Learn 37(2):183–233
Article MATH Google Scholar
Khoshabeh R, Hollan JD (2009) Spatio-temporal interest points for video analysis. In: Proc. of the 27th international conference extended abstracts on human factors in computing systems, pp 3455–3460
Korwar RM, Hollander M (1973) Contributions to the theory of Dirichlet processes. Ann Probab 1:705–711
Article MATH MathSciNet Google Scholar
Laptev I, Lindeberg T (2003) Space-time interest points. In: Proc. of IEEE international conference on computer vision (ICCV), pp 432–439
Laptev I, Marszalek M, Schmid C, Rozenfeld B (2008) Learning realistic human actions from movies. In: Proc. of IEEE conference on computer vision and pattern recognition (CVPR), pp 1–8
Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110
Article Google Scholar
Ma Z, Leijon A (2011) Bayesian estimation of beta mixture models with variational inference. IEEE Trans Pattern Anal Mach Intell 33(11):2160–2173
Article Google Scholar
McLachlan GJ, Peel D (2000) Finite mixture models. Wiley, New York
Book MATH Google Scholar
Mehta B, Nangia S, Gupta M, Nejdl W (2008) Detecting image spam using visual features and near duplicate detection. In: Proc. of the 17th international conference on World Wide Web, pp 497–506
Mikolajczyk K, Schmid C (2005) A performance evaluation of local descriptors. IEEE Trans Pattern Anal Mach Intell 27(10):1615–1630
Article Google Scholar
Neal RM (2000) Markov chain sampling methods for Dirichlet process mixture models. J Comput Graph Stat 9(2):249–265
MathSciNet Google Scholar
Parisi G (1988) Statistical field theory. Addison-Wesley
Rasmussen CE (2000) The infinite Gaussian mixture model. In: Proc. of neural information processing systems (NIPS), pp 554–560
Robert C, Casella G (1999) Monte Carlo statistical methods. Springer
Schüldt C, Laptev I, Caputo B (2004) Recognizing human actions: a local SVM approach. In: Proc. of international conference on pattern recognition (ICPR), pp 32–36
Sethuraman J (1994) A constructive definition of Dirichlet priors. Stat Sin 4:639–650
MATH MathSciNet Google Scholar
Teh YW, Jordan MI, Beal MJ, Blei DM (2004) Hierarchical Dirichlet processes. J Am Stat Assoc 101:705–711
MathSciNet Google Scholar
Woolrich MW, Behrens TE (2006) Variational Bayes inference of spatial mixture models for segmentation. IEEE Trans Med Imag 25(10):1380–1391
Article Google Scholar
Zhong D, Zhang H, Chang SF (1996) Clustering methods for video browsing and annotation. In: Storage and retrieval for image and video databases (SPIE), pp 239–246
Zhou X, Zhuang X, Yan S, Chang SF, Hasegawa-Johnson M, Huang TS (2008) SIFT-Bag Kernel for video event analysis. In: Proc. of the 16th ACM international conference on multimedia, pp 229–238

Download references

Acknowledgements

The completion of this research was made possible thanks to the Natural Sciences and Engineering Research Council of Canada (NSERC). The authors would like to thank the anonymous referees and the associate editor for their helpful comments. The complete source code of this work is available upon request.

Author information

Authors and Affiliations

Department of Electrical and Computer Engineering, Concordia University, Montreal, QC, Canada
Wentao Fan
Concordia Institute for Information Systems Engineering (CIISE), Concordia University, Montreal, QC, Canada
Nizar Bouguila

Authors

Wentao Fan
View author publications
You can also search for this author in PubMed Google Scholar
Nizar Bouguila
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nizar Bouguila.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Fan, W., Bouguila, N. Variational learning for Dirichlet process mixtures of Dirichlet distributions and applications. Multimed Tools Appl 70, 1685–1702 (2014). https://doi.org/10.1007/s11042-012-1191-0

Download citation

Published: 03 August 2012
Issue Date: June 2014
DOI: https://doi.org/10.1007/s11042-012-1191-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Variational learning for Dirichlet process mixtures of Dirichlet distributions and applications

Abstract

Access this article

Similar content being viewed by others

Online Variational Learning of Dirichlet Process Mixtures of Scaled Dirichlet Distributions

Unsupervised Variational Learning of Finite Generalized Inverted Dirichlet Mixture Models with Feature Selection and Component Splitting

Data Clustering Using Variational Learning of Finite Scaled Dirichlet Mixture Models with Component Splitting

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Variational learning for Dirichlet process mixtures of Dirichlet distributions and applications

Abstract

Access this article

Similar content being viewed by others

Online Variational Learning of Dirichlet Process Mixtures of Scaled Dirichlet Distributions

Unsupervised Variational Learning of Finite Generalized Inverted Dirichlet Mixture Models with Feature Selection and Component Splitting

Data Clustering Using Variational Learning of Finite Scaled Dirichlet Mixture Models with Component Splitting

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation