Advertisement

Soft Computing

, Volume 23, Issue 14, pp 5799–5813 | Cite as

Bayesian inference by reversible jump MCMC for clustering based on finite generalized inverted Dirichlet mixtures

  • Sami Bourouis
  • Faisal R. Al-Osaimi
  • Nizar BouguilaEmail author
  • Hassen  Sallay
  • Fahd Aldosari
  • Mohamed Al Mashrgy
Methodologies and Application

Abstract

The goal of constructing models from examples has been approached from different perspectives. Statistical methods have been widely used and proved effective in generating accurate models. Finite Gaussian mixture models have been widely used to describe a wide variety of random phenomena and have played a prominent role in many attempts to develop expressive statistical models in machine learning. However, their effectiveness is limited to applications where underlying modeling assumptions (e.g., the per-components densities are Gaussian) are reasonably satisfied. Thus, much research efforts have been devoted to developing better alternatives. In this paper, we focus on constructing statistical models from positive vectors (i.e., vectors whose elements are strictly greater than zero) for which the generalized inverted Dirichlet (GID) mixture has been shown to be a flexible and powerful parametric framework. In particular, we propose a Bayesian density estimation method based upon mixtures of GIDs. The consideration of Bayesian learning is interesting in several respects. It allows to take uncertainty into account by introducing prior information about the parameters, it allows simultaneous parameters estimation and model selection, and it allows to overcome learning problems related to over- or under-fitting. Indeed, we develop a reversible jump Markov Chain Monte Carlo sampler for GID mixtures that we apply for simultaneous clustering and feature selection in the context of some challenging real-world applications concerning scene classification, action recognition, and video forgery detection.

Keywords

Finite mixtures Generalized inverted Dirichlet Bayesian inference RJMCMC Gibbs sampling Scene classification Action recognition Video forgery 

Notes

Acknowledgements

The authors would like to thank Umm al-Qura University, Kingdom of Saudi Arabia, for their funding support under Grant Number 15-COM-3-1-0007.

Compliance with ethical standards

Conflict of interest

All authors declare that they have no conflict of interest.

References

  1. Aggarwal JK, Cai Q (1999) Human motion analysis: a review. Comput Vis Image Underst 73(3):428–440Google Scholar
  2. Allili MS, Bouguila N, Ziou D (2007) Finite generalized Gaussian mixture modeling and applications to image and video foreground segmentation. In: Proceedings of the fourth canadian conference on computer and robot vision (CRV), pp 183–190Google Scholar
  3. Baldi P, Long AD (2001) A bayesian framework for the analysis of microarray expression data: regularized t -test and statistical inferences of gene changes. Bioinformatics 17(6):509–519Google Scholar
  4. Bao SYZ, Sun M, Savarese S (2010) Toward coherent object detection and scene layout understanding. In: Proceedings of the EEE computer society conference on computer vision and pattern recognition (CVPR), pp 65–72Google Scholar
  5. Bdiri T, Bouguila N (2012) Positive vectors clustering using inverted dirichlet finite mixture models. Expert Syst Appl 39(2):1869–1882Google Scholar
  6. BenAbdelkader C, Cutler RG, Davis LS (2004) Gait recognition using image self-similarity. EURASIP J Appl Signal Process 2004:572–585Google Scholar
  7. Bickel PJ, Levina E (2004) Some theory for Fisher’s linear discriminant function, ‘naive Bayes’, and some alternatives when there are many more variables than observations. Bernoulli 10(6):989–1010MathSciNetzbMATHGoogle Scholar
  8. Bobick A, Davis J (2001) The recognition of human movement using temporal templates. IEEE Trans Pattern Anal Mach Intell 23(3):257–267Google Scholar
  9. Bong CW, Rajeswari M (2011) Multi-objective nature-inspired clustering and classification techniques for image segmentation. Appl Soft Comput 11(4):3271–3282Google Scholar
  10. Bouguila N (2007) Spatial color image databases summarization. In: Proceedings of the IEEE international conference on acoustics, speech, and signal processing, ICASSP, pp 953–956Google Scholar
  11. Bouguila N (2011) Bayesian hybrid generative discriminative learning based on finite liouville mixture models. Pattern Recognit 44(6):1183–1200zbMATHGoogle Scholar
  12. Bouguila N, Ziou D, Hammoud RI (2009) On bayesian analysis of a finite generalized dirichlet mixture via a metropolis-within-gibbs sampling. Pattern Anal Appl 12(2):151–166MathSciNetGoogle Scholar
  13. Bourouis S, Mashrgy MA, Bouguila N (2014) Bayesian learning of finite generalized inverted dirichlet mixtures: application to object classification and forgery detection. Expert Syst Appl 41(5):2329–2336Google Scholar
  14. Bouveyron C, Brunet C (2012) Simultaneous model-based clustering and visualization in the fisher discriminative subspace. Stat Comput 22(1):301–324MathSciNetzbMATHGoogle Scholar
  15. Cabral CRB, Bolfarine H, Pereira JRG (2008) Bayesian density estimation using skew student-t-normal mixtures. Comput Stat Data Anal 52(12):5075–5090MathSciNetzbMATHGoogle Scholar
  16. Cai D, Zhang C, He X (2010) Unsupervised feature selection for multi-cluster data. In: Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining (KDD), pp 333–342Google Scholar
  17. Chang S, Yan C, Dimitroff D, Arndt T (1988) An intelligent image database system. IEEE Trans Softw Eng 14(5):681–688Google Scholar
  18. Chen C (2014) Feature selection based on compactness and separability: comparison with filter-based methods. Comput Intell 30(3):636–656MathSciNetGoogle Scholar
  19. Chib S, Winkelmann R (2001) Markov chain Monte Carlo analysis of correlated count data. J Bus Econ Stat 19(4):428–435MathSciNetGoogle Scholar
  20. Chomat O, Crowley J (1999) Probabilistic recognition of activity using local appearance. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR), vol 2, pp 104–109Google Scholar
  21. Cohen WW, Richman J (2002) Learning to match and cluster large high-dimensional data sets for data integration. In: Proceedings of the Eighth ACM SIGKDD international conference on knowledge discovery and data mining (KDD), pp 475–480Google Scholar
  22. Crandall DJ, Backstrom L, Huttenlocher DP, Kleinberg JM (2009) Mapping the world’s photos. In: Proceedings of the 18th international conference on world wide web (WWW), ACM, pp 761–770Google Scholar
  23. Das S, Konar A (2009) Automatic image pixel clustering with an improved differential evolution. Appl Soft Comput 9(1):226–236Google Scholar
  24. Davis J, Bobick A (1997) The representation and recognition of human movement using temporal templates. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR), pp 928–934Google Scholar
  25. Dias JG, Wedel M (2004) An empirical comparison of em, SEM and MCMC performance for problematic gaussian mixture likelihoods. Stat Comput 14(4):323–332MathSciNetGoogle Scholar
  26. Dollar P, Rabaud V, Cottrell G, Belongie S (2005) Behavior recognition via sparse spatio-temporal features. In: Proceedings of IEEE international workshop on visual surveillance and performance evaluation of tracking and surveillance (VS-PETS), pp 65 – 72Google Scholar
  27. Duan L, Xu D, Tsang IWH, Luo J (2012) Visual event recognition in videos by learning from web data. IEEE Trans Pattern Anal Mach Intell 34(9):1667–1680Google Scholar
  28. Duygulu P, Barnard K, de Freitas JFG, Forsyth DA (2002) Object recognition as machine translation: learning a lexicon for a fixed image vocabulary. In: Heyden A, Sparr G, Nielsen M, Johansen P (eds) ECCV (4), Lecture notes in computer science, vol 2353. Springer, pp 97–112Google Scholar
  29. García JM, Benitez LR, Fernández-Caballero A, López MT (2010) Video sequence motion tracking by fuzzification techniques. Appl Soft Comput 10(1):318–331Google Scholar
  30. Geiger D, Heckerman D, King H, Meek C (2001) Stratified exponential families: graphical models and model selection. Ann Stat 29(2):505–529MathSciNetzbMATHGoogle Scholar
  31. Gelman A, Rubin DB (1992) Inference from iterative simulation using multiple sequences. Stat Sci 7(4):457–472 (with discussion) zbMATHGoogle Scholar
  32. Gokcay E, Príncipe JC (2002) Information theoretic clustering. IEEE Trans Pattern Anal Mach Intell 24(2):158–171Google Scholar
  33. Gondra I, Heisterkamp DR (2008) Content-based image retrieval with the normalized information distance. Comput Vis Image Underst 111(2):219–228Google Scholar
  34. Guha S, Rastogi R, Shim K (1998) Cure: an efficient clustering algorithm for large databases. In: Haas LM, Tiwary A (eds) SIGMOD conference. ACM Press, pp 73–84Google Scholar
  35. Guo X, Cao X, Zhang J, Li X (2009) Mift: a mirror reflection invariant feature descriptor. In: Zha H, ichiro Taniguchi R, Maybank SJ (eds) ACCV (2), Lecture notes in computer science, vol 5995. Springer, pp 536–545Google Scholar
  36. Hadjidemetriou E, Grossberg MD, Nayar SK (2004) Multiresolution histograms and their use for recognition. IEEE Trans Pattern Anal Mach Intell 26(7):831–847Google Scholar
  37. Hajji H (2005) Statistical analysis of network traffic for adaptive faults detection. IEEE Trans Neural Netw 16(5):1053–1063Google Scholar
  38. He X, Ji M, Zhang C, Bao H (2011) A variance minimization criterion to feature selection using laplacian regularization. IEEE Trans Pattern Anal Mach Intell 33(10):2013–2025Google Scholar
  39. Heitz G, Koller D (2008) Learning spatial context: using stuff to find things. In: Forsyth DA, Torr PHS, Zisserman A (eds) ECCV (1), Lecture notes in computer science, vol 5302. Springer, pp 30–43Google Scholar
  40. Hinton G (1999) Products of experts. In: Proceedings of the ninth international conference on artificial neural networks (ICANN), vol 1. IEEE, pp 1–6Google Scholar
  41. Ho RKW, Hu I (2008) Flexible modelling of random effects in linear mixed models—a bayesian approach. Comput Stat Data Anal 52(3):1347–1361MathSciNetzbMATHGoogle Scholar
  42. Hofmann T (2001) Unsupervised learning by probabilistic latent semantic analysis. Mach Learn 42(1/2):177–196zbMATHGoogle Scholar
  43. Hsu CC, Hung TY, Lin CW, Hsu CT (2008) Video forgery detection using correlation of noise residue. In: 2008 IEEE 10th workshop on multimedia signal processing, pp 170–174Google Scholar
  44. Jasra A, Stephens DA, Holmes CC (2007) Population-based reversible jump Markov chain Monte Carlo. Biometrika 94(4):787–807MathSciNetzbMATHGoogle Scholar
  45. Kanungo T, Mount DM, Netanyahu NS, Piatko CD, Silverman R, Wu AY (2002) An efficient k-means clustering algorithm: analysis and implementation. IEEE Trans Pattern Anal Mach Intell 24(7):881–892zbMATHGoogle Scholar
  46. Karthikeyan M, Aruna P (2013) Probability based document clustering and image clustering using content-based image retrieval. Appl Soft Comput 13(2):959–966Google Scholar
  47. Kato Z (2008) Segmentation of color images via reversible jump MCMC sampling. Image Vis Comput 26(3):361–371Google Scholar
  48. Kobayashi M, Okabe T, Sato Y (2010) Detecting forgery from static-scene video based on inconsistency in noise level functions. IEEE Trans Inf Forensics Secur 5(4):883–892Google Scholar
  49. Laptev I, Lindeberg T (2004) Velocity adaptation of space-time interest points. In: Proceedings of the 17th international conference on pattern recognition (ICPR), vol 1, pp 52–56Google Scholar
  50. Lazebnik S, Schmid C, Ponce J (2006) Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR), vol 2, pp 2169–2178Google Scholar
  51. Leibe B, Seemann E, Schiele B (2005) Pedestrian detection in crowded scenes. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR), vol 1. IEEE Computer Society, pp 878–885Google Scholar
  52. Lienhart R, Kuranov A, Pisarevsky V (2003) Empirical analysis of detection cascades of boosted classifiers for rapid object detection. In: Michaelis B, Krell G (eds) DAGM-symposium, Lecture notes in computer science, vol 2781. Springer, pp 297–304Google Scholar
  53. Lin TI, Lee JC (2007) Bayesian analysis of hierarchical linear mixed modeling using the multivariate t distribution. J Stat Plan Inference 137(2):484–495MathSciNetzbMATHGoogle Scholar
  54. Liu JS, Liang F, Wong WH (2000) The multiple-try method and local optimization in Metropolis sampling. J Am Stat Assoc 95(449):121–134MathSciNetzbMATHGoogle Scholar
  55. Liu D, Lam K, Shen L (2004) Optimal sampling of gabor features for face recognition. Pattern Recognit Lett 25(2):267–276Google Scholar
  56. Liu X, He GF, Peng SJ, Cheung YM, Tang YY (2017) Efficient human motion retrieval via temporal adjacent bag of words and discriminative neighborhood preserving dictionary learning. IEEE Trans Hum Mach Syst 47(6):763–776Google Scholar
  57. Law MHC, Figueiredo MAT, Jain AK (2004) Simultaneous feature selection and clustering using mixture models. IEEE Trans Pattern Anal Mach Intell 26(9):1154–1166Google Scholar
  58. Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110Google Scholar
  59. Mamat R, Herawan T, Deris MM (2013) Mar: Maximum attribute relative of soft set for clustering attribute selection. Knowl Based Syst 52:11–20Google Scholar
  60. Maree R, Geurts P, Piater J, Wehenkel L (2005) Random subwindows for robust image classification. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR), vol 1, pp 34–40Google Scholar
  61. Mashrgy MA, Bdiri T, Bouguila N (2014) Robust simultaneous positive data clustering and unsupervised feature selection using generalized inverted dirichlet mixture models. Knowl Based Syst 59:182–195Google Scholar
  62. McLachlan G, Khan N (2004) On a resampling approach for tests on the number of clusters with mixture model-based clustering of tissue samples. J Multivar Anal 90(1):90–105MathSciNetzbMATHGoogle Scholar
  63. McLachlan G, Peel D, Bean R (2003) Modelling high-dimensional data by mixtures of factor analyzers. Comput Stat Data Anal 41:379–388MathSciNetzbMATHGoogle Scholar
  64. Meila M (2007) Comparing clusterings—an information based distance. J Multivar Anal 98(5):873–895MathSciNetzbMATHGoogle Scholar
  65. Mishra NS, Ghosh S, Ghosh A (2012) Fuzzy clustering algorithms incorporating local information for change detection in remotely sensed images. Appl Soft Comput 12(8):2683–2692Google Scholar
  66. Mosleh A, Bouguila N, Hamza AB (2012) Video completion using bandlet transform. IEEE Trans Multimed 14(6):1591–1601Google Scholar
  67. Neal RM (2003) Slice sampling. Ann Stat 31(3):705–767MathSciNetzbMATHGoogle Scholar
  68. Pandey M, Lazebnik S (2011) Scene recognition and weakly supervised object localization with deformable part-based models. In: Proceedings of the IEEE international conference on computer vision (ICCV), pp 1307–1314Google Scholar
  69. Pizzuti C, Talia D (2003) P-autoclass: scalable parallel clustering for mining large data sets. IEEE Trans Knowl Data Eng 15(3):629–641Google Scholar
  70. Quack T, Mönich U, Thiele L, Manjunath BS (2004) Cortina: a system for large-scale, content-based web image retrieval. In: Proceedings of the 12th ACM international conference on multimedia (MM). ACM, pp 508–511Google Scholar
  71. Quelhas P, Monay F, Odobez JM, Gatica-Perez D, Tuytelaars T (2007) A thousand words in a scene. IEEE Trans Pattern Anal Mach Intell 29(9):1575–1589Google Scholar
  72. Rao C, Shah M (2001) View-invariance in action recognition. In: Proc. of the IEEE computer society conference on computer vision and pattern recognition (CVPR), vol 2. IEEE Computer Society, pp 316–322Google Scholar
  73. Ren Y, Liu X, Liu W (2012) Dbcamm: a novel density based clustering algorithm via using the mahalanobis metric. Appl Soft Comput 12(5):1542–1554MathSciNetGoogle Scholar
  74. Richardson S, Green PJ (1997) On bayesian analysis of mixtures with an unknown number of components. J R Stat Soc Ser B 59(4):731–792 (with discussion) MathSciNetzbMATHGoogle Scholar
  75. Rodriguez M, Ahmed J, Shah M (2008) Action mach a spatio-temporal maximum average correlation height filter for action recognition. In: Proceedings of IEEE conference on computer vision and pattern recognition (CVPR), pp 1–8Google Scholar
  76. Rufo M, Martn J, Prez C (2006) Bayesian analysis of finite mixture models of distributions from exponential families. Comput Stat 21(3–4):621–637MathSciNetGoogle Scholar
  77. Ruta A, Porikli F (2012) Compressive clustering of high-dimensional data. In: Proceedings of the 11th international conference on machine learning and applications, (ICMLA), pp 380–385Google Scholar
  78. Schiele B, Crowley JL (2000) Recognition without correspondence using multidimensional receptive field histograms. Int J Comput Vis 36(1):31–50Google Scholar
  79. Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6:461–464MathSciNetzbMATHGoogle Scholar
  80. Shen L, Bai L (2006) Mutualboost learning for selecting gabor features for face recognition. Pattern Recognit Lett 27(15):1758–1767Google Scholar
  81. Tan M (1993) Cost-sensitive learning of classification knowledge and its applications in robotics. Mach Learn 13(1):7–33MathSciNetGoogle Scholar
  82. Tu Z, Zhu SC (2002) Image segmentation by data-driven markov chain monte carlo. IEEE Trans Pattern Anal Mach Intell 24(5):657–673Google Scholar
  83. Vlassis N, Likas A (1999) A kurtosis-based dynamic approach to gaussian mixture modeling. IEEE Trans Syst Man Cybern Part A Syst Hum 29(4):393–399Google Scholar
  84. Vlassis N, Papakonstantinou G, Tsanakas P (1999) Mixture density estimation based on maximum likelihood and sequential test statistics. Neural Process Lett 9(1):63–76Google Scholar
  85. Wang W, Farid H (2007a) Exposing digital forgeries in interlaced and deinterlaced video. IEEE Trans Inf Forensics Secur 2(3):438–449Google Scholar
  86. Wang W, Farid H (2007b) Exposing digital forgeries in video by detecting duplication. In: Proceedings of the 9th workshop on multimedia and security. ACM, New York, NY, USA, pp 35–42Google Scholar
  87. Wang Y, Zhu SC (2004) Analysis and synthesis of textured motion: particles and waves. IEEE Trans Pattern Anal Mach Intell 26(10):1348–1363Google Scholar
  88. Xu D, Tian Y (2015) A comprehensive survey of clustering algorithms. Ann Data Sci 2(2):165–193MathSciNetGoogle Scholar
  89. Xu D, Xu Z, Liu S, Zhao H (2013) A spectral clustering algorithm based on intuitionistic fuzzy information. Knowl Based Syst 53:20–26Google Scholar
  90. Zahn C (1971) Graph-theoretical methods for detecting and describing gestalt clusters. IEEE Trans Comput C–20(1):68–86zbMATHGoogle Scholar
  91. Zelnik-Manor L, Irani M (2001) Event-based analysis of video. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition (CVPR), vol 2, pp II–123–II–130Google Scholar
  92. Zhang Z, Chan KL, Wu Y, Chen C (2004) Learning a multivariate gaussian mixture model with the reversible jump MCMC algorithm. Stat Comput 14(4):343–355MathSciNetGoogle Scholar
  93. Zhang B, Shan S, Chen X, Gao W (2007) Histogram of gabor phase patterns (hgpp): a novel object representation approach for face recognition. IEEE Trans Image Process 16(1):57–68MathSciNetGoogle Scholar
  94. Zhang S, Wei Z, Nie J, Huang L, Wang S, Li Z (2017) A review on human activity recognition using vision-based method. J Healthc Eng (Article ID 3090343)Google Scholar
  95. Zhao P, Zhang CQ (2011) A new clustering method and its application in social networks. Pattern Recognit Lett 32(15):2109–2118Google Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Department of Information Technology, College of Computers and Information TechnologyTaif universityTaifKingdom of Saudi Arabia
  2. 2.Department of Computer Engineering, College of Computer SystemsUmm Al-Qura UniversityMeccaKingdom of Saudi Arabia
  3. 3.The Concordia Institute for Information Systems Engineering (CIISE)Concordia UniversityMontrealCanada
  4. 4.College of Computer and Information SystemsUmm Al-Qura UniversityMeccaKingdom of Saudi Arabia

Personalised recommendations