Toward an Efficient Computation of Log-Likelihood Functions in Statistical Inference: Overdispersed Count Data Clustering

Daghyani, Masoud; Zamzami, Nuha; Bouguila, Nizar

doi:10.1007/978-3-030-23876-6_8

Toward an Efficient Computation of Log-Likelihood Functions in Statistical Inference: Overdispersed Count Data Clustering

Masoud Daghyani⁴,
Nuha Zamzami^5,6 &
Nizar Bouguila⁵

Chapter
First Online: 14 August 2019

1380 Accesses
1 Citations

Part of the book series: Unsupervised and Semi-Supervised Learning ((UNSESUL))

Abstract

This work presents an unsupervised learning algorithm, using the mesh method for computing the log-likelihood function. The multinomial Dirichlet distribution (MDD) is one of the widely used methods of modeling multicategorical count data with overdispersion. Recently, it has been shown that traditional numerical computation of the MDD log-likelihood function either results in instability or leads to long run times that make its use infeasible in case of large datasets. Thus, we propose to use the mesh algorithm that involves approximating the MDD log-likelihood function based on Bernoulli polynomials. Moreover, we extend the mesh algorithm approach for computing the log-likelihood function of a more flexible distribution, namely the multinomial generalized Dirichlet (MGD). We demonstrate the efficiency of this method in statistical inference, i.e., maximum likelihood estimation, for fitting finite mixture models based on MDD and MGD as efficient distributions for count data. Through a set of experiments, the proposed approach shows its merits in two real-world clustering problems, namely natural scenes categorization and facial expression recognition.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
https://groups.csail.mit.edu/vision/SUN/.

References

Agresti, A., Kateri, M.: Categorical Data Analysis. Springer, New York (2011)
MATH Google Scholar
Anders, S., Huber, W.: Differential expression analysis for sequence count data. Genome Biol. 11(10), R106 (2010)
Article Google Scholar
Banfield, J.D., Raftery, A.E.: Model-based Gaussian and non-Gaussian clustering. Biometrics 49, 803–821 (1993)
Article MathSciNet MATH Google Scholar
Bouguila, N.: Clustering of count data using generalized Dirichlet multinomial distributions. IEEE Trans. Knowl. Data Eng. 20(4), 462–474 (2008)
Article Google Scholar
Bouguila, N., Ziou, D., Vaillancourt, J.: Unsupervised learning of a finite mixture model based on the Dirichlet distribution and its application. IEEE Trans. Image Process. 13(11), 1533–1543 (2004)
Article Google Scholar
Busam, R., Freitag, E.: Complex Analysis. Springer, London (2009)
Book MATH Google Scholar
Cadez, I.V., Smyth, P., McLachlan, G.J., McLaren, C.E.: Maximum likelihood estimation of mixture densities for binned and truncated multivariate data. Mach. Learn. 47(1), 7–34 (2002)
Article MATH Google Scholar
Cameron, A.C., Trivedi, P.K.: Regression Analysis of Count Data, vol. 53. Cambridge University Press, Cambridge (2013)
Book MATH Google Scholar
Casella, G., Berger, R.: Duxbury advanced series in statistics and decision sciences. Statistical Inference (2002)
Google Scholar
Church, K.W., Gale, W.A.: Poisson mixtures. Nat. Lang. Eng. 1(2), 163–190 (1995)
Article MathSciNet Google Scholar
Connor, R.J., Mosimann, J.E.: Concepts of independence for proportions with a generalization of the Dirichlet distribution. J. Am. Stat. Assoc. 64(325), 194–206 (1969)
Article MathSciNet MATH Google Scholar
Csurka, G., Dance, C., Fan, L., Willamowski, J., Bray, C.: Visual categorization with bags of keypoints. In: Workshop on Statistical Learning in Computer Vision, ECCV, Prague vol. 1, pp. 1–2 (2004)
Google Scholar
De Dinechin, F., Lauter, C.Q.: Optimizing polynomials for floating-point implementation (2008). Preprint. arXiv:0803.0439
Google Scholar
Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. Ser. B (Methodol.) 39(1), 1–22 (1977)
MathSciNet MATH Google Scholar
Fei-Fei, L., Perona, P.: A Bayesian hierarchical model for learning natural scene categories. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), vol. 2, pp. 524–531. IEEE, New York (2005)
Google Scholar
Griffiths, D.: Maximum likelihood estimation for the beta-binomial distribution and an application to the household distribution of the total number of cases of a disease. Biometrics 29(4), 637–648 (1973)
Article Google Scholar
Haseman, J., Kupper, L.: Analysis of dichotomous response data from certain toxicological experiments. Biometrics 35(1), 281–293 (1979)
Article Google Scholar
Hilbe, J.M.: Negative Binomial Regression. Cambridge University Press, Cambridge (2011)
Book MATH Google Scholar
Katz, S.M.: Distribution of content words and phrases in text and language modelling. Nat. Lang. Eng. 2(1), 15–59 (1996)
Article Google Scholar
Leckenby, J.D., Kishi, S.: The Dirichlet multinomial distribution as a magazine exposure model. J. Market. Res. 21(1), 100–106 (1984)
Article Google Scholar
Lewy, P.: A generalized Dirichlet distribution accounting for singularities of the variables. Biometrics 52(4), 1394–1409 (1996)
Article MathSciNet MATH Google Scholar
Lochner, R.H.: A generalized Dirichlet distribution in Bayesian life testing. J. R. Stat. Soc. Ser. B (Methodol.) 37(1), 103–113 (1975)
MathSciNet MATH Google Scholar
Loh, W.Y.: Symmetric multivariate and related distributions. Technometrics 34(2), 235–236 (1992)
Article Google Scholar
Lowe, S.A.: The beta-binomial mixture model and its application to TDT tracking and detection. In: Proceedings of DARPA Broadcast News Workshop, pp. 127–131 (1999)
Google Scholar
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)
Article Google Scholar
Lucey, P., Cohn, J.F., Kanade, T., Saragih, J., Ambadar, Z., Matthews, I.: The extended Cohn-Kanade dataset (CK+): a complete dataset for action unit and emotion-specified expression. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition-Workshops, pp. 94–101. IEEE, New York (2010)
Google Scholar
MacKay, D.J., Peto, L.C.B.: A hierarchical Dirichlet language model. Nat. Lang. Eng. 1(3), 289–308 (1995)
Article Google Scholar
Madsen, R.E., Kauchak, D., Elkan, C.: Modeling word burstiness using the Dirichlet distribution. In: Proceedings of the 22nd International Conference on Machine Learning, pp. 545–552. ACM, New York (2005)
Google Scholar
McLachlan, G., Krishnan, T.: The EM Algorithm and Extensions, vol. 382. Wiley, Hoboken (2007)
MATH Google Scholar
McLachlan, G., Peel., D.: Finite Mixture Models. Wiley, Hoboken (2000)
Book MATH Google Scholar
McLachlan, G.J., Lee, S.X., Rathnayake, S.I.: Finite mixture models. Annu. Rev. Stat. Appl. 6, 355–378 (2000)
Article MathSciNet Google Scholar
Mimno, D., McCallum, A.: Topic models conditioned on arbitrary features with Dirichlet-multinomial regression (2012). Preprint. arXiv:1206.3278
Google Scholar
Minka, T.: Estimating a Dirichlet distribution (2000). http://research.microsoft.com/~minka/papers/dirichlet
Mosimann, J.E.: On the compound multinomial distribution, the multivariate β-distribution, and correlations among proportions. Biometrika 49(1/2), 65–82 (1962)
Article MathSciNet MATH Google Scholar
Neerchal, N.K., Morel, J.G.: An improved method for the computation of maximum likelihood estimates for multinomial overdispersion models. Comput. Stat. Data Anal. 49(1), 33–43 (2005)
Article MathSciNet MATH Google Scholar
Nigam, K., McCallum, A.K., Thrun, S., Mitchell, T.: Text classification from labeled and unlabeled documents using EM. Mach. Learn. 39(2–3), 103–134 (2000)
Article MATH Google Scholar
Oliva, A., Torralba, A.: Modeling the shape of the scene: a holistic representation of the spatial envelope. Int. J. Comput. Vis. 42(3), 145–175 (2001)
Article MATH Google Scholar
Poortema, K.: On modelling overdispersion of counts. Stat. Neerl. 53(1), 5–20 (1999)
Article MathSciNet MATH Google Scholar
Puig, P., Valero, J.: Count data distributions: some characterizations with applications. J. Am. Stat. Assoc. 101(473), 332–340 (2006)
Article MathSciNet MATH Google Scholar
Redner, R.A., Walker, H.F.: Mixture densities, maximum likelihood and the EM algorithm. SIAM Rev. 26(2), 195–239 (1984)
Article MathSciNet MATH Google Scholar
Rennie, J.D., Shih, L., Teevan, J., Karger, D.R.: Tackling the poor assumptions of Naive Bayes text classifiers. In: Proceedings of the 20th International Conference on Machine Learning (ICML-03), pp. 616–623 (2003)
Google Scholar
Rowe, C.H.: A proof of the asymptotic series for log γ (z) and log γ (z+ a). Ann. Math. 32(1), 10–16 (1931)
Article MathSciNet Google Scholar
Rust, R.T., Leone, R.P.: The mixed-media Dirichlet multinomial distribution: a model for evaluating television-magazine advertising schedules. J. Mark. Res. 21(1), 89–99 (1984)
Article Google Scholar
Teevan, J., Karger, D.R.: Empirical development of an exponential probabilistic model for text retrieval: using textual analysis to build a better model. In: Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 18–25. ACM, New York (2003)
Google Scholar
Tirri, H., Kontkanen, P., Myllym Aki, P.: Probabilistic instance-based learning. In: Machine Learning: Proceedings of the Thirteenth International Conference, pp. 507–515 (1996)
Google Scholar
Ueda, N., Saito, K.: Parametric mixture models for multi-labeled text. In: Advances in Neural Information Processing Systems, pp. 737–744 (2003)
Google Scholar
Valstar, M., Pantic, M.: Induced disgust, happiness and surprise: an addition to the MMI facial expression database. In: Proc. 3rd Intern. Workshop on EMOTION (Satellite of LREC): Corpora for Research on Emotion and Affect, Paris, p. 65 (2010)
Google Scholar
Whittaker, E., Watson, G.: A Course of Modern Analysis. Cambridge University Press, Cambridge (1990)
MATH Google Scholar
Wong, T.T.: Generalized Dirichlet distribution in Bayesian analysis. Appl. Math. Comput. 97(2–3), 165–181 (1998)
MathSciNet MATH Google Scholar
Wong, T.T.: Alternative prior assumptions for improving the performance of naïve Bayesian classifiers. Data Min. Knowl. Disc. 18(2), 183–213 (2009)
Article Google Scholar
Xiao, J., Hays, J., Ehinger, K.A., Oliva, A., Torralba, A.: Sun database: large-scale scene recognition from abbey to zoo. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 3485–3492. IEEE, New York (2010)
Google Scholar
Yu, P., Shaw, C.A.: An efficient algorithm for accurate computation of the Dirichlet-multinomial log-likelihood function. Bioinformatics 30(11), 1547–1554 (2014)
Article Google Scholar
Zamzami, N., Bouguila, N.: Consumption behavior prediction using hierarchical Bayesian frameworks. In: 2018 First International Conference on Artificial Intelligence for Industries (AI4I), pp. 31–34. IEEE, New York (2018)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electrical and Computer Engineering (ECE), Concordia University, Montreal, QC, Canada
Masoud Daghyani
Concordia Institute for Information Systems Engineering, Concordia University, Montreal, QC, Canada
Nuha Zamzami & Nizar Bouguila
Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah, Saudi Arabia
Nuha Zamzami

Authors

Masoud Daghyani
View author publications
You can also search for this author in PubMed Google Scholar
Nuha Zamzami
View author publications
You can also search for this author in PubMed Google Scholar
Nizar Bouguila
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Masoud Daghyani .

Editor information

Editors and Affiliations

Concordia Institute for Information Systems Engineering, Concordia University, Montreal, QC, Canada
Nizar Bouguila
Department of Computer Science and Technology, Huaqiao University, Xiamen, China
Wentao Fan

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Daghyani, M., Zamzami, N., Bouguila, N. (2020). Toward an Efficient Computation of Log-Likelihood Functions in Statistical Inference: Overdispersed Count Data Clustering. In: Bouguila, N., Fan, W. (eds) Mixture Models and Applications. Unsupervised and Semi-Supervised Learning. Springer, Cham. https://doi.org/10.1007/978-3-030-23876-6_8

Download citation

DOI: https://doi.org/10.1007/978-3-030-23876-6_8
Published: 14 August 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-23875-9
Online ISBN: 978-3-030-23876-6
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics