Incremental Mixture Learning for Clustering Discrete Data

Blekas, Konstantinos; Likas, Aristidis

doi:10.1007/978-3-540-24674-9_23

Konstantinos Blekas¹⁸ &
Aristidis Likas¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3025))

Included in the following conference series:

Hellenic Conference on Artificial Intelligence

1374 Accesses
3 Citations

Abstract

This paper elaborates on an efficient approach for clustering discrete data by incrementally building multinomial mixture models through likelihood maximization using the Expectation-Maximization (EM) algorithm. The method adds sequentially at each step a new multinomial component to a mixture model based on a combined scheme of global and local search in order to deal with the initialization problem of the EM algorithm. In the global search phase several initial values are examined for the parameters of the multinomial component. These values are selected from an appropriately defined set of initialization candidates. Two methods are proposed here to specify the elements of this set based on the agglomerative and the kd-tree clustering algorithms. We investigate the performance of the incremental learning technique on a synthetic and a real dataset and also provide comparative results with the standard EM-based multinomial mixture model.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Cheeseman, P., Stutz, J.: Bayesian classification (AutoClass): Theory and resutls. In: Fayyad, U., Piatesky-Shapiro, G., Smyth, P., Uthurusamy, R. (eds.) Advances in Knowledge Discovery and Data Mining, pp. 153–180. AAAI Press, CA (1995)
Google Scholar
Bengio, Y., Bengio, S.: Modeling high-dimensional discrete data with multi-layer neural networks. In: Solla, S.A., Leen, T.K., Móller, K.-R. (eds.) Advances in Neural Processing Systems 12, pp. 400–406. MIT Press, Cambridge (2000)
Google Scholar
Meil˘a, M., Hecherman, D.: An experimental comparison of model-based clustering methods. Machine Learning 42, 9–29 (2001)
Article Google Scholar
Blekas, K., Fotiadis, D.I., Likas, A.: Greedy mixture learning for multiple motif discovering in biological sequences. Bioinformatics 19(5), 607–617 (2003)
Article Google Scholar
Chickering, D., Heckerman, D.: Efficient approximations for the marginal likelihood of Bayesian networks with hidden variables. Machine Learning 29, 181–212 (1997)
Article MATH Google Scholar
Render, R.A., Walker, H.F.: Mixture densities, maximum likelihood and the EM algorithm. SIAM Review 26(2), 195–239 (1984)
Article MathSciNet Google Scholar
Vlassis, N., Likas, A.: A greedy EM algorithm for Gaussian mixture learning. Neural Processing Letters 15, 77–87 (2002)
Article MATH Google Scholar
Bentley, J.L.: Multidimensional binary search trees used for associative searching. Commun. ACM 18(9), 509–517 (1975)
Article MATH MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Ioannina, 45110, Ioannina, Greece
Konstantinos Blekas & Aristidis Likas

Authors

Konstantinos Blekas
View author publications
You can also search for this author in PubMed Google Scholar
Aristidis Likas
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Info and Communication Systems Eng, Aegean University, 83200, Karlovassi, Samos, Greece
George A. Vouros
Department of Informatics, University of Piraeus, Piraeus, Greece
Themistoklis Panayiotopoulos

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Blekas, K., Likas, A. (2004). Incremental Mixture Learning for Clustering Discrete Data. In: Vouros, G.A., Panayiotopoulos, T. (eds) Methods and Applications of Artificial Intelligence. SETN 2004. Lecture Notes in Computer Science(), vol 3025. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24674-9_23

Download citation

DOI: https://doi.org/10.1007/978-3-540-24674-9_23
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-21937-8
Online ISBN: 978-3-540-24674-9
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics