Skip to main content

Incremental Mixture Learning for Clustering Discrete Data

  • Conference paper
Methods and Applications of Artificial Intelligence (SETN 2004)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3025))

Included in the following conference series:

Abstract

This paper elaborates on an efficient approach for clustering discrete data by incrementally building multinomial mixture models through likelihood maximization using the Expectation-Maximization (EM) algorithm. The method adds sequentially at each step a new multinomial component to a mixture model based on a combined scheme of global and local search in order to deal with the initialization problem of the EM algorithm. In the global search phase several initial values are examined for the parameters of the multinomial component. These values are selected from an appropriately defined set of initialization candidates. Two methods are proposed here to specify the elements of this set based on the agglomerative and the kd-tree clustering algorithms. We investigate the performance of the incremental learning technique on a synthetic and a real dataset and also provide comparative results with the standard EM-based multinomial mixture model.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Cheeseman, P., Stutz, J.: Bayesian classification (AutoClass): Theory and resutls. In: Fayyad, U., Piatesky-Shapiro, G., Smyth, P., Uthurusamy, R. (eds.) Advances in Knowledge Discovery and Data Mining, pp. 153–180. AAAI Press, CA (1995)

    Google Scholar 

  2. Bengio, Y., Bengio, S.: Modeling high-dimensional discrete data with multi-layer neural networks. In: Solla, S.A., Leen, T.K., Móller, K.-R. (eds.) Advances in Neural Processing Systems 12, pp. 400–406. MIT Press, Cambridge (2000)

    Google Scholar 

  3. Meil˘a, M., Hecherman, D.: An experimental comparison of model-based clustering methods. Machine Learning 42, 9–29 (2001)

    Article  Google Scholar 

  4. Blekas, K., Fotiadis, D.I., Likas, A.: Greedy mixture learning for multiple motif discovering in biological sequences. Bioinformatics 19(5), 607–617 (2003)

    Article  Google Scholar 

  5. Chickering, D., Heckerman, D.: Efficient approximations for the marginal likelihood of Bayesian networks with hidden variables. Machine Learning 29, 181–212 (1997)

    Article  MATH  Google Scholar 

  6. Render, R.A., Walker, H.F.: Mixture densities, maximum likelihood and the EM algorithm. SIAM Review 26(2), 195–239 (1984)

    Article  MathSciNet  Google Scholar 

  7. Vlassis, N., Likas, A.: A greedy EM algorithm for Gaussian mixture learning. Neural Processing Letters 15, 77–87 (2002)

    Article  MATH  Google Scholar 

  8. Bentley, J.L.: Multidimensional binary search trees used for associative searching. Commun. ACM 18(9), 509–517 (1975)

    Article  MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Blekas, K., Likas, A. (2004). Incremental Mixture Learning for Clustering Discrete Data. In: Vouros, G.A., Panayiotopoulos, T. (eds) Methods and Applications of Artificial Intelligence. SETN 2004. Lecture Notes in Computer Science(), vol 3025. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24674-9_23

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-24674-9_23

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-21937-8

  • Online ISBN: 978-3-540-24674-9

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics