Skip to main content

A Mixture Model Approach for Binned Data Clustering

  • Conference paper
  • 1692 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2810))

Abstract

In some particular data analysis problems, available data takes the form of an histogram. Such data are also called binned data. This paper addresses the problem of clustering binned data using mixture models. A specific EM algorithm has been proposed by Cadez et al.([2]) to deal with these data. This algorithm has the disadvantage of being computationally expensive. In this paper, a classification version of this algorithm is proposed, which is much faster. The two approaches are compared using simulated data. The simulation results show that both algorithms generate comparable solutions in terms of resulting partition if the histogram is accurate enough.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. McLachlan, G.J., Jones, P.N.: Fitting mixture models to grouped and truncated data via the EM algorithm. Biometrics 44(2), 571–578 (1988)

    Article  MATH  Google Scholar 

  2. Cadez, I.V., Smyth, P., McLachlan, G.J., McLaren, C.E.: Maximum likelihood estimation of mixture densities for binned and truncated multivariate data. Machine Learning 47, 7–34 (2001)

    Article  Google Scholar 

  3. Celeux, G., Govaert, G.: A classification EM algorithm for clustering and two stochastic versions. Computation Statistics and Data analysis 14, 315–332 (1992)

    Article  MathSciNet  Google Scholar 

  4. Celeux, G., Govaert, G.: Gaussian parsimonious clustering models. Pattern Recognition 28(5), 781–793 (1995)

    Article  Google Scholar 

  5. Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. Pattern Recognition 28(5); J. Royal Stat. Soc. B 39(1), 1–38 (1977)

    Google Scholar 

  6. Symons, M.J.: Clustering criteria and multivariate normal mixtures. Biometrics 37, 35–43 (1981)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Samé, A., Ambroise, C., Govaert, G. (2003). A Mixture Model Approach for Binned Data Clustering. In: R. Berthold, M., Lenz, HJ., Bradley, E., Kruse, R., Borgelt, C. (eds) Advances in Intelligent Data Analysis V. IDA 2003. Lecture Notes in Computer Science, vol 2810. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-45231-7_25

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-45231-7_25

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-40813-0

  • Online ISBN: 978-3-540-45231-7

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics