Skip to main content

A New Efficient Approach in Clustering Ensembles

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4881))

Abstract

Previous clustering ensemble algorithms usually use a consensus function to obtain a final partition from the outputs of the initial clustering. In this paper, we propose a new clustering ensemble method, which generates a new feature space from initial clustering outputs. Multiple runs of an initial clustering algorithm like k-means generate a new feature space, which is significantly better than pure or normalized feature space. Therefore, running a simple clustering algorithm on generated feature space can obtain the final partition significantly better than pure data. In this method, we use a modification of k-means for initial clustering runs named as “Intelligent k-means”, which is especially defined for clustering ensembles. The results of the proposed method are presented using both simple k-means and intelligent k-means. Fast convergence and appropriate behavior are the most interesting points of the proposed method. Experimental results on real data sets show effectiveness of the proposed method.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Strehl, A., Ghosh, J.: Cluster ensembles—a knowledge reuse framework for combining partitioning. In: Proc. of 11th National Conf. on Artificial Intelligence, Edmonton, Alberta, Canada, pp. 93–98 (2002)

    Google Scholar 

  2. Fred, A.L.N., Jain, A.K.: Data Clustering Using Evidence Accumulation. In: ICPR 2000. Proc. of the 16th Intl. Conf. on Pattern Recognition, Quebec City, pp. 276–280 (2002)

    Google Scholar 

  3. Topchy, A., Jain, A.K., Punch, W.: Combining Multiple Weak Clustering. In: Proc. 3d IEEE Intl. Conf. on Data Mining, pp. 331–338 (2003)

    Google Scholar 

  4. Hu, X., Yoo, I.: Cluster ensemble and its applications in gene expression analysis. In: Chen, Y.-P.P. (ed.) Proc. 2nd Asia-Pacific Bioinformatics Conference, Dunedin, New Zealand, pp. 297–302 (2004)

    Google Scholar 

  5. Fern, X.Z, Brodley, C.E.: Random projection for high dimensional data clustering: a cluster ensemble approach. In: ICML. Proc. 20th International Conference on Machine Learning, Washington, DC, pp. 186–193 (2003)

    Google Scholar 

  6. Strehl, A., Ghosh, J.: Cluster ensembles a knowledge reuse framework for combining multiple partitions. Journal on Machine Learning Research, 583–617 (2002)

    Google Scholar 

  7. Greene, D., Tsymbal, A., Bolshakova, N., Cunningham, P.: Ensemble clustering in medical diagnostics. In: Long, R., et al. (eds.) Proc. 17th IEEE Symp. on Computer-Based Medical Systems, pp. 576–581 (2004)

    Google Scholar 

  8. Dudoit, S., Fridlyand, J.: Bagging to improve the accuracy of a clustering procedure. Bioinformatics 19, 1090–1099 (2003)

    Article  Google Scholar 

  9. Fischer, B., Buhmann, J.M.: Bagging for path-based clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1411–1415 (2003)

    Google Scholar 

  10. Fred, A.L.N., Jain, A.K.: Robust data clustering. In: CVPR. Proc. IEEE Computer Society Conference on Computer Vision and Pattern Recognition, USA, vol. II, pp. 128–136 (2003)

    Google Scholar 

  11. Minaei, B., Topchy, A., Punch, W.F.: Ensembles of Partitions via Data Resampling. In: ITCC 2004. Proc. Intl. Conf. on Information Technology, Las Vegas (2004)

    Google Scholar 

  12. Monti, S., Tamayo, P., Mesirov, J., Golub, T.: Consensus clustering: a resampling based method for class discovery and visualization of gene expression microarray data. Machine Learning 52, 91–118 (2003)

    Article  MATH  Google Scholar 

  13. Topchy, A., Minaei-Bidgoli, B., Jain, A.K., Punch, W.: Adaptive Clustering ensembles. In: ICPR 2004. Proc. Intl. Conf on Pattern Recognition, Cambridge, UK, pp. 272–275 (2004)

    Google Scholar 

  14. Barthelemy, J.P., Leclerc, B.: The median procedure for partition. In: Partitioning Data Sets. AMS DIMACS Series in Discrete Mathematics, pp. 3–34 (1995)

    Google Scholar 

  15. Weingessel, A., Dimitriadou, E., Hornik, K.: An ensemble method for clustering. Working paper (2003), http://www.ci.tuwien.ac.at/Conferences/DSC-2003/

  16. Topchy, A., Jain, A.K., Punch, W.: A mixture model for clustering ensembles. In: Proceedings of SIAM Conference on Data Mining, pp. 379–390 (2004)

    Google Scholar 

  17. Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification, 2nd edn. John Wiley & Sons Inc., New York (2001)

    MATH  Google Scholar 

  18. Aarts, E.H.L., Eiben, A.E., Van Hee, K.M.: A general theory of genetic algorithms. Tech.Rep.89/08, Einndhoven University of Technology (1989)

    Google Scholar 

  19. Bradley, P., Fayyad, U.: Refining initial points for k-means clustering. In: Proceedings 15th International Conf., on Machine Learning, San Francisco, CA, pp. 91–99 (1998)

    Google Scholar 

  20. Pena, J., Lozano, J., Larranaga, P.: An Empirical comparison of four initialization methods for the k-means algorithm. Pattern Recognition Letters 20, 1027–1040 (1999)

    Article  Google Scholar 

  21. Babu, G., Murty, M.: A near optimal initial seed value selection in k-means algorithm using a genetic algorithm. Pattern Recognition Letters 14, 763–769 (1993)

    Article  MATH  Google Scholar 

  22. Linde, Y., Buzo, A., Gray, R.: An algorithm for vector quantizer design. IEEE trans. Comm. 28, 84–95 (1980)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Hujun Yin Peter Tino Emilio Corchado Will Byrne Xin Yao

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Azimi, J., Abdoos, M., Analoui, M. (2007). A New Efficient Approach in Clustering Ensembles. In: Yin, H., Tino, P., Corchado, E., Byrne, W., Yao, X. (eds) Intelligent Data Engineering and Automated Learning - IDEAL 2007. IDEAL 2007. Lecture Notes in Computer Science, vol 4881. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-77226-2_41

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-77226-2_41

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-77225-5

  • Online ISBN: 978-3-540-77226-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics