Skip to main content

Adaptive Evidence Accumulation Clustering Using the Confidence of the Objects’ Assignments

  • Conference paper
Book cover Emerging Trends in Knowledge Discovery and Data Mining (PAKDD 2012)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7769))

Included in the following conference series:

Abstract

Ensemble methods are known to increase the performance of learning algorithms, both on supervised and unsupervised learning. Boosting algorithms are quite successful in supervised ensemble methods. These algorithms build incrementally an ensemble of classifiers by focusing on objects previously misclassified while training the current classifier. In this paper we propose an extension to the Evidence Accumulation Clustering method inspired by the Boosting algorithms. While on supervised learning the identification of misclassified objects is a trivial task because the labels for each object are known, on unsupervised learning these are unknown, making it difficult to identify the objects on which the clustering algorithm should focus. The proposed approach uses the information contained in the co-association matrix to identify degrees of confidence of the assignments of each object to its cluster. The degree of confidence is then used to select which objects should be emphasized in the learning process of the clustering algorithm. New consensus partition validity measures, based on the notion of degree of confidence, are also proposed. In order to evaluate the performance of our approaches, experiments on several artificial and real data sets were performed and shown the adaptive clustering ensemble method and the consensus partition validity measure help to improve the quality of data clustering.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 49.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Al-Razgan, M., Domeniconi, C., Barbar, D.: Random Subspace Ensembles for Clustering Categorical Data. In: Okun, O., Valentini, G. (eds.) Supervised and Unsupervised Ensemble Methods and their Applications. SCI, vol. 126, pp. 31–48. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  2. Domeniconi, C., Al-Razgan, M.: Weighted cluster ensembles: Methods and analysis. ACM Trans. Knowl. Discov. Data 2, 17:1–17:40 (2009)

    Google Scholar 

  3. Duarte, F.J., Fred, A.L.N., Rodrigues, M.F.C., Duarte, J.: Weighted evidence accumulation clustering using subsampling. In: Sixth International Workshop on Pattern Recognition in Information Systems (2006)

    Google Scholar 

  4. Dudoit, S., Fridlyand, J.: Bagging to Improve the Accuracy of a Clustering Procedure. Bioinformatics 19(9), 1090–1099 (2003)

    Article  Google Scholar 

  5. Fern, X.Z., Brodley, C.E.: Random projection for high dimensional data clustering: A cluster ensemble approach, pp. 186–193 (2003)

    Google Scholar 

  6. Fern, X.Z., Brodley, C.E.: Solving cluster ensemble problems by bipartite graph partitioning. In: Proceedings of the Twenty-First International Conference on Machine Learning, ICML 2004, pp. 36–43. ACM, New York (2004)

    Chapter  Google Scholar 

  7. Fred, A., Jain, A.: Combining multiple clustering using evidence accumulation. IEEE Trans. Pattern Analysis and Machine Intelligence 27(6), 835–850 (2005)

    Article  Google Scholar 

  8. Fred, A., Jain, A.K.: Evidence Accumulation Clustering Based on the K-Means Algorithm. In: Caelli, T.M., Amin, A., Duin, R.P.W., Kamel, M.S., de Ridder, D. (eds.) SPR 2002 and SSPR 2002. LNCS, vol. 2396, pp. 442–451. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  9. Fred, A.: Finding Consistent Clusters in Data Partitions. In: Kittler, J., Roli, F. (eds.) MCS 2001. LNCS, vol. 2096, pp. 309–318. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  10. Freund, Y., Schapire, R.E.: A Decision-theoretic Generalization of Online Learning and An Application to Boosting. In: Vitányi, P.M.B. (ed.) EuroCOLT 1995. LNCS, vol. 904, pp. 23–37. Springer, Heidelberg (1995)

    Chapter  Google Scholar 

  11. Hadjitodorov, S.T., Kuncheva, L.I., Todorova, L.P.: Moderate diversity for better cluster ensembles. Inf. Fusion 7(3), 264–275 (2006)

    Article  Google Scholar 

  12. Jouve, P., Nicoloyannis, N.: A new method for combining partitions, applications for distributed clustering. In: International Workshop on Paralell and Distributed Machine Learning and Data Mining (ECML/PKDD 2003), pp. 35–46 (2003)

    Google Scholar 

  13. MacQueen, J.B.: Some methods for classification and analysis of multivariate observations. In: Cam, L.M.L., Neyman, J. (eds.) Proc. of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 281–297. University of California Press (1967)

    Google Scholar 

  14. Minaei-Bidgoli, B., Topchy, A., Punch, W.F.: Ensembles of partitions via data resampling. In: Proceedings of the International Conference on Information Technology: Coding and Computing (ITCC 2004), vol. 2, pp. 188–192. IEEE Computer Society, Washington, DC (2004)

    Google Scholar 

  15. Saffari, A., Bischof, H.: Boosting for Model-Based Data Clustering. In: Rigoll, G. (ed.) DAGM 2008. LNCS, vol. 5096, pp. 51–60. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  16. Sneath, P.H., Sokal, R.: Numerical Taxonomy: The Principles and Practice of Numerical Classification (1973)

    Google Scholar 

  17. Sokal, R.R., Michener, C.D.: A statistical method for evaluating systematic relationships. University of Kansas Scientific Bulletin 28, 1409–1438 (1958)

    Google Scholar 

  18. Strehl, A., Ghosh, J.: Cluster ensembles — a knowledge reuse framework for combining multiple partitions. J. Mach. Learn. Res. 3, 583–617 (2003)

    MathSciNet  MATH  Google Scholar 

  19. Topchy, A., Jain, A.K., Punch, W.: Combining multiple weak clusterings, pp. 331–338 (2003)

    Google Scholar 

  20. Topchy, A., Minaei-Bidgoli, B., Jain, A.K., Punch, W.F.: Adaptive clustering ensembles. In: ICPR 2004: Proceedings of the Pattern Recognition, 17th International Conference on (ICPR 2004), vol. 1, pp. 272–275. IEEE Computer Society, Washington, DC (2004)

    Chapter  Google Scholar 

  21. Vega-Pons, S., Ruiz-Shulcloper, J.: A survey of clustering ensemble algorithms. International Journal of Pattern Recognition and Artificial Intelligence 25(03), 337–372 (2011)

    Article  MathSciNet  Google Scholar 

  22. Dimitriadou, E., Weingessel, A., Hornik, K.: Voting-Merging: An Ensemble Method for Clustering. In: Dorffner, G., Bischof, H., Hornik, K. (eds.) ICANN 2001. LNCS, vol. 2130, pp. 217–224. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  23. Zhai, S.L., Luo, B., Guo, Y.T.: Fuzzy clustering ensemble based on dual boosting. In: Proceedings of the Fourth International Conference on Fuzzy Systems and Knowledge Discovery, FSKD 2007, vol. 02, pp. 240–244. IEEE Computer Society, Washington, DC (2007)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Duarte, J.M.M., Fred, A.L.N., Duarte, F.J.F. (2013). Adaptive Evidence Accumulation Clustering Using the Confidence of the Objects’ Assignments. In: Washio, T., Luo, J. (eds) Emerging Trends in Knowledge Discovery and Data Mining. PAKDD 2012. Lecture Notes in Computer Science(), vol 7769. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-36778-6_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-36778-6_7

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-36777-9

  • Online ISBN: 978-3-642-36778-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics