Skip to main content

Evaluation of Relative Indexes for Multi-objective Clustering

  • Conference paper
  • First Online:
Hybrid Artificial Intelligent Systems (HAIS 2015)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9121))

Included in the following conference series:

Abstract

One of the biggest challenges in clustering is finding a robust and versatile criterion to evaluate the quality of clustering results. In this paper, we investigate the extent to which unsupervised criteria can be used to obtain clusters highly correlated to external labels. We show that the usefulness of these criteria is data-dependent and for most data sets multiple criteria are required in order to identify the best performing clustering algorithm. We present a multi-objective evolutionary clustering algorithm capable of finding a set of high-quality solutions. For the real world data sets examined the Pareto front can offer better clusterings than simply optimizing a single unsupervised criterion.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Caruana, R., Elhawary, M., Nguyen, N., Smith, C.: Meta Clustering. In: Proceedings of the Sixth International Conference on Data Mining. ICDM 2006, pp. 107–118. IEEE Computer Society, Washington, DC (2006)

    Google Scholar 

  2. Law, M.H.C., Topchy, A.P., Jain, A.K.: Multiobjective Data Clustering. In: CVPR, vol. 2, pp. 424–430 (2004)

    Google Scholar 

  3. MacQueen, J.B.: Some Methods for Classification and Analysis of MultiVariate Observations. In: Cam, L.M.L., Neyman, J. (eds.) Proceedings of the fifth Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 281–297. University of California Press (1967)

    Google Scholar 

  4. Lance, G.N., Williams, W.T.: A general theory of classificatory sorting strategies. Comput. J. 9(4), 373–380 (1967)

    Google Scholar 

  5. Strehl, A., Ghosh, J.: Cluster ensembles - a knowledge reuse framework for combining multiple partitions. J. Mach. Learn. Res. (JMLR) 3, 583–617 (2002)

    MathSciNet  Google Scholar 

  6. Xu, R., Wunsch, I.: Survey of clustering algorithms. IEEE Trans. Neural Netw. 16(3), 645–678 (2005)

    Google Scholar 

  7. Bifulco, I., Fedullo, C., Napolitano, F., Raiconi, G., Tagliaferri, R.: Global optimization, meta clustering and consensus clustering for class prediction. In: Proceedings of the 2009 International Joint Conference on Neural Networks, IJCNN 2009, pp. 1463–1470. IEEE Press, Piscataway (2009)

    Google Scholar 

  8. Halkidi, M., Vazirgiannis, M., Batistakis, Y.: On clustering validation techniques. J. Intell. Inf. Syst. 17(2–3), 107–145 (2001)

    MATH  Google Scholar 

  9. Bartoň, T., Kordík, P.: Encoding time series data for better clustering results. In: Herrero, Á., Snášel, V., Abraham, A., Zelinka, I., Baruque, B., Quintián, H., Calvo, J.L., Sedano, J., Corchado, E. (eds.) Int. Joint Conf. CISIS 2012-ICEUTE 2012-SOCO 2012. AISC, vol. 189, pp. 467–475. Springer, Heidelberg (2013)

    Google Scholar 

  10. Hubert, L., Levin, J.: A general statistical framework for assessing categorical clustering in free recall. Psychol. Bull. 83(6), 1072 (1976)

    Google Scholar 

  11. Milligan, G.W.: A monte carlo study of thirty internal criterion measures for cluster analysis. Psychometrika 46(2), 187–199 (1981)

    Article  MATH  MathSciNet  Google Scholar 

  12. Milligan, G.W., Cooper, M.C.: An examination of procedures for determining the number of clusters in a dataset. Psychometrika 50(2), 159–179 (1985)

    Article  Google Scholar 

  13. Handl, J., Knowles, J.: An evolutionary approach to multiobjective clustering. IEEE Trans. Evol. Comput. 11(1), 56–76 (2007)

    Google Scholar 

  14. Davies, D.L., Bouldin, D.W.: A cluster separation measure. IEEE Trans. Pattern Anal. Mach. Intell. 1(2), 224–227 (1979)

    Google Scholar 

  15. Akaike, H.: A new look at the statistical model identification. IEEE Trans. Autom. Control 19(6), 716–723 (1974)

    MATH  MathSciNet  Google Scholar 

  16. Hastie, T., Tibshirani, R., Friedman, J., Corporation, E.: The Elements of Statistical Learning. Springer, Dordrecht (2009)

    Book  MATH  Google Scholar 

  17. Albatineh, A., Niewiadomska-Bugaj, M., Mihalko, D.: On similarity indices and correction for chance agreement. J. Classif. 23(2), 301–313 (2006)

    MathSciNet  Google Scholar 

  18. Hubert, L., Arabie, P.: Comparing partitions. J. Classif. 2(1), 193–218 (1985)

    Google Scholar 

  19. Faceli, K., de Souto, M.C.P., de Araujo, D.S.A., de Carvalho, A.C.P.L.F.: Multi-objective clustering ensemble for gene expression data analysis. Neurocomputing 72(13–15), 2763–2774 (2009)

    Article  Google Scholar 

  20. Nguyen, X.V., Epps, J., Bailey, J.: Information theoretic measures for clusterings comparison: variants, properties, normalization and correction for chance. J. Mach. Learn. Res. 11, 2837–2854 (2010)

    MATH  MathSciNet  Google Scholar 

  21. Kvålseth, T.O.: Entropy and correlation: some comments. IEEE Trans. Syst. Man Cybern. 17(3), 517–519 (1987)

    Google Scholar 

  22. Tumer, K., Agogino, A.K.: Ensemble clustering with voting active clusters. Pattern Recogn. Lett. 29(14), 1947–1953 (2008)

    Google Scholar 

  23. He, Z., Xu, X., Deng, S.: k-ANMI: A mutual information based clustering algorithm for categorical data. Inf. Fusion 9(2), 223–233 (2008)

    Google Scholar 

  24. Handl, J., Knowles, J.D.: Evolutionary multiobjective clustering. In: Yao, X., Burke, E.K., Lozano, J.A., Smith, J., Merelo-Guervós, J.J., Bullinaria, J.A., Rowe, J.E., Tiňo, P., Kabán, A., Schwefel, H.-P. (eds.) PPSN 2004. LNCS, vol. 3242, pp. 1081–1091. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  25. Corne, D., Jerram, N., Knowles, J., Oates, M.: PESA-II: region-based selection in evolutionary multiobjective optimization. In: Proceedings of the Genetic and Evolutionary Computation Conference (GECCO-2001) (2001)

    Google Scholar 

  26. Deb, K., Pratap, A., Agarwal, S., Meyarivan, T.: A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans. Evol. Comput. 6(2), 182–197 (2002)

    Google Scholar 

  27. Milligan, G.: An examination of the effect of six types of error perturbation on fifteen clustering algorithms. Psychometrika 45(3), 325–342 (1980)

    Article  Google Scholar 

  28. Milligan, G., Cooper, M.: A study of standardization of variables in cluster analysis. J. Classif. 5(2), 181–204 (1988)

    MathSciNet  Google Scholar 

Download references

Acknowledgements

We would like to thank Petr Bart\(\mathop {\mathrm{{u}}}\limits ^{\tiny \circ }\)něk, Ph.D. from the IMG CAS institute for supporting our research and letting us publish all details of our work. This research is partially supported by CTU grant SGS15/117/OHK3/1T/18 New data processing methods for data mining and Program NPU I (LO1419) by Ministry of Education, Youth and Sports of Czech Republic.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tomáš Bartoň .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Bartoň, T., Kordík, P. (2015). Evaluation of Relative Indexes for Multi-objective Clustering. In: Onieva, E., Santos, I., Osaba, E., Quintián, H., Corchado, E. (eds) Hybrid Artificial Intelligent Systems. HAIS 2015. Lecture Notes in Computer Science(), vol 9121. Springer, Cham. https://doi.org/10.1007/978-3-319-19644-2_39

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-19644-2_39

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-19643-5

  • Online ISBN: 978-3-319-19644-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics