Skip to main content

Exploiting the Trade-off — The Benefits of Multiple Objectives in Data Clustering

  • Conference paper
Evolutionary Multi-Criterion Optimization (EMO 2005)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 3410))

Included in the following conference series:

Abstract

In previous work, we have proposed a novel approach to data clustering based on the explicit optimization of a partitioning with respect to two complementary clustering objectives [6]. Here, we extend this idea by describing an advanced multiobjective clustering algorithm, MOCK, with the capacity to identify good solutions from the Pareto front, and to automatically determine the number of clusters in a data set. The algorithm has been subject to a thorough comparison with alternative clustering techniques and we briefly summarize these results. We then present investigations into the mechanisms at the heart of MOCK: we discuss a simple example demonstrating the synergistic effects at work in multiobjective clustering, which explain its superiority to single-objective clustering techniques, and we analyse how MOCK’s Pareto fronts compare to the performance curves obtained by single-objective algorithms run with a range of different numbers of clusters specified.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Supporting material for MOCK, http://dbk.ch.umist.ac.uk/handl/mock/

  2. Branke, J., Deb, K., Dierolf, H., Osswald, M.: Finding knees in multi-objective optimization. In: Proceedings of the Eighth International Conference on Parallel Problem Solving from Nature, pp. 722–731. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  3. Corne, D.W., Knowles, J.D., Oates, M.J.: PESA-II: Region-based selection in evolutionary multiobjective optimization. In: Proceedings of the Genetic and Evolutionary Computation Conference, pp. 283–290. Morgan Kaufmann, San Francisco (2001)

    Google Scholar 

  4. Falkenauer, E.: Genetic Algorithms and Grouping Problems. John Wiley & Son Ltd, Chichester (1998)

    Google Scholar 

  5. Fleurya, G., Hero, A., Zareparsi, S., Swaroop, A.: Gene discovery using Pareto depth sampling distributions. Special Number on Genomics, Signal Processing and Statistics, Journal of the Franklin Institute 341(1–2), 55–75 (2004)

    Google Scholar 

  6. Handl, J., Knowles, J.: Evolutionary multiobjective clustering. In: Proceedings of the Eighth International Conference on Parallel Problem Solving from Nature, pp. 1081–1091. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  7. Handl, J., Knowles, J.: Multiobjective clustering with automatic determination of the number of clusters. Technical Report COMPYSYBIO-TR-2004-02, Department of Chemistry, UMIST, UK (August 2004)

    Google Scholar 

  8. Jain, A.K., Murty, M.N., Flynn, P.J.: Data clustering: A review. ACM Computing Surveys 31(3), 264–323 (1999)

    Article  Google Scholar 

  9. Kim, Y., Street, W.N., Menczer, F.: Evolutionary model selection in unsupervised learning. Intelligent Data Analysis 6, 531–556 (2002)

    MATH  Google Scholar 

  10. Kleinberg, J.: An impossibility theorem for clustering. In: Proceedings of the 15th Conference on Neural Information Processing Systems (2002), http://www.cs.cornell.edu/home/kleinber/nips15.ps

  11. Law, M.H.C.: Multiobjective data clustering. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 424–430. IEEE Press, Los Alamitos (2004)

    Google Scholar 

  12. Maulik, U., Bandyopadhyay, S.: Genetic algorithm-based clustering technique. Pattern Recognition 33, 1455–1465 (2000)

    Article  Google Scholar 

  13. Pan, H., Zhu, J., Han, D.: Genetic algorithms applied to multi-class clustering for gene expression data. Genomics, Proteomics & Bioinformatics 1(4) (2003)

    Google Scholar 

  14. Park, Y.-J., Song, M.-S.: A genetic algorithm for clustering problems. In: Proceedings of the Third Annual Conference on Genetic Programming, pp. 568–575. Morgan Kaufmann, San Francisco (1998)

    Google Scholar 

  15. Pena, J.M., Lozana, J.A., Larranaga, P.: An empirical comparison of four initialization methods for the k -means algorithm. Pattern Recognition Letters 20(10), 1027–1040 (1999)

    Article  Google Scholar 

  16. Strehl, A., Ghosh, J.: Cluster ensembles — a knowledge reuse framework for combining multiple partitions. Journal on Machine Learning Research 3, 583–617 (2002)

    Article  MathSciNet  Google Scholar 

  17. Tibshirani, R., Walther, G., Hastie, T.: Estimating the number of clusters in a dataset via the Gap statistic. Technical Report 208, Department of Statistics, Stanford University, USA (2000)

    Google Scholar 

  18. Topchy, A., Jain, A.K., Punch, W.: Clustering ensembles: Models of consensus and weak partitions. Submitted to IEEE Transactions on Pattern Analysis and Machine Intelligence (2004)

    Google Scholar 

  19. van Rijsbergen, C.: Information Retrieval, 2nd edn. Butterworths (1979)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Handl, J., Knowles, J. (2005). Exploiting the Trade-off — The Benefits of Multiple Objectives in Data Clustering. In: Coello Coello, C.A., Hernández Aguirre, A., Zitzler, E. (eds) Evolutionary Multi-Criterion Optimization. EMO 2005. Lecture Notes in Computer Science, vol 3410. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-31880-4_38

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-31880-4_38

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-24983-2

  • Online ISBN: 978-3-540-31880-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics