Skip to main content

Multi-Objective Clustering and Cluster Validation

  • Chapter

Part of the book series: Studies in Computational Intelligence ((SCI,volume 16))

Abstract

This chapter is concerned with unsupervised classification, that is, the analysis of data sets for which no (or very little) training data is available. The main goals in this data-driven type of analysis are the discovery of a data set’s underlying structure, and the identification of groups (or clusters) of homogeneous data items — a process commonly referred to as cluster analysis.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. M. Ankerst, M. Breunig, H.-P. Kriegel, and J. Sander. OPTICS: Ordering points to identify clustering structure. In Proceedings of the 1999 International Conference on Management of Data, pages 49–60. ACM Press, 1999.

    Google Scholar 

  2. S. Bandyopadhyay and U. Manlik. Nonparametric genetic clustering: comparison of validity indices. IEEE Transactions on Systems, Man and Cybernetics, 31:120–125, 2001.

    Google Scholar 

  3. J. Bilmes, A. Vahdat, W. Hsu, and E.-J. Im. Empirical observations of probabilistic heuristics for the clustering problem. Technical Report TR-97-018, International Computer Science Institute, University of California, Berkeley, CA, 1997.

    Google Scholar 

  4. D. W. Corne, Nick R. Jerram, Joshua D. Knowles, and Martin J. Oates. PESAII: Region-based selection in evolutionary multiobjective optimization. In Proceedings of the Genetic and Evolutionary Computation Conference, pages 283–290. Morgan Kaufmann Publishers, 2001.

    Google Scholar 

  5. D. W. Corne, J. D. Knowles, and M. J. Oates. The Pareto envelope-based selection algorithm for multiobjectice optimization. In Proceedings of the Fifth Conference on Parallel Problem Solving from Nature, pages 839–848, 2000.

    Google Scholar 

  6. D. W. Corne, J. D. Knowles, and M. J. Oates. PESA-II: region-based selection in evolutionary multiobjective optimization. In Proceedings of the Genetic and Evolutionary Computation Conference, pages 283–290, 2001.

    Google Scholar 

  7. R. O. Duda, P. E. Hart, and D. G. Stork. Pattern Classification, Second edition. John Wiley and Son Ltd, 2001.

    Google Scholar 

  8. M. Ester, H. P. Kriegel, and J. Sander. A density-based algorithm for discovering clusters in large spatial databases with noise. In Proceedings of the Second International Conference on Knowledge Discovery and Data-Mining, pages 226–231. AIII Press, 1996.

    Google Scholar 

  9. V. Estivill-Castro. Why so many clustering algorithms: A position paper. ACM SIGKDD Explorations Newsletter Archive, 4:65–75, 2002.

    Article  Google Scholar 

  10. B. S. Everitt. Cluster Analysis. Edward Arnold, 1993.

    Google Scholar 

  11. C. M. Fonseca and P. J. Fleming. On the performance assessment and comparison of stochastic multiobjective optimizers. In Proceedings of the Fourth International Conference on Parallel Problem Solving from Nature, pages 584–593. Springer-Verlag, 1996.

    Google Scholar 

  12. J. Handl and J. Knowles. Evolutionary multiobjective clustering. In Proceedings of the Eighth International Conference on Parallel Problem Solving from Nature, pages 1081–1091. Springer-Verlag, 2004.

    Google Scholar 

  13. J. Handl and J. Knowles. Multiobjective clustering with automatic determination of the number of clusters. Technical Report TR-COMPSYSBIO-2004-02, UMIST, Manchester, UK, 2004.

    Google Scholar 

  14. J. Handl and J. Knowles. Exploiting the trade-off: the benefits of multiple objectives in data clustering. In Proceedings of the Third International Conference on Evolutionary Multicriterion Optimization, pages 547–560. Springer-Verlag, 2005.

    Google Scholar 

  15. J. Handl and J. Knowles. Improvements to the scalability of multiobjective clustering. In IEEE Congress on Evolutionary Computation, pages 632–639. IEEE Press, 2005.

    Google Scholar 

  16. J. Handl, J. Knowles, and D. B. Kell. Computational cluster validation in post-genomic data analysis. Bioinformatics, 21:3201–3212, 2005.

    Article  Google Scholar 

  17. T. Hastie, R. Tibshirani, and J. Friedman. The elements of statistical learning: data mining, inference and prediction. Springer-Verlag, 2001.

    Google Scholar 

  18. A. Hubert. Comparing partitions. Journal of Classification, 2:193–198, 1985.

    Article  Google Scholar 

  19. A. K. Jain, M. N. Murty, and P. J. Flynn. Data clustering: a review. ACM Computing Surveys, 31:264–323, 1999.

    Article  Google Scholar 

  20. J. Kleinberg. An impossibility theorem for clustering. In Proceedings of the 15th Conference on Neural Information Processing Systems. The Internet, 2002.

    Google Scholar 

  21. L. MacQueen. Some methods for classification and analysis of multivariate observations. In Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, pages 281–297. University of California Press, 1967.

    Google Scholar 

  22. G. McLachlan and T. Krishman. The EM Algorithm and Extensions. John Wiley and Son Ltd, 1997.

    Google Scholar 

  23. Y.-J. Park and M.-S. Song. A genetic algorithm for clustering problems. In Proceedings of the Third Annual Conference on Genetic Programming, pages 568–575, Madison, WI, 1998. Morgan Kaufmann.

    Google Scholar 

  24. J. M. Pena, J. A. Lozana, and P. Larranaga. An empirical comparison of four initialization methods for the k-means algorithm. Pattern Recognition Letters, 20:1027–1040, 1999.

    Article  Google Scholar 

  25. V. J. Rayward-Smith, I. H. Osman, C. R. Reeves, and G. D. Smith. Modern Heuristic Search Methods. John Wiley and Son Ltd, 1996.

    Google Scholar 

  26. P. J. Rousseeuw. Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. Journal of Computational and Applied Mathematics, 20:53–65, 1987.

    Article  MATH  Google Scholar 

  27. W. S. Sarle. Cubic clustering criterion. Technical report, SAS Technical Report A-108, Cary, NC: SAS Institute Inc, 1983.

    Google Scholar 

  28. A. Strehl and J. Ghosh. Cluster ensembles — a knowledge reuse framework for combining multiple partitions. Journal on Machine Learning Research, 3:583–617, 2002.

    Article  MathSciNet  Google Scholar 

  29. G. Syswerda. Uniform crossover in genetic algorithms. In Proceedings of the Third International Conference on Genetic Algorithms, pages 2–9. Morgan Kaufmann Publishers, 1989.

    Google Scholar 

  30. R. Tibshirani, G. Walther, and T. Hastie. Estimating the number of clusters in a dataset via the Gap statistic. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 63:411–423, 2001.

    Article  MATH  MathSciNet  Google Scholar 

  31. A. Topchy, A. K. Jain, and W. Punch. Clustering ensembles: Models of consensus and weak partitions. Submitted to IEEE Transactions on Pattern Analysis and Machine Intelligence, 2004.

    Google Scholar 

  32. E. Vorhees. The effectiveness and efficiency of agglomerative hierarchical clustering in document retrieval. PhD thesis, Department of Computer Science, Cornell University, 1985.

    Google Scholar 

  33. D. Whitley. A genetic algorithm tutorial. Statistics and Computing, 4:65–85, 1994.

    Article  Google Scholar 

  34. R. J. Wilson and J. J. Watkins. Graphs: An Introductory Approach: A First Course in Discrete Mathematics. John Wiley and Sons, 1990.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer

About this chapter

Cite this chapter

Handl, J., Knowles, J. (2006). Multi-Objective Clustering and Cluster Validation. In: Jin, Y. (eds) Multi-Objective Machine Learning. Studies in Computational Intelligence, vol 16. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-33019-4_2

Download citation

  • DOI: https://doi.org/10.1007/3-540-33019-4_2

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-30676-4

  • Online ISBN: 978-3-540-33019-6

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics