Partition Selection Approach for Hierarchical Clustering Based on Clustering Ensemble

  • Sandro Vega-Pons
  • José Ruiz-Shulcloper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6419)


Hierarchical clustering algorithms are widely used in many fields of investigation. They provide a hierarchy of partitions of the same dataset. However, in many practical problems, the selection of a representative level (partition) in the hierarchy is needed. The classical approach to do so is by using a cluster validity index to select the best partition according to the criterion imposed by this index. In this paper, we present a new approach based on the clustering ensemble philosophy. The representative level is defined here as the consensus partition in the hierarchy. In the consensus computation process, we take into account the similarity between partitions and information from the evaluation of partitions with different cluster validity indexes. An experimental comparison on several datasets shows the superiority of the proposed approach with respect to the classical approach.


Hierarchical clustering partition selection clustering ensemble cluster validity index 


  1. 1.
    Jain, A.K., Murty, M., Flynn, P.: Data clustering: A review. ACM Computing Surveys (CSUR) 31(3), 264–323 (1999)CrossRefGoogle Scholar
  2. 2.
    Milligan, G.W., Cooper, M.C.: An examination of procedures for determing the number of clusters in a data set. Psychometrika 50(2), 159–179 (1985)CrossRefGoogle Scholar
  3. 3.
    Xu, R., Wunsch, D.C.: Clustering. IEEE Press Series on Computational Intelligence. John Wiley & Sons, Chichester (2009)Google Scholar
  4. 4.
    Fred, A.L.N., Jain, A.K.: Combining multiple clustering using evidence accumulation. IEEE Trans. on Pat. Analysis and Mach. Intelligence 27, 835–850 (2005)CrossRefGoogle Scholar
  5. 5.
    Gurrutxaga, I., Albisua, I., Arbelaitz, O., Martín, J., Muguerza, J., Pérez, J., Perona, I.: Sep/cop: An efficient method to find the best partition in hierarchical clustering based on a new cluster validity index. Pattern Recognition 43(10), 3364–3373 (2010)CrossRefzbMATHGoogle Scholar
  6. 6.
    Everitt, B., Landau, S., Leese, M.: Cluster analysis, 4th edn. Arnold, London (2001)zbMATHGoogle Scholar
  7. 7.
    Vega-Pons, S., Correa-Morris, J., Ruiz-Shulcloper, J.: Weighted partition consensus via kernels. Pattern Recognition 43(8), 2712–2724 (2010)CrossRefzbMATHGoogle Scholar
  8. 8.
    Bakir, G., Weston, J., Scholkopf, B.: Learning to find pre-images. In: Thrun, S., Saul, L. (eds.) Advances in Neural Information Processing Systems (NIPS 2003), vol. 16, pp. 449–456. MIT Press, Cambridge (2004)Google Scholar
  9. 9.
    Frank, A., Asuncion, A.: UCI machine learning repository. University of California, Irvine (2010), Google Scholar
  10. 10.
    Strehl, A., Ghosh, J.: Cluster ensembles: a knowledge reuse framework for combining multiple partitions. J. Mach. Learn. Res. 3, 583–617 (2002)MathSciNetzbMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Sandro Vega-Pons
    • 1
  • José Ruiz-Shulcloper
    • 1
  1. 1.Advanced Technologies Application Center (CENATAV)HavanaCuba

Personalised recommendations