Advertisement

Abstract

Cluster ensemble techniques are a means for boosting the clustering performance. However, many cluster ensemble methods are faced with high computational complexity. Indeed, the median partition methods are \(\mathcal{NP}\)-complete. While a variety of approximative approaches for suboptimal solutions have been proposed in the literature, the performance evaluation is typically done by means of ground truth. In contrast this work explores the question how well the cluster ensemble methods perform in an absolute sense without ground truth, i.e. how they compare to the (unknown) optimal solution. We present a study of applying and extending a lower bound as an attempt to answer the question. In particular, we demonstrate the tightness of the lower bound, which indicates that there exists no more room for further improvement (for the particular data set at hand). The lower bound can thus be considered as a means of exploring the performance limit of cluster ensemble techniques.

Keywords

Distance Function Ground Truth Cluster Ensemble Consensus Cluster Cluster Ensemble Method 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. 1.
    Abdala, D.D., Wattuya, P., Jiang, X.: Ensemble clustering via random walker consensus strategy. In: Proc. of ICPR, Istanbul (2010)Google Scholar
  2. 2.
    Asuncion, A., Newman, D.J.: UCI machine learning repository (2010)Google Scholar
  3. 3.
    Barthelemy, J.P., Leclerc, B.: The median procedure for partition. In: Partitioning Data Sets. AMS DIMACS Series in Discrete Mathematics, pp. 3–34 (1995)Google Scholar
  4. 4.
    Basu, S., Davidson, I., Wagstaff, K.L. (eds.): Constrained Clustering: Advances in Algorithms, Theory, and Applications. CRC Press, Boca Raton (2009)zbMATHGoogle Scholar
  5. 5.
    Fagin, R., Stockmeyer, L.: Relaxing the triangle inequality in pattern matching. Int. Journal on Computer Vision 28(3), 219–231 (1998)CrossRefGoogle Scholar
  6. 6.
    Fred, A., Jain, A.K.: Combining multiple clusterings using evidence accumulation. IEEE Trans. on PAMI 27(6), 835–850 (2005)Google Scholar
  7. 7.
    Gionis, A., Mannila, H., Tsapara, P.: Clustering Aggregation. ACM Trans. on Knowledge Discovery from Data 1(1) (2007)Google Scholar
  8. 8.
    Goder, A., Filkov, V.: Consensus clustering algorithms: Comparison and refinement. In: Proc. of Workshop on Algorithm Engineering and Experiments, pp. 109–117 (2008)Google Scholar
  9. 9.
    Heinonen, J.: Lectures on Analysis on Metric Spaces. Springer, New York (2001)zbMATHGoogle Scholar
  10. 10.
    Jiang, X., Münger, A., Bunke, H.: On median graphs: Properties, algorithms, and applications. IEEE Trans. on PAMI 23(10), 1144–1151 (2001)Google Scholar
  11. 11.
    Jiang, X., Bunke, H.: Optimal lower bound for generalized median problems in metric space. In: Caelli, T., Amin, A., Duin, R.P.W., Kamel, M., de Ridder, D. (eds.) SPR 2002 and SSPR 2002. LNCS, vol. 2396, pp. 143–151. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  12. 12.
    Lopresti, D., Zhou, J.: Using consensus sequence voting to correct OCR errors. Computer Vision and Image Understanding 67(1), 39–47 (1997)CrossRefGoogle Scholar
  13. 13.
    Meila, M.: Comparing clusterings - an information based distance. Journal of Multivariate Analysis 98(5), 873–895 (2007)zbMATHCrossRefMathSciNetGoogle Scholar
  14. 14.
    Luo, H., Jing, F., Xie, X.: Combining multiple clusterings using information theory based genetic algorithm. In: Proc. of Int. Conf. on Computational Intelligence and Security, pp. 84–89 (2006)Google Scholar
  15. 15.
    Mirkin, B.G.: Mathematical Classification and Clustering. Kluwer Academic Press, Dordrecht (1996)zbMATHGoogle Scholar
  16. 16.
    Pelillo, M.: What is a Cluster? Perspectives from Game Theory. In: NIPS Workshop on ”Clustering: Science of Art?” (2009)Google Scholar
  17. 17.
    Strehl, A., Ghosh, J.: Cluster ensembles – a knowledge reuse framework for combining multiple partitions. Journal of Machine Learning Research 3, 583–617 (2002)CrossRefMathSciNetGoogle Scholar
  18. 18.
    van Dongen, S.: Performance criteria for graph clustering and Markov cluster experiments. Technical Report INSR0012, Centrum voorWiskunde en Informatica (2000)Google Scholar
  19. 19.
    Vega-Pons, S., Correa-Morris, J., Ruiz-Shulcloper, J.: Weighted cluster ensemble using a kernel consensus function. In: Ruiz-Shulcloper, J., Kropatsch, W. (eds.) CIARP 2008. LNCS, vol. 5197, pp. 195–202. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  20. 20.
    Xu, R., Wunsch II, D.: Survey of clustering algorithms. IEEE Trans. on Neural Networks 16(3), 645–678 (2005)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Xiaoyi Jiang
    • 1
  • Daniel Abdala
    • 1
  1. 1.Department of Mathematics and Computer ScienceUniversity of MünsterMünsterGermany

Personalised recommendations