Exploring the Performance Limit of Cluster Ensemble Techniques

Jiang, Xiaoyi; Abdala, Daniel

doi:10.1007/978-3-642-14980-1_39

Xiaoyi Jiang²¹ &
Daniel Abdala²¹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 6218))

Included in the following conference series:

Joint IAPR International Workshops on Statistical Techniques in Pattern Recognition (SPR) and Structural and Syntactic Pattern Recognition (SSPR)

1735 Accesses
1 Citations

Abstract

Cluster ensemble techniques are a means for boosting the clustering performance. However, many cluster ensemble methods are faced with high computational complexity. Indeed, the median partition methods are \(\mathcal{NP}\)-complete. While a variety of approximative approaches for suboptimal solutions have been proposed in the literature, the performance evaluation is typically done by means of ground truth. In contrast this work explores the question how well the cluster ensemble methods perform in an absolute sense without ground truth, i.e. how they compare to the (unknown) optimal solution. We present a study of applying and extending a lower bound as an attempt to answer the question. In particular, we demonstrate the tightness of the lower bound, which indicates that there exists no more room for further improvement (for the particular data set at hand). The lower bound can thus be considered as a means of exploring the performance limit of cluster ensemble techniques.

Download to read the full chapter text

Chapter PDF

Evidence Accumulation in Multiobjective Data Clustering

A Quality-Driven Ensemble Approach to Automatic Model Selection in Clustering

Impact of Base Partitions on Multi-objective and Traditional Ensemble Clustering Algorithms

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Abdala, D.D., Wattuya, P., Jiang, X.: Ensemble clustering via random walker consensus strategy. In: Proc. of ICPR, Istanbul (2010)
Google Scholar
Asuncion, A., Newman, D.J.: UCI machine learning repository (2010)
Google Scholar
Barthelemy, J.P., Leclerc, B.: The median procedure for partition. In: Partitioning Data Sets. AMS DIMACS Series in Discrete Mathematics, pp. 3–34 (1995)
Google Scholar
Basu, S., Davidson, I., Wagstaff, K.L. (eds.): Constrained Clustering: Advances in Algorithms, Theory, and Applications. CRC Press, Boca Raton (2009)
MATH Google Scholar
Fagin, R., Stockmeyer, L.: Relaxing the triangle inequality in pattern matching. Int. Journal on Computer Vision 28(3), 219–231 (1998)
Article Google Scholar
Fred, A., Jain, A.K.: Combining multiple clusterings using evidence accumulation. IEEE Trans. on PAMI 27(6), 835–850 (2005)
Google Scholar
Gionis, A., Mannila, H., Tsapara, P.: Clustering Aggregation. ACM Trans. on Knowledge Discovery from Data 1(1) (2007)
Google Scholar
Goder, A., Filkov, V.: Consensus clustering algorithms: Comparison and refinement. In: Proc. of Workshop on Algorithm Engineering and Experiments, pp. 109–117 (2008)
Google Scholar
Heinonen, J.: Lectures on Analysis on Metric Spaces. Springer, New York (2001)
MATH Google Scholar
Jiang, X., Münger, A., Bunke, H.: On median graphs: Properties, algorithms, and applications. IEEE Trans. on PAMI 23(10), 1144–1151 (2001)
Google Scholar
Jiang, X., Bunke, H.: Optimal lower bound for generalized median problems in metric space. In: Caelli, T., Amin, A., Duin, R.P.W., Kamel, M., de Ridder, D. (eds.) SPR 2002 and SSPR 2002. LNCS, vol. 2396, pp. 143–151. Springer, Heidelberg (2002)
Chapter Google Scholar
Lopresti, D., Zhou, J.: Using consensus sequence voting to correct OCR errors. Computer Vision and Image Understanding 67(1), 39–47 (1997)
Article Google Scholar
Meila, M.: Comparing clusterings - an information based distance. Journal of Multivariate Analysis 98(5), 873–895 (2007)
Article MATH MathSciNet Google Scholar
Luo, H., Jing, F., Xie, X.: Combining multiple clusterings using information theory based genetic algorithm. In: Proc. of Int. Conf. on Computational Intelligence and Security, pp. 84–89 (2006)
Google Scholar
Mirkin, B.G.: Mathematical Classification and Clustering. Kluwer Academic Press, Dordrecht (1996)
MATH Google Scholar
Pelillo, M.: What is a Cluster? Perspectives from Game Theory. In: NIPS Workshop on ”Clustering: Science of Art?” (2009)
Google Scholar
Strehl, A., Ghosh, J.: Cluster ensembles – a knowledge reuse framework for combining multiple partitions. Journal of Machine Learning Research 3, 583–617 (2002)
Article MathSciNet Google Scholar
van Dongen, S.: Performance criteria for graph clustering and Markov cluster experiments. Technical Report INSR0012, Centrum voorWiskunde en Informatica (2000)
Google Scholar
Vega-Pons, S., Correa-Morris, J., Ruiz-Shulcloper, J.: Weighted cluster ensemble using a kernel consensus function. In: Ruiz-Shulcloper, J., Kropatsch, W. (eds.) CIARP 2008. LNCS, vol. 5197, pp. 195–202. Springer, Heidelberg (2008)
Chapter Google Scholar
Xu, R., Wunsch II, D.: Survey of clustering algorithms. IEEE Trans. on Neural Networks 16(3), 645–678 (2005)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Mathematics and Computer Science, University of Münster, Einsteinstrasse 62, D-48149, Münster, Germany
Xiaoyi Jiang & Daniel Abdala

Authors

Xiaoyi Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Daniel Abdala
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Computer Vision and Pattern Recognition Group,Computer Science, University of York Heslington, YO10-5DD, York, United Kingdom
Edwin R. Hancock
Department of Computer Science, University of York, YO10 5DD, UK
Richard C. Wilson
Centre for Vision, Speech and Signal Proc (CVSSP), University of Surrey, Guildford, GU2 7XH, Surrey, United Kingdom
Terry Windeatt
Electrical and Electronics Engineering Department, Middle East Technical University, 06531, Ankara, Turkey
Ilkay Ulusoy
Department of Computer Science and Artificial Intelligence, University of Alicante, P.O.B. 99, E-03080, Alicante, Spain
Francisco Escolano

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Jiang, X., Abdala, D. (2010). Exploring the Performance Limit of Cluster Ensemble Techniques. In: Hancock, E.R., Wilson, R.C., Windeatt, T., Ulusoy, I., Escolano, F. (eds) Structural, Syntactic, and Statistical Pattern Recognition. SSPR /SPR 2010. Lecture Notes in Computer Science, vol 6218. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-14980-1_39

Download citation

DOI: https://doi.org/10.1007/978-3-642-14980-1_39
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-14979-5
Online ISBN: 978-3-642-14980-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)

Exploring the Performance Limit of Cluster Ensemble Techniques

Abstract

Chapter PDF

Similar content being viewed by others

Evidence Accumulation in Multiobjective Data Clustering

A Quality-Driven Ensemble Approach to Automatic Model Selection in Clustering

Impact of Base Partitions on Multi-objective and Traditional Ensemble Clustering Algorithms

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Societies and partnerships

Navigation

Exploring the Performance Limit of Cluster Ensemble Techniques

Abstract

Chapter PDF

Similar content being viewed by others

Evidence Accumulation in Multiobjective Data Clustering

A Quality-Driven Ensemble Approach to Automatic Model Selection in Clustering

Impact of Base Partitions on Multi-objective and Traditional Ensemble Clustering Algorithms

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation