Abstract
Cluster ensembles are collections of individual solutions to a given clustering problem which are useful or necessary to consider in a wide range of applications. Aggregating these to a “common” solution amounts to finding a consensus clustering, which can be characterized in a general optimization framework. We discuss recent conceptual and computational advances in this area, and indicate how these can be used for analyzing the structure in cluster ensembles by clustering its elements.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
BEZDEK, J. C. (1974): Numerical taxonomy with fuzzy sets. Journal of Mathematical Biology, 1, 57–71.
BREIMAN, L. (1996): Bagging predictors. Machine Learning, 24(2), 123–140.
DAY, W. H. E. (1986): Foreword: Comparison and consensus of classifications. Journal of Classification, 3, 183–185.
DIETTERICH, T. G. (2002): Ensemble learning. In: M. A. Arbib (Ed.): The Handbook of Brain Theory and Neural Networks. The MIT Press, Cambridge, MA, 405–408.
DIMITRIADOU, E., WEINGESSEL, A. and HORNIK, K. (2001): Voting-merging: An ensemble method for clustering. In: G. Dorffner, H. Bischof and K. Hornik (Eds.): Artificial Neural Networks — ICANN 2001, volume 2130 of LNCS. Springer Verlag, 217–224.
DIMITRIADOU, E., WEINGESSEL, A. and HORNIK, K. (2002): A combination scheme for fuzzy clustering. International Journal of Pattern Recognition and Artificial Intelligence, 16(7), 901–912.
DUDOIT, S. and FRIDLYAND, J. (2002): A prediction-based resampling method to estimate the number of clusters in a dataset. Genome Biology, 3(7), 0036.1–0036.21.
FRALEY, C. and RAFTERY, A. E. (2002): Model-based clustering, discriminant analysis, and density estimation. Journal of the American Statistical Association, 97, 611–631. URL http://www.stat.washington.edu/mclust.
FRED, A. L. N. and JAIN, A. K. (2002): Data clustering using evidence accumulation. In: Proceedings of the 16th International Conference on Pattern Recognition (ICPR 2002), 276–280.
FRIEDMAN, J., HASTIE, T. and TIBSHIRANI, R. (2000): Additive logistic regression: A statistical view of boosting. The Annals of Statistics, 28(2), 337–407.
GORDON, A. D. and VICHI, M. (1998): Partitions of partitions. Journal of Classification, 15, 265–285.
GORDON, A. D. and VICHI, M. (2001): Fuzzy partition models for fitting a set of partitions. Psychometrika, 66(2), 229–248.
HOETING, J., MADIGAN, D., RAFTERY, A. and VOLINSKY, C. (1999): Bayesian model averaging: A tutorial. Statistical Science, 14, 382–401.
HUBERT, L. and ARABIE, P. (1985): Comparing partitions. Journal of Classification, 2, 193–218.
JAIN, A. K. and DUBES, R. C. (1988): Algorithms for Clustering Data. Prentice Hall, New Jersey.
KATZ, L. and POWELL, J. H. (1953): A proposed index of the conformity of one sociometric measurement to another. Psychometrika, 18, 149–256.
KRIEGER, A. M. and GREEN, P. E. (1999): A generalized Rand-index method for consensus clustering of separate partitions of the same data base. Journal of Classification, 16, 63–89.
LEISCH, F. (1999): Bagged clustering. Working Paper 51, SFB “Adaptive Information Systems and Modeling in Economics and Management Science”. URL http://www.ci.tuwien.ac.at/~leisch/papers/wp51.ps.
MESSATFA, H. (1992): An algorithm to maximize the agreement between partitions. Journal of Classification, 9, 5–15.
OLIVEIRA, C. A. S. and PARDALOS, P. M. (2004): Randomized parallel algorithms for the multidimensional assignment problem. Applied Numerical Mathematics, 49(1), 117–133.
PAPADIMITRIOU, C. and STEIGLITZ, K. (1982): Combinatorial Optimization: Algorithms and Complexity. Prentice Hall, Englewood Cliffs.
RAND, W. M. (1971): Objective criteria for the evaluation of clustering methods. Journal of the American Statistical Association, 66(336).
STREHL, A. and GHOSH, J. (2002): Cluster ensembles — a knowledge reuse framework for combining multiple partitions. Journal on Machine Learning Research, 3, 583–617.
VICHI, M. (1999): One-mode classification of a three-way data matrix. Journal of Classification, 16, 27–44.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin · Heidelberg
About this paper
Cite this paper
Hornik, K. (2005). Cluster Ensembles. In: Weihs, C., Gaul, W. (eds) Classification — the Ubiquitous Challenge. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-28084-7_6
Download citation
DOI: https://doi.org/10.1007/3-540-28084-7_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-25677-9
Online ISBN: 978-3-540-28084-2
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)