Evidence Accumulation Clustering Based on the K-Means Algorithm

Fred, Ana; Jain, Anil K.

doi:10.1007/3-540-70659-3_46

Ana Fred⁵ &
Anil K. Jain⁶

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2396))

Included in the following conference series:

Joint IAPR International Workshops on Statistical Techniques in Pattern Recognition (SPR) and Structural and Syntactic Pattern Recognition (SSPR)

5734 Accesses
46 Citations

Abstract

The idea of evidence accumulation for the combination of multiple clusterings was recently proposed [7]. Taking the K-means as the basic algorithm for the decomposition of data into a large number, k, of compact clusters, evidence on pattern association is accumulated, by a voting mechanism, over multiple clusterings obtained by random initializations of the K-means algorithm. This produces a mapping of the clusterings into a new similarity measure between patterns. The final data partition is obtained by applying the single-link method over this similarity matrix. In this paper we further explore and extend this idea, by proposing: (a) the combination of multiple K-means clusterings using variable k; (b) using cluster lifetime as the criterion for extracting the final clusters; and (c) the adaptation of this approach to string patterns. This leads to a more robust clustering technique, with fewer design parameters than the previous approach and potential applications in a wider range of problems.

Download to read the full chapter text

Chapter PDF

Dissimilarity Increments Distribution in the Evidence Accumulation Clustering Framework

A MAP Approach to Evidence Accumulation Clustering

Data Mining Paradigms

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

T. A. Bailey and R. Dubes. Cluster validity profiles. Pattern Recognition, 15(2):61–83, 1982.
Article MathSciNet Google Scholar
J. Buhmann and M. Held. Unsupervised learning without overfitting: Empirical risk approximation as an induction principle for reliable clustering. In Sameer Singh, editor, International Conference on Advances in Pattern Recognition, pages 167–176. Springer Verlag, 1999.
Google Scholar
R. O. Duda, P. E. Hart, and D. G. Stork. Pattern Classification. Wiley, second edition, 2001.
Google Scholar
Y. El-Sonbaty and M. A. Ismail. On-line hierarchical clustering. Pattern Recognition Letters, pages 1285–1291, 1998.
Google Scholar
M. Figueiredo and A. K. Jain. Unsupervised learning of finite mixture models. IEEE Trans. Pattern Analysis and Machine Intelligence, 24(3):381–396, 2002.
Article Google Scholar
B. Fischer, T. Zoller, and J. Buhmann. Path based pairwise data clustering with application to texture segmentation. In M. Figueiredo, J. Zerubia, and A. K. Jain, editors, Energy Minimization Methods in Computer Vision and Pattern Recogni-tion, volume 2134 of LNCS, pages 235–266. Springer Verlag, 2001.
Chapter Google Scholar
A. L. Fred. Finding consistent clusters in data partitions. In Josef Kittler and Fabio Roli, editors, Multiple Classifier Systems, volume LNCS 2096, pages 309–318. Springer, 2001.
Chapter Google Scholar
A. L. Fred and J. Leitão. Clustering under a hypothesis of smooth dissimilarity increments. In Proc. of the 15th Int’l Conference on Pattern Recognition, volume 2, pages 190–194, Barcelona, 2000.
Article Google Scholar
A. L. Fred, J. S. Marques, and P. M. Jorge. Hidden markov models vs syntactic modeling in object recognition. In ICIP’97, 1997.
Google Scholar
M. Har-Even and V. L. Brailovsky. Probabilistic validation approach for clustering. Pattern Recognition, 16:1189–1196, 1995.
Article Google Scholar
A. Jain. Fundamentals of Digital Image Processing. Prentice-Hall, 1989.
Google Scholar
A. K. Jain and R. C. Dubes. Algorithms for Clustering Data. Prentice Hall, 1988.
Google Scholar
A.K. Jain, M. N. Murty, and P.J. Flynn. Data clustering: A review. ACM Computing Surveys, 31(3):264–323, September 1999.
Google Scholar
J. Kittler, M. Hatef, R. P Duin, and J. Matas. On combining classifiers. IEEE Trans. Pattern Analysis and Machine Intelligence, 20(3):226–239, 1998.
Article Google Scholar
R. Kothari and D. Pitts. On finding the number of clusters. Pattern Recognition Letters, 20:405–416, 1999.
Article Google Scholar
Y. Man and I. Gath. Detection and separation of ring-shaped clusters using fuzzy clusters. IEEE Trans. Pattern Analysis and Machine Intelligence, 16(8):855–861, August 1994.
Google Scholar
A. Marzal and E. Vidal. Computation of normalized edit distance and applications. IEEE Trans. Pattern Analysis and Machine Intelligence, 2(15):926–932, 1993.
Article Google Scholar
G. McLachlan and K. Basford. Mixture Models: Inference and Application to Clustering. Marcel Dekker, New York, 1988.
Google Scholar
B. Mirkin. Concept learning and feature selection based on square-error clustering. Machine Learning, 35:25–39, 1999.
Article MATH MathSciNet Google Scholar
N. R. Pal and J. C. Bezdek. On cluster validity for the fuzzy c-means model. IEEE Trans. Fuzzy Systems, 3:370–379, 1995.
Article Google Scholar
E. J. Pauwels and G. Frederix. Fiding regions of interest for content-extraction. In Proc. of IS&T/SPIE Conference on Storage and Retrieval for Image and Video Databases VII, volume SPIE Vol. 3656, pages 501–510, San Jose, January 1999.
Google Scholar
E. S. Ristad and P. N. Yianilos. Learning string-edit distance. IEEE Trans. Pattern Analysis and Machine Intelligence, 20(5):522–531, May 1998.
Google Scholar
S. Roberts, D. Husmeier, I. Rezek, and W. Penny. Bayesian approaches to gaus-sian mixture modelling. IEEE Trans. Pattern Analysis and Machine Intelligence, 20(11), November 1998.
Google Scholar
D. Stanford and A. E. Raftery. Principal curve clustering with noise. Technical report, University of Washington, http://www.stat.washington.edu/raftery, 1997.
H. Tenmoto, M. Kudo, and M. Shimbo. MDL-based selection of the number of components in mixture models for pattern recognition. In Adnan Amin, Dov Dori, Pavel Pudil, and Herbert Freeman, editors, Advances in Pattern Recognition,volume 1451 of Lecture Notes in Computer Science, pages 831–836. Springer Verlag, 1998.
Chapter Google Scholar
C. Zahn. Graph-theoretical methods for detecting and describing gestalt structures. IEEE Trans. Computers, C-20(1):68–86, 1971.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Instituto de Telecomunicações Instituto Superior Técnico, Lisbon, Portugal
Ana Fred
Department of Computer Science and Engineering, Michigan State University, USA
Anil K. Jain

Authors

Ana Fred
View author publications
You can also search for this author in PubMed Google Scholar
Anil K. Jain
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Dept. of Computing Science, University of Alberta, Athabasca Hall, Room 409, Edmonton, Alberta, Canada, T6G 2H1
Terry Caelli
School of Computer Science and Engineering, University of New South Wales, Sydney, 2052, NSW, Australia
Adnan Amin
Dept. of Applied Physics Pattern Recognition Group, Delft University of Technology, Lorentzweg 1, 2628 CJ, Delft, The Netherlands
Robert P. W. Duin & Dick de Ridder &
Dept. of Systems Design Engineering, University of Waterloo, Waterloo, Ontario, Canada, N2L 3G1
Mohamed Kamel

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Fred, A., Jain, A.K. (2002). Evidence Accumulation Clustering Based on the K-Means Algorithm. In: Caelli, T., Amin, A., Duin, R.P.W., de Ridder, D., Kamel, M. (eds) Structural, Syntactic, and Statistical Pattern Recognition. SSPR /SPR 2002. Lecture Notes in Computer Science, vol 2396. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-70659-3_46

Download citation

DOI: https://doi.org/10.1007/3-540-70659-3_46
Published: 21 August 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-44011-6
Online ISBN: 978-3-540-70659-5
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)

Evidence Accumulation Clustering Based on the K-Means Algorithm

Abstract

Chapter PDF

Similar content being viewed by others

Dissimilarity Increments Distribution in the Evidence Accumulation Clustering Framework

A MAP Approach to Evidence Accumulation Clustering

Data Mining Paradigms

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Societies and partnerships

Navigation

Evidence Accumulation Clustering Based on the K-Means Algorithm

Abstract

Chapter PDF

Similar content being viewed by others

Dissimilarity Increments Distribution in the Evidence Accumulation Clustering Framework

A MAP Approach to Evidence Accumulation Clustering

Data Mining Paradigms

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation