Abstract
This paper proposes the use of more than one clustering method to improve clustering performance. Clustering is an optimization procedure based on a specific clustering criterion. Clustering combination can be regarded as a technique that constructs and processes multiple clustering criteria. Since the global and local clustering criteria are complementary rather than competitive, combining these two types of clustering criteria may enhance the clustering performance. In our past work, a multi-objective programming based simultaneous clustering combination algorithm has been proposed, which incorporates multiple criteria into an objective function by a weighting method, and solves this problem with constrained nonlinear optimization programming. But this algorithm has high computational complexity. Here a sequential combination approach is investigated, which first uses the global criterion based clustering to produce an initial result, then uses the local criterion based information to improve the initial result with a probabilistic relaxation algorithm or linear additive model. Compared with the simultaneous combination method, sequential combination has low computational complexity. results on some simulated data and standard test data are reported. It appears that clustering performance improvement can be achieved at low cost through sequential combination.
Similar content being viewed by others
References
Jain A K, Dubes R C. Algorithms for Clustering Data. Prentice-Hall, Englewood Cliffs, NJ, 1988.
Mirkin B. Mathematical Classification and Clustering. Kluwer Academic Publishers, Dordrecht, The Netherland, 1996.
Murtagh F. A survey of recent advances in hierarchical clustering algorithms.The Computer Journal, 1983, 26(4): 354–359.
Frattale F M, Rizzi A, Panella M, Martinelli G. Scale-based approach to hierarchical fuzzy clustering.Signal Processing, 2000, 80: 1001–1016.
Urquhart R. Graph theoretical clustering based on limited neighborhood sets.Pattern Recognition, 1982, 15(3): 173–187.
Zahn C T. Graph-theoretic methods for detecting and describing gestalt clusters.IEEE Trans. Computer, 1971, 20: 68–86.
Delignon Y, Marzouki A, Pieczynski W. Estimation of generalized mixtures and its application in image segmentation.IEEE Trans. Image Processing, 1997, 6(10): 1364–1374.
Fraley C, Raftery A E. How many clusters? which clustering method? answers via model-based cluster analysis.The Computer Journal, 1998, 41: 578–588.
Bezdek J C. Pattern Recognition With Fuzzy Objective Function Algorithms. Plenum Press, New York, 1981.
Kittler J, Hatef M, Duin R, Matas J. On combining classifiers.IEEE Trans. Pattern Analysis and Machine Intelligence, 1998, 20(3): 226–239.
Lam L, Suen C Y. Optimal combinations of pattern classifiers.Pattern Recognition, 1995, 16(9): 945–954.
Bay S D. Nearest neighbor classification from multiple feature subsets.Intelligent Data Analysis, 1999, 3: 191–209.
Qian Y T, Zhao R. Robust clustering based on global data distribution and local connectivity matrix. In1997 IEEE Int. Conf. on Intelligence Processing Systems, Beijing, China, October, 1997, pp. 1629–1633.
Qian Y T, Xie W, Zhao R. Robust clustering: an approach based on graph theory and objective function.Academia Electronics, 1998, 26(2): 91–94.
Hansen P, Mladenovic N. J-means: A new local search heuristic for minimum sum of squares clustering.Pattern Recognition, 2001, 34:405–413.
Jaromczyk J W, Toussaint G T. Relative neighborhood graphs and their relatives. InProc. IEEE, 1992, 80(9): 1502–1517.
Hummel R A, Zucker S W. On the foundations of relaxation labeling processes.IEEE Trans. Pattern Anal. Machine Intell., 1983, 5: 267–287.
Haralick R. An interpretation for probabilistic relaxation.Comput. Vis., Graph., Image Processing, 1983, 22: 378–385.
Peleg S. A new Probabilistic relaxation scheme.IEEE Trans. Pattern Anal. Machine Intell., 1980, 2: 362–369.
Geman S, Geman D. Stochastic relaxation, Gibbs distribution, and the Bayesian restoration of image.IEEE Trans. Pattern Anal. Machine Intell., 1984, 6(6): 721–741.
Banerjee A, Burlina P, Alajaji F. Image segmentation and labeling using the Polya Urn model.IEEE Trans. Image Processing, 1999, 8(9): 1243–1253.
Zucker S W. Relaxation processes for scene labeling convergence, speed and stability.IEEE trans. Syst., Man, Cybern., 1978, 8: 41–48.
Kosko B. Neural Networks and Fuzzy Systems: A Dynamical System Approach to Machine Intelligence. Prentice-Hall, Englewood Cliffs, NJ, 1992.
Williams G. Linear Algebra with Applications. Allyn and Bacon, Inc. Massachusetts, 1984.
Murphy P M, Aha D W. UCI Repository of Machine Learning Databases. http://www.ics.uci.edu/mlearn/MLRepository.html, Irvine, CA: University of California, 1994.
Author information
Authors and Affiliations
Corresponding author
Additional information
This work is supported in part by the China State Education Commission Laboratory for Image Processing and Intelligent Control under grant TKLJ9901, and Zhejiang Education Commission under grant No. 19990119.
QIAN Yuntao received the B.E. and M.E. degrees in automatic control from Xi’an Jiaotong University in 1989 and 1992 respectively, and his Ph.D. degree in signal processing from Xidian University in 1996. From 1996 to 1998, he was a postdoctoral fellow in Northwestern Polytechnical University. Since 1998, he has been an associate professor in Department of Computer Science, Zhejiang University. From 1999 to 2001, he was a visiting scholar to the Centre of Pattern Recognition and Machine Intelligence, Concordia University, Canada, and also to Department of Computer Science, Hong Kong Baptist University. He has published more than 20 technical papers in academic journals and conference proceedings. His present research interests include data clustering analysis, pattern recognition, image processing, wavelet theory, and neural networks.
Ching Y. Suen received his M.S. degree in engineering from the University of Hong Kong, followed by the Ph.D. degree from the University of British Columbia, Canada. In 1972, he joined the Department of Computer Science, Concordia University, Canada, and became a professor in 1979. He is the director of the Centre of Pattern Recognition and Machine Intelligence. He is the author/editor of 11 books and more than 260 papers on subjects ranging from computer vision and handwriting recognition to expert system and computational linguistics. He is the founder of a journal and an associate editor of several journals related to pattern recognition. He is a fellow of IEEE, IAPR, and the Academy of Sciences of the Royal Society of Canada.
TANG Yuanyan received his B.S. degree in electrical and computer engineering from Chongqing University, M.Eng. in electrical engineering from the Beijing University of Posts and Telecommunication, and Ph.D. in computer science from Concordia University, Canada. He is presently a professor in the Department of Computer Science, Hong Kong Baptist University. He has published more than 160 technical papers and is the author/co-author of 15 books. He is an associate editor of the International Journal of Pattern Recognition and Artificial Intelligence. His current research interests include wavelet theory and applications, pattern recognition, document processing, and artificial intelligence.
Rights and permissions
About this article
Cite this article
Qian, Y., Suen, C.Y. & Tang, Y. Sequential combination methods for data clustering analysis. J. Comput. Sci. & Technol. 17, 118–128 (2002). https://doi.org/10.1007/BF02962204
Received:
Revised:
Issue Date:
DOI: https://doi.org/10.1007/BF02962204