Abstract
Clustering ensembles have emerged as a powerful method for improving both the robustness and the stability of unsupervised classification solutions. However, finding a consensus clustering from multiple partitions is a difficult problem that can be approached from graph-based, combinatorial or statistical perspectives. We offer a probabilistic model of consensus using a finite mixture of multinomial distributions in a space of clustering. A combined partition is found as a solution to the corresponding maximum likelihood problem using the GA algorithm. The excellent scalability of this algorithm and comprehensible underlying model are particularly important for clustering of large datasets. This study includes two sections, at the first, calculate correlation matrix. this matrix show correlation between samples and we found the best samples that can be in the center of clusters. In the other section a genetic algorithm is employed to produce the most stable partitions from an evolving ensemble (population) of clustering algorithms along with a special objective function. The objective function evaluates multiple partitions according to changes caused by data perturbations and prefers those clustering that are least susceptible to those perturbations.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Peng-Yeng Yin and Ling-Hwei Chen. A new non iterative approach for clustering. Pattern Recognition Letters, 15:125–133, 1994.
D. Chaudhuri, B. B. Chaudhuri, and C. A. Murthy. A new split-and-merge clustering technique. Pattern Recognition Letters, 13:399–409, 1992
Torbjorn Eltoft and Rui J. P. DeFigueiredo. A new neural network for cluster-detection-and labeling. IEEE Transactions on Neural Networks, 9(5):1021–1034, September 1998.
C. L. Begovich and V. E. Kane. Estimating the number of groups and group membership using simulation cluster analysis. Pattern Recognition, 15(4):335–342, 1982.
Donald E. Brown, Christopher L. Huntley, and Paul J. Garvey. Clustering of homogeneous subsets. Pattern Recognition Letters, 12:401–408, 1991.
Lei Xu, Adam Krzyzak, and Erkki Oja. Rival penalized competitive learning for clustering analysis, rbf net, and curve detection. IEEE Transactions on Neural Networks, 4(4):636–649, 1993.
Lei Xu. Bayesian ying-yang machine, clustering and number of clusters. Pattern Recognition Letters, 18:1167–1178, 1997.
G. Celeux and G. Soromenho. An entropy criterion for assessing the number of clusters in amixture model. Journal of Classification, 13:195–212, 1996.
Christophe Biernacki, Gilles Celeux, and Gerard Govaert. Assessing a mixture model for clustering with the integrated classification likelihood. Rapport de recherche 3521 Theme 4, Unite de recherche INRIA Lorraine, http://www.inria.fr, 1997.
Richard S. Wallace and Takeo Kanade. Finding natural clusters having minimum description length. In Proceedings of the 10th International Conference on Pattern Recognition, volume 1, pages 438–442, Los Alamitos, CA, USA, 1990. IEEE Comput. S. Press.
A. Marazzi, P. Gamba, A. Mecocci, and A. Semboloni. Automatic selection of the number of clusters in multidimensional data problems. In Proceedings of the International Conference on Image Processing, volume 3, pages 631–634, NY, USA, 1990. IEEE.
Shri Kant, T. L. Rao, and P. N. Sundaram. An automatic and stable clustering algorithm. Pattern Recognotion Letters, 15:543–549, 1994.
G. H. Ball and D. J. Hall. Isodata, a novel method of data analysis and pattern classification. Technical report, Stanford Research Institute, Menlo Park, 1965.
G. Carpenter and S. Grossberg. Adaptive resonance theory: Stable selforganization of neural recognition codes in response to arbitrary lists of input patterns. In Proc. 8th Annu. Conf. Cognitive Sci. Soc., pages 45–62, 1986.
R. J. P. DeFigueiredo. The oi, os, omni and osman networks as best approximations of nonlinear systems under training data constraints. In Proc. IEEE Int. Symp. Circuits Syst., Seattle, WA, 1996.
Yoseph Linde, Andrs Buzo, and Robert M. Gray. An algorithm for vector quantizer design. COM-28(1):84–95, January 1980
Thomas M. Martinetz, Stanislav G. Berkovich, and Klaus J. Schulten. “Neural-Gas” network for vector quantization and its application to timeseries prediction. 4(4):558–569, July 1993.
A. Weingessel E. Dimitriadou and K. Hornik. A voting-merging clustering algorithm. In Fuzzy-Neuro Systems’ 99, editor, SFB ”Adaptive Information Systems and Modeling in Economics and Management Science, number Working Paper 3101999.
A. Weingessel E. Dimitriadou and K. Hornik. Fuzzy voting in clustering. In Fuzzy-Neuro Systems’ 99 editor, G. Brewka, R. Der S. Gottwald and A, Schierwagen, pages 63–74, 1999.
Wolfgang von der Gablentz and Mario Köppen. agglomerative single-linkage clustering and its capability for solving the interwtined spirals problem, a technical report. experimental results, Fraunhofer IPK, http://www.ipk.fhg.de, 2000.
Marvin L. Minsky and Seymour A. Papert. Perceptrons — Expanded Edition. MIT Press, 1988.
Cran. http://cran.at.r-projects.org.
J. A. Holland. Adaptation in natural and artificial systems. MIT Press, Cambridge, MA, 1985.
J. Koza. Genetic programming — On the programming of computers by means of natural selection. MIT Press, Cambridge, MA, 1992.
D. E. Goldberg. Genetic algorithms in search, optimization & machine learning. Addison-Wesley, Reading, MA, 1989.
D. Whitley. A genetic algorithmtutorial. In Statistics and Computing, 4, pages 65–85, 1994.
R. G. Reynolds. An introduction tu cultural algorithm. In In Proceedings of Evolutionary Programming, EP-94. San Diego. CA, 1994.
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 International Federation for Information Processing
About this paper
Cite this paper
analoui, M., sadighian, N. (2006). Solving Cluster Ensemble Problems by Correlation’s matrix & GA. In: Shi, Z., Shimohara, K., Feng, D. (eds) Intelligent Information Processing III. IIP 2006. IFIP International Federation for Information Processing, vol 228. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-44641-7_24
Download citation
DOI: https://doi.org/10.1007/978-0-387-44641-7_24
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-44639-4
Online ISBN: 978-0-387-44641-7
eBook Packages: Computer ScienceComputer Science (R0)