Using Genetic Algorithm for Selection of Initial Cluster Centers for the K-Means Method
The K-means algorithm is one of the most widely used clustering methods. However, solutions obtained by it are strongly dependent on initialization of cluster centers. In the paper a novel genetic algorithm, called GAKMI (Genetic Algorithm for the K-Means Initialization), for the selection of initial cluster centers is proposed. Contrary to most of the approaches described in the literature, which encode coordinates of cluster centers directly in a chromosome, our method uses binary encoding. In this encoding bits set to one select elements of the learning set as initial cluster centers. Since in our approach not every binary chromosome encodes a feasible solution, we propose two repair algorithms to convert infeasible chromosomes into feasible ones. GAKMI was tested on three datasets, using varying number of clusters. The experimental results are encouraging.
KeywordsGenetic Algorithm Feature Vector Cluster Center Binary String Initial Center
Unable to display preview. Download preview PDF.
- 2.Asuncion, A., Newman, D.J.: UCI machine learning repository (2007), http://www.ics.uci.edu/~mlearn/MLRepository.html
- 6.Hruschka, E.R., Campello, R.J.G.B., Freitas, A.A., de Carvalho, A.C.P.L.F.: A survey of evolutionary algorithms for clustering. IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews 39(2) (2009)Google Scholar
- 7.McQueen, J.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, pp. 281–297 (1967)Google Scholar