Skip to main content

A K-means Based Genetic Algorithm for Data Clustering

  • Conference paper
  • First Online:
International Joint Conference SOCO’16-CISIS’16-ICEUTE’16 (SOCO 2016, CISIS 2016, ICEUTE 2016)

Abstract

A genetic algorithm, that exploits the K-means principles for dividing objects in groups having high similarity, is proposed. The method evolves a population of chromosomes, each representing a division of objects in a different number of clusters. A group-based crossover, enriched with the one-step K-means operator, and a mutation strategy that reassigns objects to clusters on the base of their distance to the clusters computed so far, allow the approach to determine the best number of groups present in the dataset. The method has been experimented with four different fitness functions on both synthetic and real-world datasets, for which the ground-truth division is known, and compared with the K-means method. Results show that the approach obtains higher values of evaluation indexes than that obtained by the K-means method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

References

  1. Arthur, D., Vassilvitskii, S.: K-means++: the advantages of careful seeding. In: Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2007, pp. 1027–1035 (2007)

    Google Scholar 

  2. Bandyopadhyay, S., Maulik, U.: An evolutionary technique based on k-means algorithm for optimal clustering in rn. Inf. Sci. Appl. 146(1–4), 221–237 (2002)

    Article  MATH  Google Scholar 

  3. Bandyopadhyay, S., Maulik, U.: Genetic clustering for automatic evolution of clusters and application to image classification. Pattern Rec. 35, 1197–1208 (2004)

    Article  MATH  Google Scholar 

  4. Calinski, T., Harabasz, J.: A dendrite method for cluster analysis. Commun. Stat. 3(1), 1–27 (1974)

    MathSciNet  MATH  Google Scholar 

  5. Davies, D., Bouldin, D.: A cluster separation measure. IEEE Trans. Pattern Anal. Mach. Intell. 1(2), 224–227 (1979)

    Article  Google Scholar 

  6. Falkenauer, E.: Genetic Algorithms and Grouping Problems. Wiley, New York (1998)

    MATH  Google Scholar 

  7. Krishna, K., Murty, M.N.: Genetic k-means algorithm. IEEE Trans. Syst. Man Cybern. Part B 29(3), 433–439 (1999)

    Article  Google Scholar 

  8. Liu, Y., Li, Z., Xiong, H., Gao, X., Wu, J.: Understanding of internal clustering validation measures. In: Proceedings of the 2010 IEEE International Conference on Data Mining, ICDM 2010, pp. 911–916 (2010)

    Google Scholar 

  9. Lu, Y., Lu, S., Fotouhi, F., Deng, Y., Brown, S.J.: Fgka: a fast genetic k-means clustering algorithm. In: Proceedings of the 2004 ACM Symposium on Applied Computing, SAC 2004, pp. 622–623 (2004)

    Google Scholar 

  10. Lu, Y., Lu, S., Fotouhi, F., Deng, Y., Brown, S.J.: Performance evaluation of some clustering algorithms and validity indices. BMC Bioinform. 5(172), 1–10 (2004)

    Google Scholar 

  11. Pal, N.R., Bezdek, J.C.: On cluster validity for the fuzzy c-means model. IEEE Trans. Fuzzy Syst. 3(3), 370–379 (1995)

    Article  Google Scholar 

  12. Rousseeuw, P.: Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 20(1), 53–65 (1987)

    Article  MATH  Google Scholar 

Download references

Acknowledgment

This work has been partially supported by MIUR D.D. n 0001542, under the project \(BA2KNOW - PON03PE\_00001\_1\).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Clara Pizzuti .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Pizzuti, C., Procopio, N. (2017). A K-means Based Genetic Algorithm for Data Clustering. In: Graña, M., López-Guede, J.M., Etxaniz, O., Herrero, Á., Quintián, H., Corchado, E. (eds) International Joint Conference SOCO’16-CISIS’16-ICEUTE’16. SOCO CISIS ICEUTE 2016 2016 2016. Advances in Intelligent Systems and Computing, vol 527. Springer, Cham. https://doi.org/10.1007/978-3-319-47364-2_21

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-47364-2_21

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-47363-5

  • Online ISBN: 978-3-319-47364-2

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics