AKM—Augmentation of K-Means Clustering Algorithm for Big Data

  • Puja ShrivastavaEmail author
  • Laxman Sahoo
  • Manjusha Pandey
  • Sandeep Agrawal
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 695)


Clustering for big data analytics is a growing subject due to the large size of variety data sets needed to be analyzed in distributed and parallel environment. An augmentation of K-Means clustering algorithm is projected and evaluated here for MapReduce framework by using the concepts of genetic algorithm steps. Chromosome formation, fitness calculation, optimization, and crossover logics are used to overcome the problem of suboptimal solutions of K-Means clustering algorithm and reduction of time complexity of genetic K-Means algorithm for big data. Proposed algorithm is not dealing with the selection of parents to be sent to mating pool and mutation steps, so the performance time is improved.


Big data analytics K-Means Genetic clustering Chromosome Optimized cluster 


  1. 1.
    Ayed, A.B., Halima, M.B., Alimi, A.M.: Survey on clustering methods: towards fyzzy clustering for big data. In: International Conference of Soft Computing and Pattern Recognition, pp. 331–336. IEEE (2014)Google Scholar
  2. 2.
    Rajashree, D., Rasmita, D.: Comparative analysis of K-Means and genetic algorithm based data clustering. Int. J. Adv. Comput. Math. 3(2), 257–265 (2012)Google Scholar
  3. 3.
    Adil, F., Najlaa, A., Zahir, T., Abdullah, A., Ibrahim, K., Albert, Y.Z., Sebti, F., Abdelaziz, B.: A survey of clustering algorithms for big data: taxonomy and empirical analysis. IEEE Trans. Emerg. Top. Comput. 2(3), 267–279 (2014)Google Scholar
  4. 4.
    Weizhong, Z., Huifang, M., Qing, H.: Parallel K-Means clustering based on MapReduce. In: CloudCom 2009, LNCS 5931, pp. 674–679. Springer (2009)Google Scholar
  5. 5.
    Preeti, A., Deepali, Shipra, V.: Analysis of K-Means and K-Medoids algorithm for big data. Procedia Comput. Sci. 78, 507–512 (2016) (Elsevier)Google Scholar
  6. 6.
    Prajesh, P.A., Anjan, K.K., Srinath, N.K.: MapReduce Design of K-Means Clustering Algorithm. IEEE (2013)Google Scholar
  7. 7.
    Nadeem, A., Mohd, V.A., Shahbaaz, A.: MapReduce model of improved K-Means clustering algorithm using hadoop MapReduce. In: 2nd International Conference on Computational Intelligence and Communication Technology, pp. 192–198. IEEE (2016)Google Scholar
  8. 8.
    Rajshree, D., Rasmita, D.: Comparative analysis of K-Means and genetic algorithm based data clustering. Int. J. Adv. Comput. Math. Sci. 3(2), 257–265 (2012)Google Scholar
  9. 9.
    Kaustubh, C., Gauri, C.: Improved K-means clustering on Hadoop. Int. J. Recent Innov. Trends Comput. Commun. 4(4), 601–604Google Scholar
  10. 10.
    Nivranshu, H., Sana, M., Omkar, S.N.: Big data clustering using genetic algorithm on hadoop Mapreduce. Int. J. Sci. Technol. Res. 4(4), 58–63 (2015)Google Scholar
  11. 11.
    Pooja, B., Kulvinder, S.: Big data mining: analysis of genetic K-Means algorithm for big data clustering. Int. J. Adv. Res. Comput. Sci. Softw. Eng. 6(7), 223–228 (2016)Google Scholar
  12. 12.
    Ujjwal, M., Sanghamitra, B.: Genetic algorithm-based clustering technique. Pattern Recognit. 33, 1455–1465 (2000) (Elsevier)Google Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2018

Authors and Affiliations

  • Puja Shrivastava
    • 1
    Email author
  • Laxman Sahoo
    • 1
  • Manjusha Pandey
    • 1
  • Sandeep Agrawal
    • 1
  1. 1.School of Computer EngineeringKIIT UniversityBhubaneswarIndia

Personalised recommendations