AKM—Augmentation of K-Means Clustering Algorithm for Big Data

Shrivastava, Puja; Sahoo, Laxman; Pandey, Manjusha; Agrawal, Sandeep

doi:10.1007/978-981-10-7566-7_11

Puja Shrivastava¹⁸,
Laxman Sahoo¹⁸,
Manjusha Pandey¹⁸ &
…
Sandeep Agrawal¹⁸

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 695))

1556 Accesses
11 Citations

Abstract

Clustering for big data analytics is a growing subject due to the large size of variety data sets needed to be analyzed in distributed and parallel environment. An augmentation of K-Means clustering algorithm is projected and evaluated here for MapReduce framework by using the concepts of genetic algorithm steps. Chromosome formation, fitness calculation, optimization, and crossover logics are used to overcome the problem of suboptimal solutions of K-Means clustering algorithm and reduction of time complexity of genetic K-Means algorithm for big data. Proposed algorithm is not dealing with the selection of parents to be sent to mating pool and mutation steps, so the performance time is improved.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Ayed, A.B., Halima, M.B., Alimi, A.M.: Survey on clustering methods: towards fyzzy clustering for big data. In: International Conference of Soft Computing and Pattern Recognition, pp. 331–336. IEEE (2014)
Google Scholar
Rajashree, D., Rasmita, D.: Comparative analysis of K-Means and genetic algorithm based data clustering. Int. J. Adv. Comput. Math. 3(2), 257–265 (2012)
Google Scholar
Adil, F., Najlaa, A., Zahir, T., Abdullah, A., Ibrahim, K., Albert, Y.Z., Sebti, F., Abdelaziz, B.: A survey of clustering algorithms for big data: taxonomy and empirical analysis. IEEE Trans. Emerg. Top. Comput. 2(3), 267–279 (2014)
Google Scholar
Weizhong, Z., Huifang, M., Qing, H.: Parallel K-Means clustering based on MapReduce. In: CloudCom 2009, LNCS 5931, pp. 674–679. Springer (2009)
Google Scholar
Preeti, A., Deepali, Shipra, V.: Analysis of K-Means and K-Medoids algorithm for big data. Procedia Comput. Sci. 78, 507–512 (2016) (Elsevier)
Google Scholar
Prajesh, P.A., Anjan, K.K., Srinath, N.K.: MapReduce Design of K-Means Clustering Algorithm. IEEE (2013)
Google Scholar
Nadeem, A., Mohd, V.A., Shahbaaz, A.: MapReduce model of improved K-Means clustering algorithm using hadoop MapReduce. In: 2nd International Conference on Computational Intelligence and Communication Technology, pp. 192–198. IEEE (2016)
Google Scholar
Rajshree, D., Rasmita, D.: Comparative analysis of K-Means and genetic algorithm based data clustering. Int. J. Adv. Comput. Math. Sci. 3(2), 257–265 (2012)
Google Scholar
Kaustubh, C., Gauri, C.: Improved K-means clustering on Hadoop. Int. J. Recent Innov. Trends Comput. Commun. 4(4), 601–604
Google Scholar
Nivranshu, H., Sana, M., Omkar, S.N.: Big data clustering using genetic algorithm on hadoop Mapreduce. Int. J. Sci. Technol. Res. 4(4), 58–63 (2015)
Google Scholar
Pooja, B., Kulvinder, S.: Big data mining: analysis of genetic K-Means algorithm for big data clustering. Int. J. Adv. Res. Comput. Sci. Softw. Eng. 6(7), 223–228 (2016)
Google Scholar
Ujjwal, M., Sanghamitra, B.: Genetic algorithm-based clustering technique. Pattern Recognit. 33, 1455–1465 (2000) (Elsevier)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Engineering, KIIT University, Bhubaneswar, Odisha, India
Puja Shrivastava, Laxman Sahoo, Manjusha Pandey & Sandeep Agrawal

Authors

Puja Shrivastava
View author publications
You can also search for this author in PubMed Google Scholar
Laxman Sahoo
View author publications
You can also search for this author in PubMed Google Scholar
Manjusha Pandey
View author publications
You can also search for this author in PubMed Google Scholar
Sandeep Agrawal
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Puja Shrivastava .

Editor information

Editors and Affiliations

Department of Electronics and Communication Engineering, SRMGPC, Lucknow, Uttar Pradesh, India
Vikrant Bhateja
Departamento de Computación, CINVESTAV-IPN, Mexico City, Mexico
Carlos A. Coello Coello
Department of Computer Science and Engineering, PVP Siddhartha Institute of Technology, Vijayawada, Andhra Pradesh, India
Suresh Chandra Satapathy
School of Computer Engineering, KIIT University, Bhubaneswar, Odisha, India
Prasant Kumar Pattnaik

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Shrivastava, P., Sahoo, L., Pandey, M., Agrawal, S. (2018). AKM—Augmentation of K-Means Clustering Algorithm for Big Data. In: Bhateja, V., Coello Coello, C., Satapathy, S., Pattnaik, P. (eds) Intelligent Engineering Informatics. Advances in Intelligent Systems and Computing, vol 695. Springer, Singapore. https://doi.org/10.1007/978-981-10-7566-7_11

Download citation

DOI: https://doi.org/10.1007/978-981-10-7566-7_11
Published: 11 April 2018
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-7565-0
Online ISBN: 978-981-10-7566-7
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics