Abstract
Clustering for big data analytics is a growing subject due to the large size of variety data sets needed to be analyzed in distributed and parallel environment. An augmentation of K-Means clustering algorithm is projected and evaluated here for MapReduce framework by using the concepts of genetic algorithm steps. Chromosome formation, fitness calculation, optimization, and crossover logics are used to overcome the problem of suboptimal solutions of K-Means clustering algorithm and reduction of time complexity of genetic K-Means algorithm for big data. Proposed algorithm is not dealing with the selection of parents to be sent to mating pool and mutation steps, so the performance time is improved.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Ayed, A.B., Halima, M.B., Alimi, A.M.: Survey on clustering methods: towards fyzzy clustering for big data. In: International Conference of Soft Computing and Pattern Recognition, pp. 331–336. IEEE (2014)
Rajashree, D., Rasmita, D.: Comparative analysis of K-Means and genetic algorithm based data clustering. Int. J. Adv. Comput. Math. 3(2), 257–265 (2012)
Adil, F., Najlaa, A., Zahir, T., Abdullah, A., Ibrahim, K., Albert, Y.Z., Sebti, F., Abdelaziz, B.: A survey of clustering algorithms for big data: taxonomy and empirical analysis. IEEE Trans. Emerg. Top. Comput. 2(3), 267–279 (2014)
Weizhong, Z., Huifang, M., Qing, H.: Parallel K-Means clustering based on MapReduce. In: CloudCom 2009, LNCS 5931, pp. 674–679. Springer (2009)
Preeti, A., Deepali, Shipra, V.: Analysis of K-Means and K-Medoids algorithm for big data. Procedia Comput. Sci. 78, 507–512 (2016) (Elsevier)
Prajesh, P.A., Anjan, K.K., Srinath, N.K.: MapReduce Design of K-Means Clustering Algorithm. IEEE (2013)
Nadeem, A., Mohd, V.A., Shahbaaz, A.: MapReduce model of improved K-Means clustering algorithm using hadoop MapReduce. In: 2nd International Conference on Computational Intelligence and Communication Technology, pp. 192–198. IEEE (2016)
Rajshree, D., Rasmita, D.: Comparative analysis of K-Means and genetic algorithm based data clustering. Int. J. Adv. Comput. Math. Sci. 3(2), 257–265 (2012)
Kaustubh, C., Gauri, C.: Improved K-means clustering on Hadoop. Int. J. Recent Innov. Trends Comput. Commun. 4(4), 601–604
Nivranshu, H., Sana, M., Omkar, S.N.: Big data clustering using genetic algorithm on hadoop Mapreduce. Int. J. Sci. Technol. Res. 4(4), 58–63 (2015)
Pooja, B., Kulvinder, S.: Big data mining: analysis of genetic K-Means algorithm for big data clustering. Int. J. Adv. Res. Comput. Sci. Softw. Eng. 6(7), 223–228 (2016)
Ujjwal, M., Sanghamitra, B.: Genetic algorithm-based clustering technique. Pattern Recognit. 33, 1455–1465 (2000) (Elsevier)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Shrivastava, P., Sahoo, L., Pandey, M., Agrawal, S. (2018). AKM—Augmentation of K-Means Clustering Algorithm for Big Data. In: Bhateja, V., Coello Coello, C., Satapathy, S., Pattnaik, P. (eds) Intelligent Engineering Informatics. Advances in Intelligent Systems and Computing, vol 695. Springer, Singapore. https://doi.org/10.1007/978-981-10-7566-7_11
Download citation
DOI: https://doi.org/10.1007/978-981-10-7566-7_11
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-7565-0
Online ISBN: 978-981-10-7566-7
eBook Packages: EngineeringEngineering (R0)