Soft Computing

, Volume 23, Issue 21, pp 10717–10737 | Cite as

Dynamic clustering with binary social spider algorithm for streaming dataset

  • Urvashi Prakash ShuklaEmail author
  • Satyasai Jagannath Nanda
Methodologies and Application


Technical advancement in various fields like social network, health instruments and astronomical devices poses massive capturing and sensing capacity that enables huge data generations. This demands substantial storage space and voluminous data processing capacity. Streaming data clustering imparts an efficient method for handling this dataset by extracting significant information. In this article, dynamic estimation of clusters in evolving data stream is designed by incorporating swarm optimization technique. One of the recently reported algorithms inspired from the social behavior of spiders residing in huge colonies is reformulated in binary domain. The main contribution is to use the binary social spider optimization (BSSO) for dynamic data clustering of evolving dataset (DSC-BSSO). The proposed work is able to prove efficiency and efficacy as compared to the other recent existing algorithms. BSSO is well tested on various benchmark unimodal, multimodal and binary optimization functions. Results are reported in terms of parametric and nonparametric. The testing of DSC-BSSO is also done on various streaming datasets in terms of time and memory complexity. The proposed work is able to obtain compact and well-separated clusters in less than one-fourth of a minute for about 10,000 samples.


Dynamic clustering Page-Hinkley statistical test Social spider optimization Wilcoxon’s pair test 



The research work is funded by institute fellowship from Ministry of HRD, Govt. of India, to Urvashi P. Shukla to pursue her PhD work at MNIT Jaipur.

Compliance with ethical standards

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with animals or humans performed by any of the authors.


  1. Abualigah LM, Khader AT, Al-Betar MA, Awadallah MA (2016) A krill herd algorithm for efficient text documents clustering. In: Computer applications and industrial electronics (ISCAIE), 2016 IEEE symposium. IEEE, pp 67–72Google Scholar
  2. Abualigah LM, Khader AT (2017) Unsupervised text feature selection technique based on hybrid particle swarm optimization algorithm with genetic operators for the text clustering. J Supercomput 73(11):4773–4795CrossRefGoogle Scholar
  3. Abualigah LM, Khader AT, Al-Betar MA, Alomari OA (2017) Text feature selection with a robust weight scheme and dynamic dimension reduction to text document clustering. Expert Syst Appl 84:24–36CrossRefGoogle Scholar
  4. Abualigah LM, Khader AT, Hanandeh ES, Gandomi AH (2017) A novel hybridization strategy for krill herd algorithm applied to clustering techniques. Appl Soft Comput 60:423–435CrossRefGoogle Scholar
  5. Abualigah LM, Khader AT, Hanandeh ES (2018a) A hybrid strategy for krill herd algorithm with harmony search algorithm to improve the data clustering. Intelli Decis Technol Prepr, 12:1–12Google Scholar
  6. Abualigah LM, Khader AT, Hanandeh ES (2018b) A new feature selection method to improve the document clustering using particle swarm optimization algorithm. J Comput Sci 25:456–466CrossRefGoogle Scholar
  7. Azad MAK, Rocha AMAC, Fernandes EMGP (2014) Improved binary artificial fish swarm algorithm for the 0–1 multidimensional knapsack problems. Swarm Evol Comput 14:66–75CrossRefGoogle Scholar
  8. Chuang L-Y, Chang H-W, Chung-Jui T, Yang C-H (2008) Improved binary PSO for feature selection using gene expression data. Comput Biol Chem 32(1):29–38CrossRefGoogle Scholar
  9. Cuevas E, Cienfuegos M (2013) A new algorithm inspired in the behavior of the social-spider for constrained optimization. Expert Syst Appl 41(2):412–425CrossRefGoogle Scholar
  10. Cuevas E, Cienfuegos M (2014) A new algorithm inspired in the behavior of the social-spider for constrained optimization. Expert Syst Appl 41(2):412–425CrossRefGoogle Scholar
  11. de Andrade Silva J, Hruschka ER, Gama J (2017) An evolutionary algorithm for clustering data streams with a variable number of clusters. Expert Syst Appl 67:228–238CrossRefGoogle Scholar
  12. Digalakis JG, Margaritis KG (2002) An experimental study of benchmarking functions for genetic algorithms. Int J Comput Math 79(4):403–416MathSciNetCrossRefGoogle Scholar
  13. Emary E, Zawbaa HM, Hassanien AE (2016) Binary grey wolf optimization approaches for feature selection. Neurocomputing 172:371–381CrossRefGoogle Scholar
  14. Falcon R, Almeida M, Nayak A (2011) Fault identification with binary adaptive fireflies in parallel and distributed systems. In: Evolutionary computation (CEC). IEEE Congress, pp 1359–1366Google Scholar
  15. Feng Y, Wang G-G, Deb S, Mei L, Zhao X-J (2017) Solving 0–1 knapsack problem by a novel binary monarch butterfly optimization. Neural Comput Appl 28(7):1619–1634CrossRefGoogle Scholar
  16. Firpi HA, Goodman E (2004) Swarmed feature selection. In: In Information Theory, ISIT Proceedings. International Symposium on. IEEE, pp 112–118Google Scholar
  17. Garca S, Molina D, Lozano M, Herrera F (2009) A study on the use of non-parametric tests for analyzing the evolutionary algorithms behaviour: a case study on the CEC2005 special session on real parameter optimization. J Heuristics 15(6):617CrossRefGoogle Scholar
  18. Ghaemi A, Rashedi E, Pourrahimi AM, Kamandar M, Rahdari F (2017) Automatic channel selection in EEG signals for classification of left or right hand movement in Brain Computer Interfaces using improved binary gravitation search algorithm. Biomed Signal Process Control 33:109–118CrossRefGoogle Scholar
  19. Houck CR, Joines J, Kay MG (1995) A genetic algorithm for function optimization: a Matlab implementation. Ncsu-ie tr 95(09):1–10Google Scholar
  20. Islam MJ, Li X, Mei Y (2017) A time-varying transfer function for balancing the exploration and exploitation ability of a binary PSO. Appl Soft Comput 59:182–196CrossRefGoogle Scholar
  21. Jansen T, Wegener I (2005) Real royal road functions—where crossover provably is essential. Discrete Appl Math 149(1–3):111–125MathSciNetCrossRefGoogle Scholar
  22. Kanan HR, Faez K, Taheri SM (2007) Feature selection using ant colony optimization (ACO): a new method and comparative study in the application of face recognition system. In: Industrial conference on data mining. Springer, Berlin, pp 63–76Google Scholar
  23. Kennedy J, Eberhart RC (1997) A discrete binary version of the particle swarm algorithm. In: Systems, man, and cybernetics, 1997. Computational cybernetics and simulation, IEEE International Conference on, vol 5, pp 4104–4108Google Scholar
  24. Komaki GM, Kayvanfar V (2015) Grey Wolf Optimizer algorithm for the two-stage assembly flow shop scheduling problem with release time. J Comput Sci 8:109–120CrossRefGoogle Scholar
  25. Maulik U, Saha I (2010) Automatic fuzzy clustering using modified differential evolution for image classification. IEEE Trans Geosci Remote Sens 48(9):3503–3510CrossRefGoogle Scholar
  26. Mirjalili S, Wang G-G, Coelho LS (2014) Binary optimization using hybrid particle swarm optimization and gravitational search algorithm. Neural Comput Appl 25(6):1423–1435CrossRefGoogle Scholar
  27. Motieghader H, Najafi A, Sadeghi B, Masoudi-Nejad A (2017) A hybrid gene selection algorithm for microarray cancer classification using genetic algorithm and learning automata. Inform Med Unlocked 9:246–254CrossRefGoogle Scholar
  28. Mouss H, Mouss D, Mouss N, Sefouhi L (2004) Test of Page-Hinkley, an approach for fault detection in an agro-alimentary production system. In: Proceedings of the Asian control conference, vol 2. IEEE, pp 815–818Google Scholar
  29. Nakamura RYM, Pereira LAM, Costa KA, Rodrigues D, Papa JP, Yang X-S (2012) BBA: a binary bat algorithm for feature selection. In: 2012, IEEE 25th SIBGRAPI conference on graphics, Patterns and Images, pp 291–297Google Scholar
  30. Nozarian S, Soltanpoor H, VafaeiJahan M (2011) A binary model on the basis of cuckoo search algorithm in order to solve the problem of knapsack 1–0. In: International conference of sysem engineering and modeling (ICSEM), pp 67–71Google Scholar
  31. Omran MGH, Salman A, Engelbrecht AP (2006) Dynamic clustering using particle swarm optimization with application in image segmentation. Pattern Anal Appl 8(4):332MathSciNetCrossRefGoogle Scholar
  32. Ozturk C, Hancer E, Karaboga D (2015) Dynamic clustering with improved binary artificial bee colony algorithm. Appl Soft Comput 28:69–80CrossRefGoogle Scholar
  33. Panda A, Pani S (2018) An orthogonal parallel symbiotic organism search algorithm embodied with augmented Lagrange multiplier for solving constrained optimization problems. Soft Comput 22(8):2429–2447CrossRefGoogle Scholar
  34. Prasad D, Mukherjee A, Mukherjee V (2017) Application of chaotic krill herd algorithm for optimal power flow with direct current link placement problem. Chaos Solitons Fractals 103:90–100MathSciNetCrossRefGoogle Scholar
  35. Ramos CCO, Souza AN, Chiachia G, Falco AX, Papa JP (2011) A novel algorithm for feature selection using harmony search and its application for non-technical losses detection. Comput Electr Eng 37(6):886–894CrossRefGoogle Scholar
  36. Rnndles RH (1986) Nonparametric statistical inference. Technometrics 28(3):275–275CrossRefGoogle Scholar
  37. Rodrigues D, Pereira LAM, Almeida TNS, Papa JP, Souza AN, Ramos CC, Yang X-S (2013) BCS: a binary cuckoo search algorithm for feature selection. In: Circuits and systems (ISCAS), IEEE international symposium, pp 465–468Google Scholar
  38. Rodrigues D, Yang X-S, De Souza AN, Papa JP (2015) Binary flower pollination algorithm and its application to feature selection. In: Recent advances in swarm intelligence and evolutionary computation. Springer, pp 85–100Google Scholar
  39. Saki F, Kehtarnavaz N (2016) Online frame-based clustering with unknown number of clusters. Pattern Recognit 57:70–83CrossRefGoogle Scholar
  40. Shehab M, Khader AT, Al-Betar MA, Abualigah LM (2017) Hybridizing cuckoo search algorithm with hill climbing for numerical optimization problems. In: Information technology (ICIT), 2017 8th international conference. IEEE, pp 36–43Google Scholar
  41. Shilane D, Martikainen J, Dudoit S, Ovaska SJ (2008) A general framework for statistical performance comparison of evolutionary computation algorithms. Inf Sci 178(14):2870–2879CrossRefGoogle Scholar
  42. Shukla UP, Nanda SJ (2016) Cluster analysis of evolving data streams using centroid initialization methods. In: Electrical, computer and electronics engineering (UPCON), 2016 IEEE Uttar Pradesh section international conference, pp 624–629Google Scholar
  43. Shukla UP, Nanda SJ (2016) Parallel social spider clustering algorithm for high dimensional datasets. Eng Appl Artif Intell 56:75–90CrossRefGoogle Scholar
  44. Shukla UP, Nanda SJ (2018) A binary social spider optimization algorithm for unsupervised band selection in compressed hyperspectral images. Expert Syst Appl 97:336–356CrossRefGoogle Scholar
  45. Suresh K, Kumarappan N (2013) Hybrid improved binary particle swarm optimization approach for generation maintenance scheduling problem. Swarm Evol Comput 9:69–89CrossRefGoogle Scholar
  46. Wang C-D, Lai J-H, Huang D, Zheng W-S (2013) SVStream: a support vector-based algorithm for clustering data streams. IEEE Trans Knowl Data Eng 25(6):1410–1424CrossRefGoogle Scholar
  47. Wang L, Xu Y, Mao Y, Fei M (2010) A discrete harmony search algorithm. In: Life system modeling and intelligent computing. Springer, Berlin, pp 37–43Google Scholar
  48. Whitley E, Ball J (2002) Statistics review 6: nonparametric methods. Crit Care 6(6):509CrossRefGoogle Scholar
  49. Wu L, Zuo C, Zhang H (2015) A cloud model based fruit fly optimization algorithm. Knowl Based Syst 89:603–617CrossRefGoogle Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Department of Electronics and Communication EngineeringMalaviya National Institute of Technology JaipurJaipurIndia

Personalised recommendations