A new metaheuristic algorithm based on water wave optimization for data clustering

Abstract

Data clustering is an important activity in the field of data analytics. It can be described as unsupervised learning for grouping the similar objects into clusters. The similarity between objects is computed through distance measure. Further, clustering has proven its significance for solving wide range of real-world optimization problems. This work presents water wave optimization (WWO) based metaheuristic algorithm for clustering task. It is seen that WWO algorithm is an effective algorithm for solving constrained and unconstrained optimization problems. But, sometimes WWO cannot obtain promising solution for complex optimization problems due to absence of global best information component and converged on premature solution. To address the absentia of global best information and premature convergence, some improvements are inculcated in WWO algorithm to make it more promising and efficient. These improvements are described in terms of modified search mechanism and decay operator. The absentia of global best information component is handled through updated search mechanism. While, the premature convergence is addressed through a decay operator. The performance of WWO algorithm is evaluated using thirteen benchmark clustering datasets using accuracy and F-score parameters. The simulation results are compared with several state of art existing clustering algorithms and it is observed proposed WWO clustering algorithm achieves a higher accuracy and F-score rates with most of clustering datasets as compared to existing clustering algorithms. It is also showed that the proposed WWO algorithm improves the accuracy and F-score rates an average of 4% and 7% respectively as compared to existing clustering algorithm. Further, statistical test is also conducted to validate the existence of proposed WWO algorithm and statistical results confirm the existence of WWO algorithm in clustering field.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3

References

  1. 1.

    Jain AK (2008) Data clustering: 50 years beyond k-means. In: Joint European conference on machine learning and knowledge discovery in databases. Springer, Berlin, Heidelberg, pp 3–4

  2. 2.

    Gong S, Hu W, Li H, Qu Y (2018) Property clustering in linked data: an empirical study and its application to entity browsing. Int J Semant Web Inf Syst (IJSWIS) 14(1):31–70

    Article  Google Scholar 

  3. 3.

    Chou CH, Hsieh SC, Qiu CJ (2017) Hybrid genetic algorithm and fuzzy clustering for bankruptcy prediction. Appl Soft Comput 56:298–316

    Article  Google Scholar 

  4. 4.

    Holý V, Sokol O, Černý M (2017) Clustering retail products based on customer behaviour. Appl Soft Comput 60:752–762

    Article  Google Scholar 

  5. 5.

    Navarro ÁAM, Ger PM (2018) Comparison of clustering algorithms for learning analytics with educational datasets. IJIMAI 5(2):9–16

    Article  Google Scholar 

  6. 6.

    Hyde R, Angelov P, MacKenzie AR (2017) Fully online clustering of evolving data streams into arbitrarily shaped clusters. Inf Sci 382:96–114

    Article  Google Scholar 

  7. 7.

    Wang L, Zhou X, Xing Y, Yang M, Zhang C (2017) Clustering ecg heartbeat using improved semi-supervised affinity propagation. IET Softw 11(5):207–213

    Article  Google Scholar 

  8. 8.

    Mekhmoukh A, Mokrani K (2015) Improved fuzzy C-means based particle swarm optimization (PSO) initialization and outlier rejection with level set methods for MR brain image segmentation. Comput Methods Prog Biomed 122(2):266–281

    Article  Google Scholar 

  9. 9.

    Abualigah LM, Khader AT, Al-Betar MA, Alomari OA (2017) Text feature selection with a robust weight scheme and dynamic dimension reduction to text document clustering. Expert Syst Appl 84:24–36

    Article  Google Scholar 

  10. 10.

    Triguero I, del Río S, López V, Bacardit J, Benítez JM, Herrera F (2015) ROSEFW-RF: the winner algorithm for the ECBDL’14 big data competition: an extremely imbalanced big data bioinformatics problem. Knowl-Based Syst 87:69–79

    Article  Google Scholar 

  11. 11.

    Zhu J, Lung CH, Srivastava V (2015) A hybrid clustering technique using quantitative and qualitative data for wireless sensor networks. Ad Hoc Netw 25:38–53

    Article  Google Scholar 

  12. 12.

    Abualigah LMQ (2019) Feature selection and enhanced krill herd algorithm for text document clustering. Springer, Berlin, pp 1–165

    Google Scholar 

  13. 13.

    Marinakis Y, Marinaki M, Doumpos M, Zopounidis C (2009) Ant colony and particle swarm optimization for financial classification problems. Expert Syst Appl 36(7):10604–10611

    Article  Google Scholar 

  14. 14.

    Saraswathi S, Sheela MI (2014) A comparative study of various clustering algorithms in data mining. Int J Comput Sci Mob Comput 11(11):422–428

    Google Scholar 

  15. 15.

    Hartigan JA, Wong MA (1979) Algorithm AS 136: a k-means clustering algorithm. J R Stat Soc Ser C Appl Stat 28(1):100–108

    MATH  Google Scholar 

  16. 16.

    Celebi ME, Kingravi HA, Vela PA (2013) A comparative study of efficient initialization methods for the k-means clustering algorithm. Expert Syst Appl 40(1):200–210

    Article  Google Scholar 

  17. 17.

    Han J, Pei J, Kamber M (2011) Data mining: concepts and techniques. Elsevier, Amsterdam

    Google Scholar 

  18. 18.

    Moreira A, Santos MY, Carneiro S (2005) Density-based clustering algorithms–DBSCAN and SNN. University of Minho-Portugal, pp 1–18

  19. 19.

    Schaeffer SE (2007) Graph clustering. Comput Sci Rev 1(1):27–64

    MATH  Article  Google Scholar 

  20. 20.

    Hufnagl B, Lohninger H (2020) A graph-based clustering method with special focus on hyperspectral imaging. Anal Chim Acta 1097:37–48

    Article  Google Scholar 

  21. 21.

    Nanda SJ, Panda G (2014) A survey on nature inspired metaheuristic algorithms for partitional clustering. Swarm Evol Comput 16:1–18

    Article  Google Scholar 

  22. 22.

    Nayyar A, Le DN, Nguyen NG (eds) (2018) Advances in swarm intelligence for optimizing problems in computer science. CRC Press, Boca Raton

    Google Scholar 

  23. 23.

    Nayyar A, Nguyen NG (2018) Introduction to swarm intelligence. Adv Swarm Intell Optim Probl Comput Sci:53–78

  24. 24.

    Nayyar A, Garg S, Gupta D, Khanna A (2018) Evolutionary computation: theory and algorithms. In: Advances in swarm intelligence for optimizing problems in computer science. Chapman and Hall/CRC, pp 1–26

  25. 25.

    Sung CS, Jin HW (2000) A tabu-search-based heuristic for clustering. Pattern Recogn 33(5):849–858

    Article  Google Scholar 

  26. 26.

    Selim SZ, Alsultan K (1991) A simulated annealing algorithm for the clustering problem. Pattern Recogn 24(10):1003–1008

    MathSciNet  Article  Google Scholar 

  27. 27.

    Maulik U, Bandyopadhyay S (2000) Genetic algorithm-based clustering technique. Pattern Recogn 33(9):1455–1465

    Article  Google Scholar 

  28. 28.

    Karaboga D, Ozturk C (2011) A novel clustering approach: artificial Bee Colony (ABC) algorithm. Appl Soft Comput 11(1):652–657

    Article  Google Scholar 

  29. 29.

    Sahoo G, Kumar Y (2017) A two-step artificial bee colony algorithm for clustering. Neural Comput Appl 28(3):537–551

    Article  Google Scholar 

  30. 30.

    Nayyar A, Puri V, Suseendran G (2019) Artificial bee Colony optimization—population-based meta-heuristic swarm intelligence technique. Data management, analytics and innovation. Springer, Singapore, pp 513–525

    Google Scholar 

  31. 31.

    Kumar S, Nayyar A, Kumari R (2019) Arrhenius artificial bee colony algorithm. International conference on innovative computing and communications. Springer, Singapore, pp 187–195

    Google Scholar 

  32. 32.

    Shelokar PS, Jayaraman VK, Kulkarni BD (2004) An ant colony approach for clustering. Anal Chim Acta 509(2):187–195

    Article  Google Scholar 

  33. 33.

    Nayyar A, Singh R (2016) Ant colony optimization—computational swarm intelligence technique. In: 2016 3rd International conference on computing for sustainable global development (INDIACom), IEEE, pp 1493–1499

  34. 34.

    Niknam T, Amiri B (2010) An efficient hybrid approach based on PSO, ACO and k-means for cluster analysis. Appl Soft Comput 10(1):183–197

    Article  Google Scholar 

  35. 35.

    Bouyer A, Hatamlou A (2018) An efficient hybrid clustering method based on improved cuckoo optimization and modified particle swarm optimization algorithms. Appl Soft Comput 67:172–182

    Article  Google Scholar 

  36. 36.

    Kumar Y, Singh PK (2018) Improved cat swarm optimization algorithm for solving global optimization problems and its application to clustering. Appl Intell 48(9):2681–2697

    Article  Google Scholar 

  37. 37.

    Kumar Y, Sahoo G (2015) A hybrid data clustering approach based on improved cat swarm optimization and K-harmonic mean algorithm. AI Commun 28(4):751–764

    MathSciNet  Article  Google Scholar 

  38. 38.

    Senthilnath J, Omkar SN, Mani V (2011) Clustering using firefly algorithm: performance study. Swarm Evol Comput 1(3):164–171

    Article  Google Scholar 

  39. 39.

    Durbhaka GK, Selvaraj B, Nayyar A (2019) Firefly swarm: metaheuristic swarm intelligence technique for mathematical optimization. Data Management, Analytics and Innovation. Springer, Singapore, pp 457–466

    Google Scholar 

  40. 40.

    Han X, Quan L, Xiong X, Almeter M, Xiang J, Lan Y (2017) A novel data clustering algorithm based on modified gravitational search algorithm. Eng Appl Artif Intell 61:1–7

    Article  Google Scholar 

  41. 41.

    Kumar Y, Sahoo G (2014) A review on gravitational search algorithm and its applications to data clustering & classification. Int J Intell Syst Appl 6(6):79

    Google Scholar 

  42. 42.

    Hatamlou A (2013) Black hole: a new heuristic optimization approach for data clustering. Inf Sci 222:175–184

    MathSciNet  Article  Google Scholar 

  43. 43.

    Kumar Y, Sahoo G (2014) A charged system search approach for data clustering. Prog Artif Intell 2(2–3):153–166

    Article  Google Scholar 

  44. 44.

    Kumar Y, Sahoo G (2015) Hybridization of magnetic charge system search and particle swarm optimization for efficient data clustering using neighborhood search strategy. Soft Comput 19(12):3621–3645

    Article  Google Scholar 

  45. 45.

    Kumar Y, Singh PK (2019) A chaotic teaching learning based optimization algorithm for clustering problems. Appl Intell 49(3):1036–1062

    Article  Google Scholar 

  46. 46.

    Singh H, Kumar Y, Kumar S (2019) A new meta-heuristic algorithm based on chemical reactions for partitional clustering problems. Evol Intel 12(2):241–252

    Article  Google Scholar 

  47. 47.

    Hatamlou A, Abdullah S, Hatamlou M (2011) Data clustering using big bang–big crunch algorithm. In: International conference on innovative computing technology. Springer, Berlin, Heidelberg, pp 383–388

  48. 48.

    Singh H, Kumar Y (2019) Hybrid big bang-big crunch algorithm for cluster analysis. In: International conference on futuristic trends in networks and computing technologies. Springer, Singapore, pp 648–661

  49. 49.

    Zhou Y, Wu H, Luo Q, Abdel-Baset M (2019) Automatic data clustering using nature-inspired symbiotic organism search algorithm. Knowl-Based Syst 163:546–557

    Article  Google Scholar 

  50. 50.

    Agbaje MB, Ezugwu AE, Els R (2019) Automatic data clustering using hybrid firefly particle swarm optimization algorithm. IEEE Access 7:184963–184984

    Article  Google Scholar 

  51. 51.

    Kushwaha N, Pant M, Sharma S (2019) Electromagnetic optimization‐based clustering algorithm. Expert Syst:e12491

  52. 52.

    Zhao F, Zhang L, Liu H, Zhang Y, Ma W, Zhang C, Song H (2019) An improved water wave optimization algorithm with the single wave mechanism for the no-wait flow-shop scheduling problem. Eng Optim 51(10):1727–1742

    MathSciNet  Article  Google Scholar 

  53. 53.

    Singh G, Rattan M, Gill SS, Mittal N (2019) Hybridization of water wave optimization and sequential quadratic programming for cognitive radio system. Soft Comput 23(17):7991–8011

    Article  Google Scholar 

  54. 54.

    Zhao F, Liu H, Zhang Y, Ma W, Zhang C (2018) A discrete water wave optimization algorithm for no-wait flow shop scheduling problem. Expert Syst Appl 91:347–363

    Article  Google Scholar 

  55. 55.

    Zhang J, Zhou Y, Luo Q (2018) An improved sine cosine water wave optimization algorithm for global optimization. J Intell Fuzzy Syst 34(4):2129–2141

    Article  Google Scholar 

  56. 56.

    Shao Z, Pi D, Shao W (2019) A novel multi-objective discrete water wave optimization for solving multi-objective blocking flow-shop scheduling problem. Knowl-Based Syst 165:110–131

    Article  Google Scholar 

  57. 57.

    Liu A, Li P, Sun W, Deng X, Li W, Zhao Y, Liu B (2019) Prediction of mechanical properties of micro-alloyed steels via neural networks learned by water wave optimization. Neural Comput Appl:1–16

  58. 58.

    Zhou Y, Zhang J, Yang X, Ling Y (2018) Optimal reactive power dispatch using water wave optimization algorithm. Oper Res:1–17

  59. 59.

    Ibrahim AM, Tawhid MA, Ward RK (2020) A binary water wave optimization for feature selection. Int J Approximate Reasoning 120:74–91

    MathSciNet  MATH  Article  Google Scholar 

  60. 60.

    Manshahia MS (2017) Water wave optimization algorithm-based congestion control and quality of service improvement in wireless sensor networks. Trans Netw Commun 5(4):31–31

    Google Scholar 

  61. 61.

    Hematabadi AA, Foroud AA (2019) Optimizing the multi-objective bidding strategy using min–max technique and modified water wave optimization method. Neural Comput Appl 31(9):5207–5225

    Article  Google Scholar 

  62. 62.

    Soltanian A, Derakhshan F, Soleimanpour-Moghadam M (2018) MWWO: modified water wave optimization. In: 2018 3rd conference on swarm intelligence and evolutionary computation (CSIEC). IEEE, pp 1–5

  63. 63.

    Singh T (2020) A chaotic sequence-guided Harris hawks optimizer for data clustering. Neural Comput Appl

  64. 64.

    Tsai CW, Chang WY, Wang YC, Chen H (2019) A high-performance parallel coral reef optimization for data clustering. Soft Comput 23(19):9327–9340

    Article  Google Scholar 

  65. 65.

    Kuwil FH, Shaar F, Topcu AE, Murtagh F (2019) A new data clustering algorithm based on critical distance methodology. Expert Syst Appl 129:296–310

    Article  Google Scholar 

  66. 66.

    Baalamurugan KM, Bhanu SV (2019) An efficient clustering scheme for cloud computing problems using metaheuristic algorithms. Cluster Comput 22(5):12917–12927

    Article  Google Scholar 

  67. 67.

    Sharma M, Chhabra JK (2019) An efficient hybrid PSO polygamous crossover-based clustering algorithm. Evol Intell:1–19

  68. 68.

    Abdulwahab HA, Noraziah A, Alsewari AA, Salih SQ (2019) An enhanced version of black hole algorithm via levy flight for optimization and data clustering problems. IEEE Access 7:142085–142096

    Article  Google Scholar 

  69. 69.

    Mustafa HM, Ayob M, Nazri MZA, Kendall G (2019) An improved adaptive memetic differential evolution optimization algorithm for data clustering problems. PLoS ONE 14(5):e0216906

    Article  Google Scholar 

  70. 70.

    Tarkhaneh O, Moser I (2019) An improved differential evolution algorithm using Archimedean spiral and neighborhood search-based mutation approach for cluster analysis. Fut Gener Comput Syst 101:921–939

    Article  Google Scholar 

  71. 71.

    Aljarah I, Mafarja M, Heidari AA, Faris H, Mirjalili S (2020) Clustering analysis using a novel locality-informed grey wolf-inspired clustering approach. Knowl Inf Syst 62(2):507–539

    Article  Google Scholar 

  72. 72.

    Zhu LF, Wang JS, Wang HY, Guo SS, Guo MW, Xie W (2020) Data clustering method based on improved bat algorithm with six convergence factors and local search operators. IEEE Access 8:80536–80560

    Article  Google Scholar 

  73. 73.

    Senthilnath J, Kulkarni S, Suresh S, Yang XS, Benediktsson JA (2019) FPA clust: evaluation of the flower pollination algorithm for data clustering. Evol Intell:1–11

  74. 74.

    Mageshkumar C, Karthik S, Arunachalam VP (2019) Hybrid metaheuristic algorithm for improving the efficiency of data clustering. Cluster Comput 22(1):435–442

    Article  Google Scholar 

  75. 75.

    Kaur A, Pal SK, Singh AP (2019) Hybridization of chaos and flower pollination algorithm over k-means for data clustering. Appl Soft Comput:105523

  76. 76.

    Xie H, Zhang L, Lim CP, Yu Y, Liu C, Liu H, Walters J (2019) Improving K-means clustering with enhanced Firefly Algorithms. Appl Soft Comput 84:105763

    Article  Google Scholar 

  77. 77.

    Huang KW, Wu ZX, Peng HW, Tsai MC, Hung YC, Lu YC (2019) Memetic particle gravitation optimization algorithm for solving clustering problems. IEEE Access 7:80950–80968

    Article  Google Scholar 

  78. 78.

    Dinkar SK, Deep K (2019) Opposition-based antlion optimizer using Cauchy distribution and its application to data clustering problem. Neural Comput Appl:1–29

  79. 79.

    Abualigah LM, Khader AT, Hanandeh ES, Gandomi AH (2017) A novel hybridization strategy for krill herd algorithm applied to clustering techniques. Appl Soft Comput 60:423–435

    Article  Google Scholar 

  80. 80.

    Zeng N, Wang Z, Zhang H, Kim KE, Li Y, Liu X (2019) An improved particle filter with a novel hybrid proposal distribution for quantitative analysis of gold immunochromatographic strips. IEEE Trans Nanotechnol 18:819–829

    Article  Google Scholar 

  81. 81.

    Zeng N, Wang Z, Liu W, Zhang H, Hone K, Liu X (2020) A dynamic neighborhood-based switching particle swarm optimization algorithm. IEEE Trans Cybern

  82. 82.

    Abualigah L (2020) Group search optimizer: a nature-inspired meta-heuristic optimization algorithm with its results, variants, and applications. Neural Comput Appl:1–24

  83. 83.

    Abualigah L (2020) Multi-verse optimizer algorithm: a comprehensive survey of its results, variants, and applications. Neural Comput Appl:1–21

  84. 84.

    Zeng N, Qiu H, Wang Z, Liu W, Zhang H, Li Y (2018) A new switching-delayed-PSO-based optimized SVM algorithm for diagnosis of Alzheimer’s disease. Neurocomputing 320:195–202

    Article  Google Scholar 

  85. 85.

    Zhu G, Kwong S (2010) Gbest-guided artificial bee colony algorithm for numerical function optimization. Appl Math Comput 217(7):3166–3173

    MathSciNet  MATH  Google Scholar 

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Yugal Kumar.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Kaur, A., Kumar, Y. A new metaheuristic algorithm based on water wave optimization for data clustering. Evol. Intel. (2021). https://doi.org/10.1007/s12065-020-00562-x

Download citation

Keywords

  • Clustering
  • Data analysis
  • Meta-heuristic algorithms
  • Water wave optimization
  • Unsupervised learning