Binary Whale Optimization Algorithm and Binary Moth Flame Optimization with Clustering Algorithms for Clinical Breast Cancer Diagnoses

Abstract

Models based on machine learning algorithms have been developed to detect the breast cancer disease early. Feature selection is commonly applied to improve the performance of these models through selecting only relevant features. However, selecting relevant features in unsupervised learning is much difficult. This is due to the absence of class labels that guide the search for relevant information. This kind of the problem has rarely been studied in the literature. This paper presents a hybrid intelligence model that uses the cluster analysis algorithms with bio-inspired algorithms as feature selection for analyzing clinical breast cancer data. A binary version of both moth flame optimization and whale optimization algorithm is proposed. Two evaluation criteria are adopted to evaluate the proposed algorithms: clustering-based measurements and statistics-based measurements. The experimental results positively demonstrate that the capability of the proposed bio-inspired feature selection algorithms to produce both meaningful data partitions and significant feature subsets.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17

References

  1. Abdel-Basset, M., Shahat, D., Sangaiah, A. (2017). A modied nature inspired meta-heuristic whale optimization algorithm forsolving 01 knapsack problem. International Journal of Machine Learning and Cyber, 1–20.

  2. AbdEl-Fattah, S., Nabil, E., Badr, A. (2016). A binary colnal flower pollination algorithm for feature selection. Pattern Recognition Letters, 77, 21–27.

    Article  Google Scholar 

  3. Alba, E., & Dorronsoro, B. (2005). The exploration/exploitation tradeoffin dynamic cellular genetic algorithms. IEEE Transaction on Evolutionary Computation, 9(2), 126–142.

    Article  Google Scholar 

  4. Arthur, D., & Vassilvitskii, S. (2007). K-means++: the advantages of careful seeding. In Proceedings of the 18th annual acm-siam symposium on discrete algorithms (p. 10271035). PA, USA.

  5. Aziz, M., Ewees, A., Hassanien, A. (2017). Whale optimization algorithm and moth-flame optimization for multilevel thresholding image segmentation. Expert Systems with Applications, 1–33.

  6. Boussaid, I., Lepagnot, J., Siarry, P. (2013). A survey on optimization meta-heuristics. Information Sciences, 237, 82–117.

    MathSciNet  Article  Google Scholar 

  7. Bradley, P., & Fayyad, U. (1998). Refining initial points for k-means clustering. In Proceedings 15th international conference on machine learning (p. 9199). San Francisco.

  8. Brendan, J., & Dueck, D. (2007). Clustering by passing messages between data points. Science, 315(5814), 972–976.

    MathSciNet  Article  Google Scholar 

  9. Buch, H., Trivedi, I., Jangir, P., Zheng, P. (2017). Moth flame optimization to solve optimal power flow with non-parametric statistical evaluation validation. Cogent Engineering, 4, 1–22.

    Article  Google Scholar 

  10. Chen, C.H. (2014). A hybrid intelligent model of analyzing clinical breast cancer data using clustering techniques with feature selection. Applied Soft Computing, 20, 4–14.

    Article  Google Scholar 

  11. Dey, V.H.A.E., & Nilanjan, B. (2016). [studies in computational intelligence] medical imaging in clinical applications volume 651 —- bio-inspired swarm techniques for thermogram breast cancer detection. https://doi.org/10.1007/978-3-319-33793-7, 487-506.

  12. Dunn, J.C. (1973). A fuzzy relative of the isodata process and its use in detecting compact well-separated clusters. Journal of Cybernetics, 3(3), 32–57.

    MathSciNet  Article  Google Scholar 

  13. Emary, E., Zawbaa, H., Hassanien, A. (2016a). Binary gray wolf optimization approaches for feature selection. Neurocomputing, 172, 371–381.

    Article  Google Scholar 

  14. Emary, E., Zawbaa, H., Hassanien, A. (2016b). Binary grey wolf optimization approaches for feature selection. Neurocomputing, 172, 371–381.

    Article  Google Scholar 

  15. Faber, V. (1994). Clustering and the continuous k-means algorithm. Los Alamos Science, 22, 138144.

    Google Scholar 

  16. Goldbogen, J., Friedlaender, A., Calambokidis, J., Mckenna, M., Simon, Nowacek, M. (2013). Integrative approaches to the study of baleen whale diving behavior, feeding performance, and foraging ecology. Bio-Science, 63, 90–100.

    Google Scholar 

  17. Halkidi, M., Batistakis, Y., Vazirgiannis, M. (2001). On clustering validation techniques. Journal of Intelligent Information Systems, 17(2), 107–145.

    Article  Google Scholar 

  18. Hartigan, J.A., & Wong, M.A. (1979). Algorithm as 136: A k-means clustering algorithm. Journal of the Royal Statistical Society, 28(1), 100–108.

    MATH  Google Scholar 

  19. Hof, P., & Van, E. (2007). Structure of the cerebral cortex of the humpback whale, megaptera novaeangliae (cetacea, mysticeti, balaenopteridae). Anat Rec (Hoboken), 290 (1), 1–31.

  20. Hu, H., Bai, Y., Xu, T. (2017). Improved whale optimization algorithms based on inertia weights and theirs applications. International Journal of Circuits, Systems and Signal Processing, 11, 12–26.

    Google Scholar 

  21. Kaufman, L., & Rousseeuw, P.J. (1990). [wiley series in probability and statistics] finding groups in data —- agglomerative nesting (program agnes). In (pp. 199–252).

  22. Kaya, Y. (2013). A new intelligent classifier for breast cancer diagnosis based on a rough set and extreme learning machine: Rs + elm. Turkish Journal of Electrical Engineering and Computer Science, 21, 2079–2091.

    Article  Google Scholar 

  23. Kennedy, J., & Eberhart, R. (1995). Particle swarm optimization. In Proceedings of the 1995 ieee international conference on neural networks (pp. 1942–1948), Perth, WA.

  24. Kohonen, T. (1982). Self-organized formation of topologically correct feature maps. Biological Cybernetics, 43(1), 59–69.

    MathSciNet  Article  Google Scholar 

  25. Laan, M., Pollard, K., Bryan, J. (2003). A new partitioning around medoids algorithm. Journal of Statistical Computation and Simulation, 73(8), 575–584.

    MathSciNet  Article  Google Scholar 

  26. Lin, L., & Gen, M. (2009). Auto-tuning strategy for evolutionary algorithms: balancing between exploration and exploitation. Soft Computing, Springer, 13(2), 157–168.

    Article  Google Scholar 

  27. Liu, D., Liu, C., Fu, Q., Li, T., Imran, K., Cui, S., et al. (2017). Elm evaluation model of regional groundwater quality based on the crow search algorithm. Ecological Indicators, 81, 302–314.

    Article  Google Scholar 

  28. Liu, Y., Li, Z., Xiong, H., Gao, X., Wu, J. (2010). Understanding of internal clustering validation measures. In International conference on data mining (pp. 911–916).

  29. Mahdad, B., & Srairi, K. (2017). A new interactive sine cosine algorithm for loading margin stability improvement under contingency. Electrical Engineering (Archiv fur Elektrotechnik), 1–21.

  30. Marcano, A., Quintanilla, J., Andina, D. (2011). Wbcd breast cancer database classification applying artificial metaplasticity neural network. Expert Systems with Applications, 38, 9573–9579.

    Article  Google Scholar 

  31. Mirjalili, S. (2015). Moth-flame optimization algorithm: A novel nature-inspired heuristic paradigm. Knowledge-Based Systems, 89, 228–249.

    Article  Google Scholar 

  32. Mirjalili, S., & Lewis, A. (2014). Grey wolf optimizer. Advances in Engineering Software, 69, 46–61.

    Article  Google Scholar 

  33. Nahato, K., Harichandran, K., Arputharaj, K. (2015). Knowledge mining from clinical datasets using rough sets and back propagation neural network. Computational and Mathematical Methods in Medicine, 2015, 1–13.

    Article  Google Scholar 

  34. Neagu, B., Ivanov, O., Gavrilas, M. (2017a). Link prediction based on whale optimization algorithm. In The international conference on new trends in computing sciences (ictcs2017) (pp. 55–59). Amman, Jordan.

  35. Neagu, B., Ivanov, O., Gavrilas, M. (2017b). Voltage profile improvement in distribution networks using the whale optimization algorithm. In The 9th international conference on electronics, computers and artificial intelligence (ecai) (pp. 1–6). Targoviste, Romania.

  36. Olorunda, O., & Engelbrecht, A. (2008). Measuring exploration/exploitation in particle swarms using swarm diversity. In Proceedings of the 2008 ieee congress on evolutionary computation, cec (ieee world congress on computational intelligence) (pp. 1128–1134). Hong Kong.

  37. Pavlyukevich, I. (2007). Levy flights, non-local search and simulated annealing. Journal of Computing Physics, 226, 1830–1844.

    MathSciNet  Article  Google Scholar 

  38. Rajeshkumar, J., & Kousalya, K. (2017). Diabetes data classification using whale optimization algorithm and backpropagation neural network. International Research Journal of Pharmacy, 8(11), 219–222.

    Article  Google Scholar 

  39. Reddy, S., Panwar, L., Panigrahi, B., Kumar, R. (2017). Solution to unit commitment in power system operation planning using binary coded modified moth flame optimization algorithm (bmmfoa): A flame selection based computational technique. Journal of Computational Science, 1–22.

  40. Sayed, G., Darwish, A., Hassanien, A. (2017). Quantum multiverse optimization algorithm for optimization problems. Neural Computing and Applications, 1–18.

  41. Sayed, G., & Hassanien, A. (2017). Moth-flame swarm optimization with neutrosophic sets for automatic mitosis detection in breast cancer histology images. Applied Intelligence, 1–12.

  42. Sayed, G., Hassanien, A., Azar, A. (2017). Feature selection via a novel chaotic crow search algorithm. Neural Computing and Applications, 1–18.

  43. Seyedali, M., & Andrew, L. (2016). The whale optimization algorithm. Advances in Engineering Software, Elsevier, 95, 51–67.

    Article  Google Scholar 

  44. Steinley, D., & Brusco, M.J. (2007). Initializing k-means batch clustering: A critical evaluation of several techniques. Journal of Classification, 24(1), 99–121.

    MathSciNet  Article  Google Scholar 

  45. Steinley, D., & Brusco, M.J. (2008). Selection of variables in cluster analysis: An empirical comparison of eight procedures. Psychometrika, 73, 125–144.

    MathSciNet  Article  Google Scholar 

  46. Su, T., & Dy, J. (2007). In search of deterministic methods for initializing k-means and gaussian mixture clustering. Intelligent Data Analysis, 11(4), 319–338.

    Article  Google Scholar 

  47. Wang, K., Wang, B., Peng, L. (2009). Cvap: validation for cluster analysis. Data Science Journal, 8, 88–93.

    Article  Google Scholar 

  48. Wari, E., & Zhu, W. (2016). A survey on metaheuristics for optimization in food manufacturing industry. Applied Soft Computing, 1–22.

  49. Watkins, W., & Schevill, W. (1979). Aerial observation of feeding behavior in four baleen whales: Eubalaena glacialis, balaenoptera borealis, megaptera novaean-gliae, and balaenoptera physalus. Journal of Mammalogy, 60(1), 155–163.

    Article  Google Scholar 

  50. Wilcoxon, F. (1945). Individual comparisons by ranking methods. Biom Bull, 1, 80–83.

    Article  Google Scholar 

  51. World health organization (woa). (2010). quick cancer facts. http://www.who.int/cancer/en/. (Retrieved September 22, 2010).

  52. Yang, X. (2012). Flower pollination algorithm for global optimization. Proceedings of the Unconventional Computation and Natural Computation, 7445, 240–249.

    Article  Google Scholar 

  53. Zhang, W., & Zhu, G. (2017). Drilling path optimization by optimal foraging algorithm. IEEE Transactions on Industrial Informatics, PP(99), 1–21.

    Google Scholar 

  54. Zheng, B., Won, S., Lam, S. (2014). Breast cancer diagnosis based on feature extraction using a hybrid of k-means and support vector machine algorithms. Expert Systems with Applications, 41, 1476–1482.

    Article  Google Scholar 

Download references

Acknowledgements

We would like to thank the Editor for suggesting implementing different initialization strategies. We found these strategies can achieve better results for the proposed clinical decision support system model.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Gehad Ismail Sayed.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Sayed, G.I., Darwish, A. & Hassanien, A.E. Binary Whale Optimization Algorithm and Binary Moth Flame Optimization with Clustering Algorithms for Clinical Breast Cancer Diagnoses. J Classif 37, 66–96 (2020). https://doi.org/10.1007/s00357-018-9297-3

Download citation

Keywords

  • Intelligent systems
  • Breast cancer
  • Feature selection
  • Whale optimization algorithm
  • Moth flame optimization
  • WBCD