Prediction of Breast Cancer Recurrence Using Ensemble Machine Learning Classifiers

  • M. S. DawnglianiEmail author
  • N. Chandrasekaran
  • Samuel Lalmuanawma
  • H. Thangkhanhau
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 1145)


Breast Cancer is the most common type of cancer prevalent among female cancer patients, while it is also the second most dreaded disease, causing cancer deaths among women. This study proposes new criteria for the prediction of survival of breast cancer patients, based on the analysis performed using four ensemble machine learning techniques, which include, AdaBoost M1, Bagging, Voting, and Stacking. For this study, we have used a breast cancer dataset consisting of 23 attributes and containing 575 samples obtained from Mizoram State Cancer Institute of Aizawl, Mizoram, India. We have employed ensemble machine learning classifiers to predict the recurrence of breast cancer within a period of three years evaluated based on the comparison of their performance. We have used 10 fold cross-validation technique and ROC curve to arrive at the results. From the dataset, attributes are ranked according to their contribution towards the prediction.


Adaboost M1 Bagging Data mining Ensemble method Stacking Voting 


  1. 1.
    NCI. Accessed March 2019
  2. 2.
  3. 3.
    Witten, I.H., Frank, E., Hall, M.A.: Data Mining: Practical Machine Learning Tools and Techniques, 3rd edn. Morgan Kaufmann, San Francisco (2011)Google Scholar
  4. 4.
    World Cancer Research Fund and American Institute for Cancer Research.
  5. 5.
  6. 6.
  7. 7.
    National Breast Cancer Foundation, INC (2016).
  8. 8.
    Safiyari, A., Javidan, R.: Predicting lung cancer survivability using ensemble learning methods. In: 2017 Intelligent System Conference IntelliSys 2017, vol. 2018, no. September, pp. 684–688 (2018)Google Scholar
  9. 9.
    Kumar, U.K., Nikhil, M.B.S., Sumangali, K.: Prediction of breast cancer using voting classifier technique. In: 2017 IEEE International Conference Smart Technology Management Computer Communication Controlling Energy Mater. ICSTM 2017 - Proceedings, no. August, pp. 108–114 (2017)Google Scholar
  10. 10.
    Sisodia, D.S.: Ensemble learning approach for clickbait detection using article headline features. Inf. Sci. 22(2019), 31–44 (2019)Google Scholar
  11. 11.
    Mohebian, M.R., Marateb, H.R., Mansourian, M., Angel, M., Mokarian, F.: A hybrid computer-aided-diagnosis system for prediction of breast cancer recurrence (HPBCR) using optimized ensemble learning. Comput. Struct. Biotechnol. J. 15, 75–85 (2017)CrossRefGoogle Scholar
  12. 12.
    Tarek, S., Elwahab, R.A., Shoman, M.: Gene expression-based cancer classification. Egypt. Inf. J. (2016)Google Scholar
  13. 13.
    Okun, O.: Feature Selection and Ensemble Methods for Bioinformatics: Algorithmic Classification and Implementations (2011).
  14. 14.
    Lavanya, D., Usha Rani, K.: Ensemble decision making system for breast cancer data. Int. J. Comput. Appl. 51(17), 19–23 (2012)Google Scholar
  15. 15.
    Abed, B.M., et al.: A hybrid classification algorithm approach for breast cancer diagnosis. In: IEACon 2016 – 2016 IEEE Industry Electronics and Applications Conference, pp. 269–274 (2017)Google Scholar
  16. 16.
    Avula, A., Asha, A.: Improving prediction accuracy using hybrid machine learning algorithm on medical datasets. IJSER 9(10), 1461–1467 (2018)Google Scholar
  17. 17.
    Dawngliani, M.S., Chandrasekaran, N., Lalmuanawma, S.: A comparative study between data mining classification and ensemble techniques for predicting survivability of breast cancer patients. Int. J. Comput. Sci. Mob. Comput. 8(9) (2019)Google Scholar
  18. 18.
  19. 19.
  20. 20.
    Eapen, A.G.; Application of Data mining in Medical Applications. University of Waterloo (2004)Google Scholar
  21. 21.
    Hall, M., Frank, E., Holmes, G., Witten, I.H., Cunningham, S.J.: Weka: practical machine learning tools and techniques. In: Workshop on Emerging Knowledge Engineering and Connectionist-Based Information Systems (2007)Google Scholar
  22. 22.
    Aksenova, S.S.: Machine learning with WEKA. Mach. Learn. 11(1), 1–37 (2006)Google Scholar
  23. 23.
  24. 24.
    Data Pre Processing Techniques You Should Know – Towards Data Science (2018).
  25. 25.
    Khalid, S., Khalil, T., Nasreen, S.: A survey of feature selection and feature extraction techniques in machine learning. In: Proceedings of 2014 Science and Information Conference SAI 2014, pp. 372–378 (2014)Google Scholar
  26. 26.
    Feature Selection and Feature Extraction in Machine Learning: An Overview (2018).
  27. 27.
  28. 28.
  29. 29.
  30. 30.
    Rokach, L.: Ensemble-based classifiers. Artif. Intell. Rev. 33(1–2), 1–39 (2010)CrossRefGoogle Scholar
  31. 31.
    Kozak, J.: Ensemble methods. Stud. Comput. Intell. 781, 107–118 (2019)Google Scholar
  32. 32.
    Opitz, D., Maclin, R.: Popular ensemble methods: an empirical study. J. Artif. Intell. Res. 11, 169–198 (1999)Google Scholar
  33. 33.
  34. 34.
    Bostock, J.: Automated cardiac rhythm diagnosis for electrophysiological studies, an enhanced classifier approach (2014).
  35. 35.
    Multiclassifiers; Ensembles and Hybrids; Bagging, Boosting, and Stacking - PRIMO (2019).;_Ensembles_and_Hybrids;_Bagging,_Boosting,_and_Stacking

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.Martin Luther Christian UniversityShillongIndia
  2. 2.CDACPuneIndia
  3. 3.Department of ManagementMizoram UniversityAizawlIndia
  4. 4.Department of Computer ScienceGZRSCAizawlIndia

Personalised recommendations