Breast Cancer Recurrence Prediction Using Random Forest Model

  • Tahsien Al-Quraishi
  • Jemal H. Abawajy
  • Morshed U. Chowdhury
  • Sutharshan Rajasegarar
  • Ahmad Shaker Abdalrada
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 700)


Breast cancer is the second most common cause of death among Australian females. To reduce the probability of death, early detection and prevention of breast cancer is a crucial factor. Evaluating the probability of breast cancer recurrence is an important act related to breast cancer prognosis. The aim of this paper is to predict the probability of breast cancer recurrence among patients. The researchers individually applied Random Forest and Deep Neural Network classifiers to increase the prediction accuracy of those models. Wisconsin Prognosis Breast Cancer dataset was obtained from UCI machine learning Repository. The results of our experiment indicate that Random Forest technique achieved the highest accuracy compared to the existing works.


Breast cancer Random forest Deep neural network 


  1. 1.
  2. 2.
    Mehrotra, J., Vali, M., McVeigh, M., Kominsky, S.L., Fackler, M.J., Lahti-Domenici, J., Polyak, K., Sacchi, N., Garrett-Mayer, E., Argani, P.: Very high frequency of hypermethylated genes in breast cancer metastasis to the bone, brain, and lung. Clin. Cancer Res. 10(9), 3104–3109 (2004)CrossRefGoogle Scholar
  3. 3.
    Ohno-Machado, L.: Modeling medical prognosis: survival analysis techniques. J. Biomed. Inf. 34, 428–439 (2001)CrossRefGoogle Scholar
  4. 4.
    Skevofilakas, M., Nikita, K., Templaleksis, P., Birbas, K., Kaklamanos, I., Bonatsos, G.: A decision support system for breast cancer treatment based on data mining technologies and clinical practice. In: Engineering in Medicine and Biology Society, 2005. IEEE-EMBS 2005. 27th Annual International Conference, pp. 2429–2432 (2005)Google Scholar
  5. 5.
    Yi, W., Fuyong, W.: Breast cancer diagnosis via supp ort vector machines. In: Control Conference: CCC 2006. Chinese, pp. 1853–1856 (2006)Google Scholar
  6. 6.
    Delen, D., Walker, G., Kadam, A.: Predicting breast cancer survivability: a comparison of three data mining methods. Artif. Intel. Med. 34(2), 113–127 (2005)CrossRefGoogle Scholar
  7. 7.
    Sobran, N.M.M., Ahmad, A., Ibrahim, Z.: Classification of imbalanced dataset using conventional naive bayes classifier. In: International Conference on Artificial Intelligence in Computer Science and ICT, pp. 35–42 (2013)Google Scholar
  8. 8.
    Lichman, M.: UCI: Machine Learning Repository (2013).
  9. 9.
    He, H., Shen, X.: A Ranked Subspace Learning Method for Gene Expression Data Classification, IC-AI, pp. 358–364 (2007)Google Scholar
  10. 10.
    Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: SMOTE: synthetic minority over-sampling technique. J. Artif. Intel. Res. 16, 321–357 (2002)Google Scholar
  11. 11.
    Khalilia, M., Chakraborty, S., Popescu, M.: Predicting disease risks from highly imbalanced data using random forest. BMC Med. Inf. Decis. Mak. 11(1): 51 (2011)Google Scholar
  12. 12.
    Wang, S., Liu, W., Wu, J., Cao, L., Meng, Q., Kennedy, P.J.: Training deep neural networks on imbalanced data sets. In: Neural Networks (IJCNN), vol. 01, pp. 4368–4374 (2016)Google Scholar
  13. 13.
    Ojha, U., Goel, S.: A study on prediction of breast cancer recurrence using data mining techniques. In: 7th International Conference on Cloud Computing, Data Science and Engineering–Confluence, pp. 527–530 (2017)Google Scholar
  14. 14.
    Kim, Woojae, Sang, Kim, Ku, Lee, Jeong Eon, Noh, Dong-Young, Kim, Sung-Won, Jung, Yong Sik, Park, Man Young, Park, Rae Woong: Development of novel breast cancer recurrence prediction model using support vector machine. J. Breast Cancer 15(2), 230–238 (2012)CrossRefGoogle Scholar
  15. 15.
    Salama, G.I., Abdelhalim, M.B., Zeid, M.A.: Experimental comparison of classifiers for breast cancer diagnosis. In: 7th International Conference on Computer Engineering and Systems (ICCES), pp. 180–185 (2012)Google Scholar
  16. 16.
    Tomczak, J.M.: Prediction of Breast Cancer Recurrence Using Classification Restricted Boltzmann Machine with Dropping. arXiv:1308.6324 (2013)
  17. 17.
    Chaurasia, Vikas, Pal, Saurabh: Prediction of breast cancer recurrence using classification restricted boltzmann machine with dropping. Int. J. Comput. Sci. Mob. Comput. 3(1), 10–22 (2014)Google Scholar
  18. 18.
    Beheshti, Z., Shamsuddin, S.M.Hj., Beheshti, E., Yuhaniz, S.S.: Enhancement of artificial neural network learning using centripetal accelerated particle swarm optimization for medical diseases diagnosis. Soft Comput. 18(11), 2253–2270 (2014)Google Scholar
  19. 19.
    De sa Marques, J.P.: Pattern Recognition: Concepts, Methods and Applications, Springer, Berlin (2012)Google Scholar
  20. 20.
    Ho, T.K.: Random decision forests. In: Proceedings of the Third International Conference on Document Analysis and Recognition, vol. 1, pp. 278–282 (1995)Google Scholar
  21. 21.
    Shin, Seung Jun, Wu, Yichao, Zhang, Hao Helen: Two-dimensional solution surface for weighted support vector machines. J. Comput. Graph. Stat. 23(2), 383–402 (2014)MathSciNetCrossRefGoogle Scholar
  22. 22.
    Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann San Mateo. CA, Google Scholar (2014)Google Scholar

Copyright information

© Springer International Publishing AG 2018

Authors and Affiliations

  • Tahsien Al-Quraishi
    • 1
  • Jemal H. Abawajy
    • 1
  • Morshed U. Chowdhury
    • 1
  • Sutharshan Rajasegarar
    • 1
  • Ahmad Shaker Abdalrada
    • 1
  1. 1.Deakin UniversityBurwoodAustralia

Personalised recommendations