Predicting Cancer Survivability: A Comparative Study

  • Ola Abu Elberak
  • Loai Alnemer
  • Majdi Sawalha
  • Jamal AlsakranEmail author
Conference paper
Part of the Lecture Notes on Data Engineering and Communications Technologies book series (LNDECT, volume 29)


The prediction of cancer survivability in patients remains a challenging task due to its complexity and heterogeneity. Nevertheless, studying cancer survivability has been receiving an increasing attention essentially because of the positive impact it has on patients and physicians. It helps physicians determine the suitable treatment options, gives hope to patients, and improves their psychological state. This paper aims to predict the survival period a patient can live after being diagnosed with cancer disease by surveying the performance of three different regression algorithms. The three regression algorithms used are Decision Tree Regression, Multilayer Perceptron Regression, and Support Vector Regression. The algorithms are trained and tested on nine cancer types selected from the SEER dataset. The prediction models of each regression algorithm are built using cross validation evaluation method and ensemble method. Our experimental results show that Decision Tree Regression outperforms the others in predicting the survival period in all the nine cancer types.


  1. 1.
    Agrawal, A., Misra, S., Narayanan, R., Polepeddi, L., Choudhary, A.: Lung cancer survival prediction using ensemble data mining on SEER data. Sci. Programm. 20, 29–42 (2012)Google Scholar
  2. 2.
    Alnemer, L.M., Rajab, L., Aljarah, I.: Conformal prediction technique to predict breast cancer survivability. Int. J. Adv. Sci. Technol. 96, 1–10 (2016)CrossRefGoogle Scholar
  3. 3.
    Al-Bahrani, R., Agrawal, A., Choudhary, A.: Colon cancer survival prediction using ensemble data mining on SEER data. In: IEEE International Conference on Big Data (2013)Google Scholar
  4. 4.
    Al-Bahrani, R., Agrawal, A., Choudhary, A.: Survivability prediction of colon cancer patients using neural networks. Health Inform. J. (2017)Google Scholar
  5. 5.
    Bauer, E., Kohavi, R.: An empirical comparison of voting classification algorithms: bagging, boosting, and variants. Mach. Learn. 36(1/2), 105–139 (1999)CrossRefGoogle Scholar
  6. 6.
    Bellaachia A., Guven, E.: Predicting breast cancer survivability using data mining techniques. In: Scientific Data Mining Workshop, in conjunction with the 2006 SIAM Conference on Data Mining (2006)Google Scholar
  7. 7.
    Breiman, L.: Bagging predictors. Mach. Learn. 24(2), 123–140 (1996)zbMATHGoogle Scholar
  8. 8.
    Choi, J.P., Han, T.H., Park, R.W.: A hybrid bayesian network model for predicting breast cancer prognosis. J. Korean Soc. Med. Inform. 15(1), 49 (2009)CrossRefGoogle Scholar
  9. 9.
    Delen, D., Walker, G., Kadam, A.: Predicting breast cancer survivability: a comparison of three data mining methods. Artif. Intell. Med. 34(2), 113–127 (2005)CrossRefGoogle Scholar
  10. 10.
    Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification, 2nd edn. Wiley-Interscience, New York (2001)zbMATHGoogle Scholar
  11. 11.
    Fan, W., Zhang, K.: Bagging. Encyclopedia of Database Systems, pp. 1–5 (2016)Google Scholar
  12. 12.
    Frey, C.M., Feuer, E.J., Timmel, M.J.: Projection of incidence rates to a larger population using ecologic variables. Stat. Med. 13(17), 1755–1770 (1994)CrossRefGoogle Scholar
  13. 13.
    Ghodselahi, A.: A hybrid support vector machine ensemble model for credit scoring. Int. J. Comput. Appl. 17(5), 1–5 (2011)Google Scholar
  14. 14.
    Kavitha, R., Dorairangasamy, D.: Predicting breast cancer survivability using Naïve bayesian classifier and C4.5 algorithm. Elysium J. Eng. Res. Manag. 1(1), 61–63 (2014)Google Scholar
  15. 15.
    Kourou, K., Exarchos, T.P., Exarchos, K.P., Karamouzis, M.V., Fotiadis, D.I.: Machine learning applications in cancer prognosis and prediction. Comput. Struct. Biotechnol. J. 13, 8–17 (2015)CrossRefGoogle Scholar
  16. 16.
    Maclin, R., Opitz, D.: An empirical evaluation of bagging and boosting. In: Proceedings of the Fourteenth National Conference on Artificial Intelligence, pp. 546–551). AAAI Press/MIT Press, Cambridge (1997)Google Scholar
  17. 17.
    Mariotto, A., Capocaccia, R., Verdecchia, A., Micheli, A., Feuer, E., Pickle, L., Clegg, L.: Projecting SEER cancer survival rates to the US: an ecological regression approach. Cancer Causes Control 13, 101–111 (2002)CrossRefGoogle Scholar
  18. 18.
    Osowski, S., Siwek, K., Markiewicz, T.: MLP and SVM networks - a comparative study. In: Proceedings of the sixth Nordic Signal-Processing Symposium - NORSIG (2004)Google Scholar
  19. 19.
    Park, K., Ali, A., Kim, D., An, Y., Kim, M., Shin, H.: Robust predictive model for evaluating breast cancer survivability. Eng. Appl. Artif. Intell. 26(9), 2194–2205 (2013)CrossRefGoogle Scholar
  20. 20.
    Pedregosa, F., Varoquaux, G.: Scikit-learn: machine learning in Python (2011)Google Scholar
  21. 21.
    Rizzieri, D.A., Vredenburgh, J.J., Jones, R., Ross, M., Shpall, E.J., Hussein, A., Broadwater, G., Berry, D., Petros, W.P., Gilbert, C., Affronti, M.L., Coniglio, D., Rubin, P., Elkordy, M., Long, G.D., Chao, N.J., Peters, W.P.: Prognostic and predictive factors for patients with metastatic breast cancer undergoing aggressive induction therapy followed by high-dose chemotherapy with autologous stem-cell support. J. Clin. Oncol. 17(10), 3064–3074 (1999)CrossRefGoogle Scholar
  22. 22.
    Sangitab, P., Deshmukh, S.: Use of support vector machine for wind speed prediction. In: International Conference on Power and Energy Systems, pp. 1–8 (2011)Google Scholar
  23. 23.
    Shalabi, L.A., Shaaban, Z., Kasasbeh, B.: Data mining: a preprocessing engine. J. Comput. Sci. 2(9), 735–739 (2006)CrossRefGoogle Scholar
  24. 24.
    Sheta, A., Elsir, S., Faris, H.: A comparison between regression, artificial neural networks and support vector machines for predicting stock market index. Int. J. Adv. Res. Artif. Intell. 4(7) (2015)Google Scholar
  25. 25.
    Stewart, W., Wild, P.: World Cancer Report. IARC, Geneva (2014)Google Scholar
  26. 26.
    Thongkam, J., Sukmak, V., Mayusiri, W.: A comparison of regression analysis for predicting the daily number of anxiety-related outpatient visits with different time series data mining. KKU Eng. J. 42(3), 243–249 (2015)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Ola Abu Elberak
    • 1
  • Loai Alnemer
    • 1
  • Majdi Sawalha
    • 1
  • Jamal Alsakran
    • 2
    Email author
  1. 1.The University of JordanAmmanJordan
  2. 2.Higher Colleges of TechnologyFujariahUAE

Personalised recommendations