Predicting Postoperative Complications for Gastric Cancer Patients Using Data Mining

  • Hugo PeixotoEmail author
  • Alexandra Francisco
  • Ana Duarte
  • Márcia Esteves
  • Sara Oliveira
  • Vítor Lopes
  • António Abelha
  • José Machado
Conference paper
Part of the Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering book series (LNICST, volume 273)


Gastric cancer refers to the development of malign cells that can grow in any part of the stomach. With the vast amount of data being collected daily in healthcare environments, it is possible to develop new algorithms which can support the decision-making processes in gastric cancer patients treatment. This paper aims to predict, using the CRISP-DM methodology, the outcome from the hospitalization of gastric cancer patients who have undergone surgery, as well as the occurrence of postoperative complications during surgery. The study showed that, on one hand, the RF and NB algorithms are the best in the detection of an outcome of hospitalization, taking into account patients’ clinical data. On the other hand, the algorithms J48, RF, and NB offer better results in predicting postoperative complications.


Data Mining Clinical Decision Support Systems CRISP-DM Gastric cancer WEKA 



This work has been supported by Compete: POCI-01-0145-FEDER-007043 and FCT within the Project Scope UID/CEC/00319/2013.


  1. 1.
    Biglarian, A., Hajizadeh, E., Kazemnejad, A., Zali, M.R.: Application of artificial neural network in predicting the survival rate of gastric cancer patients. Iran. J. Public Health 40(2), 80–86 (2011)Google Scholar
  2. 2.
    Rugge, M., Fassan, M., Graham, D.Y.: Epidemiology of gastric cancer. In: Strong, V. (ed.) Gastric Cancer, pp. 23–34. Cham, Springer (2015). Scholar
  3. 3.
    Brenner, H., Rothenbacher, D., Arndt, V.: Epidemiology of stomach cancer. In: Verma, M. (ed.) Methods of Molecular Biology, pp. 467–477. Springer, Heidelberg (2009). Scholar
  4. 4.
    Sitarz, R., Skierucha, M., Mielko, J., Offerhaus, G.J.A., Maciejewski, R., Polkowski, W.: Gastri cancer: epidemiology, prevention, classification, and treatment. Cancer Manag. Res. 10, 239–248 (2018)CrossRefGoogle Scholar
  5. 5.
    Roder, D.M.: The epidemiology of gastric cancer. Gastric Cancer 5(Suppl 1), 5–11 (2002)CrossRefGoogle Scholar
  6. 6.
    Karimi, P., Islami, F., Anandasabapathy, S., Freedman, N.D., Kamangar, F.: Gastric cancer: descriptive epidemiology, risk factors, screening, and prevention. Cancer Epidemiol. Biomark. Prev. 23(5), 700–713 (2014)CrossRefGoogle Scholar
  7. 7.
    Koh, H.C., Tan, G.: Data mining applications in healthcare. J. Healthc. Inf. Manag. 19(2), 64–72 (2011)Google Scholar
  8. 8.
    Witten, I., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)zbMATHGoogle Scholar
  9. 9.
    Tuffery, S.: Data Mining and Statistics for Decision-Making, 1st edn. Wiley, Oxford (2011)CrossRefGoogle Scholar
  10. 10.
    Fonseca, F., Peixoto, H., Miranda, F., Machado, J., Abelha, A.: Step towards prediction of perineal tear. Procedia Comput. Sci. 113, 565–570 (2017)CrossRefGoogle Scholar
  11. 11.
    Bâra, A., Lungu, I.: Improving decision support systems with data mining techniques. In: Advances in Data Mining Knowledge Discovery and Applications. INTECH Open Access Publisher, pp. 397–418 (2012)Google Scholar
  12. 12.
    Shim, J., Warkentin, M., Courtney, J., Power, D., Sharda, R., Carlsson, C.: Past, present, and future of decision support technology. Decis. Support Syst. 33(2), 111–126 (2002)CrossRefGoogle Scholar
  13. 13.
    Beeler, P., Bates, D., Hug, B.: Clinical decision support systems. Swiss Med. Wkly 144, w14073 (2014)Google Scholar
  14. 14.
    Trowbridge, R., Weingarten, S.: Clinical decision support systems [Internet], Chap. 53. United States Department of Health & Human Services Agency for Healthcare Research and Quality (2001). Accessed 6 May 2018
  15. 15.
    Morais, A., Peixoto, H., Coimbra, C., Abelha, A., Machado, J.: Predicting the need of Neonatal Resuscitation using data mining. Procedia Comput. Sci. 113, 571–576 (2017)CrossRefGoogle Scholar
  16. 16.
    Svetnik, V., Liaw, A., Tong, C., Culberson, J., Sheridan, R., Feuston, B.: Random forest: a classification and regression tool for compound classification and QSAR modeling. J. Chem. Inf. Comput. Sci. 43(6), 1947–1958 (2003)CrossRefGoogle Scholar
  17. 17.
    Chen, C., Liaw, A., Breiman, L.: Using random forest to learn imbalanced data (2004)Google Scholar
  18. 18.
    Zhang, C., Liu, C., Zhang, X., Almpanidis, G.: An up-to-date comparison of state-of-the-art classification algorithms. Expert Syst. Appl. 82, 128–150 (2017)CrossRefGoogle Scholar
  19. 19.
    Khoshgoftaar, T., Golawala, M., Hulse, J.: An empirical study of learning from imbalanced data using random forest. In: 19th IEEE International Conference on Tools with Artificial Intelligence (ICTAI 2007) (2007)Google Scholar
  20. 20.
    Mitchell, T.: Machine Learning. McGraw-Hill, New York (1997)zbMATHGoogle Scholar
  21. 21.
    Platt, J.: Sequential minimal optimization: a fast algorithm for training support vector machines (1998)Google Scholar
  22. 22.
    Wu, X., et al.: Top 10 algorithms in data mining. Knowl. Inf. Syst. 14(1), 1–37 (2009)CrossRefGoogle Scholar
  23. 23.
    Zhao, Y., Zhang, Y.: Comparison of decision tree methods for finding active objects. Adv. Space Res. 41(12), 1955–1959 (2008)CrossRefGoogle Scholar
  24. 24.
    Rajput, A., Aharwal, R., Dubey, M., Saxena, S., Raghuvanshi, M.: J48 and JRIP rules for e-governance data. Int. J. Comput. Sci. Secur. (IJCSS) 5(2), 201–207 (2011)Google Scholar
  25. 25.
    Mohamed, W., Salleh, M., Omar, A.: A comparative study of reduced error pruning method in decision tree algorithms. In: 2012 IEEE International Conference on Control System, Computing and Engineering, pp. 392–397 (2012)Google Scholar
  26. 26.
    Delen, D., Walker, G., Kadam, A.: Predicting breast cancer survivability: a comparison of three data mining methods. Artif. Intell. Med. 34(2), 113–127 (2005)CrossRefGoogle Scholar
  27. 27.
    Khalilia, M., Chakraborty, S., Popescu, M.: Predicting disease risks from highly imbalanced data using random forest. BMC Med. Informat. Decis.-Making 11(1), 51 (2011)CrossRefGoogle Scholar

Copyright information

© ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2019

Authors and Affiliations

  • Hugo Peixoto
    • 1
    Email author
  • Alexandra Francisco
    • 2
  • Ana Duarte
    • 2
  • Márcia Esteves
    • 2
  • Sara Oliveira
    • 2
  • Vítor Lopes
    • 3
  • António Abelha
    • 1
  • José Machado
    • 1
  1. 1.Algoritmi Research CenterUniversity of Minho, Campus GualtarBragaPortugal
  2. 2.University of Minho, Campus GualtarBragaPortugal
  3. 3.Tâmega e Sousa Hospital CenterPenafielPortugal

Personalised recommendations