Skip to main content

Breast Cancer Diagnosis and Prognosis Using Machine Learning Techniques

  • Conference paper
  • First Online:
Intelligent Systems Technologies and Applications (ISTA 2017)

Abstract

Breast cancer is one of the major type of cancer which is the leading cause of death in women. The research work is carried out on the real data of patient records obtained from HealthCare Global Enterprises Ltd (HCG) hospitals. The work analyzes the four major class variables in the dataset, namely death, progression, recurrence and metastasis. The influence of the same 11 predictor variables is explored for each of the class. Various machine algorithms namely Support Vector Machine, Decision Tree, Multi-layer Perceptron and Naive Bayes have been explored for classification of the patient data into various classes. The imbalance in the data is handled using an over sampling technique. The contribution of various attributes in classifying the instances into different classes is also being explored. The model helps in predicting various factors and thus helps in early diagnosis in the breast cancer.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

References

  1. Jothi, N., Wahidah, H.: Data mining in healthcare – a review. Proc. Comput. Sci. 72, 306–313 (2015)

    Article  Google Scholar 

  2. WHO Cancer - World Health Organization. http://www.who.int/mediacentre/factsheets/fs297/en

  3. Cancer Statistics for the UK. http://www.cancerresearchuk.org

  4. Khare, S., Gupta, D.: Association rule analysis in cardiovascular disease. In: Second International Conference on Cognitive Computing and Information Processing (CCIP), SJCE, Mysuru, India, pp. 1–6. IEEE (2016)

    Google Scholar 

  5. Fan, Q., et al.: An application of apriori algorithm in SEER breast cancer data. In: 2010 International Conference on Artificial Intelligence and Computational Intelligence (AICI), vol. 3, pp. 114–116. IEEE (2010)

    Google Scholar 

  6. Gupta, D., Aggarwal, A., Khare, S.: A method to predict diagnostic codes for chronic diseases using machine learning techniques. In: Fifth IEEE International Conference on Computing Communication and Automation (ICCA), pp. 281–287 (2016)

    Google Scholar 

  7. Dominic, V., Aggarwal, A., Gupta, D., Khare, S.: Investigation of chronic disease correlation using data mining techniques. In: 2nd International Conference on Recent Advances in Engineering and Computational Sciences (RAECS), pp. 1–6. University Institute of Engineering and Technology, Panjab University, Chandigarh (2015)

    Google Scholar 

  8. Dominic, V., Gupta, D., Khare, S.: Exploration of machine learning techniques for cardiovascular disease. Appl. Med. Inf. Index Scopus 36(1), 23–32 (2015)

    Google Scholar 

  9. Kourou, K., Exarchos, T.P., Exarchos, K.P., Karamouzis, M.V., Fotiadis, D.I.: Machine learning applications in cancer prognosis and prediction. In: International Conférence Science Direct, pp. 8–17 (2014)

    Google Scholar 

  10. Sharma, N., Om, H.: Data mining models for predicting oral cancer survivability. Netw. Model. Anal. Health Inf. Bioinform. 2(4), 285–295 (2013)

    Article  Google Scholar 

  11. Yang, H., Chen, Y.P.P.: Data mining in lung cancer pathologic staging diagnosis: correlation between clinical and pathology information. Expert Syst. Appl. 42(15), 6168–6176 (2015)

    Article  Google Scholar 

  12. Abreu, P.H., et al.: Predicting breast cancer recurrence using machine learning techniques: a systematic review. ACM Comput. Surv. (CSUR) 49(3), 52 (2016)

    Article  MathSciNet  Google Scholar 

  13. Kim, W., et al.: Development of novel breast cancer recurrence prediction model using support vector machine. J. Breast Cancer 15(2), 230–238 (2012)

    Article  Google Scholar 

  14. Ahmad, L.G., Eshlaghy, A.T., Poorebrahimi, A., Ebrahimi, M., Razavi, A.R.: Using three machine learning techniques for predicting breast cancer recurrence. J. Health Med. Inf. 4(124), 3 (2013)

    Google Scholar 

  15. Park, K., et al.: Robust predictive model for evaluating breast cancer survivability. Eng. Appl. Artif. Intell. 26(9), 2194–2205 (2013)

    Article  Google Scholar 

  16. Sain, H., Purnami, S.W.: Combine sampling support vector machine for imbalanced data classification. Procedia Comput. Sci. 72, 59–66 (2015)

    Article  Google Scholar 

  17. Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)

    MATH  Google Scholar 

  18. Roozbahani, Z., Katanforoush, A.: Classification of gene expression data using multiple ranker evaluators and neural network. In: CICIS, pp. 29–31 (2012)

    Google Scholar 

  19. Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Francisco (1993)

    Google Scholar 

  20. Pal, S.K., Mitra, S.: Multilayer perceptron, fuzzy sets, and classification. IEEE Trans. Neural Netw. 3(5), 683–697 (1992)

    Article  Google Scholar 

  21. John, G.H., Langley, P.: Estimating continuous distributions in Bayesian classifiers. In: Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence, pp. 338–345. Morgan Kaufmann Publishers Inc. (1995)

    Google Scholar 

  22. Platt, J.C.: 12 fast training of support vector machines using sequential minimal optimization. Adv. Kernel Methods 1, 185–208 (1999)

    Google Scholar 

  23. Hanley, J.A., McNeil, B.J.: The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 143(1), 29–36 (1982)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sunil Suresh Shastri .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG

About this paper

Cite this paper

Shastri, S.S., Nair, P.C., Gupta, D., Nayar, R.C., Rao, R., Ram, A. (2018). Breast Cancer Diagnosis and Prognosis Using Machine Learning Techniques. In: Thampi, S., Mitra, S., Mukhopadhyay, J., Li, KC., James, A., Berretti, S. (eds) Intelligent Systems Technologies and Applications. ISTA 2017. Advances in Intelligent Systems and Computing, vol 683. Springer, Cham. https://doi.org/10.1007/978-3-319-68385-0_28

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-68385-0_28

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-68384-3

  • Online ISBN: 978-3-319-68385-0

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics