Skip to main content

On the Performance of Ensemble Learning for Automated Diagnosis of Breast Cancer

  • Conference paper
Artificial Intelligence Perspectives and Applications

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 347))

Abstract

The automated diagnosis of diseases with high accuracy rate is one of the most crucial problems in medical informatics. Machine learning algorithms are widely utilized for automatic detection of illnesses. Breast cancer is one of the most common cancer types in females and the second most common cause of death from cancer in females. Hence, developing an efficient classifier for automated diagnosis of breast cancer is essential to improve the chance of diagnosing the disease at the earlier stages and treating it more properly. Ensemble learning is a branch of machine learning that seeks to use multiple learning algorithms so that better predictive performance acquired. Ensemble learning is a promising field for improving the performance of base classifiers. This paper is concerned with the comparative assessment of the performance of six popular ensemble methods (Bagging, Dagging, Ada Boost, Multi Boost, Decorate, and Random Subspace) based on fourteen base learners (Bayes Net, FURIA, K-nearest Neighbors, C4.5, RIPPER, Kernel Logistic Regression, K-star, Logistic Regression, Multilayer Perceptron, Naïve Bayes, Random Forest, Simple Cart, Support Vector Machine, and LMT) for automatic detection of breast cancer. The empirical results indicate that ensemble learning can improve the predictive performance of base learners on medical domain. The best results for comparative experiments are acquired with Random Subspace ensemble method. The experiments show that ensemble learning methods are appropriate methods to improve the performance of classifiers for medical diagnosis.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ahmad, A.: Breast Cancer Metastasis and Drug Resistance Progress and Prospects. Springer, Berlin (2013)

    Book  Google Scholar 

  2. Tabar, L., Tot, T., Dean, P.B.: Breast Cancer-The Art and Science of Early Detection with Mammography: Perception, Interpretation, Histopathologic Correlation. Thieme, New York (2004)

    Google Scholar 

  3. Westa, D., Mangiamelib, P., Rampalc, R., Westd, V.: Ensemble strategies for a medical diagnostic decision support system: A breast cancer diagnosis application. European Journal of Operational Research 162(2), 532–551 (2005)

    Article  Google Scholar 

  4. Lundin, M., Lundin, J., Burke, H.B., Toikkanen, S., Pylkkanen, L., Joensuu, H.: Artificial Neural Networks Applied to Survival Prediction in Breast Cancer. Oncology 57, 281–286 (1999)

    Article  Google Scholar 

  5. Bellaachia, A., Guven, E.: Predicting Breast Cancer Survivability using Data Mining Techniques. In: Proceedings of the Sixth SIAM International Conference on Data Mining, pp. 1–4. SIAM, Maryland (2006)

    Google Scholar 

  6. Akay, M.F.: Support vector machines combined with feature selection for breast cancer diagnosis. Expert Systems with Applications 36(2), 3240–3247 (2009)

    Article  Google Scholar 

  7. Delen, D., Walker, G., Kadam, A.: Predicting breast cancer survivability: a comparison of three data mining methods. Artificial Intelligence in Medicine 34(2), 113–127 (2005)

    Article  Google Scholar 

  8. Ubeyli, E.D.: Adaptive neuro-fuzzy inference systems for automatic detection of breast cancer. Journal of Medical Systems 33(5), 353–358 (2009)

    Article  Google Scholar 

  9. Thongkam, J., Sukmak, V.: Bagging Random Tree for Analyzing Breast Cancer Survival. KKU Res. J. 17(1), 1–13 (2012)

    Google Scholar 

  10. Ya-Qin, L., Cheng, W.: Decision Tree Based Predictive Models for Breast Cancer Survivability on Imbalanced Data. In: Proc. 3rd International Conference on Bioinformatics and Biomedical Engineering, pp. 1–4. IEEE Press, New York (2009)

    Google Scholar 

  11. Lavanya, D., Rani, K.U.: Ensemble Decision Tree Classifier for Breast Cancer Data. International Journal of Information Technology Convergence and Services (IJITCS) 2(1), 17–24 (2012)

    Article  Google Scholar 

  12. Cruz, J.A., Wishart, D.S.: Application of Machine Learning in Cancer Prediction and Prognosis. Cancer Informatics 2006(2), 59–77 (2006)

    Google Scholar 

  13. Gayathri, B.M., Sumathi, C.P., Santhanam, T.: Breast Cancer Diagnosis Using Machine Learning Algorithm- A Survey. International Journal of Distributed and Parallel Systems 4(3), 105–112 (2013)

    Article  Google Scholar 

  14. Li, L., Hu, Q., Wu, X., Yu, D.: Exploration of classification confidence in ensemble learning. Pattern Recognition 47, 3120–3131 (2014)

    Article  Google Scholar 

  15. Cohen, W.W.: Fast Effective Rule Induction. In: Proc. Twelfth International Conference on Machine Learning, pp. 115–123. Morgan Kaufmann, San Francisco (1995)

    Google Scholar 

  16. Duma, M., Twala, B., Marwala, T., Nelwamondo, F.V.: Improving the Performance of the Ripper in Insurance Risk Classification- A Comparative Study using Feature Selection. In: Ferrier, J.-L., Bernard, A., Yu, O., Gusikin, K.M. (eds.) Proceedings of the 8th International Conference on Informatics in Control, Automation and Robotics, vol. 1, pp. 203–210. SciTePress, Netherlands (2011)

    Google Scholar 

  17. Hühn, J., Hüllermeier, E.: FURIA: an algorithm for unordered fuzzy rule induction. Data Mining and Knowledge Discovery 19(3), 293–319 (2009)

    Article  MathSciNet  Google Scholar 

  18. Aha, D.W., Kibler, D., Albert, M.K.: Instance-Based Learning Algorithms. Machine Learning 6, 37–66 (1991)

    Google Scholar 

  19. Wu, X., Kumar, V.: The Top Ten Algorithms in Data Mining. Taylor & Francis Group, New York (2009)

    Book  MATH  Google Scholar 

  20. Clearly, J.G., Trigg, L.E.: K*: An Instance-based learner using and entropic distance measure. In: Proc. Twelfth International Conference on Machine Learning, pp. 108–114. Morgan Kaufmann, San Francisco (1995)

    Google Scholar 

  21. John, G.H., Langley, P.: Estimating continuous distributions in Bayesian classifiers. In: Proc. of the Eleventh Conference on Uncertainty in Artificial Intelligence, pp. 338–345. Morgan Kaufmann, San Francisco (1995)

    Google Scholar 

  22. Han, J., Kamber, M., Pei, J.: Data Mining: Concepts and Techniques. Morgan Kaufmann, San Francisco (2011)

    Google Scholar 

  23. Bouckaert, R.R.: Bayesian Network Classifiers in Weka, http://weka.sourceforge.net/manuals/weka.bn.pdf

  24. Witten, I.H., Frank, E., Hall, M.A.: Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann, Burlington (2011)

    Google Scholar 

  25. Cessie, S.L., VanHowelingen, J.C.: Ridge Estimators in Logistic Regression. Applied Statistics 41(1), 191–201 (1992)

    Article  MATH  Google Scholar 

  26. Negnevitsky, M.: Artificial Intelligence: A Guide to Intelligent Systems. Addison-Wesley, Reading (2005)

    Google Scholar 

  27. Platt, J.: Fast Training of Support Vector Machines using Sequential Minimal Optimization. In: Schoelkopf, B., Burges, C., Smola, A. (eds.) Advances in Kernel Methods-Support Vector Learning. MIT Press, Cambridge (1998)

    Google Scholar 

  28. Quinlan, R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo (1993)

    Google Scholar 

  29. Niuniu, X., Yuxun, L.: Review of Decision Trees. In: Proc The Third IEEE International Conferrence on Computer Science and Information Technology, pp. 105–109. IEEE Press, New York (2010)

    Google Scholar 

  30. Breiman, L.: Random Forests. Machine Learning 45(1), 5–32 (2001)

    Article  MATH  Google Scholar 

  31. Landwehr, N., Hall, M., Frank, E.: Logistic Model Trees. Machine Learning 59, 161–205 (2005)

    Article  MATH  Google Scholar 

  32. Doestcsh, P., Buck, C., Golik, P., Hoppe, N.: Logistic Model Trees with AUCsplit Criterion for KDD Cup 2009 Small Challgenge. Journal of Machine Learning Research 7, 77–88 (2009)

    Google Scholar 

  33. Loh, W.Y.: Classification and regression trees. WIREs Data Mining and Knowledge Discovery 1, 14–23 (2011)

    Article  Google Scholar 

  34. Breiman, L.: Bagging predictors. Machine Learning 4(2), 123–140 (1996)

    Google Scholar 

  35. Rokach, L.: Ensemble-based classifiers. Artificial Intelligence Review 33, 1–39 (2010)

    Article  Google Scholar 

  36. Ting, K.M., Witten, I.H.: Stacking Bagged and Dagged Models. In: Fourteenth International Conference on Machine Learning, pp. 367–375. Morgan Kaufmann, San Francisco (1997)

    Google Scholar 

  37. Freund, Y., Schapire, R.: Experiments with a new boosting algorithm. In: Proc of the Thirteenth International Conference on Machine Learning, pp. 148–156. Morgan Kaufmann, San Francisco (1996)

    Google Scholar 

  38. Opitz, D., Maclin, R.: Popular Ensemble Methods: An Empirical Study. Journal of Artificial Intelligence Research 11, 169–198 (1999)

    MATH  Google Scholar 

  39. Guo, H., Viktor, H.L.: Boosting with Data Generation: Improving the Classification of Hard to Learn Examples. In: Orchard, B., Yang, C., Ali, M. (eds.) IEA/AIE 2004. LNCS (LNAI), vol. 3029, pp. 1082–1091. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  40. Webb, G.I.: MultiBoosting: A Technique for Combining Boosting and Wagging. Machine Learning 40, 159–196 (2000)

    Article  Google Scholar 

  41. Melville, P., Mooney, R.J.: Constructing Diverse Classifier Ensembles using Artificial Training Examples. In: Proceedings of the 18th IJCAI, pp. 505–510. Morgan Kaufmann, San Francisco (2003)

    Google Scholar 

  42. Ho, T.K.: The Random Subspace Method for Constructing Decision Forests. IEEE Transactions on Pattern Analysis and Machine Intelligence 20(8), 832–844 (1998)

    Article  Google Scholar 

  43. Mangasarian, O.L., Wolberg, W.H.: Cancer diagnosis via linear programming. SIAM News 23(5), 1–18 (1990)

    Google Scholar 

  44. Bache, K., Lichman, M.: UCI Machine Learning Repository, http://archieve.ics.uci.edu/ml

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Aytuğ Onan .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Onan, A. (2015). On the Performance of Ensemble Learning for Automated Diagnosis of Breast Cancer. In: Silhavy, R., Senkerik, R., Oplatkova, Z., Prokopova, Z., Silhavy, P. (eds) Artificial Intelligence Perspectives and Applications. Advances in Intelligent Systems and Computing, vol 347. Springer, Cham. https://doi.org/10.1007/978-3-319-18476-0_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-18476-0_13

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-18475-3

  • Online ISBN: 978-3-319-18476-0

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics