On the Performance of Ensemble Learning for Automated Diagnosis of Breast Cancer

Onan, Aytuğ

doi:10.1007/978-3-319-18476-0_13

Aytuğ Onan⁷

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 347))

1523 Accesses
15 Citations

Abstract

The automated diagnosis of diseases with high accuracy rate is one of the most crucial problems in medical informatics. Machine learning algorithms are widely utilized for automatic detection of illnesses. Breast cancer is one of the most common cancer types in females and the second most common cause of death from cancer in females. Hence, developing an efficient classifier for automated diagnosis of breast cancer is essential to improve the chance of diagnosing the disease at the earlier stages and treating it more properly. Ensemble learning is a branch of machine learning that seeks to use multiple learning algorithms so that better predictive performance acquired. Ensemble learning is a promising field for improving the performance of base classifiers. This paper is concerned with the comparative assessment of the performance of six popular ensemble methods (Bagging, Dagging, Ada Boost, Multi Boost, Decorate, and Random Subspace) based on fourteen base learners (Bayes Net, FURIA, K-nearest Neighbors, C4.5, RIPPER, Kernel Logistic Regression, K-star, Logistic Regression, Multilayer Perceptron, Naïve Bayes, Random Forest, Simple Cart, Support Vector Machine, and LMT) for automatic detection of breast cancer. The empirical results indicate that ensemble learning can improve the predictive performance of base learners on medical domain. The best results for comparative experiments are acquired with Random Subspace ensemble method. The experiments show that ensemble learning methods are appropriate methods to improve the performance of classifiers for medical diagnosis.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Ahmad, A.: Breast Cancer Metastasis and Drug Resistance Progress and Prospects. Springer, Berlin (2013)
Book Google Scholar
Tabar, L., Tot, T., Dean, P.B.: Breast Cancer-The Art and Science of Early Detection with Mammography: Perception, Interpretation, Histopathologic Correlation. Thieme, New York (2004)
Google Scholar
Westa, D., Mangiamelib, P., Rampalc, R., Westd, V.: Ensemble strategies for a medical diagnostic decision support system: A breast cancer diagnosis application. European Journal of Operational Research 162(2), 532–551 (2005)
Article Google Scholar
Lundin, M., Lundin, J., Burke, H.B., Toikkanen, S., Pylkkanen, L., Joensuu, H.: Artificial Neural Networks Applied to Survival Prediction in Breast Cancer. Oncology 57, 281–286 (1999)
Article Google Scholar
Bellaachia, A., Guven, E.: Predicting Breast Cancer Survivability using Data Mining Techniques. In: Proceedings of the Sixth SIAM International Conference on Data Mining, pp. 1–4. SIAM, Maryland (2006)
Google Scholar
Akay, M.F.: Support vector machines combined with feature selection for breast cancer diagnosis. Expert Systems with Applications 36(2), 3240–3247 (2009)
Article Google Scholar
Delen, D., Walker, G., Kadam, A.: Predicting breast cancer survivability: a comparison of three data mining methods. Artificial Intelligence in Medicine 34(2), 113–127 (2005)
Article Google Scholar
Ubeyli, E.D.: Adaptive neuro-fuzzy inference systems for automatic detection of breast cancer. Journal of Medical Systems 33(5), 353–358 (2009)
Article Google Scholar
Thongkam, J., Sukmak, V.: Bagging Random Tree for Analyzing Breast Cancer Survival. KKU Res. J. 17(1), 1–13 (2012)
Google Scholar
Ya-Qin, L., Cheng, W.: Decision Tree Based Predictive Models for Breast Cancer Survivability on Imbalanced Data. In: Proc. 3rd International Conference on Bioinformatics and Biomedical Engineering, pp. 1–4. IEEE Press, New York (2009)
Google Scholar
Lavanya, D., Rani, K.U.: Ensemble Decision Tree Classifier for Breast Cancer Data. International Journal of Information Technology Convergence and Services (IJITCS) 2(1), 17–24 (2012)
Article Google Scholar
Cruz, J.A., Wishart, D.S.: Application of Machine Learning in Cancer Prediction and Prognosis. Cancer Informatics 2006(2), 59–77 (2006)
Google Scholar
Gayathri, B.M., Sumathi, C.P., Santhanam, T.: Breast Cancer Diagnosis Using Machine Learning Algorithm- A Survey. International Journal of Distributed and Parallel Systems 4(3), 105–112 (2013)
Article Google Scholar
Li, L., Hu, Q., Wu, X., Yu, D.: Exploration of classification confidence in ensemble learning. Pattern Recognition 47, 3120–3131 (2014)
Article Google Scholar
Cohen, W.W.: Fast Effective Rule Induction. In: Proc. Twelfth International Conference on Machine Learning, pp. 115–123. Morgan Kaufmann, San Francisco (1995)
Google Scholar
Duma, M., Twala, B., Marwala, T., Nelwamondo, F.V.: Improving the Performance of the Ripper in Insurance Risk Classification- A Comparative Study using Feature Selection. In: Ferrier, J.-L., Bernard, A., Yu, O., Gusikin, K.M. (eds.) Proceedings of the 8th International Conference on Informatics in Control, Automation and Robotics, vol. 1, pp. 203–210. SciTePress, Netherlands (2011)
Google Scholar
Hühn, J., Hüllermeier, E.: FURIA: an algorithm for unordered fuzzy rule induction. Data Mining and Knowledge Discovery 19(3), 293–319 (2009)
Article MathSciNet Google Scholar
Aha, D.W., Kibler, D., Albert, M.K.: Instance-Based Learning Algorithms. Machine Learning 6, 37–66 (1991)
Google Scholar
Wu, X., Kumar, V.: The Top Ten Algorithms in Data Mining. Taylor & Francis Group, New York (2009)
Book MATH Google Scholar
Clearly, J.G., Trigg, L.E.: K*: An Instance-based learner using and entropic distance measure. In: Proc. Twelfth International Conference on Machine Learning, pp. 108–114. Morgan Kaufmann, San Francisco (1995)
Google Scholar
John, G.H., Langley, P.: Estimating continuous distributions in Bayesian classifiers. In: Proc. of the Eleventh Conference on Uncertainty in Artificial Intelligence, pp. 338–345. Morgan Kaufmann, San Francisco (1995)
Google Scholar
Han, J., Kamber, M., Pei, J.: Data Mining: Concepts and Techniques. Morgan Kaufmann, San Francisco (2011)
Google Scholar
Bouckaert, R.R.: Bayesian Network Classifiers in Weka, http://weka.sourceforge.net/manuals/weka.bn.pdf
Witten, I.H., Frank, E., Hall, M.A.: Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann, Burlington (2011)
Google Scholar
Cessie, S.L., VanHowelingen, J.C.: Ridge Estimators in Logistic Regression. Applied Statistics 41(1), 191–201 (1992)
Article MATH Google Scholar
Negnevitsky, M.: Artificial Intelligence: A Guide to Intelligent Systems. Addison-Wesley, Reading (2005)
Google Scholar
Platt, J.: Fast Training of Support Vector Machines using Sequential Minimal Optimization. In: Schoelkopf, B., Burges, C., Smola, A. (eds.) Advances in Kernel Methods-Support Vector Learning. MIT Press, Cambridge (1998)
Google Scholar
Quinlan, R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo (1993)
Google Scholar
Niuniu, X., Yuxun, L.: Review of Decision Trees. In: Proc The Third IEEE International Conferrence on Computer Science and Information Technology, pp. 105–109. IEEE Press, New York (2010)
Google Scholar
Breiman, L.: Random Forests. Machine Learning 45(1), 5–32 (2001)
Article MATH Google Scholar
Landwehr, N., Hall, M., Frank, E.: Logistic Model Trees. Machine Learning 59, 161–205 (2005)
Article MATH Google Scholar
Doestcsh, P., Buck, C., Golik, P., Hoppe, N.: Logistic Model Trees with AUCsplit Criterion for KDD Cup 2009 Small Challgenge. Journal of Machine Learning Research 7, 77–88 (2009)
Google Scholar
Loh, W.Y.: Classification and regression trees. WIREs Data Mining and Knowledge Discovery 1, 14–23 (2011)
Article Google Scholar
Breiman, L.: Bagging predictors. Machine Learning 4(2), 123–140 (1996)
Google Scholar
Rokach, L.: Ensemble-based classifiers. Artificial Intelligence Review 33, 1–39 (2010)
Article Google Scholar
Ting, K.M., Witten, I.H.: Stacking Bagged and Dagged Models. In: Fourteenth International Conference on Machine Learning, pp. 367–375. Morgan Kaufmann, San Francisco (1997)
Google Scholar
Freund, Y., Schapire, R.: Experiments with a new boosting algorithm. In: Proc of the Thirteenth International Conference on Machine Learning, pp. 148–156. Morgan Kaufmann, San Francisco (1996)
Google Scholar
Opitz, D., Maclin, R.: Popular Ensemble Methods: An Empirical Study. Journal of Artificial Intelligence Research 11, 169–198 (1999)
MATH Google Scholar
Guo, H., Viktor, H.L.: Boosting with Data Generation: Improving the Classification of Hard to Learn Examples. In: Orchard, B., Yang, C., Ali, M. (eds.) IEA/AIE 2004. LNCS (LNAI), vol. 3029, pp. 1082–1091. Springer, Heidelberg (2004)
Chapter Google Scholar
Webb, G.I.: MultiBoosting: A Technique for Combining Boosting and Wagging. Machine Learning 40, 159–196 (2000)
Article Google Scholar
Melville, P., Mooney, R.J.: Constructing Diverse Classifier Ensembles using Artificial Training Examples. In: Proceedings of the 18th IJCAI, pp. 505–510. Morgan Kaufmann, San Francisco (2003)
Google Scholar
Ho, T.K.: The Random Subspace Method for Constructing Decision Forests. IEEE Transactions on Pattern Analysis and Machine Intelligence 20(8), 832–844 (1998)
Article Google Scholar
Mangasarian, O.L., Wolberg, W.H.: Cancer diagnosis via linear programming. SIAM News 23(5), 1–18 (1990)
Google Scholar
Bache, K., Lichman, M.: UCI Machine Learning Repository, http://archieve.ics.uci.edu/ml

Download references

Author information

Authors and Affiliations

Faculty of Engineering, Department of Computer Engineering, Celal Bayar University, Manisa, 45140, Turkey
Aytuğ Onan

Authors

Aytuğ Onan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Aytuğ Onan .

Editor information

Editors and Affiliations

Faculty of Applied Informatics, Tomas Bata University in Zlín, Zlín, Czech Republic
Radek Silhavy
Faculty of Applied Informatics, Tomas Bata University in Zlín, Zlín, Czech Republic
Roman Senkerik
Faculty of Applied Informatics, Tomas Bata University in Zlín, Zlín, Czech Republic
Zuzana Kominkova Oplatkova
Faculty of Applied Informatics, Tomas Bata University in Zlín, Zlín, Czech Republic
Zdenka Prokopova
Faculty of Applied Informatics, Tomas Bata University in Zlín, Zlín, Czech Republic
Petr Silhavy

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Onan, A. (2015). On the Performance of Ensemble Learning for Automated Diagnosis of Breast Cancer. In: Silhavy, R., Senkerik, R., Oplatkova, Z., Prokopova, Z., Silhavy, P. (eds) Artificial Intelligence Perspectives and Applications. Advances in Intelligent Systems and Computing, vol 347. Springer, Cham. https://doi.org/10.1007/978-3-319-18476-0_13

Download citation

DOI: https://doi.org/10.1007/978-3-319-18476-0_13
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-18475-3
Online ISBN: 978-3-319-18476-0
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics