Using Feature Selection Techniques to Improve the Accuracy of Breast Cancer Classification

  • Hajar SaoudEmail author
  • Abderrahim Ghadi
  • Mohamed Ghailani
  • Boudhir Anouar Abdelhakim
Conference paper
Part of the Lecture Notes in Intelligent Transportation and Infrastructure book series (LNITI)


Classification is a data mining process that aims to divide data into classes to facilitate decision-making; it is therefore an important task in medical field. In this paper we will try to improve the accuracy of the classification of six machines learning algorithms: Bayes Network (BN), Support Vector Machine (SVM), k-nearest neighbors algorithm (Knn), Artificial Neural Network (ANN), Decision Tree (C4.5) and Logistic Regression using feature selection techniques, for breast cancer classification and diagnosis. We examined those methods of classification and techniques of feature selection in WEKA Tool (The Waikato Environment for Knowledge Analysis) using two databases, Wisconsin breast cancer datasets original (WBC) and diagnostic (WBCD) available in UCI machine learning repository.


Breast cancer Diagnostic Machines learning algorithms Feature selection Classification WEKA 


  1. 1.
    «Breast cancer statistics»: World Cancer Research Fund, 22 Aug 2018. Available on:
  2. 2.
    Ganesan, K., Acharya, U.R., Chua, C.K., Min, L.C., Abraham, K.T., Ng, K.-H.: Computer-aided breast cancer detection using mammograms: a review. IEEE Rev. Biomed. Eng. 6, 77–98 (2013)CrossRefGoogle Scholar
  3. 3.
    Aalaei, S., Shahraki, H., Rowhanimanesh, A., Eslami, S.: «Feature selection using genetic algorithm for breast cancer diagnosis: experiment on three different datasets». Iran J. Basic Med. Sci. 19(5), 7 (2016)Google Scholar
  4. 4.
    Saabith, A.L.S., Sundararajan, E., Bakar, A.A.: «Comparative Study on Different Classification Techniques for Breast Cancer Dataset», p. 8 (2014)Google Scholar
  5. 5.
    Gowri A.S. Ramar, D.K.: «A novel approach of feature selection techniques for image dataset». 3(2), 5Google Scholar
  6. 6.
    Abd El-Hafeez Ibrahim, A., Hashad, A.I., El-Deen Mohamed Shawky, N. Maher, A., Arab Academy for Science, Technology Maritime Transport, Cairo, Egypt: «Robust breast cancer diagnosis on four different datasets using multi-classifiers fusion». Int. J. Eng. Res., V4(03) (Mars 2015)Google Scholar
  7. 7.
    Hamsagayathri P., Sampath, P.: «Performance analysis of breast cancer classification using decision tree classifiers». Int. J. Curr. Pharm. Res. 9(2), 19 (Mars 2017)CrossRefGoogle Scholar
  8. 8.
    Lavanya, D., Usha Rani K.: «Analysis of feature selection with classification: breast cancer datasets». Indian J. Comput. Sci. Eng. (IJCSE) 2(5), 9 (2011) Google Scholar
  9. 9.
    Kaur, R.: «Study and comparison of feature selection approaches for intrusion detection». Int. J. Comput. Appl. 7Google Scholar
  10. 10.
    Mahmood, A.: «Structure Learning of Causal Bayesian Networks: A Survey», p. 6 Google Scholar
  11. 11.
    Han, J., Kamber, M.: Data mining: concepts and techniques, 2nd ed., [Nachdr.]. Elsevier/Morgan Kaufmann, Amsterdam (2010)Google Scholar
  12. 12.
    Han, J., Kamber, M.: Data mining: concepts and techniques, 3rd edn. Elsevier, Burlington, MA (2011)zbMATHGoogle Scholar
  13. 13.
    Yusuff, H., Mohamad, N., Ngah, U., Yahaya, A.: «Breast cancer analysis using logistic regression». Int. J. Res. Appl. Stud. 11 (2012)Google Scholar
  14. 14.
    Negnevitsky, M.: Artificial intelligence: a guide to intelligent systems. 2nd ed. Addison-Wesley, Harlow, England; New York: (2005)Google Scholar
  15. 15.
    «UCI Machine Learning Repository: Breast Cancer Wisconsin (Original) Data Set»: Available on:
  16. 16.
    «UCI Machine Learning Repository: Breast Cancer Wisconsin (Diagnostic) Data Set»: Available on:
  17. 17.
    «Machine Learning Project at the University of Waikato in New Zealand»: Available in:
  18. 18.
    Saoud, H., Ghadi, A., Ghailani, M.: Analysis of evolutionary trends of incidence and mortality by cancers. In: Ben Ahmed M., Boudhir A. (eds.) Innovations in Smart Cities and Applications. SCAMS 2017. Lecture Notes in Networks and Systems, vol 37. Springer, Cham (2018) CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Hajar Saoud
    • 1
    Email author
  • Abderrahim Ghadi
    • 1
  • Mohamed Ghailani
    • 2
  • Boudhir Anouar Abdelhakim
    • 1
  1. 1.LIST LaboratoryUniversity of Abdelmalek Essaadi (UAE)TangierMorocco
  2. 2.LabTIC LaboratoryUniversity of Abdelmalek Essaadi (UAE)TangierMorocco

Personalised recommendations