Comparison of Different Classification Techniques Using Different Datasets

  • Nitesh KumarEmail author
  • Souvik MitraEmail author
  • Madhurima Bhattacharjee
  • Lopa Mandal
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 811)


Big data analytics is considered to be the future of information technology in today's world, which incorporates data mining to be one of its most promising tool. The present work illustrates a comparative study to find out which kind of classifiers work better with which kind of datasets. It illustrates comparisons of the efficiency of the different classifiers focusing on numeric and text data. Datasets from IMDb and 20newsgroups have been used for the purpose. Current work mainly focuses on comparing different algorithms such as Decision Stump, Decision Table, K-Star, REPTree and ZeroR in the area of numeric classification, and evaluation of the efficiency of Naive Bayes classifier for text classification. The result in this paper suggests the best and worst of the test parameters, as it widens the scope of their usage on the basis of types and the size of datasets.


Data mining Text classification Numeric data classification Classifier algorithms 


  1. 1.
    Bin Othman, M.F., Yau, T.M.S.: Comparison of different classification techniques using WEKA for breast cancer. In: 3rd Kuala Lumpur International Conference on Biomedical Engineering 2006, pp. 520–523. Springer, Berlin, Heidelberg (2007)Google Scholar
  2. 2.
    Deepajothi, S., Selvarajan, S.: A comparative study of classification techniques on adult data set. Int. J. Eng. Res. Technol. (IJERT) 1 (2012)Google Scholar
  3. 3.
    Vaithiyanathan, V., Rajeswari, K., Tajane, K., Pitale, R.: Comparison of different classification techniques using different datasets. Int. J. Adv. Eng. Technol. 6(2), 764 (2013)Google Scholar
  4. 4.
    Bouali, H., Akaichi, J.: Comparative study of different classification techniques: heart disease use case. In: 2014 13th International Conference on Machine Learning and Applications (ICMLA), pp. 482–486. IEEE (2014)Google Scholar
  5. 5.
    Shinge, S., Zarekar, R., Vaithiyanathan, V., Rajeswari, K.: Comparative analysis of heart risk dataset using different classification techniques. Int. J. Res. Inf. Technol. 489–494 (2015)Google Scholar
  6. 6.
    Raheja, S., Munjal, G.: Analysis of Linux Kernel vulnerabilities. Ind. J. Sci. Technol. 9(48) (2016)Google Scholar
  7. 7.
    Arundthathi, A., Vijayaselvi, G., Savithri, V.: Assessment of decision tree algorithms on student’s recital (2017)Google Scholar
  8. 8.
    Nadu, T.: Cad Diagnosis Using PSO. BAT, MLR and SVM (2017)Google Scholar
  9. 9.
    Ikonomakis, M., Kotsiantis, S., Tampakas, V.: Text classification using machine learning techniques. WSEAS Trans. Comput. 4(8), 966–974 (2005)Google Scholar
  10. 10.
    Kotsiantis, S.B., Zaharakis, I., Pintelas, P.: Supervised machine learning: a review of classification techniques. Emer. Artif. Intell. Appl. Comput. Eng. 160, 3–24 (2007)Google Scholar
  11. 11.
    Ting, S.L., Ip, W.H., Tsang, A.H.: Is Naive Bayes a good classifier for document classification. Int. J. Softw. Eng. Appl. 5(3), 37–46 (2011)Google Scholar
  12. 12.
    Wahbeh, A.H., Al-Kabi, M.: Comparative assessment of the performance of three WEKA text classifiers applied to arabic text. Abhath Al-Yarmouk: Basic Sci. Eng. 21(1), 15–28 (2012)Google Scholar
  13. 13.
    Purohit, A., Atre, D., Jaswani, P., Asawara, P.: Text classification in data mining. Int. J. Sci. Res. Publ. 5(6), 1–7 (2015)Google Scholar
  14. 14.
    Rajeswari, R.P., Juliet, K., Aradhana: Text classification for student data set using Naive Bayes classifier and KNN classifier. Int. J. Comput. Trends Technol. (IJCTT) 43(1) (2017)Google Scholar
  15. 15.
    Tilve, A.K.S., Jain, S.N.: A survey on machine learning techniques for text classification. Int. J. Eng. Sci. Res. Technol. (2017)Google Scholar
  16. 16.
    Mandal, L., Das, R., Bhattacharya, S., Basu, P.N.: Intellimote: a hybrid classifier for classifying learners’ emotion in a distributed e-learning environment. Turkish J. Electr. Eng. Comput. Sci. 25(3), 2084–2095 (2017)CrossRefGoogle Scholar
  17. 17.
  18. 18.
  19. 19.
  20. 20.

Copyright information

© Springer Nature Singapore Pte Ltd. 2019

Authors and Affiliations

  1. 1.Tata Consultancy Services LimitedBengaluruIndia
  2. 2.Institute of Engineering & ManagementKolkataIndia

Personalised recommendations