Advertisement

Improving Accuracy of Classification Based on C4.5 Decision Tree Algorithm Using Big Data Analytics

  • Bhavna RawalEmail author
  • Ruchi Agarwal
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 711)

Abstract

C4.5 is an algorithm of decision tree that broadly used classification technique. There are many challenges in the era of big data like size, time, and cost for building a decision tree. Aim of the decision tree construction is to boost up the accuracy on the training data. In predictive modeling, it requires to split the training datasets for this MATLAB is a good choice. Also analysis of data is done easily by decision tree instead of heterogeneous data. In this paper, C4.5 is implemented with the help of MATLAB using four different datasets which provides a confusion matrix in terms of target and output classes. At the end, it compared the features of datasets. The main objective of this research is to boost up the classification accuracy and roll back timing to build a classification model. We have reduced input space using Bhattacharya distance. The proposed method shows better performance for the data file. With the help of BD, improved C4.5 is performing better than original C4.5 in every test case.

Keywords

Bhattacharya distance Big data analytics C4.5 Decision tree 

References

  1. 1.
    S. Desai, S. Roy, B. Patel, S. Purandare and M. Kucheria, “Very Fast Decision Tree (VFDT) algorithm on Hadoop”, 2016 International Conference on Computing Communication Control and automation (ICCUBEA), 2016.Google Scholar
  2. 2.
    S. Bashir, U. Qamar, F. Khan and M. Javed, “An Efficient Rule-Based Classification of Diabetes using ID3, C4.5 & amp; amp; CART Ensembles”, 2014 12th International Conference on frontiers of Information Technology, 2014.Google Scholar
  3. 3.
    Jiawei Han and MichelineKamber-Data Mining: Concepts and Techniques, 3rd edition, first volume, 2011.Google Scholar
  4. 4.
    Q. Ross, Morgan Kaufmann Publishers, “C4.5: Programs for Machine Learning”, San MateoInc (1993).Google Scholar
  5. 5.
    H. Akash, Kiran Bhowmick “A MapReduce based approach for classification” Online International Conference on Green Engineering and Technology (IC-GET) 2016.Google Scholar
  6. 6.
    Y. Zhen, Q. Yong and L. Jing, “The application of short classification based on C4.5 decision Tree in video retrieval”, 2011 6th IEEE Joint Information Technology Artificial Intelligence Conference, 2011.Google Scholar
  7. 7.
    M. M Mazid, A.B.M Shawkat Ali, K. S Tickle, “Improved C4.5 Algorithm for Rule Based Classification”, vol. 13, pp 296–301, 2010.Google Scholar
  8. 8.
    Yuan Z. “An improved network traffic classification algorithm based on Hadoop Decision tree”, Vol. 3, No. 1, March 2016.Google Scholar
  9. 9.
    X. Bao and X. Guan, “A Method of Predicting Crude Oil Output Based on RS-C4.5 Algorithm”, 3rd International Conference on Information Science and Control Engineering (ICISCE), 2016.Google Scholar
  10. 10.
    X. Zhao and J. Yang, “An improved TANC classification algorithm based on C4.5”, The 26th Chinese Control and Decision Conference (2014 CCDC), 2014.Google Scholar
  11. 11.
    S. Soliman, S. Abbas and A. Salem, “Classification of thromobosis collagen diseases based on C4.5 algorithm”, 2015 IEEE Seventh International Conference on Intelligent computing and Information System (ICICIS), 2015.Google Scholar
  12. 12.
    Z. Yuan and C. Wang, “An improved network traffic classification algorithm based on Hadoop decision tree”, 2016 IEEE Interntional Conference of Online Analysis and Computing Science (ICOACS), 2016.Google Scholar
  13. 13.
    B. Hssina, A. Merbouha, H. Ezzikouri, M. Erritali, “A comparative study of decision tree ID3 and C4.5”, vol. 1, No. 1, 2010.Google Scholar
  14. 14.
    Gongging Wu-haiguang Li-Xuegang Hu-yuanjun Bi-jing Zhang-XindongWu-“MReC4.5 Ensemble Classification with MapReduce” 4rt ChinaGrid Annual Conference-2009.Google Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2019

Authors and Affiliations

  1. 1.Department of Computer Science & EngineeringSchool of Engineering & Technology, Sharda UniversityGreater NoidaIndia

Personalised recommendations