Abstract
C4.5 is an algorithm of decision tree that broadly used classification technique. There are many challenges in the era of big data like size, time, and cost for building a decision tree. Aim of the decision tree construction is to boost up the accuracy on the training data. In predictive modeling, it requires to split the training datasets for this MATLAB is a good choice. Also analysis of data is done easily by decision tree instead of heterogeneous data. In this paper, C4.5 is implemented with the help of MATLAB using four different datasets which provides a confusion matrix in terms of target and output classes. At the end, it compared the features of datasets. The main objective of this research is to boost up the classification accuracy and roll back timing to build a classification model. We have reduced input space using Bhattacharya distance. The proposed method shows better performance for the data file. With the help of BD, improved C4.5 is performing better than original C4.5 in every test case.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
S. Desai, S. Roy, B. Patel, S. Purandare and M. Kucheria, “Very Fast Decision Tree (VFDT) algorithm on Hadoop”, 2016 International Conference on Computing Communication Control and automation (ICCUBEA), 2016.
S. Bashir, U. Qamar, F. Khan and M. Javed, “An Efficient Rule-Based Classification of Diabetes using ID3, C4.5 & amp; amp; CART Ensembles”, 2014 12th International Conference on frontiers of Information Technology, 2014.
Jiawei Han and MichelineKamber-Data Mining: Concepts and Techniques, 3rd edition, first volume, 2011.
Q. Ross, Morgan Kaufmann Publishers, “C4.5: Programs for Machine Learning”, San MateoInc (1993).
H. Akash, Kiran Bhowmick “A MapReduce based approach for classification” Online International Conference on Green Engineering and Technology (IC-GET) 2016.
Y. Zhen, Q. Yong and L. Jing, “The application of short classification based on C4.5 decision Tree in video retrieval”, 2011 6th IEEE Joint Information Technology Artificial Intelligence Conference, 2011.
M. M Mazid, A.B.M Shawkat Ali, K. S Tickle, “Improved C4.5 Algorithm for Rule Based Classification”, vol. 13, pp 296–301, 2010.
Yuan Z. “An improved network traffic classification algorithm based on Hadoop Decision tree”, Vol. 3, No. 1, March 2016.
X. Bao and X. Guan, “A Method of Predicting Crude Oil Output Based on RS-C4.5 Algorithm”, 3rd International Conference on Information Science and Control Engineering (ICISCE), 2016.
X. Zhao and J. Yang, “An improved TANC classification algorithm based on C4.5”, The 26th Chinese Control and Decision Conference (2014 CCDC), 2014.
S. Soliman, S. Abbas and A. Salem, “Classification of thromobosis collagen diseases based on C4.5 algorithm”, 2015 IEEE Seventh International Conference on Intelligent computing and Information System (ICICIS), 2015.
Z. Yuan and C. Wang, “An improved network traffic classification algorithm based on Hadoop decision tree”, 2016 IEEE Interntional Conference of Online Analysis and Computing Science (ICOACS), 2016.
B. Hssina, A. Merbouha, H. Ezzikouri, M. Erritali, “A comparative study of decision tree ID3 and C4.5”, vol. 1, No. 1, 2010.
Gongging Wu-haiguang Li-Xuegang Hu-yuanjun Bi-jing Zhang-XindongWu-“MReC4.5 Ensemble Classification with MapReduce” 4rt ChinaGrid Annual Conference-2009.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Rawal, B., Agarwal, R. (2019). Improving Accuracy of Classification Based on C4.5 Decision Tree Algorithm Using Big Data Analytics. In: Behera, H., Nayak, J., Naik, B., Abraham, A. (eds) Computational Intelligence in Data Mining. Advances in Intelligent Systems and Computing, vol 711. Springer, Singapore. https://doi.org/10.1007/978-981-10-8055-5_19
Download citation
DOI: https://doi.org/10.1007/978-981-10-8055-5_19
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-8054-8
Online ISBN: 978-981-10-8055-5
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)