Advertisement

Enhancing the Efficiency of Decision Tree C4.5 Using Average Hybrid Entropy

  • Poonam Rani
  • Kamaldeep KaurEmail author
  • Ranjit Kaur
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 955)

Abstract

Getting the efficient and effective decision tree is important, because of its numerous applications in mining and machine learning. Different modifications have been done on the splitting criteria in the decision tree. Different entropy concepts are introduced by different scholars. Shannon’s entropy, Renyi’s entropy, and Tsalli’s entropy are those entropies which can affect the overall efficiency of decision tree C4.5. This research implemented new average hybrid entropy that has combined statistical properties of Reyni’s and Tsalli’s entropy, Average Hybrid entropy is the average between the maxima of Reyni’s and Tsalli’s entropy. The overall idea is, applying Average Hybrid entropy on the basis of instances and integrates those instances after pruning. This makes the pruning process easy and gives better results. Research is done on three standard datasets Credit-g, Diabetes, and Glass dataset taken from UCI repository; it is proved that the average hybrid entropy is having the more efficient results.

Keywords

Shannon’s entropy Reyni’s entropy Tsalli’s entropy Data mining Machine learning C4.5 Decision tree J48 classifier 

References

  1. 1.
    Gajowniczek, K., Ząbkowski, T., Orłowski, A.: Comparison of decision trees with Rényi and Tsallis entropy applied for imbalanced churn dataset. In: Computer Science and Information Systems, pp. 39–44 (2015)Google Scholar
  2. 2.
    Sharma, L.P., Patel, N., Ghose, M.K., Debnath, P.: Influence of Shannon’s entropy on landslide-causing parameters for vulnerability study and zonation—a case study in Sikkim. India. Arab. J. Geosci 5, 421–431 (2012)CrossRefGoogle Scholar
  3. 3.
    Truffet, L.: Shannon Entropy Reinterpreted (2017)Google Scholar
  4. 4.
    Lima, C.F.L., De Assis, F.M., De Souza, C.P.: Decision tree based on Shannon, Rényi and Tsallis entropies for intrusion tolerant systems. In: 5th International Conference Internet Monitoring and Protection, ICIMP 2010, pp. 117–122 (2010)Google Scholar
  5. 5.
    Wang, Y., Song, C., Xia, S.-T.: Unifying Decision Trees Split Criteria Using Tsallis Entropy (2015)Google Scholar
  6. 6.
    Dogra, A.K.: A review paper on data mining techniques and algorithms. Int. Res. J. Eng. Technol. 4, 1976–1979 (2015)Google Scholar
  7. 7.
    Ilić, V.M., Stanković, M.S.: Comments on “ On q-non-extensive statistics with non-Tsallisian entropy”. Phys. A Stat. Mech. Appl. 466, 160–165 (2017)MathSciNetCrossRefGoogle Scholar
  8. 8.
    Contreras-Reyes, J.E., Arellano-Valle, R.B.: Kullback-Leibler divergence measure for multivariate skew-normal distributions. Entropy 14, 1606–1626 (2012)MathSciNetCrossRefGoogle Scholar
  9. 9.
    Bercher, J.F.: Some properties of generalized Fisher information in the context of nonextensive thermostatistics. Phys. A Stat. Mech. Appl 392, 3140–3154 (2013)MathSciNetCrossRefGoogle Scholar
  10. 10.
    Bercher, J.-F.: On generalized Cramér-Rao inequalities, generalized Fisher information, and characterizations of generalized q-Gaussian distributions. J. Phys. A: Math. Theor. 45, 255303 (2012)MathSciNetCrossRefGoogle Scholar
  11. 11.
    Teimouri, M., Nadarajah, S., Hsing Shih, S.: EM algorithms for beta kernel distributions. J. Stat. Comput. Simul. 84, 451–467 (2014)MathSciNetCrossRefGoogle Scholar
  12. 12.
    Zhong, Y.: The analysis of cases based on decision tree. In: 2016 7th IEEE International Conference on Software Engineering and Service Science, pp. 142–147 (2016)Google Scholar
  13. 13.
  14. 14.
    An Introduction to Data Science. http://saedsayad.com/decision_tree.htm
  15. 15.
  16. 16.
  17. 17.

Copyright information

© Springer Nature Singapore Pte Ltd. 2019

Authors and Affiliations

  1. 1.Lovely Professional UniversityPhagwaraIndia

Personalised recommendations