New Results on Minimum Error Entropy Decision Trees

  • J. P. Marques de Sá
  • Raquel Sebastião
  • João Gama
  • Tânia Fontes
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7042)

Abstract

We present new results on the performance of Minimum Error Entropy (MEE) decision trees, which use a novel node split criterion. The results were obtained in a comparive study with popular alternative algorithms, on 42 real world datasets. Carefull validation and statistical methods were used. The evidence gathered from this body of results show that the error performance of MEE trees compares well with alternative algorithms. An important aspect to emphasize is that MEE trees generalize better on average without sacrifing error performance.

Keywords

decision trees entropy-of-error node split criteria 

References

  1. 1.
    Rokach, L., Maimon, O.: Decision Trees. In: Maimon, O., Rokach, L. (eds.) Data Mining and Knowledge Discovery Handbook. Springer, Heidelberg (2005)Google Scholar
  2. 2.
    Marques de Sá, J.P., Sebastião, R., Gama, J.: Tree Classifiers Based on Minimum Error Entropy Decisions. Can. J. Artif. Intell., Patt. Rec. and Mach. Learning (in Press, 2011)Google Scholar
  3. 3.
    Silva, L., Felgueiras, C.S., Alexandre, L., Marques de Sá, J.: Error Entropy in Classification Problems: A Univariate Data Analysis. Neural Computation 18, 2036–2061 (2006)MathSciNetCrossRefMATHGoogle Scholar
  4. 4.
    Asuncion, A., Newman, D.J.: UCI Machine Learning Repository. University of California, School of Information and Computer Science, Irvine, CA (2010), http://www.ics.uci.edu/~mlearn/MLRepository.html
  5. 5.
    Kearns, M.: A Bound on the Error of Cross Validation Using the Approximation and Estimation Rates, with Consequences for the Training-Test Split. Neural Computation 9, 1143–1161 (1997)CrossRefGoogle Scholar
  6. 6.
    Molinaro, A.M., Simon, R., Pfeiffer, R.M.: Prediction Error Estimation: A Comparison of Resampling Methods. Bioinformatics 21, 3301–3307 (2005)CrossRefGoogle Scholar
  7. 7.
    Demšar, J.: Statistical Comparisons of Classifiers over Multiple Data Sets. J. of Machine Learning Research 7, 1–30 (2006)MathSciNetMATHGoogle Scholar
  8. 8.
    García, S., Fernández, A., Luengo, J., Herrera, F.: Advanced Nonparametric Tests for Multiple Comparisons in the Design of Experiments in Computational Intelligence and Data Mining: Experimental Analysis of Power. Information Sciences 180, 2044–2064 (2010)CrossRefGoogle Scholar
  9. 9.
    Hochberg, Y., Tamhane, A.C.: Multiple Comparison Procedures. John Wiley & Sons, Inc. (1987)Google Scholar
  10. 10.
    Salzberg, S.L.: On Comparing Classifiers: Pitfalls to Avoid and a Recommended Approach. Data Mining and Knowledge Discovery 1, 317–328 (1997)CrossRefGoogle Scholar
  11. 11.
    Jensen, D., Oates, T., Cohen, P.R.: Building Simple Models: A Case Study with Decision Trees. In: Liu, X., Cohen, P., Berthold, M. (eds.) IDA 1997. LNCS, vol. 1280, pp. 211–222. Springer, Heidelberg (1997)CrossRefGoogle Scholar
  12. 12.
    Li, R.-H., Belford, G.G.: Instability of Decision Tree Classification Algorithms. In: Proc. 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 570–575 (2002)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • J. P. Marques de Sá
    • 1
  • Raquel Sebastião
    • 2
  • João Gama
    • 2
  • Tânia Fontes
    • 1
  1. 1.INEB-Instituto de Engenharia Biomédica, FEUPUniversidade do PortoPortoPortugal
  2. 2.LIAAD - INESC Porto, L.A.PortoPortugal

Personalised recommendations