Advertisement

Tree-based Models in Statistics: Three Decades of Research

  • Eugeniusz Gatnar
Conference paper
Part of the Studies in Classification, Data Analysis, and Knowledge Organization book series (STUDIES CLASS)

Abstract

The interest in tree-structured methods has been growing rapidly in statistics. In fact, all commercial statistical packages and Data Mining tools have been equipped with tree building modules. The research in this field has its roots in early 70s when early papers on recursive partitioning of the feature space (and its result which has the form of a tree) were published in statistical journals. They began intensive research in nonparametric statistical methods for classification, regression, survival analysis etc. The aim of this paper is to summarize achievements of this research and point out some still open problems.

Keywords

Frontal Lobe Multivariate Adaptive Regression Spline Recursive Partitioning Misclassification Error Data Mining Tool 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. BELSON, W. A. (1959): Matching and Prediction on the Priniciple of Biological Classification. Applied Statistics, 8, 65–75.CrossRefGoogle Scholar
  2. BREIMAN, L., FRIEDMAN, J. OLSHEN, R. and STONE, C. (1984): Classification and Regression Trees. Wadsworth, Belmont, CA.zbMATHGoogle Scholar
  3. BREIMAN, L. (1996): Bagging Predictors. Machine Learning, 24, 123–140.MathSciNetzbMATHGoogle Scholar
  4. BREIMAN, L. (1999): Using Adaptive Bagging to Debias Regressions. Technical Report, 547, Statistics Department, University of California, Berkeley.Google Scholar
  5. CAMPBELL, N.A. and MAHON, R.J. (1974): A Multivariate Study of Variation in Two Species of Rock Crab of Genus Leptograpsus. Australian Journal of Zoology, 22, 417–425.Google Scholar
  6. CARTER, C. and CATLETT, J. (1987): Assessing Credit Card Applications Using Machine Learning. IEEE Expert, Fall Issue, 71–79.Google Scholar
  7. FREUND, Y. and SCHAPIRE, R.E. (1997): A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting. Journal of Computer and System Sciences, 55, 119–139.MathSciNetzbMATHCrossRefGoogle Scholar
  8. FRIEDMAN, J. H. (1999): Stochastic Gradient Boosting. Technical Report, Stanford University, Stanford.Google Scholar
  9. FRIEDMAN, J. H. (1991): Multivariate Adaptive Regression Splines. Annals of Statistics, 19, 1–141.CrossRefGoogle Scholar
  10. FRIEDMAN, J. H. (1977): A Recursive Partitioning Decision Rule for Nonparametric Classification. IEEE Transactions on Computers, 26, 404–408.zbMATHCrossRefGoogle Scholar
  11. GATNAR, E. (2001): Nonparametric Method for Discrimination and Regression. PWN Scientific Publishers, Warsaw (in Polish).Google Scholar
  12. GORDON, L. and OLSHEN, R.A. (1985), Tree-structured survival analysis. Cancer Treatment Reports, 69, 1065–1069.Google Scholar
  13. HASTIE, T. and PREGIBON, D. (1991) Shrinking Trees. Technical Report, AT T Bell Laboratories, Murray Hill, NJ.Google Scholar
  14. HASTIE, T., TIBSHIRANI, R. and FRIEDMAN, J. (2001): The Elements of Statistical Learning. Springer, New York.zbMATHGoogle Scholar
  15. HAYAFIL, L. and RIVEST, R.L. (1976): Constructing optimal binary decision trees is NP-complete. Information Processing Letters, 5, 15–17.CrossRefGoogle Scholar
  16. HUNT, E.B., MARIN, J., STONE, P.J. (1966): Experiments in Induction. Academic Press, New York.Google Scholar
  17. KASS, G. V. (1980): An Exploratory Technique for Investigating Large Quantities of Categorical Data. Applied Statistics, 29, 119–127.CrossRefGoogle Scholar
  18. MINGERS, J. (1987): Expert Systems: Rule Induction with Statistical Data. Journal of The Operational Research Society, 38, 39–47.Google Scholar
  19. MORGAN, J.N. and SONQUIST, J. A. (1963): Problems in the Analysis of Survey Data, and a Proposal. Journal of the American Statistical Association, 58, 415–434.zbMATHCrossRefGoogle Scholar
  20. PERINEL, E. and LECHEVALLIER, Y. (2000): Symbolic discrimination rule. In: H.H. Bock and E. Diday (Eds.): Analysis of Symbolic Data. Exploratory Methods for Extracting Statistical Information From Complex Data. Springer, Heidelberg.Google Scholar
  21. QUINLAN, J.R. (1986): Induction of Decision Trees. Machine Learning, 1, 81–106.Google Scholar
  22. QUINLAN, J.R. (1987): Simplifying Decision Trees. International Journal of Man-Machine Studies, 27, 221–234.CrossRefGoogle Scholar
  23. QUINLAN, J.R. (1993): C4. 5: Programs for Machine Learning. Morgan Kaufmann, San Mateo.Google Scholar
  24. RISSANEN, J. (1983): A universal prior for integers and estimation by minimum description length. Annals of Statistics, 11, 416–431.MathSciNetzbMATHCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2002

Authors and Affiliations

  • Eugeniusz Gatnar
    • 1
  1. 1.Department of StatisticsKatowice University of EconomicsKatowicePoland

Personalised recommendations