Abstract
The interest in tree-structured methods has been growing rapidly in statistics. In fact, all commercial statistical packages and Data Mining tools have been equipped with tree building modules. The research in this field has its roots in early 70s when early papers on recursive partitioning of the feature space (and its result which has the form of a tree) were published in statistical journals. They began intensive research in nonparametric statistical methods for classification, regression, survival analysis etc. The aim of this paper is to summarize achievements of this research and point out some still open problems.
Keywords
- Frontal Lobe
- Multivariate Adaptive Regression Spline
- Recursive Partitioning
- Misclassification Error
- Data Mining Tool
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
BELSON, W. A. (1959): Matching and Prediction on the Priniciple of Biological Classification. Applied Statistics, 8, 65–75.
BREIMAN, L., FRIEDMAN, J. OLSHEN, R. and STONE, C. (1984): Classification and Regression Trees. Wadsworth, Belmont, CA.
BREIMAN, L. (1996): Bagging Predictors. Machine Learning, 24, 123–140.
BREIMAN, L. (1999): Using Adaptive Bagging to Debias Regressions. Technical Report, 547, Statistics Department, University of California, Berkeley.
CAMPBELL, N.A. and MAHON, R.J. (1974): A Multivariate Study of Variation in Two Species of Rock Crab of Genus Leptograpsus. Australian Journal of Zoology, 22, 417–425.
CARTER, C. and CATLETT, J. (1987): Assessing Credit Card Applications Using Machine Learning. IEEE Expert, Fall Issue, 71–79.
FREUND, Y. and SCHAPIRE, R.E. (1997): A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting. Journal of Computer and System Sciences, 55, 119–139.
FRIEDMAN, J. H. (1999): Stochastic Gradient Boosting. Technical Report, Stanford University, Stanford.
FRIEDMAN, J. H. (1991): Multivariate Adaptive Regression Splines. Annals of Statistics, 19, 1–141.
FRIEDMAN, J. H. (1977): A Recursive Partitioning Decision Rule for Nonparametric Classification. IEEE Transactions on Computers, 26, 404–408.
GATNAR, E. (2001): Nonparametric Method for Discrimination and Regression. PWN Scientific Publishers, Warsaw (in Polish).
GORDON, L. and OLSHEN, R.A. (1985), Tree-structured survival analysis. Cancer Treatment Reports, 69, 1065–1069.
HASTIE, T. and PREGIBON, D. (1991) Shrinking Trees. Technical Report, AT T Bell Laboratories, Murray Hill, NJ.
HASTIE, T., TIBSHIRANI, R. and FRIEDMAN, J. (2001): The Elements of Statistical Learning. Springer, New York.
HAYAFIL, L. and RIVEST, R.L. (1976): Constructing optimal binary decision trees is NP-complete. Information Processing Letters, 5, 15–17.
HUNT, E.B., MARIN, J., STONE, P.J. (1966): Experiments in Induction. Academic Press, New York.
KASS, G. V. (1980): An Exploratory Technique for Investigating Large Quantities of Categorical Data. Applied Statistics, 29, 119–127.
MINGERS, J. (1987): Expert Systems: Rule Induction with Statistical Data. Journal of The Operational Research Society, 38, 39–47.
MORGAN, J.N. and SONQUIST, J. A. (1963): Problems in the Analysis of Survey Data, and a Proposal. Journal of the American Statistical Association, 58, 415–434.
PERINEL, E. and LECHEVALLIER, Y. (2000): Symbolic discrimination rule. In: H.H. Bock and E. Diday (Eds.): Analysis of Symbolic Data. Exploratory Methods for Extracting Statistical Information From Complex Data. Springer, Heidelberg.
QUINLAN, J.R. (1986): Induction of Decision Trees. Machine Learning, 1, 81–106.
QUINLAN, J.R. (1987): Simplifying Decision Trees. International Journal of Man-Machine Studies, 27, 221–234.
QUINLAN, J.R. (1993): C4. 5: Programs for Machine Learning. Morgan Kaufmann, San Mateo.
RISSANEN, J. (1983): A universal prior for integers and estimation by minimum description length. Annals of Statistics, 11, 416–431.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Gatnar, E. (2002). Tree-based Models in Statistics: Three Decades of Research. In: Jajuga, K., Sokołowski, A., Bock, HH. (eds) Classification, Clustering, and Data Analysis. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-56181-8_44
Download citation
DOI: https://doi.org/10.1007/978-3-642-56181-8_44
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-43691-1
Online ISBN: 978-3-642-56181-8
eBook Packages: Springer Book Archive