Abstract
In this chapter, we describe tree-based methods for regression and classification. These involve stratifying or segmenting the predictor space into a number of simple regions. In order to make a prediction for a given observation, we typically use the mean or the mode of the training observations in the region to which it belongs. Since the set of splitting rules used to segment the predictor space can be summarized in a tree, these types of approaches are known as decision tree methods.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
Both Years and Hits are integers in these data; the tree() function in R labels the splits at the midpoint between two adjacent values.
- 2.
Although CV error is computed as a function of α, it is convenient to display the result as a function of | T | , the number of leaves; this is based on the relationship between α and | T | in the original tree grown to all the training data.
- 3.
This relates to Exercise 2 of Chapter 5.
- 4.
The null rate results from simply classifying each observation to the dominant class overall, which is in this case the normal class.
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer Science+Business Media New York
About this chapter
Cite this chapter
James, G., Witten, D., Hastie, T., Tibshirani, R. (2013). Tree-Based Methods. In: An Introduction to Statistical Learning. Springer Texts in Statistics, vol 103. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-7138-7_8
Download citation
DOI: https://doi.org/10.1007/978-1-4614-7138-7_8
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-7137-0
Online ISBN: 978-1-4614-7138-7
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)