Abstract
The issue of detecting optimal split points for linear regression trees is examined. A novel approach called Turning Point Regression Tree Induction (TPRTI) is proposed which uses turning points to identify the best split points. When this approach is used, first, a general trend is derived from the original dataset by dividing the dataset into subsets using a sliding window approach and a centroid for each subset is computed. Second, using those centroids, a set of turning points is identified, indicating points in the input space in which the regression function, associated with neighboring subsets, changes direction. Third, the turning points are then used as input to a novel linear regression tree induction algorithm as potential split points. TPRTI is compared in a set of experiments using artificial and real world data sets with state-of-the-art regression tree approaches, such as M5. The experimental results indicate that TPRTI has a high predictive accuracy and induces less complex trees than competing approaches, while still being scalable to cope with larger datasets.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Alexander, W.P., Grimshaw, S.D.: Treed regression. Journal of Computational and Gra aphical Statistics 5, 156–175 (1996)
Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression Trees. Wadsworth, Belmont (1984)
Chaudhuri, P., Huang, M.-C., Loh, W.-Y., Yao, R.: Piecewise-polynomial regression trees. Statistica Sinica 4, 143–167 (1994)
DELVE repository of data (April 12, 2012), http://www.cs.toronto.edu/~delve/
Dobra, A., Gehrke, J.: SECRET:a scalable linear regression tree algorithm SIGKDD 2002 (2002)
Friedman, J.: Multivariate adaptive regression splines (with discussion). Annals of Statistics 19, 1–142 (1991)
Karalic, A.: Employing linear regression in regression tree leaves. In: European Conference on Artificial Intelligence, pp. 440–441 (1992)
Li, K.C., Lue, H.H., Chen, C.H.: Interactive Tree-Structured Regression via Principal Hessian Directions. Journal of the American Statistical Association 95, 547–560 (2000)
Loh, W.-Y.: Regression trees with unbiased variable selection and interaction detection. Statistica Sinica 12, 361–386 (2002)
Loh, W.-Y., Shih, Y.-S.: Split Selection Methods for Classification Trees. Statistica Sinica 7, 815–840 (1997)
Malerba, D., Esposito, F., Ceci, M., Appice, A.: Top-down induction of model trees with regression and splitting nodes. IEEE Transactions on Pattern Analysis and Machine Intelligence 26(5), 612–625 (2004)
Quinlan, J.R.: Learning with Continuous Classes. In: 5th Australian Joint Conference on Artificial Intelligence, pp. 343–348 (1992)
The R Project for Statistical Computing official website as of August 8, 2012, http://www.r-project.org/
StatLib repository (Dataset Archive), as of April 12, 2013, http://lib.stat.cmu.edu/datasets/
Torgo, L.: Functional models for regression tree leaves. In: Proc. 14th International Conference on Machine Learning, pp. 385–393. Morgan Kaufmann (1997)
UCI repository as of April 12, 2012, http://archive.ics.uci.edu/ml/datasets.html
Vogel, D.S., Asparouhov, O., Scheffer, T.: Scalable Look-Ahead Linear Regression Trees KDD 2007, San Jose, California, USA, August 12-15 (2007)
Weka software official website as of August 8, 2012, http://www.cs.waikato.ac.nz/ml/weka/
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Amalaman, P.K., Eick, C.F., Rizk, N. (2013). Using Turning Point Detection to Obtain Better Regression Trees. In: Perner, P. (eds) Machine Learning and Data Mining in Pattern Recognition. MLDM 2013. Lecture Notes in Computer Science(), vol 7988. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-39712-7_25
Download citation
DOI: https://doi.org/10.1007/978-3-642-39712-7_25
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-39711-0
Online ISBN: 978-3-642-39712-7
eBook Packages: Computer ScienceComputer Science (R0)