Learning Model Trees from Data Streams

Ikonomovska, Elena; Gama, Joao

doi:10.1007/978-3-540-88411-8_8

Learning Model Trees from Data Streams

Elena Ikonomovska²² &
Joao Gama²³

Conference paper

958 Accesses
17 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5255))

Abstract

In this paper we propose a fast and incremental algorithm for learning model trees from data streams (FIMT) for regression problems. The algorithm is incremental, works online, processes examples once at the speed they arrive, and maintains an any-time regression model. The leaves contain linear-models trained online from the examples that fall at that leaf, a process with low complexity. The use of linear models in the leaves increases its any-time global performance. FIMT is able to obtain competitive accuracy with batch learners even for medium size datasets, but with better training time in an order of magnitude. We study the properties of FIMT over several artificial and real datasets and evaluate its sensitivity on the order of examples and the noise level.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Gratch, J.: Sequential Inductive Learning. In: 13th National Conference on Artificial Intelligence, pp. 779–786. AAAI Press, Menlo Park (1996)
Google Scholar
Domingos, P., Hulten, G.: Mining High Speed Data Streams. In: 6th International Conference on Knowledge Discovery and Data Mining, pp. 71–80. ACM Press, New York (2000)
Google Scholar
Quinlan, J.R.: Learning with Continuous Classes. In: 5th Australian Joint Conference on Artificial Intelligence, pp. 34–348. Adams & Sterling (1992)
Google Scholar
Karalic, A.: Employing Linear Regression in Regression Tree Leaves. In: 10th European Conference on Artificial Intelligence, pp. 440–441. John Wiley & Sons, Chichester (1992)
Google Scholar
Potts, D., Sammut, C.: Incremental Learning of Linear Model Trees. J. Machine Learning 61, 5–48 (2005)
Article MATH Google Scholar
Siciliano, R., Mola, F.: Modeling for Recursive Partitioning and Variable Selection. In: Computational Statistics, pp. 172–177. R. Dutter & W. Grossmann (1994)
Google Scholar
Musick, R., Catlett, J., Russell, S.: Decision Theoretic Sub-sampling for Induction on Large Databases. In: 10th International Conference on Machine Learning, pp. 212–219. Morgan Kaufmann, San Francisco (1993)
Google Scholar
Gama, J., Rocha, R., Medas, P.: Accurate Decision Trees for Mining High-Speed Data Streams. In: The 9th International Conference on Knowledge Discovery and Data Mining, pp. 52–528. KDD Press (2003)
Google Scholar
Hulten, G., Domingos, P.: VFML – A toolkit for mining high-speed time-changing data streams (2003), http://www.cs.washington.edu/dm/vfml/
Angluin, D., Valiant, L.G.: Fast Probabilistic Algorithms for Hamiltonian Circuits and Matchings. J. Computer and System Sciences 19, 155–193 (1979)
Article MathSciNet Google Scholar
Friedman, J.H.: Multivariate Adaptive Regression Splines. J. The Annals of Statistics 19, 1–141 (1991)
Article MATH Google Scholar
Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression Trees. Chapman & Hall/CRC, Belmont (1984)
MATH Google Scholar
Dobra, A., Gehrke, J.: SECRET: A Scalable Linear Regression Tree Algorithm. In: 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 481–487. ACM Press, New York (2001)
Google Scholar
Schaal, S., Atkeson, C.: Constructive Incremental Learning From only Local Information. J. Neural Computation 10, 2047–2084 (1998)
Article Google Scholar
Blake, C., Keogh, E., Merz, C.: UCI Repository of Machine Learning Databases (1999)
Google Scholar
Breiman, L.: Arcing Classifiers. J. The Annals of Statistics. 26(3), 801–849 (1998)
Article MathSciNet MATH Google Scholar
Geman, S., Bienenstock, E., Doursat, R.: Neural Networks and the Bias/Variance Dilemma. J. Neural Computation 4, 1–58 (1992)
Article Google Scholar

Download references

Author information

Authors and Affiliations

FEIT – Ss. Cyril and Methodius University, Karpos II bb, 1000, Skopje, Macedonia
Elena Ikonomovska
LIAAD/INESC, FEP – University of Porto, Rua Campo Alegre 823, 4150, Porto, Portugal
Joao Gama

Authors

Elena Ikonomovska
View author publications
You can also search for this author in PubMed Google Scholar
Joao Gama
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

INSA Lyon, LIRIS CNRS UMR 5205, University of Lyon, 69621, Villeurbanne Cedex, France
Jean-François Jean-Fran
Department of Computer and Information Science, University of Konstanz, Box M 712, 78457, Konstanz, Germany
Michael R. Berthold
University of Bonn and Fraunhofer IAIS, Schloss Birlinghoven, 53754, Sankt Augustin, Germany
Tamás Horváth

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ikonomovska, E., Gama, J. (2008). Learning Model Trees from Data Streams. In: Jean-Fran, JF., Berthold, M.R., Horváth, T. (eds) Discovery Science. DS 2008. Lecture Notes in Computer Science(), vol 5255. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-88411-8_8

Download citation

DOI: https://doi.org/10.1007/978-3-540-88411-8_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-88410-1
Online ISBN: 978-3-540-88411-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics