Fast and Light Boosting for Adaptive Mining of Data Streams

Chu, Fang; Zaniolo, Carlo

doi:10.1007/978-3-540-24775-3_36

Fang Chu¹⁹ &
Carlo Zaniolo¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3056))

Included in the following conference series:

Pacific-Asia Conference on Knowledge Discovery and Data Mining

3074 Accesses
65 Citations

Abstract

Supporting continuous mining queries on data streams requires algorithms that (i) are fast, (ii) make light demands on memory resources, and (iii) are easily to adapt to concept drift. We propose a novel boosting ensemble method that achieves these objectives. The technique is based on a dynamic sample-weight assignment scheme that achieves the accuracy of traditional boosting without requiring multiple passes through the data. The technique assures faster learning and competitive accuracy using simpler base models. The scheme is then extended to handle concept drift via change detection. The change detection approach aims at significant data changes that could cause serious deterioration of the ensemble performance, and replaces the obsolete ensemble with one built from scratch. Experimental results confirm the advantages of our adaptive boosting scheme over previous approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Breiman, L.: Bagging predictors. In: ICML (1996)
Google Scholar
Dietterich, T.: Ensemble methods in machine learning. Multiple Classifier Systems (2000)
Google Scholar
Domeniconi, C., Gunopulos, D.: Incremental support vector machine construction. In: ICDM (2001)
Google Scholar
Domingos, P., Hulten, G.: Mining high-speed data streams. In: ACM SIGKDD (2000)
Google Scholar
Dong, G., Han, J., Lakshmanan, L.V.S., Pei, J., Wang, H., Yu, P.S.: Online mining of changes from data streams: Research problems and preliminary results. In: ACM SIGMOD MPDS (2003)
Google Scholar
Fern, A., Givan, R.: Online ensemble learning: An empirical study. In: ICML (2000)
Google Scholar
Frank, E., Holmes, G., Kirkby, R., Hall, M.: Racing committees for large datasets. Discovery Science (2002)
Google Scholar
Freund, Y., Schapire, R.: Experiments with a new boosting algorithm. In: ICML (1996)
Google Scholar
Friedman, J., Hastie, T., Tibshirani, R.: Additive logistic regression: A statistical view of boosting. The Annals of Statistics 28(2), 337–407 (1998)
Article MathSciNet Google Scholar
Ganti, V., Gehrke, J., Ramakrishnan, R.: andW. Loh. Mining data streams under block evolution. SIGKDD Explorations 3(2), 1–10 (2002)
Article Google Scholar
Hulten, G., Spencer, L., Domingos, P.: Mining time-changing data streams. In: ACM SIGKDD (2001)
Google Scholar
Oza, N., Russell, S.: Experimental comparisons of online and batch versions of bagging and boosting. In: ACM SIGKDD (2001)
Google Scholar
Schapire, R., Freund, Y., Bartlett, P.: Boosting the margin: A new explanation for the effectiveness of voting methods. In: ICML (1997)
Google Scholar
Stolfo, S., Fan, W., Lee, W., Prodromidis, A., Chan, P.: Credit card fraud detection using meta-learning: Issues and initial results. In: AAAI 1997 Workshop on Fraud Detection and Risk Management (1997)
Google Scholar
Street, W., Kim, Y.: A streaming ensemble algorithm (sea) for large-scale classification. In: ACM SIGKDD (2001)
Google Scholar
Wang, H., Fan, W., Yu, P., Han, J.: Mining concept-drifting data streams using ensemble classifiers. In: ACM SIGKDD (2003)
Google Scholar
Widmer, G., Kubat, M.: Learning in the presence of concept drift and hidden contexts. Machine Learning (1996)
Google Scholar

Download references

Author information

Authors and Affiliations

University of California, Los Angeles, CA, 90095, USA
Fang Chu & Carlo Zaniolo

Authors

Fang Chu
View author publications
You can also search for this author in PubMed Google Scholar
Carlo Zaniolo
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Engineering and Information Technology, Deakin University, VIC 3125, Australia
Honghua Dai
University of Illinois at Urbana-Champaign, 61801, Urbana, IL, USA
Ramakrishnan Srikant
Faculty of Engineering and Information Technology, Centre for Quantum Computation and Intelligent Systems, and Australian ACS National Committee for Artificial Intelligence, University of Technology, Sydney, Australia
Chengqi Zhang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chu, F., Zaniolo, C. (2004). Fast and Light Boosting for Adaptive Mining of Data Streams. In: Dai, H., Srikant, R., Zhang, C. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2004. Lecture Notes in Computer Science(), vol 3056. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24775-3_36

Download citation

DOI: https://doi.org/10.1007/978-3-540-24775-3_36
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-22064-0
Online ISBN: 978-3-540-24775-3
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics