Skip to main content

Adaptive Ensembles for Evolving Data Streams – Combining Block-Based and Online Solutions

  • Conference paper
  • First Online:
New Frontiers in Mining Complex Patterns (NFMCP 2015)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9607))

Included in the following conference series:

  • 591 Accesses

Abstract

Learning ensemble classifiers from concept drifting data streams is discussed. The paper starts with a general overview of these ensembles. Then, differences between block-based and on-line ensembles are examined in detail. We hypothesize that it is still possible to develop new ensembles that combine the most beneficial properties of both types of these classifiers. Two such ensembles are described: Accuracy Updated Ensemble designed to process data blocks and its incremental version, Online Accuracy Updated Ensemble, for learning from single examples.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bifet, A., Holmes, G., Pfahringer, B.: Leveraging bagging for evolving data streams. In: Balcázar, J.L., Bonchi, F., Gionis, A., Sebag, M. (eds.) ECML PKDD 2010, Part I. LNCS, vol. 6321, pp. 135–150. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  2. Bifet, A., Holmes, G., Kirkby, R., Pfahringer, B.: MOA: Massive online analysis. J. Mach. Learn. Res. 11, 1601–1604 (2010)

    Google Scholar 

  3. Brzezinski, D.: Block-based and online ensembles for concept-drifting data streams. Ph.D. Thesis, Poznan University of Technology (2015)

    Google Scholar 

  4. Brzeziński, D., Stefanowski, J.: Accuracy updated ensemble for data streams with concept drift. In: Corchado, E., Kurzyński, M., Woźniak, M. (eds.) HAIS 2011, Part II. LNCS, vol. 6679, pp. 155–163. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  5. Brzezinski, D,. Stefanowski, J.: From block-based ensembles to onlinelearners in changing data streams: if- and how-to. In: Proceedings of the 2012 ECML PKDD Workshop on Instant Interactive Data Mining. http://adrem.ua.ac.be/iid2012/

  6. Brzezinski, D., Stefanowski, J.: Classifiers for concept-drifting dat streams: Evaluating things that really matter. In: Proceedings of the ECML PKDD 2013 Workshop on Real-World Challenges for Data Stream Mining (2013)

    Google Scholar 

  7. Brzezinski, D., Stefanowski, J.: Reacting to different types of concept drift: The accuracy updated ensemble algorithm. IEEE Trans. Neural Netw. Learn. Syst. 25(1), 81–94 (2014)

    Article  Google Scholar 

  8. Brzezinski, D., Stefanowski, J.: Combining block-based and online methods in learning ensembles from concept drifting data streams. Inf. Sci. 265, 50–67 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  9. Brzezinski, D., Stefanowski, J.: Prequential AUC for classifier evaluation and drift detection in evolving data streams. In: Appice, A., Ceci, M., Loglisci, C., Manco, G., Masciari, E., Ras, Z.W. (eds.) NFMCP 2014. LNCS, vol. 8983, pp. 87–101. Springer, Heidelberg (2015)

    Google Scholar 

  10. Deckert, M.: Incremental rule-based learners for handling concept drift: an overview. Found. Comput. Decis. Sci. 38(1), 35–65 (2013)

    MathSciNet  Google Scholar 

  11. Deckert, M., Stefanowski, J.: Comparing block ensembles for data streams with concept drift. In: Pechenizkiy, M., Wojciechowski, M. (eds.) ADBIS 2012. AISC, vol. 185, pp. 69–78. Springer, Heidelberg (2012)

    Google Scholar 

  12. Ditzler, G., Roveri, M., Alippi, C., Polikar, R.: Learning in nonstationary environments - a survey. IEEE Comput. Intell. Mag. 10(4), 12–25 (2015)

    Article  Google Scholar 

  13. Elwell, R., Polikar, R.: Incremental learning of concept drift in nonstationary environments. IEEE Trans. Neural Netw. 22(10), 1517–1531 (2011)

    Article  Google Scholar 

  14. Gama, J.: Knowledge Discovery from Data Streams. CRC Publishers, Boca Raton (2010)

    Book  MATH  Google Scholar 

  15. Gama, J., Zliobaite, I., Bifet, A., Pechenizkiy, M., Bouchachia, A.: A survey on concept drift adaptation. ACM Comp. Surv. 46(4), 44:1–44:37 (2014)

    Article  MATH  Google Scholar 

  16. Gomes, J., Gaber, M., Sousa, P., Menasalvas, E.: Mining recurring concepts in a dynamic feature space. IEEE Trans. Neural Netw. Learn. Syst. 25(1), 95–110 (2014)

    Article  Google Scholar 

  17. Hoens, T., Chawla, N.: Learning in non-stationary environments with class imbalance. In: Proceedings of the 18th ACM SIGKDD International Conference Knowledge Discovery Data Mining, pp. 168–176 (2012)

    Google Scholar 

  18. Japkowicz, N.: Assessment metrics for imbalanced learning. In: He, H., Ma, Y. (eds.) Imbalanced Learning: Foundations, Algorithms, and Applications, pp. 187–206. Wiley-IEEE Press, New Jersey (2013)

    Chapter  Google Scholar 

  19. Japkowicz, N., Stefanowski, J.: A machine learning perspective on big data analysis. In: Japkowicz, N., Stefanowski, J. (eds.) Big Data Analysis: New Algorithms for a New Society. SBD, vol. 16, pp. 1–31. Springer, Switzerland (2016)

    Chapter  Google Scholar 

  20. Kmieciak, M., Stefanowski, J.: Handling sudden concept drift in Enron message data streams. Control Cybern. 40(3), 667–695 (2011)

    MATH  Google Scholar 

  21. Kolter, J., Maloof, M.: Dynamic weighted majority: An ensemble method for drifting concepts. J. Mach. Learn. Res. 8, 2755–2790 (2007)

    MATH  Google Scholar 

  22. Krempl, G., Zliobaite, I., Brzezinski, D., Hullermeier, E., Last, M., Lemaire, V., Noack, T., Shaker, A., Sievi, S., Spiliopoulou, M., Stefanowski, J.: Open challenges for data stream mining research. SIGKDD Explor. 16(1), 1–10 (2014)

    Article  Google Scholar 

  23. Kuncheva, L.I.: Classifier ensembles for changing environments. In: Roli, F., Kittler, J., Windeatt, T. (eds.) MCS 2004. LNCS, vol. 3077, pp. 1–15. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  24. Kuncheva, L.: Combining Pattern Classifiers: Methods and Algorithms, 2nd edn. Wiley, Hoboken (2014)

    MATH  Google Scholar 

  25. Lemaire, V., Salperwyck, C., Bondu, A.: A survey on supervised classification on data streams. In: Zimányi, E., Kutsche, R.-D. (eds.) eBISS 2014. LNBIP, vol. 205, pp. 88–125. Springer, Heidelberg (2015)

    Google Scholar 

  26. Littlestone, N., Warmuth, M.: The weighted majority algorithm. Inf. Comput. 108(2), 212–261 (1994)

    Article  MathSciNet  MATH  Google Scholar 

  27. Masud, M., Gao, J., Khan, L., Han, J., Thuraisingham, B.: A practical approach to classify evolving data streams: training with limited amount of labeled data. In: Proceedings of the 8th IEEE International Conference on Data Mining, pp. 929–934 (2008)

    Google Scholar 

  28. Minku, L., White, A., Yao, X.: The impact of diversity on online ensemble learning in the presence of concept drift. IEEE Trans. Knowl. Data Eng. 22(5), 730–742 (2010)

    Article  Google Scholar 

  29. Nishida, K., Yamauchi, K., Omori, T.: ACE: adaptive classifiers-ensemble system for concept-drifting environments. In: Oza, N.C., Polikar, R., Kittler, J., Roli, F. (eds.) MCS 2005. LNCS, vol. 3541, pp. 176–185. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  30. Oza, N., Russell, S.: Experimental comparisons of online and batch versions of bagging and boosting. In: Proceedings of the 7th ACM SIGKDD International Conference Knowledge Discovery Data Mining, pp. 359–364. ACM Press (2001)

    Google Scholar 

  31. Shaker, A., Hullermeier, E.: Recovery analysis for adaptive learning from non-stationary data streams: Experimental design and case study. Neurocomputing 150, 250–264 (2015)

    Article  Google Scholar 

  32. Spiliopoulou, M., Krempl, G.: Mining multiple threads of streaming data. In: Tutorial at the 17th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD 2013), Gold Coast, Australia, April 2013. https://kmd.cs.ovgu.de/tutorial_pakdd2013.html

  33. Street, N., Kim, Y.: A streaming ensemble algorithm (SEA) for large-scale classification. In: Proceedings of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 377–382 (2001)

    Google Scholar 

  34. Tsymbal, A.: The problem of concept drift: definitions and related works, Technical report, Dept. Comput. Sci., Trinity College Dublin (2004)

    Google Scholar 

  35. Wang, H., Fan, W., Yu, P., Han, J.: Mining concept-drifting data streams using ensemble classifiers. In: Proceedings ACM SIGKDD International Conference Knowledge Discovery Data Mining, pp. 226–235 (2003)

    Google Scholar 

  36. Wang, S., Minku, L., Yao, X.: Resampling-based ensemble methods for online class imbalance learning. IEEE Trans. Knowl. Data Eng. 27(5), 1356–1368 (2015)

    Article  Google Scholar 

  37. Webb, G., Hyde, R., Cao, H., Nguyen, H., Petitjean, F.: Characterizing Concept Drift. arXiv preprint (accepted for publication in journal Data Mining and Knowledge Discovery) (2015). arXiv:1511.03816

  38. Zliobaite, I.: Controlled permutations for testing adaptive learning models. Knowl. Inf. Syst. 39(3), 565–578 (2014)

    Article  Google Scholar 

  39. Zliobaite, I., Pechenizky, M., Gama, J.: An overview of concept drift applications. In: Japkowicz, N., Stefanowski, J. (eds.) Big Data Analysis: New Algorithms for a New Society. SBD, vol. 16, pp. 91–114. Springer, Switzerland (2016)

    Chapter  Google Scholar 

Download references

Acknowledgment

The research on this paper was supported by the Polish National Science Center under grant no. DEC-2013/11/B/ST6/00963. The close co-operation with Dariusz Brzezinski on developing the new AUE and OAUE ensembles, and their experimental evaluation, is also acknowledged.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jerzy Stefanowski .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Stefanowski, J. (2016). Adaptive Ensembles for Evolving Data Streams – Combining Block-Based and Online Solutions. In: Ceci, M., Loglisci, C., Manco, G., Masciari, E., Ras, Z. (eds) New Frontiers in Mining Complex Patterns. NFMCP 2015. Lecture Notes in Computer Science(), vol 9607. Springer, Cham. https://doi.org/10.1007/978-3-319-39315-5_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-39315-5_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-39314-8

  • Online ISBN: 978-3-319-39315-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics