CPF: Concept Profiling Framework for Recurring Drifts in Data Streams

Anderson, Robert; Koh, Yun Sing; Dobbie, Gillian

doi:10.1007/978-3-319-50127-7_17

Robert Anderson²¹,
Yun Sing Koh²¹ &
Gillian Dobbie²¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9992))

Included in the following conference series:

Australasian Joint Conference on Artificial Intelligence

3216 Accesses
6 Citations

Abstract

We propose the Concept Profiling Framework (CPF), a meta-learner that uses a concept drift detector and a collection of classification models to perform effective classification on data streams with recurrent concept drifts, through relating models by similarity of their classifying behaviour. We introduce a memory-efficient version of our framework and show that it can operate faster and with less memory than a naïve implementation while achieving similar accuracy. We compare this memory-efficient version of CPF to a state-of-the-art meta-learner made to handle recurrent drift and show that we can regularly achieve improved classification accuracy along with runtime and memory use. We provide results from testing on synthetic and real-world datasets to prove CPF’s value in classifying data streams with recurrent concepts.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

References

Tsymbal, A.: The problem of concept drift: definitions and related work. Technical report TCD-CS-2004-15, Trinity College Dublin (2004)
Google Scholar
Widmer, G., Kubat, M.: Learning in the presence of concept drift and hidden contexts. Mach. Learn. 23, 69–101 (1996)
Article Google Scholar
Yang, Y., Wu, X., Zhu, X.: Mining in anticipation for concept change: proactive-reactive prediction in data streams. Data Mining Knowl. Disc. 13, 261–289 (2006)
Article MathSciNet Google Scholar
Gomes, J.B., Sousa, P.A., Menasalvas, E.: Tracking recurrent concepts using context. Intell. Data Anal. 16, 803–825 (2012)
Article Google Scholar
Gonçalves, P.M., De Barros, R.S.M.: RCD: a recurring concept drift framework. Pattern Recogn. Lett. 34, 1018–1025 (2013)
Article Google Scholar
Gama, J.A., Žliobaitė, I., Bifet, A., Pechenizkiy, M., Bouchachia, A.: A survey on concept drift adaptation. ACM Comput. Surv. 46, 1–37 (2014)
Article MATH Google Scholar
Domingos, P., Hulten, G.: Mining high-speed data streams. In: Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 71–80. ACM (2000)
Google Scholar
Hoeffding, W.: Probability inequalities for sums of bounded random variables. J. Am. Stat. Assoc. 58, 13–30 (1963)
Article MathSciNet MATH Google Scholar
Gama, J., Medas, P., Castillo, G., Rodrigues, P.: Learning with drift detection. In: Bazzan, A.L.C., Labidi, S. (eds.) SBIA 2004. LNCS (LNAI), vol. 3171, pp. 286–295. Springer, Heidelberg (2004). doi:10.1007/978-3-540-28645-5_29
Chapter Google Scholar
Baena-Garcıa, M., del Campo-Ávila, J., Fidalgo, R., Bifet, A., Gavalda, R., Morales-Bueno, R.: Early drift detection method. In: Fourth International Workshop on Knowledge Discovery from Data Streams, vol. 6, pp. 77–86 (2006)
Google Scholar
Gama, J., Kosina, P.: Tracking recurring concepts with meta-learners. In: Lopes, L.S., Lau, N., Mariano, P., Rocha, L.M. (eds.) EPIA 2009. LNCS (LNAI), vol. 5816, pp. 423–434. Springer, Heidelberg (2009). doi:10.1007/978-3-642-04686-5_35
Chapter Google Scholar
Brzezinski, D., Stefanowski, J.: Reacting to different types of concept drift: the accuracy updated ensemble algorithm. IEEE Trans. Neural Netw. Learn.Syst. 25, 81–94 (2014)
Article Google Scholar
Hulten, G., Spencer, L., Domingos, P.: Mining time-changing data streams. In: Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 97–106. ACM (2001)
Google Scholar
Kawala, F., Douzal-Chouakria, A., Gaussier, E., Dimert, E.: Prédictions d’activité dans les réseaux sociaux en ligne. In: 4ième conférence sur les modèles et l’analyse des réseaux: Approches mathématiques et informatiques, pp. 16–28 (2013)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Auckland, Auckland, New Zealand
Robert Anderson, Yun Sing Koh & Gillian Dobbie

Authors

Robert Anderson
View author publications
You can also search for this author in PubMed Google Scholar
Yun Sing Koh
View author publications
You can also search for this author in PubMed Google Scholar
Gillian Dobbie
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Robert Anderson .

Editor information

Editors and Affiliations

University of Tasmania, Hobart, Australia
Byeong Ho Kang
Auckland University of Technology, Auckland, New Zealand
Quan Bai

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Anderson, R., Koh, Y.S., Dobbie, G. (2016). CPF: Concept Profiling Framework for Recurring Drifts in Data Streams. In: Kang, B.H., Bai, Q. (eds) AI 2016: Advances in Artificial Intelligence. AI 2016. Lecture Notes in Computer Science(), vol 9992. Springer, Cham. https://doi.org/10.1007/978-3-319-50127-7_17

Download citation

DOI: https://doi.org/10.1007/978-3-319-50127-7_17
Published: 29 November 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-50126-0
Online ISBN: 978-3-319-50127-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics