Abstract
In this chapter, we give a brief overview of a few special topics in online machine learning, all of which are extensively covered in recent surveys. In Section “Reinforcement Learning,” we survey reinforcement learning. In Section “Unsupervised Data Mining,” we describe unsupervised data mining methods, including clustering, frequent itemset mining, dimensionality reduction, and topic modeling. In Section “Concept Drift and Adaptive Learning,” we describe the notion of the dataset drift or, in other terms, concept drift and list the most important drift adapting methods. We only discuss representative results in these areas. This chapter is an extension of the other chapters in this handbook, “Overview of Online Machine Learning in Big Data Streams,” “Online Machine Learning Algorithms Over Data Streams,” and “Recommender Systems Over Data Streams.”
References
Ackermann MR, Märtens M, Raupach C, Swierkot K, Lammersen C, Sohler C (2012) Streamkm++: a clustering algorithm for data streams. J Exp Algorithmics (JEA) 17:2–4
Aggarwal CC (2013) A survey of stream clustering algorithms. In: Aggarwal CC, Reddy CK (eds) Data clustering: algorithms and applications. Chapman and Hall/CRC, Boca Raton, p 231
Aggarwal CC, Han J (2014) Frequent pattern mining. Springer, Cham
Aggarwal CC, Han J, Wang J, Yu PS (2003) A framework for clustering evolving data streams. In: Proceedings of the 29th international conference on very large data bases, vol 29. VLDB Endowment, pp 81–92
Agrawal R, Imielienski T, Swami A (1993) Mining association rules between sets of items in large databases. In: Bunemann P, Jajodia S (eds) Proceedings of the 1993 ACM SIGMOD conference on management of data. ACM Press, New York, pp 207–216
Alberg D, Last M, Kandel A (2012) Knowledge discovery in data streams with regression tree methods. Wiley Interdiscip Rev Data Min Knowl Disc 2(1): 69–78
Auer P, Cesa-Bianchi N, Freund Y, Schapire RE (2002) The nonstochastic multiarmed bandit problem. SIAM J Comput 32(1):48–77
Bach S, Maloof M (2010) A Bayesian approach to concept drift. In: Advances in neural information processing systems. Curran Associates, Inc., New York, pp 127–135
Bifet A (2010) Adaptive stream mining: Pattern learning and mining from evolving data streams. In: Proceedings of the 2010 conference on adaptive stream mining: pattern learning and mining from evolving data streams. IOS Press, pp 1–212
Bifet A, Gavaldà R (2009) Adaptive learning from evolving data streams. In: International symposium on intelligent data analysis. Springer, pp 249–260
Bifet A, Read J, Pfahringer B, Holmes G, Žliobaitė I (2013) CD-MOA: change detection framework for massive online analysis. In: International symposium on intelligent data analysis. Springer, pp 92–103
Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3:993–1022
Bradley PS, Fayyad UM, Reina C et al (1998) Scaling clustering algorithms to large databases. In: Proceedings of the 4th international conference on knowledge discovery and data mining, pp 9–15
Brand M (2002) Incremental singular value decomposition of uncertain data with missing values. In: Computer vision–ECCV 2002, pp 707–720
Bunch JR, Nielsen CP (1978) Updating the singular value decomposition. Numer Math 31(2):111–129
Calders T, Dexters N, Gillis JJ, Goethals B (2014) Mining frequent itemsets in a stream. Inf Syst 39:233–255
Canini K, Shi L, Griffiths T (2009) Online inference of topics with latent dirichlet allocation. In: Proceedings of the twelth international conference on artificial intelligence and statistics, in PMLR, Clearwater Beach, vol 5, pp 65–72
Cao F, Estert M, Qian W, Zhou A (2006) Density-based clustering over an evolving data stream with noise. In: Proceedings of the 2006 SIAM international conference on data mining. SIAM, pp 328–339
Chang JH, Lee WS (2003) Estwin: adaptively monitoring the recent change of frequent itemsets over online data streams. In: Proceedings of the 12th international conference on information and knowledge management. ACM, pp 536–539
Chang JH, Lee WS (2003) Finding recent frequent itemsets adaptively over online data streams. In: Proceedings of the 9th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 487–492
Chang JH, Lee WS (2006) Finding frequent itemsets over online data streams. Inf Softw Technol 48(7): 606–618
Charikar M, Chen K, Farach-Colton M (2004) Finding frequent items in data streams. Theor Comput Sci 312(1):3–15
Chen Y, Tu L (2007) Density-based clustering for real-time stream data. In: Proceedings of the 13th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 133–142
Cheng J, Ke Y, Ng W (2008a) Maintaining frequent closed itemsets over a sliding window. J Inf Syst 31(3): 191–215
Cheng J, Ke Y, Ng W (2008b) A survey on algorithms for mining frequent itemsets over data streams. Knowl Inf Syst 16(1):1–27
Chi Y, Wang H, Philip SY, Muntz RR (2006) Catch the moment: maintaining closed frequent itemsets over a data stream sliding window. Knowl Inf Syst 10(3): 265–294
Deerwester SC, Dumais ST, Landauer TK, Furnas GW, Harshman RA (1990) Indexing by latent semantic analysis. J Am Soc Inf Sci 41(6):391–407. www.citeseer.nj.nec.com/deerwester90indexing.html
Domingos P, Hulten G (2000) Mining high-speed data streams. In: Proceedings of the 6th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 71–80
Elwell R, Polikar R (2009) Incremental learning in nonstationary environments with controlled forgetting. In: International joint conference on neural networks, IJCNN2009. IEEE, pp 771–778
Ester M, Kriegel HP, Sander J, Xu X et al (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of ACM SIGKDD, pp 226–231
Farnstrom F, Lewis J, Elkan C (2000) Scalability for clustering algorithms revisited. ACM SIGKDD Explorations Newsletter 2(1):51–57
Gama J, Medas P, Castillo G, Rodrigues P (2004) Learning with drift detection. In: Brazilian symposium on artificial intelligence. Springer, pp 286–295
Gama J, Rocha R, Medas P (2003) Accurate decision trees for mining high-speed data streams. In: Proceedings of the 9th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 523–528
Gama J, Rodrigues PP (2007) Stream-based electricity load forecast. In: European conference on principles of data mining and knowledge discovery. Springer, pp 446–453
Gama J, Rodrigues PP, Lopes L (2011) Clustering distributed sensor data streams using local processing and reduced communication. Intell Data Anal 15(1):3–28
Gama J, Žliobaite I, Bifet A, Pechenizkiy M, Bouchachia A (2014) A survey on concept drift adaptation. ACM Comput Surv (CSUR) 46(4):44
Giannella C, Han J, Pei J, Yan X, Yu PS (2003) Mining frequent patterns in data streams at multiple time granularities. Next Gener Data Mining 212:191–212
Guha S, Meyerson A, Mishra N, Motwani R, O’Callaghan L (2003) Clustering data streams: theory and practice. IEEE Trans Knowl Data Eng 15(3):515–528
Günter S, Schraudolph NN, Vishwanathan S (2007) Fast iterative kernel principal component analysis. J Mach Learn Res 8:1893–1918
Hall P, Marshall D, Martin R (2000) Merging and splitting eigenspace models. IEEE Trans Pattern Anal Mach Intell 22(9):1042–1049
Hartigan JA, Hartigan J (1975) Clustering algorithms, vol 209. Wiley, New York
Ho Q, Cipar J, Cui H, Lee S, Kim JK, Gibbons PB, Gibson GA, Ganger G, Xing EP (2013) More effective distributed ML via a stale synchronous parallel parameter server. In: Advances in neural information processing systems. Neural Information Processing Systems Foundation, Inc., Lake Tahoe, pp 1223–1231
Hoffman M, Bach FR, Blei DM (2010) Online learning for latent dirichlet allocation. In: Lafferty JD, Williams CKI, Shawe-Taylor J, Zemel RS, Culotta A (eds) Advances in neural information processing systems. Curran Associates, Inc., New York, pp 856–864
Honeine P (2012) Online kernel principal component analysis: a reduced-order model. IEEE Trans Pattern Anal Mach Intell 34(9):1814–1826
Ipek E, Mutlu O, Martínez JF, Caruana R (2008) Self-optimizing memory controllers: A reinforcement learning approach. In: Proceedings of 35th international symposium on computer architecture, ISCA’08. IEEE, pp 39–50
Jagerman R, Eickhoff C, de Rijke M (2017) Computing web-scale topic models using an asynchronous parameter server. In: Proceedings of the 40th international ACM SIGIR conference on research and development in information retrieval. ACM
Jain AK, Murty MN, Flynn PJ (1999) Data clustering: a review. ACM Comput Surv (CSUR) 31(3):264–323
Jolliffe IT (1986) Principal component analysis and factor analysis. In: Jolliffe IT (ed) Principal component analysis. Springer, New York, pp 115–128
Kavitha V, Punithavalli M (2010) Clustering time series data stream-a literature survey. arXiv preprint arXiv:1005.4270
Kim KI, Franz MO, Scholkopf B (2005) Iterative kernel principal component analysis for image modeling. IEEE Trans Pattern Anal Mach Intell 27(9):1351–1366
Klinkenberg R (2004) Learning drifting concepts: example selection vs. example weighting. Intell Data Anal 8(3):281–300
Klinkenberg R, Joachims T (2000) Detecting concept drift with support vector machines. In: ICML, pp 487–494
Kolter JZ, Maloof MA (2003) Dynamic weighted majority: a new ensemble method for tracking concept drift. In: Proceedings of the 3rd IEEE international conference on data mining, ICDM2003. IEEE, pp 123–130
Koychev I (2000) Gradual forgetting for adaptation to concept drift. In: Proceedings of the ECAI 2000 workshop on current issues in spatio-temporal reasoning
Kranen P, Assent I, Baldauf C, Seidl T (2011) The clustree: indexing micro-clusters for anytime stream mining. Knowl Inf Syst 29(2):249–272
Kuncheva LI, Žliobaitė I (2009) On the window size for classification in changing environments. Intell Data Anal 13(6):861–872
Lee D, Lee W (2005) Finding maximal frequent itemsets over online data streams adaptively. In: Proceedings of the 5th IEEE international conference on data mining. IEEE, pp 8–pp
Leite D, Costa P, Gomide F (2013) Evolving granular neural networks from fuzzy data streams. Neural Netw 38:1–16
Li HF, Ho CC, Lee SY (2009) Incremental updates of closed frequent itemsets over continuous data streams. Expert Syst Appl 36(2):2451–2458
Li HF, Lee SY, Shan MK (2004) An efficient algorithm for mining frequent itemsets over the entire history of data streams. In: Proceedings of the 1st international workshop on knowledge discovery in data streams, vol 39
Li L, Chu W, Langford J, Schapire RE (2010) A contextual-bandit approach to personalized news article recommendation. In: Proceedings of the 19th international conference on world wide web. ACM, pp 661–670
Li M, Andersen DG, Park JW, Smola AJ, Ahmed A, Josifovski V, Long J, Shekita EJ, Su BY (2014) Scaling distributed machine learning with the parameter server. In: Proceedings of 11th USENIX symposium on operating systems design and implementation (OSDI14). USENIX Association, pp 583–598
Littlestone N (1988) Learning quickly when irrelevant attributes abound: a new linear-threshold algorithm. Mach Learn 2(4):285–318
Mahdiraji AR (2009) Clustering data stream: a survey of algorithms. Int J Knowl Based Intell Eng Syst 13(2):39–44
Maloof MA, Michalski RS (2004) Incremental learning with partial instance memory. Artif Intell 154(1–2): 95–126
Minku LL, White AP, Yao X (2010) The impact of diversity on online ensemble learning in the presence of concept drift. IEEE Trans Knowl Data Eng 22(5): 730–742
Mnih V, Kavukcuoglu K, Silver D, Rusu AA, Veness J, Bellemare MG, Graves A, Riedmiller M, Fidjeland AK, Ostrovski G et al (2015) Human-level control through deep reinforcement learning. Nature 518(7540):529–533
Moreno-Torres JG, Raeder T, Alaiz-RodríGuez R, Chawla NV, Herrera F (2012) A unifying view on dataset shift in classification. Pattern Recog 45(1):521–530
O’callaghan L, Mishra N, Meyerson A, Guha S, Motwani R (2002) Streaming-data algorithms for high-quality clustering. In: Proceedings of 18th international conference on data engineering. IEEE, pp 685–694
Oja E (1982) Simplified neuron model as a principal component analyzer. J Math Biol 15(3):267–273
Oja E (1992) Principal components, minor components, and linear neural networks. Neural Netw 5(6):927–935
Pan SJ, Yang Q (2010) A survey on transfer learning. IEEE Trans Knowl Data Eng 22(10):1345–1359
Pang-Ning T, Steinbach M, Kumar V et al (2006) Introduction to data mining. Pearson Addison Wesley, Boston/Toronto
Quadrana M, Bifet A, Gavalda R (2015) An efficient closed frequent itemset miner for the MOA stream mining system. AI Commun 28(1):143–158
Quionero-Candela J, Sugiyama M, Schwaighofer A, Lawrence ND (2009) Dataset shift in machine learning. The MIT Press: Cambridge
Rodrigues PP, Gama J, Pedroso JP (2006) ODAC: hierarchical clustering of time series data streams. In: Proceedings of the 2006 SIAM international conference on data mining. SIAM, pp 499–503
Sanger TD (1989) Optimal unsupervised learning in a single-layer linear feedforward neural network. Neural Netw 2(6):459–473
Schlimmer JC, Granger RH (1986) Incremental learning from noisy data. Mach Learn 1(3):317–354
Schölkopf B, Smola A, Müller KR (1998) Nonlinear component analysis as a kernel eigenvalue problem. Neural Comput 10(5):1299–1319
Silva JA, Faria ER, Barros RC, Hruschka ER, de Carvalho AC, Gama J (2013) Data stream clustering: A survey. ACM Comput Surv (CSUR) 46(1):13
Silver D, Huang A, Maddison CJ, Guez A, Sifre L, Van Den Driessche G, Schrittwieser J, Antonoglou I, Panneershelvam V, Lanctot M et al (2016) Mastering the game of go with deep neural networks and tree search. Nature 529(7587):484–489
Smola A, Narayanamurthy S (2010) An architecture for parallel topic models. Proc VLDB Endow 3(1–2): 703–710
Song G, Yang D, Cui B, Zheng B, Liu Y, Xie K (2007) Claim: an efficient method for relaxed frequent closed itemsets mining over stream data. In: International conference on database systems for advanced applications. Springer, pp 664–675
Song X, Lin CY, Tseng BL, Sun MT (2005) Modeling and predicting personal information dissemination behavior. In: Proceedings of the 11th ACM SIGKDD international conference on knowledge discovery in data mining. ACM, pp 479–488
Storkey A (2009) When training and test sets are different: characterizing learning transfer. In: Sugiyama C, Lawrence S (eds) Dataset shift in machine learning. MIT Press, Cambridge, pp 3–28
Sutton RS (1996) Generalization in reinforcement learning: successful examples using sparse coarse coding. In: Touretzky DS, Mozer MC, Hasselmo ME (eds) Advances in neural information processing systems, vol 8. MIT Press, Cambridge, pp 1038–1044
Sutton RS, Barto AG (1998) Reinforcement learning: an introduction, vol 16. MIT Press, Cambridge, pp 285–286
Syed NA, Liu H, Sung KK (1999) Handling concept drifts in incremental learning with support vector machines. In: Proceedings of the 5th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 317–321
Teflioudi C, Gemulla R, Mykytiuk O (2015) Lemp: fast retrieval of large entries in a matrix product. In: Proceedings of the 2015 ACM SIGMOD international conference on management of data. ACM, pp 107–122
Tesauro G (1995) Td-gammon: a self-teaching backgammon program. In: Applications of neural networks. Springer, Boston, pp 267–285
Tsymbal A (2004) The problem of concept drift: definitions and related work. Technical Report 2, Computer Science Department, Trinity College Dublin
Wang H, Fan W, Yu PS, Han J (2003) Mining concept-drifting data streams using ensemble classifiers. In: Proceedings of the 9th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 226–235
Watkins CJ, Dayan P (1992) Q-learning. Mach Learn 8(3–4):279–292
Widmer G, Kubat M (1996) Learning in the presence of concept drift and hidden contexts. Mach Learn 23(1):69–101
Williams RJ (1992) Simple statistical gradient-following algorithms for connectionist reinforcement learning. Mach Learn 8(3–4):229–256
Xu R, Wunsch D (2008) Clustering, vol 10. Wiley, Hoboken
Yen SJ, Wu CW, Lee YS, Tseng VS, Hsieh CH (2011) A fast algorithm for mining frequent closed itemsets over stream sliding window. In: 2011 IEEE international conference on fuzzy systems (FUZZ). IEEE, pp 996–1002
Yu HF, Hsieh CJ, Yun H, Vishwanathan S, Dhillon IS (2015) A scalable asynchronous distributed algorithm for topic modeling. In: Proceedings of the 24th international conference on world wide web, pp 1340–1350. International World Wide Web Conferences Steering Committee
Yu JX, Chong Z, Lu H, Zhou A (2004) False positive or false negative: mining frequent itemsets from high speed transactional data streams. In: Proceedings of the 13th international conference on very large data bases, vol 30. VLDB Endowment, pp 204–215
Yuan J, Gao F, Ho Q, Dai W, Wei J, Zheng X, Xing EP, Liu TY, Ma WY (2015) Lightlda: big topic models on modest computer clusters. In: Proceedings of the 24th international conference on world wide web. International World Wide Web Conferences Steering Committee, pp 1351–1361
Zhang T, Ramakrishnan R, Livny M (1996) Birch: an efficient data clustering method for very large databases. ACM SIGMOD Rec 25(2):103–114
Zhou A, Cao F, Qian W, Jin C (2008) Tracking clusters in evolving data streams over sliding windows. Knowl Inf Syst 15(2):181–214
Žliobaitė I (2009) Learning under concept drift: an overview. Technical report, Vilnius University
Žliobaite I, Bifet A, Gaber M, Gabrys B, Gama J, Minku L, Musial K (2012) Next challenges for adaptive learning systems. ACM SIGKDD Explor Newsl 14(1): 48–55
Acknowledgements
Support from the EU H2020 grant Streamline No 688191 and the “Big Data—Momentum” grant of the Hungarian Academy of Sciences.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Section Editor information
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this entry
Cite this entry
Benczúr, A.A., Kocsis, L., Pálovics, R. (2018). Reinforcement Learning, Unsupervised Methods, and Concept Drift in Stream Learning. In: Sakr, S., Zomaya, A. (eds) Encyclopedia of Big Data Technologies. Springer, Cham. https://doi.org/10.1007/978-3-319-63962-8_327-1
Download citation
DOI: https://doi.org/10.1007/978-3-319-63962-8_327-1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-63962-8
Online ISBN: 978-3-319-63962-8
eBook Packages: Springer Reference MathematicsReference Module Computer Science and Engineering