Abstract
Research in big data becomes pioneer in the field of information system. Data stream is well-studied problem in traditional data mining environment, but still needs exploration while dealing with big data. This paper mainly reviewed different research activities, scientific practice, and methods which have been developed for stream big data. In addition, examine well-known real-time platforms which are evolving to handle streaming problem and having existing similarity in terms of usage of main memory and distributed computing technologies for non-real-time data. Finally, summarize open issues and challenges faced by current technologies while acquisition and processing of big data in real time.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Lohr, S.: The age of big data. New York Times 11 (2012)
Fan, W., Bifet, A.: Mining big data: current status, and forecast to the future. ACM SIGKDD Explor. Newsl. 14(2), 1–5 (2013)
Labrinidis, Alexandros, Jagadish, H.V.: Challenges and opportunities with big data. Proc. VLDB Endow. 5(12), 2032–2033 (2012)
Chen, C.P., Zhang, C.-Y.: Data-intensive applications, challenges, techniques and technologies: a survey on Big Data. Inf. Sci. 275, 314–347 (2014)
Chen, M., Mao, S., Liu, Y.: Big data: a survey. Mob. Netw. Appl. 19(2), 171–209 (2014)
Aggarwal, C.: Data streams: models and algorithms. Springer, Berlin (2007)
Nguyen, H.-L., Woon, Y.-K., Ng, W.-K.: A survey on data stream clustering and classification. Knowl. Inf. Syst. 45(3), 535–569 (2015)
Rutkowski, L., Pietruczuk, L., Duda, P., Jaworski, M.: Decision trees for mining data streams based on the McDiarmid’s bound. IEEE Trans. Knowl. Data Eng. 25(6), 1272–1279 (2013)
Rutkowski, L., Jaworski, M., Pietruczuk, L., Duda, P.: Decision Trees for mining data streams based on the Gaussian approximation. IEEE Trans. Knowl. Data Eng. 26(1), 108–119 (2014)
Rutkowski, L., Jaworski, M., Pietruczuk, L., Duda, P.: The CART decision tree for mining data streams. Inf. Sci. 266, 1–15 (2014)
Rutkowski, L., Jaworski, M., Pietruczuk, L., Duda, P.: A new method for data stream mining based on the misclassification error. IEEE Trans. Neural Netw. Learn. Syst. 26(5), 1048–1059 (2015)
Agerri, R., Artola, X., Beloki, Z., Rigau, G., Soroa, A.: Big data for natural language processing: a streaming approach. Knowl. Syst. 79, 36–42 (2015)
Vu, A.T., De Francisci Morales, G., Gama, J., Bifet, A.: Distributed adaptive model rules for mining big data streams. In: IEEE International Conference on Big Data, pp. 345–353 (2014)
Fegaras, L.: Incremental query processing on big data streams. IEEE Trans. Knowl. Data Eng. 28(11), 2998–3012 (2016). doi:10.1109/TKDE.2016.2601103
Marron, D., Bifet, A., De Francisci Morales, G.: Random forests of very fast decision trees on GPU for mining evolving big data streams. ECAI 14 (2014)
Yun, U., Lee, G.: Sliding window based weighted erasable stream pattern mining for stream data applications. Future Gener. Comput. Syst. (2016)
Zliobaite, I., Gabrys, B.: Adaptive preprocessing for streaming data. IEEE Trans. Knowl. Data Eng. 26(2), 309–321 (2014)
Domingos, P., Hulten, G.: Mining high-speed data streams. In: Proceedings of Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 71–80 (2000)
Bifet, A., de Francisci Morales, G., Read, J., Holmes, G., Pfahringer, B.: Efficient online evaluation of big data stream classifiers. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 59–68 (2015)
Krempl, G., Žliobaite, I., Brzeziński, D., Hüllermeier, E., Last, M., Lemaire, V., Noack, T., Shaker, A., Sievi, S., Spiliopoulou, M., Stefanowski, J.: Open challenges for data stream mining research. ACM SIGKDD Explor. Newsl. 16(1), 1–10 (2014)
Gaber, M., Zaslavsky, A., Krishnaswamy, S.: Mining data streams: a review. SIGMOD Rec. 34(2), 18–26 (2005)
Liu, J., Li, J., Li, W., Wu, J.: Rethinking big data: a review on the data quality and usage issues. ISPRS J. Photogramm. Remote Sens. (2015)
Yue, P., Jiang, L.: BigGIS: how big data can shape next-generation GIS. In: Third International Conference on Agro-geoinformatics (Agro-geoinformatics 2014), IEEE (2014)
Gonzalez, M.C., Hidalgo, C.A., Barabasi, A.-L.: Understanding individual human mobility patterns. Nature 453(7196), 779–782 (2008)
SDSS-III: Massive Spectroscopic Surveys of the Distant Universe, the Milky Way Galaxy, and Extra-Solar Planetary Systems, Jan 2008, http://www.sdss3.org/collaboration/description.pdf/
Jagadish, H.V., Gehrke, J., Labrinidis, A., Papakonstantinou, Y., Patel, J.M., Ramakrishnan, R., Shahabi, C.: Big data and its technical challenges. Commun. ACM 57(7), 86–94 (2014)
Madjid, K., Mustapha, N., Sulaiman, N.: Data stream clustering by divide and conquer approach based on vector model. J. Big Data 3(1) (2016)
Fong, S., Wong, R., Vasilakos, A.V.: Accelerated PSO swarm search feature selection for data stream mining big data. IEEE Trans. Serv. Comput. 9(1), 33–45 (2016)
Joao, D., Gama, J., Bifet, A.: Adaptive model rules from high-speed data streams. ACM Trans. Knowl. Discov. Data (TKDD) 10(3), 30 (2016)
Beyer, M.A., Laney, D.: The importance of “Big Data: a definition. Gartner, Stamford (2012)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Tidke, B., Mehta, R. (2018). A Comprehensive Review and Open Challenges of Stream Big Data. In: Pant, M., Ray, K., Sharma, T., Rawat, S., Bandyopadhyay, A. (eds) Soft Computing: Theories and Applications. Advances in Intelligent Systems and Computing, vol 584. Springer, Singapore. https://doi.org/10.1007/978-981-10-5699-4_10
Download citation
DOI: https://doi.org/10.1007/978-981-10-5699-4_10
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-5698-7
Online ISBN: 978-981-10-5699-4
eBook Packages: EngineeringEngineering (R0)