Skip to main content

A Comprehensive Review and Open Challenges of Stream Big Data

  • Conference paper
  • First Online:

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 584))

Abstract

Research in big data becomes pioneer in the field of information system. Data stream is well-studied problem in traditional data mining environment, but still needs exploration while dealing with big data. This paper mainly reviewed different research activities, scientific practice, and methods which have been developed for stream big data. In addition, examine well-known real-time platforms which are evolving to handle streaming problem and having existing similarity in terms of usage of main memory and distributed computing technologies for non-real-time data. Finally, summarize open issues and challenges faced by current technologies while acquisition and processing of big data in real time.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Lohr, S.: The age of big data. New York Times 11 (2012)

    Google Scholar 

  2. Fan, W., Bifet, A.: Mining big data: current status, and forecast to the future. ACM SIGKDD Explor. Newsl. 14(2), 1–5 (2013)

    Google Scholar 

  3. Labrinidis, Alexandros, Jagadish, H.V.: Challenges and opportunities with big data. Proc. VLDB Endow. 5(12), 2032–2033 (2012)

    Article  Google Scholar 

  4. Chen, C.P., Zhang, C.-Y.: Data-intensive applications, challenges, techniques and technologies: a survey on Big Data. Inf. Sci. 275, 314–347 (2014)

    Article  Google Scholar 

  5. Chen, M., Mao, S., Liu, Y.: Big data: a survey. Mob. Netw. Appl. 19(2), 171–209 (2014)

    Article  Google Scholar 

  6. Aggarwal, C.: Data streams: models and algorithms. Springer, Berlin (2007)

    Book  MATH  Google Scholar 

  7. Nguyen, H.-L., Woon, Y.-K., Ng, W.-K.: A survey on data stream clustering and classification. Knowl. Inf. Syst. 45(3), 535–569 (2015)

    Article  Google Scholar 

  8. Rutkowski, L., Pietruczuk, L., Duda, P., Jaworski, M.: Decision trees for mining data streams based on the McDiarmid’s bound. IEEE Trans. Knowl. Data Eng. 25(6), 1272–1279 (2013)

    Article  Google Scholar 

  9. Rutkowski, L., Jaworski, M., Pietruczuk, L., Duda, P.: Decision Trees for mining data streams based on the Gaussian approximation. IEEE Trans. Knowl. Data Eng. 26(1), 108–119 (2014)

    Article  MATH  Google Scholar 

  10. Rutkowski, L., Jaworski, M., Pietruczuk, L., Duda, P.: The CART decision tree for mining data streams. Inf. Sci. 266, 1–15 (2014)

    Article  MATH  Google Scholar 

  11. Rutkowski, L., Jaworski, M., Pietruczuk, L., Duda, P.: A new method for data stream mining based on the misclassification error. IEEE Trans. Neural Netw. Learn. Syst. 26(5), 1048–1059 (2015)

    Article  MathSciNet  Google Scholar 

  12. Agerri, R., Artola, X., Beloki, Z., Rigau, G., Soroa, A.: Big data for natural language processing: a streaming approach. Knowl. Syst. 79, 36–42 (2015)

    Google Scholar 

  13. Vu, A.T., De Francisci Morales, G., Gama, J., Bifet, A.: Distributed adaptive model rules for mining big data streams. In: IEEE International Conference on Big Data, pp. 345–353 (2014)

    Google Scholar 

  14. Fegaras, L.: Incremental query processing on big data streams. IEEE Trans. Knowl. Data Eng. 28(11), 2998–3012 (2016). doi:10.1109/TKDE.2016.2601103

  15. Marron, D., Bifet, A., De Francisci Morales, G.: Random forests of very fast decision trees on GPU for mining evolving big data streams. ECAI 14 (2014)

    Google Scholar 

  16. Yun, U., Lee, G.: Sliding window based weighted erasable stream pattern mining for stream data applications. Future Gener. Comput. Syst. (2016)

    Google Scholar 

  17. Zliobaite, I., Gabrys, B.: Adaptive preprocessing for streaming data. IEEE Trans. Knowl. Data Eng. 26(2), 309–321 (2014)

    Article  Google Scholar 

  18. Domingos, P., Hulten, G.: Mining high-speed data streams. In: Proceedings of Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 71–80 (2000)

    Google Scholar 

  19. Bifet, A., de Francisci Morales, G., Read, J., Holmes, G., Pfahringer, B.: Efficient online evaluation of big data stream classifiers. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 59–68 (2015)

    Google Scholar 

  20. Krempl, G., Žliobaite, I., Brzeziński, D., Hüllermeier, E., Last, M., Lemaire, V., Noack, T., Shaker, A., Sievi, S., Spiliopoulou, M., Stefanowski, J.: Open challenges for data stream mining research. ACM SIGKDD Explor. Newsl. 16(1), 1–10 (2014)

    Article  Google Scholar 

  21. Gaber, M., Zaslavsky, A., Krishnaswamy, S.: Mining data streams: a review. SIGMOD Rec. 34(2), 18–26 (2005)

    Article  MATH  Google Scholar 

  22. Liu, J., Li, J., Li, W., Wu, J.: Rethinking big data: a review on the data quality and usage issues. ISPRS J. Photogramm. Remote Sens. (2015)

    Google Scholar 

  23. Yue, P., Jiang, L.: BigGIS: how big data can shape next-generation GIS. In: Third International Conference on Agro-geoinformatics (Agro-geoinformatics 2014), IEEE (2014)

    Google Scholar 

  24. Gonzalez, M.C., Hidalgo, C.A., Barabasi, A.-L.: Understanding individual human mobility patterns. Nature 453(7196), 779–782 (2008)

    Article  Google Scholar 

  25. SDSS-III: Massive Spectroscopic Surveys of the Distant Universe, the Milky Way Galaxy, and Extra-Solar Planetary Systems, Jan 2008, http://www.sdss3.org/collaboration/description.pdf/

  26. Jagadish, H.V., Gehrke, J., Labrinidis, A., Papakonstantinou, Y., Patel, J.M., Ramakrishnan, R., Shahabi, C.: Big data and its technical challenges. Commun. ACM 57(7), 86–94 (2014)

    Article  Google Scholar 

  27. Madjid, K., Mustapha, N., Sulaiman, N.: Data stream clustering by divide and conquer approach based on vector model. J. Big Data 3(1) (2016)

    Google Scholar 

  28. Fong, S., Wong, R., Vasilakos, A.V.: Accelerated PSO swarm search feature selection for data stream mining big data. IEEE Trans. Serv. Comput. 9(1), 33–45 (2016)

    Google Scholar 

  29. Joao, D., Gama, J., Bifet, A.: Adaptive model rules from high-speed data streams. ACM Trans. Knowl. Discov. Data (TKDD) 10(3), 30 (2016)

    Google Scholar 

  30. Beyer, M.A., Laney, D.: The importance of “Big Data: a definition. Gartner, Stamford (2012)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bharat Tidke .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Tidke, B., Mehta, R. (2018). A Comprehensive Review and Open Challenges of Stream Big Data. In: Pant, M., Ray, K., Sharma, T., Rawat, S., Bandyopadhyay, A. (eds) Soft Computing: Theories and Applications. Advances in Intelligent Systems and Computing, vol 584. Springer, Singapore. https://doi.org/10.1007/978-981-10-5699-4_10

Download citation

  • DOI: https://doi.org/10.1007/978-981-10-5699-4_10

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-10-5698-7

  • Online ISBN: 978-981-10-5699-4

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics