Advertisement

Fault Tolerant Data Stream Processing in Cooperation with OLTP Engine

  • Yoshiharu IshikawaEmail author
  • Kento Sugiura
  • Daiki Takao
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11297)

Abstract

In recent years, with the increase of big data and the spread of IoT technology and the continual evolution of hardware technology, the demand for data stream processing is further increased. Meanwhile, in the field of database systems, a new demand for HTAP (hybrid transactional and analytical processing) that integrates the functions of on-line transaction processing (OLTP) and on-line analytical processing (OLAP) is emerging. Based on this background, our group started a new project to develop data stream processing technologies in the HTAP environment in cooperation with other research groups in Japan. Our main focus is to develop new data stream processing methodologies such as fault tolerance in cooperation with the OLAP engine. In this paper, we describe the background, the objectives and the issues of the research.

Keywords

Data stream processing Fault tolerance Query processing OLTP HTAP 

Notes

Acknowledgments

This paper is based on results obtained from a project commissioned by the New Energy and Industrial Technology Development Organization (NEDO) and a project supported by JSPS KAKENHI Grant Number 16H01722.

References

  1. 1.
    Aggarwal, C.C. (ed.): Data Streams: Models and Algorithms, vol. 31. Springer, Heidelberg (2006).  https://doi.org/10.1007/978-0-387-47534-9CrossRefzbMATHGoogle Scholar
  2. 2.
    Aggarwal, C.C., Yu, P.S.: A survey of synopsis construction in data streams. In: Aggarwal, C.C. (ed.) Data Streams. Advances in Database Systems, vol. 31, pp. 169–207. Springer, Boston (2007).  https://doi.org/10.1007/978-0-387-47534-9_9
  3. 3.
    Ailamaki, A., Liarou, E., Tözün, P., Porobic, D., Psaroudakis, I.: Databases on Modern Hardware. Synthesis Lectures on Data Management. Morgan & Claypool, San Rafael (2017)Google Scholar
  4. 4.
    Akidau, T., et al.: MillWheel: fault-tolerant stream processing at internet scale. PVLDB 6(11), 1033–1044 (2013)Google Scholar
  5. 5.
    Andrade, H.C.M., Gedik, B., Turaga, D.S.: Fundamentals of Stream Processing. Cambridge University Press, New York (2014)Google Scholar
  6. 6.
    Appuswamy, R., Karpathiotakis, M., Porobic, D., Ailamaki, A.: The case for heterogeneous HTAP. In: CIDR (2017)Google Scholar
  7. 7.
    Balazinska, M., Balakrishnan, H., Madden, S., Stonebraker, M.: Fault-tolerance in the Borealis distributed stream processing system. In: SIGMOD, pp. 13–24 (2005)Google Scholar
  8. 8.
    Barber, R., et al.: Evolving databases for new-gen big data applications. In: CIDR (2017)Google Scholar
  9. 9.
    Barber, R., et al.: Wildfire: concurrent blazing data ingest and analytics. In: SIGMOD, pp. 2077–2080 (2016)Google Scholar
  10. 10.
    Carbone, P., Ewen, S., Fóra, G., Haridi, S., Richter, S., Tzoumas, K.: State management in Apache Flink: consistent stateful distributed stream processing. PVLDB 10(12), 1718–1729 (2017)Google Scholar
  11. 11.
    Chandramouli, B., Goldstein, J.: Shrink: prescribing resiliency solutions for streaming. PVLDB 10(5), 505–516 (2017)Google Scholar
  12. 12.
    Chaudhry, N., Shaw, K., Abdelguerfi, M. (ed.) Stream Data Management. Springer, Heidelberg (2005).  https://doi.org/10.1007/b106968zbMATHGoogle Scholar
  13. 13.
    Cherniack, M., et al.: Scalable distributed stream processing. In: CIDR (2003)Google Scholar
  14. 14.
    Cormode, G., Garofalakis, M., Haas, P.J., Jermaine, C.: Synopses for massive data: samples, histograms, wavelets, sketches. Found. Trends Databases 4(1–3), 1–294 (2012)zbMATHGoogle Scholar
  15. 15.
    da Silva, G.J., et al.: Consistent regions: guaranteed tuple processing in IBM streams. PVLDB 9(13), 1341–1352 (2016)Google Scholar
  16. 16.
    Ellis, B.: Real-Time Analytics. Wiley, Indianapolis (2014)Google Scholar
  17. 17.
    Fernandez, R.C., Migliavacca, M., Kalyvianaki, E., Pietzuch, P.: Integrating scale out and fault tolerance in stream processing using operator state management. In: SIGMOD, pp. 725–736 (2013)Google Scholar
  18. 18.
    Floratou, A., Agrawal, A., Graham, B., Rao, S., Ramasamy, K.: Dhalion: self-regulating stream processing in heron. PVLDB 10(12), 1825–1836 (2017)Google Scholar
  19. 19.
    Galakatos, A., Crotty, A., Zgraggen, E., Kraska, T., Binnig, C.: Revisiting reuse for approximate query processing. PVLDB 10(10), 1142–1153 (2017)Google Scholar
  20. 20.
    Garofalakis, M., Gehrke, J., Rastogi, R. (eds.) Data Stream Management: Processing High-Speed Data Streams. Springer, Heidelberg (2016).  https://doi.org/10.1007/978-3-540-28608-0Google Scholar
  21. 21.
    Garofalakis, M., Gibbon, P.B.: Approximate query processing: taming the terabytes! In: VLDB (tutorial) (2001)Google Scholar
  22. 22.
    Golab, L., Özsu, M.T.: Data Stream Management. Synthesis Lectures on Data Management. Morgan & Claypool, San Rafael (2010)CrossRefGoogle Scholar
  23. 23.
    Huang, Q., Lee, P.P.C.: Toward high-performance distributed stream processing via approximate fault tolerance. PVLDB 10(3), 73–84 (2016)Google Scholar
  24. 24.
    Hwang, J.-H., Balazinska, M., Rasin, A., Çetintemel, U., Stonebraker, M., Zdonik, S.: High-availability algorithms for distributed stream processing. In: ICDE, pp. 779–790 (2005)Google Scholar
  25. 25.
    Hwang, J.-H., Xing, Y., Cetintemel, U., Zdonik, S.: A cooperative, self-configuring high-availability solution for stream processing. In: ICDE, pp. 176–185 (2007)Google Scholar
  26. 26.
    Krishnamurthy, S., et al.: Continuous analytics over discontinuous streams. In: SIGMOD, pp. 1081–1092 (2010)Google Scholar
  27. 27.
    Kulkarni, S., et al.: Twitter heron: stream processing at scale. In: SIGMOD, pp. 239–250 (2015)Google Scholar
  28. 28.
    Meehan, J., et al.: S-Store: streaming meets transaction processing. PVLDB 8(13), 2134–2145 (2015)Google Scholar
  29. 29.
    Mozafari, B., Niu, N.: A handbook for building an approximate query engine. IEEE Data Eng. Bull. 38(3), 3–29 (2015)Google Scholar
  30. 30.
    Muthukrishnan, S.: Data Streams: Algorithms and Applications. Foundations and Trends in Theoretical Computer Science. Now Publishers, Delft (2005)zbMATHGoogle Scholar
  31. 31.
    Noghabi, S.A., et al.: Stateful scalable stream processing at LinkedIn. PVLDB 10(12), 1634–1645 (2017)Google Scholar
  32. 32.
    Özcan, F., Tian, Y., Tözün, P.: Hybrid transactional/analytical processing: a survey. In: SIGMOD (2017)Google Scholar
  33. 33.
    Ré, C., Letchner, J., Balazinksa, M., Suciu, D.: Event queries on correlated probabilistic streams. In: SIGMOD, pp. 715–728 (2008)Google Scholar
  34. 34.
    Shah, M.A., Hellerstein, J.M., Brewer, E.: Highly available, fault-tolerant, parallel dataflows. In: SIGMOD, pp. 827–838 (2004)Google Scholar
  35. 35.
    Stonebraker, M., Çetintemel, U., Zdonik, S.: The 8 requirements of real-time stream processing. SIGMOD Rec. 34(4), 42–47 (2005)CrossRefGoogle Scholar
  36. 36.
  37. 37.
    Sugiura, K., Ishikawa, Y., Sasaki, Y.: Grouping methods for pattern matching over probabilistic data streams. IEICE Trans. Inf. Syst. E-100D(4), 718–729 (2017)CrossRefGoogle Scholar
  38. 38.
    Toshniwal, A.: Storm@twitter. In: SIGMOD, pp. 147–156 (2014)Google Scholar
  39. 39.
    Tran, T.T.L., Peng, L., Diao, Y., McGregor, A., Liu, A.: CLARO: modeling and processing uncertain data streams. VLDBJ 21(5), 651–676 (2012)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  • Yoshiharu Ishikawa
    • 1
    Email author
  • Kento Sugiura
    • 1
  • Daiki Takao
    • 1
  1. 1.Graduate School of InformaticsNagoya UniversityNagoyaJapan

Personalised recommendations