Advertisement

Stateful Load Balancing for Parallel Stream Processing

  • Qingsong Guo
  • Yongluan Zhou
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10659)

Abstract

Timely processing of streams in parallel requires dynamic load balancing to diminish skewness of data. In this paper we study this problem for stateful operators with key grouping for which the process of load balancing involves a lot of state movements. Consequently, load balancing is a bi-objective optimization problem, namely Minimum-Cost-Load-Balance (MCLB). We address MCLB with two approximate algorithms by a certain relaxation of the objectives: (1) a greedy algorithm ELB performs load balancing eagerly but relaxes the objective of load imbalance to a range; and (2) a periodic algorithm CLB aims at reducing load imbalance via a greedy procedure of minimizing the covariance of substreams but ignores the objective of state movement by amortizing the overhead of it over a relative long period. We evaluate our approaches with both synthetic and real data. The results show that they can adapt effectively to load variations and improve latency efficiently comparing to the existing solutions whom ignored the overhead of state movement in stateful load balancing.

Keywords

Stream processing Load balancing State movement 

Notes

Acknowledgements

The author from North University of China is supported by NSFC No. 61602427 and NSF of Shanxi No. 201601D202037.

References

  1. 1.
  2. 2.
    Abadi, D.J., Ahmad, Y., et al.: The design of the Borealis stream processing engine. In: CIDR 2005, Asilomar, CA, January 2005Google Scholar
  3. 3.
    Gedik, B.: Partitioning functions for stateful data parallelism in stream processing. VLDB J. 23(4), 517–539 (2014)CrossRefGoogle Scholar
  4. 4.
    Godfrey, B., Lakshminarayanan, K., Surana, S., Karp, R.M., Stoica, I.: Load balancing in dynamic structured P2P systems. In: INFOCOM 2004, Hong Kong, China, 7–11 March 2004Google Scholar
  5. 5.
    Kifer, D., Ben-David, S., Gehrke, J.: Detecting change in data streams. In: VLDB 2004, pp. 180–191. VLDB Endowment (2004)Google Scholar
  6. 6.
    Madsen, K.G.S., Thyssen, P., Zhou, Y.: Integrating fault-tolerance and elasticity in a distributed data stream processing system. In: SSDBM 2014. ACM, New York (2014)Google Scholar
  7. 7.
    Madsen, K.G.S., Zhou, Y.: Demo: elastic mapreduce-style processing of fast data. In: DEBS 2013, pp. 335–336 (2013)Google Scholar
  8. 8.
    Madsen, K.G.S., Zhou, Y., Cao, J.: Integrative dynamic reconfiguration in a parallel stream processing engine. CoRR abs/1602.03770 (2016)Google Scholar
  9. 9.
    Madsen, K.G.S., Zhou, Y., Cao, J.: Integrative dynamic reconfiguration in a parallel stream processing engine. In: 33rd IEEE International Conference on Data Engineering, ICDE 2017, San Diego, CA, USA, 19–22 April 2017, pp. 227–230 (2017)Google Scholar
  10. 10.
    Nasir, M.A.U., Morales, G.D.F., García-Soriano, D., Kourtellis, N., Serafini, M.: The power of both choices: practical load balancing for distributed stream processing engines. In: ICDE 2015, Seoul, South Korea, 13–17 April 2015, pp. 137–148 (2015)Google Scholar
  11. 11.
    Nuaimi, K.A., Mohamed, N., Nuaimi, M.A., Al-Jaroodi, J.: A survey of load balancing in cloud computing: challenges and algorithms. In: NCCA 2012, London, UK, 3–4 December 2012, pp. 137–142 (2012)Google Scholar
  12. 12.
    Schneider, S., Hirzel, M., Gedik, B., Wu, K.L.: Auto-parallelizing stateful distributed streaming applications. In: PACT 2012, pp. 53–64. ACM, New York (2012)Google Scholar
  13. 13.
    Shah, M.A., Chandrasekaran, S., Hellerstein, J.M., Ch, S., Franklin, M.J.: Flux: an adaptive partitioning operator for continuous query systems. In: ICDE 2002, pp. 25–36 (2002)Google Scholar
  14. 14.
    Wu, S., Kumar, V., Wu, K.L., Ooi, B.C.: Parallelizing stateful operators in a distributed stream processing system: how, should you and how much? In: DEBS 2012, pp. 278–289. ACM, New York (2012)Google Scholar
  15. 15.
    Xing, Y., Hwang, J.H., Çetintemel, U., Zdonik, S.: Providing resiliency to load variations in distributed stream processing. In: VLDB 2006, pp. 775–786. VLDB Endowment (2006)Google Scholar
  16. 16.
    Xing, Y., Zdonik, S., Hwang, J.H.: Dynamic load distribution in the borealis stream processor. In: ICDE 2005, pp. 791–802. IEEE Computer Society (2005)Google Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  1. 1.North University of ChinaTaiyuanChina
  2. 2.University of CopenhagenCopenhagenDenmark

Personalised recommendations