A Fuzzy Load Balancer for Adaptive Fault Tolerance Management in Cloud Platforms

  • Hamid Arabnejad
  • Claus PahlEmail author
  • Giovani Estrada
  • Areeg Samir
  • Frank Fowley
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10465)


To achieve high levels of reliability, availability and performance in cloud environments, a fault tolerance approach to handle failures effectively is needed. In most existing research, the primary focus has been on explicit specification-driven solutions which requires too much effort for application developers, and leads to inflexibility. We propose a fuzzy job distributor (load balancer) for fault tolerance management to reduce levels of management complexity for the user. The proposed approach aims to reduce the possibility of fault occurrences in the system by a fair distribution of user job requests among available resources. In our self-adaptive approach, the system manages anomalous situations that might lead to failure by distributing the incoming job request based on the reliability of processing nodes, i.e., virtual machines (VMs). The reliability of VMs is a variable parameter and changes during its lifetime. Our approach is implemented and comparatively analysed using OpenStack. The experimental results show a significant reduction in the occurrence of faults in comparison with other load balancing algorithms.


Load balancing Job distributor Fault tolerance Fuzzy logic Cloud computing Anomaly detection OpenStack 



This work was partly supported by IC4 (the Irish Centre for Cloud Computing and Commerce), funded by EI and the IDA.


  1. 1.
    Amin, Z., Singh, H., Sethi, N.: Review on fault tolerance techniques in cloud computing. Int. J. Comput. Appl. 116(18), 11–17 (2015)Google Scholar
  2. 2.
    Arabnejad, H., Jamshidi, P., Estrada, G., El Ioini, N., Pahl, C.: An auto-scaling cloud controller using fuzzy q-learning-implementation in openstack. In: European Conference on Service-Oriented and Cloud Computing (2016)Google Scholar
  3. 3.
    Arabnejad, H., Pahl, C., Jamshidi, P., Estrada, G.: A comparison of reinforcement learning techniques for fuzzy cloud auto-scaling. In: International Symposium on Cluster, Cloud and Grid Computing, CCGrid (2017)Google Scholar
  4. 4.
    Chaczko, Z., Mahadevan, V., Aslanzadeh, S., Mcdermid, C.: Availability and load balancing in cloud computing. In: International Conference on Computer and Software Modeling (2011)Google Scholar
  5. 5.
    Chen, X., Jiang, J.H.: A method of virtual machine placement for fault-tolerant cloud applications. Intel. Autom. Soft Comput. 22(4), 587–597 (2016)CrossRefGoogle Scholar
  6. 6.
    Cheraghlou, M.N., Khadem-Zadeh, A., Haghparast, M.: A survey of fault tolerance architecture in cloud computing. J. Netw. Comput. Appl. 61, 81–92 (2016)CrossRefGoogle Scholar
  7. 7.
    Engelmann, C., Vallee, G.R., Naughton, T., Scott, S.L.: Proactive fault tolerance using preemptive migration. In: International Conference on Parallel Distributed and Network-Based Proceedings (2009)Google Scholar
  8. 8.
    Ganesh, A., Sandhya, M., Shankar, S.: A study on fault tolerance methods in cloud computing. In: International Advance Computing Conference, pp. 844–849 (2014)Google Scholar
  9. 9.
    Heinrich, R., van Hoorn, A., Knoche, H., Li, F., Lwakatare, L.E., Pahl, C., Schulte, S., Wettinger, J.: Performance engineering for microservices: research challenges and directions. In: ACM International Conference on Performance Engineering Companion (2017)Google Scholar
  10. 10.
    Huang, Y., Kintala, C., Kolettis, N., Fulton, N.S.: Analysis, module and applications. In: International Symposium on Fault-Tolerant Computing, Software Rejuvenation (1995)Google Scholar
  11. 11.
    Jamshidi, P., Pahl, C., Mendonça, N.C.: Managing uncertainty in autonomic cloud elasticity controllers. IEEE Cloud Comput. 3(3), 50–60 (2016)CrossRefGoogle Scholar
  12. 12.
    Jamshidi, P., Sharifloo, A., Pahl, C., Arabnejad, H., Metzger, A., Estrada, G.: Fuzzy self-learning controllers for elasticity management in dynamic cloud architectures. In: ACM International Conference on Quality of Software Architectures (QoSA), pp. 70–79 (2016)Google Scholar
  13. 13.
    Jamshidi, P., Sharifloo, A.M., Pahl, C., Metzger, A., Estrada, G.: Self-learning cloud controllers: fuzzy q-learning for knowledge evolution. In: 2015 International Conference on Cloud and Autonomic Computing (ICCAC) (2015)Google Scholar
  14. 14.
    Jia, R., Abdelwahed, S., Erradi, A.: A predictive control approach for fault management of computing systems. Perform. Eval. Rev. 43(3), 16–20 (2015)CrossRefGoogle Scholar
  15. 15.
    Kansal, N.J., Chana, I.: Cloud load balancing techniques: a step towards green computing. Int. J. Comput. Sci. Issues 9(1), 238–246 (2012)Google Scholar
  16. 16.
    Li, B., Kapitza, R.: BFT-Dep: automatic deployment of byzantine fault-tolerant services in PaaS cloud. In: Jelasity, M., Kalyvianaki, E. (eds.) DAIS 2016. LNCS, vol. 9687, pp. 109–114. Springer, Cham (2016). doi: 10.1007/978-3-319-39577-7_9 Google Scholar
  17. 17.
    Liu, J., Wang, S., Zhou, A., Kumar, S., Yang, F., Buyya, R.: Using proactive fault-tolerance approach to enhance cloud service reliability. IEEE TCC (2016). Pre-print online at Accessed 22 Aug 2017
  18. 18.
    Lyons, R.E., Vanderkulk, W.: The use of triple-modular redundancy to improve computer reliability. IBM J. Res. Dev. 6(2), 200–209 (1962)CrossRefzbMATHGoogle Scholar
  19. 19.
    Mohammed, B., Kiran, M., Maiyama, K.M., Kamala, M.M., Awan, I.-U.: Failover strategy for fault tolerance in cloud computing environment. Pract. Exp. Softw. 47(9), 1243–1274 (2017)CrossRefGoogle Scholar
  20. 20.
    Nagarajan, A.B., Mueller, F., Engelmann, C., Scott, S.L.: Proactive fault tolerance for HPC with Xen virtualization. In: International Conference on Supercomputing (2007)Google Scholar
  21. 21.
    Pahl, C., Brogi, A., Soldani, J., Jamshidi, P.: Cloud container technologies: a state-of-the-art review. IEEE Trans. Cloud Comput. (2017). Pre-print online at Accessed 22 Aug 2017
  22. 22.
    Pahl, C., Helmer, S., Miori, L., Sanin, J., Lee, B.: A container-based edge cloud PaaS architecture based on raspberry pi clusters. In: IEEE International Conference on Future Internet of Things and Cloud Workshops (FiCloudW) (2016)Google Scholar
  23. 23.
    Randles, M., Lamb, D., Taleb-Bendiab, A.: A comparative study into distributed load balancing algorithms for cloud computing. In: AINA Workshops (2010)Google Scholar
  24. 24.
    Vas, P.: Artificial-intelligence-based electrical machines and drives: application of fuzzy, neural, fuzzy-neural, and genetic-algorithm-based techniques. OUP (1999)Google Scholar
  25. 25.
    Wang, Z., Gao, L., Gu, Y., Bao, Y., Yu, G.: A fault-tolerant framework for asynchronous iterative computations in cloud environments. In: ACM Symposium on Cloud Computing, pp. 71–83 (2016)Google Scholar
  26. 26.
    Zhang, Y., Wong, D., Zheng, W.: User-level checkpoint and recovery for LAM/MPI. Operating Syst. Rev. 39(3), 72–81 (2005)CrossRefGoogle Scholar

Copyright information

© IFIP International Federation for Information Processing 2017

Authors and Affiliations

  • Hamid Arabnejad
    • 1
  • Claus Pahl
    • 2
    Email author
  • Giovani Estrada
    • 3
  • Areeg Samir
    • 2
  • Frank Fowley
    • 1
  1. 1.IC4Dublin City UniversityDublinIreland
  2. 2.Free University of Bozen-BolzanoBolzanoItaly
  3. 3.IntelLeixlipIreland

Personalised recommendations