Wireless Personal Communications

, Volume 107, Issue 1, pp 23–40 | Cite as

Task Failure Prediction using Combine Bagging Ensemble (CBE) Classification in Cloud Workflow

  • P. PadmakumariEmail author
  • A. Umamakeswari


Scientific applications adopt cloud environment for executing its workflows as tasks. When a task fails, dependency nature of the workflows affects the overall performance of the execution. An efficient failure prediction mechanism is needed to execute the workflow efficiently. This paper proposes a failure prediction method which is implemented using various machine learning classifiers. Among different classifiers, Naïve Bayes predicts the failure with the highest accuracy of 94.4%. Further, to improve the accuracy of prediction, a novel ensemble method called combine bagging ensemble is introduced and acquires overall accuracy as 95.8%. The validation of proposed method is carried out by comparing simulation and real-time cloud testbed.


Machine learning Ensemble Cloud computing Fault prediction Task failure Scientific workflow 



  1. 1.
    Kumar, S., et al. (2015). Fault Tolerance and Load Balancing algorithm in Cloud Computing: A survey. IJARCCE International Journal of Advanced Research in Computer and Communication Engineering, 4(7), 92–96.Google Scholar
  2. 2.
    Yu, Z., Wang, C., & Shi, W. (2010). FLAW: FaiLure-Aware Workflow scheduling in high performance computing systems. Journal of Cluster Computing, 13(4), 421–434.CrossRefGoogle Scholar
  3. 3.
    Poola, D., Ramamohanarao, K., & Buyya, R. (2016). Enhancing reliability of workflow execution using task replication and spot instances. ACM Transactions on Autonomous and Adaptive Systems (TAAS), 10(4), 30.Google Scholar
  4. 4.
    Samak, T., Gunter, D., Goode, M., Deelman, E., Juve, G., Silva, F., & Vahi K. (2012) Failure analysis of distributed scientific workflows executing in the cloud. In Proceedings of the 8th International conference on Network and Service Management (pp. 46–54).Google Scholar
  5. 5.
    Lin, M., Yao, Z., & Huang, T. (2016). A hybrid push protocol for resource monitoring in cloud computing platforms. Optik-International Journal for Light and Electron Optics, 127(4), 2007–2011.CrossRefGoogle Scholar
  6. 6.
    Huang, H., & Wang, L. (2010). P&p: A combined push–pull model for resource monitoring in cloud computing environment. In IEEE 3rd international conference on cloud computing (CLOUD). IEEE.Google Scholar
  7. 7.
    Cheraghlou, M. N., Khadem-Zadeh, A., & Haghparast, M. (2015). A survey of fault tolerance architecture in cloud computing. Journal of Network and Computer Applications, 61, 81–92.CrossRefGoogle Scholar
  8. 8.
    Derbeko, P., Dolev, S., Gudes, E., & Sharma, S. (2016). Security and privacy aspects in MapReduce on clouds: a survey. Computer Science Review, 20, 1–28.MathSciNetCrossRefzbMATHGoogle Scholar
  9. 9.
    Salfner, F., Lenk, M., & Malek, M. (2010). A survey of online failure prediction methods. ACM Computing Surveys, 42, 1–42.CrossRefGoogle Scholar
  10. 10.
    Zheng, Z., Zhou, T. C., Lyu, M. R., & King, I. (2010, November). FTCloud: A component ranking framework for fault-tolerant cloud applications. In IEEE 21st International Symposium on Software Reliability Engineering (ISSRE), 2010 (pp. 398–407), IEEEGoogle Scholar
  11. 11.
    Al-Sayed, M. M., Khattab, S., & Omara, F. A. (2016). Prediction mechanisms for monitoring state of cloud resources using Markov chain model. Journal of Parallel and Distributed Computing, 96, 163–171.CrossRefGoogle Scholar
  12. 12.
    Bala, A., & Chana, I. (2015). Intelligent failure prediction models for scientific workflows. Expert Systems with Applications, 42(3), 980–989.CrossRefGoogle Scholar
  13. 13.
    Bui, D. M., & Lee, S. (2016). Fuzzy Fault Detection in IaaS Cloud Computing. In Proceedings of the 10th International Conference on Ubiquitous Information Management and Communication (p. 65), ACM.Google Scholar
  14. 14.
    Amiri, M., & Mohammad-Khanli, L. (2017). Survey on prediction models of applications for resources provisioning in cloud. Journal of Network and Computer Applications, 82, 93–113.CrossRefGoogle Scholar
  15. 15.
    Deelman, E., et al. (2005). Pegasus: A framework for mapping complex scientific workflows onto distributed systems. Scientific Programming, 13, 219–237.CrossRefGoogle Scholar
  16. 16.
    Deelman, E. (2010). Grids and clouds: Making workflow applications work in heterogeneous distributed environments. The International Journal of High Performance Computing Applications, 24(3), 284–298.CrossRefGoogle Scholar
  17. 17.
    Zhang, Y., Zheng, Z., & Lyu, M. R. (2011, July). BFTCloud: A byzantine fault tolerance framework for voluntary-resource cloud computing. In IEEE International Conference on Cloud Computing (CLOUD), 2011 (pp. 444–451), IEEE.Google Scholar
  18. 18.
    Pandeeswari, N., & Kumar, G. (2016). Anomaly detection system in cloud environment using fuzzy clustering based ANN. Mobile Networks and Applications, 21(3), 494–505.CrossRefGoogle Scholar
  19. 19.
    Catal, C., & Diri, B. (2009). A systematic review of software fault prediction studies. Expert Systems with Applications, 36, 7346–7354.CrossRefGoogle Scholar
  20. 20.
    Islam, A., Keunga, J., Lee, K., & Liu, A. (2012). Empirical prediction models for adaptive resource provisioning in the cloud. Future Generation Computer Systems, 28, 155–162.CrossRefGoogle Scholar
  21. 21.
    Malhotra, R., & Jain, A. (2012). Fault prediction using statistical and machine learning methods for improving software quality. Journal of information Processing Systems, 8, 241–262.CrossRefGoogle Scholar
  22. 22.
    Islam T, Manivannan D. Predicting Application Failure in Cloud: A Machine Learning Approach. In IEEE International Conference on Cognitive Computing (ICCC), 2017 Jun 25 (pp. 24–31), IEEE.Google Scholar
  23. 23.
    Bala, A., & Chana, I. (2012). Fault tolerance-challenges, techniques and implementation in cloud computing. IJCSI, 9(1), 288–293.Google Scholar
  24. 24.
    Gupta, N., Ahuja, N., Malhotra, S., Bala, A., & Kaur, G. (2017). Intelligent heart disease prediction in cloud environment through ensembling. Expert Systems, 34(3), e12207.CrossRefGoogle Scholar
  25. 25.
    Sindrilaru, E., Costan, A., & Cristea, V. (2010, February). Fault tolerance and recovery in grid workflow management systems. In 2010 international conference on complex, intelligent and software intensive systems (pp. 475–480). IEEE.Google Scholar
  26. 26.
    W. Yoo, A. Sim, and K. Wu, “Machine learning based job status prediction in scientific clusters. In Proceedings 2016 SAI Computing Conference SAI 2016, (pp. 44–53), 2016.Google Scholar
  27. 27.
    Jhawar, R., Piuri, V., & Santambrogio, M. D. (2012). A comprehensive conceptual system-level approach to fault tolerance in cloud computing. In IEEE international systems conference (pp. 1–5).Google Scholar
  28. 28.
    Calheiros, R. N., Ranjan, R., Beloglazov, A., Rose, C. A. F. D., & Buyya, R. (2011). CloudSim: A toolkit for modeling and simulation of cloud computing environments and evaluation of resource provisioning algorithms. Software: Practice and Experience, 41, 23–50.Google Scholar
  29. 29.
    Chen, W., & Deelman, E. (2012). WorkfowSim: A toolkit for simulating scientific workflows in distributed environments. In IEEE 8th international conference on E-Science, (pp. 1–8).Google Scholar
  30. 30.
    Juve, G. et al. (2009). Scientific workflow applications on Amazon EC2. In 5th IEEE international conference on E-science workshops, (pp. 59–66).Google Scholar
  31. 31.
    Amazon Elastic Compute Cloud(Amazon EC2)
  32. 32.
    Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., & Witten, I. H. (2009). The WEKA data mining software: An update. SIGKDD Explorations, 11.Google Scholar
  33. 33.
    Catal, C. (2011). Software fault prediction: a literature review and current trends. Expert Systems with Applications, 38(4), 4626–4636.CrossRefGoogle Scholar
  34. 34.
    Mohamed, N, & J. Al-Jaroodi (2012). A collaborative fault-tolerant transfer protocol for replicated data in the cloud. In International Conference on Collaboration Technologies and Systems (CTS), IEEE 2012.Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  1. 1.School of ComputingSASTRA Deemed UniversityThanjavurIndia

Personalised recommendations