Skip to main content

Energy-Aware Fault-Tolerant Scheduling Under Reliability and Time Constraints in Heterogeneous Systems

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10956))

Abstract

As heterogeneous systems have been deployed widely in various fields, the reliability become the major concern. Thereby, fault tolerance receives a great deal of attention in both industry and academia, especially for safety critical systems. Such systems require that tasks need to be carried out correctly in a given deadline even when an error occurs. Therefore, it is imperative to support fault-tolerance capability for systems. Scheduling is an efficient approach to achieving fault tolerance by allocating multiple copies of tasks on processors. Existing fault-tolerant scheduling algorithms realize fault tolerance without energy limit. To address this issue, this paper proposes an energy-aware fault-tolerant scheduling algorithm DRB-FTSA-E. The algorithm adopts the active replication strategy and uses a high utilization of energy consumption to complete a set of tasks with given reliability and time constraints. It finds out all schemes that meet time and system reliability constraints, and chooses the scheme with the maximum utilization of energy consumption as the final scheduling scheme. Experimental simulation results show that the proposed algorithm can effectively achieve the maximum utilization of energy consumption while meeting the reliability and time constraints.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Benoit, A., Hakem, M., Robert, Y.: Fault tolerant scheduling of precedence task graphs on heterogeneous platforms. In: IEEE International Symposium on Parallel and Distributed Processing, pp. 1–8 (2008)

    Google Scholar 

  2. Broberg, J., Ståhl, P.: Dynamic fault tolerance and task scheduling in distributed systems (2016)

    Google Scholar 

  3. Cui, X.T., Wu, K.J., Wei, T.Q., Sha, H.M.: Worst-case finish time analysis for dag-based applications in the presence of transient faults. J. Comput. Sci. Technol. 31(2), 267–283 (2016)

    Article  MathSciNet  Google Scholar 

  4. Deng, F., Tian, Y., Zhu, R., Chen, Z.: Fault-tolerant approach for modular multilevel converters under submodule faults. IEEE Trans. Ind. Electron. 63(11), 7253–7263 (2016)

    Article  Google Scholar 

  5. Girault, A., Kalla, H., Sighireanu, M., Sorel, Y.: An algorithm for automatically obtaining distributed and fault-tolerant static schedules. In: 2003 Proceedings of the International Conference on Dependable Systems and Networks, pp. 159–168 (2006)

    Google Scholar 

  6. Guo, H., Wang, Z.G., Zhou, J.L.: Load balancing based process scheduling with fault-tolerance in heterogeneous distributed system. Chin. J. Comput. 28(11), 1807–1816 (2005)

    MathSciNet  Google Scholar 

  7. Guo, Y., Zhu, D., Aydin, H.: Generalized standby-sparing techniques for energy-efficient fault tolerance in multiprocessor real-time systems. In: IEEE International Conference on Embedded and Real-Time Computing Systems and Applications, pp. 62–71 (2013)

    Google Scholar 

  8. Guo, Y., Zhu, D., Aydin, H., Yang, L.T., Member, S., Antonio, S.: Energy-efficient scheduling of primary/backup tasks in multiprocessor real-time systems (extended version) (2013)

    Google Scholar 

  9. Haque, M.A., Aydin, H., Zhu, D.: On reliability management of energy-aware real-time systems through task replication. IEEE Trans. Parallel Distrib. Syst. 28(3), 813–825 (2017)

    Article  Google Scholar 

  10. Iyer, R.K.: Measurement and modeling of computer reliability as affected by system activity. ACM Trans. Comput. Syst. 4(3), 214–237 (1986)

    Article  Google Scholar 

  11. Levitin, G., Xing, L., Dai, Y.: Optimizing dynamic performance of multistate systems with heterogeneous 1-out-of-n warm standby components. IEEE Trans. Syst. Man Cybern. Syst. PP(99), 1–10 (2016)

    Google Scholar 

  12. Liu, J., Wang, S., Zhou, A., Kumar, S., Yang, F., Buyya, R.: Using proactive fault-tolerance approach to enhance cloud service reliability. IEEE Trans. Cloud Comput. PP(99), 1 (2016)

    Google Scholar 

  13. Luo, W., Yang, F., Pang, L., Qin, X.: Fault-tolerant scheduling based on periodic tasks for heterogeneous systems. In: Yang, L.T., Jin, H., Ma, J., Ungerer, T. (eds.) ATC 2006. LNCS, vol. 4158, pp. 571–580. Springer, Heidelberg (2006). https://doi.org/10.1007/11839569_56

    Chapter  Google Scholar 

  14. Song, Y.D., Yuan, X.: Low-cost adaptive fault-tolerant approach for semi-active suspension control of high speed trains. IEEE Trans. Ind. Electron. PP(99), 1 (2016)

    Google Scholar 

  15. Sridharan, R., Mahapatra, R.: Reliability aware power management for dual-processor real-time embedded systems. In: Design Automation Conference, pp. 819–824 (2010)

    Google Scholar 

  16. Tabbaa, N., Entezari-Maleki, R., Movaghar, A.: A fault tolerant scheduling algorithm for dag applications in cluster environments. Commun. Comput. Inf. Sci. 188, 189–199 (2011)

    Google Scholar 

  17. Topcuouglu, H., Hariri, S., Wu, M.Y.: Performance-effective and low-complexity task scheduling for heterogeneous computing. IEEE Trans. Parallel Distrib. Syst. 13(3), 260–274 (2002)

    Article  Google Scholar 

  18. Treaster, M.: A survey of fault-tolerance and fault-recovery techniques in parallel systems. ACM Computing Research Repository (CoRR 501002, 1–11) (2005)

    Google Scholar 

  19. Wei, M., Liu, J., Li, T., Xu, X., Hu, W., Zhao, D.: Fault-tolerant scheduling of real-time tasks on heterogeneous systems. In: 2017 12th IEEE Conference on Industrial Electronics and Applications (ICIEA), pp. 1006–1011. IEEE (2017)

    Google Scholar 

  20. Xie, G.Q., Ren-Fa, L.I., Liu, L., Yang, F.: Dag reliability model and fault-tolerant algorithm for heterogeneous distributed systems. Chin. J. Comput. 36(10), 2019–2032 (2013)

    Article  MathSciNet  Google Scholar 

  21. Zhao, B., Aydin, H., Zhu, D.: Shared recovery for energy efficiency and reliability enhancements in real-time applications with precedence constraints. ACM Trans. Des. Autom. Electron. Syst. 18(2), 1–21 (2013)

    Article  Google Scholar 

  22. Zhao, L., Ren, Y., Yang, X., Sakurai, K.: Fault-tolerant scheduling with dynamic number of replicas in heterogeneous systems. In: IEEE International Conference on High Performance Computing and Communications, pp. 434–441 (2011)

    Google Scholar 

  23. Zhu, D., Aydin, H.: Reliability-aware energy management for periodic real-time tasks. In: IEEE Real Time and Embedded Technology and Applications Symposium, pp. 225–235 (2007)

    Google Scholar 

Download references

Acknowledgment

The authors would like to express their sincere gratitude to the editors and the referees. This work was supported by the National Natural Science Foundation of China (Grant Nos. 61602350, 61602349), the Open Foundation of Hubei Province Key Laboratory of Intelligent Information Processing and Real-time Industrial System (2016znss26C).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jing Liu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Guo, T., Liu, J., Hu, W., Wei, M. (2018). Energy-Aware Fault-Tolerant Scheduling Under Reliability and Time Constraints in Heterogeneous Systems. In: Huang, DS., Gromiha, M., Han, K., Hussain, A. (eds) Intelligent Computing Methodologies. ICIC 2018. Lecture Notes in Computer Science(), vol 10956. Springer, Cham. https://doi.org/10.1007/978-3-319-95957-3_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-95957-3_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-95956-6

  • Online ISBN: 978-3-319-95957-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics