Skip to main content

Active Accelerated Self-healing as a Key Design Knob for Cross-Layer Resilience

  • Chapter
  • First Online:
  • 443 Accesses

Abstract

Cross-layer resiliency is a closer to optimal way of maximizing reliability by breaking the abstraction layers boundaries across the system stack. In this chapter, we discuss how accelerated and active self-healing methods can be effectively applied at different levels in the system hierarchy. Circuit blocks that were presented in the previous chapter serve as the underlying infrastructure for recovery; at the architecture level, unit-level self-healing and intrinsic heat reduce the hardware costs for recovery through architectural opportunities; at the system level, scheduling that follows certain circadian rhythm can be implemented to deeply heal the circuit. Overall, these techniques can work together and compensate the trade-offs necessary for recovery.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Computing Community Consortium (CCC) Visioning Study on Cross-Layer Reliability. http://www.relxlayer.org/.

  2. Nicholas P Carter, Helia Naeimi, and Donald S Gardner. Design techniques for cross-layer resilience. In Proceedings of the Conference on Design, Automation and Test in Europe, pages 1023–1028. European Design and Automation Association, 2010.

    Google Scholar 

  3. Subhasish Mitra, Kevin Brelsford, and Pia N Sanda. Cross-layer resilience challenges: Metrics and optimization. In Design, Automation & Test in Europe Conference & Exhibition (DATE), 2010, pages 1029–1034. IEEE, 2010.

    Google Scholar 

  4. E. Cheng, J. Abraham, P. Bose, A. Buyuktosunoglu, K. Campbell, D. Chen, C. Y. Cher, H. Cho, B. Le, K. Lilja, S. Mirkhani, K. Skadron, M. Stan, L. Szafaryn, C. Vezyrtzis, and S. Mitra. Cross-layer resilience in low-voltage digital systems: Key insights. In 2017 IEEE International Conference on Computer Design (ICCD), pages 593–596, Nov 2017.

    Google Scholar 

  5. S Sarma, N Dutt, N Venkatasubramanian, A Nicolau, and P Gupta. Cyberphysical system-on-chip (cpsoc): Sensor actuator rich self-aware computational platform. University of California Irvine, Tech. Rep. CECS TR-13-06, 2013.

    Google Scholar 

  6. Alec Roelke, Xinfei Guo, and Mircea R Stan. OldSpot: A Pre-RTL Model for Fine-grained Aging and Lifetime Optimization. In Computer Design (ICCD), 2018 IEEE International Conference on. IEEE, 2018.

    Google Scholar 

  7. Nathan Binkert, Bradford Beckmann, Gabriel Black, Steven K Reinhardt, Ali Saidi, Arkaprava Basu, Joel Hestness, Derek R Hower, Tushar Krishna, Somayeh Sardashti, et al. The gem5 simulator. ACM SIGARCH Computer Architecture News, 39(2):1–7, 2011.

    Article  Google Scholar 

  8. Sheng Li, Jung Ho Ahn, Richard D Strong, Jay B Brockman, Dean M Tullsen, and Norman P Jouppi. McPAT: an integrated power, area, and timing modeling framework for multicore and many core architectures. In Microarchitecture, 2009. MICRO-42. 42nd Annual IEEE/ACM International Symposium on, pages 469–480. IEEE, 2009.

    Google Scholar 

  9. Wei Huang, Shougata Ghosh, Sivakumar Velusamy, Karthik Sankaranarayanan, Kevin Skadron, and Mircea R Stan. HotSpot: A compact thermal modeling methodology for early-stage VLSI design. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 14(5):501–513, 2006.

    Google Scholar 

  10. Gregory G Faust, Runjie Zhang, Kevin Skadron, Mircea R Stan, and Brett H Meyer. ArchFP: Rapid prototyping of pre-RTL floorplans. In VLSI and System-on-Chip (VLSI-SoC), 2012 IEEE/IFIP 20th International Conference on, pages 183–188. IEEE, 2012.

    Google Scholar 

  11. Christian Bienia. Benchmarking modern multiprocessors. Princeton University, 2011.

    Google Scholar 

  12. Hadi Esmaeilzadeh, Emily Blem, Renee St Amant, Karthikeyan Sankaralingam, and Doug Burger. Dark silicon and the end of multicore scaling. In Computer Architecture (ISCA), 2011 38th Annual International Symposium on, pages 365–376. IEEE, 2011.

    Google Scholar 

  13. Jorg Henkel, Heba Khdr, Santiago Pagani, and Muhammad Shafique. New trends in dark silicon. In Design Automation Conference (DAC), 2015 52nd ACM/EDAC/IEEE, pages 1–6. IEEE, 2015.

    Google Scholar 

  14. Lin Huang and Qiang Xu. Characterizing the lifetime reliability of many core processors with core-level redundancy. In Computer-Aided Design (ICCAD), 2010 IEEE/ACM International Conference on, pages 680–685. IEEE, 2010.

    Google Scholar 

  15. Cheng Zhuo, Kaviraj Chopra, Dennis Sylvester, and David Blaauw. Process variation and temperature-aware full chip oxide breakdown reliability analysis. Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions on, 30(9):1321–1334, 2011.

    Article  Google Scholar 

  16. Paul Bogdan, Siddharth Garg, and Umit Y Ogras. Energy-efficient computing from systems-on-chip to micro-server and data centers. In Green Computing Conference and Sustainable Computing Conference (IGSC), 2015 Sixth International, pages 1–6. IEEE, 2015.

    Google Scholar 

  17. Anshul Gandhi, Mor Harchol-Balter, and Michael A Kozuch. Are sleep states effective in data centers? In Green Computing Conference (IGCC), 2012 International, pages 1–10. IEEE, 2012.

    Google Scholar 

  18. A. Paya and D. Marinescu. Energy-aware load balancing and application scaling for the cloud ecosystem. IEEE Transactions on Cloud Computing, PP(99):1–1, 2015.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Guo, X., Stan, M.R. (2020). Active Accelerated Self-healing as a Key Design Knob for Cross-Layer Resilience. In: Circadian Rhythms for Future Resilient Electronic Systems. Springer, Cham. https://doi.org/10.1007/978-3-030-20051-0_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-20051-0_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-20050-3

  • Online ISBN: 978-3-030-20051-0

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics