Abstract
Cross-layer resiliency is a closer to optimal way of maximizing reliability by breaking the abstraction layers boundaries across the system stack. In this chapter, we discuss how accelerated and active self-healing methods can be effectively applied at different levels in the system hierarchy. Circuit blocks that were presented in the previous chapter serve as the underlying infrastructure for recovery; at the architecture level, unit-level self-healing and intrinsic heat reduce the hardware costs for recovery through architectural opportunities; at the system level, scheduling that follows certain circadian rhythm can be implemented to deeply heal the circuit. Overall, these techniques can work together and compensate the trade-offs necessary for recovery.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Computing Community Consortium (CCC) Visioning Study on Cross-Layer Reliability. http://www.relxlayer.org/.
Nicholas P Carter, Helia Naeimi, and Donald S Gardner. Design techniques for cross-layer resilience. In Proceedings of the Conference on Design, Automation and Test in Europe, pages 1023–1028. European Design and Automation Association, 2010.
Subhasish Mitra, Kevin Brelsford, and Pia N Sanda. Cross-layer resilience challenges: Metrics and optimization. In Design, Automation & Test in Europe Conference & Exhibition (DATE), 2010, pages 1029–1034. IEEE, 2010.
E. Cheng, J. Abraham, P. Bose, A. Buyuktosunoglu, K. Campbell, D. Chen, C. Y. Cher, H. Cho, B. Le, K. Lilja, S. Mirkhani, K. Skadron, M. Stan, L. Szafaryn, C. Vezyrtzis, and S. Mitra. Cross-layer resilience in low-voltage digital systems: Key insights. In 2017 IEEE International Conference on Computer Design (ICCD), pages 593–596, Nov 2017.
S Sarma, N Dutt, N Venkatasubramanian, A Nicolau, and P Gupta. Cyberphysical system-on-chip (cpsoc): Sensor actuator rich self-aware computational platform. University of California Irvine, Tech. Rep. CECS TR-13-06, 2013.
Alec Roelke, Xinfei Guo, and Mircea R Stan. OldSpot: A Pre-RTL Model for Fine-grained Aging and Lifetime Optimization. In Computer Design (ICCD), 2018 IEEE International Conference on. IEEE, 2018.
Nathan Binkert, Bradford Beckmann, Gabriel Black, Steven K Reinhardt, Ali Saidi, Arkaprava Basu, Joel Hestness, Derek R Hower, Tushar Krishna, Somayeh Sardashti, et al. The gem5 simulator. ACM SIGARCH Computer Architecture News, 39(2):1–7, 2011.
Sheng Li, Jung Ho Ahn, Richard D Strong, Jay B Brockman, Dean M Tullsen, and Norman P Jouppi. McPAT: an integrated power, area, and timing modeling framework for multicore and many core architectures. In Microarchitecture, 2009. MICRO-42. 42nd Annual IEEE/ACM International Symposium on, pages 469–480. IEEE, 2009.
Wei Huang, Shougata Ghosh, Sivakumar Velusamy, Karthik Sankaranarayanan, Kevin Skadron, and Mircea R Stan. HotSpot: A compact thermal modeling methodology for early-stage VLSI design. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 14(5):501–513, 2006.
Gregory G Faust, Runjie Zhang, Kevin Skadron, Mircea R Stan, and Brett H Meyer. ArchFP: Rapid prototyping of pre-RTL floorplans. In VLSI and System-on-Chip (VLSI-SoC), 2012 IEEE/IFIP 20th International Conference on, pages 183–188. IEEE, 2012.
Christian Bienia. Benchmarking modern multiprocessors. Princeton University, 2011.
Hadi Esmaeilzadeh, Emily Blem, Renee St Amant, Karthikeyan Sankaralingam, and Doug Burger. Dark silicon and the end of multicore scaling. In Computer Architecture (ISCA), 2011 38th Annual International Symposium on, pages 365–376. IEEE, 2011.
Jorg Henkel, Heba Khdr, Santiago Pagani, and Muhammad Shafique. New trends in dark silicon. In Design Automation Conference (DAC), 2015 52nd ACM/EDAC/IEEE, pages 1–6. IEEE, 2015.
Lin Huang and Qiang Xu. Characterizing the lifetime reliability of many core processors with core-level redundancy. In Computer-Aided Design (ICCAD), 2010 IEEE/ACM International Conference on, pages 680–685. IEEE, 2010.
Cheng Zhuo, Kaviraj Chopra, Dennis Sylvester, and David Blaauw. Process variation and temperature-aware full chip oxide breakdown reliability analysis. Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions on, 30(9):1321–1334, 2011.
Paul Bogdan, Siddharth Garg, and Umit Y Ogras. Energy-efficient computing from systems-on-chip to micro-server and data centers. In Green Computing Conference and Sustainable Computing Conference (IGSC), 2015 Sixth International, pages 1–6. IEEE, 2015.
Anshul Gandhi, Mor Harchol-Balter, and Michael A Kozuch. Are sleep states effective in data centers? In Green Computing Conference (IGCC), 2012 International, pages 1–10. IEEE, 2012.
A. Paya and D. Marinescu. Energy-aware load balancing and application scaling for the cloud ecosystem. IEEE Transactions on Cloud Computing, PP(99):1–1, 2015.
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Guo, X., Stan, M.R. (2020). Active Accelerated Self-healing as a Key Design Knob for Cross-Layer Resilience. In: Circadian Rhythms for Future Resilient Electronic Systems. Springer, Cham. https://doi.org/10.1007/978-3-030-20051-0_5
Download citation
DOI: https://doi.org/10.1007/978-3-030-20051-0_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-20050-3
Online ISBN: 978-3-030-20051-0
eBook Packages: EngineeringEngineering (R0)