Abstract
The advent of deep sub-micron technology has exacerbated reliability issues in on-chip interconnects. In particular, single event upsets, such as soft errors, and hard faults are rapidly becoming a force to be reckoned with. This spiraling trend highlights the importance of detailed analyses of these reliability hazards and the incorporation of comprehensive protection measures into all NoC designs. In this chapter, the author examines the impact of these transient and permanent failures on the reliability of on-chip interconnects and develops comprehensive counter-measures to either prevent or recover from them. In this regard, several novel schemes are proposed to remedy various kinds of soft and hard error symptoms, while keeping area and power overhead at a minimum. The proposed solutions are architected to fully exploit the available infrastructures in an NoC and enable versatile reuse of valuable resources. The effectiveness of the proposed techniques has been validated using a cycle-accurate simulator.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
J. A. Kahle, M. N. Day, H. P. Hofstee, C. R. Johns, T. R. Maeurer, and D. Shippy, “Introduction to the Cell Multiprocessor,” IBM Journal of Research and Development, vol. 49, pp. 589-604, 2005.
P. Kongetira, K. Aingaran, and K. Olukotun, “Niagara: a 32-way multithreaded Sparc processor,” in IEEE Micro, vol. 25, pp. 21-29, 2005.
J. Srinivasan, S. V. Adve, P. Bose, and J. A. Rivers, “The impact of technology scaling on lifetime reliability,” in Proceedings of the International Conference on Dependable Systems and Networks (DSN), pp. 177–186, 2004.
R. Marculescu, “Networks-on-chip: the quest for on-chip fault-tolerant communication,” in Proceedings of the IEEE Computer Society Annual Symposium on VLSI (ISVLSI), pp. 8-12, 2003.
T. Dumitras, S. Kerner, and R. Marculescu, “Towards on-chip fault-tolerant communication,” in Proceedings of the Asia and South Pacific Design Automation Conference (ASP-DAC), pp. 225-232, 2003.
J. Kim, D. Park, C. Nicopoulos, N. Vijaykrishnan, and C. R. Das, “Design and analysis of an NoC architecture from performance, reliability and energy perspective,” in Proceedings of the Symposium on Architecture for Networking and Communications Systems (ANCS), pp. 173 - 182, 200
J. Duato, “A new theory of deadlock-free adaptive routing in wormhole networks,” in IEEE Transactions on Parallel and Distributed Systems, vol. 4, pp. 1320-1331, 1993.
J. Kim, D. Park, T. Theocharides, N. Vijaykrishnan, and C. R. Das, “A low latency router supporting adaptivity for on-chip interconnects,” in Proceedings of the Design Automation Conference (DAC), pp. 559-564, 2005.
L. Shang, L. S. Peh, A. Kumar, and N. K. Jha, “Thermal Modeling, Characterization and Management of On-Chip Networks,” in Proceedings of the International Symposium on Microarchitecture (MICRO), pp. 67-78, 2004.
W. J. Dally and B. Towles, Principles and practices of interconnection networks: Morgan Kaufmann, 2003.
J. Duato, S. Yalamanchili, and L. Ni, “Interconnection networks: An engineering Approach.,” Los Alamitos, Calif., IEEE Computer Society, 1997.
D. Bertozzi, L. Benini, and G. De Micheli, “Low power error resilient encoding for on-chip data buses,” in Proc. of the Design, Automation and Test in Europe Conference (DATE), pp. 102-109, 2002.
A. Krstic, J. Yi-Min, and C. Kwang-Ting, “Pattern generation for delay testing and dynamic timing analysis considering power-supply noise effects,” IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems, vol. 20, pp. 4
K. L. Shepard and V. Narayanan, “Noise in deep submicron digital design,” in Proc. of the International Conference on Computer-Aided Design (ICCAD), pp. 524-531, 1996.
S. Murali, T. Theocharides, N. Vijaykrishnan, M. J. Irwin, L. Benini, and G. De Micheli, “Analysis of error recovery schemes for networks on chips,” in IEEE Design & Test of Computers, vol. 22(5), pp. 434-442, 2005.
K. Constantinides, S. Plaza, J. Blome, Z. Bin, V. Bertacco, S. Mahlke, T. Austin, and M. Orshansky, “BulletProof: A Defect-Tolerant CMP Switch Architecture,” in Proceedings of the International Symposium on High-Performance Computer Architecture (HPCA), pp. 3-14, 2006.
P. Shivakumar, M. Kistler, S. W. Keckler, D. Burger, and L. Alvisi, “Modeling the effect of technology trends on the soft error rate of combinational logic,” in Proceedings of the International Conference on Dependable Systems and Networks (DSN), pp. 389-398, 2002.
G. V. Varatkar and R. Marculescu, “On-chip traffic modeling and synthesis for MPEG-2 video applications,” in IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 12, pp. 108-119, 2004.
S. R. Sridhara and N. R. Shanbhag, “Coding for system-on-chip networks: a unified framework,” in Proceedings of the Design Automation Conference (DAC), 2004.
H. Zimmer and A. Jantsch, “A fault model notation and error-control scheme for switch-to-switch buses in a network-on-chip,” in Proceedings of the International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS), pp.
P. Vellanki, N. Banerjee, and K. S. Chatha, “Quality-of-service and error control techniques for network-on-chip architectures,” in Proceedings of the Great Lakes symposium on VLSI, 2004.
M. Dall’Osso, G. Biccari, L. Giovannini, D. Bertozzi, and L. Benini, “xpipes: a latency insensitive parameterized network-on-chip architecture for multiprocessor SoCs,” in Proceedings of the 21st International Conference on Computer Design (ICCD), pp. 536-539, 2003.
W. J. Dally, L. R. Dennison, D. Harris, K. Kinhong, and T. Xanthopoulos, “Architecture and implementation of the reliable router,” in Proceedings of the Hot Interconnects Symposium, pp. 197-208, 1994.
J. Wu, “A deterministic fault-tolerant and deadlock-free routing protocol in 2-D meshes based on odd-even turn model,” in Proceedings of the International Conference on Supercomputing (ICS), pp. 67-76, 2002.
T. Nesson and S. L. Johnsson, “ROMM routing on mesh and torus networks,” in Proceedings of the Symposium on Parallel Algorithms and Architectures (SPAA), 1995.
K. V. Anjan and T. M. Pinkston, “An efficient, fully adaptive deadlock recovery scheme: DISHA,” in Proceedings of the International Symposium on Computer Architecture (ISCA), pp. 201-210, 1995.
P. Liden, P. Dahlgren, R. Johansson, and J. Karlsson, “On latching probability of particle induced transients in combinational networks,” in Proceedings of the Symposium on Fault-Tolerant Computing (FTCS), pp. 340-349, 1994.
R. V. Boppana and S. Chalasani, “Fault-tolerant routing with non-adaptive wormhole algorithms in mesh networks,” in Proceedings of the Conference on Supercomputing, pp. 693-702, 1994.
D. Brooks and M. Martonosi, “Dynamic thermal management for high-performance microprocessors,” in Proceedings of the International Symposium on High-Performance Computer Architecture (HPCA), pp. 171-182, 2001.
S. S. Mukherjee, F. Silla, P. Bannon, J. Emer, S. Lang, and D. Webb, “A comparative study of arbitration algorithms for the Alpha 21364 pipelined router,” in Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), 2002.
F. Li and M. Kandemir, “Locality-conscious workload assignment for array-based computations in MPSOC architectures,” in Proceedings of the Design Automation Conference (DAC), pp. 95-100, 2005.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2009 Springer Science+Business Media B.V.
About this chapter
Cite this chapter
Nicopoulos, C., Narayanan, V., Das, C.R. (2009). Exploring FaultoTolerant Network-on-Chip Architectures [37]. In: Network-on-Chip Architectures. Lecture Notes in Electrical Engineering, vol 45. Springer, Dordrecht. https://doi.org/10.1007/978-90-481-3031-3_5
Download citation
DOI: https://doi.org/10.1007/978-90-481-3031-3_5
Published:
Publisher Name: Springer, Dordrecht
Print ISBN: 978-90-481-3030-6
Online ISBN: 978-90-481-3031-3
eBook Packages: EngineeringEngineering (R0)