Abstract
To achieve sustainability, computing systems demand a high-performance and energy-efficient on-chip communication infrastructure. Because of its scalability, reusability and high throughput, Networks-on-Chip (NoCs) have been increasingly adopted in the sustainable computing systems. The growing transient and permanent errors induced by the scaled technologies add a new challenge—reliability—on the sustainable computing system design. The commonly used techniques for reliable networks-on-chip design are overviewed in this chapter. The very recent energy-efficient NoC link and router design approached are presented, as well.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Cray Research, Inc. (1985) The cray-2 computer system
Gioiosa R (2010) Towards sustainable exascale computing. In: Proceedings of the18th IEEE/IFIP VLSI system on chip conference (VLSI-SoC), Madrid, Spain, pp 270–275
Zhang Y, Sun J, Yuan G, Zhang L (2010) Perspectives of China’s HPC system development: a view from the 2009 China HPC TOP100 list. J Frontiers Comput Sci China 4(4):437–444
Nickolls J, Dally WJ (2010) The GPU computing era. IEEE Micro 30(2):56–69
Truong DN et al (2009) A 167-processor computational platform in 65 nm CMOS. IEEE J Solid State Circuits 44(4):1130–1144
Seiler L et al (2009) Larrabee: a many-core x86 architecture for visual computing. IEEE Micro 29(1):10–21
Dally WJ, Towles B (2001) Route packets, not wires: on-chip interconnection networks. In: Proceedings of the 38th design automation conference (DAC’01), Las Vegas, NV, USA, pp 684–689
Benini L, De Micheli G (2002) Networks on chips: a new SoC paradigm. Computer 35:70–78
Agarwal A, Iskander C, Shankar R (2009) Survey of network on chip (NoC) architectures & contributions. Eng Comput Architec 3:1–15
Kogge P et al (2008) Exascale computing study: technology challenges in achieving exascale systems. Tech Rep DARPA-2008-13, DARPA IPTO
Naffziger S (2006) High-performance processors in a power-limited world. In: Proceedings of the symposium on VLSI Circuits, Honolulu, Hawaii, USA, pp 93–97
Constantinescu C (2003) Trends and challenges in VLSI circuit reliability. IEEE Micro 23: 14–19
Hussein MA, He J (2005) Materials’ impact on interconnect process technology and reliability. IEEE Trans Semiconduct Manuf 18:69–85
Jakushokas R et al (2011) Power distribution networks with on-chip decoupling capacitors. Springer, New York
Chandra V, Aitken R (2008) Impact of technology and voltage scaling on the soft error susceptibility in nanoscale CMOS. In: Proceedings of DFT’08, Cambridge, MA, USA, pp 114–122
Barsky R, Wagner IA (2004) Reliability and yield: a joint defect-oriented approach. In: Proceedings of the 19th IEEE international symposium on defect and fault tolerance in VLSI Syst (DFT’04), Cannes, France, pp 2–10
Shivakumar P et al (2002) Modeling the effect of technology trends on the soft error rate of combinational logic. In: Proceedings of international conference on dependable systems and networks, Washington, DC, USA, pp 389–398
Agarwal K, Sylvester D, Blaauw D (2006) Modeling and analysis of crosstalk noise in coupled RLC interconnects. IEEE Trans Comput Aided Des Integr Circuits Syst 25:892–901
Baumann R (2005) Radiation-induced soft errors in advanced semiconductor technologies. IEEE Trans Device Mater Reliab 5:305–316
Bertozzi D, Benini L, De Micheli G (2005) Error control scheme for on-chip communication links: the energy-reliability tradeoff. IEEE Trans Comput Aided Des Integr Circuits Syst (TCAD) 24:818–831
Lin S, Costello D, Miller M (1984) Automatic-repeat-request error control schemes. IEEE Commun Mag 22:5–17
Metzner J (1979) Improvements in block-retransmission schemes. IEEE Trans Commun COM 23:525–532
Lehtonen T, Lijieberg P, Plosila J (2007) Analysis of forward error correction methods for nanoscale networks-on-chip. In: Proceedings of the nano-net, Catania, Italy, pp 1–5
Lin S, Costello DJ (2004) Error control coding, 2nd edn. Prentice Hall
Sridhara S, Shanbhag RN (2005) Coding for system-on-chip networks: a unified framework. IEEE Trans Very Large Scale Integr (VLSI) Syst 12:655–667
Rossi D, Metra C, Nieuwland KA, Katoch A (2005) Exploiting ECC redundancy to minimize crosstalk impact. IEEE Des Test Comput 22:59–70
Zimmer H, Jantsch A (2003) A fault model notation and error-control scheme for switch-to-switch buses in a network-on-chip. In: Proceedings of the international conference on hardware/software codesign and system synthsis (CODES-ISSS), Newport Beach, CA, USA, pp 188–193
Yu Q, Ampadu P (2008) Adaptive error control for NoC switch-to-switch links in a variable noise environment. In: Proceedings of IEEE international symposiun on defect and fault tolerance in VLSI system (DFT), Cambridge, MA, USA, pp 352–360
Reed SI, Solomon G (1960) Polynomial codes over certain finite fields. J Soc Ind Appl Math 8:300–304
Dumitras T, Kerner S, Marculescu R (2003) Towards on-chip fault-tolerant communication. In: Proceedings of the Asia and South Pacific design automation conference (ASP-DAC’03), Kitakyushu, Japan, pp 225–232
Haas ZJ, Halpern JY, Li L (2006) Gossip-based ad hoc routing. IEEE/ACM Trans Network (TON) 14:476–491
Pirretti M et al (2004) Fault tolerant algorithms for network-on-chip interconnect. In: Proceedings IEEE computer society annual symposium on VLSI emerging trends in VLSI syst design (ISVLSI’04), Lafayette, Louisiana, USA, pp 46–51
Patooghy A, Miremadi SG (2008) LTR: a low-overhead and reliable routing algorithm for network on chips. In: Proceedings of international SoC design conference Busan, Korea, I-129–I-133
Bobda C et al (2005) DyNoC: a dynamic infrastructure for communication in dynamically reconfigurable devices. In: Proceedings of international conference on field programmable logic and applications, Tampere, Finland, pp 153–158
Zhang Z, Greiner A, Taktak S (2008) A reconfigurable routing algorithm for a fault-tolerant 2D-mesh network-on-chip. In: Proceedings of IEEE design automation conference (DAC’08), Austin, TX, USA, pp 441–446
Glass CJ, Ni LM (1992) The turn model for adaptive routing. In: Proceedings of international symposium computer architecture, Gold Coast, Australia, pp 278–287
Chiu G-M (2000) The odd-even turn model for adaptive routing. IEEE Trans Parallel Distr Syst 11:729–738
Li M, Zeng QA, Jone WB(2006) DyXY-A proximity congestion-aware deadlock-free dynamic routing method for network-on-chip. In: Proceedings of DAC 2006, San Francisco, CA, USA, pp 849–852
Hosseini A, Ragheb T, Massoud Y (2008) A fault-ware dynamic routing algorithm for on-chip networks. In: Proceedings of IEEE international symposium circuits and syst( ISCAS ’08), Seattle, Washington, USA, pp 2653–2656
Aliabadi MR, Khademzadeh A, Raiya AM (2008) Dynamic intermediate node algorithm (DINA): a novel fault tolerance routing methodology for NoCs. In: Proceedings of international symposium on telecommunication, Tehran, Iran, pp 521–526
Schonwald T, Zimmermann J, Bringmann O, Rosenstiel W (2007) Fully adaptive fault-tolerant routing algorithm for network-on-chip architectures. In: Proceedings of euromicro conference on digital system design architecture, Lubeck, Germany, pp 527–534
Zhou J, Lau FCM (2001) Adaptive fault-tolerant wormhole routing in 2D meshes. In: Proceedings of 15th international parallel and distributed processing symposium, pp 1–8
Boppana RV, Chalasani S (1995) Fault-tolerant wormhole routing algorithms for mesh networks. IEEE Trans Comput 44:848–864
Chen K-H, Chiu G-M (1998) Fault-tolerant routing algorithm for meshes without using virtual channels. Inform Sci Eng 14:765–783
Park D, Nicopoulos C, Kim J, Vijaykrishnan N, Das CR (2006) Exploring fault-tolerant network-on-chip architectures. In: Proceedings of international conference on dependable syst and networks (DSN’06), Philadelphia, PA, USA, pp 93–104
Duato J (1997) A theory of fault-tolerant routing in wormhole networks. IEEE Trans Parallel Distr Syst 8:790–802
Lehtonen T, Wolpert D, Liljeberg P, Plosila J, Ampadu P (2010) Self-adaptive system for addressing permanent errors in on-chip interconnects. IEEE Trans Very Large Scale Integr (VLSI) Syst 18:527–540
Lehtonen T, Liljeberg P, Plosila J (2007) Online reconfigurable self-timed links for fault tolerant NoC. VLSI Des 2007:1–13
Elias P (1954) Error-free coding. IEEE Trans Inf Theory 4:29–37
Fujiwara E (2006) Code design for dependable systems: theory and practical applications. Wiley Interscience, Hoboken
Pyndiah R (1998) Near-optimum decoding of product codes: block turbo codes. IEEE Trans Commun 46(8):1003–1010
Fu B, Ampadu P (2009) On hamming product codes with type-II hybrid ARQ for on-chip interconnects. IEEE Trans Circuits Syst I, Reg Papers 9:2042–2054
Constantinides K et al (2006) BulletProof: a defect-tolerant CMP switch architecture. In: Proceedings of HPCA’06, Austin, Feb 2006, pp 5–16
Patel KN, Markov IL (2004) Error-correction and crosstalk avoidance in DSM busses. IEEE Trans Very Large Scale Integr (VLSI) Syst 12:1076–1080
Ganguly A, Pande PP, Belzer B, Grecu C (2008) Design of low power & reliable networks on chip through joint crosstalk avoidance and multiple error correction coding. J Electron Test Theory Appl (JETTA), Special Issue on Defect and Fault Tolerance 24:67–81
Ganguly A, Pande PP, Belzer B (2009) Crosstalk-aware channel coding schemes for energy efficient and reliable NOC interconnects. IEEE Trans Very Large Scale Integr (VLSI) Syst 17(11):1626–1639
Sridhara S, Shanbhag RN (2007) Coding for reliable on-chip buses: a class of fundamental bounds and practical codes. IEEE Trans Comput Aided Des Integr Circuits Syst 5:977–982
Sridhara S, Ahmed A, Shanbhag RN (2004) Area and energy-efficient crosstalk avoidance codes for on-chip busses. In: Proceedings of international conference on computer design (ICCD), San Jose, CA, USA, pp 12–17
Duan C, Tirumala A, Khatri SP (2001) Analysis and avoidance of crosstalk in on-chip buses. In: Proceedings of hot interconnects, Stanford, California, USA, pp 133–138
Victor B, Keutzer K (2001) Bus encoding to prevent crosstalk delay. In: Proceedings of IEEE/ACM international conference on computer-aided design (ICCAD), San Jose, CA, USA, pp 57–63
Hirose K, Yassura H (2000) A bus delay reduction technique considering crosstalk. In: Proceedings of design, automation and test in Europe (DATE), Paris, France, pp 441–445
Nose K, Sakurai T (2001) Two schemes to reduce interconnect delay in bi-directional and uni-directional buses. In: Proceedings of VLSI symposium, Kyoto, Japan, pp 193–194
Fu B, Ampadu P (2010) Exploiting parity computation latency for on-chip crosstalk reduction. IEEE Trans Circuits Syst II: Expr Briefs 57:399–403
Arizona State University Predictive Technology Model [Online]. http://ptm.asu.edu/
Fick D et al. (2009) A highly resilient routing algorithm for fault-tolerant NoCs. In: Proceedings of DATE’09, Nice, France, Mar 2009, pp 21–26
Sanusi A, Bayoumi MA (2009) Smart-flooding: a novel scheme for fault-tolerant NoCs. In: Proceedings of IEEE SoC conference, Belfast, Northern Ireland, Sept 2009, pp 259–262
Rodrigo S, Flich J, Roca A, Medardoni S, Bertozzi D, Camacho J, Silla F, Duato J (2010) Addressing manufacturing challenges with cost-efficient fault tolerant routing. In: Proceedings of NOCS’10, Grenoble, France, May 2010, pp 25–32
Yanamandra A et al (2010) Optimizing power and performance for reliable on-chip networks. In: Proceedings of ASP-DAC’10, Taipei, Taiwan, Jan 2010, pp 431–436
Lyons REAND, Vanderkulk W (1962) The use of triple-modular redundancy to improve computer reliability. IBM J Res Dev 6(2):200–209
Vangal S et al (2008) An 80-tile sub-100-W TeraFLOPS processor in 65-nm CMOS. IEEE J Solid State Circuits 43(1):29–41
Yu Q, Zhang M, Ampadu P (2011) Exploiting inherent information redundancy to manage transient errors in NoC routing arbitration. In: Proceedings of. 5th ACM/IEEE international symposium on networks-on-chip (NoCS’11), Pittsburgh, Pennsylvania, USA, pp 105–112
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer Science+Business Media New York
About this chapter
Cite this chapter
Ampadu, P., Yu, Q., Fu, B. (2013). Reliable Networks-on-Chip Design for Sustainable Computing Systems. In: Pande, P., Ganguly, A., Chakrabarty, K. (eds) Design Technologies for Green and Sustainable Computing Systems. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-4975-1_2
Download citation
DOI: https://doi.org/10.1007/978-1-4614-4975-1_2
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-4974-4
Online ISBN: 978-1-4614-4975-1
eBook Packages: EngineeringEngineering (R0)