Abstract
This is the second and the last chapter of this book devoted to on-line Network-on-Chip (NoC) testing strategies. As mentioned before, the main difference of on-line and off-line tests is that the former detects run-time faults during system’s mission mode, while in the latter is typically used to detect manufacturing defects while the system is in test mode. Compared to the previous chapter, this one presents techniques used at the router, NoC, and system levels, while the previous chapter focuses on link and router level techniques. The most used techniques at the router, NoC, and the system levels are fault tolerant and adaptive routing algorithms – where an alternative path is found, avoiding the defective part of the NoC – and fault reconfiguration – where the hardware or the software are reconfigured to mask and isolate the defective block. However, both techniques assume they are able to pinpoint the exact location of a hardware defect. This task alone, called fault location, can be a challenge itself, since NoCs are scalable and they can have hundreds or even thousands of switching elements. Similarly to the previous chapter, the test approaches presented in this chapter also have costs in terms of, for instance, silicon area, network performance, network congestion, and energy consumption. Thus, the challenge for the designer is, again, to find a good trade-off between these costs and the potential benefit of the test approach in terms of reliability. However, this trade-off evaluation is typically much more complex at the NoC level than it is at link or router level, due to the size of NoCs and complex data communication patterns of the applications. This chapter presents the most relevant on-line NoC testing strategies at the NoC and system levels and their results in terms of costs and reliability.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Self checking can also be implemented in software, but in this case it looses the on-line testing capability and perhaps the ability to locate transient faults. The system needs to be in test mode periodically to locate a permanent fault.
- 2.
Do not require to change the router internal design.
- 3.
Total of 500 functional ports assuming 100 functional routers and each router has five ports.
References
Chang YC, Chiu CT, Lin SY, Liu CK (2011) On the design and analysis of fault tolerant NoC architecture using spare routers. In: Proceedings of the Asia and South pacific design automation conference (ASPDAC), Yokohama, Japan, pp 431–436
Constantinides K, Plaza S, Blome J, Bertacco V, Mahlke S, Austin T, Orshansky M (2006) BulletProof: a defect tolerant CMP switch architecture. In: Proceedings of the international symposium on high-performance computer architecture, Austin, TX, USA, pp 3–14
Fick D, DeOrio A, Chen G, Bertacco V, Sylvester D, Blaauw D (2009a) A highly resilient routing algorithm for fault-tolerant NoCs. In: Proceedings of the design, automation, and test in Europe (DATE), Nice, France, pp 21–26
Fick D, DeOrio A, Hu J, Bertacco V, Blaauw D, Sylvester D (2009b) Vicis: a reliable network for unreliable silicon. In: Proceedings of the ACM/IEEE design automation conference (DAC), San Francisco, CA, pp 812–817
Grecu C, Ivanov A, Saleh R, Sogomonyan ES, Pande PP (2006) On-line fault detection and location for NoC interconnects. In: Proceedings of the international on-line testing symposium (IOLTS), Lake of Como, Italy, pp 145–150
Kakoee MR, Bertacco V., Benini L (2011) ReliNoC: a reliable network for priority-based on-chip communication. In: Proceedings of the design, automation and test in Europe conference (DATE), Grenoble, France, pp 1–6
Kohler A, Gert S, Martin R (2010) Fault tolerant network on chip switching with graceful performance degradation. IEEE Trans Comput Aided Des Integr Circuit Sys 29(6):883–896
Koibuchi M, Hiroki M, Hideharu A, Pinkston TM (2008) A lightweight fault-tolerant mechanism for network-on-chip. In: Proceedings of the international symposium on networks-on-chip (NOCS), Newcastle upon Tyne, UK, pp 13–22
Lehtonen T, Wolpert D, Liljeberg P, Plosila J, Ampadu P (2010) Self-adaptive system for addressing permanent errors in on-chip interconnects. IEEE Trans Very Large Scale Integr (VLSI) Sys 18(4):527–540
Liu C, Zhang L, Han Y, and Li X (2011) A resilient on-chip router design through data path salvaging. In: Proceedings of the Asia and South Pacific design automation conference (ASPDAC), Yokohama, Japan, pp 437–442
Raik J, Govind V, Ubar R (2009) Design-for-testability-based external test and diagnosis of mesh-like network-on-a-chips. IET Comput Digital Tech 3(5):476–486
Rodrigo S, Flich J, Roca A, Medardoni S, Bertozzi D, Camacho J, Silla F, Duato J (2010) Addressing manufacturing challenges with cost-efficient fault tolerant routing. In: Proceedings of the international symposium on networks-on-chip (NOCS), Grenoble, France, pp 25–32
Zhang L, Han Y, Xu Q, Li XW, Li H (2009) On topology reconfiguration for defect-tolerant noc-based homogeneous many core systems. IEEE Trans Very Large Scale Integr (VLSI) Syst 17(9):1173–1186
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2012 Springer Science+Business Media, LLC
About this chapter
Cite this chapter
Cota, É., de Morais Amory, A., Lubaszewski, M.S. (2012). Error Location and Reconfiguration. In: Reliability, Availability and Serviceability of Networks-on-Chip. Springer, Boston, MA. https://doi.org/10.1007/978-1-4614-0791-1_9
Download citation
DOI: https://doi.org/10.1007/978-1-4614-0791-1_9
Published:
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4614-0790-4
Online ISBN: 978-1-4614-0791-1
eBook Packages: EngineeringEngineering (R0)