Skip to main content

Power and Thermal Effects and Management

  • Chapter
  • First Online:
  • 1602 Accesses

Abstract

As semiconductor processes scale to smaller and smaller feature sizes, manufacturing reliable digital designs is challenging how systems are traditionally designed. Specifically, the shrinking of transistor and wire size imposes that these components simultaneously are becoming more prone to complete, or parametric, failure at manufacturing time. Additionally, the derived systems are increasingly expensive to produce and less likely to function correctly for as long as intended. In order to address these challenges, the NoC-based systems have to be designed with reliability and fault tolerance features in mind. Toward this goal, a number of design techniques and methodologies are available that promise to provide sufficient fault coverage with controllable overhead in terms of hardware redundancy and performance (e.g., delay/power) degradation. This chapter studies the origin of faults in modern technologies and explains the classification to transient, intermittent, and permanent faults. A survey of fault tolerance methods is presented to demonstrate the diversity of available methods. Fault tolerance methods for NoCs are studied at different layers of the OSI reference model.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. J. Abraham, A. Krishnamachary, R. Tupuri, A comprehensive fault model for deep submicron digital circuits, in Internationl Workshop on Electronic Design, Test and Applications, 2002, pp. 360–364

    Google Scholar 

  2. R. Aitken, Nanometer technology effects on fault models for IC testing. IEEE Comput. 32(11), 46–51 (1999)

    Article  Google Scholar 

  3. P. Aldworth, System-on-a-chip bus architecture for embedded applications, in International Conference on, Computer Design, 1999, pp. 297–298

    Google Scholar 

  4. W. Bainbridge, S. Furber, Delay insensitive system-on-chip interconnect using 1-of-4 data encoding, in International Symposium on Asynchronus Circuits and Systems, 2001, pp. 118–126

    Google Scholar 

  5. R. Baumann, Soft errors in advanced computer systems. IEEE Des. Test Comput. 22(3), 258–266 (2005)

    Google Scholar 

  6. R. Blahut, Algebraic Codes for Data Transmission (Cambridge University Press, Cambridge, 2003)

    Book  MATH  Google Scholar 

  7. S. Borkar, Designing reliable systems from unreliable components: the challenges of transistor variability and degradation. IEEE Micro 25(6), 10–16 (2005)

    Google Scholar 

  8. D. Brooks, M. Martonosi, Dynamic thermal management for high-performance microprocessors, in International Symposium on High-Performance Computer, Architecture, 2001, pp. 171–182

    Google Scholar 

  9. R. Casado, A. Bermudez, J. Duato, F. Quiles, J. Sanchez, A protocol for deadlock-free dynamic reconfiguration in high-speed local area networks. IEEE Trans. Parallel Distrib. Syst. 12(2), 115–132 (2001)

    Article  Google Scholar 

  10. L. Chanhee, K. Hokeun Kim, P. Hae-woo, K. Sungchan, O. Hyunok Oh, H. Soonhoi, A task remapping technique for reliable multi-core embedded systems, in International Conference on Hardware/Software Codesign and System, Synthesis, 2010, pp. 307–316

    Google Scholar 

  11. L.H. Chee, W, Daasch, G. Cai, A thermal-aware superscalar microprocessor, in International Symposium on Quality, Electronic Design, 2002, pp. 517–522

    Google Scholar 

  12. Y. Chengmo, A. Orailoglu, Predictable execution adaptivity through embedding dynamic reconfigurability into static MPSoC schedules, in International Conference on Hardware/Software Codesign and System, Synthesis, 2007, pp. 15–20

    Google Scholar 

  13. L. Cherkasova, V. Kotov, T. Rokicki, Fibre channel fabrics: evaluation and design, in International Conference on System Sciences, 1996, pp. 53–62

    Google Scholar 

  14. J. Cong, Z. Yan Zhang, Thermal via planning for 3-D ICs, in International Conference on, Computer-Aided Design, 2005, pp. 745–752

    Google Scholar 

  15. C. Constantinescu, Trends and challenges in VLSI circuit reliability. IEEE Micro 23(4), 14–19 (2003)

    Article  Google Scholar 

  16. B. Cordan, An efficient bus architecture for system-on-chip design, in Custom Integrated Circuits, 1999, pp. 623–626

    Google Scholar 

  17. O. Derin, D. Kabakci, L. Fiorin, Online task remapping strategies for fault-tolerant Network-on-Chip multiprocessors, in International Symposium on Networks on Chip, 2011, pp. 129–136

    Google Scholar 

  18. A. Dogan, F. Ozguner, Matching and scheduling algorithms for minimizing execution time and failure probability of applications in heterogeneous computing. IEEE Trans. Parallel Distrib. Syst. 13(3), 308–323 (2002)

    Article  Google Scholar 

  19. J. Flich, P. Lopez, C. Sancho, A. Robles, J. Duato, Improving infiniBand routing through multiple virtual networks, in International Symposium on High, Performance Computing, 2002, pp. 49–63

    Google Scholar 

  20. M. Gomez, J. Duato, J. Flich, P. Lopez, A. Robles, N. Nordbotten, O. Lysne, T. Skeie, An efficient fault-tolerant routing methodology for meshes and tori. Comput. Archit. Lett. 3(1), 3 (2004)

    Google Scholar 

  21. C. Grecu, A. Ivanov, R. Saleh, E. Sogomonyan, P. Partha Pratim, On-line fault detection and location for NoC interconnects, in International On-Line Testing, Symposium, 2006, p. 6

    Google Scholar 

  22. P. Guerrier, A. Greiner, A generic architecture for on-chip packet-switched interconnections, in Design, Automation and Test in Europe Conference and Exhibition, 2000, pp. 250–256

    Google Scholar 

  23. C. Ho, L. Stockmeyer, A new approach to fault-tolerant wormhole routing for mesh-connected parallel computers. IEEE Trans. Comput. 53(4), 427–438 (2004)

    Article  Google Scholar 

  24. http://www.xilinx.com/ise/optional_prod/tmrtool.htm

  25. Z. Hui, W. Marlene, V. George, J. Rabaey, Interconnect architecture exploration for low-energy reconfigurable single-chip DSPs, in IEEE Workshop On VLSI, 1999, pp. 2–8

    Google Scholar 

  26. W. Hung, C. Addo-Quaye, T. Theocharides, Y. Xie, N. Vijakrishnan, M. Irwin, Thermal-aware IP virtualization and placement for networks-on-chip architecture, in International Conference on, Computer Design, 2004, pp. 430–437

    Google Scholar 

  27. International Standards Organization, Open Systems Interconnection (OSI) Standard 35.100 (available at http://www.iso.org)

  28. ITRS 2011 (availabe at http://www.itrs.net)

  29. A. Iwata, M. Sasaki, T. Kikkawa, S. Kameda, H. Ando, K. Kimoto, D. Arizono, H. Sunami, A 3D integration scheme utilizing wireless interconnections for implementing hyper brains, in International Solid-State Circuits Conference, 2005, pp. 262–597

    Google Scholar 

  30. V. Izosimov, P. Pop, P. Eles, Z. Peng, Design optimization of time- and cost-constrained fault-tolerant distributed embedded systems, in Design, Automation and Test in, Europe, 2005, pp. 864–869

    Google Scholar 

  31. S. Jayanth, S. Adve, B. Pradip, J. Rivers, Lifetime reliability: toward an architectural solution. IEEE Micro 25(3), 70–80 (2005)

    Article  Google Scholar 

  32. H. Jia, J. Blech, A. Raabe, C. Buckl, A. Knoll, Analysis and optimization of fault-tolerant task scheduling on multiprocessor embedded systems, in International Conference on Hardware/Software Codesign and System, Synthesis, 2011, pp. 247–256

    Google Scholar 

  33. B. Johnson, Design and Analysis of Fault-Tolerant Digital Systems (Addison-Wesley, MA, 1989)

    Google Scholar 

  34. M. Koibuchi, A. Funahashi, A. Jouraku, H. Amano, L-turn routing: an adaptive routing in irregular networks, in International Conference on Parallel Processing, 2001, pp. 383–392

    Google Scholar 

  35. M. Koibuchi, A. Jouraku, K. Watanabe, H. Amano, Descending layers routing: a deadlock-free deterministic routing using virtual channels in system area networks with irregular topologies, in International Conference on Parallel Processing, 2003, pp. 527–536

    Google Scholar 

  36. I. Koren, C. Krishna, Fault-tolerant systems (Morgan Kaufmann, CA, 2007)

    MATH  Google Scholar 

  37. P. Lala, Self-Checking and Fault-Tolerant Digital Design (Morgan Kaufmann Publishers, CA, 2001)

    Google Scholar 

  38. H. Lin, Y. Feng, X. Qiang, Lifetime reliability-aware task allocation and scheduling for MPSoC platforms, in Design, Automation and Test in Europe Conference and Exhibition, 2009, pp. 51–56

    Google Scholar 

  39. L. Lin, N. Vijaykrishnan, M. Kandemir, M. Irwin, Adaptive error protection for energy efficiency, in International Conference on, Computer Aided Design, 2003, pp. 2–7

    Google Scholar 

  40. A. Maheshwari, W. Burleson, R. Tessier, Trading off transient fault tolerance and power consumption in deep submicron VLSI circuits. IEEE Trans. Very Large Scale Integr. VLSI Syst. 12(3), 299–311 (2004)

    Article  Google Scholar 

  41. S. Murali, M. Coenen, A. Radulescu, K. Goossens, G. De Micheli, A methodology for mapping multiple use-cases onto networks on chips, in Design, Automation and Test in, Europe, 2006, pp. 1–6

    Google Scholar 

  42. S. Murali, T. Theocharides, N. Vijaykrishnan, M. Irwin, L. Benini, G. De Micheli, Analysis of error recovery schemes for networks on chips. IEEE Des. Test Comput. 22(5), 434–442 (2005)

    Article  Google Scholar 

  43. E. Nilsson, J. Oberg, PANACEA - a case study on the PANACEA NoC - a nostrum network on chip prototype, Royal Institute of Technology, Tech. Report. 229, 2006

    Google Scholar 

  44. P. Partha Pratim, C. Grecu, M. Jones, A. Ivanov, R. Saleh, Performance evaluation and design trade-offs for network-on-chip interconnect architectures. IEEE Trans. Comput. 54(8), 1025–1040 (2005)

    Google Scholar 

  45. C. Patel, S. Chai, S. Yalamanchili, D. Schimmel, Power constrained design of multiprocessor interconnection networks, in International Conference on, Computer Design, 1997, pp. 408–416

    Google Scholar 

  46. M. Pirretti, G. Link, R. Brooks, N. Vijaykrishnan, M. Kandemir, M. Irwin, Fault tolerant algorithms for network-on-chip interconnect, in IEEE Annual Symposium on VLSI, 2004, pp. 46–51

    Google Scholar 

  47. V. Puente, R. Beivide, J. Gregorio, J. Prellezo, J. Duato, C. Izu, Adaptive bubble router: a design to improve performance in torus networks, in International Conference on Parallel Processing, 1999, pp. 58–67

    Google Scholar 

  48. J.M. Rabaey, Low Power Design Essentials, Series on Integrated Circuits and Systems (Springer, New York, 2009)

    Book  Google Scholar 

  49. F. Ridruejo, J. Miguel-Alonso, INSEE: an interconnection network simulation and evaluation environment. Euro-Par Parallel Process. 3648, 1014–1023 (2005)

    Google Scholar 

  50. J. Sancho, A. Robles, J. Flich, P. Lopez, J. Duato, Effective methodology for deadlock-free minimal routing in infiniBand networks, in International Conference on Parallel Processing, 2002, pp. 409–418

    Google Scholar 

  51. M. Schroeder, A. Birrell, M. Burrows, H. Murray, R. Needham, T. Rodeheffer, E. Satterthwaite, C. Thacker, Autonet: a high-speed, self-configuring local area network using point-to-point links. IEEE J. Sel. Areas Commun. 9(8), 1318–1335 (1991)

    Article  Google Scholar 

  52. R. Seifert, Gigabit Ethernet (Addison-Wesley, MA, 1998). ISBN 0-201-18553-9

    Google Scholar 

  53. N. Shanbhag, A mathematical basis for power-reduction in digital VLSI systems. IEEE Trans. Circuits Syst. II Analog Digital SSignal Proc. 44(11), 935–951 (1997)

    Google Scholar 

  54. T. Simunic, S. Boyd, P. Glynn, Managing power consumption in networks on chips. IEEE Trans. Very Large Scale Integr. VLSI Syst. 12(1), 96–107 (2004)

    Article  Google Scholar 

  55. K. Skadron, M. Stan, W. Huang, V. Sivakumar, S. Karthik, D. Tarjan, Temperature-aware microarchitecture, in International Symposium on Computer, Architecture, 2003, pp. 2–13

    Google Scholar 

  56. T. Skeie, O. Lysne, I. Theiss, Layered shortest path (LASH) routing in irregular system area networks, in International Symposium on Parallel and Distributed Processing, Symposium, 2002, pp. 162–169

    Google Scholar 

  57. J. Smolens, B. Gold, J. Hoe, B. Falsafi, K. Mai, Detecting emerging wearout faults, in Workshop on Silicon Errors in Logic - System Effects, 2007

    Google Scholar 

  58. T. Streichert, C. Strengert, C. Haubelt, J. Teich, Dynamic task binding for hardware/software reconfigurable networks, in Symposium on Integrated circuits and systems design, 2006, pp. 38–43

    Google Scholar 

  59. I. Sungjun, K. Banerjee, Full chip thermal analysis of planar (2-D) and vertically integrated (3-D) high performance ICs, in International Electron Devices Meeting, 2000, pp. 727–730

    Google Scholar 

  60. S. Tosun, N. Mansouri, E. Arvas, M. Kandemir, X. Yuan, Reliability-centric high-level synthesis, in Design, Automation and Test in, Europe, 2005, pp. 1258–1263

    Google Scholar 

  61. J. Walrand, P. Varaiya, High-Performance Communication Networks (Morgan Kaufman, CA, 2000)

    Google Scholar 

  62. H. Wei, K. Sankaranarayanan, K. Skadron, R. Ribando, M. Stan, Accurate, pre-RTL temperature-aware design using a parameterized, geometric thermal model. IEEE Trans. Comput. 57(9), 1277–1288 (2008)

    Article  MathSciNet  Google Scholar 

  63. R. Wells, Applied Coding and Information Theory for Engineers (Prentice Hall, Inc., NJ, 1999)

    Google Scholar 

  64. S. Winegarden, A bus architecture centric configurable processor system, in Custom Integrated Circuits, 1999, pp. 627–630

    Google Scholar 

  65. D. Wingard, MicroNetwork-based integration for SOCs, in Design Automation Conference, 2001, pp. 673–677

    Google Scholar 

  66. F. Worm, P. Ienne, P. Thiran, G. De Micheli, A robust self-calibrating transmission scheme for on-chip networks. IEEE Trans. Very Large Scale Integr. VLSI Syst. 13(1), 126–139 (2005)

    Article  Google Scholar 

  67. Y. Xie, L. Li, M. Kandemir, N. Vijaykrishnan, M. Irwin, Reliability-aware co-synthesis for embedded systems, in International Conference on Application-Specific Systems, Architectures and Processors, 2004, pp. 41–50

    Google Scholar 

  68. T. Ye, L. Benini, G. De Micheli, Analysis of power consumption on switch fabrics in network routers, in Design Automation Conference, 2002, pp. 524–529

    Google Scholar 

  69. R. Yoshimura, K. Tan Boon, T. Ogawa, S. Hatanaka, T. Matsuoka, K. Taniguchi, DS-CDMA wired bus with simple interconnection topology for parallel processing system LSIs, in International Solid-State Circuits Conference, 2000, pp. 370–371

    Google Scholar 

  70. Z. Yuping, H. Zimian, X. Xianbin, Z. Wuqing, W. Zhuowei, Workload-balancing schedule with adaptive architecture of MPSoCs for fault tolerance, in International Conference on Biomedical Engineering and Informatics, 2010, pp. 2775–2779

    Google Scholar 

  71. H. Zhang, V. George, J. Rabaey, Low-swing on-chip signaling techniques: effectiveness and robustness. IEEE Trans. Very Large Scale Integr. VLSI Syst. 8(3), 264–272 (2000)

    Article  Google Scholar 

  72. H. Zimmer, A. Jantsch, A fault model notation and error-control scheme for switch-to-switch buses in a network-on-chip, in InternationAl Conference on Hardware/Software Codesign and System, Synthesis, 2003, pp. 188–193

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Konstantinos Tatas .

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer Science+Business Media New York

About this chapter

Cite this chapter

Tatas, K., Siozios, K., Soudris, D., Jantsch, A. (2014). Power and Thermal Effects and Management. In: Designing 2D and 3D Network-on-Chip Architectures. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-4274-5_4

Download citation

  • DOI: https://doi.org/10.1007/978-1-4614-4274-5_4

  • Published:

  • Publisher Name: Springer, New York, NY

  • Print ISBN: 978-1-4614-4273-8

  • Online ISBN: 978-1-4614-4274-5

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics