Skip to main content

Architecting Dependable Many-Core Processors Using Core-Level Dynamic Redundancy

  • Conference paper
Trustworthy Computing and Services (ISCTCS 2012)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 320))

Included in the following conference series:

  • 3189 Accesses

Abstract

Future many-core processors probably contain more than 1000 cores on a single die. But continued scaling of silicon fabrication technology make such chips orders of magnitude more vulnerable to errors. This means reliability techniques have to be an essential part of many-core processors. Redundant execution is a efficient solution to improve reliability. Present redundant execution mechanisms such an SRT,CRT,DIVA and RECVF aim to improve performance decrease using execution assistance and other speculative mechanisms. We are from another way that utilizing idle cores in may-core processors to execute redundancy. We propose core-level dynamic redundancy (CDR) which includes the following unique properties : i) eliminates restriction of hardware and supports redundancy on arbitrary core. ii) dynamically chooses core to execute redundancy on cores conditions, so effectively balance reliability, performance and power. Experimental results show the effectiveness of the pro-posed techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Borkar, S.: Thousand core chips: a technology perspective. In: Proceedings of the 44th annual Design Automation Conference (June 2007)

    Google Scholar 

  2. Srinivasan, J., Adve, S.V., Bose, P., Rivers, J.A.: The impact of technology scaling on lifetime reliability. In: Intl. Conf. on Dependable Systems and Networks (June 2004)

    Google Scholar 

  3. Reinhardt, S.K., Mukherjee, S.S.: Transient fault detection via simulta-neous multithreading. In: Intl. Symp. on Computer Architecture (June 2000)

    Google Scholar 

  4. Mukherjee, S.S., Kontz, M., Reinhardt, S.K.: Detailed design and evaluation of redundant multithreading alternatives. In: Intl. Symp. on Computer Architecture (May 2002)

    Google Scholar 

  5. Austin, T.: DIVA: A Reliable Substrate For Deep Submicron Microarchitecture Design. In: Proceedings of the 32nd MICRO, pp. 196–207 (1999)

    Google Scholar 

  6. Subramanyan, P., Singh, V., Saluja, K.K., Larsson, E.: Energy-Efficient Fault Tolerance in Chip Multiprocessors Using Critical Value Forwarding. In: Intl. Conf. on Dependable Systems and Networks (June 2010)

    Google Scholar 

  7. Rotenberg, E.: AR-SMT: A microarchitectural approach to fault tolerance in microprocessors. In: Intl. Symp. on Fault-Tolerant Computing (June 1999)

    Google Scholar 

  8. Vijaykumar, T.N., Pomeranz, I., Cheng, K.: Transient-fault recovery using simultaneous multithreading. In: Intl. Symp. on Computer Architecture (May 2002)

    Google Scholar 

  9. Gomaa, M., Scarbrough, C., Vijaykumar, T.N., Pomeranz, I.: Transient-fault recovery for chip multiprocessors. In: Intl. Symp. on Computer Architecture (June 2003)

    Google Scholar 

  10. Smolens, J.C., Gold, B.T., Falsafi, B., Hoe, J.C.: Reunion: Complexity-effective multicore redundancy. In: Intl. Symp. on Microarchitecture (December 2006)

    Google Scholar 

  11. Nomura, S., Sinclair, M.D., Ho, C., Govindaraju, V., de Krujif, M., Sankaralingam, K.: Sampling + DMR: Practical and Low-overhead Permanent Fault Detection. In: Intl. Symp. on Computer Architecture (June 2011)

    Google Scholar 

  12. Smolens, A.C., Gold, B.T., Kim, J., Falsafi, B., Hoe, J.C., Nowatzyk, A.G.: Fingerprinting: bounding soft-error detection latency and bandwidth. In: Intl. Conf. on ASPLOS (October 2004)

    Google Scholar 

  13. Greskamp, B., Torrellas, J.: Paceline: Improving Single-Thread Performance in Nanoscale CMPs through Core Overclocking. In: Proceedings of the 16th PACT (September 2007)

    Google Scholar 

  14. Subramanyan, Singh, V., Saluja, K.K., Larsson, E.: Mulitplexed Redundant Execution: A Technique for Efficient Fault Tolerance in Chip Multiprocessors. In: Proc. of DATE (2010)

    Google Scholar 

  15. Zhang, L., Han, Y., Xu, Q., Li, X.: Defect tolerance in homogeneous manycore processors using core-level redundancy with unified topology. In: Proc. Design, Automation and Test in Europe, DATE 2008, pp. 891–896 (2008)

    Google Scholar 

  16. Wentzlaff, D., Griffin, P., Hoffmann, H., Bao, L.W., Edwards, B., Ramey, C., Mattina, M., Miao, C.C., Brown, J.F., Agarwal, A.: On-chip interconnection architecture of the tile pro-cessor. IEEE Micro 27(5), 15–31 (2007)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Jia, W., Zhang, C., Fu, J., Li, R. (2013). Architecting Dependable Many-Core Processors Using Core-Level Dynamic Redundancy. In: Yuan, Y., Wu, X., Lu, Y. (eds) Trustworthy Computing and Services. ISCTCS 2012. Communications in Computer and Information Science, vol 320. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35795-4_86

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-35795-4_86

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-35794-7

  • Online ISBN: 978-3-642-35795-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics