Advertisement

Network Fault Detection and Recovery in the Chaos Router

  • Kevin Bolding
  • Lawrence Snyder
Chapter
  • 37 Downloads
Part of the The Kluwer International Series in Engineering and Computer Science book series (SECS, volume 285)

Abstract

Chaotic routing, which allows packets to follow non-minimal routes, provides a basic level of fault-tolerance by allowing messages to be routed around faults without requiring a priori knowledge of their locations. However, the mechanisms for doing this can be slow and clumsy at times. We augment Chaotic routing with a limited amount of hardware to support fault, detection, identification, and reconfiguration so that the network can automatically reconfigure itself when faults occur. We present a high-level design of these mechanisms, driven by the goal of achieving reasonable reliability without exorbitant cost.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [1]
    Kevin Bolding. Chaotic Routing: Design and Implementation of an Adaptive Multicomputer Network Router. PhD thesis, University of Washington, Seattle, WA, July 1993.Google Scholar
  2. [2]
    Kevin Bolding, Sen-Ching Cheung, Sung-Eun Choi, Carl Ebeling, Soha Hassoun, Ton Anh Ngo, and Robert Wille. The chaos router chip: Design and implementation of an adaptive router. In Proceedings of the IFIP Conf. on VLSI, pages 311–320, September 1993.Google Scholar
  3. [3]
    Kevin Bolding and Lawrence Snyder, Mesh and torus chaotic routing. In Advanced Research in VLSI and Parallel Systems: Proceedings of the 1992 Brown/MIT Conference, pages 333–347, March 1992.Google Scholar
  4. [4]
    A. Borodin and J. E. Hopcroft. Routing, merging and sorting on parallel models of computation. Journal of Computer and System Sciences, 30:130–145, 1985.zbMATHCrossRefGoogle Scholar
  5. [5]
    Ming-Syan Chen and Kang G. Shin. Adaptive fault-tolerant routing in hypercube multicomputers. IEEE Trans. on Computers, 39(12):1406–1416, December 1990.CrossRefGoogle Scholar
  6. [6]
    Bill Coates, Al Davis, and Ken Stevens. The post office experience: Designing a large asynchronous chip. In Proceedings of the HICSS, 1993.Google Scholar
  7. [7]
    Robert Cypher and Luis Gravano. Adaptive, deadlock-free packet routing in torus networks with minimal storage. In Proc. Int. Conf. on Parallel Processing, pages 204–211, 1992.Google Scholar
  8. [8]
    W. Dally. Wire-efficient VLSI multiprocessor communication networks. In Paul Losleben, editor, Proceedings of the Stanford Conference on Advanced Research in VLSI, pages 391–415. MIT Press, March 1987.Google Scholar
  9. [9]
    W. Dally and C. Seitz. Deadlock-free message routing in multiprocessor interconnection networks. IEEE Trans. on Computers, C-36(5):547–553, May 1987.Google Scholar
  10. [10]
    Chien Fang and Ted Szymanski. An analysis of deflection routing in multi-dimensional regular mesh networks. In Proceedings of IEEE INFOCOM’ 91, pages 859–868, April 1991.Google Scholar
  11. [11]
    C. Flaig. VLSI mesh routing systems. Master’s thesis, California Institute of Technology, May 1987.Google Scholar
  12. [12]
    Melanie L. Fulgham and Lawrence Snyder. Performance of chaos and oblivious routers under non-uniform traffic. Technical Report CSE-93-06-01, University of Washington, Seattle, WA, June 1993.Google Scholar
  13. [13]
    Christopher J. Glass and Lionel M. Ni. The turn model for adaptive routing. In Proc. Int. Symp. on Computer Architecture, 1992.Google Scholar
  14. [14]
    P. Kermani and L. Kleinrock. Virtual cut-through: A new computer communication switching technique. Computer Networks, 3:267–286, 1979.zbMATHGoogle Scholar
  15. [15]
    Smaragda Konstantinidou and Lawrence Snyder. The chaos router: A practical application of randomization in network routing. In Proc. Symp. on Parallel Algorithms and Architectures, pages 21–30, 1990.Google Scholar
  16. [16]
    D. H. Linder and J. C. Hardin. An adaptive and fault tolerant wormhole routing strategy for k-ary n-cubes. IEEE Trans. on Computers, C-40(1):2–12, January 1991.CrossRefGoogle Scholar
  17. [17]
    Neil McKenzie, Kevin Bolding, Carl Ebeling, and Lawrence Snyder. CRANIUM: An interface for message passing on adaptive packet routing networks. In Proc. Parallel Computer Routing and Communication Workshop, May 1994.Google Scholar
  18. [18]
    J. Y. Ngai and C. L. Seitz. A framework for adaptive routing in multicomputer networks. In Proc. Symp. on Parallel Algorithms and Architectures, pages 1–9, 1989.Google Scholar
  19. [19]
    Gustavo D. Pifarré, Luis Gravano, Sergio A. Felperin, and Jorge L. C. Sanz. Fully-adaptive minimal deadlock-free packet routing in hypercubes, meshes and other networks. In Proc. Symp. on Parallel Algorithms and Architectures, pages 278–290, 1991.Google Scholar
  20. [20]
    Charles L. Seitz and Wen-King Su. A family of routing and communication chips based on the Mosaic. In Symp. on Integrated Systems: Proc. of the 1993 Washington Conf., pages 320–337, 1993.Google Scholar
  21. [21]
    B. J. Smith. Architecture and applications of the HEP multiprocessor computer system. In Proceedings of SPIE, pages 241–248, 1981.Google Scholar

Copyright information

© Kluwer Academic Publishers 1994

Authors and Affiliations

  • Kevin Bolding
  • Lawrence Snyder
    • 1
  1. 1.Department of Computer Science and EngineeringUniversity of WashingtonSeattle

Personalised recommendations