Abstract
In this paper we propose a sufficient condition for minimal routing in 3-dimensional (3-D) meshes with faulty nodes. It is based on an early work of the author on minimal routing in 2-dimensional (2-D) meshes. Unlike many traditional models that assume all the nodes know global fault distribution or just adjacent fault information, our approach is based on the concept oflimited global fault information. First, we propose a fault model calledfaulty cube in which all faulty nodes in the system are contained in a set of faulty cubes. Fault information is then distributed to limited number of nodes while it is still sufficient to support minimal routing. The limited fault information collected at each node is represented by a vetor calledextended safety level. The extended safety level associated with a node can be used to determine the existence of a minimal path from this node to a given destination. Specifically, we study the existence of minimal paths at a given source node, limited distribution of fault information, minimal routing, and deadlock-free and livelock-free routing. Our results show that any minimal routing that is partially adaptive can be applied in our model as long as the destination node meets a certain condition. We also propose a dynamic planar-adaptive routing scheme that offers better fault tolerance and adaptivity than the planar-adaptive routing scheme in 3-D meshes. Our approach is the first attempt to address adaptive and minimal routing in 3-D meshes with faulty nodes using limited fault information.
Similar content being viewed by others
References
Dally W J. The J-Machine: System Support for Actors. Towards Open Information Science, Hewitt and Agha (eds.), MIT Press, 1992.
Koeninger R K, Furtney M, Walker M. A shared memory MPP from cray research.Digital Technical Journal, Spring 1994, 6(2): 8–21.
Wu J. Adaptive fault-tolerant routing in cube-based multicomputers using safety vectors.IEEE Transactions on Parallel and Distributed Systems, April, 1998, 9(4): 321–334.
Wu J. Reliable unicasting in faulty hypercubes using safety levels.IEEE Transactions on Computers, Feb., 1997, 46(2): 241–247.
Chen M S, Shin K G. Depth-first search approach for fault-tolerant routing in hypercube multicomputers.IEEE Transactions on Parallel and Distributed Systems, April, 1990, 1(2): 152–159.
Fleury E, Fraigniaud P. A general theory for deadlock avoidance in wormhole-ronted networks.IEEE Transactions on Parallel and Distributed Systems, July, 1998, 9(7): 626–638.
Wu J. Fault-tolerant adaptive and minimal routing in mesh-connected multicomputers using extended safety levels.IEEE Transactions on Parallel and Distributed Systems, Feb., 2000, 11(2): 149–159.
Chien A A, Kim J H. Planar-adaptive routing: Low-cost adaptive networks for multiprocessors.Journal of ACM, January, 1995, 42(1): 91–123.
Boppana R V, Chalasani S. Fault tolerant wormhole routing algorithms for mesh networks.IEEE Transactions on Computers July, 1995, 44(7): 848–864.
Boura Y M, Das C R. Fault-tolerant routing in mesh networks. InProc. 1995 International Conference on Parallel Processing, 1995, I, pp.106–109.
Libeskind-Hadas R, Brandt E. Origin-Based fault-tolerant routing in the mesh. InProc. the 1st International Symposium on High Performance Computer Architecture, Raleigh, North Carolina, Jan., 1995, pp. 102–111.
Su C C, Shin K G. Adaptive fault-tolerant deadlock-free routing in meshes and hypercubes.IEEE Transactions on Computers, June, 1996, 45(6): 672–683.
Chen X, Wu J. Minimal routing in 3-D meshes using extended safety levels. InProc. ISATED International Conference on Parallel and Distributed Systems, Oct., 1998.
Duato J, Yalamanchili S, Ni L. Interconnection networks: An engineering approach.IEEE Computer Society, Los, Alamitos, CA, 1997.
Linder D H, Harden J C. An adaptive and fault-tolerant wormhole routing strategy fork-aryn-cubes.IEEE Transactions on Computers, Jan. 1991, 40(1): 2–12.
Wu J. A theory of fault-tolerant adaptive and minimal routing inn-dimensional meshes.The Computer Journal, 2002, 5(3): 349–363.
Panda D K. Issues in designing efficient and practical algorithms for collective communication on wormholerouted systems. InProc. the 1995 ICPP Workshop on Challenges for Parallel Processing, CRC Press, Oconomowc, Wisconsin, Aug., 1995, pp.8–15.
Author information
Authors and Affiliations
Corresponding author
Additional information
This work was supported in part by the NSF of USA under Grant Nos. CCR 9900646 and ANI 0073736.
A preliminary version of this paper was preseated at the International Conference on Parallel and Distributed Systems (ICPADS). 2000.
WU Jie received the B.S. and M.S. degrees from Shanghai University of Science and Technology (now Shanghai University) in 1982 and 1985, respectively; the Ph.D. degree from Florida Atlantic University in 1989. He is currently a professor at the Department of Computer Science and Engineering, Florida Atlantic University. He has published over 150 papers in various journals and conference proceedings. His research interests are in the area of mobile computing, routing protocols, fault-tolerant computing, and interconnection networks. Dr. Wu served on many conference committees and editorial boards. He was a co-guest-editor of IEEE Transactions on Parallel and Distributed Systems and Journal of Parallel and Distributing Computing. He is the author of the text “Distributed System Design” published by the CRC Press. Dr. Wu was the recipient of the 1996–1997 and 2001–2002 Researcher of the Year Award at Florida Atlantic University. He served as an IEEE Computer Society Distinguished Visitor. Dr. Wu is a member of ACM and a senior member of IEEE.
Rights and permissions
About this article
Cite this article
Wu, J. A simple fault-tolerant adaptive and minimal routing approach in 3-D meshes. J. Comput. Sci. & Technol. 18, 1–13 (2003). https://doi.org/10.1007/BF02946645
Received:
Revised:
Issue Date:
DOI: https://doi.org/10.1007/BF02946645