Abstract
Many applications in parallel processing have to traverse large, implicitly defined trees with irregular shape. The receiver initiated load balancing algorithm random polling has long been known to be very efficient for these problems in practice. For any ε > 0, we prove that its parallel execution time is at most \( (1 + \in )T_{seq} /P + \mathcal{O}(T_{atomic} + h(\frac{1} { \in } + T_{rout} + T_{split} )) \) with high probability, where T rout, T split and T atomic bound the time for sending a message, splitting a subproblem and finishing a small unsplittable subproblem respectively. The maximum splitting depth h is related to the depth of the computation tree. Previous work did not prove efficiency close to one and used less accurate models. In particular, our machine model allows asynchronous communication with nonconstant message delays and does not assume that communication takes place in rounds. This model is compatible with the LogP model.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
G. Aharoni, Amnon Barak, and Yaron Farber. An adaptive granularity control algorithm for the parallel execution of functional programs. Future Generation Computing Systems, 9:163–174, 1993.
N. S. Arora, R. D. Blumofe, and C. G. Plaxton. Thread scheduling for multiprogrammed multiprocessors. In 10th ACM Symposium on Parallel Algorithms and Architectures, pages 119–129, 1998.
S. Arvindam, V. Kumar, V. N. Rao, and V. Singh. Automatic test pattern generator on parallel processors. Technical Report TR 90-20, University of Minnesota,1990.
G. S. Bloom and S. W. Golomb. Applications of numbered undirected graphs. Proceedings of the IEEE, 65(4):562–570, April 1977.
R. D. Blumofe and C. E. Leiserson. Scheduling multithreaded computations by work stealing. In Foudations of Computer Science, pages 356–368, Santa Fe, 1994.
M. Böhm and E. Speckenmeyer. A fast parallel SAT-solver-efficient workload balancing. Annals of Mathematics and Artificial Intelligence, 17:381–400, 1996.
S. Chakrabarti, A. Ranade, and K. Yelick. Randomized load balancing for tree-structured computation. In Scalable High Performance Computing Conference, pages 666–673, Knoxville, 1994.
D. Culler, R. Karp, D. Patterson, A. Sahay, K. E. Schauser, E. Santos, R. Subramonian, and T. v. Eicken. LogP: Towards a realistic model of parallel computation. In Fourth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pages 1–12, San Diego, 1993.
W. Ertel. Parallele Suche mit randomisiertem Wettbewerb in Inferenzsystemen. Dissertation, TU München, 1992.
P. Fatourou and P. Spirakis. Scheduling algorithms for strict multithreaded computations. In ISAAC: 7th International Symposium on Algorithms and Computation, number 1178 in LNCS, pages 407–416, 1996.
P. Fatourou and P. Spirakis. A new scheduling algorithm for general strict multi-threaded computations. In 13rd International Symposium on DIStributed Computing (DISC’99), Bratislava, Slovakia, 1999. to appear.
R. Feldmann, P. Mysliwietz, and B. Monien. Studying overheads in massively parallel min/max-tree evaluation. In ACM Symposium on Parallel Architectures and Algorithms, pages 94–103, 1994.
R. Finkel and U. Manber. DIB-A distributed implementation of backtracking. ACM Transactions on Programming Languages and Systems, 9(2):235–256, April 1987.
C. Goumopoulos, E. Housos, and O. Liljenzin. Parallel crew scheduling on workstation networks using PVM. In EuroPVM-MPI, number 1332 in LNCS, Cracow, Poland, 1997.
V. Heun and E. W. Mayr. Efficient dynamic embedding of arbitrary binary trees into hypercubes. In International Workshop on Parallel Algorithms for Irregularly Structured Problems, number 1117 in LNCS, 1996.
J. C. Kergommeaux and P. Codognet. Parallel logic programming systems. ACM Computing Surveys, 26(3):295–336, 1994.
R. E. Korf. Depth-first iterative-deepening: An optimal admissible tree search. Artificial Intelligence, 27:97–109, 1985.
V. Kumar and G. Y. Ananth. Scalable load balancing techniques for parallel computers. Technical Report TR 91-55, University of Minnesota, 1991.
V. Kumar, A. Grama, A. Gupta, and G. Karypis. Introduction to Parallel Computing. Design and Analysis of Algorithms. Benjamin/Cummings, 1994.
F. T. Leighton, B. M. Maggs, A. G. Ranade, and S. B. Rao. Randomized routing and sorting on fixed-connection networks. Journal of Algorithms, 17:157–205, 1994.
S. Martello and P. Toth. Knapsack Problems-Algorithms and Computer Implementations. Wiley, 1990.
F. Mattern. Algorithms for distributed termination detection. Distributed Computing, 2:161–175, 1987.
M. Mitzenmacher. Analyses of load stealing models based on differential equations. In 10th ACM Symposium on Parallel Algorithms and Architectures, pages 212–221, 1998.
A. Nonnenmacher and D. A. Mlynski. Liquid crystal simulation using automatic differentiation and interval arithmetic. In G. Alefeld and A. Frommer, editors, Scientific Computing and Validated Numerics. Akademie Verlag, 1996.
W. H. Press, S.A. Teukolsky, W. T. Vetterling, and B. P. Flannery. Numerical Recipes in C. Cambridge University Press, 2. edition, 1992.
V. N. Rao and V. Kumar. Parallel depth first search. International Journal of Parallel Programming, 16(6):470–519, 1987.
A. Reinefeld. Scalability of massively parallel depth-first search. In DIMACS Workshop, 1994.
P. Sanders. Analysis of random polling dynamic load balancing. Technical Report IB 12/94, Universität Karlsruhe, Fakultät für Informatik, April 1994.
P. Sanders. A detailed analysis of random polling dynamic load balancing. In International Symposium on Parallel Architectures, Algorithms and Networks, pages 382–389, Kanazawa, Japan, 1994.
P. Sanders. Better algorithms for parallel backtracking. In Workshop on Algorithms for Irregularly Structured Problems, number 980 in LNCS, pages 333–347, 1995.
P. Sanders. A scalable parallel tree search library. In S. Ranka, editor, 2nd Workshop on Solving Irregular Problems on Distributed Memory Machines, Honolulu, Hawaii, 1996.
P. Sanders. Lastverteilungsalgorithmen für parallele Tiefensuche. PhD thesis, University of Karlsruhe, 1997.
P. Sanders. Lastverteilungsalgorithmen für parallele Tiefensuche. Number 463 in Fortschrittsberichte, Reihe 10. VDI Verlag, 1997.
P. Sanders. Tree shaped computations as a model for parallel applications. In ALV’98 Workshop on application based load balancing. SFB 342, TU München, Germany, March 1998. http://www.mpi-sb.mpg.de/~sanders/papers/alv.ps.gz.
E. Speckenmeyer, B. Monien, and O. Vornberger. Superlinear speedup for parallel backtracking. In C. D. Houstis, E. N.; Papatheodorou, T. S.; Polychronopoulos, editor, Proceedings of the 1st International Conference on Supercomputing, number 297 in LNCS, pages 985–993, Athens, Greece, June 1987. Springer.
R. Wattenhofer and P. Widmayer. An inherent bottleneck in distributed counting. Journal Parallel and Distributed Processing, Special Issue on Parallel and Distributed Data Structures, 49:135–145, 1998.
I. C. Wu and H. T. Kung. Communication complexity of parallel divide-and-conquer. In Foudations of Computer Science, pages 151–162, 1991.
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 1999 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Sanders, P. (1999). Asynchronous Random Polling Dynamic Load Balancing. In: Algorithms and Computation. ISAAC 1999. Lecture Notes in Computer Science, vol 1741. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-46632-0_5
Download citation
DOI: https://doi.org/10.1007/3-540-46632-0_5
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-66916-6
Online ISBN: 978-3-540-46632-1
eBook Packages: Springer Book Archive