Abstract
Concurrent programs often exhibit nondeterministic behavior because execution order of concurrent events may involve some arbitrariness. Such indeterminacy makes it difficult to find the sources of program errors, We propose a debugging scheme for fine-grain parallel programs on massively parallel processors. It facilitates (1) replay of a specific execution with a small amount of log information, provided that the intra-node scheduling policy employed is deterministic and known, and (2) by using scalar timestamps, it also detects “race” conditions where message arrival order causes indeterminacy. We evaluate its performance through a prototype debugging system for a concurrent object-oriented language ABCL/f on a multicomputer AP1000+ with 32–1024 nodes.
Preview
Unable to display preview. Download preview PDF.
References
P. America. Designing an Object-Oriented Programming Language with Behavioural Subtyping. In Proc. of REX/FOOL, volume 489 of LNCS, pages 60–90, 1990.
D. Callahan and J Subhlok. Static analysis of low-level synchronization. In Proc. of Workshop on Parallel and Distributed Debugging, pages 100–111. ACM, 1988.
J. D. Choi, B. P. Miller, and R. B. Netzer. Techniques for debugging parallel programs with flowback analysis. ACM Transaction on Programming Languages and Systems, 13(4):491–530, 1991.
C. J. Fidge. Partial orders for parallel debugging. In Workshop on Parallel and Distributed Debugging, volume 2 of SIGPLAN NOTICE, pages 183–194, 1989.
J. Gait. A debugger for concurrent programs. Softw. Pract. Exper., 15(6):539–554, 1985.
R. H. Halstead Jr.. Multilisp: A Language for Concurrent Symbolic Computation. ACM Transactions on Programming Languages and Systems, 7(3):501–538, 1985.
R. H. Halstead Jr., D. A. Kranz, and P. G. Sobalvarro. MulTVision: A tool for visualizing parallel program executions. In Proc. of Parallel Symbolic Computing: Languages, Systems, and Applications, volume 748 of LNCS, pages 183–204, 1992.
L. Lamport. Time, clocks and the ordering of events in a distributed system. CACM, 1978.
T.J. Leblanc and J.M. Mellor-Crummey. Debugging parallel programs with instant replay. IEEE Transactions on Computers. 36(4):471–482, April 1987.
B.P. Miller and Choi J, D. A mechanism for efficient debugging of parallel programs. In Proc. of PLDI, pages 135–144, June 1988.
R. B. Netzer and B. P. Miller. Optimal tracing and replay for debugging message-passing parallel programs. In Proc. of Suprecomputing, pages 502–511, 1992.
R. B. Netzer and Jian Xu. Adaptive message logging for incremental replay of message-passing programs. In Proc. of Suprecomputing, pages 840–849, 1993.
Kenjiro Taura, Satoshi Matsuoka, and Akinori Yonezawa. An Efficient Implementation Scheme of Concurrent Object-Oriented Languages on Stock Multicomputers. In Proc. of PPOPP, pages 218–228, 1993.
Kenjiro Taura, Satoshi Matsuoka, and Akinori Yonezawa. ABCL/f: A future-based polymorphic typed concurrent object-oriented language — its design and implementation-. In Proc. of the DIMACS workshop on Specification of Parallel Algorithms, number 18 in DIMACS, pages 275–292. American Mathematical Society, 1994.
Kenjiro Taura and Akinori Yonezawa. Irregular numerics in concurrent objectoriented language ABCL/f— a case study in FEM and Nbody. In Proc. of SWoPP, Beppu, Japan, 1995.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1996 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kamada, T., Yonezawa, A. (1996). A debugging scheme for fine-grain threads on massively parallel processors with a small amount of log information —Replay and race detection—. In: Ito, T., Halstead, R.H., Queinnec, C. (eds) Parallel Symbolic Languages and Systems. PSLS 1995. Lecture Notes in Computer Science, vol 1068. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0023057
Download citation
DOI: https://doi.org/10.1007/BFb0023057
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-61143-1
Online ISBN: 978-3-540-68332-2
eBook Packages: Springer Book Archive