Abstract
This paper presents a mechanism for record-replay of parallel programs written in a remote procedure call (RPC) based parallel programming model. This mechanism, which will serve as a basis for implementing a user-level debugger, exploits some properties of the programming model to limit drastically the number of records that need to be done. A formal proof of the equivalence between recorded and replayed executions is given. Systematic measurements of the time overhead of the recording indicate that it is sufficiently low for the recording mode to be considered as normal execution mode. Similar techniques can be applied to other programming models.
Keywords
This work was partially supported by the French Ministery of Research under the inter-PRC project Trace.
Download to read the full chapter text
Chapter PDF
References
P. Bouvry, J. Chassin, and D. Trystram. Efficient solutions for mapping parallel programs. In Proceedings of EuroPar'95. Springer-Verlag, August 1995.
M. Christaller. Athapascan-0a control parallelism approach on top of PVM. In Proc PVM User's group meeting. University of Tennessee, Oak Ridge, 1994.
H. Jamrozik. Aide à la Mise au Point des Applications Parallèles et Réparties à base d'Objets Persistants. PhD thesis, Université Joseph Fourier, Grenoble, 1993.
J. P. Kitajima and B. Plateau. Modelling parallel program behaviour in ALPES. Information and Software Technology, 36(7):457–464, July 1994.
T.J. LeBlanc and J.M. Mellor-Crummey. Debugging Parallel Programs with Instant Replay. IEEE Transactions on Computers, C-36(4):471–481, 1987.
E. Leu and A. Schiper. Execution replay: a mechanism for integrating a visualization tool with a symbolic debugger. In CONPAR 92 — VAPP V, volume 634 of LNCS, September 1992.
F. Mattern. Virtual time and global states of distributed systems. In Proceedings of the Workshop on Parallel and Distributed Algorithms, Bonas, France, September 1988. North Holland.
J.M. Mellor-Crummey. Debugging and Analysis of Large-Scale Parallel Programs. Technical Report 312, University of Rochester, September 1989.
B. Plateau. Présentation d'APACHE. Rapport APACHE 1, IMAG, Grenoble, December 1994. Available at ftp.imag.fr:imag/APACHE/RAPPORTS.
V. Strassen. Gaussian Elimination is not Optimal. Numerische Mathematik, Band 13(Heft 4):354–356, 1969.
C. Tron et al. Performance Evaluation of Parallel Systems: the alpes environment. In Proceedings of ParCo93. Elsevier Science Publishers, 1993.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1995 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Fagot, A., de Kergommeaux, J.C. (1995). Formal and experimental validation of a low overhead execution replay mechanism. In: Haridi, S., Ali, K., Magnusson, P. (eds) EURO-PAR '95 Parallel Processing. Euro-Par 1995. Lecture Notes in Computer Science, vol 966. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0020463
Download citation
DOI: https://doi.org/10.1007/BFb0020463
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-60247-7
Online ISBN: 978-3-540-44769-6
eBook Packages: Springer Book Archive