Skip to main content

Monitoring for detecting bugs and blocking communication

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 854))

Abstract

Writing parallel programs is more difficult than serial programming. So or somehow similar most papers about parallel processing start. Parallel programming alone is not that difficult but writing efficient and error free parallel programs can be a very tedious job.

In our paper we propose a strategy for debugging parallel programs on distributed memory machines. We follow a method called trace driven simulation [10, 11, 12] that has proven to be successful in finding several types of errors in parallel programs. During our work we developed a monitoring strategy as well as tools for visualization and inspection of recorded program traces. The whole concept is covered by a similar philosophy as UNIX (“keep the tools small and clear”) but with an extension to the “look and feel” concept that is known from modern graphical user interface design. Therefore we are developing handy tools in a modular way. The integration of all tools leads to our debugging environment.

This is a preview of subscription content, log in via an institution.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. D.P. Agrawal, V.K. Janakiram, G.C. Pathak, “Evaluating the performance of multicomputer configurations”, IEEE Computer 19 (7), pp. 23–37, July 1986

    Google Scholar 

  2. W.J. Dally, C.L. Seitz, “Deadlock-Free Message Routing in Multiprocessor Interconnection networks”, IEEE Trans. Computers 36 (5), pp. 547–553, May 1987

    Google Scholar 

  3. A. Erzmann, “Messung des Kommunikationsverhaltens des nCUBE 2-Parallelrechners” Technical Report University Hannover, May 1993

    Google Scholar 

  4. C.J. Fidge, “Partial orders for parallel debugging”, Proc. Workshop on Parallel and Distributed Debugging, ACM, pp. 183–194, 1988

    Google Scholar 

  5. G. A. Geist, M.T. Heath, B.W. Peyton, P.H.Worley, “A Users' Guide to PICL — A Portable Instrumented Communication Library”, ORNL/TM-11616, Oak Ridge National Lab, August 1990

    Google Scholar 

  6. S. Grabner, D. Kranzlmüller, “ATEMPT — A Tool for Event Manipulation”, submitted to HICSS-28, Maui, Hawaii, 1995

    Google Scholar 

  7. S. Grabner, J. Volkert, “Debugging Parallel Programs using Event Manipulation”, Proc. 1st Intl. Meeting on Vector and Parallel Processing, Porto, Portugal, Sept 1993

    Google Scholar 

  8. M. T. Heath, J.E. Finger, “ParaGraph: A Tool for Visualizing Performance of Parallel Programs”, Technical Report Oak Ridge Natl. Lab., Sept. 1993

    Google Scholar 

  9. R. Kolmhofer, “Kommunikation in Parallelrechnern mit verteiltem Speicher”, Masters Thesis, Institute of Computer Science, Johannes Kepler University Linz, May 1993

    Google Scholar 

  10. T.J. LeBlanc, J.M. Mellor-Crummey, “Debugging parallel programs with instant replay”, IEEE Trans. on Computing, pp. 471–482, April 1987

    Google Scholar 

  11. D.C. Marinescu, J.E. Lumpp Jr., T.L. Casavant, “Models for Monitoring and Debugging Tools for Parallel and Distributed Software”, Journal of Parallel and Distributed Computing 9 (2), pp. 171–184, 1990

    Google Scholar 

  12. C. E. McDowell, D.P. Helmbold, “Debugging Concurrent Programs”, ACM Computing Surveys 21 (4), pp. 593–622, Dec. 1989

    Google Scholar 

  13. nCUBE Corporation, nCUBE 2 Processor Manual Rel. 3.0, 1992

    Google Scholar 

  14. nCUBE Corporation, nCUBE 2 Programmer's Guide Rel. 3.0, 1992

    Google Scholar 

  15. L.M. Ni, P.K. McKinley, “A Survey of Wormhole Routing Techniques in Direct Networks”, IEEE Computer, Vol. 26, No. 2, Feb. 93, pp. 62–76

    Google Scholar 

  16. D.F. Snelling, G.-R. Hoffmann, “A comparative study of libraries for parallel processing”, Proc. of the Intl. Conference on Vector and Parallel Processors”, in Computational Science III, Parallel Computing, Vol. 8, (1–3), pp. 255–266, 1988

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Bruno Buchberger Jens Volkert

Rights and permissions

Reprints and permissions

Copyright information

© 1994 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Grabner, S., Kranzlmüller, D. (1994). Monitoring for detecting bugs and blocking communication. In: Buchberger, B., Volkert, J. (eds) Parallel Processing: CONPAR 94 — VAPP VI. VAPP CONPAR 1994 1994. Lecture Notes in Computer Science, vol 854. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-58430-7_7

Download citation

  • DOI: https://doi.org/10.1007/3-540-58430-7_7

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-58430-8

  • Online ISBN: 978-3-540-48789-0

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics