Skip to main content

Abstract

Parallel programs are much more difficult to develop, debug, maintain, and understand than their sequential counterparts. One reason is the difficulty in establishing correctness - which must take into account temporal conditions: liveness, deadlock-freeness, process synchronization and communication, this is often called correctness debugging. Another reason is the diversity of parallel architectures and the need to produce a highly efficient program finely tuned to the specific target architectures. The impact of task granularity on a parallel algorithm, the properties of the memory hierarchy, and the intricacies involved in the exploitation of multilevel parallelism, should all be carefully analyzed and used to devise a transformation strategy for the program. The adaptation of an initially inefficient algorithm to a specific hardware is often called performance debugging, a term that suggests that the correctness criteria for a parallel algorithm should include requirements for its performance on a given architecture. An inefficient but otherwise correct program is of practically no use for execution on a parallel system.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. O. Babaoglu, Paralex: an environment for parallel programming in distributed systems, In: Proc. of ACM International Conference on Supercomputing, July 1992.

    Google Scholar 

  2. A. Beguelin and G. Nutt, Collected papers on Phred, Department of computer science, University of Colorado, CU-CS-511-91, Jan. 1991.

    Google Scholar 

  3. A. Beguelin, et al., HeNCE: A Users’ Guide Version 2.0, ftp://ftp.netlib.org/hence, June 1994.

    Google Scholar 

  4. J.C. Browne, et al., Visual programming and parallel computing, In: Workshop on Environments and Tools for Parallel Scientific Computing, May 1994. See also: http://www.cs.utexas.edu.

  5. J.C. Browne, et al:, Visual programming and debugging for parallel computing, IEEE Parallel & Distributed Technology, Spring 1995.

    Google Scholar 

  6. Cray Research Inc., CF77 compiling system, Parallel Processing Guide (Volume 4), 1991.

    Google Scholar 

  7. M.E. Crovella and T.J. LeBlanc, Performance debugging using parallel performance predicates, In: Proc. of 1993 ACM/ONR Workshop on Parallel and Distributed Debugging, 1993, 140–150.

    Google Scholar 

  8. M.E. Crovella and T.J. LeBlanc, Parallel performance prediction using lost cycles analysis, In: Proc. of Supercomputing’ 94, 1994.

    Google Scholar 

  9. J.J. Dongarra and D.C. Sorenson, SCHEDULE: tools for developing and analyzing parallel Fortran programs, Argonne National Laboratory MCSD Technical Memorandum, No. 86, Nov. 1986.

    Google Scholar 

  10. G.A. Geist, et al., PICL-A Portable Instrumented Communication Library, Oak Ridge National Laboratory, May 1990.

    Google Scholar 

  11. A. Geist, et al., PVM: Parallel Virtual Machine-A Users Guide and Tutorial for Networked Parallel Computing, The MIT Press, 1994.

    Google Scholar 

  12. M. Girkar and C. Polychronopoulos, Automatic extraction of functional parallelism from ordinary programs, IEEE Trans. Parallel and Distributed Systems 3,2, Mar. 1992, 166–178.

    Article  Google Scholar 

  13. J.K. Hollingsworth, Finding Bottlenecks in Large Scale Parallel Programs, Ph.D. Thesis, University of Wisconsin-Madison, 1994.

    Google Scholar 

  14. Kai Hwang, Advanced Computer Architecture: Parallelism, Scalability, Programmability, McGraw-Hill, Inc., 1993.

    Google Scholar 

  15. R.M. Keller and W. Yen, A graphical approach to software development using function graphs, IEEE COMPCON”81, Feb. 1981.

    Google Scholar 

  16. J. Kohn and W. Wiliams, ATExpert, Journal of Parallel and Distributed Computing 18, 1993, 205–222.

    Article  Google Scholar 

  17. D.J. Kuck et al., The structure of an advanced retargetable vectorizer, In: Supercomputers: Design and Application Tutorial, Ed.: Kai Hwang, IEEE Society Press, 1984, 967–974.

    Google Scholar 

  18. Kei-Chun Li and Kang Zhang, Tuning Parallel Program through Automatic Program Analysis, In: Proc. of the Second International Symposium on Parallel Architectures, Algorithms, and Networks, IEEE Computer Society Press, Beijing, China, June 1996, 330–333.

    Chapter  Google Scholar 

  19. A.D. Malony and G. V. Wilson, Future directions in parallel performance environments, Performance Measurement and Visualization of Parallel Systems, Eds.: G. Haring and G. Kotsis, Elsevier Science Publishers B.V., 1993, 331–35

    Google Scholar 

  20. P. Newton and J.C. Browne, The CODE 2.0 graphical parallel programming language, In: Proc. of ACM International Conference on Super computing, July 1992.

    Google Scholar 

  21. P. Newton, VPE User Manual, http://www.cs.utk.edu, University of Tennessee, June 1995.

  22. P. Newton and J. Dongarra, Overview of VPE: A Visual Environment for Message-Passing Parallel Programming, http://www.cs.utk.edu.

  23. D.J. Palermo et al., Communication optimizations used in the PARADIGM compiler, In: Proc. of the 1994 International Conference on Parallel Processing, Vol.11, 1994, II-1-II-10.

    Google Scholar 

  24. S. Pande and D.P. Agrawal, Special issue on compilation techniques for distributed memory systems, Journal of Parallel and Distributed Computing 38, 1996, 107–113.

    Article  Google Scholar 

  25. S.E. Perl and W.E. Weihl, Performance assertion checking, In: Proc. of the 14th ACM Symposium on Operating Systems Principles, 1993, 134–145.

    Google Scholar 

  26. M. Wolfe, High Performance Compilers for Parallel Computing, Addison-Wesley Publishing Company, 1996.

    Google Scholar 

  27. Xingfu Wu, The approaches and practice of performance evaluation, prediction, and visualization for parallel systems (project proposal). State Key Laboratory for Novel Software Technology, Nanjing University, Feb. 1998, China.

    Google Scholar 

  28. J. C. Yan, Performance tuning with AIMS-An automated instrumentation and monitoring system for multicomputers, In: Proc. of the 27th Hawaii International Conference on System Sciences, 1994, Vol. II, 625–633.

    Google Scholar 

  29. H. Zima and B. Chapman, Supercompilers for Parallel and Vector Computers, ACM Press, Frontier Series, 1991.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 1999 Springer Science+Business Media New York

About this chapter

Cite this chapter

Wu, X. (1999). Parallel Performance Debugging. In: Performance Evaluation, Prediction and Visualization of Parallel Systems. The Kluwer International Series on Asian Studies in Computer and Information Science, vol 4. Springer, Boston, MA. https://doi.org/10.1007/978-1-4615-5147-8_7

Download citation

  • DOI: https://doi.org/10.1007/978-1-4615-5147-8_7

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-1-4613-7343-8

  • Online ISBN: 978-1-4615-5147-8

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics