Parallel Performance Debugging

Wu, Xingfu

doi:10.1007/978-1-4615-5147-8_7

Xingfu Wu^4,5

Part of the book series: The Kluwer International Series on Asian Studies in Computer and Information Science ((ASIS,volume 4))

90 Accesses

Abstract

Parallel programs are much more difficult to develop, debug, maintain, and understand than their sequential counterparts. One reason is the difficulty in establishing correctness - which must take into account temporal conditions: liveness, deadlock-freeness, process synchronization and communication, this is often called correctness debugging. Another reason is the diversity of parallel architectures and the need to produce a highly efficient program finely tuned to the specific target architectures. The impact of task granularity on a parallel algorithm, the properties of the memory hierarchy, and the intricacies involved in the exploitation of multilevel parallelism, should all be carefully analyzed and used to devise a transformation strategy for the program. The adaptation of an initially inefficient algorithm to a specific hardware is often called performance debugging, a term that suggests that the correctness criteria for a parallel algorithm should include requirements for its performance on a given architecture. An inefficient but otherwise correct program is of practically no use for execution on a parallel system.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

O. Babaoglu, Paralex: an environment for parallel programming in distributed systems, In: Proc. of ACM International Conference on Supercomputing, July 1992.
Google Scholar
A. Beguelin and G. Nutt, Collected papers on Phred, Department of computer science, University of Colorado, CU-CS-511-91, Jan. 1991.
Google Scholar
A. Beguelin, et al., HeNCE: A Users’ Guide Version 2.0, ftp://ftp.netlib.org/hence, June 1994.
Google Scholar
J.C. Browne, et al., Visual programming and parallel computing, In: Workshop on Environments and Tools for Parallel Scientific Computing, May 1994. See also: http://www.cs.utexas.edu.
J.C. Browne, et al:, Visual programming and debugging for parallel computing, IEEE Parallel & Distributed Technology, Spring 1995.
Google Scholar
Cray Research Inc., CF77 compiling system, Parallel Processing Guide (Volume 4), 1991.
Google Scholar
M.E. Crovella and T.J. LeBlanc, Performance debugging using parallel performance predicates, In: Proc. of 1993 ACM/ONR Workshop on Parallel and Distributed Debugging, 1993, 140–150.
Google Scholar
M.E. Crovella and T.J. LeBlanc, Parallel performance prediction using lost cycles analysis, In: Proc. of Supercomputing’ 94, 1994.
Google Scholar
J.J. Dongarra and D.C. Sorenson, SCHEDULE: tools for developing and analyzing parallel Fortran programs, Argonne National Laboratory MCSD Technical Memorandum, No. 86, Nov. 1986.
Google Scholar
G.A. Geist, et al., PICL-A Portable Instrumented Communication Library, Oak Ridge National Laboratory, May 1990.
Google Scholar
A. Geist, et al., PVM: Parallel Virtual Machine-A Users Guide and Tutorial for Networked Parallel Computing, The MIT Press, 1994.
Google Scholar
M. Girkar and C. Polychronopoulos, Automatic extraction of functional parallelism from ordinary programs, IEEE Trans. Parallel and Distributed Systems 3,2, Mar. 1992, 166–178.
Article Google Scholar
J.K. Hollingsworth, Finding Bottlenecks in Large Scale Parallel Programs, Ph.D. Thesis, University of Wisconsin-Madison, 1994.
Google Scholar
Kai Hwang, Advanced Computer Architecture: Parallelism, Scalability, Programmability, McGraw-Hill, Inc., 1993.
Google Scholar
R.M. Keller and W. Yen, A graphical approach to software development using function graphs, IEEE COMPCON”81, Feb. 1981.
Google Scholar
J. Kohn and W. Wiliams, ATExpert, Journal of Parallel and Distributed Computing 18, 1993, 205–222.
Article Google Scholar
D.J. Kuck et al., The structure of an advanced retargetable vectorizer, In: Supercomputers: Design and Application Tutorial, Ed.: Kai Hwang, IEEE Society Press, 1984, 967–974.
Google Scholar
Kei-Chun Li and Kang Zhang, Tuning Parallel Program through Automatic Program Analysis, In: Proc. of the Second International Symposium on Parallel Architectures, Algorithms, and Networks, IEEE Computer Society Press, Beijing, China, June 1996, 330–333.
Chapter Google Scholar
A.D. Malony and G. V. Wilson, Future directions in parallel performance environments, Performance Measurement and Visualization of Parallel Systems, Eds.: G. Haring and G. Kotsis, Elsevier Science Publishers B.V., 1993, 331–35
Google Scholar
P. Newton and J.C. Browne, The CODE 2.0 graphical parallel programming language, In: Proc. of ACM International Conference on Super computing, July 1992.
Google Scholar
P. Newton, VPE User Manual, http://www.cs.utk.edu, University of Tennessee, June 1995.
P. Newton and J. Dongarra, Overview of VPE: A Visual Environment for Message-Passing Parallel Programming, http://www.cs.utk.edu.
D.J. Palermo et al., Communication optimizations used in the PARADIGM compiler, In: Proc. of the 1994 International Conference on Parallel Processing, Vol.11, 1994, II-1-II-10.
Google Scholar
S. Pande and D.P. Agrawal, Special issue on compilation techniques for distributed memory systems, Journal of Parallel and Distributed Computing 38, 1996, 107–113.
Article Google Scholar
S.E. Perl and W.E. Weihl, Performance assertion checking, In: Proc. of the 14th ACM Symposium on Operating Systems Principles, 1993, 134–145.
Google Scholar
M. Wolfe, High Performance Compilers for Parallel Computing, Addison-Wesley Publishing Company, 1996.
Google Scholar
Xingfu Wu, The approaches and practice of performance evaluation, prediction, and visualization for parallel systems (project proposal). State Key Laboratory for Novel Software Technology, Nanjing University, Feb. 1998, China.
Google Scholar
J. C. Yan, Performance tuning with AIMS-An automated instrumentation and monitoring system for multicomputers, In: Proc. of the 27th Hawaii International Conference on System Sciences, 1994, Vol. II, 625–633.
Google Scholar
H. Zima and B. Chapman, Supercompilers for Parallel and Vector Computers, ACM Press, Frontier Series, 1991.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Louisiana State University, USA
Xingfu Wu
State Key Laboratory for Novel Software Technology at Nanjing University, China
Xingfu Wu

Authors

Xingfu Wu
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Wu, X. (1999). Parallel Performance Debugging. In: Performance Evaluation, Prediction and Visualization of Parallel Systems. The Kluwer International Series on Asian Studies in Computer and Information Science, vol 4. Springer, Boston, MA. https://doi.org/10.1007/978-1-4615-5147-8_7

Download citation

DOI: https://doi.org/10.1007/978-1-4615-5147-8_7
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4613-7343-8
Online ISBN: 978-1-4615-5147-8
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics