Performance Prediction Methodology for Parallel Programs with MPI in NOW Environments
We present a methodology for parallel programming, along with MPI performance measurement and prediction in a class of a distributed computing environments, namely networks of workstations. Our approach is based on a two-level model where, at the top, a new parallel version of timing graph representation is used to make explicit the parallel communication and code segments of a given parallel program, while at the bottom level, analytical models are developed to represent execution behavior of parallel communications and code segments. Execution time results obtained from execution, together with problem size and number of nodes, are input to the model, which allows us to predict the performance of similar cluster computing systems with a different number of nodes. The analytical model is validated by performing experiments over a homogeneous cluster of workstations. Final results show that our approach produces accurate predictions, within 5% of actual results.
KeywordsExecution Time Parallel Program Message Passing Interface Communication Time Parallel Application
Unable to display preview. Download preview PDF.
- D.H. Bailey, J. T. Barton, T.A. Lasinski and H. D. Simon. The NAS parallel benchmarks. Tech. Report NASA memorandum 103863, NASA Ames Research Center, July 1993.Google Scholar
- H.W. Cain, B.P. Miller and B.J. Wylie. A callgraph-based search strategy for automated performance diagnosis. In: Proceedings of the Euro-Par 2000, Munich, Germany, 2000.Google Scholar
- A.J.C. van Gemund. Performance modeling of parallel systems. PhD thesis, Delft University of Technology, Delft University Press, ISBN 90-407-1326-X, 1996.Google Scholar
- P.G. Harrison, N.M. Patel. Performance modeling of communication networks and computer architectures. Addison-Wesley, 1993.Google Scholar
- G. Karypis, V. Kumar. Analysis of multilevel graph partitioning. Technical report 98-037, University of Minnesota, 1998.Google Scholar
- J. Landrum, J. Hardwick and Q.F. Stout. Predicting algorithm performance. Computing Science and Statistics 30, pages 309–314, 1998.Google Scholar
- K.C. Li. Performance analysis and prediction of parallel programs on network of workstations. Ph.D. thesis, Department of Computer Engineering and Digital Systems, University of São Paulo, 2001.Google Scholar
- R. P. Martin et al. Effects of Communication Latency, Overhead, and Bandwidth in a Cluster Architecture. In: Proceedings of the 24th Annual International Symposium on Computer Architecture, pages 85–97, Denver, 1997.Google Scholar
- C.L. Mendes, D.A. Reed. Performance prediction by trace transformation. In: V SBACPAD razilian Symposium on Computer Architecture-High Performance Computing, São Paulo, 1993.Google Scholar
- N. Nupairoj, L. Ni. Performance Evaluation of Some MPI Implementations. Technical Report MSU-CPS-ACS-94, Department of Computer Science, Michigan State University, Sept. 1994.Google Scholar