Advertisement

STAPL: An Adaptive, Generic Parallel C++ Library

  • Ping An
  • Alin Jula
  • Silvius Rus
  • Steven Saunders
  • Tim Smith
  • Gabriel Tanase
  • Nathan Thomas
  • Nancy Amato
  • Lawrence Rauchwerger
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2624)

Abstract

The Standard Template Adaptive Parallel Library (STAPL) is a parallel library designed as a superset of the ANSI C++ Standard Template Library (STL). It is sequentially consistent for functions with the same name, and executes on uni- or multi-processor systems that utilize shared or distributed memory. STAPL is implemented using simple parallel extensions of C++ that currently provide a SPMD model of parallelism, and supports nested parallelism. The library is intended to be general purpose, but emphasizes irregular programs to allow the exploitation of parallelism for applications which use dynamically linked data structures such as particle transport calculations, molecular dynamics, geometric modeling, and graph algorithms. STAPL provides several different algorithms for some library routines, and selects among them adaptively at runtime. STAPL can replace STL automatically by invoking a preprocessing translation phase. In the applications studied, the performance of translated code was within 5% of the results obtained using STAPL directly. STAPL also provides functionality to allow the user to further optimize the code and achieve additional performance gains. We present results obtained using STAPL for a molecular dynamics code and a particle transport code.

Keywords

Work Function Automatic Translation Data Dependence Graph Standard Template Library Execution Schedule 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [1]
    The CHARM++ Programming Language Manual. http://charm.cs.uiuc.edu, 2000.
  2. [2]
    N.M. Amato, J. Perdue, A. Pietracaprina, G. Pucci, and M. Mathis. Predicting performance on SMPs. a case study: The SGI Power Challenge. In Proc. International Parallel and Distributed Processing Symposium (IPDPS), pages 729–737, 2000.Google Scholar
  3. [3]
    N. M. Amato, A. Pietracaprina, G. Pucci, L. K. Dale, and J. Perdue. A cost model for communication on a symmetric multiprocessor. Technical Report 98-004, Dept. of Computer Science, Texas A&M University, 1998. A preliminary verson of this work was presented at the SPAA’98 Revue.Google Scholar
  4. [4]
    Ping An, Alin Jula, Silvius Rus, Steven Saunders, Tim Smith, Gabriel Tanase, Nathan Thomas, Nancy Amato, and Lawrence Rauchwerger. Stapl: An adaptive, generic parallel programming library for c++. Technical Report TR01-012, Dept. of Computer Science, Texas A&M University, June 2001.Google Scholar
  5. [5]
    Emery Berger, Kathryn McKinley, Robert Blumofe, and Paul Wilson. HOARD: A scalable memory allocator for multithreaded applications. In International Conference on Architectural Support for Programming Languages and Operatings Systems (ASPLOS), 2000.Google Scholar
  6. [6]
    Guy Blelloch. Vector Models for Data-Parallel Computing. MIT Press, 1990.Google Scholar
  7. [7]
    Guy Blelloch. NESL: A Nested Data-Parallel Language. Technical Report CMU-CS-93-129, Carnegie Mellon University, April 1993.Google Scholar
  8. [8]
    C. Chang, A. Sussman, and J. Saltz. Object-oriented runtime support for complex distributed data structures, 1995.Google Scholar
  9. [9]
    David Culler, Andrea Dusseau, Seth Copen Goldstein, Arvind Krishnamurthy, Steven Lumetta, Thorsten von Eicken, and Katherine Yelick. Parallel programming in Split-C. In International Conference on Supercomputing, November 1993.Google Scholar
  10. [10]
    Matteo Frigo, Charles Leiserson, and Keith Randall. The implementation of the Cilk-5 multithreaded language. In ACMSIGPLAN Conference on Programming Language Design and Implementation (PLDI), 1998.Google Scholar
  11. [11]
    Adolfy Hoisie, Olaf Lubeck, and Harvey Wasserman. Performance and scalability analysis of teraflop-scale parallel architectures using multidimensional wavefront applications. Technical Report LAUR-98-3316, Los Alamos National Laboratory, August 1998.Google Scholar
  12. [12]
    Adolfy Hoisie, Olaf Lubeck, and Harvey Wasserman. Scalability analysis of multidimensional wavefront algorithms on large-scale SMP clusters. In Proceedings of Frontiers’ 99: The 7th Symposium on the Frontiers of Massively Parallel Computation, pages 4–15, Annapolis, MD, February 1999. IEEE Computer Society.Google Scholar
  13. [13]
    International Standard ISO/IEC 14882. Programming Languages — C++, 1998. First Edition.Google Scholar
  14. [14]
    Elizabeth Johnson. Support for Parallel Generic Programming. PhD thesis, Indiana University, 1998.Google Scholar
  15. [15]
    Elizabeth Johnson and Dennis Gannon. HPC++: Experiments with the parallel standard library. In International Conference on Supercomputing, 1997.Google Scholar
  16. [16]
    K. R. Koch, R. S. Baker, and R. E. Alcouffe. Solution of the first-order form of the 3D discrete ordinates equation on a massively parallel processor. Transactions of the American Nuclear Society, 65:198–199, 1992.Google Scholar
  17. [17]
    David Musser, Gillmer Derge, and Atul Saini. STL Tutorial and Reference Guide, Second Edition. Addison-Wesley, 2001.Google Scholar
  18. [18]
    C.G. Plaxtion N.S. Arora, R.D. Blumofe. Thread scheduling for multiprogrammed multiprocessors. In Proceedings of the 10th ACM Symposium on Parallel Algorithms and Architectures, June 1998.Google Scholar
  19. [19]
    J. Reynders. Pooma: A framework for scientific simulation on parallel architectures, 1996.Google Scholar
  20. [20]
    Robert Sedgewick. Algorithms in C++. Addison-Wesley, 1992.Google Scholar
  21. [21]
    Thomas Sheffler. A portable MPI-based parallel vector template library. Technical Report RIACS-TR-95.04, Research Institute for Advanced Computer Science, March 1995.Google Scholar
  22. [22]
    Bjarne Stroustrup. The C++ Programming Language, Third Edition. Addison-Wesley, 1997.Google Scholar
  23. [23]
    Gregory Wilson and Paul Lu. Parallel Programming using C++. MIT Press, 1996.Google Scholar
  24. [24]
    Paul Wilson, Mark Johnstone, Michael Neely, and David Boles. Dynamic storage allocation: A survey and critical review. In International Workshop on Memory Management, September 1995.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2003

Authors and Affiliations

  • Ping An
    • 1
  • Alin Jula
    • 1
  • Silvius Rus
    • 1
  • Steven Saunders
    • 1
  • Tim Smith
    • 1
  • Gabriel Tanase
    • 1
  • Nathan Thomas
    • 1
  • Nancy Amato
    • 1
  • Lawrence Rauchwerger
    • 1
  1. 1.Dept. of Computer ScienceTexas A&M UniversityCollege Station

Personalised recommendations