Skip to main content

Evolution of a Parallel Performance System

  • Conference paper
Tools for High Performance Computing

Abstract

The TAU Performance System® is an integrated suite of tools for instrumentation, measurement, and analysis of parallel programs targeting large-scale, high-performance computing (HPC) platforms. Representing over fifteen calendar years and fifty person years of research and development effort, TAU’s driving concerns have been portability, flexibility, interoperability, and scalability. The result is a performance system which has evolved into a leading framework for parallel performance evaluation and problem solving. This paper presents the current state of TAU, overviews the design and function of TAU’s main features, discusses best practices of TAU use, and outlines future development.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 54.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ahn, D., Kufrin, R., Raghuraman, A., Seo, J.: Perfsuite. http://perfsuite.ncsa.uiuc.edu/

  2. Bell, R., Malony, A., Shende, S.: A portable, extensible, and scalable tool for parallel performance profile analysis. In: Proc. EUROPAR 2003 Conference (EUROPAR03) (2003). URL http://www.cs.uoregon.edu/research/paracomp/papers/parco03/parco03.pdf

  3. Bernholdt, D.E., Allan, B.A., Armstrong, R., Bertrand, F., Chiu, K., Dahlgren, T.L., Damevski, K., Elwasif, W.R., Epperly, T.G.W., Govindaraju, M., Katz, D.S., Kohl, J.A., Krishnan, M., Kumfert, G., Larson, J.W., Lefantzi, S., Lewis, M.J., Malony, A.D., McInnes, L., Nieplocha, J., Norris, B., Parker, S.G., Ray, J., Shende, S., Windus, T.L., Zhou, S.: A Component Architecture for High-Performance Scientific Computing. Intl. Journal of High-Performance Computing Applications ACTS Collection Special Issue (2005)

    Google Scholar 

  4. Berrendorf, R., Ziegler, H., Mohr, B.: PCL — The Performance Counter Library. http://www.fz-juelich.de/zam/PCL/

  5. Browne, S., Dongarra, J., Garner, N., Ho, G., Mucci, P.: A Portable Programming Interface for Performance Evaluation on Modern Processors. International Journal of High Performance Computing Applications 14(3), 189–204 (2000)

    Article  Google Scholar 

  6. Brunst, H., Malony, A.D., Shende, S., Bell, R.: Online Remote Trace Analysis of Parallel Applications on High-Performance Clusters. In: Proceedings of the ISHPC Conference (LNCS 2858), pp. 440–449. Springer (2003)

    Google Scholar 

  7. Brunst, H., Nagel, W.E., Malony, A.D.: A Distributed Performance Analysis Architecture for Clusters. In: Proceedings of the IEEE International Conference on Cluster Computing (Cluster 2003), pp. 73–83. IEEE Computer Society (2003)

    Google Scholar 

  8. Buck, B., Hollingsworth, J.: An API for Runtime Code Patching. Journal of High Performance Computing Applications 14(4), 317–329 (2000)

    Article  Google Scholar 

  9. CCA Forum: The Common Component Architecture Forum. http://www.cca-forum.org

  10. DeRose, L.: The Hardware Performance Monitor Toolkit. In: Proceedings of the European Conference on Parallel Computing (EuroPar 2001, LNCS 2150), pp. 122–131. Springer (2001)

    Google Scholar 

  11. Dongarra, J., Malony, A.D., Moore, S., Mucci, P., Shende, S.: Performance Instrumentation and Measurement for Terascale Systems. In: Proceedings of the ICCS 2003 Conference (LNCS 2660), pp. 53–62 (2003)

    Google Scholar 

  12. Eaton, J.W.: Octave home page. http://www.octave.org/. Http://www.octave.org/

  13. Forum, M.P.I.: MPI: A Message Passing Interface Standard. International Journal of Supercomputer Applications (Special Issue on MPI) 8(3/4) (1994)

    Google Scholar 

  14. Foundation, T.A.S.: Apache derby. URL http://db.apache.org/derby/. Http://db.apache.org/derby/

  15. Graham, S., Kessler, P., McKusick, M.: gprof: A Call Graph Execution Profiler. SIGPLAN ’82 Symposium on Compiler Construction pp. 120–126 (1982)

    Google Scholar 

  16. Huck, K., Malony, A.: PerfExplorer: A performance data mining framework for large-scale parallel computing. In: Conference on High Performance Networking and Computing (SC’05) (2005)

    Google Scholar 

  17. Huck, K., Malony, A., Bell, R., Morris, A.: Design and Implementation of a Parallel Performance Data Management Framework. In: Proc. International Conference on Parallel Processing, ICPP-05 (2005)

    Google Scholar 

  18. IBM: IBM DB2 Information Management Software. http://www.ibm.com/software/data

  19. Knüpfer, A., Brendel, R., Brunst, H., Mix, H., Nagel, W.E.: Introducing the Open Trace Format (OTF). In: Proceedings of the 6th International Conference on Computational Science, Springer Lecture Notes in Computer Science, vol. 3992, pp. 526–533. Reading, UK (2006)

    Google Scholar 

  20. Kohn, S., Kumfert, G., Painter, J., Ribbens, C.: Divorcing Language Dependencies from a Scientific Software Library. In: Proceedings of the 10th SIAM Conference on Parallel Processing (2001)

    Google Scholar 

  21. Lindlan, K.A., Cuny, J., Malony, A.D., Shende, S., Mohr, B., Rivenburgh, R., Rasmussen., C.: A Tool Framework for Static and Dynamic Analysis of Object-Oriented Software with Templates. In: Proceedings of SC2000: High Performance Networking and Computing Conference (2000)

    Google Scholar 

  22. Malony, A., Shende, S.: Distributed and Parallel Systems: From Concepts to Applications, chap. Performance Technology for Complex Parallel and Distributed Systems, pp. 37–46. Kluwer, Norwell, MA (2000)

    Google Scholar 

  23. Malony, A.D.: Performance Observability. Ph.D. thesis, University of Illinois at Urbana-Champaign (1990)

    Google Scholar 

  24. Mohr, B., Malony, A.D., Shende, S., Wolf, F.: Towards a Performance Tool Interface for OpenMP: An Approach Based on Directive Rewriting. In: Proceedings of Third European Workshop on OpenMP (2001)

    Google Scholar 

  25. Mohr, B., Wolf, F.: KOJAK - A Tool Set for Automatic Performance Analysis of Parallel Applications. In: Proceedings of the European Conference on Parallel Computing (EuroPar 2003, LNCS 2790), pp. 1301–1304. Springer (2003)

    Google Scholar 

  26. Mucci, P.: Dynaprof. http://www.cs.utk.edu/~mucci/dynaprof

  27. MySQL: MySQL: The World’s Most Popular Open Source Database

    Google Scholar 

  28. Nagel, W., Arnold, A., Weber, M., Hoppe, H.C., Solchenbach, K.: VAMPIR: Visualization and Analysis of MPI Resources. Supercomputer 12(1), 69–80 (1996)

    Google Scholar 

  29. Nataraj, A., Malony, A.D., Shende, S., Morris, A.: Integrated parallel performance views. Cluster Computing 11(1), 57–73 (2008). http://dx.doi.org/10.1007/s10586-007-0051-6

    Article  Google Scholar 

  30. Nataraj, A., Morris, A., Malony, A.D., Arnold, D., Miller, B.: A Framework for Scalable, Parallel Performance Monitoring using TAU and MRNet. Under submission

    Google Scholar 

  31. Nataraj, A., Morris, A., Malony, A.D., Sottile, M., Beckman, P.: The Ghost in the Machine: Observing the Effects of Kernel Operation on Parallel Application Performance. In: ACM/IEEE SC2007. Reno, Nevada (2007)

    Google Scholar 

  32. Nataraj, A., Sottile, M., Morris, A., Malony, A.D., Shende, S.: TAUoverSupermon : Low-Overhead Online Parallel Performance Monitoring. In: Europar’07: European Conference on Parallel Processing (2007)

    Google Scholar 

  33. Norris, B., Ray, J., McInnes, L., Bernholdt, D., Elwasif, W., Malony, A., Shende, S.: Computational quality of service for scientific components. In: Proceedings of the International Symposium on Component-based Software Engineering (CBSE7). Springer (2004)

    Google Scholar 

  34. Oracle Corporation: Oracle. http://www.oracle.com

  35. PostgreSQL: PostgreSQL: The World’s Most Advanced Open Source Database. http://www.postgresql.org

  36. Seidl, S.: VTF3 - A Fast Vampir Trace File Low-Level Management Library. Tech. Rep. ZHR-R-0304, Dresden University of Technology, Center for High-Performance Computing (2003)

    Google Scholar 

  37. Shende, S.: The Role of Instrumentation and Mapping in Performance Measurement. Ph.D. thesis, University of Oregon (2001)

    Google Scholar 

  38. Shende, S., Malony, A.D.: The TAU parallel performance system. The International Journal of High Performance Computing Applications 20(2), 287–331 (2006). URL http://www.cs.uoregon.edu/research/tau

    Article  Google Scholar 

  39. Shende, S., Malony, A.D., Cuny, J., Lindlan, K., Beckman, P., Karmesin, S.: Portable Profiling and Tracing for Parallel Scientific Applications using C++. In: Proceedings of the SIGMETRICS Symposium on Parallel and Distributed Tools, SPDT’98, pp. 134–145 (1998)

    Google Scholar 

  40. Shende, S., Malony, A.D., Rasmussen, C., Sottile, M.: A Performance Interface for Component-Based Applications. In: Proceedings of International Workshop on Performance Modeling, Evaluation and Optimization, International Parallel and Distributed Processing Symposium (2003)

    Google Scholar 

  41. Subramanya, R., Reddy, R.: Sandia DNS code for 3D compressible flows - Final Report. Tech. Rep. PSC-Sandia-FR-3.0, Pittsburgh Supercomputing Center, PA (2000)

    Google Scholar 

  42. Szyperski, C.: Component Software: Beyond Object-Oriented Programming. Addison-Wesley (1997)

    Google Scholar 

  43. The R Foundation for Statistical Computing: R project for statistical computing (2007). URL http://www.r-project.org. Http://www.r-project.org

  44. University of Oregon: TAU Portable Profiling. http://tau.uoregon.edu

  45. University of Oregon: TAU Portal. http://tau.nic.uoregon.edu

  46. University of Oregon: Tuning and Analysis Utilities User’s Guide. http://www.cs.uoregon.edu/research/paracomp/tau

  47. Vetter, J., Chambreau, C.: mpiP: Lightweight, Scalable MPI Profiling. http://www.llnl.gov/CASC/mpip/

  48. Witten, Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann (2005). URL http://www.cs.waikato.ac.nz/~ml/weka/

  49. Wolf, F., Mohr, B., Dongarra, J., Moore, S.: Efficient Pattern Search in Large Traces through Successive Refinement. In: Proceedings of the European Conference on Parallel Computing (EuroPar 2004, LNCS 3149), pp. 47–54. Springer (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Allen D. Malony .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Malony, A.D. et al. (2008). Evolution of a Parallel Performance System. In: Resch, M., Keller, R., Himmler, V., Krammer, B., Schulz, A. (eds) Tools for High Performance Computing. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-68564-7_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-68564-7_11

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-68561-6

  • Online ISBN: 978-3-540-68564-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics