Skip to main content

A Loop-Aware Search Strategy for Automated Performance Analysis

  • Conference paper
High Performance Computing and Communications (HPCC 2005)

Part of the book series: Lecture Notes in Computer Science ((LNCCN,volume 3726))

  • 808 Accesses

Abstract

Automated online search is a powerful technique for performance diagnosis. Such a search can change the types of experiments it performs while the program is running, making decisions based on live performance data. Previous research has addressed search speed and scaling searches to large codes and many nodes. This paper explores using a finer granularity for the bottlenecks that we locate in an automated online search, i.e., refining the search to bottlenecks localized to loops. The ability to insert and remove instrumentation on-the-fly means an online search can utilize fine-grain program structure in ways that are infeasible using other performance diagnosis techniques. We automatically detect loops in a program’s binary control flow graph and use this information to efficiently instrument loops. We implemented our new strategy in an existing automated online performance tool, Paradyn. Results for several sequential and parallel applications show that a loop-aware search strategy can increase bottleneck precision without compromising search time or cost.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Aho, A., Sethi, R., Ullman, J.: Compilers: Principles, Techniques and Tools. Addison-Wesley, Reading (1985)

    Google Scholar 

  2. Cain, H.W., Miller, B.P., Wylie, B.J.N.: A Callgraph-Based Search Strategy for Automated Performance Diagnosis. Concurrency and Computation: Practice & Experience 14(3), 203–217 (2002); Also appears as Euro-Par 2000, Munich, Germany (August 2000)

    Article  MATH  Google Scholar 

  3. DeRose, L.A., Mohr, B., Seelam, S.R.: Profiling and tracing openMP applications with POMP based monitoring libraries. In: Danelutto, M., Vanneschi, M., Laforenza, D. (eds.) Euro-Par 2004. LNCS, vol. 3149, pp. 39–46. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  4. Gerndt, H.M., Krumme, A.: A Rule-Based Approach for Automatic Bottleneck Detection in Programs on Shared Virtual Memory Systems. In: 2nd Intl. Workshop on High-Level Programming Models and Supportive Environments, Geneva, Switzerland (April 1997)

    Google Scholar 

  5. Goldberg, A.J., Hennessy, J.: MTOOL: A Method for Isolating Memory Bottlenecks in Shared Memory Multiprocessor Programs. In: International Conference on Parallel Processing, Augsust 1991, pp. 251–257 (1991)

    Google Scholar 

  6. Graham, S., Kessler, P., McKusick, M.: An Execution Profiler for Modular Programs. Software Practice & Experience 13(8), 671–686 (1983)

    Article  MATH  Google Scholar 

  7. Helm, B.R., Malony, A.D., Fickas, S.F.: Capturing and Automating Performance Diagnosis: the Poirot Approach. In: Intl. Parallel Processing Symposium, Santa Barbara, California (April 1995)

    Google Scholar 

  8. Hirzel, M., Chilimbi, T.: Bursty tracing: A framework for low-overhead temporal profiling. In: 4th ACM Workshop on Feedback-Directed and Dynamic Optimization, Austin, Texas (December 2001)

    Google Scholar 

  9. Hollingsworth, J.K., Miller, B.P.: Dynamic Control of Performance Monitoring on Large Scale Parallel Systems. In: International Conference on Supercomputing, Tokyo (July 1993)

    Google Scholar 

  10. Hollingsworth, J.K., Miller, B.P., Cargille, J.: Dynamic Program Instrumentation for Scalable Performance Tools. In: Scalable High Performance Computing, Knoxville, Tennessee (May 1994)

    Google Scholar 

  11. Karavanic, K.L., Miller, B.P.: Improving Online Performance Diagnosis by the Use of Historical Performance Data. In: SC 1999, Portland, Oregon (November 1999)

    Google Scholar 

  12. Larus, J.R., Schnarr, E.: EEL: Machine-Independent Executable Editing. Programming Language Design and Implementation, 291–300 (1995)

    Google Scholar 

  13. Lengauer, T., Tarjan, R.E.: A fast algorithm for finding dominators in a flowgraph. ACM Transactions on Programming Languages and Systems (TOPLAS) 1(1), 121–141 (1979)

    Article  MATH  Google Scholar 

  14. Mellor-Crummey, J., Fowler, R., Marin, G.: HPCView: A tool for top-down analysis of node performance. In: Los Alamos Computer Science Institute Second Annual Symposium, Santa Fe, NM (October 2001)

    Google Scholar 

  15. Miller, B.P., Callaghan, M.D., Cargille, J.M., Hollingsworth, J.K., Irvin, R.B., Karavanic, K.L., Kunchithapadam, K., Newhall, T.: The Paradyn Parallel Performance Measurement Tools. IEEE Computer 28(11) (November 1995)

    Google Scholar 

  16. Muchnick, S.: Advanced Compiler Design and Implementation. Morgan Kaufmann, San Francisco (1997)

    Google Scholar 

  17. Mukerjee, N., Riley, G.D., Gurd, J.R.: FINESSE: A Prototype Feedback-Guided Performance Enhancement System. In: 8th Euromicro Workshop on Parallel and Distributed Processing, Rhodos, Greece (January 2000)

    Google Scholar 

  18. Reed, D.A., Aydt, R.A., Noe, R.J., Roth, P.C., Shields, K.A., Schwartz, B., Tavera, L.F.: Scalable Performance Analysis: The Pablo Performance Analysis Environment. In: Skjellum, A. (ed.) Scalable Parallel Libraries Conference, October 1993, pp. 104–113 (1993)

    Google Scholar 

  19. Roth, P.C., Miller, B.P.: Deep Start: A Hybrid Strategy for Automated Performance Problem Searches. In: Concurrency and Computation: Practice and Experience, September 11-12, vol. 15, pp. 1027–1046. John Wiley & Sons, Chichester (2003); Euro-Par 2002. LNCS, vol. 2400. Springer, Heidelberg

    Google Scholar 

  20. Srivastava, A., Eustace, A.: ATOM: A system for building customized program analysis tools. Programming Language Design and Implementation 11, 196–205 (1994)

    Google Scholar 

  21. Intel®VTuneTMPerformance Analyzer, http://www.intel.com/software/products/vtune/

  22. Official OpenMP Fortran Version 2.0 Specification, http://www.openmp.org/drupal/mp-documents/fspec20.pdf

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Collins, E.D., Miller, B.P. (2005). A Loop-Aware Search Strategy for Automated Performance Analysis. In: Yang, L.T., Rana, O.F., Di Martino, B., Dongarra, J. (eds) High Performance Computing and Communications. HPCC 2005. Lecture Notes in Computer Science, vol 3726. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11557654_68

Download citation

  • DOI: https://doi.org/10.1007/11557654_68

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-29031-5

  • Online ISBN: 978-3-540-32079-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics