Skip to main content

Studying Performance Changes with Tracking Analysis

  • Conference paper
Tools for High Performance Computing 2014

Abstract

Scientific applications can have so many parameters, possible usage scenarios and target architectures, that a single experiment is often not enough for an effective analysis that gets sound understanding of their performance behavior. Different software and hardware settings may have a strong impact on the results, but trying and measuring in detail even just a few possible combinations to decide which configuration is better, rapidly floods the user with excessive amounts of information to compare.

In this chapter we introduce a novel methodology for performance analysis based on object tracking techniques. The most compute-intensive parts of the program are automatically identified via cluster analysis, and then we track the evolution of these regions across different experiments to see how the behavior of the program changes with respect to the varying settings and over time. This methodology addresses an important problem in HPC performance analysis, where the volume of data that can be collected expands rapidly in a potentially high dimensional space of performance metrics, and we are able to manage this complexity and identify coarse properties that change when parameters are varied to target tuning and more detailed performance studies.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 54.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Browne, S., Dongarra, J., Garner, N., Ho, G., Mucci, P.: A portable programming interface for performance evaluation on modern processors. Int. J. High Perform. Comput. Appl. 14, 189–204 (2000)

    Article  Google Scholar 

  2. BSC Facilities. http://www.bsc.es/marenostrum-support-services

  3. BSC Tools. http://www.bsc.es/paraver

  4. Buck, B., Hollingsworth, J.K.: An API for runtime code patching. Int. J. High Perform. Comput. Appl. 14, 317–329 (2000)

    Article  Google Scholar 

  5. Casas, M., Badia, R., Labarta, J.: Automatic analysis of speedup of MPI applications. In: Proceedings of the 22nd Annual International Conference on Supercomputing (ICS’08), Island of Kos, pp. 349–358. ACM, New York (2008)

    Google Scholar 

  6. Casas, M., Bronevetsky, G.: Active measurement of memory resource consumption. In: Proceedings of the 2014 IEEE 28th International Parallel and Distributed Processing Symposium (IPDPS’14), Phoenix, pp. 995–1004. IEEE Computer Society, Washington, DC (2014)

    Google Scholar 

  7. Geimer, M., Saviankou, P., Strube, A., Szebenyi, Z., Wolf, F., Wylie, B.J.N.: Further improving the scalability of the SCALASCA toolset. In: Proceedings of the 10th International Conference on Applied Parallel and Scientific Computing (PARA’10), Reykjavik, vol. 2, pp. 463–473. Springer, Berlin/Heidelberg (2012)

    Google Scholar 

  8. GNU Binutils. http://www.gnu.org/software/binutils

  9. González, J., et al.: Automatic detection of parallel applications computation phases. In: IPDPS: 23rd IEEE International Parallel and Distributed Processing Symposium, Sao Paulo (2009)

    Google Scholar 

  10. González, J., et al.: Automatic evaluation of the computation structure of parallel applications. In: PDCAT: Proceedings of the 2009 International Conference on Parallel and Distributed Computing, Applications and Technologies, Higashi Hiroshima, pp. 138–145 (2009)

    Google Scholar 

  11. González, J., et al.: Performance data extrapolation in parallel codes. In: ICPADS: 16th IEEE International Conference on Parallel and Distributed Systems, Shanghai, pp. 155–163 (2010)

    Google Scholar 

  12. Huck, K.A., Malony, A.D.: PerfExplorer: a performance data mining framework for large-scale parallel computing. In: Proceedings of the Conference on Supercomputing, New York, p. 41 (2005)

    Google Scholar 

  13. Ibarra, O.H., Kim, C.E.: Fast approximation algorithms for the knapsack and sum of subset problems. J. ACM 22(4), 463–468 (1975)

    Article  MATH  MathSciNet  Google Scholar 

  14. Jones, P.: Parallel Ocean Program (POP) user guide. Technical report, Los Alamos National Laboratory, March 2003

    Google Scholar 

  15. Lavalléea, P.-F., et al.: HYDRO. http://www.prace-ri.eu

  16. Mellor-Crummey, J.: HPCToolkit: multi-platform tools for profile-based performance analysis. In: APART, Nov 2003

    Google Scholar 

  17. Mimica, P., Giannios, D., Aloy, M.A.: Deceleration of arbitrarily magnetized GRB Ejecta: the complete evolution. Technical report arXiv:0810.2961, Oct 2008. Comments: 13 pages, 10 figures, revised version sent to the referee (first version submitted on 6th of August)

    Google Scholar 

  18. NAS Parallel Benchmarks. http://www.nas.nasa.gov/Software/NPB

  19. Open source package for Material eXplorer. http://www.openmx-square.org

  20. Palaniappan, K., et al.: Moving object segmentation using the flux tensor for biological video microscopy. In: PCM, Hong Kong, p. 483 (2007)

    Google Scholar 

  21. Patwary, M.A., Palsetia, D., Agrawal, A., Liao, W.-K., Manne, F., Choudhary, A.: A new scalable parallel DBSCAN algorithm using the disjoint-set data structure. In: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis (SC’12), Salt Lake City, pp. 62:1–62:11. IEEE Computer Society, Los Alamitos (2012)

    Google Scholar 

  22. Roth, P.C.: ETRUSCA: event trace reduction using statistical data clustering analysis. Master’s thesis, University of Iowa (1992)

    Google Scholar 

  23. Servat, H., et al.: Detailed performance analysis using coarse grain sampling. In: PROPER (2009)

    Google Scholar 

  24. Servat, H., et al.: Unveiling internal evolution of parallel application computation phases. In: ICPP, Taipei, pp. 155–164 (2011)

    Google Scholar 

  25. Shende, S.S., Malony, A.D.: The tau parallel performance system. Int. J. High Perform. Comput. Appl. 20, 287–311 (2006)

    Article  Google Scholar 

  26. Song, F., et al.: An algebra for cross-experiment performance analysis. In: ICPP, Montreal, pp. 63–72 (2004)

    Google Scholar 

  27. The CGPOP Miniapp website. http://www.cs.colostate.edu/hpc/cgpop

  28. The libunwind project. http://www.nongnu.org/libunwind

  29. The Weather Research & Forecasting model. http://www.wrf-model.org

  30. Yilmaz, A., et al.: Object tracking: a survey. ACM Comput. Surv. 38(4) (2006)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Germán Llort .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Llort, G., Servat, H., Gonzalez, J., Gimenez, J., Labarta, J. (2015). Studying Performance Changes with Tracking Analysis. In: Niethammer, C., Gracia, J., Knüpfer, A., Resch, M., Nagel, W. (eds) Tools for High Performance Computing 2014. Springer, Cham. https://doi.org/10.1007/978-3-319-16012-2_9

Download citation

Publish with us

Policies and ethics