Skip to main content

Effective Source-to-Source Outlining to Support Whole Program Empirical Optimization

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5898))

Abstract

Although automated empirical performance optimization and tuning is well-studied for kernels and domain-specific libraries, a current research grand challenge is how to extend these methodologies and tools to significantly larger sequential and parallel applications. In this context, we present the ROSE source-to-source outliner, which addresses the problem of extracting tunable kernels out of whole programs, thereby helping to convert the challenging whole-program tuning problem into a set of more manageable kernel tuning tasks. Our outliner aims to handle large scale C/C++, Fortran and OpenMP applications. A set of program analysis and transformation techniques are utilized to enhance the portability, scalability, and interoperability of source-to-source outlining. More importantly, the generated kernels preserve performance characteristics of tuning targets and can be easily handled by other tools. Preliminary evaluations have shown that the ROSE outliner serves as a key component within an end-to-end empirical optimization system and enables a wide range of sequential and parallel optimization opportunities.

This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Whaley, C., Dongarra, J.: Automatically tuned linear algebra software. In: Proceedings of Supercomputing, Orlando, FL (1998)

    Google Scholar 

  2. Frigo, M.: A fast Fourier transform compiler. In: Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, Atlanta, Georgia (May 1999)

    Google Scholar 

  3. Kisuki, T., Knijnenburg, P.M., O’Boyle, M.F.: Combined selection of tile sizes and unroll factors using iterative compilation. In: Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, Philadelphia, PA (October 2000)

    Google Scholar 

  4. Whalley, D.B.: Tuning high performance kernels through empirical compilation. In: ICPP 2005: Proceedings of the 2005 International Conference on Parallel Processing, Washington, DC, USA, pp. 89–98. IEEE Computer Society, Los Alamitos (2005)

    Google Scholar 

  5. Lee, Y.J., Hall, M.W.: A code isolator: Isolating code fragments from large programs. In: Eigenmann, R., Li, Z., Midkiff, S.P. (eds.) LCPC 2004. LNCS, vol. 3602, pp. 164–178. Springer, Heidelberg (2005)

    Google Scholar 

  6. Qasem, A., Kennedy, K., Mellor-Crummey, J.: Automatic tuning of whole applications using direct search and a performance-based transformation system. J. Supercomput. 36(2), 183–196 (2006)

    Article  Google Scholar 

  7. Pan, Z., Eigenmann, R.: PEAK—a fast and effective performance tuning system via compiler optimization orchestration. ACM Trans. Program. Lang. Syst. 30(3), 1–43 (2008)

    Article  Google Scholar 

  8. Bailey, D., Chame, J., Chen, C., Dongarra, J., Hall, M., Hollingsworth, J.K., Hovland, P., Moore, S., Seymour, K., Shin, J., Tiwari, A., Williams, S., You, H.: PERI auto-tuning. Journal of Physics: Conference Series (2008)

    Google Scholar 

  9. Zhao, P., Amaral, J.N.: Ablego: a function outlining and partial inlining framework: Research articles. Softw. Pract. Exper. 37(5), 465–491 (2007)

    Article  Google Scholar 

  10. Quinlan, D.J., et al.: ROSE compiler project, http://www.rosecompiler.org/

  11. Mellor-Crummey, J., et al.: HPCToolkit, http://www.hpctoolkit.org/

  12. Hargrove, P.H., et al.: Berkeley lab checkpoint/restart (BLCR), https://ftg.lbl.gov/CheckpointRestart

  13. You, H., Seymour, K., Dongarra, J.: An effective empirical search method for autmatic software tuning. Technical report, University of Tennessee (2005)

    Google Scholar 

  14. Chung, I.H., Hollingsworth, J.K.: Using information from prior runs to improve automated tuning systems. In: SC 2004: Proceedings of the 2004 ACM/IEEE conference on Supercomputing, Washington, DC, USA, p. 30 (2004)

    Google Scholar 

  15. Yi, Q., Quinlan, D.: Applying loop optimizations to object-oriented abstractions through general classification of array semantics. In: Eigenmann, R., Li, Z., Midkiff, S.P. (eds.) LCPC 2004. LNCS, vol. 3602, pp. 253–267. Springer, Heidelberg (2005)

    Google Scholar 

  16. Yi, Q., Seymour, K., You, H., Vuduc, R., Quinlan, D.: POET: Parameterized optimizations for empirical tuning. In: Workshop on Performance Optimization of High-Level Languages and Libraries (POHLL) (March 2007)

    Google Scholar 

  17. Chen, C., Chame, J., Hall, M.: CHiLL: A framework for composing high-level loop transformations. Technical report, USC Computer Science (2008)

    Google Scholar 

  18. Brown, P.N., Falgout, R.D., Jones, J.E.: Semicoarsening multigrid on distributed memory machines. SIAM J. Sci. Comput. 21(5), 1823–1834 (2000)

    Article  MATH  MathSciNet  Google Scholar 

  19. Liao, C., Quinlan, D.J., Willcock, J.J., Panas, T.: Extending automatic parallelization to optimize high-level abstractions for multicore. In: Müller, M.S., de Supinski, B.R., Chapman, B.M. (eds.) IWOMP 2009. LNCS, vol. 5568, pp. 28–41. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  20. Lakhotia, A., Deprez, J.C.: Restructuring programs by tucking statements into functions. In: Harman, M., Gallagher, K. (eds.) Special Issue on Program Slicing. Information and Software Technology, vol. 40, pp. 677–689 (1998)

    Google Scholar 

  21. Komondoor, R., Horwitz, S.: Effective, automatic procedure extraction. In: IWPC 2003: Proceedings of the 11th IEEE International Workshop on Program Comprehension, Washington, DC, USA, p. 33. IEEE Computer Society, Los Alamitos (2003)

    Google Scholar 

  22. Jin, G., Mellor-Crummey, J.: Experiences tuning SMG98: a semicoarsening multigrid benchmark based on the hypre library. In: ICS 2002: Proceedings of the 16th international conference on Supercomputing, pp. 305–314. ACM, New York (2002)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Liao, C., Quinlan, D.J., Vuduc, R., Panas, T. (2010). Effective Source-to-Source Outlining to Support Whole Program Empirical Optimization. In: Gao, G.R., Pollock, L.L., Cavazos, J., Li, X. (eds) Languages and Compilers for Parallel Computing. LCPC 2009. Lecture Notes in Computer Science, vol 5898. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-13374-9_21

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-13374-9_21

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-13373-2

  • Online ISBN: 978-3-642-13374-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics