Abstract
Conventional compilers provide limited external control over the optimizations they automatically apply to attain high performance. Consequently, these optimizations have become increasingly ineffective due to the difficulty of understanding the higher-level semantics of the user applications. This paper presents a framework that provides interactive fine-grained control of compiler optimizations to external users as part of an integrated program development environment. Through a source-level optimization specification language and a Graphical User Interface (GUI), users can interactively select regions within their source code as targets of optimization and then explicitly compose and configure how each optimization should be applied to maximize performance. The optimization specifications can then be downloaded and fed into a backend transformation engine, which empirically tunes the optimization configurations on varying architectures. When used to optimize a collection of matrix and stencil kernels, our framework was able to attain 1.84X/3.83X speedup on average compared with using icc/gcc alone.
This research is funded by NSF through award CCF1261811, CCF1421443, and CCF1261778, and DOE through award DE-SC0001770.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Carr, S., Kennedy, K.: Improving the ratio of memory operations to floating-point operations in loops. ACM Trans. Program. Lang. Syst. 16(6), 1768–1810 (1994)
Chen, C., Chame, J., Hall, M.: Combining models and guided empirical search to optimize for multiple levels of the memory hierarchy. In: International Symposium on Code Generation and Optimization, March 2005
Cohen, A., Sigler, M., Girbal, S., Temam, O., Parello, D., Vasilache, N.: Facilitating the search for compositions of program transformations. In: ICS 2005: Proceedings of the 19th Annual International Conference on Supercomputing, pp. 151–160. ACM, New York, NY, USA (2005)
Donadio, S., Brodman, J., Roeder, T., Yotov, K., Barthou, D., Cohen, A., Garzarán, M.J., Padua, D., Pingali, K.: A language for the compact representationof multiple program versions. In: LCPC, October 2005
Hall, M., Chame, J., Chen, C., Shin, J., Rudy, G., Khan, M.M.: Loop transformation recipes for code generation and auto-tuning. In: Gao, G.R., Pollock, L.L., Cavazos, J., Li, X. (eds.) LCPC 2009. LNCS, vol. 5898, pp. 50–64. Springer, Heidelberg (2010)
Lam, M., Rothberg, E., Wolf, M.E.: The cache performance and optimizations of blocked algorithms. In: Proceedings of the Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, Santa Clara, April 1991
McKinley, K., Carr, S., Tseng, C.: Improving data locality with loop transformations. ACM Trans. Program. Lang. Syst. 18(4), 424–453 (1996)
O’Boyle, M., Motogelwa, N., Knijnenburg, P.: Feedback assisted iterative compilation. In: Languages and Compilers for Parallel Computing (2000)
Pan, Z., Eigenmann, R.: Fast automatic procedure-level performance tuning. In: Proceedings of Parallel Architectures and Compilation Techniques (2006)
Pike, G., Hilfinger, P.: Better tiling and array contraction for compiling scientific programs. In: Conference on SC, Baltimore, MD, USA (2002)
Qasem, A., Kennedy, K., Mellor-Crummey, J.: Automatic tuning of whole applications using direct search and a performance-based transformation system. J. Supercomput. 36(2), 183–196 (2006)
Stephenson, M., Amarasinghe, S.: Predicting unroll factors using supervised classification. In: CGO, San Jose, CA, USA (2005)
Voss, M.J., Eigenmann, R.: High-level adaptive program optimization with ADAPT. In: ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (2001)
Vuduc, R., Demmel, J., Yelick, K.: OSKI: An interface for a self-optimizing library of sparse matrix kernels (2005)
Wang, Q., Zhang, X., Zhang, Y., Yi, Q.: Augem: Automatically generate high performance dense linear algebra kernels on x86 cpus. In: Proceedings of SC13: International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2013, pp. 25:1–25:12, New York, NY, USA (2013)
Wolfe, M.J.: More iteration space tiling. In: Proceedings of Supercomputing, Reno, November 1989
Yi, Q.: Automated programmable control and parameterization of compiler optimizations. In: CGO 2011: ACM/IEEE International Symposium on Code Generation and Optimization, April 2011
Yi, Q.: POET: A scripting language for applying parameterized source-to-source program transformations. Softw. Pract. Exp. 42, 675–706 (2012)
Yi, Q., Guo, J.: Extensive parameterization and tuning of architecture-sensitive optimizations. In: iWapt 2011: The Sixth International Workshop on Automatic Performance Tuning, June 2011
Yi, Q., Seymour, K., You, H., Vuduc, R., Quinlan, D.: POET: Parameterized optimizations for empirical tuning. In: POHLL 2007: Workshop on Performance Optimization for High-Level Languages and Libraries, March 2007
Yi, Q., Wang, Q., Cui, H.: Specializing compiler optimizations through programmable composition for dense matrix computations. In: Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO-47, pp. 596–608. IEEE Computer Society, Washington, DC, USA (2014)
Yi, Q., Whaley, C.: Automated transformation for performance-critical kernels. In: LCSD 2007: ACM SIGPLAN Symposium on Library-Centric Software Design, Montreal, Canada, October 2007
Yotov, K., Li, X., Ren, G., Garzaran, M., Padua, D., Pingali, K., Stodghill, P.: A comparison of empirical and model-driven optimization. In: IEEE special issue on Program Generation Optimization, and Adaptation (2005)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Nesterenko, B., Wang, W., Yi, Q. (2016). Interactive Composition of Compiler Optimizations. In: Shen, X., Mueller, F., Tuck, J. (eds) Languages and Compilers for Parallel Computing. LCPC 2015. Lecture Notes in Computer Science(), vol 9519. Springer, Cham. https://doi.org/10.1007/978-3-319-29778-1_6
Download citation
DOI: https://doi.org/10.1007/978-3-319-29778-1_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-29777-4
Online ISBN: 978-3-319-29778-1
eBook Packages: Computer ScienceComputer Science (R0)