ALIC: A Low Overhead Compiler Optimization Prediction Model
- 28 Downloads
Iterative compilation based on machine learning can automatically predict the best optimization for the new programs. However, the efficient prediction models often require repetitive training, which leads to a higher training time overheads, and greatly affects the widespread utilization of the technology. The existing approaches in the prediction model construction often use random sample search strategy, which easily lead to data redundancy. In addition, due to the effect of run-time noises, the sample program is subjected to a fixed number of repetitive observations. However, in the case there is very little noises, the repetitive observations will result in a serious waste of iterative compilation time overheads. Therefore, how to effectively collect the optimal prediction model samples and choose the appropriate sample observations number are the key problem of reducing the iterative compilation overheads. We propose a low overheads iterative compilation optimization parameters prediction model ALIC. First, we describe the target programs by static-dynamic features representation based on feature-class relevance, and construct an initial optimization prediction model by the classifier. Then we use a dynamic number of sample observations strategy for each sample. The most profitable sample from the candidate samples set is selected and marked, each mark is equivalent to increase the number of sample observations. Finally, the optimization prediction model is constructed based on the intermediate prediction model that learns candidate samples actively. The experimental results show that when predicting optimization parameters for the new programs on Intel Xeon E5520 and Chinese Shenwei 26010 platforms, the ALIC model generates 1.38× (by ICC14.0 compiler), 1.35× (by GCC5.4 compiler) average performance improvement on the Xeon platform, and 1.42× (by SW compiler) on the Shenwei Platform. In addition, the ALIC model can significantly reduce the iterative compilation training time overheads than the existing approaches.
KeywordsIterative compilation Time overheads Active learning Dynamic observations
This work was funded by the National key research and development program “high-performance computing” key special (2016YFB0200503).
- 3.Nobre, R., Martins, L. G., & Cardoso, J. M. (2015). Use of previously acquired positioning of optimizations for phase ordering exploration. In Proceedings of the 18th international workshop on software and compilers for embedded systems, pp. 58–67.Google Scholar
- 4.Li, F. Q., Tang, F. L., & Shen, Y. (2014). Feature mining for machine learning based compilation optimization. In Proceedings of the eighth international conference on innovative mobile and internet services in ubiquitous computing, pp. 207–214.Google Scholar
- 5.Ballal, P. A., Sarojadevi, H., et al. (2015). Compiler optimization: A genetic algorithm approach. International Journal of Computer Applications., 112(10), 9–13.Google Scholar
- 6.Schkufza, E., Sharma, R., & Aiken, A. (2014). Stochastic optimization of floating-point programs with tunable precision. In Proceedings of programming language design and implementation (PLDI), pp. 53–64.Google Scholar
- 7.Purini, S., & Jain, L. (2013). Finding good optimization sequences covering program space. ACM Transactions on Architecture and Code Optimization (TACO), 9(4), 56:1–56:23.Google Scholar
- 8.Wang, Z., & Boyle, M. F. P. O. (2013). Using machine learning to partition streaming programs. ACM Transactions on Architecture and Code Optimization (TACO), 10(3), 20:1–20:25.Google Scholar
- 9.Trouvé, A., Cruz, A., et al. (2013). Using machine learning in order to improve automatic SIMD instruction generation. In Proceedings of the international conference on computational science, pp. 1292–1301.Google Scholar
- 10.Kumar, T. S., Sakthivel, S., & Kumar, S. (2014). Optimizing code by selecting compiler flags using parallel genetic algorithm on multicore CPUs. International Journal of Engineering and Technology (IJET), 6(2), 544–555.Google Scholar
- 11.Ogilvie, W. F., Petoumenos, P., Wang, Z., & Leather H. (2014). Fast automatic heuristic construction using active learning. In International workshop on languages and compilers for parallel computing (LCPC), pp. 146–160.Google Scholar
- 12.Balaprakash, P., Gramacy, R. B., & Wild, S. M. (2013). Active-learning based surrogate models for empirical performance tuning. In IEEE international conference on cluster computing, pp. 1–8.Google Scholar
- 13.Balaprakash, P., Rupp, K., Mametjanov, A., et al. (2013). Empirical performance modeling of GPU kernels using active learning. In International conference on parallel computing, pp 646–655.Google Scholar
- 14.Mazouz, A., Touati, S. A. A., & Barthou, D. (2010). Study of variations of native program execution times on multi-core architectures. In Conference on complex, intelligent and software intensive systems (CISIS), pp. 919–924.Google Scholar
- 15.SPEC CPU2006: SPEC CPU2006 benchmark suite. http://www.spec.org/cpu/.
- 16.Okada, T. K., Goldman, A., & Cavalheiro, G. G. H. (2016). Using NAS Parallel Benchmarks to evaluate HPC performance in clouds. In IEEE 15th international symposium on network computing and applications (NCA), pp. 27–30.Google Scholar
- 17.Sani, S., Wiratunga, N., Massie, S., & Cooper, K. (2017). kNN sampling for personalised human activity recognition. In International conference on case-based reasoning, pp. 330–344.Google Scholar