Compile-Time Based Performance Prediction

  • Calin Cascaval
  • Luiz DeRose
  • David A. Padua
  • Daniel A. Reed
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1863)


In this paper we present results we obtained using a compiler to predict performance of scientific codes. The compiler, Polaris [3], is both the primary tool for estimating the performance of a range of codes, and the beneficiary of the results obtained from predicting the program behavior at compile time. We show that a simple compile-time model, augmented with profiling data obtained using very light instrumentation, can be accurate within 20% (on average) of the measured performance for codes using both dense and sparse computational methods.


Execution Time Performance Prediction Cache Line Memory Hierarchy Symbolic Expression 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    T. Ball and J. R. Larus. Branch prediction for free. In Proceedings of the ACM SIGPLAN Conference on Programming Languages Design and Implementation’ 93, pages 300–313, 1993.Google Scholar
  2. 2.
    U. Banerjee. Dependence analysis. Kluwer Academic Publishers, 1997.Google Scholar
  3. 3.
    W. Blume, R. Doallo, R. Eigenmann, J. Grout, J. Hoeflinger, T. Lawrence, J. Lee, D. Padua, Y. Paek, W. Pottenger, L. Rauchwerger, and P. Tu. Parallel Programming with Polaris. IEEE Computer, December 1996.Google Scholar
  4. 4.
    R. Bramley, D. Gannon, T. Stuckey, J. Villacis, J. Balasubramanian, E. Akman, F. Breg, S. Diwan, and M. Govindaraju. The Linear System Analyzer, chapter PSEs. IEEE, 1998.Google Scholar
  5. 5.
    C. Cascaval and D. A. Padua. Compile-time cache misses estimation using stack distances. In preparation.Google Scholar
  6. 6.
    P. P. Chang, S. A. Mahlke, and W.-M. W. Hwu. Using profile information to assist classic compiler code optimizations. Software Practice and Experience, 21(12):1301–1321, December 1991.Google Scholar
  7. 7.
    R. P. Colwell, R. P. Nix, J. J. O’Donnell, D. B. Papworth, and P. K. Rodman. A VLIW architecture for a trace scheduling compiler. In Proceedings of ASPLOS II, pages 180–192, Palo Alto, CA, October 1987.Google Scholar
  8. 8.
    L. DeRose, Y. Zhang, and D. A. Reed. SvPablo: A multi-language performance analysis system. In 10th International Conference on Computer Performance Evaluation-Modelling Techniques and Tools-Performance Tools’98, pages 352–355, Palma de Mallorca, Spain, September 1998.Google Scholar
  9. 9.
    T. Fahringer. Evaluation of benchmark performance estimation for parallel Fortran programs on massively parallel SIMD and MIMD computers. In IEEE Proceedings of the 2nd Euromicro Workshop on Parallel and Distributed Processing, Malaga, Spain, January 1994.Google Scholar
  10. 10.
    T. Fahringer. Automatic Performance Prediction of Parallel Programs. Kluwer Academic Press, 1996.Google Scholar
  11. 11.
    T. Fahringer. Estimating cache performance for sequential and data parallel programs. Technical Report TR 97-9, Institute for Software Technology and Parallel Systems, Univ. of Vienna, Vienna, Austria, October 1997.Google Scholar
  12. 12.
    J. A. Fisher. Trace scheduling: A technique for global microcode compaction. IEEE Transactions on Computers, C(30):478–490, July 1981.Google Scholar
  13. 13.
    J. D. Gee, M. D. Hill, and A. J. Smith. Cache performance of the SPEC92 benchmark suite. In Proceedings of the IEEE Micro, pages 17–27, August 1993.Google Scholar
  14. 14.
    S. Ghosh, M. Martonosi, and S. Malik. Precise Miss Analysis for Program Transformations with Caches of Arbitrary Associativity. In Proceedings of ASPLOS VIII, San Jose, CA, October 1998.Google Scholar
  15. 15.
    M. D. Hill and A. J. Smith. Evaluating associativity in cpu caches. IEEE Transactions on Computers, 38(12):1612–1630, December 1989.Google Scholar
  16. 16.
    Y. Kang, M. Huang, S.-M. Yoo, Z. Ge, D. Keen, V. Lam, P. Pattnaik, and J. Torrellas. FlexRAM: Toward an advanced intelligent memory system. In International Conference on Computer Design (ICCD), October 1999.Google Scholar
  17. 17.
    R. L. Mattson, J. Gecsei, D. Slutz, and I. Traiger. Evaluation techniques for storage hierarchies. IBM Systems Journal, 9(2), 1970.Google Scholar
  18. 18.
    J. Reilly. SPEC95 Products and Benchmarks. SPEC Newsletter, September 1995.Google Scholar
  19. 19.
    R. Saavedra and A. Smith. Measuring cache and tlb performance and their effect on benchmark run times. IEEE Transactions on Computers, 44(10):1223–1235, October 1995.Google Scholar
  20. 20.
    R. H. Saavedra-Barrera and A. J. Smith. Analysis of benchmark characteristics and benchmark performance prediction. Technical Report CSD 92-715, Computer Science Division, UC Berkeley, 1992.Google Scholar
  21. 21.
    R. H. Saavedra-Barrera, A. J. Smith, and E. Miya. Machine characterization based on an abstract high-level language machine. IEEE Transactions on Computers, 38(12):1659–1679, December 1989.Google Scholar
  22. 22.
    V. Sarkar. Determining average program execution times and their variance. In Proceedings of the ACM SIGPLAN Conference on Programming Languages Design and Implementation’ 89, pages 298–312, Portland, Oregon, July 1989.Google Scholar
  23. 23.
    R. A. Sugumar and S. G. Abraham. Set-associative cache simulation using generalized binomial trees. ACM Trans. Comp. Sys., 13(1), 1995.Google Scholar
  24. 24.
    J. G. Thompson and A. J. Smith. Efficient (stack) algorithms for analysis of write-back and sector memories. ACM Transactions on Computer Systems, 7(1), 1989.Google Scholar
  25. 25.
    W.-H. Wang and J.-L. Baer. Efficient trace-driven simulation methods for cache performance analysis. ACM Transactions on Computer Systems, 9(3), 1991.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2000

Authors and Affiliations

  • Calin Cascaval
    • 1
  • Luiz DeRose
    • 1
  • David A. Padua
    • 1
  • Daniel A. Reed
    • 1
  1. 1.Department of Computer ScienceUniversity of Illinois at Urbana-ChampaignUSA

Personalised recommendations