Skip to main content

Phase-Based Miss Rate Prediction Across Program Inputs

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 3602))

Abstract

Previous work shows the possibility of predicting the cache miss rate (CMR) for all inputs of a program. However, most optimization techniques need to know more than the miss rate of the whole program. Many of them benefit from knowing miss rate of each execution phase of a program for all inputs.

In this paper, we describe a method that divides a program into phases that have a regular locality pattern. Using a regression model, it predicts the reuse signature and then the cache miss rate of each phase for all inputs. We compare the prediction with the actual measurement. The average prediction is over 98% accurate for a set of floating-point programs. The predicted CMR-traces matches the simulated ones in spite of dramatic fluctuations of the miss rate over time. This technique can be used for improving dynamic optimization, benchmarking, and compiler design.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Allen, F., Cocke, J.: A proram data flow analysis procedure. Communications of the ACM 19, 137–147 (1976)

    Article  MATH  Google Scholar 

  2. Allen, R., Kennedy, K.: Optimizing Compilers for Modern Architectures: A Dependence-based Approach. Morgan Kaufmann Publishers, San Francisco (2001)

    Google Scholar 

  3. Balasubramonian, R., Albonesi, D., Buyuktosunoglu, A., Dwarkadas, S.: Memory hierarchy reconfiguration for energy and performance in general-purpose processor architectures. In: Proceedings of the 33rd International Symposium on Microarchitecture, Monterey, California (December 2000)

    Google Scholar 

  4. Balasubramonian, R., Dwarkadas, S., Albonesi, D.H.: Dynamically managing the communication-parallelism trade-off in future clustered processors. In: Proceedings of International Symposium on Computer Architecture, San Diego, CA (June 2003)

    Google Scholar 

  5. Burke, M., Cytron, R.: Interprocedural dependence analysis and parallelization. In: Proceedings of the SIGPLAN 1986 Symposium on Compiler Construction, Palo Alto, CA (June 1986)

    Google Scholar 

  6. Callahan, D., Cocke, J., Kennedy, K.: Analysis of interprocedural side effects in a parallel programming environment. Journal of Parallel and Distributed Computing 5(5), 517–550 (1988)

    Article  Google Scholar 

  7. Cascaval, C., Padua, D.A.: Estimating cache misses and locality using stack distances. In: Proceedings of International Conference on Supercomputing, San Francisco, CA (June 2003)

    Google Scholar 

  8. Chatterjee, S., Parker, E., Hanlon, P.J., Lebeck, A.R.: Exact analysis of the cache behavior of nested loops. In: Proceedings of ACM SIGPLAN Conference on Programming Language Design and Implementation, Snowbird, UT (2001)

    Google Scholar 

  9. Dhodapkar, A.S., Smith, J.E.: Managing multi-configuration hardware via dynamic working-set analysis. In: Proceedings of International Symposium on Computer Architecture, Anchorage, Alaska (June 2002)

    Google Scholar 

  10. Dhodapkar, A.S., Smith, J.E.: Comparing program phase detection techniques. In: Proceedings of International Symposium on Microarchitecture (December 2003)

    Google Scholar 

  11. Ding, C., Zhong, Y.: Predicting whole-program locality with reuse distance analysis. In: Proceedings of ACM SIGPLAN Conference on Programming Language Design and Implementation, San Diego, CA (June 2003)

    Google Scholar 

  12. Duesterwald, E., Cascaval, C., Dwarkadas, S.: Characterizing and predicting program behavior and its variability. In: Proceedings of International Conference on Parallel Architectures and Compilation Techniques, New Orleans, Louisiana (September 2003)

    Google Scholar 

  13. Fang, C., Carr, S., Onder, S., Wang, Z.: Reuse-distance-based miss-rate prediction on a per instruction basis. In: Proceedings of the first ACM SIGPLAN Workshop on Memory System Performance, Washington DC (June 2004)

    Google Scholar 

  14. Ferrante, J., Sarkar, V., Thrash, W.: On estimating and enhancing cache effectiveness. In: Banerjee, U., Gelernter, D., Nicolau, A., Padua, D. (eds.) Languages and Compilers for Parallel Computing, Fourth International Workshop, Santa Clara, CA, August 1991. Springer, Heidelberg (1991)

    Google Scholar 

  15. Gallivan, K., Jalby, W., Gannon, D.: On the problem of optimizing data transfers for complex memory systems. In: Proceedings of the Second International Conference on Supercomputing, St. Malo, France (July 1988)

    Google Scholar 

  16. Ghosh, S., Martonosi, M., Malik, S.: Cache miss equations: A compiler framework for analyzing and tuning memory behavior. ACM Transactions on Programming Langauges and Systems 21(4) (1999)

    Google Scholar 

  17. Havlak, P., Kennedy, K.: An implementation of interprocedural bounded regular section analysis. IEEE Transactions on Parallel and Distributed Systems 2(3), 350–360 (1991)

    Article  Google Scholar 

  18. Hill, M.D.: Aspects of cache memory and instruction buffer performance. PhD thesis, University of California, Berkeley (November 1987)

    Google Scholar 

  19. Hsu, C.-H., Kermer, U.: The design, implementation and evaluation of a compiler algorithm for CPU energy reduction. In: Proceedings of ACM SIGPLAN Conference on Programming Language Design and Implementation, San Diego, CA (June 2003)

    Google Scholar 

  20. Hsu, C.-H., Kremer, U., Hsiao, M.: Compiler-directed dynamic frequency and voltage scaling. In: Workshop on Power-Aware Computer Systems, Cambridge, MA (November 2000)

    Google Scholar 

  21. Li, Z., Yew, P., Zhu, C.: An efficient data dependence analysis for parallelizing compilers. IEEE Transactions on Parallel and Distributed Systems 1(1), 26–34 (1990)

    Article  Google Scholar 

  22. Huang, M., Renau, J., Torrellas, J.: Positional adaptation of processors: application to energy reduction. In: Proceedings of the International Symposium on Computer Architecture, San Diego, CA (June 2003)

    Google Scholar 

  23. Magklis, G., Scott, M.L., Semeraro, G., Albonesi, D.H., Dropsho, S.: Profile-based dynamic voltage and frequency scaling for a multiple clock domain microprocessor. In: Proceedings of the International Symposium on Computer Architecture, San Diego, CA (June 2003)

    Google Scholar 

  24. Marin, G., Mellor-Crummey, J.: Cross architecture performance predictions for scientific applications using parameterized models. In: Proceedings of Joint International Conference on Measurement and Modeling of Computer Systems, New York City, NY (June 2004)

    Google Scholar 

  25. Mattson, R.L., Gecsei, J., Slutz, D., Traiger, I.L.: Evaluation techniques for storage hierarchies. IBM System Journal 9(2), 78–117 (1970)

    Article  Google Scholar 

  26. McKinley, K.S., Temam, O.: Quantifying loop nest locality using SPEC 1995 and the perfect benchmarks. ACM Transactions on Computer Systems 17(4), 288–336 (1999)

    Article  Google Scholar 

  27. Mellor-Crummey, J., Fowler, R., Whalley, D.B.: Tools for application-oriented performance tuning. In: Proceedings of the 15th ACM International Conference on Supercomputing, Sorrento, Italy (June 2001)

    Google Scholar 

  28. Shen, X., Zhong, Y., Ding, C.: Regression-based multi-model prediction of data reuse signature. In: Proceedings of the 4th Annual Symposium of the Las Alamos Computer Science Institute, Sante Fe, Mexico (November 2003)

    Google Scholar 

  29. Shen, X., Zhong, Y., Ding, C.: Locality phase prediction. In: Proceedings of the Eleventh International Conference on Architect ural Support for Programming Languages and Operating Systems (ASPLOS XI), Boston, MA (2004) (to appear)

    Google Scholar 

  30. Sherwood, T., Sair, S., Calder, B.: Phase tracking and prediction. In: Proceedings of International Symposium on Computer Architecture, San Diego, CA (June 2003)

    Google Scholar 

  31. Triolet, R., Irigoin, F., Feautrier, P.: Direct parallelization of CALL statements. In: Proceedings of the SIGPLAN 1986 Symposium on Compiler Construction, Palo Alto, CA (June 1986)

    Google Scholar 

  32. Zhong, Y., Dropsho, S.G., Ding, C.: Miss rate prediction across all program inputs. In: Proceedings of the 12th International Conference on Parallel Architectures and Compilation Techniques, New Orleans, Louisiana (September 2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Shen, X., Zhong, Y., Ding, C. (2005). Phase-Based Miss Rate Prediction Across Program Inputs. In: Eigenmann, R., Li, Z., Midkiff, S.P. (eds) Languages and Compilers for High Performance Computing. LCPC 2004. Lecture Notes in Computer Science, vol 3602. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11532378_5

Download citation

  • DOI: https://doi.org/10.1007/11532378_5

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-28009-5

  • Online ISBN: 978-3-540-31813-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics