Phase-Based Miss Rate Prediction Across Program Inputs

Shen, Xipeng; Zhong, Yutao; Ding, Chen

doi:10.1007/11532378_5

Phase-Based Miss Rate Prediction Across Program Inputs

Xipeng Shen¹⁹,
Yutao Zhong¹⁹ &
Chen Ding¹⁹

Conference paper

941 Accesses
2 Citations

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 3602))

Abstract

Previous work shows the possibility of predicting the cache miss rate (CMR) for all inputs of a program. However, most optimization techniques need to know more than the miss rate of the whole program. Many of them benefit from knowing miss rate of each execution phase of a program for all inputs.

In this paper, we describe a method that divides a program into phases that have a regular locality pattern. Using a regression model, it predicts the reuse signature and then the cache miss rate of each phase for all inputs. We compare the prediction with the actual measurement. The average prediction is over 98% accurate for a set of floating-point programs. The predicted CMR-traces matches the simulated ones in spite of dramatic fluctuations of the miss rate over time. This technique can be used for improving dynamic optimization, benchmarking, and compiler design.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Allen, F., Cocke, J.: A proram data flow analysis procedure. Communications of the ACM 19, 137–147 (1976)
Article MATH Google Scholar
Allen, R., Kennedy, K.: Optimizing Compilers for Modern Architectures: A Dependence-based Approach. Morgan Kaufmann Publishers, San Francisco (2001)
Google Scholar
Balasubramonian, R., Albonesi, D., Buyuktosunoglu, A., Dwarkadas, S.: Memory hierarchy reconfiguration for energy and performance in general-purpose processor architectures. In: Proceedings of the 33rd International Symposium on Microarchitecture, Monterey, California (December 2000)
Google Scholar
Balasubramonian, R., Dwarkadas, S., Albonesi, D.H.: Dynamically managing the communication-parallelism trade-off in future clustered processors. In: Proceedings of International Symposium on Computer Architecture, San Diego, CA (June 2003)
Google Scholar
Burke, M., Cytron, R.: Interprocedural dependence analysis and parallelization. In: Proceedings of the SIGPLAN 1986 Symposium on Compiler Construction, Palo Alto, CA (June 1986)
Google Scholar
Callahan, D., Cocke, J., Kennedy, K.: Analysis of interprocedural side effects in a parallel programming environment. Journal of Parallel and Distributed Computing 5(5), 517–550 (1988)
Article Google Scholar
Cascaval, C., Padua, D.A.: Estimating cache misses and locality using stack distances. In: Proceedings of International Conference on Supercomputing, San Francisco, CA (June 2003)
Google Scholar
Chatterjee, S., Parker, E., Hanlon, P.J., Lebeck, A.R.: Exact analysis of the cache behavior of nested loops. In: Proceedings of ACM SIGPLAN Conference on Programming Language Design and Implementation, Snowbird, UT (2001)
Google Scholar
Dhodapkar, A.S., Smith, J.E.: Managing multi-configuration hardware via dynamic working-set analysis. In: Proceedings of International Symposium on Computer Architecture, Anchorage, Alaska (June 2002)
Google Scholar
Dhodapkar, A.S., Smith, J.E.: Comparing program phase detection techniques. In: Proceedings of International Symposium on Microarchitecture (December 2003)
Google Scholar
Ding, C., Zhong, Y.: Predicting whole-program locality with reuse distance analysis. In: Proceedings of ACM SIGPLAN Conference on Programming Language Design and Implementation, San Diego, CA (June 2003)
Google Scholar
Duesterwald, E., Cascaval, C., Dwarkadas, S.: Characterizing and predicting program behavior and its variability. In: Proceedings of International Conference on Parallel Architectures and Compilation Techniques, New Orleans, Louisiana (September 2003)
Google Scholar
Fang, C., Carr, S., Onder, S., Wang, Z.: Reuse-distance-based miss-rate prediction on a per instruction basis. In: Proceedings of the first ACM SIGPLAN Workshop on Memory System Performance, Washington DC (June 2004)
Google Scholar
Ferrante, J., Sarkar, V., Thrash, W.: On estimating and enhancing cache effectiveness. In: Banerjee, U., Gelernter, D., Nicolau, A., Padua, D. (eds.) Languages and Compilers for Parallel Computing, Fourth International Workshop, Santa Clara, CA, August 1991. Springer, Heidelberg (1991)
Google Scholar
Gallivan, K., Jalby, W., Gannon, D.: On the problem of optimizing data transfers for complex memory systems. In: Proceedings of the Second International Conference on Supercomputing, St. Malo, France (July 1988)
Google Scholar
Ghosh, S., Martonosi, M., Malik, S.: Cache miss equations: A compiler framework for analyzing and tuning memory behavior. ACM Transactions on Programming Langauges and Systems 21(4) (1999)
Google Scholar
Havlak, P., Kennedy, K.: An implementation of interprocedural bounded regular section analysis. IEEE Transactions on Parallel and Distributed Systems 2(3), 350–360 (1991)
Article Google Scholar
Hill, M.D.: Aspects of cache memory and instruction buffer performance. PhD thesis, University of California, Berkeley (November 1987)
Google Scholar
Hsu, C.-H., Kermer, U.: The design, implementation and evaluation of a compiler algorithm for CPU energy reduction. In: Proceedings of ACM SIGPLAN Conference on Programming Language Design and Implementation, San Diego, CA (June 2003)
Google Scholar
Hsu, C.-H., Kremer, U., Hsiao, M.: Compiler-directed dynamic frequency and voltage scaling. In: Workshop on Power-Aware Computer Systems, Cambridge, MA (November 2000)
Google Scholar
Li, Z., Yew, P., Zhu, C.: An efficient data dependence analysis for parallelizing compilers. IEEE Transactions on Parallel and Distributed Systems 1(1), 26–34 (1990)
Article Google Scholar
Huang, M., Renau, J., Torrellas, J.: Positional adaptation of processors: application to energy reduction. In: Proceedings of the International Symposium on Computer Architecture, San Diego, CA (June 2003)
Google Scholar
Magklis, G., Scott, M.L., Semeraro, G., Albonesi, D.H., Dropsho, S.: Profile-based dynamic voltage and frequency scaling for a multiple clock domain microprocessor. In: Proceedings of the International Symposium on Computer Architecture, San Diego, CA (June 2003)
Google Scholar
Marin, G., Mellor-Crummey, J.: Cross architecture performance predictions for scientific applications using parameterized models. In: Proceedings of Joint International Conference on Measurement and Modeling of Computer Systems, New York City, NY (June 2004)
Google Scholar
Mattson, R.L., Gecsei, J., Slutz, D., Traiger, I.L.: Evaluation techniques for storage hierarchies. IBM System Journal 9(2), 78–117 (1970)
Article Google Scholar
McKinley, K.S., Temam, O.: Quantifying loop nest locality using SPEC 1995 and the perfect benchmarks. ACM Transactions on Computer Systems 17(4), 288–336 (1999)
Article Google Scholar
Mellor-Crummey, J., Fowler, R., Whalley, D.B.: Tools for application-oriented performance tuning. In: Proceedings of the 15th ACM International Conference on Supercomputing, Sorrento, Italy (June 2001)
Google Scholar
Shen, X., Zhong, Y., Ding, C.: Regression-based multi-model prediction of data reuse signature. In: Proceedings of the 4th Annual Symposium of the Las Alamos Computer Science Institute, Sante Fe, Mexico (November 2003)
Google Scholar
Shen, X., Zhong, Y., Ding, C.: Locality phase prediction. In: Proceedings of the Eleventh International Conference on Architect ural Support for Programming Languages and Operating Systems (ASPLOS XI), Boston, MA (2004) (to appear)
Google Scholar
Sherwood, T., Sair, S., Calder, B.: Phase tracking and prediction. In: Proceedings of International Symposium on Computer Architecture, San Diego, CA (June 2003)
Google Scholar
Triolet, R., Irigoin, F., Feautrier, P.: Direct parallelization of CALL statements. In: Proceedings of the SIGPLAN 1986 Symposium on Compiler Construction, Palo Alto, CA (June 1986)
Google Scholar
Zhong, Y., Dropsho, S.G., Ding, C.: Miss rate prediction across all program inputs. In: Proceedings of the 12th International Conference on Parallel Architectures and Compilation Techniques, New Orleans, Louisiana (September 2003)
Google Scholar

Download references

Author information

Authors and Affiliations

Computer Science Department, University of Rochester,
Xipeng Shen, Yutao Zhong & Chen Ding

Authors

Xipeng Shen
View author publications
You can also search for this author in PubMed Google Scholar
Yutao Zhong
View author publications
You can also search for this author in PubMed Google Scholar
Chen Ding
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of ECE, Purdue University, 47907, West Lafayette, IN
Rudolf Eigenmann
Department of Computer Science, Purdue University, 47906, West Lafayette, IN, USA
Zhiyuan Li
School of Electrical and Computer Engineering, Purdue University, 47907, West Lafayette, IN, USA
Samuel P. Midkiff

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Shen, X., Zhong, Y., Ding, C. (2005). Phase-Based Miss Rate Prediction Across Program Inputs. In: Eigenmann, R., Li, Z., Midkiff, S.P. (eds) Languages and Compilers for High Performance Computing. LCPC 2004. Lecture Notes in Computer Science, vol 3602. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11532378_5

Download citation

DOI: https://doi.org/10.1007/11532378_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-28009-5
Online ISBN: 978-3-540-31813-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics