Skip to main content

PSnAP: Accurate Synthetic Address Streams through Memory Profiles

  • Conference paper
Book cover Languages and Compilers for Parallel Computing (LCPC 2009)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5898))

Abstract

Memory address traces are an important information source; they drive memory simulations for performance modeling, systems design and application tuning. For long running applications, the direct use of an address trace is complicated by its size. Previous attempts to reduce trace size incurred a substantial penalty with respect to trace accuracy. We propose a novel method of memory profiling that enables the generation of highly accurate synthetic traces with space requirements typically under 1% of the original traces. We demonstrate the synthetic trace accuracy in terms of cache hit rates, spatial-temporal locality scores and locality surfaces. Simulated cache hit rates from synthetic traces are within 3.5% of observed and on average are within 1.0% for L1 cache. Our profiles are on average 60 times smaller than compressed traces. The combination of small profile sizes and high similarity to original traces makes our technique uniquely applicable to performance modeling and trace driven simulation of large-scale parallel scientific applications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Skadron, K., Martonosi, M., August, D.I., Hill, M.D., Lilja, D.J., Pai, V.S.: Challenges in computer architecture evaluation. Computer 36(8), 30–36 (2003)

    Article  Google Scholar 

  2. Mattson, R., Gecsei, J., Slutz, D., Traiger, I.: Evaluation techniques for storage hierarchies. IBM Systems Journal 9, 78–117 (1970)

    Article  Google Scholar 

  3. Calingaert, P.: System performance evaluation: survey and appraisal. Commun. ACM 10(1), 12–18 (1967)

    Article  Google Scholar 

  4. Anacker, W., Wang, C.P.: Evaluation of computing systems with memory hierarchies. IEEE Transactions on Electronic Computers 16(6), 670–679 (1967)

    Article  Google Scholar 

  5. Anacker, W., Wang, C.: Performance evaluation of computing systems with memory hierarchies. IEEE Transactions on Electronic Computers EC-16(6), 764–773 (1967)

    Article  Google Scholar 

  6. Snavely, A., Carrington, L., Wolter, N., Labarta, J., Badia, R., Purkayastha, A.: A framework for application performance modeling and prediction. In: ACM/IEEE Conference on High Performance Networking and Computing (2002)

    Google Scholar 

  7. Carrington, L., Wolter, N., Snavely, A., Lee, C.: Applying an automated framework to produce accurate blind performance predictions of full-scale hpc applications. In: UGC (2004)

    Google Scholar 

  8. Flanagan, J., Nelson, B., Thompson, G.: The inaccuracy of trace-driven simulation using incomplete multiprogramming trace data. In: MASCOTS (1996)

    Google Scholar 

  9. Kaeli, D.R.: Issues in trace-driven simulation. In: Donatiello, L., Nelson, R. (eds.) SIGMETRICS 1993 and Performance 1993. LNCS, vol. 729, pp. 224–244. Springer, Heidelberg (1993)

    Chapter  Google Scholar 

  10. Vanderwiel, S.P., Lilja, D.J.: Data prefetch mechanisms. ACM Comput. Surv. 32(2), 174–199 (2000)

    Article  Google Scholar 

  11. Murphy, R.C., Kogge, P.M.: On the memory access patterns of supercomputer applications: Benchmark selection and its implications. IEEE Trans. Comput. 56(7), 937–945 (2007)

    Article  MathSciNet  Google Scholar 

  12. Laurenzano, M., Simon, B., Snavely, A., Gunn, M.: Low cost trace-driven memory simulation using simpoint. In: Workshop on Binary Instrumentation and Applications (2005)

    Google Scholar 

  13. Gao, X.: Reducing time and space costs of memory tracing. PhD thesis, University of California at San Diego, La Jolla, CA, USA (2006)

    Google Scholar 

  14. Mitarai, S., Hirao, M., Matsumoto, T., Shinohara, A., Takeda, M., Arikawa, S.: Compressed pattern matching for SEQUITUR. In: Data Compression Conference, p. 469+ (2001)

    Google Scholar 

  15. Gao, X., Snavely, A., Carter, L.: Path grammar guided trace compression and trace approximation. In: International Symposium on High Performance Distributed Computing (2006)

    Google Scholar 

  16. Sorenson, E., Flanagan, J.: Evaluating synthetic trace models using locality surfaces. In: IEEE International Workshop on Workload Characterization, November 2002, pp. 23–33 (2002)

    Google Scholar 

  17. Grimsrud, K., Archibald, J., Frost, R., Nelson, B.: On the accuracy of memory reference models. In: The international conference on Computer performance evaluation: modelling techniques and tools, Secaucus, NJ, USA, pp. 369–388. Springer, New York (1994)

    Google Scholar 

  18. Tikir, M., Laurenzano, M., Carrington, L., Snavely, A.: The pmac binary instrumentation library for powerpc. In: Workshop on Binary Instrumentation and Applications (2006)

    Google Scholar 

  19. Agarwal, R.C., Alpern, B., Carter, L., Gustavson, F.G., Klepacki, D.J., Lawrence, R., Zubair, M.: High-performance parallel implementations of the NAS kernel benchmarks on the IBM sp2. IBM Systems Journal 34(2), 263–272 (1995)

    Article  Google Scholar 

  20. Aarseth, S.: Nbody2: a direct n-body integration code. New Astronomy 6, 277 (2001)

    Article  Google Scholar 

  21. Hennessy, J., Patterson, D.: Computer Architecture: A Quantitative Approach. Morgan Kaufmann, San Francisco (2003)

    Google Scholar 

  22. Weinberg, J., Snavely, A.: Chameleon: A framework for observing, understanding, and imitating memory behavior. In: Workshop on State-of-the-Art in Scientific and Parallel Computing, Trondheim, Norway (May 2008)

    Google Scholar 

  23. Sorenson, E.S., Flanagan, J.K.: Using locality surfaces to characterize the specint 2000 benchmark suite. In: Workload Characterization of Emerging Computer Applications, pp. 101–120. Kluwer Academic Publishers, Dordrecht (2001)

    Google Scholar 

  24. Gao, X., Laurenzano, M., Simon, B., Snavely, A.: Reducing overheads for acquiring dynamic traces. In: International Symposium on Workload Characterization (2005)

    Google Scholar 

  25. Denning, P.J.: On modeling program behavior. In: American Federation of Information Processing Societies joint computer conference, pp. 937–944. ACM, New York (1971)

    Google Scholar 

  26. Aho, A.V., Denning, P.J., Ullman, J.D.: Principles of optimal page replacement. J. ACM 18(1), 80–93 (1971)

    Article  MATH  MathSciNet  Google Scholar 

  27. Spirn, J.: Distance string models for program behavior. Computer 9(11), 14–20 (1976)

    Article  Google Scholar 

  28. Thiebaut, D., Wolf, J., Stone, H.: Synthetic traces for trace-driven simulation of cache memories. IEEE Transactions on Computers 41(4), 388–410 (1992)

    Article  Google Scholar 

  29. Agarwal, A., Hennessy, J., Horowitz, M.: An analytical cache model. ACM Trans. Comput. Syst. 7(2), 184–215 (1989)

    Article  Google Scholar 

  30. Berg, E., Hagersten, E.: Statcache: a probabilistic approach to efficient and accurate data locality analysis. In: IEEE International Symposium on Performance Analysis of Systems and Software, Washington, DC, USA, pp. 20–27. IEEE Computer Society, Los Alamitos (2004)

    Chapter  Google Scholar 

  31. Archibald, J., Baer, J.L.: Cache coherence protocols: evaluation using a multiprocessor simulation model. ACM Trans. Comput. Syst. 4(4), 273–298 (1986)

    Article  Google Scholar 

  32. Hassan, R., Harris, A., Topham, N., Efthymiou, A.: Synthetic trace-driven simulation of cache memory. In: International Conference on Advanced Information Networking and Applications Workshop, May 2007, vol. 1, pp. 764–771 (2007)

    Google Scholar 

  33. Cascaval, C., DeRose, L., Padua, D.A., Reed, D.A.: Compile-time based performance prediction. In: Carter, L., Ferrante, J. (eds.) LCPC 1999. LNCS, vol. 1863, pp. 365–379. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Olschanowsky, C.M., Tikir, M.M., Carrington, L., Snavely, A. (2010). PSnAP: Accurate Synthetic Address Streams through Memory Profiles. In: Gao, G.R., Pollock, L.L., Cavazos, J., Li, X. (eds) Languages and Compilers for Parallel Computing. LCPC 2009. Lecture Notes in Computer Science, vol 5898. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-13374-9_24

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-13374-9_24

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-13373-2

  • Online ISBN: 978-3-642-13374-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics