Advertisement

Runtime Biased Pointer Reuse Analysis and Its Application to Energy Efficiency

  • Yao Guo
  • Saurabh Chheda
  • Csaba Andras Moritz
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3164)

Abstract

Compiler-enabled memory systems have been successful in reducing chip energy consumption. A major challenge lies in their applicability in the context of complex pointer-intensive programs. State-of-the-art high precision pointer analysis techniques have limitations when applied to such programs, and therefore have restricted use. This paper describes runtime biased pointer reuse analysis to capture the behavior of pointers in programs of arbitrary complexity. The proposed technique is runtime biased and speculative in the sense that the possible targets for each pointer access are statically predicted based on the likelihood of their occurrence at runtime, rather than conservative static analysis alone. This idea implemented as a flow-sensitive dataflow analysis enables high precision in capturing pointer behavior, reduces complexity, and extends the approach to arbitrary programs. Besides memory accesses with good reuse/locality, the technique identifies irregular accesses that typically result in energy and performance penalties when managed statically. The approach is validated in the context of a compiler managed memory system targeting energy efficiency. On a suite of pointer-intensive benchmarks, the techniques increase the fraction of memory accesses that can be mapped statically to energy efficient memory access paths by 7-72%, giving a 4-31% additional L1 data cache energy reduction.

Keywords

Pointer Access Memory Access Loop Iteration Cache Line Access Path 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    The standard performance evaluation corporation (2000), http://www.spec.org
  2. 2.
    Ashok, R., Chheda, S., Moritz, C.A.: Cool-mem: Combining statically speculative memory accessing with selective address translation for energy efficiency. In: ASPLOS (2002)Google Scholar
  3. 3.
    Austin, T.: Pointer-intensive benchmark suite, version 1.1 (1995), http://www.cs.wisc.edu/~austin/ptr-dist.html
  4. 4.
    Brooks, D., Tiwari, V., Martonosi, M.: Wattch: A framework for architectural-level power analysis and optimizations. In: Proceedings of the 27th Annual International Symposium on Computer Architecture, Vancouver, British Columbia, June 12–14, 2000, pp. 83–94. IEEE Computer Society and ACM SIGARCH (2000)Google Scholar
  5. 5.
    Burger, D.C., Austin, T.M.: The simplescalar tool set, version 2.0. Technical Report CS-TR-1997-1342, University of Wisconsin, Madison (June 1997)Google Scholar
  6. 6.
    Emami, M., Ghiya, R., Hendren, L.J.: Context-sensitive interprocedural points-to analysis in the presence of function pointers. In: SIGPLAN Conference on Programming Language Design and Implementation, pp. 242–256 (1994)Google Scholar
  7. 7.
    Gowan, M.K., Biro, L.L., Jackson, D.B.: Power considerations in the design of the alpha 21264 microprocessor. In: Proceedings of the 1998 Conference on Design Automation (DAC 1998), Los Alamitos, CA, June 15–19, 1998, pp. 726–731. ACM/IEEE (1998)Google Scholar
  8. 8.
    Montanaro, J., Witek, R.T., Anne, K., Black, A.J., Cooper, E.M., Dobberpuhl, D.W., Donahue, P.M., Eno, J., Hoeppner, G.W., Kruckemyer, D., Lee, T.H., Lin, P.C.M., Madden, L., Murray, D., Pearce, M.H., Santhanam, S., Snyder, K.J., Stephany, R., Thierauf, S.C.: A 160-MHz, 32-b, 0.5-W CMOS RISC microprocessor. Digital Technical Journal of Digital Equipment Corporation 9(1) (1997)Google Scholar
  9. 9.
    Mowry, T.: Tolerating Latency Through Software Controlled Data Prefetching. PhD thesis, Dept. of Computer Science, Stanford University (March 1994)Google Scholar
  10. 10.
    Rogers, A., Carlisle, M.C., Reppy, J.H., Hendren, L.J.: Supporting dynamic data structures on distributed-memory machines. ACM Transactions on Programming Languages and Systems 17(2), 233–263 (1995)CrossRefGoogle Scholar
  11. 11.
    Smith, M.: Extending suif for machine-dependent optimizations. In: Proc. First SUIF Compiler Workshop (January 1996)Google Scholar
  12. 12.
    Unsal, O.S., Ashok, R., Koren, I., Krishna, C.M., Moritz, C.A.: Cool-cache for hot multimedia. In: Proceedings of the 34th annual ACM/IEEE international symposium on Microarchitecture, pp. 274–283. IEEE Computer Society Press, Los Alamitos (2001)Google Scholar
  13. 13.
    Wilson, R., French, R., Wilson, C., Amarasinghe, S., Anderson, J., Tjiang, S., Liao, S.-W., Tseng, C.-W., Hall, M.W., Lam, M., Hennessy, J.L.: SUIF: A parallelizing and optimizing research compiler. Technical Report CSL-TR-94-620, Computer Systems Laboratory, Stanford University (May 1994)Google Scholar
  14. 14.
    Witchel, E., Larsen, S., Ananian, C.S., Asanović, K.: Direct addressed caches for reduced power consumption. In: Proceedings of the 34th Annual International Symposium on Microarchitecture, Austin, Texas, December 1–5, 2001, pp. 124–133. IEEE Computer Society TC-MICRO and ACM SIGMICRO (2001)Google Scholar
  15. 15.
    Wolf, M.E.: Improving Locality and Parallelism in Nested Loops. PhD thesis, Dept. of Computer Science, Stanford University (August 1992)Google Scholar
  16. 16.
    Wolf, M.E., Lam, M.S.: A data locality optimizing algorithm. SIGPLAN Notices 26(6), 30–44 (June 1991); In: ACM SIGPLAN 1991 Conference on Programming Language Design and ImplementationCrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Yao Guo
    • 1
  • Saurabh Chheda
    • 1
  • Csaba Andras Moritz
    • 1
  1. 1.Department of Electrical and Computer EngineeringUniversity of MassachusettsAmherstUSA

Personalised recommendations