Abstract
We present an automatic approach for prefetching data for linked list data structures. The main idea is based on the observation that linked list elements are frequently allocated at constant distance from one another in the heap. When linked lists are traversed, a regular pattern of memory accesses with constant stride emerges. This regularity in the memory footprint of linked lists enables the development of a prefetching framework where the address of the element accessed in one of the future iterations of the loop is dynamically predicted based on its previous regular behavior.
We automatically identify pointer-chasing recurrences in loops that access linked lists. This identification uses a surprisingly simple method that looks for induction pointers — pointers that are updated in each loop iteration by a load with a constant offset. We integrate induction pointer prefetching with loop scheduling. A key intuition incorporated in our framework is to insert prefetches only if there are processor resources and memory bandwidth available. In order to estimate available memory bandwidth we calculate the number of potential cache misses in one loop iteration. Our estimation algorithm is based on an application of graph coloring on a memory access interference graph derived from the control flow graph. We implemented the prefetching framework in an industry-strength production compiler, and performed experiments on ten benchmark programs with linked lists. We observed performance improvements between 15% and 35% in three of them.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
T. F. Chen and J. L. Baer. Effective hardware-based data prefetching for high-performance processors. IEEE Transactions on Computers, 44:609–623, May 1995.
T. M. Chilimbi, M. D. Hill, and J. R. Larus. Making pointer-based data structures cache conscious. Computer, 33(12):67–74, December 2000.
F. Chow, S. Chan, R. Kennedy, S-M Liu, R. Lo, and Peng Tu. A new algorithm for partial redundancy elimination based on SSA form. In International Conference on Programming Languages Design and Implementation, pages 273–286, 1997.
T. H. Cormen, C. E. Leiserson, and R. L. Rivest. Introduction to Algorithms. MIT Press; McGraw-Hill Book Company, Cambridge, Massachusetts; New York, New York, 1990.
J. C. Dehnert and R. A. Towle. Compiling for the cydra 5. The Journal of Supercomputing, 7:181–227, May 1993.
J. Fu and J. Patel. Stride directed prefetching in scalar processors. In International Symposium on Microarchitecture, pages 102–110, 1992.
J. Gonzales and A. Gonzales. Speculative execution via address prediction and data prefetching. In International Conference on Supercomputing, pages 196–203, 1997.
M. Lipasti, W. Schmidt, S. Kunkel, and R. Roediger. SPAID: Software prefetching in pointer-and call-intensive environments. In International Symposium on Microarchitecture, pages 231–236, 1995.
C. K. Luk and T. Mowry. Compiler based prefetching for recursive data structures. In International Conference on Architectural Support for Programming Languages and Operating Systems, pages 222–233, 1996.
S. Mantripragada, S. Jain, and J. Dehnert. A new framework for integrated global local scheduling. In Conference on Parallel Architectures and Compilation Techniques, pages 167–174, Paris, France, October 1998.
S. Mehrotra. Data Prefetch Mechanisms for Accelerating Symbolic and Numeric Computation. PhD thesis, University of Illinois at Urbana-Champaign, 1996.
T. Mowry. Tolerating Latency Through Software-Controlled Data Prefetching. PhD thesis, Stanford University, 1994.
T. Mowry and C. K. Luk. Predicting data cache misses in non-numeric applications through correlation profiling. In International Symposium on Microarchitecture, pages 314–320, 1997.
T. Ozawa, Y. Kimura, and S. Nishizaki. Cache miss heuristics and preloading techniques for general-purpose programs. In International Symposium on Microarchitecture, pages 243–248, 1995.
B. Rau. Iterative modulo scheduling. Technical Report HPL-94-115, HP Laboratories, 1995.
A. Roth, A. Moshovos, and G. Sohi. Dependence based prefetching for linked data structures. In International Conference on Architectural Support for Programming Languages and Operating Systems, pages 115–126, 1998.
J. Ruttenberg, G. R. Gao, A. Stouchinin, and W. Lichtenstein. Software pipelining showdown: Optimal vs. heuristic methods in a production compiler. In International Conference on Programming Languages Design and Implementation, pages 1–11, Philadelphia, PA, May 1996.
C. Selvidge. Compilation-Based Prefetching for Memory Latency Tolerance. PhD thesis, MIT, 1992.
A. Stoutchinin, J. N. Amaral, G. R. Gao, J. Dehnert, and S. Jain. Automatic prefetching of induction pointers for software pipelining. Technical Report 37, November 1999.
R. Tarjan. Enumeration of the elementary circuits of a directed graph. SIAM Journal on Computing, 2(3):211–216, September 1973.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Stoutchinin, A., Amaral, J.N., Gao, G.R., Dehnert, J.C., Jain, S., Douillet, A. (2001). Speculative Prefetching of Induction Pointers . In: Wilhelm, R. (eds) Compiler Construction. CC 2001. Lecture Notes in Computer Science, vol 2027. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45306-7_20
Download citation
DOI: https://doi.org/10.1007/3-540-45306-7_20
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-41861-0
Online ISBN: 978-3-540-45306-2
eBook Packages: Springer Book Archive