Strength Reduction of Integer Division and Modulo Operations
- 312 Downloads
Integer division, modulo, and remainder operations are expressive and useful operations. They are logical candidates to express complex data accesses such as the wrap-around behavior in queues using ring buffers. In addition, they appear frequently in address computations as a result of compiler optimizations that improve data locality, perform data distribution, or enable parallelization. Experienced application programmers, however, avoid them because they are slow. Furthermore, while advances in both hardware and software have improved the performance of many parts of a program, few are applicable to division and modulo operations. This trend makes these operations increasingly detrimental to program performance.
This paper describes a suite of optimizations for eliminating division, modulo, and remainder operations from programs. These techniques are analogous to strength reduction techniques used for multiplications. In addition to some algebraic simplifications, we present a set of optimization techniques that eliminates division and modulo operations that are functions of loop induction variables and loop constants. The optimizations rely on algebra, integer programming, and loop transformations.
KeywordsStrength Reduction Iteration Space Division Operation Address Computation Loop Transformation
Unable to display preview. Download preview PDF.
- R. Alverson. Integer Division Using Reciprocals. In Proceedings of the Tenth Symposium on Computer Arithmetic, Grenoble, France, June 1991.Google Scholar
- S. Amarasinghe. Parallelizing Compiler Techniques Based on Linear Inequalities. In Ph.D Thesis, Stanford University. Also appears as Techical Report CSL-TR-97-714, Jan 1997.Google Scholar
- C. Ancourt and F. Irigoin. Scanning Polyhedra with Do Loops. In Proceedings of the Third ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pages 39–50, Williamsburg, VA, Apr. 1991.Google Scholar
- M. Ancourt. Génération Automatique de Codes de Transfert pour Multiprocesseurs à Mémoires Locales. PhD thesis, Université Paris VI, Mar. 1991.Google Scholar
- J. M. Anderson, S. P. Amarasinghe, and M. S. Lam. Data and Computation Transformations for Multiprocessors. In Proceedings of the Fifth ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, pages 166–178, Santa Barbara, CA, July 1995.Google Scholar
- R. Barua, W. Lee, S. Amarasinghe, and A. Agarwal. Maps: A Compiler-Managed Memory System for Raw Machines. In Proceedings of the 26th International Symposium on Computer Architecture, Atlanta, GA, May 1999.Google Scholar
- T. Granlund and P. Montgomery. Division by Invariant Integers using Multiplication. In Proceedings of the SIGPLAN’ 94 Conference on Programming Language Design and Implementation, Orlando, FL, June 1994.Google Scholar
- B. Greenwald. A Technique for Compilation to Exposed Memory Hierarchy. Master’s thesis, Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, September 1999.Google Scholar
- C. A. Moritz, M. Frank, W. Lee, and S. Amarasinghe. Hot Pages: Software Caching for Raw Microprocessors. Technical Memo LCS-TM-599, Laboratory for Computer Science, Massachusetts Institute of Technology, Sept 1999.Google Scholar
- S. Oberman. Design Issues in High Performance Floating Point Arithmetic Units. PhD thesis, Stanford University, December 1996.Google Scholar
- W. Pugh. The Omega test: A fast and practical integer programming algorithm for dependence analysis. In Proceedings of Supercomputing’ 91, Albuquerque, NM, Nov. 1991.Google Scholar
- M. B. Taylor. Design Decisions in the Implementation of a Raw Architecture Workstation. Master’s thesis, Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, September 1999.Google Scholar
- R. Wilson, R. French, C. Wilson, S. Amarasinghe, J. Anderson, S. Tjiang, S.-W. Liao, C.-W. Tseng, M. Hall, M. Lam, and J. Hennessy. SUIF: An Infrastructure for Research on Parallelizing and Optimizing Compilers. ACM SIGPLAN Notices, 29(12), Dec. 1996.Google Scholar
- M. E. Wolf. Improving Locality and Parallelism in Nested Loops. PhD thesis, Dept. of Computer Science, Stanford University, Aug. 1992.Google Scholar