Compositional Approach Applied to Loop Specialization

  • Lamia Djoudi
  • Jean-Thomas Acquaviva
  • Denis Barthou
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4641)


An optimizing compiler has a hard time to generate a code which will perform at top speed for an arbitrary data set size. In general, the low level optimization process must take into account parameters such as loop trip count for generating efficient code. The code can be specialized depending upon data set size ranges, at the expense of code expansion and decision tree overhead.


Cache Line Assembly Code Cycle Count Software Pipeline Spec Benchmark 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Schwiegelshohn, U., Gasperoni, F., Ebcioglu, K.: On Optimal Parallelization of Arbitrary Loops. Journal of Parallel and Distributed Computing 11, 130–134 (1991)CrossRefGoogle Scholar
  2. 2.
    Darte, A., Robert, Y.: Affine-by-statement scheduling of uniform and affine loop nests over parametric domains. Journal of Parallel and Distributed Computing 29(1), 43–59 (1995)CrossRefGoogle Scholar
  3. 3.
    Rau, B.R.: Iterative modulo scheduling: an algorithm for software pipelining loops. In: Int. Symp. on Microarchitecture, San Jose, California, United States, pp. 63–74. ACM Press, New York (1994)Google Scholar
  4. 4.
    Griebl, M., Feautrier, P., Lengauer, C.: Index set splitting. Int. Journal of Parallel Programming 28(6), 607–631 (2000)CrossRefGoogle Scholar
  5. 5.
    Djoudi, L., Barthou, D., Carribault, P., Lemuet, C., Acquaviva, J.T., Jalby, W.: Exploring application performance: a new tool for a static / dynamic approach. In: LACSI Los Alamos Computer Science Institute Symposium (2005)Google Scholar
  6. 6.
    Cooper, K., Dasgupta, A., Kennedy, K.: Vizer: A system to vectorize intel x86 binaries. In: LACSI Los Alamos Computer Science Institute Symposium (December 2002)Google Scholar
  7. 7.
    Merten, M., Thiems, M.: An overview of the IMPACT x86 binary reoptimization framework. Technical report (July 1998)Google Scholar
  8. 8.
    Larus, J., Schnarr, E.: EEL: Machine-independent executable editing. In: Int. Conf. on Programming Language Design and Implementation, pp. 291–300 (1995)Google Scholar
  9. 9.
    McNairy, C., Soltis, D.: Itanium 2 processor microarchitecture. IEEE Micro 23(2), 44–55 (2003)CrossRefGoogle Scholar
  10. 10.
    Allan, V.H., Jones, R.B., Lee, R.M., Allan, S.J.: Software pipelining. ACM Computing Surveys 27(3), 367–432 (1995)CrossRefGoogle Scholar
  11. 11.
    Doshi, G., Krishnaiyer, R., Muthukumar, K.: Optimizing software data prefetches with rotating registers. In: Int. Conf. on Parallel Architectures and Compilation Techniques, Barcelona, Catalunya, Spain, IEEE Computer Society Press, Los Alamitos (2001)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • Lamia Djoudi
    • 1
  • Jean-Thomas Acquaviva
    • 1
  • Denis Barthou
    • 1
  1. 1.Université de VersaillesFrance

Personalised recommendations