Advertisement

On Code Generation for Multi-generator WITH-Loops in SAC

  • Clemens Grelck
  • Dietmar Kreye
  • Sven-Bodo Scholz
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1868)

Abstract

Most array operations in Sac are specified in terms of so-called with-loops, a sac-specific form of array comprehension. Due to the map-like semantics of with-loops its loop instances can be computed in any order which provides considerable freedom when it comes to compiling them into nestings of for-loops in C. This paper discusses several different execution orders and their impact on compilation complexity, size of generated code, and execution runtimes. As a result, a multiply parameterized compilation scheme is proposed which achieves speedups of up to a factor of 16 when compared against a naïve compilation scheme.

Keywords

Array Element Iteration Space Iteration Order Cache Line Index Vector 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Bacon, D.F., Graham, S.L., Sharp, O.J.: Compiler Transformations for High-Performance Computing. ACM Computing Surveys 26(4), 345–420 (1994)CrossRefGoogle Scholar
  2. 2.
    Burke, C.: J and APL. Iverson Software Inc., Toronto, Canada (1996)Google Scholar
  3. 3.
    Cann, D.C.: Compilation Techniques for High Performance Applicative Computation. Techn. Report CS-89-108, Lawrence Livermore National Laboratory, Liver-more California (1989)Google Scholar
  4. 4.
    Cann, D.C.: The Optimizing SISAL Compiler: Version 12.0. Lawrence Livermore National Laboratory. Livermore California (1993); Part of the SISAL distributionGoogle Scholar
  5. 5.
    Cann, D.C., Evripidou, P.: Advanced Array Optimizations for High Performance Functional Languages. IEEE Transactions on Parallel and Distributed Systems 6(3), 229–239 (1995)CrossRefGoogle Scholar
  6. 6.
    Cerniak, M.: Optimizing Programs by Data and Control Transformations. PhD thesis, University of Rochester, Rochester, New York (1997)Google Scholar
  7. 7.
    Coleman, S., McKinley, K.: Tile Size Selection Using Cache Organization and Data Layout. In: Proceedings of the ACM SIGPLAN 1995 Conference on Programming Language Design and Implementation, La Jolla, California, pp. 279–290 (1995)Google Scholar
  8. 8.
    Grelck, C.: Shared Memory Multiprocessor Support for SAC. In: Hammond, K., Davie, T., Clack, C. (eds.) IFL 1998. LNCS, vol. 1595, pp. 38–53. Springer, Heidelberg (1999) ISBN 3-540-66229-4Google Scholar
  9. 9.
    Grelck, C., Scholz, S.-B.: Accelerating APL Programs with SAC. In: Lefevre, O. (ed.) Proceedings of the Array Processing Language Conference (APL 1999), Scranton, Pennsylvania. APL Quote Quad, vol. 29(1), pp. 50–57. ACM Press, New York (1999)Google Scholar
  10. 10.
    Jenkins, M.A., Jenkins, W.H.: The Q’Nial Language and Reference Manuals. Nial Systems Ltd., Ottawa (1993)Google Scholar
  11. 11.
    Kreye, D.J.: Zur Generierung von effizient ausführbarem Code aus SAC-spezifischen Schleifenkonstrukten. Diploma thesis, Institut für Informatik und Praktische Mathematik, Universität Kiel (1998)Google Scholar
  12. 12.
    Lam, M.S., Rothberg, E.E., Wolf, M.E.: The Cache Performance of Blocked Algorithms. In: Proceedings of the 4th International Conference on Architectural Support for Programming Languages and Operating Systems, Palo Alto, California, pp. 63–74 (1991)Google Scholar
  13. 13.
    Lewis, E.C., Lin, C., Snyder, L.: The Implementation and Evaluation of Fusion and Contraction in Array Languages. In: Proceedings of the ACM SIGPLAN 1998 Conference on Programming Language Design and Implementation. ACM, New York (1998)Google Scholar
  14. 14.
    Manjikian, N., Abdelrahman, T.S.: Array Data Layout for the Reduction of Cache Conflicts. In: Proceedings of the International Conference on Parallel and Distributed Computing Systems (1995)Google Scholar
  15. 15.
    Maydan, D.E.: Accurate Analysis of Array References. PhD thesis, Stanford University (1992)Google Scholar
  16. 16.
    McKinley, K., Carr, S., Tseng, C.-W.: Improving Data Locality with Loop Transformations. ACM Transactions on Programming Languages and Systems 18(4), 424–453 (1996)CrossRefGoogle Scholar
  17. 17.
    Roth, G., Kennedy, K.: Dependence Analysis of Fortran90 Array Syntax. In: Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications (1996)Google Scholar
  18. 18.
    Roth, G., Kennedy, K.: Loop Fusion in High Performance Fortran. CRPC TR98745, Rice University, Houston, Texas (1998)Google Scholar
  19. 19.
    Sarkar, V., Thekkath, R.: A General Framework for Iteration-Reordering Loop Transformations. In: Proceedings of the ACM SIGPLAN 1992 Conference on Programming Language Design and Implementation, San Francisco, California, pp. 175–187. ACM Press, New York (1992)CrossRefGoogle Scholar
  20. 20.
    Scholz, S.-B.: Single Assignment C — Entwurf und Implementierung einer funktionalen C-Variante mit spezieller Unterstätzung shape-invarianter Array-Operationen. PhD thesis, Institut für Informatik und Praktische Mathematik, Universität Kiel (1996) ISBN 3-8265-3138-8Google Scholar
  21. 21.
    Scholz, S.-B.: A Case Study: Effects of With-Loop-Folding on the NAS Benchmark MG in SAC. In: Hammond, K., Davie, T., Clack, C. (eds.) IFL 1998. LNCS, vol. 1595, pp. 216–228. Springer, Heidelberg (1999) ISBN 3-540-66229-4Google Scholar
  22. 22.
    van Groningen, J.: The Implementation and Efficiency of Arrays in Clean 1.1. In: Kluge, W.E. (ed.) IFL 1996. LNCS, vol. 1268, pp. 105–124. Springer, Heidelberg (1997) ISBN 3-540-63237-9Google Scholar
  23. 23.
    Wolf, M.E., Lam, M.S.: A Data Locality Optimizing Algorithm. In: Proceedings of the ACM SIGPLAN 1991 Conference on Programming Language Design and Implementation, pp. 30–44. ACM Press, New York (1991)CrossRefGoogle Scholar
  24. 24.
    Wolfe, M.J.: High-Performance Compilers for Parallel Computing. Addison-Wesley, Reading (1995) ISBN 0-8053-2730-4Google Scholar
  25. 25.
    Zima, H., Chapman, B.: Supercompilers for Parallel and Vector Computers. Addison-Wesley, Reading (1991)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2000

Authors and Affiliations

  • Clemens Grelck
    • 1
  • Dietmar Kreye
    • 1
  • Sven-Bodo Scholz
    • 1
  1. 1.Department of Computer Science and Applied MathematicsUniversity of KielKielGermany

Personalised recommendations