Supercomputing pp 157-171 | Cite as

Modeling the Memory of the Cray2 for Compile Time Optimization

  • C. Eisenbeis
  • W. Jalby
  • A. Lichnewsky
Conference paper
Part of the NATO ASI Series book series (volume 62)


In a previous work [3], a cyclic scheduling method was shown efficient to generate vector code for the Cray-2 architecture, and compared to existing compilers. This method was using the framework of microcode compaction through a simplified model of the Cray-2 vector instruction stream. In this paper, we further elaborate on how to model the machine architecture within the underlying cyclic scheduling method. The impact of the choice of the model on the code generated is analyzed and performance results are presented.


Functional Unit Memory Access Memory Bandwidth Strip Mining Loop Body 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [1]
    Arya, S., Optimal Instruction Scheduling for a Class of Vector Processors: an Integer Programming Approach, Report CRL-TR-19-83, University of Michigan, 1983Google Scholar
  2. [2]
    Cytron, R., Ferrante, J., What’s in a name? The value of renaming for parallelism detection and storage allocation, Proc. ICPP, 1987Google Scholar
  3. [3]
    Eisenbeis, C., Jalby, W., Lichnewsky, A., Squeezing more CPU performance out of a Cray-2 by vector block scheduling, Proc. Supercomputing 88, Kissimee, Florida 1988.Google Scholar
  4. [4]
    Eisenbeis, C., Jalby, W., Lichnewsky, A., Compile-time optimization of memory and register usage on the CRAY2, INRIA rep. to appear 1989.Google Scholar
  5. [5]
    Fisher, J.A., The optimization of horizontal microcode within and beyond basic blocks: an application of processor scheduling with resources, Phd thesis, New York Univ, 1979.Google Scholar
  6. [6]
    Fisher, J.A., Ellis, J.R., Ruttenberg, J.C., Nicolau, A., Parallel processing: a smart compiler and a dumb machine, Proc. SIGPLAN Symp. on Compiler Construction, 1984.Google Scholar
  7. [7]
    Kennedy, K., Automatic Vectorization of Fortran Programs to Vector Form, Technical Report, Rice University, Houston, TX, October 1980.Google Scholar
  8. [8]
    Kuck, D.J., Kuhn, R., Padua, D., Leasure, B., Wolfe, M., Dependence Graphs and Compiler Optimizations, Proc. 8th ACM Symp. POPL, Williamsburgh, VA, 1981.Google Scholar
  9. [9]
    Lam, M.S.L., A systolic array optimizing compiler, Phd Thesis, Carnegie Mellon University, 1987.Google Scholar
  10. [10]
    Lichnewsky, A., Thomasset, F., Techniques de base pour l’exploitation automatique du parallelisme dans les programmes, Rapport de Recherche INRIA, N 460, 1985.Google Scholar
  11. [11]
    Lichnewsky, A., Thomasset, F., Eisenbeis, C., Automatic Detection of Parallelism in Sci-entific Programs with Application to Array-Processors, Proc.of IBM Institute, North Holland, 1986.Google Scholar
  12. [12]
    Nicolau, A., Uniform Parallelism Exploitation in Ordinary Programs, Proc. of ICPP, 1985.Google Scholar
  13. [13]
    Patel, J.H., Davidson, E.S., Improving the throughput of a pipeline by insertion of delays, Proc., 3rd Ann. Symp. Comp. Arch., 1976.Google Scholar
  14. [14]
    Touzeau, R.F., A Fortran Compiler for the FPS-164 Scientific Computer, SIGPLAN Notices, Vol. 19, N 6, 1984.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1990

Authors and Affiliations

  • C. Eisenbeis
    • 1
  • W. Jalby
    • 1
  • A. Lichnewsky
    • 1
  1. 1.Domaine de VoluceauINRIALe Chesnay CedexFrance

Personalised recommendations