Improving Off-Chip Memory Energy Behavior in a Multi-processor, Multi-bank Environment
- 300 Downloads
Many embedded/portable applications from image and video processing domains are characterized by spending a large fraction of their energy in executing load/store instructions that access off-chip memory. Although most performance-oriented locality optimization techniques reduce the number of memory instructions and, consequently, improve memory energy consumption, we also need to consider energy-oriented approaches if we are to improve energy behavior further.
Our focus in this paper is on a system with multiple homogeneous processors and a multi-bank memory architecture that process large arrays of signals. To reduce energy consumption in such a system, we use a compiler-based approach which exploits low-power operating modes. In such an architecture, one of the major problems is to address the conflicting requirements of maximizing parallelism and reducing energy consumption. This conflict arises because maximizing parallelism requires independent concurrent accesses to different memory banks, whereas reducing energy consumption implies limiting the accesses at a given period of time to a small set of memory banks (so that the remaining banks can be placed into a low-power operating mode). Our approach consists of three complementary steps, namely, parallel access pattern detection, array allocation across memory banks, and data layout transformations. Our preliminary results indicate that our approach leads to significant off-chip memory energy savings without sacrificing the available parallelism.
KeywordsReduce Energy Consumption Array Element Access Pattern Memory Bank Memory Controller
Unable to display preview. Download preview PDF.
- 1.S. P. Amarasinghe, J. M. Anderson, M. S. Lam, and C. W. Tseng The SUIF compiler for scalable parallel machines. In Proc. the Seventh SIAM Conference on Parallel Processing for Scientific Computing, February, 1995.Google Scholar
- 2.J. Anderson. Automatic Computation and Data Decomposition for Multiprocessors. Ph.D. dissertation, Stanford University, March 1997.Google Scholar
- 3.V. Delaluz, M. Kandemir, N. Vijaykrishnan, A. Sivasubramaniam, and M. J. Irwin. DRAM energy management using software and hardware directed power mode control. In Proc. the 7th International Conference on High Performance Computer Architecture, Monterrey, Mexico, January 2001.Google Scholar
- 4.F. Douglas, P. Krishnan, and B. Marsh. Thwarting the power-hungry disk. In Proc. Winter Usenix, 1994.Google Scholar
- 5.A. R. Lebeck, X. Fan, H. Zeng, and C. S. Ellis. Power aware page allocation. In Proc. Ninth International Conference on Architectural Support for Programming Languages and Operating Systems, November 2000.Google Scholar
- 6.Pentium III Processor Mobile Module MMC-2, Data-sheet 243356-001, Intel Corporation.Google Scholar
- 7.Rambus Inc. http://www.rambus.com/.
- 8.128/144-MBit Direct RDRAM Data Sheet, Rambus Inc., May 1999.Google Scholar
- 9.N. Vijaykrishnan, M. Kandemir, M. J. Irwin, H. Y. Kim, and W. Ye. Energy-driven integrated hardware-software optimizations using SimplePower. In Proc. the International Symposium on Computer Architecture, June 2000.Google Scholar
- 10.V. Zyuban and P. Kogge. Split register file architectures for inherently lower power microprocessors. In Proc. Power-Driven Micro-architecture Workshop, in conjunction with ISCA’98, pages 32–37, 1998.Google Scholar