Advanced Data Layout Optimization for Multimedia Applications
Increasing disparity between processor and memory speeds has been a motivation for designing systems with deep memory hierarchies. Most data-dominated multimedia applications do not use their cache efficiently and spend much of their time waiting for memory accesses . This also implies a significant additional cost in increased memory bandwidth, in the system bus load and the associated power consumption apart from increasing the average memory access time.
KeywordsMultimedia Application Loop Nest Cache Size Spatial Reuse Tile Size
Unable to display preview. Download preview PDF.
- 2.D.C. Burger, J.R. Goodman and A. Kagi, “The declining effectiveness of dynamic caching for general purpose multiprocessor”, Technical Report, University of Wisconsin, Madison, no. 1261, 1995.Google Scholar
- 3.E. De Greef, “Storage size reduction for multimedia applications”, Doctoral Dissertation, Dept. of EE, K.U.Leuven, January 1998.Google Scholar
- 4.F. Catthoor, S. Wuytack, E. De Greef, F. Balasa, L. Nachtergaele, A. Vandecappelle, “Custom Memory Management Methodology — Exploration of Memory Organization for Embedded Multimedia System Design”, ISBN 0-7923-8288-9, Kluwer Acad. Publ., Boston, 1998.Google Scholar
- 5.S. Gupta, M. Miranda, F. Catthoor, R. Gupta, “Analysis of high-level address code transformations”, In proc. of design automation and test in europe (DATE) conference, Paris, March 2000.Google Scholar
- 7.C. Kulkarni, F. Catthoor, H. De Man, “Code transformations for low power caching in embedded multimedia processors,”, Intnl. Parallel Proc. Symp.(IPPS/SPDP), Orlando FL, pp. 292–297, April 1998.Google Scholar
- 8.D. Kulkarni and M. Stumm, “Linear loop transformations in optimizing compilers for parallel machines”, The Australian computer journal, pp. 41–50, may 1995.Google Scholar
- 9.M. Lam, E. Rothberg and M. Wolf, “The cache performance and optimizations of blocked algorithms”, In Proc. 6th Intnl. Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-IV), pp.63–74, Santa Clara, Ca., 1991.Google Scholar
- 10.C.L. Lawson and R.J. Hanson, “Solving least squares problems”, Classics in applied mathematics, SIAM, Philadelphia, 1995.Google Scholar
- 11.N. Manjikian and T. Abdelrahman, “Array data layout for reduction of cache con flicts”, Intl. Conference on Parallel and Distributed Computing Systems, 1995.Google Scholar
- 12.K.S. McKinley and O. Temam, “A quantitative analysis of loop nest locality”, Proc. of 8th Intnl. Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS-VIII), Boston, MA, October 1996.Google Scholar
- 14.P.R. Panda, N.D. Dutt and A. Nicolau, “Memory data organization for improved cache performance in embedded processor applications”, In Proc. ISSS-96, pp.90–95, La Jolla, Ca., Nov 1996.Google Scholar