Abstract
We describe the analysis of an on-line pattern-recognition algorithm to dynamically control the configuration of the L1 data cache of a high-performance processor. The microarchitecture achieves higher performance and energy saving due to the accommodation of operating frequency, capacity, set-associativity, line size, hit latency, energy per access, and chip area to program workload and ILP. We show that for the operating frequency 4.5 GHz, the execution time is always reduced with an average measure of 12.1% when compared to a non-adaptive high-performance processor. Additionally, the energy saving is 2.7% on average, and t1he product time-energy is reduced on average by 14.9%. We also consider a profile-based reconfiguration of data cache, which allows picking different cache configurations but only one can be chosen for each program. Experimental results indicate that this approach yields a high percentage of the performance improvement and energy saving achieved by the on-line algorithm.
This work was supported by the MCyT-Spain under contract TIN 2004-03388, the Gobierno de Canarias, the Generalitat de Catalunya (SGR-00218), and the HiPEAC European Network of Excellence.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Akiyoshi, H., Shimizu, H., Matsumoto, T., Kobayashi, K., Sambonsugi, Y.: A 320ps access, 3 GHz cycle, 144 Kb SRAM macro in 90nm CMOS technology using an all-stage reset control signal generator. In: Proc. IEEE Solid-State Circuits Conference, pp. 460–508 (2003)
Bahar, R., Albera, G., Manne, S.: Power and performance tradeoffs using various caching strategies. In: Proc. Symp. Low Power Electronics and Design, pp. 64–69. ACM Press, New York (1998)
Balasubramonian, R., Albonesi, D.H., Buyuktosunoglu, A., Dwarkadas, S.: A Dynamically Tunable Memory Hierarchy. IEEE Tran. on Computers 52(10), 1243–1257 (2003)
Benitez, D.: Performance of Reconfigurable Architectures for Image-Processing Applications. Journal of Systems Architecture 49(4-6), 193–210 (2003)
Burger, D., Austin, T.M.: The SimpleScalar Toolset. Ver. 2.0. Computer Architecture News 25(3), 13–25 (1997)
Compton, K., Hauck, S.: Reconfigurable Computing: A survey of Systems and Software. ACM Computing Surveys 34(2), 171–210 (2002)
Dhodapkar, A.S., Smith, J.E.: Managing Multi-Configuration Hardware via Dynamic Working Set Analysis. In: Proc. 29th Intl. Symposium on Computer Architecture, pp. 233–244. IEEE Computer Society, Los Alamitos (2002)
Dropsho, S., Buyuktosunoglu, A., Balasubramonian, R., Albonesi, D.H., Dwarkadas, S., Semeraro, G., Magklis, G., Scott, M.L.: Integrating Adaptive On-Chip Storage Structures for Reduced Dynamic Power. In: Proc. Intl. Conference Parallel Architectures and Compilation Techniques, pp. 41–152. IEEE Computer Society, Los Alamitos (2002)
Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification, 2nd edn. Wiley, Chichester (2000)
Hu, Z., Kaxiras, S., Martonosi, M.: Timekeeping in the memory system: predicting and optimizing memory behavior. In: Proc. 29th Intl. Symposium Computer Architecture, pp. 209–220. IEEE Computer, Los Alamitos (2002)
Karnik, T., Borkar, S., De, V.: Sub-90nm technologies: challenges and opportunities for CAD. In: Proc. Conference Computer-Aided Design, pp. 203–206. IEEE Computer Society, Los Alamitos (2002)
Kin, J., Gupta, M., Mangione-Smith, W.: The filter cache: an energy efficient memory structure. In: Proc. 30th Intl. Symposium on Microarchitecture, pp. 184–193. IEEE Comp. Society, Los Alamitos (1997)
Magklis, G., Scott, M.L., Semeraro, G., Albonesi, D.H., Dropsho, S.: Profile-based Dynamic Voltage and Frequency Scaling for a Multiple Clock Domain Microprocessor. In: Proc. 30th Intl. Symposium Computer Architecture, pp. 14–25. IEEE Computer Society, Los Alamitos (2003)
Moure, J.C., Rexachs, D.I., Luque, E.: The KScalar Simulator. ACM Journal of Educational Resources in Computing (JERIC) 2(1), 73–116 (2002)
Semeraro, G., Magklis, G., Balasubramonian, R., Albonesi, D.H., Dwarkadas, S., Scott, M.L.: Energy-Efficient Processor Design Using Multiple Clock Domains with Dynamic Voltage and Frequency Scaling. In: Proc. 8th Intl. Symposium on High-Performance Computer Architecture, pp. 29–42. IEEE Computer Society, Los Alamitos (2002)
Sherwood, T., Perelman, E., Hamerly, G., Calder, B.: Automatically Characterizing Large Scale Programs. In: Proc. 10th Intl. Conference Architectural Support for Programming Languages and Operating Systems, pp. 45–57. ACM Press, New York (2002)
Sherwood, T., Sair, S., Calder, B.: Phase Tracking and Prediction. In: Proc. 30th Intl. Symposium Computer Architecture, pp. 336–349. ACM Press, New York (2003)
Shivakumar, P., Jouppi, N.P.: CACTI 3.0: An Integrated Cache Timing, Power, and Area Model. In: Compact WRL Technical Report 2001/2 (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Benítez, D., Moure, J.C., Rexachs, D.I., Luque, E. (2005). Performance and Power Evaluation of an Intelligently Adaptive Data Cache. In: Bader, D.A., Parashar, M., Sridhar, V., Prasanna, V.K. (eds) High Performance Computing – HiPC 2005. HiPC 2005. Lecture Notes in Computer Science, vol 3769. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11602569_39
Download citation
DOI: https://doi.org/10.1007/11602569_39
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-30936-9
Online ISBN: 978-3-540-32427-0
eBook Packages: Computer ScienceComputer Science (R0)