Dynamic Capping of Physical Register Files in Simultaneous Multi-threading Processors for Performance
Today, Simultaneous Multi-Threading (SMT) processors allow sharing of many datapath elements among applications. This type of resource sharing helps keeping the area requirement of a SMT processor at a very modest size. However, a major performance problem arises due to resource conflicts when multiple threads race for the same shared resource. In an earlier study, the authors propose capping of a shared resource, Physical Register File (PRF), for improving processor performance by giving less PRF entries, and, hence, spending less power, as well. For the sake of simplicity, the authors propose a fix PRF-capping amount, which they claim to be sufficient for all workload combinations. However, we show that a fix PRF-capping strategy may not always give the optimum performance, since any thread’s behavior may change at any time during execution. In this study, we extend that earlier work with an adaptive PRF-capping mechanism, which tracks down the behavior of all running threads and move the cap value to a near-optimal position by the help of a hill-climbing algorithm. As a result, we show that we can achieve up to 21% performance improvement over the fix capping method, giving 7.2% better performance, on the average, in a 4-threaded system.
KeywordsProcessor performance Shared resource management Simultaneous Multi-Threading
This work is supported by the Scientific and Technical Research Council of Turkey (TUBITAK) under Grant No:117E866.
- 2.Monreal, T., González, A., Valero, M., González, J., Viñals, V.: Dynamic register renaming through virtual-physical registers. J. Instr. Level Parallelism 2, 1–20 (2000) Google Scholar
- 3.Choi, S., Yeung, D.: Learning-based SMT processor resource distribution via hill-climbing. In: ACM SIGARCH Computer Architecture News, vol. 34, no. 2, pp. 239–251. IEEE Computer Society (2006)Google Scholar
- 4.Wang, H., Koren, I., Krishna, C.M.: An adaptive resource partitioning algorithm for SMT processors. In: Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques, pp. 230–239. ACM, Oct 2008Google Scholar
- 5.Cazorla, F.J., Ramirez, A., Valero, M., Fernandez, E.: Dynamically controlled resource allocation in SMT processors. In: Proceedings of the 37th Annual IEEE/ACM International Symposium on Microarchitecture, pp. 171–182. IEEE Computer Society (2004)Google Scholar
- 7.Sharkey, J., Ponomarev, D., Ghose, K.: M-Sim: a flexible, multithreaded architectural simulation environment. Technical Report CS-TR-05-DP01, Department of CS, SUNY-Binghamton (2005)Google Scholar