A Fetch Policy Maximizing Throughput and Fairness for Two-Context SMT Processors
In Simultaneous Multithreading (SMT) processors, co-scheduled threads share the processor’s resources, but at the same time compete for them. A thread missing in L2 cache may hold a large number of resources which other threads could be using to make forward progress. And as a result, the overall performance of SMT processors is degraded. Currently, many instruction fetch policy focus on this problem. However, these policies are not perfect, and each has its own disadvantages. Especially, these policies are designed for processors implementing any ways simultaneous multithreading. The disadvantages of these policies may become more serious when they are used in two-context SMT processors.
In this paper, we propose a novel fetch policy called RG-FP (Resource Gating based on Fetch Priority), which is specially designed for two-context SMT processors. RG-FP combines reducing fetch priority with controlling shared resource allocation to prevent the negative effects caused by loads missing in L2 cache. Simulation results show that our RG-FP policy outperforms previously proposed fetch policies for all types of workloads in both throughput and fairness, especially for memory bounded workloads. Results also tell that our policy shows different degrees of improvement over other fetch policies. The increment over PDG is greatest, reaching 41.8% in throughput and 50.0% in Hmean on average.
KeywordsShared Resource Dynamic Resource Allocation Translation Lookaside Buffer Data Gating Idle Cycle
Unable to display preview. Download preview PDF.
- 1.Tullsen, D., Eggers, S., Levy, H.: Simultaneous multithreading: Maximizing on-chip parallelism. In: Proc. ISCA-22 (1995)Google Scholar
- 2.Tullsen, D., Eggers, S., et al.: Exploiting choice: Instruction fetch and issue on an implementable simultaneous multithreading processor. In: Proc. ISCA-23 (1996)Google Scholar
- 4.Tullsen, D., Brown, J.: Handling long-latency loads in a simultaneous multithreaded processor. In: Proc. MICRO-34 (2001)Google Scholar
- 5.El-Moursy, A., Albonesi, D.: Front-end policies for improved issue efficiency in SMT processors. In: Proc. HPCA-9 (2003)Google Scholar
- 6.F. J. Cazorla, A. Ramirez, et al.: DCache Warn: an I-Fetch policy to increase SMT efficiency. In Proc. IPDPS-18(2004) Google Scholar
- 7.Koufaty, D., Marr, D.T.: Hyperthreading technology in the Netburst microarchitecture. IEEE Micro (2003)Google Scholar
- 8.Kalla, R., Sinharoy, B., Tendler, J.: IBM POWER5 chip: a dual-core multithreaded processor. IEEE Micro (2004)Google Scholar
- 9.Yoaz, A., Erez, M., et al.: Speculation techniques for improving load related instruction scheduling. In: Proc. ISCA-26 (1999)Google Scholar
- 10.Cazorla, F.J., Ramirez, A., et al.: Dynamically controlled resource allocation in SMT processors. In: Proc. MICRO-37 (2004)Google Scholar
- 11.Cazorla, F.J., et al.: Implicit vs. explicit resource allocation in SMT processors. In: Proceedings of the Euromicro Symposium on Digital System Design (2004)Google Scholar
- 12.The standard performance evaluation corporation, WWW cite, http://www.specbench.org
- 13.Sherwood, T., Perelman, E., Calder, B.: Basic block distribution analysis to find periodic behavior and simulation points in applications. In: Proceedings of the International Conference on Parallel Architectures and Compilation Techniques (2001)Google Scholar
- 14.Tullsen, D.: Simulation and modeling of a simultaneous multithreading processor. In: Proceedings of 22nd Annual Computer Measurement Group Conference (1996)Google Scholar