Abstract
The effects of the general-purpose precise interrupt mechanisms in use for the past few decades have received very little attention. When modern out-of-order processors handle interrupts precisely, they typically begin by flushing the pipeline to make the CPU available to execute handler instructions. In doing so, the CPU ends up flushing many instructions that have been brought in to the reorder buffer. In particular, many of these instructions have reached a very deep stage in the pipeline - representing significant work that is wasted. In addition, an overhead of several cycles can be expected in re-fetching and re-executing these instructions. This paper concentrates on improving the performance of precisely handling software managed translation lookaside buffer (TLB) interrupts, one of the most frequently occurring interrupts. This paper presents a novel method of in-lining the interrupt handler within the reorder buffer. Since the first level interrupt-handlers of TLBs are usually small, they could potentially fit in the reorder buffer along with the user-level code already there. In doing so, the instructions that would otherwise be flushed from the pipe need not be re-fetched and re-executed. Additionally, it allows for instructions independent of the exceptional instruction to continue to execute in parallel with the handler code. We simulate two different schemes of in-lining the interrupt on a processor with a 4-way out-of-order core similar to the Alpha 21264. We also analyzed the overhead of re-fetching and re-executing instructions when handling an interrupt by the traditional method. We find that our schemes significantly cut back on the number of instructions being re-fetched by 50-90%, and also provides a performance improvement of 5-25%.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
T. E. Anderson, H. M. Levy, B. N. Bershad, and E. D. Lazowska. “The interaction of architecture and operating system design.” In Proc. Fourth Int’l Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS’91), April 1991, pp. 108–120.
J. McCalpin. An Industry Perspective on Performance Characterization: Applications vs Benchmarks. Keynote address at Third Annual IEEE Workshop on Workload Characterization, Austin TX, September 16, 2000.
B. Case. “AMD unveils first superscalar 29K core.” Microprocessor Report, vol. 8, no. 14, October 1994.
B. Case. “x86 has plenty of performance headroom.” Microprocessor Report, vol. 8, no. 11, August 1994.
Z. Cvetanovic and R. E. Kessler. “Performance analysis of the Alpha 21264-based Compaq ES40 system.” In Proc. 27th Annual International Symposium on Computer Architecture (ISCA’00), Vancouver BC, June 2000, pp. 192–202.
L. Gwennap. “Intel’s P6 uses decoupled superscalar design.” Microprocessor Report, vol. 9, no. 2, February 1995.
L. Gwennap. “Digital 21264 sets new standard.” Microprocessor Report, vol. 10, no. 14, October 1996.
D. Henry, B. Kuszmaul, G. Loh, and R. Sami. “Circuits for wide-window superscalar processors.” In Proc. 27th Annual International Symposium on Computer Architecture (ISCA’00), Vancouver BC, June 2000, pp. 236–247.
D. S. Henry. “Adding fast interrupts to superscalar processors.” Tech. Rep. Memo-366, MIT Computation Structures Group, December 1994.
J. Huck and J. Hays. “Architectural support for translation table management in large address space machines.” In Proc. 20th Annual International Symposium on Computer Architecture (ISCA’ 93), May 1993, pp. 39–50.
B. L. Jacob and T. N. Mudge. “A look at several memory-management units, TLB-refill mechanisms, and page table organizations.” In Proc. Eighth Int’l Conf. on Architectural Support for Programming Languages and Operating Systems (ASPLOS’98), San Jose CA, October 1998, pp. 295–306.
B. L. Jacob and T. N. Mudge. “Virtual memory in contemporary microprocessors.” IEEE Micro, vol. 18, no. 4, pp. 60–75, July/August 1998.
B. L. Jacob and T. N. Mudge. “Virtual memory: Issues of implementation.” IEEE Computer, vol. 31, no. 6, pp. 33–43, June 1998.
G. Kane and J. Heinrich. MIPS RISC Architecture. Prentice-Hall, Englewood Cliffs NJ, 1992.
M. Moudgill and S. Vassiliadis. “Precise interrupts.” IEEE Micro, vol. 16, no. 1, pp. 58–67, February 1996.
D. Nagle, R. Uhlig, T. Stanley, S. Sechrest, T. Mudge, and R. Brown. “Design tradeoffs for software-managed TLBs.” In Proc. 20th Annual International Symposium on Computer Architecture (ISCA’93), May 1993.
X. Qiu and M. Dubois. “Tolerating late memory traps in ILP processors.” In Proc. 26th Annual International Symposium on Computer Architecture (ISCA’99), Atlanta GA, May 1999, pp. 76–87.
M. Rosenblum, E. Bugnion, S. A. Herrod, E. Witchel, and A. Gupta. “The impact of architectural trends on operating system performance.” In Proc. 15th ACM Symposium on Operating Systems Principles (SOSP’95), December 1995.
M. Slater. “AMD’s K5 designed to outrun Pentium.” Microprocessor Report, vol. 8, no. 14, October 1994.
J. E. Smith and A. R. Pleszkun. “Implementation of precise interrupts in pipelined processors.” In Proc. 12th Annual International Symposium on Computer Architecture (ISCA’85), Boston MA, June 1985, pp. 36–44.
G. S. Sohi and S. Vajapeyam. “Instruction issue logic for high-performance, interruptable pipelined processors.” In Proc. 14th Annual International Symposium on Computer Architecture (ISCA’ 87), June 1987.
R. M. Tomasulo. “An efficient algorithm for exploiting multiple arithmetic units.” IBM Journal of Research and Development, vol. 11, no. 1, pp. 25–33, 1967.
H. C. Torng and M. Day. “Interrupt handling for out-of-order execution processors.” IEEE Transactions on Computers, vol. 42, no. 1, pp. 122–127, January 1993.
M. Upton. Personal communication. 1997.
W. Walker and H. G. Cragon. “Interrupt processing in concurrent processors.” IEEE Computer, vol. 28, no. 6, June 1995.
K. C. Yeager. “The MIPS R10000 superscalar microprocessor.” IEEE Micro, vol. 16, no. 2, pp. 28–40, April 1996.
C.B. Zilles, J.S. Emer, and G.S. Sohi, “Concurrent Event-Handling Through Multithreading”, IEEE Transactions on Computers, 48:9, September, 1999, pp 903–916.
Jaleel, Aamer and Jacob, Bruce. “In-line Interrupt Handling for Software Managed TLBs”. Proc. 2001 IEEE International Conference on Computer Design (ICCD 2001), Austin TX, September 2001.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Jaleel, A., Jacob, B. (2001). Improving the Precise Interrupt Mechanism of Software- Managed TLB Miss Handlers. In: Monien, B., Prasanna, V.K., Vajapeyam, S. (eds) High Performance Computing — HiPC 2001. HiPC 2001. Lecture Notes in Computer Science, vol 2228. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45307-5_25
Download citation
DOI: https://doi.org/10.1007/3-540-45307-5_25
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-43009-4
Online ISBN: 978-3-540-45307-9
eBook Packages: Springer Book Archive