Skip to main content

IOMMU: Strategies for Mitigating the IOTLB Bottleneck

  • Conference paper
Book cover Computer Architecture (ISCA 2010)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6161))

Included in the following conference series:

Abstract

The input/output memory management unit (IOMMU) was recently introduced into mainstream computer architecture when both Intel and AMD added IOMMUs to their chip-sets. An IOMMU provides memory protection from I/O devices by enabling system software to control which areas of physical memory an I/O device may access. However, this protection incurs additional direct memory access (DMA) overhead due to the required address resolution and validation.

IOMMUs include an input/output translation lookaside buffer (IOTLB) to speed-up address resolution, but still every IOTLB cache-miss causes a substantial increase in DMA latency and performance degradation of DMA-intensive workloads. In this paper we first demonstrate the potential negative impact of IOTLB cache-misses on workload performance. We then propose both system software and hardware enhancements to reduce IOTLB miss rate and accelerate address resolution. These enhancements can lead to a reduction of over 60% in IOTLB miss-rate for common I/O intensive workloads.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. AMD: IOMMU architectural specification, http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/34434.pdf

  2. Bellard, F.: QEMU, a fast and portable dynamic translator. In: ATEC 2005: Proceedings of the Annual Conference on USENIX (41–41) (2005)

    Google Scholar 

  3. Ben-Yehuda, M., Mason, J., Xenidis, J., Krieger, O., van Doorn, L., Nakajima, J., Mallick, A., Wahlig, E.: Utilizing IOMMUs for virtualization in Linux and Xen. In: OLS 2006: The 2006 Ottawa Linux Symposium, pp. 71–86 (July 2006)

    Google Scholar 

  4. Ben-Yehuda, M., Xenidis, J., Ostrowski, M., Rister, K., Bruemmer, A., van Doorn, L.: The price of safety: Evaluating IOMMU performance. In: OLS 2007: The 2007 Ottawa Linux Symposium, pp. 9–20 ( July 2007)

    Google Scholar 

  5. Hill, M.D., Kong, S.I., Patterson, D.A., Talluri, M.: Tradeoffs in supporting two page sizes. Tech. rep., Mountain View, CA, USA (1993)

    Google Scholar 

  6. Linux 2.6.31:drivers/Documentation/networking/e1000.txt

    Google Scholar 

  7. Intel: Intel virtualization technology for directed I/O, architecture specification, http://download.intel.com/technology/computing/vptech/Intelr_VT_for_Direct_IO.pdf

  8. Jouppi, N.P.: Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers. SIGARCH Comput. Archit. News 18(3a), 364–373 (1990)

    Article  Google Scholar 

  9. Kandiraju, G.B., Sivasubramaniam, A.: Characterizing the d-TLB behavior of SPEC CPU2000 benchmarks. SIGMETRICS Perform. Eval. Rev. 30(1), 129–139 (2002)

    Article  Google Scholar 

  10. Kandiraju, G.B., Sivasubramaniam, A.: Going the distance for TLB prefetching: An application-driven study. In: International Symposium on Computer Architecture, p. 195 (2002)

    Google Scholar 

  11. Kivity, A., Kamay, Y., Laor, D., Lublin, U., Liguori, A.: KVM: the Linux Virtual Machine Monitor. In: Proceedings of the Linux Symposium, Ottawa, Ontario (2007), http://www.kernel.org/doc/ols/2007/ols2007v1-pages-225-230.pdf

  12. LSI53C895A PCI to ultra2 SCSI controller technical manual, http://www.lsi.com/DistributionSystem/AssetDocument/files/docs/techdocs/storage_stand_prod/SCSIControllers/lsi53c895a_tech_manual.pdf

  13. Miller, D.S., Henderson, R., Jelinek, J.: Linux 2.6.31:Documentation/DMA-mapping.txt

    Google Scholar 

  14. Moll, L., Shand, M.: Systems performance measurement on PCI pamette. In: Proceedings of the 5th Annual IEEE Symposium on FPGAs for Custom Computing Machines, April 1997, pp. 125–133 (1997)

    Google Scholar 

  15. Navarro, J., Iyer, S., Druschel, P., Cox, A.: Practical, transparent operating system support for superpages. In: OSDI 2002: Proceedings of the 5th Symposium on Operating Systems Design and Implementation, pp. 89–104. ACM, New York (2002), http://dx.doi.org/10.1145/1060289.1060299

    Google Scholar 

  16. Sugerman, J., Venkitachalam, G., Lim, B.H.: Virtualizing I/O devices on VMware workstation’s hosted virtual machine monitor. In: USENIX Annual Technical Conference. USENIX Association, Berkeley (2001), http://dx.doi.org/10.1145/265924.265930

    Google Scholar 

  17. Tomonori, F.: DMA representations sg_table vs. sg_ring IOMMUs and LLDś restrictions. LSF 2008 http://iou.parisc-linux.org/lsf2008/IOD-MA_Representations-fujita_tomonori.pdf

  18. Vaidyanathan, K., Huang, W., Chai, L., Panda, D.K.: Designing efficient asynchronous memory operations using hardware copy engine: A case study with I/OAT. In: Proceedings of 21th International Parallel and Distributed Processing Symposium (IPDPS 2007), March 26-30, pp. 1–8. IEEE, Long Beach (2007)

    Google Scholar 

  19. Willmann, P., Rixner, S., Cox, A.L.: Protection strategies for direct access to virtualized I/O devices. In: ATC 2008: USENIX 2008 Annual Technical Conference on Annual Technical Conference, pp. 15–28. USENIX Association, Berkeley (2008)

    Google Scholar 

  20. Yassour, B.A., Ben-Yehuda, M., Wasserman, O.: On the DMA mapping problem in direct device assignment. In: SYSTOR 2010: The 3rd Annual Haifa Experimental Systems Conference (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Amit, N., Ben-Yehuda, M., Yassour, BA. (2011). IOMMU: Strategies for Mitigating the IOTLB Bottleneck. In: Varbanescu, A.L., Molnos, A., van Nieuwpoort, R. (eds) Computer Architecture. ISCA 2010. Lecture Notes in Computer Science, vol 6161. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24322-6_22

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-24322-6_22

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-24321-9

  • Online ISBN: 978-3-642-24322-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics