Skip to main content

Hardware-Assisted Context Management for Accelerator Virtualization: A Case Study with RSA

  • Conference paper
Architecture of Computing Systems – ARCS 2016 (ARCS 2016)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9637))

Included in the following conference series:

Abstract

The advantages of virtualization, including the ability to migrate, schedule, and manage software processes, continues to drive the demand for hardware and software support. However, the packaging of software state required by virtualization is in direct conflict with the trend toward accelerator-rich architectures where state is distributed between the processor and a set of heterogeneous devices – a problem that is particularly acute in the mobile SoC market. Virtualizing such systems requires that the VMM explicitly manage the internal state of all of the accelerators over which a process’s computation may be spread. Public-key crypto engines are particularly problematic because of both the sensitivity of the information that they carry and the long compute times required to complete a single task.

In this paper we examine a set of hardware design approaches to public-key crypto accelerator virtualization and study the trade-off between sharing granularity and management overhead in time and space. Based on observations made during the design of several such systems, we propose a hybrid local-remote scheduling approach that promotes more intelligent decisions during hardware context switches and enables quick and safe state packaging. We find that performance can vary significantly among the examined approaches, and that our new design, with explicit accelerator support for state management and a modicum of scheduling flexibility, can allow highly contended resources to be efficiently shared with only moderate gains in area and power consumption.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. 5.3, C.: http://quid.hpl.hp.com:9081/cacti

  2. Chen, J.H., Wu, H.S., Shieh, M.D., Lin, W.C.: A new montgomery modular multiplication algorithm and its vlsi design for rsa cryptosystem. In: IEEE International Symposium on Circuits and Systems, ISCAS 2007, pp. 3780–3783. IEEE (2007)

    Google Scholar 

  3. Clark, N., Hormati, A., Mahlke, S.: Veal: Virtualized execution accelerator for loops. In: 35th International Symposium on Computer Architecture, ISCA 2008, pp. 389–400. IEEE (2008)

    Google Scholar 

  4. Compiler, D.: https://www.synopsys.com/tools/implementation/rtlsynthesis

  5. Cong, J., Ghodrat, M.A., Gill, M., Grigorian, B., Huang, H., Reinman, G.: Composable accelerator-rich microprocessor enhanced for adaptivity and longevity. In: 2013 IEEE International Symposium on Low Power Electronics and Design (ISLPED), pp. 305–310. IEEE (2013)

    Google Scholar 

  6. Cong, J., Ghodrat, M.A., Gill, M., Grigorian, B., Reinman, G.: Charm: a composable heterogeneous accelerator-rich microprocessor. In: Proceedings of the 2012 ACM/IEEE international symposium on Low power electronics and design, pp. 379–384. ACM (2012)

    Google Scholar 

  7. Govindaraju, V., Ho, C.H., Sankaralingam, K.: Dynamically specialized datapaths for energy efficient computing. In: 2011 IEEE 17th International Symposium on High Performance Computer Architecture (HPCA), pp. 503–514. IEEE (2011)

    Google Scholar 

  8. Gupta, V., Schwan, K., Tolia, N., Talwar, V., Ranganathan, P.: Pegasus: Coordinated scheduling for virtualized accelerator-based systems. In: 2011 USENIX Annual Technical Conference (USENIX ATC 2011), p. 31 (2011)

    Google Scholar 

  9. Hiremane, R.: Intel virtualization technology for directed i/o(intel vt-d). Technology@ Intel Magazine 4(10) (2007)

    Google Scholar 

  10. Jovanovic, S., Tanougast, C., Weber, S.: A hardware preemptive multitasking mechanism based on scan-path register structure for fpga-based reconfigurable systems. In: Second NASA/ESA Conference on Adaptive Hardware and Systems, AHS 2007, pp. 358–364. IEEE (2007)

    Google Scholar 

  11. Koch, D., Haubelt, C., Teich, J.: Efficient hardware checkpointing: concepts, overhead analysis, and implementation. In: Proceedings of the 2007 ACM/SIGDA 15th international symposium on Field programmable gate arrays, pp. 188–196. ACM (2007)

    Google Scholar 

  12. Liu, J., Abali, B.: Virtualization polling engine (vpe): using dedicated cpu cores to accelerate i/o virtualization. In: Proceedings of the 23rd International Conference on Supercomputing, pp. 225–234. ACM (2009)

    Google Scholar 

  13. Menychtas, K., Shen, K., Scott, M.L.: Disengaged scheduling for fair, protected access to fast computational accelerators. In: Proceedings of the 19th International Conference on Architectural Support for Programming Languages and Operating Systems, pp. 301–316. ACM (2014)

    Google Scholar 

  14. ModelSim: http://www.mentor.com/products/fv/modelsim

  15. Montgomery, P.L.: Modular multiplication without trial division. Math. Comput. 44(170), 519–521 (1985)

    Article  MathSciNet  MATH  Google Scholar 

  16. OpenSSL: https://www.openssl.org

  17. Rupnow, K., Fu, W., Compton, K.: Block, drop or roll (back): Alternative preemption methods for rh multi-tasking. In: 17th IEEE Symposium on Field Programmable Custom Computing Machines, FCCM 2009, pp. 63–70. IEEE (2009)

    Google Scholar 

  18. Shieh, M.D., Chen, J.H., Wu, H.H., Lin, W.C.: A new modular exponentiation architecture for efficient design of rsa cryptosystem. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 16(9), 1151–1161 (2008)

    Article  Google Scholar 

  19. Stewin, P., Bystrov, I.: Understanding DMA malware. In: Flegel, U., Markatos, E., Robertson, W. (eds.) DIMVA 2012. LNCS, vol. 7591, pp. 21–41. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  20. Stillwell, P.M., Chadha, V., Tickoo, O., Zhang, S., Illikkal, R.,Iyer, R., Newell, D.: Hippai: high performance portable accelerator interface for socs. In: 2009 International Conference on High Performance Computing (HiPC), pp.109–118. IEEE (2009)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ying Gao .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Gao, Y., Sherwood, T. (2016). Hardware-Assisted Context Management for Accelerator Virtualization: A Case Study with RSA. In: Hannig, F., Cardoso, J.M.P., Pionteck, T., Fey, D., Schröder-Preikschat, W., Teich, J. (eds) Architecture of Computing Systems – ARCS 2016. ARCS 2016. Lecture Notes in Computer Science(), vol 9637. Springer, Cham. https://doi.org/10.1007/978-3-319-30695-7_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-30695-7_6

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-30694-0

  • Online ISBN: 978-3-319-30695-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics