qCUDA-ARM: Virtualization for Embedded GPU Architectures

Huang, Bo-Yu; Lee, Che-Rung

doi:10.1007/978-3-030-38651-1_23

Bo-Yu Huang¹² &
Che-Rung Lee¹²

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11894))

Included in the following conference series:

International Conference on Internet of Vehicles

1417 Accesses
3 Citations

Abstract

The emergence of Internet of Things (IOT) is changing the ways of computing resources acquisition, from centralized cloud data centers to distributed pervasive edge nodes. To cope the small amount of diversity problem for IOT devices and applications, two research trends are investigated for the system design of edge nodes: heterogeneity and virtualization. In this paper, we consider the integration of those two important trends and present a virtualization system for embedded GPU architectures, called qCUDA-ARM. The design of qCUDA-ARM is based on the framework of qCUDA, a virtualization system for x86 servers. Because of the architectural differences between x86 servers and ARM based embedded systems, many subsystems of qCUDA-ARM, such as memory management, need to be redesigned. We evaluated the performance of qCUDA-ARM with three CUDA benchmarks and two real world applications. For computational intensive jobs, qCUDA-ARM can reach similar performance of the native system; and for memory bound programs, qCUDA-ARM can also have up to 90% performance of that of the native one.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

On the Virtualization of CUDA Based GPU Remoting on ARM and X86 Machines in the GVirtuS Framework

Article 13 October 2016

The High Performance Internet of Things: Using GVirtuS to Share High-End GPUs with ARM Based Cluster Computing Nodes

Hands on with OpenMP4.5 and Unified Memory: Developing Applications for IBM’s Hybrid CPU + GPU Systems (Part II)

References

CUDA toolkit document 5.9 memory management. https://docs.nvidia.com/cuda/cuda-runtime-api
Mediatek Helio. https://en.wikichip.org/wiki/mediatek/helio
Programming guide: CUDA toolkit documentation. https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html/
Amert, T., Otterness, N., Yang, M., Anderson, J.H., Smith, F.D.: GPU scheduling on the NVIDIA TX2: hidden details revealed. In: 2017 IEEE Real-Time Systems Symposium (RTSS), pp. 104–115. IEEE (2017)
Google Scholar
Bonomi, F., Milito, R., Zhu, J., Addepalli, S.: Fog computing and its role in the internet of things. In: 2012 First Edition of the MCC Workshop on Mobile Cloud Computing, pp. 13–16 (2012)
Google Scholar
Celesti, A., Mulfari, D., Fazio, M., Villari, M., Puliafito, A.: Exploring container virtualization in IoT clouds. In: 2016 IEEE International Conference on Smart Computing, pp. 1–6 (2016)
Google Scholar
Duato, J., Peña, A.J., Silla, F., Mayo, R., Quintana-Ortí, E.S.: rCUDA: reducing the number of GPU-based accelerators in high performance clusters, pp. 224–231 (2010)
Google Scholar
Giunta, G., Montella, R., Agrillo, G., Coviello, G.: A GPGPU transparent virtualization component for high performance computing clouds. In: D’Ambra, P., Guarracino, M., Talia, D. (eds.) Euro-Par 2010. LNCS, vol. 6271, pp. 379–391. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15277-1_37
Chapter Google Scholar
Gottschlag, M., Hillenbrand, M., Kehne, J., Stoess, J., Bellosa, F.: LoGV: low-overhead GPGPU virtualization. In: 2013 IEEE 10th International Conference on High Performance Computing and Communications and 2013 IEEE International Conference on Embedded and Ubiquitous Computing, pp. 1721–1726 (2013)
Google Scholar
Gubbi, J., Buyya, R., Marusic, S., Palaniswami, M.: Internet of things (IoT): a vision, architectural elements, and future directions. Future Gener. Comput. Syst. 29, 1645–1660 (2013)
Article Google Scholar
Guo, C., et al.: BCube: a high performance, server-centric network architecture for modular data centers. In: Proceedings of the ACM SIGCOMM 2009 Conference on Data Communication (2009)
Google Scholar
Hsu, H.C., Lee, C.R.: G-KVM: a full GPU virtualization on KVM. In: 2016 IEEE International Conference on Computer and Information Technology, pp. 545–552 (2016)
Google Scholar
Jones, R.W.: Optimizing QEMU boot time. http://oirase.annexia.org/tmp/paper.pdf
Kato, S., McThrow, M., Maltzahn, C., Brandt, S.: Gdev: first-class GPU resource management in the operating system. In: Proceedings of the 2012 USENIX Conference on Annual Technical Conference, USENIX ATC 2012, p. 37. USENIX Association, Berkeley (2012)
Google Scholar
Tian, K., Dong, Y., Cowperthwaite, D.: A full GPU virtualization solution with mediated pass-through. In: USENIX ATC 2014 Proceedings of the 2014 USENIX Conference on USENIX Annual Technical Conference, pp. 121–132 (2014)
Google Scholar
Tong, L., Li, Y., Gao, W.: A hierarchical edge cloud architecture for mobile computing. In: The 35th Annual IEEE International Conference on Computer Communications, pp. 1–9 (2016)
Google Scholar
Morabito, R., Kjällman, J., Komu, M.: Hypervisors vs. lightweight virtualization: a performance comparison. In: 2015 IEEE International Conference on Cloud Engineering, pp. 386–393, March 2015. https://doi.org/10.1109/IC2E.2015.74
Markthub, P., Nomura, A., Matsuoka, S.: mrCUDA: low-overhead middleware for transparently migrating CUDA execution from remote to local GPUs. In: Presented at the SC15 Conference (2015)
Google Scholar
Russell, R.: Virtio: towards a de-facto standard for virtual I/O devices. In: ACM SIGOPS Operating Systems Review - Research and Developments in the Linux Kernel, pp. 95–103 (2008)
Google Scholar
Shi, L., Chen, H., Sun, J., Li, K.: vCUDA: GPU-accelerated high-performance computing in virtual machines. IEEE Trans. Comput. 61(6), 804–816 (2012)
Article MathSciNet Google Scholar
Stevens, A.: Introduction to amba® 4 ace™ and big. little™ processing technology. ARM White Paper, CoreLink Intelligent System IP by ARM (2011)
Google Scholar
Suzuki, Y., Kato, S., Yamada, H., Kono, K.: GPUvm: GPU virtualization at the hypervisor. IEEE Trans. Comput. 65, 2752–2766 (2015)
Article MathSciNet Google Scholar
Suzuki, Y., Kato, S., Yamada, H., Kono, K.: GPUvm: why not virtualizing GPUs at the hypervisor? In: 2014 USENIX Annual Technical Conference, USENIX ATC 2014, pp. 109–120 (2014)
Google Scholar
Zhu, J., Chan, D.S., Prabhu, M.S., Natarajan, P., Hu, H., Bonomi, F.: Improving web sites performance using edge servers in fog computing architecture. In: 2013 IEEE Seventh International Symposium on Service-Oriented System Engineering, pp. 320–323 (2013)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, National Tsing Hua University, HsinChu, Taiwan
Bo-Yu Huang & Che-Rung Lee

Authors

Bo-Yu Huang
View author publications
You can also search for this author in PubMed Google Scholar
Che-Rung Lee
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Che-Rung Lee .

Editor information

Editors and Affiliations

Chung Hua University, Hsinchu, Taiwan
Ching-Hsien Hsu
Saint-Quentin-en-Yvelines, Université de Versailles, Versailles Cedex, France
Sondès Kallel
China Medical University, Tainan, Taiwan
Kun-Chan Lan
Sun Yat-sen University, Guangzhou, China
Zibin Zheng

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Huang, BY., Lee, CR. (2020). qCUDA-ARM: Virtualization for Embedded GPU Architectures. In: Hsu, CH., Kallel, S., Lan, KC., Zheng, Z. (eds) Internet of Vehicles. Technologies and Services Toward Smart Cities. IOV 2019. Lecture Notes in Computer Science(), vol 11894. Springer, Cham. https://doi.org/10.1007/978-3-030-38651-1_23

Download citation

DOI: https://doi.org/10.1007/978-3-030-38651-1_23
Published: 19 January 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-38650-4
Online ISBN: 978-3-030-38651-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

qCUDA-ARM: Virtualization for Embedded GPU Architectures

Abstract

Access this chapter

Similar content being viewed by others

On the Virtualization of CUDA Based GPU Remoting on ARM and X86 Machines in the GVirtuS Framework

The High Performance Internet of Things: Using GVirtuS to Share High-End GPUs with ARM Based Cluster Computing Nodes

Hands on with OpenMP4.5 and Unified Memory: Developing Applications for IBM’s Hybrid CPU + GPU Systems (Part II)

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

qCUDA-ARM: Virtualization for Embedded GPU Architectures

Abstract

Access this chapter

Similar content being viewed by others

On the Virtualization of CUDA Based GPU Remoting on ARM and X86 Machines in the GVirtuS Framework

The High Performance Internet of Things: Using GVirtuS to Share High-End GPUs with ARM Based Cluster Computing Nodes

Hands on with OpenMP4.5 and Unified Memory: Developing Applications for IBM’s Hybrid CPU + GPU Systems (Part II)

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation