Pushing Big Data into Accelerators: Can the JVM Saturate Our Hardware?

Peltenburg, Johan; Hesam, Ahmad; Al-Ars, Zaid

doi:10.1007/978-3-319-67630-2_18

Johan Peltenburg¹⁷,
Ahmad Hesam¹⁷ &
Zaid Al-Ars¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10524))

Included in the following conference series:

International Conference on High Performance Computing

1862 Accesses
6 Citations

Abstract

Advancements in the field of big data have led into an increasing interest in accelerator-based computing as a solution for computationally intensive problems. However, many prevalent big data frameworks are built and run on top of the Java Virtual Machine (JVM), which does not explicitly offer support for accelerated computing with e.g. GPGPU or FPGA. One major challenge in combining JVM-based big data frameworks with accelerators is transferring data from objects that reside in JVM managed memory to the accelerator. In this paper, a rigorous analysis of possible solutions is presented to address this challenge. Furthermore, a tool is presented which generates the required code for four alternative solutions and measures the attainable data transfer speed, given a specific object graph. This can give researchers and designers a fast insight about whether the interface between JVM and accelerator can saturate the computational resources of their accelerator. The benchmarking tool was run on a POWER8 system, for which results show that depending on the size of the objects and collections size, an approach based on the Java Native Interface can achieve between 0.9 and 12 GB/s, ByteBuffers can achieve between 0.7 and 3.3 GB/s, the Unsafe library can achieve between 0.8 and 16 GB/s and finally an approach access the data directly can achieve between 3 and 67 GB/s. From our measurements, we conclude that the HotSpot VM does not yet have standardized interfaces by design that can saturate common bandwidths to accelerators seen today or in the future, although one of the approaches presented in this paper can overcome this limitation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Notes

1.
this depends on whether the representation of the array in the VM is the same as the native representation, and if the VM garbage collector supports “pinning”.

References

Anderson, M., Smith, S., Sundaram, N., Capota, M., Zhao, Z., Dulloor, S., Satish, N., Willke, T.L.: Bridging the gap between HPC and big data frameworks. Proc. VLDB Endow. 10(8) (2017)
Google Scholar
Bytedeco: JavaCPP, April 2017, https://github.com/bytedeco/javacpp
Chen, Y.T., Cong, J., Fang, Z., Lei, J., Wei, P.: When apache spark meets FPGAs: a case study for next-generation DNA sequencing acceleration. In: The 8th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud 2016) (2016)
Google Scholar
Chen, Z.N., Chen, K., Jiang, J.L., Zhang, L.F., Wu, S., Qi, Z.W., Hu, C.M., Wu, Y.W., Sun, Y.Z., Tang, H., et al.: Evolution of cloud operating system: from technology to ecosystem. J. Comput. Sci. Technol. 32(2), 224–241 (2017)
Article Google Scholar
Databricks: TensorFrames: Experimental tensorflow binding for Scala and Apache Spark, April 2017, https://github.com/databricks/tensorframes
Esmaeilzadeh, H., Blem, E., St Amant, R., Sankaralingam, K., Burger, D.: Dark silicon and the end of multicore scaling. In: ACM SIGARCH Computer Architecture News, vol. 39, pp. 365–376. ACM (2011)
Google Scholar
Ghasemi, E., Chow, P.: Accelerating apache spark big data analysis with FPGAs. In: 2016 IEEE 24th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM), p. 94, May 2016
Google Scholar
Gouy, I.: The computer language benchmarks game, 20 March (2017), http://benchmarksgame.alioth.debian.org/
Huang, M., Wu, D., Yu, C.H., Fang, Z., Interlandi, M., Condie, T., Cong, J.: Programming and runtime support to Blaze FPGA accelerator deployment at datacenter scale. In: Proceedings of the Seventh ACM Symposium on Cloud Computing, pp. 456–469. ACM (2016)
Google Scholar
Lindholm, T., Yellin, F., Bracha, G., Buckley, A.: The Java Virtual Machine Specification, Java SE, 8th edn. Oracle (2015)
Google Scholar
Open-source project: Java Native Access, April 2017, https://github.com/java-native-access/jna
Oracle: Java HotSpot virtual machine performance enhancements, April 2017, http://docs.oracle.com/javase/8/docs/technotes/guides/vm/performance-enhancements-7.html
Oracle: Object serialization stream protocol, April 2017, https://docs.oracle.com/javase/8/docs/platform/serialization/spec/serialTOC.html
Peltenburg, J.: JVM-to-Accelerator Benchmark Tool, https://github.com/johanpel/jvm2accbench
Stuecheli, J., Blaner, B., Johns, C., Siegel, M.: CAPI: a coherent accelerator processor interface. IBM J. Res. Dev. 59(1), 1–7 (2015)
Article Google Scholar
Weiss, P.: Off heap memory access for non-jvm libraries, March 2017, https://issues.apache.org/jira/browse/SPARK-10399
Yuan, Y., Salmi, M.F., Huai, Y., Wang, K., Lee, R., Zhang, X.: Spark-GPU: an accelerated in-memory data processing engine on clusters. In: 2016 IEEE International Conference on Big Data (Big Data), pp. 273–283, December 2016
Google Scholar
Zaharia, M., Chowdhury, M., Das, T., Dave, A., Ma, J., McCauley, M., Franklin, M.J., Shenker, S., Stoica, I.: Resilient distributed datasets: a fault-tolerant abstraction for in-memory cluster computing. In: Proceedings of the 9th USENIX Conference on Networked Systems Design and Implementation, p. 2. USENIX Association (2012)
Google Scholar

Download references

Acknowledgment

The authors would like to thank Erik Vermij for his help using the POWER8 system and the Texas Advanced Computing Center and their partners for access to the hardware. This work was supported by the European Commission in the context of the ARTEMIS project ALMARVI (project #621439).

Author information

Authors and Affiliations

Computer Engineering Lab, Delft University of Technology, Delft, Netherlands
Johan Peltenburg, Ahmad Hesam & Zaid Al-Ars

Authors

Johan Peltenburg
View author publications
You can also search for this author in PubMed Google Scholar
Ahmad Hesam
View author publications
You can also search for this author in PubMed Google Scholar
Zaid Al-Ars
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Johan Peltenburg .

Editor information

Editors and Affiliations

Deutsches Klimarechenzentrum (DKRZ), Hamburg, Hamburg, Germany
Julian M. Kunkel
TITECH, Tokyo, Japan
Rio Yokota
Department of Computer Science, University of Delaware, Newark, Delaware, USA
Michela Taufer
Lawrence Berkeley National Laboratory, Berkeley, California, USA
John Shalf

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Peltenburg, J., Hesam, A., Al-Ars, Z. (2017). Pushing Big Data into Accelerators: Can the JVM Saturate Our Hardware?. In: Kunkel, J., Yokota, R., Taufer, M., Shalf, J. (eds) High Performance Computing. ISC High Performance 2017. Lecture Notes in Computer Science(), vol 10524. Springer, Cham. https://doi.org/10.1007/978-3-319-67630-2_18

Download citation

DOI: https://doi.org/10.1007/978-3-319-67630-2_18
Published: 20 October 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-67629-6
Online ISBN: 978-3-319-67630-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics