Skip to main content

Proposal of Parallel Processing Area Extraction and Data Transfer Number Reduction for Automatic GPU Offloading of IoT Applications

  • Conference paper
  • First Online:
Smart Computing and Communication (SmartCom 2018)

Abstract

Recently, IoT (Internet of Things) technologies have been progressed. To overcome of the high cost of developing IoT services by vertically integrating devices and services, Open IoT enables various IoT services to be developed by integrating horizontally separated devices and services. For Open IoT, we have proposed Tacit Computing technology to discover the devices that have data users need on demand and use them dynamically and an automatic GPU (graphics processing unit) offloading technology as an elementary technology of Tacit Computing. However, it can improve limited applications because it only optimizes parallelizable loop statements extraction. Therefore, in this paper, to improve performances of more applications automatically, we propose an improved method with reduction of data transfer between CPU and GPU. This can improve performance of many IoT applications. We evaluate our proposed GPU offloading method by applying it to Darknet which is general large application for CPU and find that it can process it 3 times as quickly as only using CPUs within 10 h tuning time.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Hermann, M., et al.: Design principles for industrie 4.0 scenarios. In: Working Draft. Rechnische Universitat Dortmund (2015)

    Google Scholar 

  2. Evans, P.C., et al.: Industrial internet: pushing the boundaries of minds and machines. Technical report of GE (2012)

    Google Scholar 

  3. AWS IoT Platform. https://aws.amazon.com/iot/how-it-works/?nc1=h_ls

  4. Sefraoui, O., et al.: OpenStack: toward an open-source solution for cloud computing. Int. J. Comput. Appl. 55(3), 38–42 (2012)

    Google Scholar 

  5. Yamato, Y., et al.: Fast and reliable restoration method of virtual resources on OpenStack. IEEE Trans. Cloud Comput. 6, 572–583 (2015)

    Article  Google Scholar 

  6. Yamato, Y., et al.: Development of low user impact and low cost server migration technology for shared hosting services. IEICE Trans. Commun. J95-B(4), 547–555 (2012)

    Google Scholar 

  7. Yamato, Y.: Key points of telecommunication carriers’ shared hosting servers replacement project. J. Soc. Project Manag. 15(3), 3–8 (2013)

    Google Scholar 

  8. Yamato, Y., et al.: Software maintenance evaluation of agile software development method based on OpenStack. IEICE Trans. Inf. Syst. E98-D(7), 1377–1380 (2015)

    Google Scholar 

  9. Zaharia, M., et al.: Spark: cluster computing with working sets. In: 2nd USENIX Conference on Hot Topics in Cloud Computing (2010)

    Google Scholar 

  10. Marz, N.: STORM: distributed and fault-tolerant realtime computation (2013)

    Google Scholar 

  11. Dean, J., et al.: MapReduce: simplified data processing on large clusters. In: OSDI 2004, pp. 137–150, December 2004

    Google Scholar 

  12. TRON project. http://www.tron.org/

  13. Yamato, Y.: Ubiquitous service composition technology for ubiquitous network environments. IPSJ J. 48(2), 562–577 (2007)

    Google Scholar 

  14. Yamato, Y., et al.: Context-aware ubiquitous service composition technology. In: The IFIP International Conference on Research and Practical Issues of Enterprise Information Systems (CONFENIS 2006), pp. 51–61, April 2006

    Google Scholar 

  15. Yamato, Y., et al.: Study of user customize sevice composition technology based on BPEL extension. IPSJ J. 51 (2010)

    Google Scholar 

  16. Yamato, Y., et al.: Study and development of user customize service composition and change-over using BPEL engine. IEICE Trans. Commun. J91-B, 1428–1439 (2008)

    Google Scholar 

  17. Yamato, Y., et al.: Context-aware service composition and component change-over using semantic web techniques. IEEE ICWS 2007, 687–694 (2007)

    Google Scholar 

  18. OpenCV. http://opencv.org/

  19. AWS EC2 instance type. https://aws.amazon.com/ec2/instance-types/

  20. Putnam, A., et al.: A reconfigurable fabric for accelerating large-scale datacenter services. In: ISCA 2014, pp. 13–24, June 2014

    Google Scholar 

  21. Yamato, Y., et al.: Study of service control function for SOAP-REST mash-up service. IPSJ J. 51(2) (2010)

    Google Scholar 

  22. Yamato, Y., et al.: Abstract service scenario generation method for ubiquitous service composition. IIEICE Trans. Commun. J91-B, 1220–1230 (2008)

    Google Scholar 

  23. Yokohata, Y., et al.: Context-aware content-provision service for shopping malls based on ubiquitous service-oriented network framework and authentication and access control agent framework. In: IEEE CCNC 2006, pp. 1330–1331 (2006)

    Google Scholar 

  24. Moriya, T., et al.: Development of building alarm system on service delivery platform. IEICE Trans. Commun. J93-B(4) (2010)

    Google Scholar 

  25. Yamato, Y., et al.: Development of service processing agent for context aware service. IEICE Trans. Commun. J91-B(12) (2008)

    Google Scholar 

  26. Sanders, J., et al.: CUDA by Example : An Introduction to General-Purpose GPU Programming. Addison-Wesley, Boston (2011). ISBN 0131387685

    Google Scholar 

  27. Stone, J.E., et al.: OpenCL: a parallel programming standard for heterogeneous computing systems. Comput. Sci. Eng. 12, 66–73 (2010)

    Article  Google Scholar 

  28. Yamato, Y., et al.: Automatic GPU offloading technology for open IoT environment. IEEE Internet Things J. (2018)

    Google Scholar 

  29. Wienke, S., Springer, P., Terboven, C., an Mey, D.: OpenACC—first experiences with real-world applications. In: Kaklamanis, C., Papatheodorou, T., Spirakis, P.G. (eds.) Euro-Par 2012. LNCS, vol. 7484, pp. 859–870. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-32820-6_85

    Chapter  Google Scholar 

  30. Wolfe, M.: Implementing the PGI accelerator model. In: ACM the 3rd Workshop on General-Purpose Computation on Graphics Processing Units, pp. 43–50 (2010)

    Google Scholar 

  31. Ishizaki, K.: Transparent GPU exploitation for Java. In: The Fourth International Symposium on Computing and Networking (CANDAR 2016), November 2016

    Google Scholar 

  32. Su, E., et al.: Compiler support of the workqueuing execution model for Intel SMP architectures. In: Fourth European Workshop on OpenMP, September 2002

    Google Scholar 

  33. Himeno. http://accc.riken.jp/en/supercom/himenobmt/

  34. Tanaka, Y., et al.: Evaluation of optimization method for Fortran codes with GPU automatic parallelization compiler. IPSJ SIG Technical Report, no. 9 (2011)

    Google Scholar 

  35. Tomatsu, Y., et al.: gPot: intelligent compiler for GPGPU using combinatorial optimization techniques. In: The 7th Joint Symposium Between Doshisha University and Chonnam National University, August 2010

    Google Scholar 

  36. Yamato, Y., et al.: Study and evaluation of context aware service composition using BPEL engine. Inf. Technol. Lett. 6, 447–449 (2007)

    Google Scholar 

  37. Yamato, Y., et al.: Study of service control function for web-telecom coordination service. IEICE trans. commun. J91-B, 1417–1427 (2008)

    Google Scholar 

  38. Nakano, Y., et al.: Implementation and evaluation of wrapper system that creates web services from web applications. IPSJ J. 49(2), 727–738 (2008)

    Google Scholar 

  39. Yamato, Y., et al.: Evaluation of service composition technology through field trial of shopping support service. IPSJ J. 48(2), 755–769 (2007)

    Google Scholar 

  40. Yamato, Y., et al.: Study of service composition engine implemented on cellular phone. Inf. Technol. Lett. 4, 269–271 (2005)

    Google Scholar 

  41. Altera SDK for OpenCL. https://www.altera.com/products/design-software/embedded-software-developers/opencl/documentation.html

  42. Yamato, Y.: Optimum application deployment technology for heterogeneous IaaS cloud. J. Inf. Process. 25(1), 56–58 (2017)

    Google Scholar 

  43. Yamato, Y.: OpenStack hypervisor, container and baremetal servers performance comparison. IEICE Commun. Express 4, 228–232 (2015)

    Article  Google Scholar 

  44. Yamato, Y.: Performance-aware server architecture recommendation and automatic performance verification technology on IaaS Cloud. Serv. Orient. Comput. Appl. 11, 121–135 (2016)

    Article  Google Scholar 

  45. Yamato, Y.: Server selection, configuration and reconfiguration technology for IaaS cloud with multiple server types. J. Netw. Syst. Manag. (2017). https://doi.org/10.1007/s10922-017-9418-z

  46. Yamato, Y., et al.: Development of template management technology for easy deployment of virtual resources on OpenStack. J. Cloud Comput. 3, 7 (2014). https://doi.org/10.1186/s13677-014-0007-3

    Article  Google Scholar 

  47. Yamato, Y.: Automatic verification technology of software patches for user virtual environments on IaaS cloud. J. Cloud Comput. 4, 4 (2015). https://doi.org/10.1186/s13677-015-0028-6

    Article  Google Scholar 

  48. Holland, J.H.: Genetic algorithms. Sci. Am. 267, 66–73 (1992)

    Article  Google Scholar 

  49. Clang. http://llvm.org/

  50. GCOV. http://gcc.gnu.org/onlinedocs/gcc/Gcov.html

  51. GPROF. http://sourceware.org/binutils/docs-2.20/gprof/

  52. Laplace equation source. https://github.com/parallel-forall/cudacasts/tree/master/ep3-first-openacc-program

  53. Redmon, J., et al.: Real-time grasp detection using convolutional neural networks. In: IEEE International Conference on Robotics and Automation (ICRA), May 2015

    Google Scholar 

  54. Beylkin, G., et al.: Multiresolution representation of operators with boundary conditions on simple domains. Elsevier ACHA 33(1), 109–139 (2012)

    MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yoji Yamato .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Yamato, Y., Noguchi, H., Kataoka, M., Isoda, T., Demizu, T. (2018). Proposal of Parallel Processing Area Extraction and Data Transfer Number Reduction for Automatic GPU Offloading of IoT Applications. In: Qiu, M. (eds) Smart Computing and Communication. SmartCom 2018. Lecture Notes in Computer Science(), vol 11344. Springer, Cham. https://doi.org/10.1007/978-3-030-05755-8_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-05755-8_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-05754-1

  • Online ISBN: 978-3-030-05755-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics