Abstract
During the past few years, the increase of computational power has been realized using more processors with multiple cores and specific processing units like graphics processing units (GPUs). Also, the introduction of programming languages such as CUDA and OpenCL makes it easy, even for non-graphics programmers, to exploit the computational power of massively parallel processors available in current GPUs. Although CUDA and OpenCL relieve programmers from considering many low-level details of parallel programming on multiple cores on a single GPU, the same support at a higher level of parallelization for multiple GPUs is still under research. In particular, fundamental issues of memory management and synchronization must be dealt with directly by the programmer. In this chapter, we introduce concepts for CUDA-based frameworks which are designed for stateful stream data processing for graph-like arrangements of processing modules on two or more GPUs in a single compute node. We evaluate these concepts and further elaborate on the approach of our choice. Our approach relieves the programmer from error-prone chores of memory management and synchronization. The chapter presents detailed evaluation results which demonstrate the scalability of the proposed framework. To demonstrate the usability of our framework, we utilize it for demanding online processing in the areas of crystallographic structure detection and video decryption.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Macedonia, M.: The GPU enters computing’s mainstream. IEEE Comput. 36(10), 106–108 (2003)
Enmyren, J., Kessler, C.: Skepu: A multi-backend skeleton programming library for multi-GPU systems. In: Proceedings on International ACM Workshop High-level parallel programming and applications, pp. 5–14 (2010)
Meyer, B., Plessl, C., Forstner, J.: Transformation of scientific algorithms to parallel computing code: Single GPU and mpi multi GPU backends with subdomain support. In: Proceeding of 2011 Symposium on Application Accelerators in High-Performance Computing (SAAHPC), pp. 60–63 (2011)
Chen, L., Villa, O., Krishnamoorthy, S., Gao, G.: Dynamic load balancing on single- and multi-GPU systems. In: Proc. Parallel & Distributed Processing (IPDPS) (2010). doi:10.1109/IPDPS.2010.5470413
Chen, L., Villa, O., Gao, G.: Exploring fine-grained task-based execution on multi-GPU systems. In: Proceedings of IEEE International Conference on Cluster Computing, pp. 386–394 (2011)
Stuart, J.A., Chen, C.K., Ma, K.L., Owens, J.D.: Multi-GPU volume rendering using MapReduce. In: Proceedings of International ACM Symposium on High Performance Distributed Computing, pp. 841–848 (2010)
Schaa, D., Kaeli, D.: Exploring the multiple-GPU design space. In: Proceedings of International IEEE Symposium on Parallel and Distributed Processing (2009)
Verner, U., Schuster, A., Silberstein, M.: Processing data streams with hard real-time constraints on heterogeneous systems. In: Proceedings on International Conference on Supercomputing, pp. 120–129 (2011)
Yamagiwa, S., Arai, M., Wada, K.: Efficient handling of stream buffers in GPU stream-based computing platform. In: Proceedings on IEEE Pacific Rim Conference on Communications, Computers and Signal Processing, pp. 286–291 (2011)
Teodoro, G., Sachetto, R., Sertel, O., Gurcan, M., Meira, W., Catalyurek, U., Ferreira, R.: Coordinating the use of GPU and CPU for improving performance of compute intensive applications. In: Proceedings on Internatyional IEEE Conference on Cluster (2009)
Houzet, D., Huet, S., Rahman, A.: Syscellc: A data-flow programming model on multi-GPU. In: Proceedings of International Conference on Computational Science, pp. 1035–1044 (2010)
Zhang, Y., Mueller, F.: Gstream: A general-purpose data streaming framework on GPU clusters. In: Proceedings of International Conference on Parallel Processing, pp. 245–254, (2011)
Vogelgesang, M., Chilingaryan, S., dos Santos Rolo, T., Kopmann, A.: Ufo: A scalable GPU-based image processing framework for on-line monitoring. In: Proceedings on IEEE 14th International Conference on High Performance Computing and Communications, pp. 824–829 (2012)
Wang, X., Bao, X.: A novel block cryptosystem based on the coupled chaotic map lattice. Nonlinear Dyn 72, 707–715 (2013)
Cheddad, A., Condell, J., Curran, K., Kevitt, P.M.: Digital image steganography: survey and analysis of current methods. Signal Process 90, 727–752 (2010)
Alghabi, F., Schipper, U., Kolb, A.: Real-time processing of pnCCD images using GPUs. In: 14th International Workshop on Radiation Imaging Detectors (2012)
Andritschke, R., Hartner, G., Hartmann, R., Meidinger, N., Strüder, L.: Data analysis for characterizing pnCCDs. In Proceedings of Nuclear Science Symposium, pp. 2166–2172 (2008)
Acknowledgments
This research was partially funded by the German Ministry for Research and Education (BMBF) under grant No. 05k10PSB.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer Science+Business Media Singapore
About this chapter
Cite this chapter
Alghabi, F., Schipper, U., Kolb, A. (2015). A Scalable Software Framework for Stateful Stream Data Processing on Multiple GPUs and Applications. In: Cai, Y., See, S. (eds) GPU Computing and Applications. Springer, Singapore. https://doi.org/10.1007/978-981-287-134-3_7
Download citation
DOI: https://doi.org/10.1007/978-981-287-134-3_7
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-287-133-6
Online ISBN: 978-981-287-134-3
eBook Packages: EngineeringEngineering (R0)