Abstract
Task-based runtime systems have gained a lot of interest in recent years since they support separating the specification of parallel computations from the concrete mapping onto a parallel architecture. This separation of concerns is considered key to coping with the increased complexity, performance variability, and heterogeneity of future parallel systems and to facilitating portability of applications across different architectures. In this paper we present our work on a programming framework that enables the expression of pipeline patterns at a high-level of abstraction by adding pragma directives to sequential C++ codes. Such high-level abstractions are then transformed to a runtime coordination layer, which utilizes different task-based runtime systems including StarPU and OCR to realize efficient parallel execution on single-node multi-core architectures. We describe the major aspects of our approach for mapping pipeline patterns to task-based runtimes and present experimental results for a real-world face-detection application indicating that a performance competitive with low-level programming approaches can be achieved.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Augonnet, C., Thibault, S., Namyst, R., Wacrenier, P.-A.: StarPU: a unified platform for task scheduling on heterogeneous multicore architectures. Concurr. Comput.: Pract. Exp. - Euro-Par 2009, 187–198 (2011)
Kaiser, H., Heller, T., Adelstein-Lelbach, B., Serio, A., Fey, D.: HPX - a task based programming model in a global address space. In: PGAS 2014: the 8th International Conference on Partitioned Global Address Space Programming Models (2014)
Cledat, R., Mattson, T.: OCR, the open community runtime interface. OCR specification 1.2.0 (2016)
Benkner, S., et al.: PEPPHER: efficient and productive usage of hybrid computing systems. IEEE Micro 31(5), 28–41 (2011)
Bueno, J., et al.: Productive programming of GPU clusters with OmpSs. In: 2012 IEEE 26th International Parallel Distributed Processing Symposium (IPDPS) (2012)
OpenMP Architecture Review Board. OpenMP Application Programming Interface v4.5 (2015)
Bauer, M., Treichler, S., Slaughter, E., Aiken, A.: Legion: expressing locality and independence with logical regions. In: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, Salt Lake City, Utah (2012)
Pheatt, C.: Intel® threading building blocks. J. Comput. Sci. Coll. 23(4), 298 (2008)
Robson, M.P., Buch, R., Kale, L., Runtime coordinated heterogeneous tasks in charm++. In: ESPM2 Workshop, in Conjunction with SC16, Salt Lake City (2016)
Majeti, D., Sarkar, V.: Heterogeneous Habanero-C (H2C): a portable programming model for heterogeneous processors. In: 2015 IEEE International Parallel and Distributed Processing Symposium Workshop (2015)
Bajrovic, E., Benkner, S.: Automatic performance tuning of pipeline patterns for heterogeneous parallel architectures. In: The 2014 International Conference on Parallel and Distributed Processing, Techniques and Applications (2014)
Bradski, G., Kaehler, A.: Learning OpenCV 3: computer vision in C++ with the OpenCV Library. O’Reilly Media, Sebastopol (2016)
Dokulil, J., Sandrieser, M., Benkner, S.: OCR-Vx - an alternative implementation of the open community runtime. In: International Workshop on Runtime Systems for Extreme Scale Programming Models and Architectures, in conjunction with SC15, Austin, Texas, November 2015
Dokulil, J., Sandrieser, M., Benkner, S.: Implementing the open community runtime for shared-memory and distributed-memory systems. In: 24th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP), Heraklion, Greece. IEEE Computer Society, February 2016
Benkner, S., Bajrovic, E., Marth, E., Sandrieser, M., Namyst, R., Thibault, S.: High-level support for pipeline parallelism on many-core architectures. In: Kaklamanis, C., Papatheodorou, T., Spirakis, Paul G. (eds.) Euro-Par 2012. LNCS, vol. 7484, pp. 614–625. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-32820-6_61
Gerndt, M., Cesar, E., Benkner, S. (eds.): Automatic tuning of HPC applications - the periscope tuning framework (PTF). Shakar Verlag, Herzogenrath (2015)
Acknowledgement
The work was supported in part by the Austrian Science Fund (FWF) project P 29783 Dynamic Runtime System for Future Parallel Architectures.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Bajrovic, E., Benkner, S., Dokulil, J. (2019). Pipeline Patterns on Top of Task-Based Runtimes. In: Park, J., Shen, H., Sung, Y., Tian, H. (eds) Parallel and Distributed Computing, Applications and Technologies. PDCAT 2018. Communications in Computer and Information Science, vol 931. Springer, Singapore. https://doi.org/10.1007/978-981-13-5907-1_11
Download citation
DOI: https://doi.org/10.1007/978-981-13-5907-1_11
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-5906-4
Online ISBN: 978-981-13-5907-1
eBook Packages: Computer ScienceComputer Science (R0)