Abstract
Writing efficient general-purpose programs for Graphics Processing Units (GPU’s) is a complex task. In order to be able to program these processors efficiently, one has to understand their intricate architecture, memory subsystem as well as the interaction with the Central Processing Unit (CPU). The paper presents the GAP - an automatic parallelizer designed to translate sequential ANSI C code to parallel CUDA C programs. The general processing architecture of GAP is presented. Developed and implemented compiler was tested on the series of ANSI C programs. The generated code performed very well, achieving significant speed-ups for the programs that expose high degree of data-parallelism. The results show that the idea of applying the automatic parallelization for generating the CUDA C code is feasible and realistic.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Banerjee, U.: Loop Transformations for Restructuring Compilers: The Foundations. Kluwer Academic Publishers, New York (1993)
Banerjee, U.: Loop Transformations for Restructuring Compilers: Loop Parallelization. Kluwer Academic Publishers, New York (1994)
Banerjee, U.: Loop Transformations for Restructuring Compilers: Dependence Analysis. Kluwer Academic Publishers, New York (1994)
Allen, R., Kennedy, K.: Automatic loop interchange. In: Proceedings of the SIGPLAN 84 Symposium on Compiler Construction, Montreal, pp. 233–246 (1984)
Wolfe, M.J.: Advanced loop interchange. In: Proceedings of the 1986 International Conference on Parallel Processing, St. Charles, Illinois, pp. 536–543 (1986)
Zima, H., Chapman, B.: Supercompilers for Parallel and Vector Computers. ACM Press, New York (1991)
Midki, S.M.: Automatic Parallelization: An Overview of Fundamental Compiler Techniques. Morgan Claypool Publishers, California (2012)
Quillere, F., Rajopadhye, S.V., Wilde, D.: Generation of efficient nested loops from polyhedra. Int. J. Parallel Program. 28(5), 469–498 (2000)
Bondhugula, U.K.R.: Effective automatic parallelization and locality optimization using the polyhedral model. Ph.D. thesis, The Ohio State University, Ohio (2010)
Baskaran, M.M., Ramanujam, J., Sadayappan, P.: Automatic C-to-CUDA code generation for affine programs. In: Proceedings of the 19th International Conference CC2010, Paphos, Cyprus, pp. 244–263 (2010)
Kwiatkowski, J., Bajgoric, D.: Automatic parallelization of ANSI C to CUDA C programs. In: Wyrzykowski, R., Dongarra, J., Deelman, E., Karczewski, K. (eds.) PPAM 2017. LNCS, vol. 10777, pp. 459–470. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-78024-5_40
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Kwiatkowski, J., Bajgoric, D., Fras, M. (2019). GAP - General Autonomous Parallelizer for CUDA Environment. In: Borzemski, L., Świątek, J., Wilimowska, Z. (eds) Information Systems Architecture and Technology: Proceedings of 39th International Conference on Information Systems Architecture and Technology – ISAT 2018. ISAT 2018. Advances in Intelligent Systems and Computing, vol 852. Springer, Cham. https://doi.org/10.1007/978-3-319-99981-4_17
Download citation
DOI: https://doi.org/10.1007/978-3-319-99981-4_17
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-99980-7
Online ISBN: 978-3-319-99981-4
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)