Array Replication to Increase Parallelism in Applications Mapped to Configurable Architectures

Ziegler, Heidi E.; Malusare, Priyadarshini L.; Diniz, Pedro C.

doi:10.1007/978-3-540-69330-7_5

Heidi E. Ziegler²⁰,
Priyadarshini L. Malusare²⁰ &
Pedro C. Diniz²⁰

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4339))

Included in the following conference series:

International Workshop on Languages and Compilers for Parallel Computing

511 Accesses
3 Citations

Abstract

Configurable architectures, with multiple independent on-chip RAM modules, offer the unique opportunity to exploit inherent parallel memory accesses in a sequential program by not only tailoring the number and configuration of the modules in the resulting hardware design but also the accesses to them. In this paper we explore the possibility of array replication for loop computations that is beyond the reach of traditional privatization and parallelization analyses. We present a compiler analysis that identifies portions of array variables that can be temporarily replicated within the execution of a given loop iteration, enabling the concurrent execution of statements or even non-perfectly nested loops. For configurable architectures where array replication is essentially free in terms of execution time, this replication enables not only parallel execution but also reduces or even eliminates memory contention. We present preliminary experiments applying the proposed technique to hardware designs for commercially available FPGA devices.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Allen, F., Burke, M., Cytron, R., Ferrante, J., Hsieh, W., Sarkar, V.: A Framework for Determining Useful Parallelism. In: Proc. Intl. Conf. Supercomputing, pp. 207–215. ACM, New York (1988)
Google Scholar
Allen, R., Kennedy, K.: Automatic Translation of Fortran Programs to Vector Form 9(4), 491–542 (1987)
Google Scholar
Balasundaram, V., Kennedy, K.: A technique for summarizing data access and its use in parallelism enhancing transformations. In: Proc. ACM Conf. Programming Languages Design and Implementation, pp. 41–53 (1989)
Google Scholar
Eigenmann, R., Hoeflinger, J., Li, Z., Padua, D.: Experience in the AutomaticParallelization of four Perfect Benchmark Programs. In: Banerjee, U., Nicolau, A., Gelernter, D., Padua, D.A. (eds.) LCPC 1991. LNCS, vol. 589. Springer, Heidelberg (1992)
Chapter Google Scholar
Goldstein, S., Schmit, H., Moe, M., Budiu, M., Cadambi, S., Taylor, R., Laufer, R.: PipeRench: a coprocessor for streaming multimedia acceleration. In: Proc. 26th Intl. Symp. Comp. Arch., pp. 28–39 (1999)
Google Scholar
Li, Z.: Array privatization for parallel execution of loops. In: Proc. ACM Intl. Conf. Supercomputing (1992)
Google Scholar
Mentor Graphics Inc. Monet^TM (1999)
Google Scholar
Rinard, M., Diniz, P.: Eliminating Synchronization Bottlenecks in object-based Programs using Adaptive Replication. In: Proc. Intl. Conf. Supercomputing, pp. 83–92. ACM, New York (1999)
Chapter Google Scholar
So, B., Hall, M., Ziegler, H.: Custom Data Layout for Memory Parallelism. In: Proc. Intl. Symp. Code Gen. Opt., March 2004, pp. 291–302 (2004)
Google Scholar
Tseng, C.-W.: Compiler optimizations for eliminating barrier synchronization. In: Proc. Fifth Symp. Principles and Practice of Parallel Programming. ACM SIGPLAN Notices, vol. 30(8), pp. 144–155 (1995)
Google Scholar
Tu, P., Padua, D.: Automatic Array Privatization. In: Banerjee, U., Gelernter, D., Nicolau, A., Padua, D.A. (eds.) LCPC 1993. LNCS, vol. 768. Springer, Heidelberg (1994)
Google Scholar
Xilinx Inc. Virtex-II Pro^TM Platform FPGAs: introduction and overview, DS083- 1(v2.4.1) edn. (March 2003)
Google Scholar
Ziegler, H., Hall, M., Diniz, P.: Compiler-generated Communication for Pipelined FPGA applications. In: Proc. 40th Design Automation Conference (June 2003)
Google Scholar

Download references

Author information

Authors and Affiliations

University of Southern California / Information Sciences Institute, 4676 Admiralty Way, Suite 1001, Marina del Rey, California, 90292
Heidi E. Ziegler, Priyadarshini L. Malusare & Pedro C. Diniz

Authors

Heidi E. Ziegler
View author publications
You can also search for this author in PubMed Google Scholar
Priyadarshini L. Malusare
View author publications
You can also search for this author in PubMed Google Scholar
Pedro C. Diniz
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

BSC-UPC,
Eduard Ayguadé
Department of Computer Science, Louisiana State University, 70803, Baton Rouge, LA, USA
Gerald Baumgartner
Dept. of Electrical and Computer Engg., Louisiana State University, Baton Rouge, LA, USA
J. Ramanujam
Department of Computer Science and Engineering, The Ohio State University, 2015 Neil Avenue, 43210, Columbus, OH, USA
P. Sadayappan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ziegler, H.E., Malusare, P.L., Diniz, P.C. (2006). Array Replication to Increase Parallelism in Applications Mapped to Configurable Architectures. In: Ayguadé, E., Baumgartner, G., Ramanujam, J., Sadayappan, P. (eds) Languages and Compilers for Parallel Computing. LCPC 2005. Lecture Notes in Computer Science, vol 4339. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-69330-7_5

Download citation

DOI: https://doi.org/10.1007/978-3-540-69330-7_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-69329-1
Online ISBN: 978-3-540-69330-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics