Abstract
In this work, we present new proposals based on the owner-compute rule for the parallelization of irregular loops with dependences. The parallel code increases the available parallelism through the distribution of the statements inside each iteration instead of the whole iterations of the loop. Additionally, our proposal presents as main features the reordering of the layout of the indirection entries, optimizing data locality, and the efficient load balancing. Inspector and executor phases are fully parallel, without synchronizations and uncoupled, allowing the reuse of the information of the inspector. Experimental results on a SGI O2000 system prove that our approach exhibits a high performance, even when compared to well-known parallelization strategies.
This work was funded by CICYT under project TIC 2001-3694.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Zhu, C.-Q., Yew, P.-C.: A Scheme to Enforce Data Dependence on Large Multiprocessor Systems. IEEE Trans. on Software Engineering 13(6), 726–739 (1987)
Midkiff, S.P., Padua, D.A.: Compiler Algorithms for Synchronization. IEEE Transactions on Computers 36(12), 1485–1495 (1987)
Saltz, J.H., Mirchandaney, R., Crowley, K.: Run-Time Parallelization and Scheduling of Loops. IEEE Transactions on Computers 40(5), 603–612 (1991)
Leung, S.-T., Zahorjan, J.: Restructuring Arrays for Efficient Parallel Loop Execution. Technical Report 94-02-01, Department of Computer Science and Engineering, University of Washington (1994)
Chen, D.-K., Torrellas, J., Yew, P.-C.: An Efficient Algorithm for the Run-Time Parallelization of DOACROSS Loops. In: Supercomputing Conference, Washington DC, pp. 518–527 (1994)
Xu, C.: Effects of Parallelism Degree on Run-Time Parallelization of Loops. In: 31st Hawaii Int’l Conference on System Sciences, Kohala Coast, HI (1998)
Martín, M.J., Singh, D.E., Touriño, J., Rivera., F.F.: Exploiting ocality in the Run-Time Parallelization of Irregular Loops. In: Proceedings of the 31th International Conference on Parallel Processing, pp. 17–22 (2002)
Han, H., Tseng, C.-W.: Improving Locality for Adaptive Irregular Scientific odes. In: 13th Int’l Workshop on Languages and Compilers for Parallel Computing, Yorktown Heights, NY, pp. 173–188 (2000)
Mellor-Crummey, J.M., Whalley, D.B., Kennedy, K.: Improving Memory ierarchy Performance for Irregular Applications. In: ACM Int’l Conference on supercomputing, Rhodes, Greece, pp. 425–433 (1999)
Xu, C., Chaudhary, V.: Time Stamp Algorithms for Runtime Parallelization of OACROSS Loops with Dynamic Dependences. IEEE Transactions on Parallel and Distributed Systems 12(5), 433–450 (2001)
Duff, I.S., Grimes, R.G., Lewis, J.G.: Users’ Guide for the Harwell- Boeing Sparse Matrix Collection. Boeing Computer Services (1992)
Bank, R.E.: PLTMG: A Software Package for Solving Elliptic Partial Differential Equations, Users’ Guide 7.0. SIAM, Philadelphia (1994)
Berry, M., et al.: The PERFECT club benchmarks: Effective performance evaluation of supercomputers. Intl. Journal of Supercomputer Applications 3(3), 5–40 (1989)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Singh, D.E., Martín, M.J., Rivera, F.F. (2003). Increasing the Parallelism of Irregular Loops with Dependences. In: Kosch, H., Böszörményi, L., Hellwagner, H. (eds) Euro-Par 2003 Parallel Processing. Euro-Par 2003. Lecture Notes in Computer Science, vol 2790. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-45209-6_43
Download citation
DOI: https://doi.org/10.1007/978-3-540-45209-6_43
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-40788-1
Online ISBN: 978-3-540-45209-6
eBook Packages: Springer Book Archive