ASPEN: An Efficient Algorithm for Data Redistribution Between Producer and Consumer Grids
HPC applications and libraries have frequently moved parallel data from one distribution scheme to another, for reasons of performance. In modern times, a resurgence of interest in this data redistribution problem has emerged due to the need to relocate data distributed across one Producer grid onto a different distribution scheme across a Consumer grid. In this paper, we study the efficient algorithms to perform redistribution, and show how the best methods from the literature are still dependent on the number of processors in both grids. We describe a new algorithm ASPEN that exploits more cyclic patterns and relations in the distribution, is not dependent on the total number of processors and is thus well suited for use in a workflow management systems. We describe a preliminary implementation of the algorithm within such a workflow system and show performance results that indicate a significant performance benefit in data redistribution generation.
KeywordsData distribution Redistribution Data placement Data locality Memory layout Communication pattern Parallel programming Distributed memory
This work was partly funded by the EXPERTISE project (http://www.msca-expertise.eu/), which has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No 721865.
- 8.Thakur, R., Choudhary, A., Fox, G.: Runtime array redistribution in HPF programs. In: Proceedings of the Scalable High-Performance Computing Conference, pp. 309–316. IEEE (1994)Google Scholar