Finish Accumulators: An Efficient Reduction Construct for Dynamic Task Parallelism
Parallel reductions represent a common pattern for computing the aggregation of an associative and commutative operation, such as summation, across multiple pieces of data supplied by parallel tasks. In this poster, we introduce finish accumulators, a unified construct that supports predefined and user-defined parallel reductions for dynamic task parallelism. Finish accumulators are designed to be integrated into structured task parallelism constructs, such as the async and finish constructs found in the X10 and Habanero-Java (HJ) languages, so as to guarantee determinism for accumulation and to avoid any possible race conditions in referring to intermediate results. In contrast to lower-level reduction constructs such as atomic variables, the high-level semantics of finish accumulators allows for a wide range of implementations with different accumulation policies, e.g., eager-computation vs. lazy-computation. The best implementation can thus be selected based on a given application and target platform. We have integrated finish accumulators into the Habanero-Java task parallel language, and used them for research and teaching. In addition to their higher-level semantics, experimental results demonstrate that our Java-based implementation of finish accumulators delivers comparable or better performance for computing reductions relative to Java’s atomic variables and concurrent collections.