Skip to main content

Contention-Free Communication Scheduling for Group Communication in Data Parallelism

  • Conference paper
  • 550 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4804))

Abstract

Group communication significantly influences the performance of data parallel applications. It is required often in two situations: one is array redistribution from phase to phase; the other is array remapping after loop partition. Nevertheless, the important factor that influences the efficiency of group communication is often neglected: a larger communication idle time may occur when there is node contention and difference among message lengths during one particular communication step. This paper is devoted to develop an efficient scheduling strategy using the compiling information provided by array subscripts, array distribution pattern and array access period. Our strategy not only avoids inter-processor contention, but it also minimizes real communication cost in each communication step. Our experimental results show that our strategy has better performance than the traditional implement of MPI_Alltoallv, alltoall based scheduling, and greedy scheduling.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. HPF Forum: High Performance Fortran Language Specification. version 2.0 edition. Rice University, Houston, Texas (1996)

    Google Scholar 

  2. Park, N., Prasanna, V.K., Raghavendra, C.S.: Efficient Algorithms for Block-cyclic Array Redistribution between Processor Sets. IEEE Trans. Parallel Distrib. Systems 10(12), 1217–1239 (1999)

    Article  Google Scholar 

  3. Desprez, F., Dongarra, J., Petitet, A., Randriamaro, C., Robert, Y.: Scheduling Block-cyclic Array Redistribution. IEEE Trans. Parallel Distrib. Systems 9(2), 192–205 (1998)

    Article  Google Scholar 

  4. Faraj, A., Yuan, X., Patarasuk, P.: A Message Scheduling Scheme for All-to-all Personalized Communication on Ethernet Switched Cluster. IEEE Trans. Parallel Distrib. Systems 18(2), 264–276 (2007)

    Article  Google Scholar 

  5. Guo, M., Nakata, I., Yamashita, Y.: Contention-free Communication Scheduling for Array Redistribution. Parallel Comput. 25(3), 1325–1343 (2000)

    Article  Google Scholar 

  6. Guo, M., Pan, Y.: Improving Communication Scheduling for Array Redistribution. J. Parallel Distrib. Comput. 65, 553–563 (2005)

    Article  MATH  Google Scholar 

  7. Faraj, A., Yuan, X.: An Empirical Approach for Efficient All-to-All Personalized Communication on Ethernet Switched Clusters. In: The 34th International Conference on Parallel Processing, pp. 321–328 (2005)

    Google Scholar 

  8. Matsuda, M., Kudoh, T., Kodama, Y., Takano, R., Ishikawa, Y.: Efficient MPI Collective Operations for Clusters in Long-and-fast Networks. IEEE Conference on Cluster, 1–9 (2006)

    Google Scholar 

  9. Faraj, A., Patarasuk, P., Yuan, X.: A Study of Process Arrival Patterns for MPI Collective Operations. In: The 21th ACM International Conference on Supercomput., pp. 168–179 (2007)

    Google Scholar 

  10. Bozkus, Z., Choudhary, A., Fox, G., Haupt, T., Ranka, S., Wu, M.Y.: Compiling Fortran 90D/HPF for Distributed Memory MIMD Computers. J. Parallel and Distrib. Comput. 21, 15–26 (1994)

    Article  Google Scholar 

  11. Benkner, S.: VFC: The Vienna Fortran Compiler. Scientific Programming 7(1), 67–81 (1999)

    Google Scholar 

  12. Hu, C.J.: Multi-paradigm Parallel Computing Centered on Data Parallel. Ph.D. Thesis, University of Peking, China (2001)

    Google Scholar 

  13. Yu, H.S., Hu, C.J., Huang, Q.J., Ding, W.K., Xu, Z.Q: A Time-slicing Optimization Framework of Computation Partitioning for Data-parallel Languages. J. Software 12(10), 1434–1446 (2001)

    Google Scholar 

  14. Hu, C.J., Li, J., Wang, J., Li, Y.H., Ding, L., Li, J.J.: Communication Generation for Irregular Parallel Applications. In: The international symposium on parallel computing in electrical engineering, pp. 263–270 (2006)

    Google Scholar 

  15. Huang, T.C., Shiu, L.C.: Efficient Communication Sets Generation for Block-cyclic Distribution on Distributed-memory Machines. J. Systems Arch. 48, 255–265 (2003)

    Article  Google Scholar 

  16. Hwang, G.H.: An Efficient Algorithm for Communication Set Generation of Data Parallel Programs with Block-cyclic Distribution. Parallel Comput. 30, 473–501 (2004)

    Article  MathSciNet  Google Scholar 

  17. Adams, J.C., Brainerd, W.S., Martin, J.T., Smith, B.T., Wagener, J.L.: Fortran 90Handbook Complete Ansi/iso Reference. Intertext Publications McGraw-Hill Book Company, New York (1992)

    Google Scholar 

  18. MPICH-2 (2005), http://www-unix.mcs.anl.gov/mpi/

  19. Karwande, A., Yuan, X., Lowenthal, K.D.: An MPI Prototype for Compiled Communication on Ethernet Switched Clusters. J. Parallel and Distrib. Comput., special issue on Design and Performance of Networks for Super-, Cluster-, and Grid-Computing 65(10), 1123–1133 (2005)

    Google Scholar 

  20. Wang, J., Hu, C.J.: Technology_report-07-2-4 (2007), http://202.204.54.130/mywiki/WangJue?action=AttachFile ,

  21. Dietz, H.G., Chung, T.M., Mattox, T.I., Muhammad, T.: Purdue’s Adapter for Parallel Execution and Rapid Synchronization: The TTL PAPERS Design. Technical Report, Purdue University School of Electrical Engineering (1995)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Robert Meersman Zahir Tari

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Wang, J., Hu, C., Li, J. (2007). Contention-Free Communication Scheduling for Group Communication in Data Parallelism. In: Meersman, R., Tari, Z. (eds) On the Move to Meaningful Internet Systems 2007: CoopIS, DOA, ODBASE, GADA, and IS. OTM 2007. Lecture Notes in Computer Science, vol 4804. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-76843-2_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-76843-2_16

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-76835-7

  • Online ISBN: 978-3-540-76843-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics