Skip to main content

Parallel Prefix (Scan) Algorithms for MPI

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 4192))

Abstract

We describe and experimentally compare four theoretically well-known algorithms for the parallel prefix operation (scan, in MPI terms), and give a presumably novel, doubly-pipelined implementation of the in-order binary tree parallel prefix algorithm. Bidirectional interconnects can benefit from this implementation. We present results from a 32 node AMD Cluster with Myrinet 2000 and a 72-node SX-8 parallel vector system. The doubly-pipelined algorithm is more than a factor two faster than the straight-forward binomial-tree algorithm found in many MPI implementations. However, due to its small constant factors the simple, linear pipeline algorithm is preferable for systems with a moderate number of processors. We also discuss adapting the algorithms to clusters of SMP nodes.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bae, S., Kim, D., Ranka, S.: Vector prefix and reduction computation on coarse-grained, distributed memory machines. In: International Parallel Processing Symposium/Symposium on Parallel and Distributed Processing (IPPS/SPDP 1998), pp. 321–325 (1998)

    Google Scholar 

  2. Blelloch, G.E.: Scans as primitive parallel operations. IEEE Transactions on Computers 38(11), 1526–1538 (1989)

    Article  Google Scholar 

  3. Gropp, W., Huss-Lederman, S., Lumsdaine, A., Lusk, E., Nitzberg, B., Saphir, W., Snir, M.: MPI – The Complete Reference. In: The MPI Extensions, vol. 2, MIT Press, Cambridge (1998)

    Google Scholar 

  4. Hillis, W.D., Steele, J.G.L.: Data parallel algorithms. Communications of the ACM 29(12), 1170–1183 (1986)

    Article  Google Scholar 

  5. JáJá, J.: An Introduction to Parallel Algorithms. Addison-Wesley, Reading (1992)

    MATH  Google Scholar 

  6. Lin, Y.-C., Yeh, C.-S.: Efficient parallel prefix algorithms on multiport message-passing systems. Information Processing Letters 71, 91–95 (1999)

    Article  MATH  MathSciNet  Google Scholar 

  7. Mayr, E.W., Plaxton, C.G.: Pipelined parallel prefix computations, and sorting on a pipelined hypercube. Journal of Parallel and Distributed Computing 17, 374–380 (1993)

    Article  MATH  Google Scholar 

  8. Sanders, P., Sibeyn, J.F.: A bandwidth latency tradeoff for broadcast and reduction. Information Processing Letters 86(1), 33–38 (2003)

    Article  MATH  MathSciNet  Google Scholar 

  9. Santos, E.E.: Optimal and efficient algorithms for summing and prefix summing on parallel machines. Journal of Parallel and Distributed Computing 62(4), 517–543 (2002)

    Article  MATH  Google Scholar 

  10. Snir, M., Otto, S., Huss-Lederman, S., Walker, D., Dongarra, J.: MPI – The Complete Reference. In: The MPI Core, 2nd edn., vol. 1. MIT Press, Cambridge (1998)

    Google Scholar 

  11. Worringen, J.: Pipelining and overlapping for MPI collective operations. In: 28th Annual IEEE Conference on Local Computer Networks (LCN 2003), pp. 548–557 (2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Sanders, P., Träff, J.L. (2006). Parallel Prefix (Scan) Algorithms for MPI. In: Mohr, B., Träff, J.L., Worringen, J., Dongarra, J. (eds) Recent Advances in Parallel Virtual Machine and Message Passing Interface. EuroPVM/MPI 2006. Lecture Notes in Computer Science, vol 4192. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11846802_15

Download citation

  • DOI: https://doi.org/10.1007/11846802_15

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-39110-4

  • Online ISBN: 978-3-540-39112-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics