Skip to main content

Oblivious vs. Distribution-Based Sorting: An Experimental Evaluation

  • Conference paper
Algorithms – ESA 2005 (ESA 2005)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 3669))

Included in the following conference series:

Abstract

We compare two algorithms for sorting out-of-core data on a distributed-memory cluster. One algorithm, Csort, is a 3-pass oblivious algorithm. The other, Dsort, makes two passes over the data and is based on the paradigm of distribution-based algorithms. In the context of out-of-core sorting, this study is the first comparison between the paradigms of distribution-based and oblivious algorithms. Dsort avoids two of the four steps of a typical distribution-based algorithm by making simplifying assumptions about the distribution of the input keys. Csort makes no assumptions about the keys. Despite the simplifying assumptions, the I/O and communication patterns of Dsort depend heavily on the exact sequence of input keys. Csort, on the other hand, takes advantage of predetermined I/O and communication patterns, governed entirely by the input size, in order to overlap computation, communication, and I/O . Experimental evidence shows that, even on inputs that followed Dsort’s simplifying assumptions, Csort fared well. The running time of Dsort showed great variation across five input cases, whereas Csort sorted all of them in approximately the same amount of time. In fact, Dsort ran significantly faster than Csort in just one out of the five input cases: the one that was the most unrealistically skewed in favor of Dsort. A more robust implementation of Dsort—one without the simplifying assumptions—would run even slower.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Knuth, D.E.: Sorting and Searching. In: The Art of Computer Programming, vol. 3. Addison-Wesley, Reading (1973)

    Google Scholar 

  2. Leighton, F.T.: Introduction to Parallel Algorithms and Architectures: Arrays, Trees, Hypercubes. Morgan Kaufmann, San Francisco (1992)

    MATH  Google Scholar 

  3. Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C.: Introduction to Algorithms, 2nd edn. The MIT Press and McGraw-Hill (2001)

    Google Scholar 

  4. Arpaci-Dusseau, A.C., Arpaci-Dusseau, R.H., Culler, D.E., Hellerstein, J.M., Patterson, D.A.: High-performance sorting on networks of workstations. In: SIGMOD 1997 (1997)

    Google Scholar 

  5. Graefe, G.: Parallel external sorting in Volcano. Technical Report CU-CS-459-90, University of Colorado at Boulder, Department of Computer Science (1990)

    Google Scholar 

  6. Vitter, J.S.: External memory algorithms and data structures: Dealing with MASSIVE DATA. ACM Computing Surveys 33, 209–271 (2001)

    Article  Google Scholar 

  7. Chaudhry, G., Cormen, T.H., Wisniewski, L.F.: Columnsort lives! An efficient out-of-core sorting program. In: Proceedings of the Thirteenth Annual ACM Symposium on Parallel Algorithms and Architectures, pp. 169–178 (2001)

    Google Scholar 

  8. Chaudhry, G., Cormen, T.H.: Getting more from out-of-core columnsort. In: Mount, D.M., Stein, C. (eds.) ALENEX 2002. LNCS, vol. 2409, pp. 143–154. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  9. Chaudhry, G., Cormen, T.H., Hamon, E.A.: Parallel out-of-core sorting: The third way (Cluster Computing) (to appear)

    Google Scholar 

  10. Chaudhry, G., Cormen, T.H.: Slabpose columnsort: A new oblivious algorithm for out-of-core sorting on distributed-memory clusters (Algorithmica) (to appear)

    Google Scholar 

  11. Blelloch, G.E., Leiserson, C.E., Maggs, B.M., Plaxton, C.G., Smith, S.J., Zagha, M.: An experimental analysis of parallel sorting algorithms. Theory of Computing Systems 31, 135–167 (1998)

    Article  MATH  MathSciNet  Google Scholar 

  12. Snir, M., Otto, S.W., Huss-Lederman, S., Walker, D.W., Dongarra, J.: MPI—The Complete Reference. In: The MPI Core, vol. 1. The MIT Press, Cambridge (1998)

    Google Scholar 

  13. Gropp, W., Huss-Lederman, S., Lumsdaine, A., Lusk, E., Nitzberg, B., Saphir, W., Snir, M.: MPI—The Complete Reference. In: The MPI Extensions, vol. 2. The MIT Press, Cambridge (1998)

    Google Scholar 

  14. Arpaci-Dusseau, A.C., Arpaci-Dusseau, R.H., Culler, D.E., Hellerstein, J.M., Patterson, D.A.: Searching for the sorting record: Experiences in tuning NOW-Sort. In: 1998 Symposium on Parallel and Distributed Tools, SPDT 1998 (1998)

    Google Scholar 

  15. Vitter, J.S., Shriver, E.A.M.: Algorithms for parallel memory I: Two-level memories. Algorithmica 12, 110–147 (1994)

    Article  MATH  MathSciNet  Google Scholar 

  16. Leighton, T.: Tight bounds on the complexity of parallel sorting. IEEE Transactions on Computers C-34, 344–354 (1985)

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Chaudhry, G., Cormen, T.H. (2005). Oblivious vs. Distribution-Based Sorting: An Experimental Evaluation. In: Brodal, G.S., Leonardi, S. (eds) Algorithms – ESA 2005. ESA 2005. Lecture Notes in Computer Science, vol 3669. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11561071_30

Download citation

  • DOI: https://doi.org/10.1007/11561071_30

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-29118-3

  • Online ISBN: 978-3-540-31951-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics