Abstract
We present a fast radix sorting algorithm that builds upon a microarchitecture-aware variant of counting sort. Taking advantage of virtual memory and making use of write-combining yields a per-pass throughput corresponding to at least 89% of the system’s peak memory bandwidth. Our implementation outperforms Intel’s recently published radix sort by a factor of 1.64. It also compares favorably to the reported performance of an algorithm for Fermi GPUs when data-transfer overhead is included. These results indicate that scalar, bandwidth-sensitive sorting algorithms remain competitive on current architectures. Various other memory-intensive applications can benefit from the techniques described herein.
Chapter PDF
References
Bohannon, P., McIlroy, P., Rastogi, R.: Main-memory index structures with fixed-size partial keys. In: SIGMOD Conference, pp. 163–174 (2001), http://www.acm.org/sigs/sigmod/sigmod01/eproceedings/papers/Research-Bohannon-et-al.pdf
Satish, N., Kim, C., Chhugani, J., Nguyen, A., Lee, V., Kim, D., Dubey, P.: Fast sort on CPUs and GPUs: a case for bandwidth oblivious SIMD sort. In: Elmagarmid, A., Agrawal, D. (eds.) SIGMOD Conference, pp. 351–362. ACM Press, New York (2010), http://doi.acm.org/10.1145/1807167.1807207
Mehlhorn, Sanders: Scanning multiple sequences via cache memory. Algorithmica 35 (2003)
Intel. Intel Architecture Software Developer Manual (2010), System Programming Guide, http://www.intel.com/Assets/PDF/manual/253668.pdf
Intel Corporation. Intel 64 and IA-32 Architectures Optimization Reference Manual (November 2007), http://www.intel.com/design/processor/manuals/248966.pdf
Wassenberg, J., Middelmann, W., Sanders, P.: An efficient parallel algorithm for graph-based image segmentation (June 2009), http://algo2.iti.uni-karlsruhe.de/wassenberg/wassenberg09parallelSegmentation.pdf
Jimenez-Gonzalez, D., Navarro, J., Larriba-Pey, J.: Fast parallel in-memory 64-bit sorting. In: Proceedings of the 2001 International Conference on Supercomputing (15th ICS 2001), Sorrento, Napoli, Italy, pp. 114–122. ACM, New York (2001)
an Mey, D., Terboven, C.: Affinity matters! OpenMP on multicore and ccNUMA architectures. In: Parallel Computing: Architectures, Algorithms and Applications, vol. 15, Forschungszentrum Jülich and RWTH Aachen University ( Febuary 2008), http://www.compunity.org/events/pastevents/parco07/AffinityMatters_DaM.pdf
Panneton, F., L’Ecuyer, P., Matsumoto, M.: Improved long-period generators based on linear recurrences modulo 2. ACM Transactions on Mathematical Software 32 (2006)
Satish, N., Kim, C., Chhugani, J., Nguyen, A., Lee, V., Kim, D., Dubey, P.: Fast sort on CPUs, GPUs and intel MIC architectures. Technical report, Intel (2010), http://techresearch.intel.com/userfiles/en-us/FASTsort_CPUsGPUs_IntelMICarchitectures.pdf
Merrill, D., Grimshaw, A.: Revisiting sorting for GPGPU stream architectures. Technical Report 3, University of Virginia (February 2010), http://www.cs.virginia.edu/~dgm4d/papers/RadixSortTR.pdf
Levinthal, D.: Performance Analysis Guide for Intel Core i7 Processor and Intel Xeon 5500 processors. Intel, http://software.intel.com/sites/products/collateral/hpc/vtune/performance_analysis_guide.pdf
Besedin, D.: RightMark memory analyzer, http://cpu.rightmark.org (accessed January 9, 2009)
Jacob, B., Ng, S., Wang, D.: Memory systems: cache, DRAM, disk. Morgan Kaufmann, San Francisco (2007)
Helman, D., Bader, D., JáJá, J.: A randomized parallel sorting algorithm with an experimental study. J. Parallel Distrib. Comput. 52(1), 1–23 (1998)
Wassenberg, J.: Vmcsort demo (May 2011), http://algo2.iti.kit.edu/wassenberg/vmcsort/demo.html
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Wassenberg, J., Sanders, P. (2011). Engineering a Multi-core Radix Sort. In: Jeannot, E., Namyst, R., Roman, J. (eds) Euro-Par 2011 Parallel Processing. Euro-Par 2011. Lecture Notes in Computer Science, vol 6853. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23397-5_16
Download citation
DOI: https://doi.org/10.1007/978-3-642-23397-5_16
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-23396-8
Online ISBN: 978-3-642-23397-5
eBook Packages: Computer ScienceComputer Science (R0)