Engineering a Multi-core Radix Sort

  • Jan Wassenberg
  • Peter Sanders
Conference paper

DOI: 10.1007/978-3-642-23397-5_16

Part of the Lecture Notes in Computer Science book series (LNCS, volume 6853)
Cite this paper as:
Wassenberg J., Sanders P. (2011) Engineering a Multi-core Radix Sort. In: Jeannot E., Namyst R., Roman J. (eds) Euro-Par 2011 Parallel Processing. Euro-Par 2011. Lecture Notes in Computer Science, vol 6853. Springer, Berlin, Heidelberg


We present a fast radix sorting algorithm that builds upon a microarchitecture-aware variant of counting sort. Taking advantage of virtual memory and making use of write-combining yields a per-pass throughput corresponding to at least 89% of the system’s peak memory bandwidth. Our implementation outperforms Intel’s recently published radix sort by a factor of 1.64. It also compares favorably to the reported performance of an algorithm for Fermi GPUs when data-transfer overhead is included. These results indicate that scalar, bandwidth-sensitive sorting algorithms remain competitive on current architectures. Various other memory-intensive applications can benefit from the techniques described herein.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Jan Wassenberg
    • 1
  • Peter Sanders
    • 2
  1. 1.Fraunhofer IOSBEttlingenGermany
  2. 2.Karlsruhe Institute of TechnologyKarlsruheGermany

Personalised recommendations