Abstract
This chapter is dedicated to sorting networks with regular and easily scalable structures, which permit data to be sorted and a number of supplementary problems to be solved. Two core architectures are discussed: (1) iterative that is based on a highly parallel combinational sorting network with minimal propagation delay, and (2) communication-time allowing data to be processed as soon as a new item is received and, thus, minimizing communication overhead that is frequently pointed out as the main bottleneck in system performance. The architectures are modeled in software (using Java language) and implemented in FPGA. It is shown that sorting is a base for many other data processing techniques, some of which have already been discussed in Chap. 3. Several new problems that are important for practical applications are highlighted, namely retrieving maximum and/or minimum sorted subsets, filtering (making it possible a set of data with the desired characteristics to be extracted), processing non-repeated items applying the address-based technique that has already been used in the previous chapter, traditional pipelining together with the introduced ring pipeline. The primary emphasis is on such important features as efficient pre-processing, uniformity of core components, rational combination of parallel, pipelined and sequential computations, and regularity of the circuits and interconnections. Potential alternative solutions are demonstrated and discussed. Many examples are given and analyzed with all necessary details.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Knuth DE (2011) The art of computer programming. Sorting and searching, 3rd edn. Addison-Wesley, Massachusetts
Sklyarov V, Rjabov A, Skliarova I, Sudnitson A (2016) High-performance information processing in distributed computing systems. Int J Innov Comput Inf Control 12(1):139–160
Sklyarov V, Skliarova I (2017) Parallel sort and search in distributed computer systems. In: Proceedings of the international scientific and practical conference “computer science and applied mathematics”, Almaty, Kazakhstan, Sept 2017, pp 86–105
Beliakov G, Johnstone M, Nahavandi S (2012) Computing of high breakdown regression estimators without sorting on graphics processing units. Computing 94(5):433–447
Kovacs E, Ignat I (2007) Clustering with prototype entity selection compared with K-means. J Control Eng Appl Inform 9(1):11–18
Al-Khalidi H, Taniar D, Safar M (2013) Approximate algorithms for static and continuous range queries in mobile navigation. Computing 95(10–11):949–976
Ramíres BCL, Guzmán G, Alhalabi WS, Cruz-Cortés N, Torres-Ruiz M, Moreno M (2018) On the usage of sorting networks to control greenhouse climatic factors. Int J Distrib Sensor Netw 14(2)
Mueller R, Teubner J, Alonso G (2012) Sorting networks on FPGAs. Int J Very Large Data Bases 21(1):1–23
Ortiz J, Andrews D (2010) A configurable high-throughput linear sorter system. In: Proceedings of 2010 IEEE international symposium on parallel & distributed processing, Atlanta, USA, Apr 2010, pp 1–8
Zuluada M, Milder P, Puschel M (2012) Computer generation of streaming sorting networks. In: Proceedings of the 49th design automation conference, San Francisco, June, 2012, pp 1245–1253
Greaves DJ, Singh S (2008) Kiwi: synthesis of FPGA circuits from parallel programs. In: Proceedings of the 16th IEEE international symposium on field-programmable custom computing machines—FCCM’08, Palo Alto, USA, Apr 2008, pp 3–12
Chey S, Liz J, Sheaffery JW, Skadrony K, Lach J (2008) Accelerating compute-intensive applications with GPUs and FPGAs. In: Proceedings of 2008 symposium on application specific processors—SASP’08, Anaheim, CA, USA, June 2008, pp 101–107
Chamberlain RD, Ganesan N (2009) Sorting on architecturally diverse computer systems. In: Proceedings of the 3rd international workshop on high-performance reconfigurable computing technology and applications—HPRCTA’09, Portland, USA, Nov 2009, pp 39–46
Koch D, Torresen J (2011) FPGASort: a high performance sorting architecture exploiting run-time reconfiguration on FPGAs for large problem sorting. In: Proceedings of the 19th ACM/SIGDA international symposium on field programmable gate arrays—FPGA’11, New York, USA, Feb–Mar 2011, pp 45–54
Kipfer P, Westermann R (2005) Improved GPU sorting. In: Pharr M, Fernando R (eds) GPU gems 2: programming techniques for high-performance graphics and general-purpose computation. Addison-Wesley. https://developer.nvidia.com/gpugems/GPUGems2/gpugems2_chapter46.html. Accessed 23 Feb 2019
Gapannini G, Silvestri F, Baraglia R (2012) Sorting on GPU for large scale datasets: a thorough comparison. Inf Process Manage 48(5):903–917
Ye X, Fan D, Lin W, Yuan N, Ienne P (2010) High performance comparison-based sorting algorithm on many-core GPUs. In: Proceedings of 2010 IEEE international symposium on parallel & distributed processing—IPDPS’10, Atlanta USA, Apr 2010, pp 1–10
Satish N, Harris M, Garland M (2009) Designing efficient sorting algorithms for many core GPUs. In: Proceedings of IEEE international symposium on parallel & distributed processing—IPDPS’09, Rome, Italy, May 2009, pp 1–10
Cederman D, Tsigas P (2008) A practical quicksort algorithm for graphics processors. In: Proceedings of the 16th annual European symposium on algorithms—ESA’08, Karlsruhe, Germany, Sept 2008, pp 246–258
Grozea C, Bankovic Z, Laskov P (2010) FPGA vs. multi-core CPUs vs. GPUs. In: Keller R, Kramer D, Weiss JP (eds) Facing the multicore-challenge. Springer, Berlin, pp 105–117
Edahiro M (2009) Parallelizing fundamental algorithms such as sorting on multi-core processors for EDA acceleration. In: Proceedings of the 18th Asia and South Pacific design automation conference—ASP-DAC’09, Yokohama, Japan, Jan 2009, pp 230–233
Batcher KE (1968) Sorting networks and their applications. In: Proceedings of AFIPS spring joint computer conference, USA
Aj-Haj Baddar SW, Batcher KE (2011) Designing sorting networks. A new paradigm. Springer, Berlin
O’Connor DG, Nelson RJ (1962) Sorting system with N-line sorting switch. US patent 3029413. http://patentimages.storage.googleapis.com/19/4e/8c/d8704ce03c9504/US3029413.pdf. Accessed 10 Mar 2019
Lacey S, Box R (1991) A fast, easy sort: a novel enhancement makes a bubble sort into one of the fastest sorting routines. Byte 16(4):315–320
Sklyarov V, Skliarova I, Mihhailov D, Sudnitson A (2011) Implementation in FPGA of address-based data sorting. In: Proceedings of the 21st international conference on field programmable logic and applications, Crete, Greece, 2011, pp 405–410
Sklyarov V, Skliarova I (2014) High-performance implementation of regular and easily scalable sorting networks on an FPGA. Microprocess Microsyst 38(5):470–484
Xilinx, Inc. (2018) Zynq-7000 all programmable SoC technical reference manual. http://www.xilinx.com/support/documentation/user_guides/ug585-Zynq-7000-TRM.pdf. Accessed 23 Feb 2019
Silva J, Sklyarov V, Skliarova I (2015) Comparison of On-chip communications in Zynq-7000 all programmable systems-on-chip. IEEE Embed Syst Lett 7(1):31–34
Sklyarov V, Skliarova I, Silva J, Sudnitson A (2015) Analysis and comparison of attainable hardware acceleration in all programmable systems-on-chip. In: Proceedings of the Euromicro conference on digital system design—Euromicro DSD’’015, Madeira, Portugal, Aug 2015, pp 345–352
Sklyarov V, Skliarova I (2015) Hardware accelerators for data sort in all programmable systems-on-chip. Adv Electr Comput Eng 15(4):9–16
Farmahini-Farahani A, Duwe HJ III, Schulte MJ, Compton K (2013) Modular design of high-throughput, low-latency sorting units. IEEE Trans Comput 62(7):1389–1402
Yuce B, Ugurdag HF, Goren S, Dundar G (2013) A fast circuit topology for finding the maximum of N k-bit numbers. In: Proceedings of the 21st symposium on computer arithmetic, Austin, TX, USA, Apr 2013
Wey C, Shieh M, Lin S (2008) Algorithms of finding the first two minimum values and their hardware implementation. IEEE Trans Circuits Syst I 55(11):3430–3437
Sklyarov V, Skliarova I, Rjabov A, Sudnitson A (2016) Computing sorted subsets for data processing in communicating software/hardware control systems. Int J Comput Commun Control 11(1):126–141
Sklyarov V, Skliarova I, Rjabov A, Sudnitson A (2015) Zynq-based system for extracting sorted subsets from large data sets. Inf Midem J Microelectron Electron Compon Mater 45(2):142–152
Alekseev VE (1969) Sorting algorithms with minimum memory. Kibernetika 5(5):99–103
Skliarova I, Sklyarov V, Sudnitson A (2017) Fast processing of non-repeated values in hardware. Elektron Elektrotech 23(3):74–77
Sklyarov V, Skliarova I, Sudnitson A (2016) Fast data sort based on searching networks with ring pipeline. Electron Electr Eng 22(4):58–62
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Skliarova, I., Sklyarov, V. (2019). Hardware Accelerators for Data Sort. In: FPGA-BASED Hardware Accelerators. Lecture Notes in Electrical Engineering, vol 566. Springer, Cham. https://doi.org/10.1007/978-3-030-20721-2_4
Download citation
DOI: https://doi.org/10.1007/978-3-030-20721-2_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-20720-5
Online ISBN: 978-3-030-20721-2
eBook Packages: EngineeringEngineering (R0)