This paper deals with the parallel execution of algorithms with global and/or irregular data dependencies on a regularly and locally connected processor array. The associated communication problems are solved by the use of a two-dimensional sorting algorithm. The proposed architecture, which is based on a two-dimensional sorting network, offers a high degree of flexibility and allows an efficient mapping of many irregularly structured algorithms. In this architecture a one-dimensional processor array performs all required control and arithmetic operations, whereas the sorter solves complex data transfer problems. The storage capability of the sorting network is also used as memory for data elements. The algorithms for sparse matrix computations, fast Fourier transformation and for the convex hull problem, which are mapped onto this architecture, as well as the simulation of a shared-memory computer show that the utilization of the most complex components, the processors, is O(1).
This is a preview of subscription content, log in to check access.
Buy single article
Instant access to the full article PDF.
Price includes VAT for USA
Subscribe to journal
Immediate online access to all issues from 2019. Subscription will auto renew annually.
This is the net price. Taxes to be calculated in checkout.
S.G. Akl, The Design and Analysis of Parallel Algorithms, Englewood Cliffs, NJ: Prentice-Hall, 1989.
K.E. Batcher, “Sorting Networks and Their Applications,”Proc. AFIPS Spring Joint Computer Conf. 32, 1968, pp. 307–314.
E. Bernard, “CMOS-Entwurf eines 2-dim. fehlertoleranten Sortiernetzwerkes für Datentransportaufgaben,” Diplomarbeit, TU München, 1990.
G.E. Blelloch, “Scans as Primitive Parallel Operations,”IEEE Transactions on Computers, vol. 38, pp. 1526–1527, 1989.
J. Götze and U. Schwiegelshohn, “Sparse-Matrix-Vector Multiplication on a Systolic Array,”Proc. of ICASSP, pp. 2061–2064, 1988.
K. Hwang, P.-S. Tseng, and D. Kim, “An Orthogonal Multiprocessor for Parallel Scientific Computations,”IEEE Transactions on Computers, vol. 38, 1989, pp. 47–61.
D.E. Knuth,The Art of Computer Programming, vol. 3: Sorting and Searching, Reading, MA: Addison Wesley, 1973.
J.G. Krammer, “Parallel Processing with a Sorting Network,”ISCAS, New Orleans, 1990, pp. 966–969.
J.G. Krammer and H. Arif, “A Fault-Tolerant Two-Dimensional Sorting Network,”Proc. of ASAP, 1990, pp. 317–328.
M. Misra and V.K.P. Kumar, “Efficient VLSI Implementation of Iterative Solutions to Sparse Linear Systems,” Technical Report, IRIS no. 246, University of Southern California, 1988.
D. Nassimi and S. Sahni, “An Optimal Routing Algorithm for Mesh-Connected Parallel Computers,”Journal of the ACM, vol. 30, 1980, pp. 6–29.
D. Nassimi and S. Sahni, “Bitonic Sort on a Mesh-Connected Parallel Computer,”IEEE Transactions on Computers, vol. 27, 1979, pp. 3–7.
M.C. Pease, “An adaptation of the fast Fourier transform for parallel processing,”Journal of the ACM, vol. 15, 1968, pp. 252–264.
M.C. Pease, “The Indirect Binary n-Cube Microprocessor Array,”IEEE Transactions on Computers, vol. 26, 1977, pp. 458–473.
F.P. Preparata and M.I. Shamos,Computational Geometry—An Introduction, New York: Springer-Verlag, 1985.
I.D. Scherson and S. Sen, “Parallel Sorting in Two-Dimensional VLSI Models of Computation,”IEEE Transactions on Computers, vol. 38, 1989, pp. 238–249.
U. Schwiegelshohn, “A Shortperiodic Two-Dimensional Systolic Sorting Algorithm,”Int. Conf. on Systolic Arrays, San Diego, Calif., 1988, pp. 257–264.
H.S. Stone, “Parallel Processing with the Perfect Shuffle,”IEEE Transactions on Computers, vol. 20, 1971, pp. 153–161.
G. Strang,Linear Algebra and Its Applications, New York: Academic Press, 1980.
C.D. Thompson and H.T. Kung, “Sorting on a Mesh-Connected Parallel Computer,”Comm. ACM, vol. 20, 1977, pp. 263–271.
C.D. Thompson, “The VLSI Complexity of Sorting,”IEEE Transactions on Computers, vol. 32, 1983, pp. 1171–1184.
K.W. Przytula, J.G. Nash and S. Hansen, “Fast Fourier transforms algorithm for two-dimensional array of processors,”SPIE, vol. 826, 1987, pp. 186–198.
About this article
Cite this article
Krammer, J.G. A sorter-based architecture for a parallel implementation of communication intensive algorithms. J VLSI Sign Process Syst Sign Image Video Technol 3, 93–103 (1991). https://doi.org/10.1007/BF00927837
- Data Element
- Local Memory
- Connected Region
- Sorting Algorithm
- Processor Array