Abstract
This paper presents a parallel blocked algorithm for the algebraic path problem (APP). It is known that the complexity of the APP is the same as that of the classical matrix-matrix multiplication; however, solving the APP takes much more running time because of its unique data dependencies that limits data reuse drastically. We examine a parallel implementation of a blocked algorithm for the APP on the one-chip Intrinsity FastMATH adaptive processor, which consists of a scalar MIPS processor extended with a SIMD matrix coprocessor. The matrix coprocessor supports native matrix instructions on an array of 4 × 4 processing elements. Implementing with matrix instructions requires us to transform algorithms in terms of matrix-matrix operations. Conventional vectorization for SIMD vector processing deals with only the innermost loop; however, on the FastMATH processor, we need to vectorize two or three nested loops in order to convert the loops to equivalent one matrix operation. Our experimental results show a peak performance of 9.27 GOPS and high usage rates of matrix instructions for solving the APP. Findings from our experimental results indicate that the SIMD matrix extension to (super)scalar processor would be very useful for fast solution of many matrix-formulated problems.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Venkataraman, G., Sahni, S., Mukhopadhyaya, S.: A blocked all-pairs shortest-paths algorithm. In: Halldórsson, M.M. (ed.) SWAT 2000. LNCS, vol. 1851, p. 419. Springer, Heidelberg (2000)
Penner, M., Prasanna, V.K.: Cache-friendly implementations of transitive closure. In: Proceedings of the 2001 International Conference on Parallel Architectures and Compilation Techniques (PACT 2001), Barcelona, Spain (September 2001)
Griem, G., Oliker, G.: Transitive closure on the Imagine stream processor. In: The 5th Workshop on Media and Stream Processors (MSP-5), San Diego, CA (December 2003)
Park, J.-S., Penner, M., Prasanna, V.K.: Optimizing graph algorithms for improved cache performance. IEEE Transactions on Parallel and Distributed Systems 15(9), 769–782 (2004)
Floyd, R.W.: Algorithm 97: Shortest path. Communications ACM 5(6), 345 (1962)
Rote, G.: A systolic array algorithm for the algebraic path problem (shortest paths; matrix inversion). Computing 34, 191–219 (1985)
Robert, Y., Trystram, D.: Parallel implementation of the algebraic path problem. In: Proceedings of the Conference on Algorithms and Hardware for Parallel Processing (CONPAR 1986), pp. 149–156 (1986)
Nunez, F.J., Valero, M.: A block algorithm for the algebraic path problem and its execution on a systolic array. In: Proceedings of the International Conference on Systolic Arrays, pp. 265–274 (1988)
Maggs, B.M., Plotkin, S.A.: Minimum-cost spanning tree as a path-finding problem. Information Processing Letters 26, 291–293 (1988)
Rote, G.: Path problems in graphs. Computing Supplementum 7, 155–189 (1990)
Sedukhin, S.: Design and analysis of systolic algorithms for the algebraic path problem. Computers and Artificial Intelligence 11(3), 269–292 (1992)
Fink, E.: A survey of sequential and systolic algorithms for the algebraic path problem, Technical report CS-92-37, Department of Computer Science, University of Waterloo (1992)
Cachera, D., Rajopadhye, S., Risset, T., Tadonki, C.: Parallelization of the algebraic path problem on linear SIMD/SPMD arrays, Technical report 1346, Irisa (2001)
Olson, T.: Advanced processing techniques using the Intrinsity FastMATH processor, in: Embedded Processor Forum, California, USA (May 2002)
Anantha, V., Harle, C., Olson, T., Yost, G.: An innovative high-performance architecture for vector and matrix math algorithms. In: Proceedings of the 6th Annual Workshop on High Performance Embedded Computing (HPEC 2002), Massachusetts, USA (September 2002)
Intrinsity Software Application Writer’s Manual, ver. 0.3, Intrinsity, Inc (2003)
Using MATLAB Version 6, The Math Works, Inc. (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Takahashi, A., Sedukhin, S. (2005). Parallel Blocked Algorithm for Solving the Algebraic Path Problem on a Matrix Processor. In: Yang, L.T., Rana, O.F., Di Martino, B., Dongarra, J. (eds) High Performance Computing and Communications. HPCC 2005. Lecture Notes in Computer Science, vol 3726. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11557654_89
Download citation
DOI: https://doi.org/10.1007/11557654_89
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-29031-5
Online ISBN: 978-3-540-32079-1
eBook Packages: Computer ScienceComputer Science (R0)