Cache Oblivious Sparse Matrix Multiplication

Dusefante, Matteo; Jacob, Riko

doi:10.1007/978-3-319-77404-6_32

Matteo Dusefante¹⁶ &
Riko Jacob¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10807))

Included in the following conference series:

Latin American Symposium on Theoretical Informatics

2759 Accesses
2 Citations

Abstract

We study the problem of sparse matrix multiplication in the Random Access Machine and in the Ideal Cache-Oblivious model. We present a simple algorithm that exploits randomization to compute the product of two sparse matrices with elements over an arbitrary field. Let \(A \in \mathbb {F}^{n \times n}\) and \(C \in \mathbb {F}^{n \times n}\) be matrices with h nonzero entries in total from a field \(\mathbb {F}\). In the RAM model, we are able to compute all the k nonzero entries of the product matrix \(AC \in \mathbb {F}^{n \times n}\) using \(\tilde{\mathcal {O}}(h + kn)\) time and \(\mathcal {O}(h)\) space, where the notation \(\tilde{\mathcal {O}}(\cdot )\) suppresses logarithmic factors. In the External Memory model, we are able to compute cache obliviously all the k nonzero entries of the product matrix \(AC \in \mathbb {F}^{n \times n}\) using \(\tilde{\mathcal {O}}(h/B + kn/B)\) I/Os and \(\mathcal {O}(h)\) space. In the Parallel External Memory model, we are able to compute all the k nonzero entries of the product matrix \(AC \in \mathbb {F}^{n \times n}\) using \(\tilde{\mathcal {O}}(h/PB + kn/PB)\) time and \(\mathcal {O}(h)\) space, which makes the analysis in the External Memory model a special case of Parallel External Memory for \(P=1\). The guarantees are given in terms of the size of the field and by bounding the size of \(\mathbb {F}\) as \({|}\mathbb {F}{|} > kn \log (n^2/k)\) we guarantee an error probability of at most \(1{\text{/ }}n\) for computing the matrix product.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
A cancellation occurs when \((AC)_{i,j} = 0\) while elementary products do not evaluate to zero, i.e. \(A_{i,\kappa } \cdot C_{\kappa ,j} \ne 0\), for some \(\kappa \in [n]\).
2.
Initializing \(\mathcal {A}\) and \(\mathcal {C}\) corresponds to computing prefix sums of each row and column vector of A and C respectively, which requires a linear scan of the input matrices.
3.
We do not consider the last layer, i.e. \(\log (n^2/k)\), as it does not involve any stochastic process.

References

Aggarwal, A., Vitter, J.S.: The input/output complexity of sorting and related problems. Commun. ACM 31(9), 1116–1127 (1988)
Article MathSciNet Google Scholar
Frigo, M., Leiserson, C.E., Prokop, H., Ramachandran, S.: Cache-oblivious Algorithms. In: 40th Annual Symposium on Foundations of Computer Science, pp. 285–297. IEEE (1999)
Google Scholar
Arge, L., Goodrich, M.T., Nelson, M., Sitchinava, N.: Fundamental parallel algorithms for private-cache chip multiprocessors. In: Proceedings of the 20th Annual Symposium on Parallelism in Algorithms and Architectures, SPAA 2008, pp. 197–206. ACM, New York (2008)
Google Scholar
Bender, M.A., Fineman, J.T., Gilbert, S., Kuszmaul, B.C.: Concurrent cache-oblivious B-trees. In: Proceedings of the 17th Annual ACM Symposium on Parallelism in Algorithms and Architectures, pp. 228–237. ACM (2005)
Google Scholar
Strassen, V.: Gaussian elimination is not optimal. Numer. Math. 13(4), 354–356 (1969)
Article MathSciNet MATH Google Scholar
Le Gall, F.: Powers of tensors and fast matrix multiplication. In: Proceedings of the 39th International Symposium on Symbolic and Algebraic Computation, pp. 296–303. ACM (2014)
Google Scholar
Yuster, R., Zwick, U.: Fast sparse matrix multiplication. ACM Trans. Algorithms (TALG) 1(1), 2–13 (2005)
Article MathSciNet MATH Google Scholar
Iwen, M.A., Spencer, C.V.: A note on compressed sensing and the complexity of matrix multiplication. Inf. Process. Lett. 109(10), 468–471 (2009)
Article MathSciNet MATH Google Scholar
Amossen, R.R., Pagh, R.: Faster join-projects and sparse matrix multiplications. In: Proceedings of the 12th International Conference on Database Theory, ICDT 2009, pp. 121–126. ACM, New York (2009)
Google Scholar
Lingas, A.: A fast output-sensitive algorithm for Boolean matrix multiplication. In: Fiat, A., Sanders, P. (eds.) ESA 2009. LNCS, vol. 5757, pp. 408–419. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-04128-0_37
Chapter Google Scholar
Pagh, R.: Compressed matrix multiplication. In: Proceedings of the 3rd Innovations in Theoretical Computer Science Conference, pp. 442–451. ACM (2012)
Google Scholar
Williams, R., Yu, H.: Finding orthogonal vectors in discrete structures. In: Proceedings of the 25th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2014, Philadelphia, PA, USA, pp. 1867–1877 (2014)
Google Scholar
Jacob, R., Stöckel, M.: Fast output-sensitive matrix multiplication. In: Bansal, N., Finocchi, I. (eds.) ESA 2015. LNCS, vol. 9294, pp. 766–778. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-662-48350-3_64
Chapter Google Scholar
Van Gucht, D., Williams, R., Woodruff, D.P., Zhang, Q.: The communication complexity of distributed set-joins with applications to matrix multiplication. In: Proceedings of the 34th ACM Symposium on Principles of Database Systems, PODS 2015, pp. 199–212. ACM, New York (2015)
Google Scholar
Hong, J.W., Kung, H.T.: I/O complexity: the red-blue pebble game. In: Proceedings of the 13th Annual ACM Symposium on Theory of Computing, STOC 1981, pp. 326–333. ACM, New York (1981)
Google Scholar
Pagh, R., Stöckel, M.: The input/output complexity of sparse matrix multiplication. In: Schulz, A.S., Wagner, D. (eds.) ESA 2014. LNCS, vol. 8737, pp. 750–761. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-662-44777-2_62
Google Scholar
Chazelle, B., Guibas, L.J.: Fractional cascading: I. A data structuring technique. Algorithmica 1(1), 133–162 (1986)
Article MathSciNet MATH Google Scholar
Demaine, E.D., Gopal, V., Hasenplaugh, W.: Cache-oblivious iterated predecessor queries via range coalescing. In: Dehne, F., Sack, J.-R., Stege, U. (eds.) WADS 2015. LNCS, vol. 9214, pp. 249–262. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-21840-3_21
Chapter Google Scholar
Brodal, G.S., Fagerberg, R.: Cache oblivious distribution sweeping. In: Widmayer, P., Eidenbenz, S., Triguero, F., Morales, R., Conejo, R., Hennessy, M. (eds.) ICALP 2002. LNCS, vol. 2380, pp. 426–438. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-45465-9_37
Chapter Google Scholar
Bender, M.A., Brodal, G.S., Fagerberg, R., Jacob, R., Vicari, E.: Optimal sparse matrix dense vector multiplication in the I/O-model. Theory Comput. Syst. 47(4), 934–962 (2010)
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

IT University of Copenhagen, Copenhagen, Denmark
Matteo Dusefante & Riko Jacob

Authors

Matteo Dusefante
View author publications
You can also search for this author in PubMed Google Scholar
Riko Jacob
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Matteo Dusefante .

Editor information

Editors and Affiliations

Stony Brook University, Stony Brook, New York, USA
Michael A. Bender
Rutgers University, New Brunswick, New Jersey, USA
Martín Farach-Colton
Pace University, New York, New York, USA
Miguel A. Mosteiro

Omitted Proofs

Proof

(Lemma 2 ). (i) If \(\langle a,c \rangle \ne 0\), then there exist \(i,j \in [n]\) such that \(u_i,v_j \ne 0\) and \(\langle a_i,c_j \rangle \ne 0\), hence, \((AC)_{i,j} \ne 0\). If there is a nonzero entry then \(\langle a,c \rangle \ne 0\) with probability at least \(1- 2/{|}\mathbb {F}{|} + 1/{|}\mathbb {F}{|}^2\). This is equivalent of saying that if there is a nonzero entry then \(\langle a,c \rangle = 0\) with probability at most \(2/{|}\mathbb {F}{|} - 1/{|}\mathbb {F}{|}^2\). Without loss of generality, let \(i_1=i_2=i\) and \(j_1=j_2=j\). Considering a bigger submatrix with exactly one nonzero entry leaves the probability unchanged, while considering more nonzero entries will only increase the probability of \(\langle a,c \rangle \ne 0\). Therefore, we consider the case where we want to isolate, with high probability, the location of a single nonzero entry in a submatrix of unit size. It follows that, in order to query the submatrix we have to perform the following inner product \(\langle a,c \rangle = \langle u_ia_i,v_jc_j \rangle \), where u, v are chosen uniformly at random from \(\mathbb {F}\). Since \(\langle a_i,c_j \rangle \ne 0\) by hypothesis, we have that \(\mathbf {Pr}(\langle a,c \rangle = 0) \ge 2/{|}\mathbb {F}{|} - 1/{|}\mathbb {F}{|}^2\).

(ii) If the submatrix of AC with indices \([i_1,i_2] \times [j_1,j_2]\) is all zero then \(\langle a,c \rangle = 0\) with probability at least \(1 - 2k\log (n^2/k)/{|}\mathbb {F}{|} + k\log (n^2/k)/{|}\mathbb {F}{|}^2\). By Lemma 1 this is true. If \(\langle a,c \rangle = 0\) then the submatrix of AC with indices \([i_1,i_2] \times [j_1,j_2]\) is all zero with probability at least \(1 - 2k\log (n^2/k)/{|}\mathbb {F}{|} + k\log (n^2/k)/{|}\mathbb {F}{|}^2\). That is, if \(\langle a,c \rangle = 0\) then the submatrix has a nonzero entry with probability at most \(2k\log (n^2/k)/{|}\mathbb {F}{|} - k\log (n^2/k))/{|}\mathbb {F}{|}^2\). Without loss of generality, let \(i_1=i_2=i\) and \(j_1=j_2=j\). We have that \(\langle a,c \rangle = \langle ua_i,vc_j \rangle = 0\), where u, v are chosen uniformly at random from \(\mathbb {F}\). Therefore, \(\mathbf {Pr}(\langle a_i,c_j \rangle \ne 0) \ge 2/{|}\mathbb {F}{|} - 1/{|}\mathbb {F}{|}^2\). The latter is a lower bound on the probability to not detect a nonzero entry in the output matrix. A union bound over the \(k \log (n^2/k)\) queries needed to isolate the k nonzero entries, gives us the probability to incur in at least one false negative. By considering its complement, the claim follows. \(\square \)

Proof

(Lemma 3 ). (i) If \(\langle a,c \rangle \ne 0\), then there exist \(i,j \in [m]\) such that \(u_i,v_j \ne 0\) and \(\langle a_i,c_j \rangle \ne 0\), hence, \((AC)_{i,j} \ne 0\). If there is a nonzero entry then \(\langle a,c \rangle \ne 0\) with probability at least \(1- 1/{|}\mathbb {F}^*{|}\). This is equivalent of saying that if there is a nonzero entry then \(\langle a,c \rangle = 0\) with probability at most \(1/{|}\mathbb {F}^*{|}\). If \(i_1=i_2=i\), \(j_1=j_2=j\) and \(\langle a_i,c_j \rangle \ne 0\) then \(\langle a,c \rangle \ne 0\) since scaling vectors with random elements from \(\mathbb {F}^*\) preserves non orthogonality. If \(i_1<i_2\) and \(j_1<j_2\) and the submatrix contains exactly one nonzero entry, the same reasoning applies. If the submatrix has \(\ell >1\) nonzero entries, then, without loss of generality, there exist \(a_1,\dots ,a_\ell ,c_1, \dots ,c_\ell \) such that \(\langle u_1a_1,v_1c_1 \rangle + \dots + \langle u_\ell a_\ell ,v_\ell c_\ell \rangle = 0\) and \(\langle a_i,c_j \rangle \ne 0\), for all \(i,j \in [\ell ]\). That is, \(\ell \) inner products that generate as many nonzero entries and produce a false negative when the submatrix is queried. By the linearity of the inner product, we have that \(\langle u_1a_1,v_1c_1 \rangle + \dots + \langle u_\ell a_\ell ,v_\ell c_\ell \rangle = u_1v_1 \langle a_1,c_1 \rangle + \dots + u_\ell v_\ell \langle a_\ell , c_\ell \rangle \). Hence, the sum cancels whenever \(u_i = - ( u_1v_1 \langle a_1,c_1 \rangle + \dots + u_\ell v_\ell \langle a_\ell , c_\ell \rangle )/ v_i \langle a_i,c_i \rangle \) for a generic \(i \in [\ell ]\). Note that, such a \(u_i\) is in \(\mathbb {F}\) since fields guarantee the existence of additive and multiplicative inverses. The probability to choose \(u_i\) such that it cancels the other inner products is the same as choosing an element from \(\mathbb {F}^*\) uniformly at random, i.e. \(1/{|}\mathbb {F}^*{|}\).

(ii) If the submatrix of AC with indices \([i_1,i_2] \times [j_1,j_2]\) is all zero then \(\langle a,c \rangle = 0\) with probability at least \(1 - k\log ((n^2/k) -1)/{|}\mathbb {F}^*{|}\). By Lemma 1 this is true. If \(\langle a,c \rangle = 0\) then the submatrix of AC with indices \([i_1,i_2] \times [j_1,j_2]\) is all zero with probability at least \(1 - k\log ((n^2/k) -1)/{|}\mathbb {F}^*{|}\). That is, if \(\langle a,c \rangle = 0\) then the submatrix of AC has a nonzero entry with probability at most \(k\log ((n^2/k) -1)/{|}\mathbb {F}^*{|}\). If \(\langle a,c \rangle = 0\) and \(i_1=i_2=i\), \(j_1=j_2=j\) then \(\langle a_i,c_j \rangle = 0\). The same reasoning applies for \(i_1<i_2\), \(j_1<j_2\) and exactly one nonzero entry in the submatrix. Let \(\langle a,c \rangle = 0\) and suppose there exist \(a_1,\dots ,a_\ell ,c_1, \dots ,c_\ell \) such that \(\langle u_1a_1,v_1c_1 \rangle + \dots + \langle u_\ell a_\ell ,v_\ell c_\ell \rangle = 0\) and \(\langle a_i,c_j \rangle \ne 0\), for all \(i,j \in [\ell ]\). Hence, as in (i), the sum cancels with probability \(1/{|}\mathbb {F}^*{|}\). The latter is a lower bound on the probability to not detect a nonzero entry in the output matrix. A union bound over the \( k\log ((n^2/k) -1)\) queries needed to isolate the k nonzero entries, gives us the probability to incur in at least one false negative.^{Footnote 3} By considering its complement, the claim follows. \(\square \)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Dusefante, M., Jacob, R. (2018). Cache Oblivious Sparse Matrix Multiplication. In: Bender, M., Farach-Colton, M., Mosteiro, M. (eds) LATIN 2018: Theoretical Informatics. LATIN 2018. Lecture Notes in Computer Science(), vol 10807. Springer, Cham. https://doi.org/10.1007/978-3-319-77404-6_32

Download citation

DOI: https://doi.org/10.1007/978-3-319-77404-6_32
Published: 13 March 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-77403-9
Online ISBN: 978-3-319-77404-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Cache Oblivious Sparse Matrix Multiplication

Abstract

Access this chapter

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Omitted Proofs

Omitted Proofs

Proof

Proof

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation