Skip to main content

Cache Oblivious Sparse Matrix Multiplication

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10807))

Abstract

We study the problem of sparse matrix multiplication in the Random Access Machine and in the Ideal Cache-Oblivious model. We present a simple algorithm that exploits randomization to compute the product of two sparse matrices with elements over an arbitrary field. Let \(A \in \mathbb {F}^{n \times n}\) and \(C \in \mathbb {F}^{n \times n}\) be matrices with h nonzero entries in total from a field \(\mathbb {F}\). In the RAM model, we are able to compute all the k nonzero entries of the product matrix \(AC \in \mathbb {F}^{n \times n}\) using \(\tilde{\mathcal {O}}(h + kn)\) time and \(\mathcal {O}(h)\) space, where the notation \(\tilde{\mathcal {O}}(\cdot )\) suppresses logarithmic factors. In the External Memory model, we are able to compute cache obliviously all the k nonzero entries of the product matrix \(AC \in \mathbb {F}^{n \times n}\) using \(\tilde{\mathcal {O}}(h/B + kn/B)\) I/Os and \(\mathcal {O}(h)\) space. In the Parallel External Memory model, we are able to compute all the k nonzero entries of the product matrix \(AC \in \mathbb {F}^{n \times n}\) using \(\tilde{\mathcal {O}}(h/PB + kn/PB)\) time and \(\mathcal {O}(h)\) space, which makes the analysis in the External Memory model a special case of Parallel External Memory for \(P=1\). The guarantees are given in terms of the size of the field and by bounding the size of \(\mathbb {F}\) as \({|}\mathbb {F}{|} > kn \log (n^2/k)\) we guarantee an error probability of at most \(1{\text{/ }}n\) for computing the matrix product.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    A cancellation occurs when \((AC)_{i,j} = 0\) while elementary products do not evaluate to zero, i.e. \(A_{i,\kappa } \cdot C_{\kappa ,j} \ne 0\), for some \(\kappa \in [n]\).

  2. 2.

    Initializing \(\mathcal {A}\) and \(\mathcal {C}\) corresponds to computing prefix sums of each row and column vector of A and C respectively, which requires a linear scan of the input matrices.

  3. 3.

    We do not consider the last layer, i.e. \(\log (n^2/k)\), as it does not involve any stochastic process.

References

  1. Aggarwal, A., Vitter, J.S.: The input/output complexity of sorting and related problems. Commun. ACM 31(9), 1116–1127 (1988)

    Article  MathSciNet  Google Scholar 

  2. Frigo, M., Leiserson, C.E., Prokop, H., Ramachandran, S.: Cache-oblivious Algorithms. In: 40th Annual Symposium on Foundations of Computer Science, pp. 285–297. IEEE (1999)

    Google Scholar 

  3. Arge, L., Goodrich, M.T., Nelson, M., Sitchinava, N.: Fundamental parallel algorithms for private-cache chip multiprocessors. In: Proceedings of the 20th Annual Symposium on Parallelism in Algorithms and Architectures, SPAA 2008, pp. 197–206. ACM, New York (2008)

    Google Scholar 

  4. Bender, M.A., Fineman, J.T., Gilbert, S., Kuszmaul, B.C.: Concurrent cache-oblivious B-trees. In: Proceedings of the 17th Annual ACM Symposium on Parallelism in Algorithms and Architectures, pp. 228–237. ACM (2005)

    Google Scholar 

  5. Strassen, V.: Gaussian elimination is not optimal. Numer. Math. 13(4), 354–356 (1969)

    Article  MathSciNet  MATH  Google Scholar 

  6. Le Gall, F.: Powers of tensors and fast matrix multiplication. In: Proceedings of the 39th International Symposium on Symbolic and Algebraic Computation, pp. 296–303. ACM (2014)

    Google Scholar 

  7. Yuster, R., Zwick, U.: Fast sparse matrix multiplication. ACM Trans. Algorithms (TALG) 1(1), 2–13 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  8. Iwen, M.A., Spencer, C.V.: A note on compressed sensing and the complexity of matrix multiplication. Inf. Process. Lett. 109(10), 468–471 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  9. Amossen, R.R., Pagh, R.: Faster join-projects and sparse matrix multiplications. In: Proceedings of the 12th International Conference on Database Theory, ICDT 2009, pp. 121–126. ACM, New York (2009)

    Google Scholar 

  10. Lingas, A.: A fast output-sensitive algorithm for Boolean matrix multiplication. In: Fiat, A., Sanders, P. (eds.) ESA 2009. LNCS, vol. 5757, pp. 408–419. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-04128-0_37

    Chapter  Google Scholar 

  11. Pagh, R.: Compressed matrix multiplication. In: Proceedings of the 3rd Innovations in Theoretical Computer Science Conference, pp. 442–451. ACM (2012)

    Google Scholar 

  12. Williams, R., Yu, H.: Finding orthogonal vectors in discrete structures. In: Proceedings of the 25th Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2014, Philadelphia, PA, USA, pp. 1867–1877 (2014)

    Google Scholar 

  13. Jacob, R., Stöckel, M.: Fast output-sensitive matrix multiplication. In: Bansal, N., Finocchi, I. (eds.) ESA 2015. LNCS, vol. 9294, pp. 766–778. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-662-48350-3_64

    Chapter  Google Scholar 

  14. Van Gucht, D., Williams, R., Woodruff, D.P., Zhang, Q.: The communication complexity of distributed set-joins with applications to matrix multiplication. In: Proceedings of the 34th ACM Symposium on Principles of Database Systems, PODS 2015, pp. 199–212. ACM, New York (2015)

    Google Scholar 

  15. Hong, J.W., Kung, H.T.: I/O complexity: the red-blue pebble game. In: Proceedings of the 13th Annual ACM Symposium on Theory of Computing, STOC 1981, pp. 326–333. ACM, New York (1981)

    Google Scholar 

  16. Pagh, R., Stöckel, M.: The input/output complexity of sparse matrix multiplication. In: Schulz, A.S., Wagner, D. (eds.) ESA 2014. LNCS, vol. 8737, pp. 750–761. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-662-44777-2_62

    Google Scholar 

  17. Chazelle, B., Guibas, L.J.: Fractional cascading: I. A data structuring technique. Algorithmica 1(1), 133–162 (1986)

    Article  MathSciNet  MATH  Google Scholar 

  18. Demaine, E.D., Gopal, V., Hasenplaugh, W.: Cache-oblivious iterated predecessor queries via range coalescing. In: Dehne, F., Sack, J.-R., Stege, U. (eds.) WADS 2015. LNCS, vol. 9214, pp. 249–262. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-21840-3_21

    Chapter  Google Scholar 

  19. Brodal, G.S., Fagerberg, R.: Cache oblivious distribution sweeping. In: Widmayer, P., Eidenbenz, S., Triguero, F., Morales, R., Conejo, R., Hennessy, M. (eds.) ICALP 2002. LNCS, vol. 2380, pp. 426–438. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-45465-9_37

    Chapter  Google Scholar 

  20. Bender, M.A., Brodal, G.S., Fagerberg, R., Jacob, R., Vicari, E.: Optimal sparse matrix dense vector multiplication in the I/O-model. Theory Comput. Syst. 47(4), 934–962 (2010)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Matteo Dusefante .

Editor information

Editors and Affiliations

Omitted Proofs

Omitted Proofs

Proof

(Lemma 2 ). (i) If \(\langle a,c \rangle \ne 0\), then there exist \(i,j \in [n]\) such that \(u_i,v_j \ne 0\) and \(\langle a_i,c_j \rangle \ne 0\), hence, \((AC)_{i,j} \ne 0\). If there is a nonzero entry then \(\langle a,c \rangle \ne 0\) with probability at least \(1- 2/{|}\mathbb {F}{|} + 1/{|}\mathbb {F}{|}^2\). This is equivalent of saying that if there is a nonzero entry then \(\langle a,c \rangle = 0\) with probability at most \(2/{|}\mathbb {F}{|} - 1/{|}\mathbb {F}{|}^2\). Without loss of generality, let \(i_1=i_2=i\) and \(j_1=j_2=j\). Considering a bigger submatrix with exactly one nonzero entry leaves the probability unchanged, while considering more nonzero entries will only increase the probability of \(\langle a,c \rangle \ne 0\). Therefore, we consider the case where we want to isolate, with high probability, the location of a single nonzero entry in a submatrix of unit size. It follows that, in order to query the submatrix we have to perform the following inner product \(\langle a,c \rangle = \langle u_ia_i,v_jc_j \rangle \), where u, v are chosen uniformly at random from \(\mathbb {F}\). Since \(\langle a_i,c_j \rangle \ne 0\) by hypothesis, we have that \(\mathbf {Pr}(\langle a,c \rangle = 0) \ge 2/{|}\mathbb {F}{|} - 1/{|}\mathbb {F}{|}^2\).

(ii) If the submatrix of AC with indices \([i_1,i_2] \times [j_1,j_2]\) is all zero then \(\langle a,c \rangle = 0\) with probability at least \(1 - 2k\log (n^2/k)/{|}\mathbb {F}{|} + k\log (n^2/k)/{|}\mathbb {F}{|}^2\). By Lemma 1 this is true. If \(\langle a,c \rangle = 0\) then the submatrix of AC with indices \([i_1,i_2] \times [j_1,j_2]\) is all zero with probability at least \(1 - 2k\log (n^2/k)/{|}\mathbb {F}{|} + k\log (n^2/k)/{|}\mathbb {F}{|}^2\). That is, if \(\langle a,c \rangle = 0\) then the submatrix has a nonzero entry with probability at most \(2k\log (n^2/k)/{|}\mathbb {F}{|} - k\log (n^2/k))/{|}\mathbb {F}{|}^2\). Without loss of generality, let \(i_1=i_2=i\) and \(j_1=j_2=j\). We have that \(\langle a,c \rangle = \langle ua_i,vc_j \rangle = 0\), where u, v are chosen uniformly at random from \(\mathbb {F}\). Therefore, \(\mathbf {Pr}(\langle a_i,c_j \rangle \ne 0) \ge 2/{|}\mathbb {F}{|} - 1/{|}\mathbb {F}{|}^2\). The latter is a lower bound on the probability to not detect a nonzero entry in the output matrix. A union bound over the \(k \log (n^2/k)\) queries needed to isolate the k nonzero entries, gives us the probability to incur in at least one false negative. By considering its complement, the claim follows.   \(\square \)

Proof

(Lemma 3 ). (i) If \(\langle a,c \rangle \ne 0\), then there exist \(i,j \in [m]\) such that \(u_i,v_j \ne 0\) and \(\langle a_i,c_j \rangle \ne 0\), hence, \((AC)_{i,j} \ne 0\). If there is a nonzero entry then \(\langle a,c \rangle \ne 0\) with probability at least \(1- 1/{|}\mathbb {F}^*{|}\). This is equivalent of saying that if there is a nonzero entry then \(\langle a,c \rangle = 0\) with probability at most \(1/{|}\mathbb {F}^*{|}\). If \(i_1=i_2=i\), \(j_1=j_2=j\) and \(\langle a_i,c_j \rangle \ne 0\) then \(\langle a,c \rangle \ne 0\) since scaling vectors with random elements from \(\mathbb {F}^*\) preserves non orthogonality. If \(i_1<i_2\) and \(j_1<j_2\) and the submatrix contains exactly one nonzero entry, the same reasoning applies. If the submatrix has \(\ell >1\) nonzero entries, then, without loss of generality, there exist \(a_1,\dots ,a_\ell ,c_1, \dots ,c_\ell \) such that \(\langle u_1a_1,v_1c_1 \rangle + \dots + \langle u_\ell a_\ell ,v_\ell c_\ell \rangle = 0\) and \(\langle a_i,c_j \rangle \ne 0\), for all \(i,j \in [\ell ]\). That is, \(\ell \) inner products that generate as many nonzero entries and produce a false negative when the submatrix is queried. By the linearity of the inner product, we have that \(\langle u_1a_1,v_1c_1 \rangle + \dots + \langle u_\ell a_\ell ,v_\ell c_\ell \rangle = u_1v_1 \langle a_1,c_1 \rangle + \dots + u_\ell v_\ell \langle a_\ell , c_\ell \rangle \). Hence, the sum cancels whenever \(u_i = - ( u_1v_1 \langle a_1,c_1 \rangle + \dots + u_\ell v_\ell \langle a_\ell , c_\ell \rangle )/ v_i \langle a_i,c_i \rangle \) for a generic \(i \in [\ell ]\). Note that, such a \(u_i\) is in \(\mathbb {F}\) since fields guarantee the existence of additive and multiplicative inverses. The probability to choose \(u_i\) such that it cancels the other inner products is the same as choosing an element from \(\mathbb {F}^*\) uniformly at random, i.e. \(1/{|}\mathbb {F}^*{|}\).

(ii) If the submatrix of AC with indices \([i_1,i_2] \times [j_1,j_2]\) is all zero then \(\langle a,c \rangle = 0\) with probability at least \(1 - k\log ((n^2/k) -1)/{|}\mathbb {F}^*{|}\). By Lemma 1 this is true. If \(\langle a,c \rangle = 0\) then the submatrix of AC with indices \([i_1,i_2] \times [j_1,j_2]\) is all zero with probability at least \(1 - k\log ((n^2/k) -1)/{|}\mathbb {F}^*{|}\). That is, if \(\langle a,c \rangle = 0\) then the submatrix of AC has a nonzero entry with probability at most \(k\log ((n^2/k) -1)/{|}\mathbb {F}^*{|}\). If \(\langle a,c \rangle = 0\) and \(i_1=i_2=i\), \(j_1=j_2=j\) then \(\langle a_i,c_j \rangle = 0\). The same reasoning applies for \(i_1<i_2\), \(j_1<j_2\) and exactly one nonzero entry in the submatrix. Let \(\langle a,c \rangle = 0\) and suppose there exist \(a_1,\dots ,a_\ell ,c_1, \dots ,c_\ell \) such that \(\langle u_1a_1,v_1c_1 \rangle + \dots + \langle u_\ell a_\ell ,v_\ell c_\ell \rangle = 0\) and \(\langle a_i,c_j \rangle \ne 0\), for all \(i,j \in [\ell ]\). Hence, as in (i), the sum cancels with probability \(1/{|}\mathbb {F}^*{|}\). The latter is a lower bound on the probability to not detect a nonzero entry in the output matrix. A union bound over the \( k\log ((n^2/k) -1)\) queries needed to isolate the k nonzero entries, gives us the probability to incur in at least one false negative.Footnote 3 By considering its complement, the claim follows.   \(\square \)

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Dusefante, M., Jacob, R. (2018). Cache Oblivious Sparse Matrix Multiplication. In: Bender, M., Farach-Colton, M., Mosteiro, M. (eds) LATIN 2018: Theoretical Informatics. LATIN 2018. Lecture Notes in Computer Science(), vol 10807. Springer, Cham. https://doi.org/10.1007/978-3-319-77404-6_32

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-77404-6_32

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-77403-9

  • Online ISBN: 978-3-319-77404-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics