Adaptive Sampling and Fast Low-Rank Matrix Approximation

Deshpande, Amit; Vempala, Santosh

doi:10.1007/11830924_28

Amit Deshpande²⁰ &
Santosh Vempala²⁰

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4110))

Included in the following conference series:

International Workshop on Approximation Algorithms for Combinatorial Optimization
International Workshop on Randomization and Approximation Techniques in Computer Science

2071 Accesses
59 Citations

Abstract

We prove that any real matrix A contains a subset of at most 4k/ε+ 2k log(k+1) rows whose span “contains” a matrix of rank at most k with error only (1+ε) times the error of the best rank-k approximation of A. We complement it with an almost matching lower bound by constructing matrices where the span of any k/2ε rows does not “contain” a relative (1+ε)-approximation of rank k. Our existence result leads to an algorithm that finds such rank-k approximation in time

\( O \left( M \left( \frac{k}{\epsilon} + k^{2} \log k \right) + (m+n) \left( \frac{k^{2}}{\epsilon^{2}} + \frac{k^{3} \log k}{\epsilon} + k^{4} \log^{2} k \right) \right), \)

i.e., essentially O(Mk/ε), where M is the number of nonzero entries of A. The algorithm maintains sparsity, and in the streaming model [12,14,15], it can be implemented using only 2(k+1)(log(k+1)+1) passes over the input matrix and \(O \left( \min \{ m, n \} (\frac{k}{\epsilon} + k^{2} \log k) \right)\) additional space. Previous algorithms for low-rank approximation use only one or two passes but obtain an additive approximation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Arora, S., Hazan, E., Kale, S.: A Fast Random Sampling Algorithm for Sparsifying Matrices. In: Díaz, J., Jansen, K., Rolim, J.D.P., Zwick, U. (eds.) APPROX 2006 and RANDOM 2006. LNCS, vol. 4110, pp. 272–279. Springer, Heidelberg (2006)
Chapter Google Scholar
Achlioptas, D., McSherry, F.: Fast Computation of Low Rank Approximations. In: Proceedings of the 33rd Annual Symposium on Theory of Computing (2001)
Google Scholar
Aggarwal, C., Procopiuc, C., Wolf, J., Yu, P., Park, J.: Fast Algorithms for Projected Clustering. In: Proceedings of SIGMOD (1999)
Google Scholar
Bar-Yosseff, Z.: Sampling Lower Bounds via Information Theory. In: Proceedings of the 35th Annual Symposium on Theory of Computing (2003)
Google Scholar
de la Vega, W.F., Karpinski, M., Kenyon, C., Rabani, Y.: Approximation schemes for clustering problems. In: Proceedings of the 35th Annual ACM Symposium on Theory of Computing (2003)
Google Scholar
Drineas, P.: Personal communication (2006)
Google Scholar
Drineas, P., Frieze, A., Kannan, R., Vempala, S., Vinay, V.: Clustering in large graphs and matrices. In: Proceedings of the 10th SODA (1999)
Google Scholar
Drineas, P., Kannan, R.: Pass Efficient Algorithm for approximating large matrices. In: Proceedings of 14th SODA (2003)
Google Scholar
Drineas, P., Kannan, R., Mahoney, M.: Fast Monte Carlo Algorithms for Matrices II: Computing a Low-Rank Approximation to a Matrix. Yale University Technical Report, YALEU/DCS/TR-1270 (2004)
Google Scholar
Drineas, P., Mahoney, M., Muthukrishnan, S.: Polynomial time algorithm for column-row based relative error low-rank matrix approximation. DIMACS Technical Report 2006-04 (2006)
Google Scholar
Deshpande, A., Rademacher, L., Vempala, S., Wang, G.: Matrix Approximation and Projective Clustering via Volume Sampling. In: Proceedings of the 17th ACM-SIAM Symposium on Discrete Algorithms (SODA) (2006)
Google Scholar
Feigenbaum, J., Kannan, S., McGregor, A., Suri, S., Zhang, J.: On Graph Problems in a Semi-Streaming Model. In: Díaz, J., Karhumäki, J., Lepistö, A., Sannella, D. (eds.) ICALP 2004. LNCS, vol. 3142. Springer, Heidelberg (2004)
Google Scholar
Frieze, A., Kannan, R., Vempala, S.: Fast Monte-Carlo algorithms for finding low-rank approximations. Journal of the ACM 51(6), 1025–1041 (2004)
Article MATH MathSciNet Google Scholar
Guha, S., Koudas, N., Shim, K.: Data-streams and histograms. In: Proceedings of 33rd ACM Symposium on Theory of Computing (2001)
Google Scholar
Henzinger, M., Raghavan, P., Rajagopalan, S.: Computing on Data Streams. Technical Note 1998-011, Digital Systems Research Center, Palo Alto, CA (May 1998)
Google Scholar
Matoušek, J.: On approximate geometric k-clustering. Discrete and Computational Geometry, 61–84 (2000)
Google Scholar

Download references

Author information

Authors and Affiliations

Mathematics Department and CSAIL, MIT,
Amit Deshpande & Santosh Vempala

Authors

Amit Deshpande
View author publications
You can also search for this author in PubMed Google Scholar
Santosh Vempala
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Departament de Llenguatges i Sistemes Informatics, Universitat Politecnica de Catalunya, Campus Nord - Ed. Omega, 240 Jordi Girona Salgado, 1-3 E-08034, Barcelona
Josep Díaz
Institute for Computer Science, University of Kiel, Olshausenstrasse 40, 24118, Kiel, Germany
Klaus Jansen
Centre Universitaire d’Informatique, Battelle Bâtiment A, Route de Drize 7,, 1227, Carouge, Geneva, Switzerland
José D. P. Rolim
School of Computer Science, Tel Aviv University, 69978, Tel Aviv, Israel
Uri Zwick

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Deshpande, A., Vempala, S. (2006). Adaptive Sampling and Fast Low-Rank Matrix Approximation. In: Díaz, J., Jansen, K., Rolim, J.D.P., Zwick, U. (eds) Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques. APPROX RANDOM 2006 2006. Lecture Notes in Computer Science, vol 4110. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11830924_28

Download citation

DOI: https://doi.org/10.1007/11830924_28
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-38044-3
Online ISBN: 978-3-540-38045-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics