Abstract
We prove that any real matrix A contains a subset of at most 4k/ε+ 2k log(k+1) rows whose span “contains” a matrix of rank at most k with error only (1+ε) times the error of the best rank-k approximation of A. We complement it with an almost matching lower bound by constructing matrices where the span of any k/2ε rows does not “contain” a relative (1+ε)-approximation of rank k. Our existence result leads to an algorithm that finds such rank-k approximation in time
\( O \left( M \left( \frac{k}{\epsilon} + k^{2} \log k \right) + (m+n) \left( \frac{k^{2}}{\epsilon^{2}} + \frac{k^{3} \log k}{\epsilon} + k^{4} \log^{2} k \right) \right), \)
i.e., essentially O(Mk/ε), where M is the number of nonzero entries of A. The algorithm maintains sparsity, and in the streaming model [12,14,15], it can be implemented using only 2(k+1)(log(k+1)+1) passes over the input matrix and \(O \left( \min \{ m, n \} (\frac{k}{\epsilon} + k^{2} \log k) \right)\) additional space. Previous algorithms for low-rank approximation use only one or two passes but obtain an additive approximation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Arora, S., Hazan, E., Kale, S.: A Fast Random Sampling Algorithm for Sparsifying Matrices. In: Díaz, J., Jansen, K., Rolim, J.D.P., Zwick, U. (eds.) APPROX 2006 and RANDOM 2006. LNCS, vol. 4110, pp. 272–279. Springer, Heidelberg (2006)
Achlioptas, D., McSherry, F.: Fast Computation of Low Rank Approximations. In: Proceedings of the 33rd Annual Symposium on Theory of Computing (2001)
Aggarwal, C., Procopiuc, C., Wolf, J., Yu, P., Park, J.: Fast Algorithms for Projected Clustering. In: Proceedings of SIGMOD (1999)
Bar-Yosseff, Z.: Sampling Lower Bounds via Information Theory. In: Proceedings of the 35th Annual Symposium on Theory of Computing (2003)
de la Vega, W.F., Karpinski, M., Kenyon, C., Rabani, Y.: Approximation schemes for clustering problems. In: Proceedings of the 35th Annual ACM Symposium on Theory of Computing (2003)
Drineas, P.: Personal communication (2006)
Drineas, P., Frieze, A., Kannan, R., Vempala, S., Vinay, V.: Clustering in large graphs and matrices. In: Proceedings of the 10th SODA (1999)
Drineas, P., Kannan, R.: Pass Efficient Algorithm for approximating large matrices. In: Proceedings of 14th SODA (2003)
Drineas, P., Kannan, R., Mahoney, M.: Fast Monte Carlo Algorithms for Matrices II: Computing a Low-Rank Approximation to a Matrix. Yale University Technical Report, YALEU/DCS/TR-1270 (2004)
Drineas, P., Mahoney, M., Muthukrishnan, S.: Polynomial time algorithm for column-row based relative error low-rank matrix approximation. DIMACS Technical Report 2006-04 (2006)
Deshpande, A., Rademacher, L., Vempala, S., Wang, G.: Matrix Approximation and Projective Clustering via Volume Sampling. In: Proceedings of the 17th ACM-SIAM Symposium on Discrete Algorithms (SODA) (2006)
Feigenbaum, J., Kannan, S., McGregor, A., Suri, S., Zhang, J.: On Graph Problems in a Semi-Streaming Model. In: Díaz, J., Karhumäki, J., Lepistö, A., Sannella, D. (eds.) ICALP 2004. LNCS, vol. 3142. Springer, Heidelberg (2004)
Frieze, A., Kannan, R., Vempala, S.: Fast Monte-Carlo algorithms for finding low-rank approximations. Journal of the ACM 51(6), 1025–1041 (2004)
Guha, S., Koudas, N., Shim, K.: Data-streams and histograms. In: Proceedings of 33rd ACM Symposium on Theory of Computing (2001)
Henzinger, M., Raghavan, P., Rajagopalan, S.: Computing on Data Streams. Technical Note 1998-011, Digital Systems Research Center, Palo Alto, CA (May 1998)
Matoušek, J.: On approximate geometric k-clustering. Discrete and Computational Geometry, 61–84 (2000)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Deshpande, A., Vempala, S. (2006). Adaptive Sampling and Fast Low-Rank Matrix Approximation. In: Díaz, J., Jansen, K., Rolim, J.D.P., Zwick, U. (eds) Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques. APPROX RANDOM 2006 2006. Lecture Notes in Computer Science, vol 4110. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11830924_28
Download citation
DOI: https://doi.org/10.1007/11830924_28
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-38044-3
Online ISBN: 978-3-540-38045-0
eBook Packages: Computer ScienceComputer Science (R0)