Abstract
The problem of extracting a well conditioned submatrix from any rectangular matrix (with e.g. normalized columns) has been a subject of extensive research with applications to machine learning (rank revealing factorization, sparse solutions to least squares regression problems, clustering, \(\cdots \)), optimisation (low stretch spanning trees, \(\cdots \)), and is also connected with problems in functional and harmonic analysis (Bourgain-Tzafriri restricted invertibility problem).
In this paper, we provide a deterministic algorithm which extracts a submatrix \(X_S\) from any matrix X with guaranteed individual lower and upper bounds on each singular value of \(X_S\). We are also able to deduce a slightly weaker (up to a \(\log \)) version of the Bourgain-Tzafriri theorem as an immediate side result.
We end the paper with a description of how our method applies to the analysis of a large data set and how its numerical efficiency compares with the method of Spieman and Srivastava.
Keywords
- Bourgain Tzafriri theorem
- Restricted invertibility
- Column selection problems
This is a preview of subscription content, access via your institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Arthur, D., Vassilvitskii, S.: k-means++: the advantages of careful seeding. In: Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 1027–1035. Society for Industrial and Applied Mathematics (2007)
Avron, H., Boutsidis, C.: Faster subset selection for matrices and applications. SIAM J. Matrix Anal. Appl. 34(4), 1464–1499 (2013)
Bourgain, J., Tzafriri, L.: Invertibility of “large” submatrices with applications to the geometry of Banach spaces and harmonic analysis. Israel J. Math. 57(2), 137–224 (1987)
Boutsidis, C., Drineas, P., Magdon-Ismail, M.: Near-optimal column-based matrix reconstruction. SIAM J. Comput. 43(2), 687–717 (2014)
Boutsidis, C., Drineas, P., Mahoney, M.: On selecting exactly k columns from a matrix (2008, in press)
d’Aspremont, A., Ghaoui, L.E., Jordan, M.I., Lanckriet, G.R.: A direct formulation for sparse PCA using semidefinite programming. In: Advances in Neural Information Processing Systems, pp. 41–48 (2005)
Farahat, A.K., Elgohary, A., Ghodsi, A., Kamel, M.S.: Greedy column subset selection for large-scale data sets. Knowl. Inform. Syst. 45(1), 1–34 (2015)
Mallat, S.: Group invariant scattering. Commun. Pure Appl. Math. 65(10), 1331–1398 (2012)
Bruna, J., Mallat, S.: Invariant scattering convolution networks. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1872–1886 (2013)
Naor, A.: Sparse quadratic forms and their geometric applications [following Batson, Spielman and Srivastava]. Séminaire Bourbaki: Vol. 2010/2011. Exposés 1027–1042. Astérisque No. 348 (2012), Exp. No. 1033, viii, 189–217
Nikolov, A.: Randomized rounding for the largest simplex problem. In: Proceedings of the Forty-Seventh Annual ACM on Symposium on Theory of Computing, pp. 861–870 (2015)
Nikolov, A., Singh, M.: Maximizing determinants under partition constraints. In: STOC 2016, pp. 192–201 (2016)
Spielman, D.A., Srivastava, N.: An elementary proof of the restricted invertibility theorem. Israel J. Math. 190, 83–91 (2012)
Tropp, J.A.: The random paving property for uniformly bounded matrices. Studia Math. 185(1), 67–82 (2008)
Tropp, J.A.: Norms of random submatrices and sparse approximation. C. R. Acad. Sci. Paris, Ser. I 346, 1271–1274 (2008)
Tropp, J.A.: Column subset selection, matrix factorization, and eigenvalue optimization. In: Proceedings of the Twentieth Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 978–986. Society for Industrial and Applied Mathematics (2009)
Vershynin, R.: John’s decompositions: selecting a large part. Israel J. Math. 122, 253–277 (2001)
Youssef, P.: A note on column subset selection. Int. Math. Res. Not. IMRN 23, 6431–6447 (2014)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG
About this paper
Cite this paper
Chrétien, S., Darses, S. (2018). An Elementary Approach to the Problem of Column Selection in a Rectangular Matrix. In: Nicosia, G., Pardalos, P., Giuffrida, G., Umeton, R. (eds) Machine Learning, Optimization, and Big Data. MOD 2017. Lecture Notes in Computer Science(), vol 10710. Springer, Cham. https://doi.org/10.1007/978-3-319-72926-8_20
Download citation
DOI: https://doi.org/10.1007/978-3-319-72926-8_20
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-72925-1
Online ISBN: 978-3-319-72926-8
eBook Packages: Computer ScienceComputer Science (R0)