An Elementary Approach to the Problem of Column Selection in a Rectangular Matrix
The problem of extracting a well conditioned submatrix from any rectangular matrix (with e.g. normalized columns) has been a subject of extensive research with applications to machine learning (rank revealing factorization, sparse solutions to least squares regression problems, clustering, \(\cdots \)), optimisation (low stretch spanning trees, \(\cdots \)), and is also connected with problems in functional and harmonic analysis (Bourgain-Tzafriri restricted invertibility problem).
In this paper, we provide a deterministic algorithm which extracts a submatrix \(X_S\) from any matrix X with guaranteed individual lower and upper bounds on each singular value of \(X_S\). We are also able to deduce a slightly weaker (up to a \(\log \)) version of the Bourgain-Tzafriri theorem as an immediate side result.
We end the paper with a description of how our method applies to the analysis of a large data set and how its numerical efficiency compares with the method of Spieman and Srivastava.
KeywordsBourgain Tzafriri theorem Restricted invertibility Column selection problems
- 1.Arthur, D., Vassilvitskii, S.: k-means++: the advantages of careful seeding. In: Proceedings of the Eighteenth Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 1027–1035. Society for Industrial and Applied Mathematics (2007)Google Scholar
- 5.Boutsidis, C., Drineas, P., Mahoney, M.: On selecting exactly k columns from a matrix (2008, in press)Google Scholar
- 6.d’Aspremont, A., Ghaoui, L.E., Jordan, M.I., Lanckriet, G.R.: A direct formulation for sparse PCA using semidefinite programming. In: Advances in Neural Information Processing Systems, pp. 41–48 (2005)Google Scholar
- 10.Naor, A.: Sparse quadratic forms and their geometric applications [following Batson, Spielman and Srivastava]. Séminaire Bourbaki: Vol. 2010/2011. Exposés 1027–1042. Astérisque No. 348 (2012), Exp. No. 1033, viii, 189–217Google Scholar
- 11.Nikolov, A.: Randomized rounding for the largest simplex problem. In: Proceedings of the Forty-Seventh Annual ACM on Symposium on Theory of Computing, pp. 861–870 (2015)Google Scholar
- 12.Nikolov, A., Singh, M.: Maximizing determinants under partition constraints. In: STOC 2016, pp. 192–201 (2016)Google Scholar
- 16.Tropp, J.A.: Column subset selection, matrix factorization, and eigenvalue optimization. In: Proceedings of the Twentieth Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 978–986. Society for Industrial and Applied Mathematics (2009)Google Scholar