Abstract
Recent work has demonstrated the effectiveness of gradient descent for directly estimating high-dimensional signals via nonconvex optimization in a globally convergent manner using a proper initialization. However, the performance is highly sensitive in the presence of adversarial outliers that may take arbitrary values. In this chapter, we introduce the median-Truncated Gradient Descent (median-TGD) algorithm to improve the robustness of gradient descent against outliers, and apply it to two celebrated problems: low-rank matrix recovery and phase retrieval. Median-TGD truncates the contributions of samples that deviate significantly from the sample median in each iteration in order to stabilize the search direction. Encouragingly, when initialized in a neighborhood of the ground truth known as the basin of attraction, median-TGD converges to the ground truth at a linear rate under Gaussian designs with a near-optimal number of measurements, even when a constant fraction of the measurements are arbitrarily corrupted. In addition, we introduce a new median-truncated spectral method that ensures an initialization in the basin of attraction. The stability against additional dense bounded noise is also established. Numerical experiments are provided to validate the superior performance of median-TGD.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
It is straightforward to handle stochastic noise such as Gaussian noise, by noticing that its infinity norm is bounded with high probability.
- 2.
Our discussions can be extended to the rectangular case, see [8].
- 3.
The algorithm can be used to estimate complex-valued signals as well.
References
P. Auer, M. Herbster, M.K. Warmuth, Exponentially many local minima for single neurons, in Advances in neural information processing systems (1996), pp. 316–322
R. Sun, Z.-Q. Luo, Guaranteed matrix completion via non-convex factorization. IEEE Trans. Inf. Theory 62(11), 6535–6579 (2016)
E.J. Candès, X. Li, M. Soltanolkotabi, Phase retrieval via wirtinger flow: theory and algorithms. IEEE Trans. Inf. Theory 61(4), 1985–2007 (2015)
J. Sun, Q. Qu, J. Wright, Complete dictionary recovery over the sphere i: overview and the geometric picture. IEEE Trans. Inf. Theory 63(2), 853–884 (2017)
X. Li, S. Ling, T. Strohmer, K. Wei, Rapid, robust, and reliable blind deconvolution via nonconvex optimization. Appl. Comput. Harmon. Anal. (2018)
P.J. Huber, Robust Statistics (Springer, 2011)
Y. Li, Y. Chi, H. Zhang, Y. Liang, Non-convex low-rank matrix recovery from corrupted random linear measurements, in 2017 International Conference on Sampling Theory and Applications (SampTA) (2017)
Y. Li, Y. Chi, H. Zhang, Y. Liang, Nonconvex low-rank matrix recovery with arbitrary outliers via median-truncated gradient descent. arXiv:1709.08114 (2017)
H. Zhang, Y. Chi, Y. Liang, Median-truncated nonconvex approach for phase retrieval with outliers. IEEE Trans. Inf. Theory 64(11), 7287–7310 (2018)
R.J. Tibshirani, Fast computation of the median by successive binning. arXiv:0806.3301 (2008)
B. Recht, M. Fazel, P.A. Parrilo, Guaranteed minimum-rank solutions of linear matrix equations via nuclear norm minimization. SIAM Rev. 52(3), 471–501 (2010)
D. Gross, Recovering low-rank matrices from few coefficients in any basis. IEEE Trans. Inf. Theory 57(3), 1548–1566 (2011)
S. Negahban, M.J. Wainwright, Estimation of (near) low-rank matrices with noise and high-dimensional scaling. Ann. Stat. 39(2), 1069–1097 (2011)
E. Candes, B. Recht, Exact matrix completion via convex optimization. Commun. ACM 55(6), 111–119 (2012)
Y. Chen, Y. Chi, Robust spectral compressed sensing via structured matrix completion. IEEE Trans. Inf. Theory 60(10), 6576–6601 (2014)
Y. Chen, Y. Chi, A. Goldsmith, Exact and stable covariance estimation from quadratic sampling via convex programming. IEEE Trans. Inf. Theory 61(7), 4034–4059 (2015)
Y. Chen, Y. Chi, Harnessing structures in big data via guaranteed low-rank matrix estimation: recent theory and fast algorithms via convex and nonconvex optimization. IEEE Signal Process. Mag. 35(4), 14–31 (2018)
S. Tu, R. Boczar, M. Simchowitz, M. Soltanolkotabi, B. Recht, Low-rank solutions of linear matrix equations via procrustes flow, in Proceedings of the 33rd International Conference on International Conference on Machine Learning (ICML) (2016), pp. 964–973
J. Drenth, X-Ray Crystallography (Wiley Online Library, 2007)
J.R. Fienup, Phase retrieval algorithms: a comparison. Appl. Opt. 21(15), 2758–2769 (1982)
H. Zhang, Y. Zhou, Y. Liang, Y. Chi, A nonconvex approach for phase retrieval: reshaped wirtinger flow and incremental algorithms. J. Mach. Learn. Res. 18(141), 1–35 (2017)
H. Zhang, Y. Chi, Y. Liang, Provable non-convex phase retrieval with outliers: Median truncated wirtinger flow, in International Conference on Machine Learning (2016), pp. 1022–1031
Y. Chen, E. Candes, Solving random quadratic systems of equations is nearly as easy as solving linear systems, in Advances in Neural Information Processing Systems (NIPS) (2015)
E.J. Candès, Y. Plan, Tight oracle inequalities for low-rank matrix recovery from a minimal number of noisy random measurements. IEEE Trans. Inf. Theory 57(4), 2342–2359 (2011)
Y. Chi, Y. M. Lu, Y. Chen, Nonconvex optimization meets low-rank matrix factorization: an overview. arXiv:1809.09573 (2018)
P. Netrapalli, P. Jain, S. Sanghavi, Phase retrieval using alternating minimization, Advances in Neural Information Processing Systems (NIPS) (2013)
G. Wang, G.B. Giannakis, Y.C. Eldar, Solving systems of random quadratic equations via truncated amplitude flow. IEEE Trans. Inf. Theory 64(2), 773–794 (2018)
J. Sun, Q. Qu, J. Wright, A geometric analysis of phase retrieval. Found. Comput. Math. 18(5), 1131–1198 (2018)
C. Ma, K. Wang, Y. Chi, Y. Chen, Implicit regularization in nonconvex statistical estimation: gradient descent converges linearly for phase retrieval, matrix completion and blind deconvolution. arXiv:1711.10467 (2017)
Y. Li, C. Ma, Y. Chen, Y. Chi, Nonconvex matrix factorization from rank-one measurements. arXiv:1802.06286 (2018)
Y. Chen, Y. Chi, J. Fan, C. Ma, Gradient descent with random initialization: fast global convergence for nonconvex phase retrieval. arXiv:1803.07726 (2018)
R.H. Keshavan, A. Montanari, S. Oh, Matrix completion from a few entries. IEEE Trans. Inf. Theory 56(6), 2980–2998 (2010)
P. Jain, P. Netrapalli, S. Sanghavi, Low-rank matrix completion using alternating minimization, in Proceedings of the Forty-Fifth Annual ACM Symposium on Theory of Computing (2013), pp. 665–674
R. Sun, Z.-Q. Luo, Guaranteed matrix completion via nonconvex factorization, in IEEE 56th Annual Symposium on Foundations of Computer Science (FOCS) (2015), pp. 270–289
M. Hardt, Understanding alternating minimization for matrix completion, in IEEE 55th Annual Symposium on Foundations of Computer Science (FOCS) (2014), pp. 651–660
C. De Sa, C. Re, K. Olukotun, Global convergence of stochastic gradient descent for some non-convex matrix problems, in International Conference on Machine Learning (2015), pp. 2332–2341
Q. Zheng, J. Lafferty, Convergence analysis for rectangular matrix completion using burer-monteiro factorization and gradient descent. arXiv:1605.07051 (2016)
C. Jin, S. M. Kakade, P. Netrapalli, Provable efficient online matrix completion via non-convex stochastic gradient descent, in Advances in Neural Information Processing Systems (2016), pp. 4520–4528
R. Ge, J. D. Lee, T. Ma, Matrix completion has no spurious local minimum, in Advances in Neural Information Processing Systems (NIPS) (2016), pp. 2973–2981
S. Bhojanapalli, B. Neyshabur, N. Srebro, Global optimality of local search for low rank matrix recovery, in Advances in Neural Information Processing Systems (2016), pp. 3873–3881
Y. Chen, M.J. Wainwright, Fast low-rank estimation by projected gradient descent: general statistical and algorithmic guarantees. arXiv:1509.03025 (2015)
Q. Zheng, J. Lafferty, A convergent gradient descent algorithm for rank minimization and semidefinite programming from random linear measurements, in Advances in Neural Information Processing Systems (NIPS) (2015)
D. Park, A. Kyrillidis, C. Caramanis, S. Sanghavi, Finding low-rank solutions via nonconvex matrix factorization, efficiently and provably. SIAM J. Imaging Sci. 11(4), 2165–2204 (2018)
K. Wei, J.-F. Cai, T.F. Chan, S. Leung, Guarantees of riemannian optimization for low rank matrix recovery. SIAM J. Matrix Anal. Appl. 37(3), 1198–1222 (2016)
Q. Li, G. Tang, The nonconvex geometry of low-rank matrix optimizations with general objective functions, in 2017 IEEE Global Conference on Signal and Information Processing (GlobalSIP) (IEEE, 2017), pp. 1235–1239
X. Li, J. Haupt, J. Lu, Z. Wang, R. Arora, H. Liu, T. Zhao, Symmetry, saddle points, and global optimization landscape of nonconvex matrix factorization, in Information Theory and Applications Workshop (ITA) (IEEE 2018), pp. 1–9
P. Netrapalli, U. Niranjan, S. Sanghavi, A. Anandkumar, P. Jain, Non-convex robust PCA, in Advances in Neural Information Processing Systems (NIPS) (2014)
X. Yi, D. Park, Y. Chen, C. Caramanis, Fast algorithms for robust PCA via gradient descent, in Advances in Neural Information Processing Systems (2016), pp. 4152–4160
A. Anandkumar, P. Jain, Y. Shi, U.N. Niranjan, Tensor vs. matrix methods: Robust tensor decomposition under block sparse perturbations, in Artificial Intelligence and Statistics (2016), pp. 268–276
S. Arora, R. Ge, T. Ma, A. Moitra, Simple, efficient, and neural algorithms for sparse coding, in Conference on Learning Theory (2015), pp. 113–149
J. Sun, Q. Qu, J. Wright, Complete dictionary recovery using nonconvex optimization, in Proceedings of the 32nd International Conference on Machine Learning (ICML) (2015)
A. S. Bandeira, N. Boumal, V. Voroninski, On the low-rank approach for semidefinite programs arising in synchronization and community detection, in 29th Annual Conference on Learning Theory (2016)
N. Boumal, Nonconvex phase synchronization. SIAM J. Optim. 26(4), 2355–2377 (2016)
K. Lee, Y. Li, M. Junge, Y. Bresler, Blind recovery of sparse signals from subsampled convolution. IEEE Trans. Inf. Theory 63(2), 802–821 (2017)
Y. Chen, E.J. Candès, The projected power method: an efficient algorithm for joint alignment from pairwise differences. Commun. Pure Appl. Math. 71(8), 1648–1714 (2018)
K. Chen, On k-median clustering in high dimensions, in Proceedings of the Seventeenth Annual ACM-SIAM Symposium on Discrete Algorithm (2006)
D. Wagner, Resilient aggregation in sensor networks, in Proceedings of the 2nd ACM Workshop on Security of Ad Hoc and Sensor Networks (ACM, 2004), pp. 78–87
Y. Chen, C. Caramanis, S. Mannor, Robust sparse regression under adversarial corruption, in Proceedings of the 30th International Conference on Machine Learning (ICML) (2013)
C. Qu, H. Xu, Subspace clustering with irrelevant features via robust dantzig selector, in Advances in Neural Information Processing Systems (NIPS) (2015)
A. Prasad, A. S. Suggala, S. Balakrishnan, P. Ravikumar, Robust estimation via robust gradient estimation. arXiv:1802.06485 (2018)
D. Yin, Y. Chen, R. Kannan, P. Bartlett, Byzantine-robust distributed learning: towards optimal statistical rates, in Proceedings of the 35th International Conference on Machine Learning, 10–15 Jul 2018 (2018), pp. 5650–5659
Y. Chen, L. Su, J. Xu, Distributed statistical machine learning in adversarial settings: byzantine gradient descent, in Proceedings of the ACM on Measurement and Analysis of Computing Systems, vol. 1, no 2 (2017), p. 44
Y. Li, Y. Sun, Y. Chi, Low-rank positive semidefinite matrix recovery from corrupted rank-one measurements. IEEE Trans. Signal Process. 65(2), 397–408 (2017)
P. Hand, Phaselift is robust to a constant fraction of arbitrary errors. Applied and Computational Harmonic Analysis 42(3), 550–562 (2017)
D. Weller, A. Pnueli, G. Divon, O. Radzyner, Y. Eldar, J. Fessler, Undersampled phase retrieval with outliers. IEEE Transactions on Computational Imaging 1(4), 247–258 (2015)
J. Wright, A. Ganesh, K. Min, Y. Ma, Compressive principal component pursuit. Information and Inference 2(1), 32–68 (2013)
Y. Cherapanamjeri, K. Gupta, P. Jain, Nearly optimal robust matrix completion, in Proceedings of the 34th International Conference on Machine Learning (2017), pp. 797–805
X. Zhang, L. Wang, Q. Gu, A unified framework for nonconvex low-rank plus sparse matrix recovery, in International Conference on Artificial Intelligence and Statistics (2018), pp. 1097–1107
Acknowledgements
The work of Y. Chi and Y. Li is supported in part by AFOSR under the grant FA9550-15-1-0205, by ONR under the grant N00014-18-1-2142, by ARO under the grant W911NF-18-1-0303, and by NSF under the grants CAREER ECCS-1818571 and CCF-1806154. The work of Y. Liang is supported in part by NSF under the grants CCF-1761506 and ECCS-1818904.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Chi, Y., Li, Y., Zhang, H., Liang, Y. (2019). Median-Truncated Gradient Descent: A Robust and Scalable Nonconvex Approach for Signal Estimation. In: Boche, H., Caire, G., Calderbank, R., Kutyniok, G., Mathar, R., Petersen, P. (eds) Compressed Sensing and Its Applications. Applied and Numerical Harmonic Analysis. Birkhäuser, Cham. https://doi.org/10.1007/978-3-319-73074-5_8
Download citation
DOI: https://doi.org/10.1007/978-3-319-73074-5_8
Published:
Publisher Name: Birkhäuser, Cham
Print ISBN: 978-3-319-73073-8
Online ISBN: 978-3-319-73074-5
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)