Median-Truncated Gradient Descent: A Robust and Scalable Nonconvex Approach for Signal Estimation

Chi, Yuejie; Li, Yuanxin; Zhang, Huishuai; Liang, Yingbin

doi:10.1007/978-3-319-73074-5_8

Median-Truncated Gradient Descent: A Robust and Scalable Nonconvex Approach for Signal Estimation

Yuejie Chi²¹,
Yuanxin Li²¹,
Huishuai Zhang²² &
…
Yingbin Liang²³

Chapter
First Online: 14 August 2019

1500 Accesses
2 Citations

Part of the book series: Applied and Numerical Harmonic Analysis ((ANHA))

Abstract

Recent work has demonstrated the effectiveness of gradient descent for directly estimating high-dimensional signals via nonconvex optimization in a globally convergent manner using a proper initialization. However, the performance is highly sensitive in the presence of adversarial outliers that may take arbitrary values. In this chapter, we introduce the median-Truncated Gradient Descent (median-TGD) algorithm to improve the robustness of gradient descent against outliers, and apply it to two celebrated problems: low-rank matrix recovery and phase retrieval. Median-TGD truncates the contributions of samples that deviate significantly from the sample median in each iteration in order to stabilize the search direction. Encouragingly, when initialized in a neighborhood of the ground truth known as the basin of attraction, median-TGD converges to the ground truth at a linear rate under Gaussian designs with a near-optimal number of measurements, even when a constant fraction of the measurements are arbitrarily corrupted. In addition, we introduce a new median-truncated spectral method that ensures an initialization in the basin of attraction. The stability against additional dense bounded noise is also established. Numerical experiments are provided to validate the superior performance of median-TGD.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 109.00; Price excludes VAT (USA)

Hardcover Book: USD 139.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
It is straightforward to handle stochastic noise such as Gaussian noise, by noticing that its infinity norm is bounded with high probability.
2.
Our discussions can be extended to the rectangular case, see [8].
3.
The algorithm can be used to estimate complex-valued signals as well.

References

P. Auer, M. Herbster, M.K. Warmuth, Exponentially many local minima for single neurons, in Advances in neural information processing systems (1996), pp. 316–322
Google Scholar
R. Sun, Z.-Q. Luo, Guaranteed matrix completion via non-convex factorization. IEEE Trans. Inf. Theory 62(11), 6535–6579 (2016)
Article MathSciNet Google Scholar
E.J. Candès, X. Li, M. Soltanolkotabi, Phase retrieval via wirtinger flow: theory and algorithms. IEEE Trans. Inf. Theory 61(4), 1985–2007 (2015)
Article MathSciNet Google Scholar
J. Sun, Q. Qu, J. Wright, Complete dictionary recovery over the sphere i: overview and the geometric picture. IEEE Trans. Inf. Theory 63(2), 853–884 (2017)
Article MathSciNet Google Scholar
X. Li, S. Ling, T. Strohmer, K. Wei, Rapid, robust, and reliable blind deconvolution via nonconvex optimization. Appl. Comput. Harmon. Anal. (2018)
Google Scholar
P.J. Huber, Robust Statistics (Springer, 2011)
Google Scholar
Y. Li, Y. Chi, H. Zhang, Y. Liang, Non-convex low-rank matrix recovery from corrupted random linear measurements, in 2017 International Conference on Sampling Theory and Applications (SampTA) (2017)
Google Scholar
Y. Li, Y. Chi, H. Zhang, Y. Liang, Nonconvex low-rank matrix recovery with arbitrary outliers via median-truncated gradient descent. arXiv:1709.08114 (2017)
H. Zhang, Y. Chi, Y. Liang, Median-truncated nonconvex approach for phase retrieval with outliers. IEEE Trans. Inf. Theory 64(11), 7287–7310 (2018)
Article MathSciNet Google Scholar
R.J. Tibshirani, Fast computation of the median by successive binning. arXiv:0806.3301 (2008)
B. Recht, M. Fazel, P.A. Parrilo, Guaranteed minimum-rank solutions of linear matrix equations via nuclear norm minimization. SIAM Rev. 52(3), 471–501 (2010)
Article MathSciNet Google Scholar
D. Gross, Recovering low-rank matrices from few coefficients in any basis. IEEE Trans. Inf. Theory 57(3), 1548–1566 (2011)
Article MathSciNet Google Scholar
S. Negahban, M.J. Wainwright, Estimation of (near) low-rank matrices with noise and high-dimensional scaling. Ann. Stat. 39(2), 1069–1097 (2011)
Article MathSciNet Google Scholar
E. Candes, B. Recht, Exact matrix completion via convex optimization. Commun. ACM 55(6), 111–119 (2012)
Article Google Scholar
Y. Chen, Y. Chi, Robust spectral compressed sensing via structured matrix completion. IEEE Trans. Inf. Theory 60(10), 6576–6601 (2014)
Article MathSciNet Google Scholar
Y. Chen, Y. Chi, A. Goldsmith, Exact and stable covariance estimation from quadratic sampling via convex programming. IEEE Trans. Inf. Theory 61(7), 4034–4059 (2015)
Article MathSciNet Google Scholar
Y. Chen, Y. Chi, Harnessing structures in big data via guaranteed low-rank matrix estimation: recent theory and fast algorithms via convex and nonconvex optimization. IEEE Signal Process. Mag. 35(4), 14–31 (2018)
Article Google Scholar
S. Tu, R. Boczar, M. Simchowitz, M. Soltanolkotabi, B. Recht, Low-rank solutions of linear matrix equations via procrustes flow, in Proceedings of the 33rd International Conference on International Conference on Machine Learning (ICML) (2016), pp. 964–973
Google Scholar
J. Drenth, X-Ray Crystallography (Wiley Online Library, 2007)
Google Scholar
J.R. Fienup, Phase retrieval algorithms: a comparison. Appl. Opt. 21(15), 2758–2769 (1982)
Article Google Scholar
H. Zhang, Y. Zhou, Y. Liang, Y. Chi, A nonconvex approach for phase retrieval: reshaped wirtinger flow and incremental algorithms. J. Mach. Learn. Res. 18(141), 1–35 (2017)
MathSciNet MATH Google Scholar
H. Zhang, Y. Chi, Y. Liang, Provable non-convex phase retrieval with outliers: Median truncated wirtinger flow, in International Conference on Machine Learning (2016), pp. 1022–1031
Google Scholar
Y. Chen, E. Candes, Solving random quadratic systems of equations is nearly as easy as solving linear systems, in Advances in Neural Information Processing Systems (NIPS) (2015)
Google Scholar
E.J. Candès, Y. Plan, Tight oracle inequalities for low-rank matrix recovery from a minimal number of noisy random measurements. IEEE Trans. Inf. Theory 57(4), 2342–2359 (2011)
Article MathSciNet Google Scholar
Y. Chi, Y. M. Lu, Y. Chen, Nonconvex optimization meets low-rank matrix factorization: an overview. arXiv:1809.09573 (2018)
P. Netrapalli, P. Jain, S. Sanghavi, Phase retrieval using alternating minimization, Advances in Neural Information Processing Systems (NIPS) (2013)
Google Scholar
G. Wang, G.B. Giannakis, Y.C. Eldar, Solving systems of random quadratic equations via truncated amplitude flow. IEEE Trans. Inf. Theory 64(2), 773–794 (2018)
Article MathSciNet Google Scholar
J. Sun, Q. Qu, J. Wright, A geometric analysis of phase retrieval. Found. Comput. Math. 18(5), 1131–1198 (2018)
Article MathSciNet Google Scholar
C. Ma, K. Wang, Y. Chi, Y. Chen, Implicit regularization in nonconvex statistical estimation: gradient descent converges linearly for phase retrieval, matrix completion and blind deconvolution. arXiv:1711.10467 (2017)
Y. Li, C. Ma, Y. Chen, Y. Chi, Nonconvex matrix factorization from rank-one measurements. arXiv:1802.06286 (2018)
Y. Chen, Y. Chi, J. Fan, C. Ma, Gradient descent with random initialization: fast global convergence for nonconvex phase retrieval. arXiv:1803.07726 (2018)
R.H. Keshavan, A. Montanari, S. Oh, Matrix completion from a few entries. IEEE Trans. Inf. Theory 56(6), 2980–2998 (2010)
Article MathSciNet Google Scholar
P. Jain, P. Netrapalli, S. Sanghavi, Low-rank matrix completion using alternating minimization, in Proceedings of the Forty-Fifth Annual ACM Symposium on Theory of Computing (2013), pp. 665–674
Google Scholar
R. Sun, Z.-Q. Luo, Guaranteed matrix completion via nonconvex factorization, in IEEE 56th Annual Symposium on Foundations of Computer Science (FOCS) (2015), pp. 270–289
Google Scholar
M. Hardt, Understanding alternating minimization for matrix completion, in IEEE 55th Annual Symposium on Foundations of Computer Science (FOCS) (2014), pp. 651–660
Google Scholar
C. De Sa, C. Re, K. Olukotun, Global convergence of stochastic gradient descent for some non-convex matrix problems, in International Conference on Machine Learning (2015), pp. 2332–2341
Google Scholar
Q. Zheng, J. Lafferty, Convergence analysis for rectangular matrix completion using burer-monteiro factorization and gradient descent. arXiv:1605.07051 (2016)
C. Jin, S. M. Kakade, P. Netrapalli, Provable efficient online matrix completion via non-convex stochastic gradient descent, in Advances in Neural Information Processing Systems (2016), pp. 4520–4528
Google Scholar
R. Ge, J. D. Lee, T. Ma, Matrix completion has no spurious local minimum, in Advances in Neural Information Processing Systems (NIPS) (2016), pp. 2973–2981
Google Scholar
S. Bhojanapalli, B. Neyshabur, N. Srebro, Global optimality of local search for low rank matrix recovery, in Advances in Neural Information Processing Systems (2016), pp. 3873–3881
Google Scholar
Y. Chen, M.J. Wainwright, Fast low-rank estimation by projected gradient descent: general statistical and algorithmic guarantees. arXiv:1509.03025 (2015)
Q. Zheng, J. Lafferty, A convergent gradient descent algorithm for rank minimization and semidefinite programming from random linear measurements, in Advances in Neural Information Processing Systems (NIPS) (2015)
Google Scholar
D. Park, A. Kyrillidis, C. Caramanis, S. Sanghavi, Finding low-rank solutions via nonconvex matrix factorization, efficiently and provably. SIAM J. Imaging Sci. 11(4), 2165–2204 (2018)
Article MathSciNet Google Scholar
K. Wei, J.-F. Cai, T.F. Chan, S. Leung, Guarantees of riemannian optimization for low rank matrix recovery. SIAM J. Matrix Anal. Appl. 37(3), 1198–1222 (2016)
Article MathSciNet Google Scholar
Q. Li, G. Tang, The nonconvex geometry of low-rank matrix optimizations with general objective functions, in 2017 IEEE Global Conference on Signal and Information Processing (GlobalSIP) (IEEE, 2017), pp. 1235–1239
Google Scholar
X. Li, J. Haupt, J. Lu, Z. Wang, R. Arora, H. Liu, T. Zhao, Symmetry, saddle points, and global optimization landscape of nonconvex matrix factorization, in Information Theory and Applications Workshop (ITA) (IEEE 2018), pp. 1–9
Google Scholar
P. Netrapalli, U. Niranjan, S. Sanghavi, A. Anandkumar, P. Jain, Non-convex robust PCA, in Advances in Neural Information Processing Systems (NIPS) (2014)
Google Scholar
X. Yi, D. Park, Y. Chen, C. Caramanis, Fast algorithms for robust PCA via gradient descent, in Advances in Neural Information Processing Systems (2016), pp. 4152–4160
Google Scholar
A. Anandkumar, P. Jain, Y. Shi, U.N. Niranjan, Tensor vs. matrix methods: Robust tensor decomposition under block sparse perturbations, in Artificial Intelligence and Statistics (2016), pp. 268–276
Google Scholar
S. Arora, R. Ge, T. Ma, A. Moitra, Simple, efficient, and neural algorithms for sparse coding, in Conference on Learning Theory (2015), pp. 113–149
Google Scholar
J. Sun, Q. Qu, J. Wright, Complete dictionary recovery using nonconvex optimization, in Proceedings of the 32nd International Conference on Machine Learning (ICML) (2015)
Google Scholar
A. S. Bandeira, N. Boumal, V. Voroninski, On the low-rank approach for semidefinite programs arising in synchronization and community detection, in 29th Annual Conference on Learning Theory (2016)
Google Scholar
N. Boumal, Nonconvex phase synchronization. SIAM J. Optim. 26(4), 2355–2377 (2016)
Article MathSciNet Google Scholar
K. Lee, Y. Li, M. Junge, Y. Bresler, Blind recovery of sparse signals from subsampled convolution. IEEE Trans. Inf. Theory 63(2), 802–821 (2017)
Article MathSciNet Google Scholar
Y. Chen, E.J. Candès, The projected power method: an efficient algorithm for joint alignment from pairwise differences. Commun. Pure Appl. Math. 71(8), 1648–1714 (2018)
Article MathSciNet Google Scholar
K. Chen, On k-median clustering in high dimensions, in Proceedings of the Seventeenth Annual ACM-SIAM Symposium on Discrete Algorithm (2006)
Google Scholar
D. Wagner, Resilient aggregation in sensor networks, in Proceedings of the 2nd ACM Workshop on Security of Ad Hoc and Sensor Networks (ACM, 2004), pp. 78–87
Google Scholar
Y. Chen, C. Caramanis, S. Mannor, Robust sparse regression under adversarial corruption, in Proceedings of the 30th International Conference on Machine Learning (ICML) (2013)
Google Scholar
C. Qu, H. Xu, Subspace clustering with irrelevant features via robust dantzig selector, in Advances in Neural Information Processing Systems (NIPS) (2015)
Google Scholar
A. Prasad, A. S. Suggala, S. Balakrishnan, P. Ravikumar, Robust estimation via robust gradient estimation. arXiv:1802.06485 (2018)
D. Yin, Y. Chen, R. Kannan, P. Bartlett, Byzantine-robust distributed learning: towards optimal statistical rates, in Proceedings of the 35th International Conference on Machine Learning, 10–15 Jul 2018 (2018), pp. 5650–5659
Google Scholar
Y. Chen, L. Su, J. Xu, Distributed statistical machine learning in adversarial settings: byzantine gradient descent, in Proceedings of the ACM on Measurement and Analysis of Computing Systems, vol. 1, no 2 (2017), p. 44
Google Scholar
Y. Li, Y. Sun, Y. Chi, Low-rank positive semidefinite matrix recovery from corrupted rank-one measurements. IEEE Trans. Signal Process. 65(2), 397–408 (2017)
Article MathSciNet Google Scholar
P. Hand, Phaselift is robust to a constant fraction of arbitrary errors. Applied and Computational Harmonic Analysis 42(3), 550–562 (2017)
Article MathSciNet Google Scholar
D. Weller, A. Pnueli, G. Divon, O. Radzyner, Y. Eldar, J. Fessler, Undersampled phase retrieval with outliers. IEEE Transactions on Computational Imaging 1(4), 247–258 (2015)
Article MathSciNet Google Scholar
J. Wright, A. Ganesh, K. Min, Y. Ma, Compressive principal component pursuit. Information and Inference 2(1), 32–68 (2013)
Article MathSciNet Google Scholar
Y. Cherapanamjeri, K. Gupta, P. Jain, Nearly optimal robust matrix completion, in Proceedings of the 34th International Conference on Machine Learning (2017), pp. 797–805
Google Scholar
X. Zhang, L. Wang, Q. Gu, A unified framework for nonconvex low-rank plus sparse matrix recovery, in International Conference on Artificial Intelligence and Statistics (2018), pp. 1097–1107
Google Scholar

Download references

Acknowledgements

The work of Y. Chi and Y. Li is supported in part by AFOSR under the grant FA9550-15-1-0205, by ONR under the grant N00014-18-1-2142, by ARO under the grant W911NF-18-1-0303, and by NSF under the grants CAREER ECCS-1818571 and CCF-1806154. The work of Y. Liang is supported in part by NSF under the grants CCF-1761506 and ECCS-1818904.

Author information

Authors and Affiliations

Department of Electrical and Computer Engineering, Carnegie Mellon University, 5000 Forbes Avenue, Pittsburgh, PA, 15213, USA
Yuejie Chi & Yuanxin Li
Microsoft Research Asia, 5 Danling Street Microsoft Tower 2, Haidian District, Beijing, 100080, China
Huishuai Zhang
Department of Electrical and Computer Engineering, The Ohio State University, 2015 Neil Avenue, Columbus, OH, 43210, USA
Yingbin Liang

Authors

Yuejie Chi
View author publications
You can also search for this author in PubMed Google Scholar
Yuanxin Li
View author publications
You can also search for this author in PubMed Google Scholar
Huishuai Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yingbin Liang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yuejie Chi .

Editor information

Editors and Affiliations

Department of Electrical and Computer Engineering, Munich Center for Quantum Science and Technology (MCQST), Technical University of Munich, Munich, Germany
Holger Boche
Institute of Telecommunications Systems, Technical University of Berlin, Berlin, Germany
Giuseppe Caire
Department of Electrical and Computer Engineering, Duke University, Durham, NC, USA
Robert Calderbank
Department of Mathematics, Technical University of Berlin, Berlin, Germany
Gitta Kutyniok
Faculty of Electrical Engineering and Information Technology, RWTH Aachen University, Aachen, Nordrhein-Westfalen, Germany
Rudolf Mathar
Mathematical Institute, University of Oxford, Oxford, UK
Philipp Petersen

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Chi, Y., Li, Y., Zhang, H., Liang, Y. (2019). Median-Truncated Gradient Descent: A Robust and Scalable Nonconvex Approach for Signal Estimation. In: Boche, H., Caire, G., Calderbank, R., Kutyniok, G., Mathar, R., Petersen, P. (eds) Compressed Sensing and Its Applications. Applied and Numerical Harmonic Analysis. Birkhäuser, Cham. https://doi.org/10.1007/978-3-319-73074-5_8

Download citation

DOI: https://doi.org/10.1007/978-3-319-73074-5_8
Published: 14 August 2019
Publisher Name: Birkhäuser, Cham
Print ISBN: 978-3-319-73073-8
Online ISBN: 978-3-319-73074-5
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics