Skip to main content

Median-Truncated Gradient Descent: A Robust and Scalable Nonconvex Approach for Signal Estimation

  • Chapter
  • First Online:

Part of the book series: Applied and Numerical Harmonic Analysis ((ANHA))

Abstract

Recent work has demonstrated the effectiveness of gradient descent for directly estimating high-dimensional signals via nonconvex optimization in a globally convergent manner using a proper initialization. However, the performance is highly sensitive in the presence of adversarial outliers that may take arbitrary values. In this chapter, we introduce the median-Truncated Gradient Descent (median-TGD) algorithm to improve the robustness of gradient descent against outliers, and apply it to two celebrated problems: low-rank matrix recovery and phase retrieval. Median-TGD truncates the contributions of samples that deviate significantly from the sample median in each iteration in order to stabilize the search direction. Encouragingly, when initialized in a neighborhood of the ground truth known as the basin of attraction, median-TGD converges to the ground truth at a linear rate under Gaussian designs with a near-optimal number of measurements, even when a constant fraction of the measurements are arbitrarily corrupted. In addition, we introduce a new median-truncated spectral method that ensures an initialization in the basin of attraction. The stability against additional dense bounded noise is also established. Numerical experiments are provided to validate the superior performance of median-TGD.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   109.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD   139.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    It is straightforward to handle stochastic noise such as Gaussian noise, by noticing that its infinity norm is bounded with high probability.

  2. 2.

    Our discussions can be extended to the rectangular case, see [8].

  3. 3.

    The algorithm can be used to estimate complex-valued signals as well.

References

  1. P. Auer, M. Herbster, M.K. Warmuth, Exponentially many local minima for single neurons, in Advances in neural information processing systems (1996), pp. 316–322

    Google Scholar 

  2. R. Sun, Z.-Q. Luo, Guaranteed matrix completion via non-convex factorization. IEEE Trans. Inf. Theory 62(11), 6535–6579 (2016)

    Article  MathSciNet  Google Scholar 

  3. E.J. Candès, X. Li, M. Soltanolkotabi, Phase retrieval via wirtinger flow: theory and algorithms. IEEE Trans. Inf. Theory 61(4), 1985–2007 (2015)

    Article  MathSciNet  Google Scholar 

  4. J. Sun, Q. Qu, J. Wright, Complete dictionary recovery over the sphere i: overview and the geometric picture. IEEE Trans. Inf. Theory 63(2), 853–884 (2017)

    Article  MathSciNet  Google Scholar 

  5. X. Li, S. Ling, T. Strohmer, K. Wei, Rapid, robust, and reliable blind deconvolution via nonconvex optimization. Appl. Comput. Harmon. Anal. (2018)

    Google Scholar 

  6. P.J. Huber, Robust Statistics (Springer, 2011)

    Google Scholar 

  7. Y. Li, Y. Chi, H. Zhang, Y. Liang, Non-convex low-rank matrix recovery from corrupted random linear measurements, in 2017 International Conference on Sampling Theory and Applications (SampTA) (2017)

    Google Scholar 

  8. Y. Li, Y. Chi, H. Zhang, Y. Liang, Nonconvex low-rank matrix recovery with arbitrary outliers via median-truncated gradient descent. arXiv:1709.08114 (2017)

  9. H. Zhang, Y. Chi, Y. Liang, Median-truncated nonconvex approach for phase retrieval with outliers. IEEE Trans. Inf. Theory 64(11), 7287–7310 (2018)

    Article  MathSciNet  Google Scholar 

  10. R.J. Tibshirani, Fast computation of the median by successive binning. arXiv:0806.3301 (2008)

  11. B. Recht, M. Fazel, P.A. Parrilo, Guaranteed minimum-rank solutions of linear matrix equations via nuclear norm minimization. SIAM Rev. 52(3), 471–501 (2010)

    Article  MathSciNet  Google Scholar 

  12. D. Gross, Recovering low-rank matrices from few coefficients in any basis. IEEE Trans. Inf. Theory 57(3), 1548–1566 (2011)

    Article  MathSciNet  Google Scholar 

  13. S. Negahban, M.J. Wainwright, Estimation of (near) low-rank matrices with noise and high-dimensional scaling. Ann. Stat. 39(2), 1069–1097 (2011)

    Article  MathSciNet  Google Scholar 

  14. E. Candes, B. Recht, Exact matrix completion via convex optimization. Commun. ACM 55(6), 111–119 (2012)

    Article  Google Scholar 

  15. Y. Chen, Y. Chi, Robust spectral compressed sensing via structured matrix completion. IEEE Trans. Inf. Theory 60(10), 6576–6601 (2014)

    Article  MathSciNet  Google Scholar 

  16. Y. Chen, Y. Chi, A. Goldsmith, Exact and stable covariance estimation from quadratic sampling via convex programming. IEEE Trans. Inf. Theory 61(7), 4034–4059 (2015)

    Article  MathSciNet  Google Scholar 

  17. Y. Chen, Y. Chi, Harnessing structures in big data via guaranteed low-rank matrix estimation: recent theory and fast algorithms via convex and nonconvex optimization. IEEE Signal Process. Mag. 35(4), 14–31 (2018)

    Article  Google Scholar 

  18. S. Tu, R. Boczar, M. Simchowitz, M. Soltanolkotabi, B. Recht, Low-rank solutions of linear matrix equations via procrustes flow, in Proceedings of the 33rd International Conference on International Conference on Machine Learning (ICML) (2016), pp. 964–973

    Google Scholar 

  19. J. Drenth, X-Ray Crystallography (Wiley Online Library, 2007)

    Google Scholar 

  20. J.R. Fienup, Phase retrieval algorithms: a comparison. Appl. Opt. 21(15), 2758–2769 (1982)

    Article  Google Scholar 

  21. H. Zhang, Y. Zhou, Y. Liang, Y. Chi, A nonconvex approach for phase retrieval: reshaped wirtinger flow and incremental algorithms. J. Mach. Learn. Res. 18(141), 1–35 (2017)

    MathSciNet  MATH  Google Scholar 

  22. H. Zhang, Y. Chi, Y. Liang, Provable non-convex phase retrieval with outliers: Median truncated wirtinger flow, in International Conference on Machine Learning (2016), pp. 1022–1031

    Google Scholar 

  23. Y. Chen, E. Candes, Solving random quadratic systems of equations is nearly as easy as solving linear systems, in Advances in Neural Information Processing Systems (NIPS) (2015)

    Google Scholar 

  24. E.J. Candès, Y. Plan, Tight oracle inequalities for low-rank matrix recovery from a minimal number of noisy random measurements. IEEE Trans. Inf. Theory 57(4), 2342–2359 (2011)

    Article  MathSciNet  Google Scholar 

  25. Y. Chi, Y. M. Lu, Y. Chen, Nonconvex optimization meets low-rank matrix factorization: an overview. arXiv:1809.09573 (2018)

  26. P. Netrapalli, P. Jain, S. Sanghavi, Phase retrieval using alternating minimization, Advances in Neural Information Processing Systems (NIPS) (2013)

    Google Scholar 

  27. G. Wang, G.B. Giannakis, Y.C. Eldar, Solving systems of random quadratic equations via truncated amplitude flow. IEEE Trans. Inf. Theory 64(2), 773–794 (2018)

    Article  MathSciNet  Google Scholar 

  28. J. Sun, Q. Qu, J. Wright, A geometric analysis of phase retrieval. Found. Comput. Math. 18(5), 1131–1198 (2018)

    Article  MathSciNet  Google Scholar 

  29. C. Ma, K. Wang, Y. Chi, Y. Chen, Implicit regularization in nonconvex statistical estimation: gradient descent converges linearly for phase retrieval, matrix completion and blind deconvolution. arXiv:1711.10467 (2017)

  30. Y. Li, C. Ma, Y. Chen, Y. Chi, Nonconvex matrix factorization from rank-one measurements. arXiv:1802.06286 (2018)

  31. Y. Chen, Y. Chi, J. Fan, C. Ma, Gradient descent with random initialization: fast global convergence for nonconvex phase retrieval. arXiv:1803.07726 (2018)

  32. R.H. Keshavan, A. Montanari, S. Oh, Matrix completion from a few entries. IEEE Trans. Inf. Theory 56(6), 2980–2998 (2010)

    Article  MathSciNet  Google Scholar 

  33. P. Jain, P. Netrapalli, S. Sanghavi, Low-rank matrix completion using alternating minimization, in Proceedings of the Forty-Fifth Annual ACM Symposium on Theory of Computing (2013), pp. 665–674

    Google Scholar 

  34. R. Sun, Z.-Q. Luo, Guaranteed matrix completion via nonconvex factorization, in IEEE 56th Annual Symposium on Foundations of Computer Science (FOCS) (2015), pp. 270–289

    Google Scholar 

  35. M. Hardt, Understanding alternating minimization for matrix completion, in IEEE 55th Annual Symposium on Foundations of Computer Science (FOCS) (2014), pp. 651–660

    Google Scholar 

  36. C. De Sa, C. Re, K. Olukotun, Global convergence of stochastic gradient descent for some non-convex matrix problems, in International Conference on Machine Learning (2015), pp. 2332–2341

    Google Scholar 

  37. Q. Zheng, J. Lafferty, Convergence analysis for rectangular matrix completion using burer-monteiro factorization and gradient descent. arXiv:1605.07051 (2016)

  38. C. Jin, S. M. Kakade, P. Netrapalli, Provable efficient online matrix completion via non-convex stochastic gradient descent, in Advances in Neural Information Processing Systems (2016), pp. 4520–4528

    Google Scholar 

  39. R. Ge, J. D. Lee, T. Ma, Matrix completion has no spurious local minimum, in Advances in Neural Information Processing Systems (NIPS) (2016), pp. 2973–2981

    Google Scholar 

  40. S. Bhojanapalli, B. Neyshabur, N. Srebro, Global optimality of local search for low rank matrix recovery, in Advances in Neural Information Processing Systems (2016), pp. 3873–3881

    Google Scholar 

  41. Y. Chen, M.J. Wainwright, Fast low-rank estimation by projected gradient descent: general statistical and algorithmic guarantees. arXiv:1509.03025 (2015)

  42. Q. Zheng, J. Lafferty, A convergent gradient descent algorithm for rank minimization and semidefinite programming from random linear measurements, in Advances in Neural Information Processing Systems (NIPS) (2015)

    Google Scholar 

  43. D. Park, A. Kyrillidis, C. Caramanis, S. Sanghavi, Finding low-rank solutions via nonconvex matrix factorization, efficiently and provably. SIAM J. Imaging Sci. 11(4), 2165–2204 (2018)

    Article  MathSciNet  Google Scholar 

  44. K. Wei, J.-F. Cai, T.F. Chan, S. Leung, Guarantees of riemannian optimization for low rank matrix recovery. SIAM J. Matrix Anal. Appl. 37(3), 1198–1222 (2016)

    Article  MathSciNet  Google Scholar 

  45. Q. Li, G. Tang, The nonconvex geometry of low-rank matrix optimizations with general objective functions, in 2017 IEEE Global Conference on Signal and Information Processing (GlobalSIP) (IEEE, 2017), pp. 1235–1239

    Google Scholar 

  46. X. Li, J. Haupt, J. Lu, Z. Wang, R. Arora, H. Liu, T. Zhao, Symmetry, saddle points, and global optimization landscape of nonconvex matrix factorization, in Information Theory and Applications Workshop (ITA) (IEEE 2018), pp. 1–9

    Google Scholar 

  47. P. Netrapalli, U. Niranjan, S. Sanghavi, A. Anandkumar, P. Jain, Non-convex robust PCA, in Advances in Neural Information Processing Systems (NIPS) (2014)

    Google Scholar 

  48. X. Yi, D. Park, Y. Chen, C. Caramanis, Fast algorithms for robust PCA via gradient descent, in Advances in Neural Information Processing Systems (2016), pp. 4152–4160

    Google Scholar 

  49. A. Anandkumar, P. Jain, Y. Shi, U.N. Niranjan, Tensor vs. matrix methods: Robust tensor decomposition under block sparse perturbations, in Artificial Intelligence and Statistics (2016), pp. 268–276

    Google Scholar 

  50. S. Arora, R. Ge, T. Ma, A. Moitra, Simple, efficient, and neural algorithms for sparse coding, in Conference on Learning Theory (2015), pp. 113–149

    Google Scholar 

  51. J. Sun, Q. Qu, J. Wright, Complete dictionary recovery using nonconvex optimization, in Proceedings of the 32nd International Conference on Machine Learning (ICML) (2015)

    Google Scholar 

  52. A. S. Bandeira, N. Boumal, V. Voroninski, On the low-rank approach for semidefinite programs arising in synchronization and community detection, in 29th Annual Conference on Learning Theory (2016)

    Google Scholar 

  53. N. Boumal, Nonconvex phase synchronization. SIAM J. Optim. 26(4), 2355–2377 (2016)

    Article  MathSciNet  Google Scholar 

  54. K. Lee, Y. Li, M. Junge, Y. Bresler, Blind recovery of sparse signals from subsampled convolution. IEEE Trans. Inf. Theory 63(2), 802–821 (2017)

    Article  MathSciNet  Google Scholar 

  55. Y. Chen, E.J. Candès, The projected power method: an efficient algorithm for joint alignment from pairwise differences. Commun. Pure Appl. Math. 71(8), 1648–1714 (2018)

    Article  MathSciNet  Google Scholar 

  56. K. Chen, On k-median clustering in high dimensions, in Proceedings of the Seventeenth Annual ACM-SIAM Symposium on Discrete Algorithm (2006)

    Google Scholar 

  57. D. Wagner, Resilient aggregation in sensor networks, in Proceedings of the 2nd ACM Workshop on Security of Ad Hoc and Sensor Networks (ACM, 2004), pp. 78–87

    Google Scholar 

  58. Y. Chen, C. Caramanis, S. Mannor, Robust sparse regression under adversarial corruption, in Proceedings of the 30th International Conference on Machine Learning (ICML) (2013)

    Google Scholar 

  59. C. Qu, H. Xu, Subspace clustering with irrelevant features via robust dantzig selector, in Advances in Neural Information Processing Systems (NIPS) (2015)

    Google Scholar 

  60. A. Prasad, A. S. Suggala, S. Balakrishnan, P. Ravikumar, Robust estimation via robust gradient estimation. arXiv:1802.06485 (2018)

  61. D. Yin, Y. Chen, R. Kannan, P. Bartlett, Byzantine-robust distributed learning: towards optimal statistical rates, in Proceedings of the 35th International Conference on Machine Learning, 10–15 Jul 2018 (2018), pp. 5650–5659

    Google Scholar 

  62. Y. Chen, L. Su, J. Xu, Distributed statistical machine learning in adversarial settings: byzantine gradient descent, in Proceedings of the ACM on Measurement and Analysis of Computing Systems, vol. 1, no 2 (2017), p. 44

    Google Scholar 

  63. Y. Li, Y. Sun, Y. Chi, Low-rank positive semidefinite matrix recovery from corrupted rank-one measurements. IEEE Trans. Signal Process. 65(2), 397–408 (2017)

    Article  MathSciNet  Google Scholar 

  64. P. Hand, Phaselift is robust to a constant fraction of arbitrary errors. Applied and Computational Harmonic Analysis 42(3), 550–562 (2017)

    Article  MathSciNet  Google Scholar 

  65. D. Weller, A. Pnueli, G. Divon, O. Radzyner, Y. Eldar, J. Fessler, Undersampled phase retrieval with outliers. IEEE Transactions on Computational Imaging 1(4), 247–258 (2015)

    Article  MathSciNet  Google Scholar 

  66. J. Wright, A. Ganesh, K. Min, Y. Ma, Compressive principal component pursuit. Information and Inference 2(1), 32–68 (2013)

    Article  MathSciNet  Google Scholar 

  67. Y. Cherapanamjeri, K. Gupta, P. Jain, Nearly optimal robust matrix completion, in Proceedings of the 34th International Conference on Machine Learning (2017), pp. 797–805

    Google Scholar 

  68. X. Zhang, L. Wang, Q. Gu, A unified framework for nonconvex low-rank plus sparse matrix recovery, in International Conference on Artificial Intelligence and Statistics (2018), pp. 1097–1107

    Google Scholar 

Download references

Acknowledgements

The work of Y. Chi and Y. Li is supported in part by AFOSR under the grant FA9550-15-1-0205, by ONR under the grant N00014-18-1-2142, by ARO under the grant W911NF-18-1-0303, and by NSF under the grants CAREER ECCS-1818571 and CCF-1806154. The work of Y. Liang is supported in part by NSF under the grants CCF-1761506 and ECCS-1818904.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yuejie Chi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Chi, Y., Li, Y., Zhang, H., Liang, Y. (2019). Median-Truncated Gradient Descent: A Robust and Scalable Nonconvex Approach for Signal Estimation. In: Boche, H., Caire, G., Calderbank, R., Kutyniok, G., Mathar, R., Petersen, P. (eds) Compressed Sensing and Its Applications. Applied and Numerical Harmonic Analysis. Birkhäuser, Cham. https://doi.org/10.1007/978-3-319-73074-5_8

Download citation

Publish with us

Policies and ethics