Compressed Sensing and Its Applications pp 195-209 | Cite as

# Deep Learning for Trivial Inverse Problems

- 1 Citations
- 767 Downloads

## Abstract

Deep learning is producing most remarkable results when applied to some of the toughest large-scale nonlinear problems such as classification tasks in computer vision or speech recognition. Recently, deep learning has also been applied to inverse problems, in particular, in medical imaging. Some of these applications are motivated by mathematical reasoning, but a solid and at least partially complete mathematical theory for understanding neural networks and deep learning is missing. In this paper, we do not address large-scale problems but aim at understanding neural networks for solving some small and rather naive inverse problems. Nevertheless, the results of this paper highlight the particular complications of inverse problems, e.g., we show that applying a natural network design for mimicking Tikhonov regularization fails when applied to even the most trivial inverse problems. The proofs of this paper utilize basic and well-known results from the theory of statistical inverse problems. We include the proofs in order to provide some material ready to be used in student projects or general mathematical courses on data analysis. We only assume that the reader is familiar with the standard definitions of feedforward networks, e.g., the backpropagation algorithm for training such networks. We also include—without proof—numerical experiments for analyzing the influence of the network design, which include comparisons with learned iterative soft-thresholding algorithm (LISTA).

## Notes

### Acknowledgements

The author acknowledges the final support provided by the Deutsche Forschungsgemeinschaft (DFG) under grant GRK 2224/1 “Pi3: Parameter Identification—Analysis, Algorithms, Applications”. The numerical examples were done by Hannes Albers and Alexander Denker, who in particular designed the experiment with sparse input data. The statistical analysis was supported by Max Westphal. Furthermore, the author wants to thank Carola Schönlieb for her hospitality; part of the paper was written during the authors sabbatical in Cambridge. Finally, the author wants to thank a reviewer for careful reading and suggesting several improvements.

## References

- 1.J. Adler, O. Öktem, Solving ill-posed inverse problems using iterative deep neural networks. Inverse Probl.
**33**(12), 124007 (2017)MathSciNetCrossRefGoogle Scholar - 2.J. Bioucas-Dias, M. Figueiredo, A new twist: two-step iterative shrinkage/thresholding algorithms for image restoration.
**16**, 2992–3004 (2008)Google Scholar - 3.H. Bölcskei, P. Grohs, G. Kutyniok, P. Petersen, Optimal approximation with sparsely connected deep neural networks. CoRR abs/1705.01714abs/1705.01714 (2017)
- 4.T. Bonesky, K. Bredies, D.A. Lorenz, P. Maass, A generalized conditional gradient method for nonlinear operator equations with sparsity constraints. Inverse Probl.
**23**(5), 2041 (2007)MathSciNetCrossRefGoogle Scholar - 5.R.H. Byrd, G.M. Chin, J. Nocedal, W. Yuchen, Sample size selection in optimization methods for machine learning. Math. Programm.
**134**(1), 127–155 (2012)MathSciNetCrossRefGoogle Scholar - 6.Y. Chen, T. Pock, Trainable nonlinear reaction diffusion: a flexible framework for fast and effective image restoration. IEEE Trans. Pattern Anal. Mach. Intell.
**39**(6), 1256–1272 (2017)CrossRefGoogle Scholar - 7.C. Chung Van, J.C. De los Reyes, C.B. Schoenlieb, Learning optimal spatially-dependent regularization parameters in total variation image denoising. Inverse Probl.
**33**(7), 074005 (2017)Google Scholar - 8.D. Colton, H. Engl, A.K. Louis, J. McLaughlin, W. Rundell,
*Surveys on Solution Methods for Inverse Problems*(Springer, 2000). http://www.deeplearningbook.org - 9.I. Daubechies, M. Defrise, C. De Mol, An iterative thresholding algorithm for linear inverse problems with a sparsity constraint. Commun. Pure Appl. Math.
**57**(11), 1413–1457 (2004)MathSciNetCrossRefGoogle Scholar - 10.A. Edelman, B.D. Sutton, Y. Wang,
*Random Matrix Theory, Numerical Computation and Applications*Google Scholar - 11.A. Edelman, N.R. Rao, Random matrix theory. Acta Numer.
**14**, 233–297 (2005)MathSciNetCrossRefGoogle Scholar - 12.I. Goodfellow, Y. Bengio, A. Courville,
*Deep Learning*(MIT Press, 2016). http://www.deeplearningbook.org - 13.K. Gregor, Y. LeCun, Learning fast approximations of sparse coding, in
*Proceedings of the 27th International Conference on International Conference on Machine Learning*, ICML’10 (Omnipress, USA, 2010), pp. 399–406Google Scholar - 14.A. Hauptmann, F. Lucka, M. Betcke, N. Huynh, J. Adler, B. Cox, P. Beard, S. Ourselin, S. Arridge, Model based learning for accelerated, limited-view 3D photoacoustic tomography. IEEE Trans. Med. Imaging (2018). In PressGoogle Scholar
- 15.B. Jin, P. Maass, Sparsity regularization for parameter identification problems. Inverse Probl.
**28**(12), 123001 (2012)MathSciNetCrossRefGoogle Scholar - 16.J. Kaipio, E. Somersalo,
*Statistical and Computational Inverse Problems*(Springer, 2005)Google Scholar - 17.R. Latala, Some estimates of norms of random matrices. Proc. Am. Math. Soc.
**133**(5), 1273–1282 (2005)MathSciNetCrossRefGoogle Scholar - 18.Y. LeCun, Y. Bengio, G. Hinton, Deep learning. Nature
**521**(7553), 436–444 (2015)CrossRefGoogle Scholar - 19.S. Mallat, Understanding deep convolutional networks. CoRR, abs/1601.04920 (2016)CrossRefGoogle Scholar
- 20.J. Martens, I. Sutskever,
*Training Deep and Recurrent Networks with Hessian-Free Optimization*(Springer, Berlin, Heidelberg, 2012), pp. 479–535Google Scholar - 21.J.L. Mueller, S. Siltanen,
*Linear and Nonlinear Inverse Problems with Practical Applications*(SIAM, 2012)Google Scholar - 22.D.E. Rumelhart, G.E. Hinton, R.J. Williams,
*Neurocomputing: Foundations of Research. Chapter Learning Representations by Back-propagating Errors*(MIT Press, Cambridge, MA, USA, 1988), pp. 696–699Google Scholar - 23.M. Unser (2018) A representer theorem for deep neural networks. ArXiv e-printsGoogle Scholar
- 24.R. van Handel (2015) On the spectral norm of Gaussian random matrices. ArXiv e-printsGoogle Scholar