Fast optimal transport regularized projection and application to coefficient shrinkage and filtering


This paper explores solutions to the problem of regularized projections with respect to the optimal transport metric. Expanding recent works on optimal transport dictionary learning and non-negative matrix factorization, we derive general purpose algorithms for projecting on any set of vectors with any regularization, and we further propose fast algorithms for the special cases of projecting onto invertible or orthonormal bases. Noting that pass filters and coefficient shrinkage can be seen as regularized projections under the Euclidean metric, we show how to use our algorithms to perform optimal transport pass filters and coefficient shrinkage. We give experimental evidence that using the optimal transport distance instead of the Euclidean distance for filtering and coefficient shrinkage leads to reduced artifacts and improved denoising results.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Code availability

A python library for our methods and all scripts necessary to reproduce our figures and results will be made available on the author’s Web site upon publication of this paper.


  1. 1.

    Since D is orthonormal, the problem is actually equivalent to simply \(\displaystyle \min \limits _{\varvec{\lambda }} {{\,\mathrm{OT}\,}}_{\gamma }(\varvec{X}, \varvec{\lambda }) + \alpha \Vert \varvec{\lambda }\Vert _2^2\).


  1. 1.

    Arjovsky, M., Chintala, S., Bottou, L.: Wasserstein generative adversarial networks. PMLR Proc. Mach. Learn. Res. 70, 214–223 (2017)

    Google Scholar 

  2. 2.

    Ataee, Z., Mohseni, H.: Structured dictionary learning using mixed-norms and group-sparsity constraint. Vis. Comput. 36(8), 1679–1692 (2020)

    Article  Google Scholar 

  3. 3.

    Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 2(1), 183–202 (2009)

    MathSciNet  Article  Google Scholar 

  4. 4.

    Beylkin, G., Coifman, R., Rokhlin, V.: Fast wavelet transforms and numerical algorithms i. Commun. Pure Appl. Math. 44(2), 141–183 (1991)

    MathSciNet  Article  Google Scholar 

  5. 5.

    Blondel, M., Seguy, V., Rolet, A.: Smooth and sparse optimal transport. In: International Conference on Artificial Intelligence and Statistics, pp 880–889. (2018)

  6. 6.

    Cazelles, E., Seguy, V., Bigot, J., Cuturi, M., Papadakis, N.: Geodesic PCA versus log-PCA of histograms in the wasserstein space. SIAM J. Sci. Comput. 40(2), B429–B456 (2018)

    MathSciNet  Article  Google Scholar 

  7. 7.

    Chang, S.G., Yu, B., Vetterli, M.: Adaptive wavelet thresholding for image denoising and compression. IEEE Trans. Image Process. 9(9), 1532–1546 (2000)

    MathSciNet  Article  Google Scholar 

  8. 8.

    Cohen, A., Daubechies, I., Feauveau, J.C.: Biorthogonal bases of compactly supported wavelets. Commun. Pure Appl. Math. 45(5), 485–560 (1992)

    MathSciNet  Article  Google Scholar 

  9. 9.

    Condat, L.: A primal-dual splitting method for convex optimization involving lipschitzian, proximable and linear composite terms. J. Optim. Theory Appl. 158(2), 460–479 (2013)

    MathSciNet  Article  Google Scholar 

  10. 10.

    Cooley, J.W., Tukey, J.W.: An algorithm for the machine calculation of complex Fourier series. Math. Comput. 19(90), 297–301 (1965)

    MathSciNet  Article  Google Scholar 

  11. 11.

    Cuturi, M.: Sinkhorn distances: lightspeed computation of optimal transport. In: Advances in Neural Information Processing Systems, pp 2292–2300 (2013)

  12. 12.

    Cuturi, M., Doucet, A.: Fast computation of wasserstein barycenters. In: Proceedings of the 31st International Conference on Machine Learning (ICML-14) (2014)

  13. 13.

    Cuturi, M., Peyré, G.: A smoothed dual approach for variational wasserstein problems. SIAM J. Imaging Sci. 9(1), 320–343 (2016)

    MathSciNet  Article  Google Scholar 

  14. 14.

    Daubechies, I.: Ten Lectures on Wavelets. Siam, New Delhi (1992)

    Google Scholar 

  15. 15.

    Dehda, B., Melkemi, K.: Image denoising using new wavelet thresholding function. J. Appl. Math. Comput. Mech. 16, 2 (2017)

    Article  Google Scholar 

  16. 16.

    Donoho, D.L.: De-noising by soft-thresholding. IEEE Trans. Inf. Theory 41(3), 613–627 (1995)

    MathSciNet  Article  Google Scholar 

  17. 17.

    Donoho, D.L., Johnstone, I.M.: Ideal spatial adaptation by wavelet shrinkage. Biometrika 81(3), 425–455 (1994)

    MathSciNet  Article  Google Scholar 

  18. 18.

    Donoho, D.L., Johnstone, I.M.: Adapting to unknown smoothness via wavelet shrinkage. J. Am. Stat. Assoc. 90(432), 1200–1224 (1995)

    MathSciNet  Article  Google Scholar 

  19. 19.

    Feydy, J., Séjourné, T., Vialard, F.X., Amari, S.i., Trouvé, A., Peyré, G.: Interpolating between optimal transport and mmd using sinkhorn divergences. In: The 22nd International Conference on Artificial Intelligence and Statistics, pp 2681–2690. (2019)

  20. 20.

    Flamary, R., Févotte, C., Courty, N., Emiya, V.: Optimal spectral transportation with application to music transcription. In: Advances in Neural Information Processing Systems, pp 703–711. (2016)

  21. 21.

    Frogner, C., Zhang, C., Mobahi, H., Araya, M., Poggio, T.A.: Learning with a wasserstein loss. In: Advances in Neural Information Processing Systems, pp 2053–2061. (2015)

  22. 22.

    Gramfort, A., Peyré, G., Cuturi, M.: Fast optimal transport averaging of neuroimaging data. In: International Conference on Information Processing in Medical Imaging, Springer, pp 261–272. (2015)

  23. 23.

    Kaur, L., Gupta, S., Chauhan, R.: Image denoising using wavelet thresholding. ICVGIP 2, 16–18 (2002)

    Google Scholar 

  24. 24.

    Kusner, M., Sun, Y., Kolkin, N., Weinberger K. From word embeddings to document distances. In: International conference on machine learning, pp 957–966. (2015)

  25. 25.

    Lee, D.D., Seung, H.S.: Algorithms for non-negative matrix factorization. In: Advances in neural information processing systems, pp 556–562. (2001)

  26. 26.

    Lorenz, D.A., Pock, T.: An inertial forward–backward algorithm for monotone inclusions. J. Math. Imaging Vis. 51(2), 311–325 (2015)

    MathSciNet  Article  Google Scholar 

  27. 27.

    Mairal, J., Bach, F., Ponce, J., Sapiro, G. Online dictionary learning for sparse coding. In: Proceedings of the 26th Annual International Conference on Machine Learning, pp 689–696. (2009)

  28. 28.

    Nesterov, Y.: A method of solving a convex programming problem with convergence rate \(o (1/k^2)\). Sov. Math. Doklady 27, 372–376 (1983)

    MATH  Google Scholar 

  29. 29.

    Orlin, J.: A polynomial time primal network simplex algorithm for minimum cost flows. Math. Progr. 78(2), 109–129 (1997)

    MathSciNet  Article  Google Scholar 

  30. 30.

    Peyré, G., Cuturi, M., et al.: Computational optimal transport. Found. Trends® Mach. Learn. 11(5–6), 355–607 (2019)

    Article  Google Scholar 

  31. 31.

    Rabin, J., Papadakis, N.: Convex color image segmentation with optimal transport distances. In: International Conference on Scale Space and Variational Methods in Computer Vision, Springer, pp 256–269. (2015)

  32. 32.

    Redko, I., Courty, N., Flamary, R., Tuia, D.: Optimal transport for multi-source domain adaptation under target shift. In: The 22nd International Conference on Artificial Intelligence and Statistics, pp 849–858. (2019)

  33. 33.

    Rolet, A., Cuturi, M., Peyré, G.: Fast dictionary learning with a smoothed wasserstein loss. In: Artificial Intelligence and Statistics, pp 630–638. (2016)

  34. 34.

    Rolet, A., Seguy, V., Blondel, M., Sawada, H.: Blind source separation with optimal transport non-negative matrix factorization. EURASIP J. Adv. Signal Process. 2018(1), 53 (2018)

    Article  Google Scholar 

  35. 35.

    Sandler, R., Lindenbaum, M.: Nonnegative matrix factorization with earth mover’s distance metric. In: IEEE Conference on Computer Vision and Pattern Recognition, 2009. CVPR 2009 IEEE, pp 1873–1880. (2009)

  36. 36.

    Seguy, V., Cuturi, M.: Principal geodesic analysis for probability measures under the optimal transport metric. In: Advances in Neural Information Processing Systems, pp 3312–3320. (2015)

  37. 37.

    Seguy, V., Damodaran, B.B., Flamary, R., Courty, N., Rolet, A., Blondel, M.: Large-scale optimal transport and mapping estimation. In: International Conference on Learning Representations (ICLR). (2018)

  38. 38.

    Solomon, J., de Goes, F., Peyre, G., Cuturi, M., Butscher, A., Nguyen, A., Du, T., Guibas, L.: Convolutional wasserstein distances: efficient optimal transportation on geometric domains. ACM Trans. Gr. 34, 4 (2015)

    Article  Google Scholar 

  39. 39.

    Tartavel, G., Peyré, G., Gousseau, Y.: Wasserstein loss for image synthesis and restoration. SIAM J. Imaging Sci. 9(4), 1726–1755 (2016)

    MathSciNet  Article  Google Scholar 

  40. 40.

    Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P.: Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 13(4), 600–612 (2004)

    Article  Google Scholar 

Download references


This work was partly supported by JSPS KAKENHI Grant Number 17H01788.

Author information




AR and VS designed the research and wrote the paper. Experiments were performed by AR. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Antoine Rolet.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Rolet, A., Seguy, V. Fast optimal transport regularized projection and application to coefficient shrinkage and filtering. Vis Comput (2021).

Download citation


  • Optimal transport
  • Wasserstein distance
  • Coefficient shrinkage
  • Sparse decomposition
  • Wavelet thresholding
  • Denoising