Skip to main content

CNN-Based DCT-Like Transform for Image Compression

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10705))

Abstract

This paper presents a block transform for image compression, where the transform is inspired by discrete cosine transform (DCT) but achieved by training convolutional neural network (CNN) models. Specifically, we adopt the combination of convolution, nonlinear mapping, and linear transform to form a non-linear transform as well as a non-linear inverse transform. The transform, quantization, and inverse transform are jointly trained to achieve the overall rate-distortion optimization. For the training purpose, we propose to estimate the rate by the \(l_1\)-norm of the quantized coefficients. We also explore different combinations of linear/non-linear transform and inverse transform. Experimental results show that our proposed CNN-based transform achieves higher compression efficiency than fixed DCT, and also outperforms JPEG significantly at low bit rates.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    http://r0k.us/graphics/kodak/.

  2. 2.

    https://hevc.hhi.fraunhofer.de/svn/svn_HEVCSoftware/tags/HM-16.15/.

  3. 3.

    https://github.com/tensorflow/models/tree/master/compression. This network has no entropy coding since the authors do not provide.

References

  1. Wallace, G.K.: The JPEG still picture compression standard. IEEE Trans. Consum. Electron. 38(1), xviii–xxxiv (1992)

    Google Scholar 

  2. Christopoulos, C., Skodras, A., Ebrahimi, T.: The JPEG2000 still image coding system: an overview. IEEE Trans. Consum. Electron. 46(4), 1103–1127 (2000)

    Article  Google Scholar 

  3. Wiegand, T., Sullivan, G.J., Bjontegaard, G., Luthra, A.: Overview of the H.264/AVC video coding standard. IEEE Trans. Circ. Syst. Video Technol. 13(7), 560–576 (2003)

    Article  Google Scholar 

  4. Sullivan, G.J., Ohm, J., Han, W.J., Wiegand, T.: Overview of the high efficiency video coding (HEVC) standard. IEEE Trans. Circ. Syst. Video Technol. 22(12), 1649–1668 (2012)

    Article  Google Scholar 

  5. Hu, W., Cheung, G., Ortega, A., Au, O.C.: Multiresolution graph fourier transform for compression of piecewise smooth images. IEEE Trans. Image Process. 24(1), 419–433 (2015)

    Article  MathSciNet  Google Scholar 

  6. Toderici, G., O’Malley, S.M., Hwang, S.J., Vincent, D., Minnen, D., Baluja, S., Covell, M., Sukthankar, R.: Variable rate image compression with recurrent neural networks. In: ICLR (2016)

    Google Scholar 

  7. Toderici, G., Vincent, D., Johnston, N., Hwang, S.J., Minnen, D., Shor, J., Covell, M.: Full resolution image compression with recurrent neural networks. In: CVPR, pp. 5306–5314 (2017)

    Google Scholar 

  8. Johnston, N., Vincent, D., Minnen, D., Covell, M., Singh, S., Chinen, T., Hwang, S.J., Shor, J., Toderici, G.: Improved lossy image compression with priming and spatially adaptive bit rates for recurrent networks. arXiv preprint arXiv:1703.10114 (2017)

  9. Ballé, J., Laparra, V., Simoncelli, E.P.: End-to-end optimized image compression. In: ICLR (2017)

    Google Scholar 

  10. Theis, L., Shi, W., Cunningham, A., Huszár, F.: Lossy image compression with compressive autoencoders. In: ICLR (2017)

    Google Scholar 

  11. Rippel, O., Bourdev, L.: Real-time adaptive image compression. In: ICML, pp. 2922–2930 (2017)

    Google Scholar 

  12. Jiang, F., Tao, W., Liu, S., Ren, J., Guo, X., Zhao, D.: An end-to-end compression framework based on convolutional neural networks. IEEE Trans. Circ. Syst. Video Technol. (2017). https://doi.org/10.1109/TCSVT.2017.2734838

  13. Baig, M.H., Torresani, L.: Multiple hypothesis colorization and its application to image compression. Comput. Vis. Image Underst. (2017)

    Google Scholar 

  14. Prakash, A., Moran, N., Garber, S., DiLillo, A., Storer, J.: Semantic perceptual image compression using deep convolution networks. In: DCC, pp. 250–259 (2017)

    Google Scholar 

  15. Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  16. Wong, C.W., Au, O.C., Lam, H.K.: Rate control using probability of non-zero quantized coefficients. In: ICME (2004)

    Google Scholar 

  17. Candes, E.J., Tao, T.: Decoding by linear programming. IEEE Trans. Inf. Theory 51(12), 4203–4215 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  18. Nair, V., Hinton, G.E.: Rectified linear units improve restricted boltzmann machines. In: ICML, pp. 807–814 (2010)

    Google Scholar 

  19. Schaefer, G., Stich, M.: UCID: an uncompressed color image database. In: Electronic Imaging 2004, International Society for Optics and Photonics, pp. 472–480 (2004)

    Google Scholar 

  20. Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., Darrell, T.: Caffe: convolutional architecture for fast feature embedding. In: ACM Multimedia, pp. 675–678. ACM (2014)

    Google Scholar 

  21. Said, A.: Introduction to arithmetic coding - theory and practice. Technical report HPL-2004-76, Hewlett Packard Laboratories Palo Alto (2004)

    Google Scholar 

Download references

Acknowledgment

This work was supported by the Natural Science Foundation of China (NSFC) under Grant 61772483, Grant 61390512, and Grant 61425026, and by the Fundamental Research Funds for the Central Universities under Grant WK3490000001.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dong Liu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Liu, D., Ma, H., Xiong, Z., Wu, F. (2018). CNN-Based DCT-Like Transform for Image Compression. In: Schoeffmann, K., et al. MultiMedia Modeling. MMM 2018. Lecture Notes in Computer Science(), vol 10705. Springer, Cham. https://doi.org/10.1007/978-3-319-73600-6_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-73600-6_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-73599-3

  • Online ISBN: 978-3-319-73600-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics