Heterogeneous CPU plus GPU approaches for HEVC

  • Gabriel Cebrián-Márquez
  • Vicente Galiano
  • Héctor Migallón
  • José Luis Martínez
  • Pedro Cuenca
  • Otoniel López-Granado
Article
  • 41 Downloads

Abstract

The high efficiency video coding (HEVC) standard has opened the door to high-quality multimedia contents and new formats such as ultra-high definition as a result of the unceasing demands of the market. This standard is able to outperform prior standards by up to 50% in terms of perceptual video quality, but at the cost of extremely large computational complexities. For this reason, the development of fast coding algorithms is now a requirement to make HEVC an adequate candidate for real-world scenarios. In this regard, this paper proposes a collaborative CPU \(+\) GPU coding architecture for this standard, in which the CPU performs a coarse-grained parallelization of the encoder, while the GPU carries out a fast motion estimation. Given that the GPU algorithm can work together with a wide variety of parallel algorithms, this paper evaluates two of them: tiles, defined in the standard, and slices, already present in previous standards. Results indicate that slices are more adequate in terms of parallel efficiency (10.75\(\times {}\) speedup on average using 12 threads), while tiles achieve better coding efficiency.

Keywords

HEVC H.265 Parallel encoding GPU Tiles Slices 

Notes

Acknowledgements

This work was jointly supported by the Spanish Ministry of Economy and Competitiveness and the European Commission (FEDER funds) under the Projects TIN2015-66972-C5-2-R and TIN2015-66972-C5-4-R, and by the Spanish Ministry of Education, Culture and Sports under the Grant FPU13/04601.

References

  1. 1.
    ISO/IEC, ITU-T (2017) Advanced video coding for generic audiovisual services. ITU-T Recommendation H.264 and ISO/IEC 14496-10 (version 12)Google Scholar
  2. 2.
    ISO/IEC, ITU-T (2016) High efficiency video coding (HEVC). ITU-T Recommendation H.265 and ISO/IEC 23008-2 (version 4)Google Scholar
  3. 3.
    Sullivan GJ, Ohm JR, Han WJ, Wiegand T (2012) Overview of the high efficiency video coding (HEVC) standard. IEEE Trans Circuits Syst Video Technol 22(12):1649–1668.  https://doi.org/10.1109/TCSVT.2012.2221191 CrossRefGoogle Scholar
  4. 4.
    Ohm JR, Sullivan GJ, Schwarz H, Tan TK, Wiegand T (2012) Comparison of the coding efficiency of video coding standards—including high efficiency video coding (HEVC). IEEE Trans Circuits Syst Video Technol 22(12):1669–1684.  https://doi.org/10.1109/TCSVT.2012.2221192 CrossRefGoogle Scholar
  5. 5.
    Fuldseth A, Horowitz M, Xu S, Zhou M (2011) Tiles. Technical Report JCTVC-E408Google Scholar
  6. 6.
    Henry F, Pateux S (2011) Wavefront parallel processing. Technical Report JCTVC-E196Google Scholar
  7. 7.
    Bossen F, Bross B, Suhring K, Flynn D (2012) HEVC complexity and implementation analysis. IEEE Trans Circuits Syst Video Technol 22(12):1685–1696.  https://doi.org/10.1109/TCSVT.2012.2221255 CrossRefGoogle Scholar
  8. 8.
    Cebrián-Márquez G, Martínez JL, Cuenca P (2017) Heterogeneous CPU plus GPU tile-based approach for HEVC. In: Proceedings of the 17th International Conference on Mathematical Methods in Science and Engineering (CMMSE), vol 2, pp 534–545Google Scholar
  9. 9.
    Misra K, Segall A, Horowitz M, Xu S, Fuldseth A, Zhou M (2013) An overview of tiles in HEVC. IEEE J Sel Top Signal Process 7(6):969–977.  https://doi.org/10.1109/JSTSP.2013.2271451 CrossRefGoogle Scholar
  10. 10.
    Fernández DG, Del Barrio AA, Botella G, García C (2017) Fast and effective CU size decision based on spatial and temporal homogeneity detection. Multimed Tools Appl. https://doi.org/10.1007/s11042-017-4503-6. (in press)
  11. 11.
    Fernández DG, Del Barrio AA, Botella G, García C, Prieto M, Hermida R (2018) Complexity reduction in the HEVC/H265 standard based on smooth region classification. Digit Signal Process 73:24–39.  https://doi.org/10.1016/j.dsp.2017.11.001 CrossRefGoogle Scholar
  12. 12.
    Cebrián-Márquez G, Martínez JL, Cuenca P (2017b) Inter and intra pre-analysis algorithm for HEVC. J Supercomput 73(1):414–432.  https://doi.org/10.1007/s11227-016-1882-9 CrossRefGoogle Scholar
  13. 13.
    Chi CC, Álvarez-Mesa M, Juurlink B, Clare G, Henry F, Pateux S, Schierl T (2012) Parallel scalability and efficiency of HEVC parallelization approaches. IEEE Trans Circuits Syst Video Technol 22(12):1827–1838.  https://doi.org/10.1109/TCSVT.2012.2223056 CrossRefGoogle Scholar
  14. 14.
    Chi CC, Álvarez-Mesa M, Lucas J, Juurlink B, Schierl T (2013) Parallel HEVC decoding on multi- and many-core architectures. J Sign Process Syst 71(3):247–260.  https://doi.org/10.1007/s11265-012-0714-2 CrossRefGoogle Scholar
  15. 15.
    Yu Q, Zhao L, Ma S (2012) Parallel AMVP candidate list construction for HEVC. In: IEEE Visual Communications and Image Processing (VCIP). https://doi.org/10.1109/VCIP.2012.6410775
  16. 16.
    Yan C, Zhang Y, Xu J, Dai F, Zhang J, Dai Q, Wu F (2014) Efficient parallel framework for HEVC motion estimation on many-core processors. IEEE Trans Circuits Syst Video Technol 24(12):2077–2089.  https://doi.org/10.1109/TCSVT.2014.2335852 CrossRefGoogle Scholar
  17. 17.
    Wang X, Song L, Chen M, Yang J (2013) Paralleling variable block size motion estimation of HEVC on CPU plus GPU platform. In: IEEE International Conference on Multimedia and Expo Workshops (ICMEW). https://doi.org/10.1109/ICMEW.2013.6618412
  18. 18.
    Radicke S, Hahn JU, Grecos C, Wang Q (2014) A highly-parallel approach on motion estimation for high efficiency video coding (HEVC). In: IEEE International Conference on Consumer Electronics (ICCE), pp 187–188. https://doi.org/10.1109/ICCE.2014.6775965
  19. 19.
    Radicke S, Hahn JU, Wang Q, Grecos C (2014b) Bi-predictive motion estimation for HEVC on a graphics processing unit (GPU). IEEE Trans Consum Electron 60(4):728–736.  https://doi.org/10.1109/TCE.2014.7027349 CrossRefGoogle Scholar
  20. 20.
    ISO/IEC, ITU-T (2016) HEVC test model (HM) reference software. https://hevc.hhi.fraunhofer.de/
  21. 21.
    Bossen F (2013) Common test conditions and software reference configurations. Technical Report JCTVC-L1100Google Scholar
  22. 22.
    Bjøntegaard G (2008) Improvements of the BD-PSNR model. Technical Report VCEG-AI11, 35th VCEG Meeting, ITU-T SG16 Q6Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Albacete Research Institute of Informatics (I3A)University of Castilla-La ManchaAlbaceteSpain
  2. 2.Department of Physics and Computer ArchitectureMiguel Hernández UniversityElcheSpain

Personalised recommendations