3D Video Compression

Müller, Karsten; Merkle, Philipp; Tech, Gerhard

doi:10.1007/978-1-4419-9964-1_8

3D Video Compression

Karsten Müller⁵,
Philipp Merkle⁵ &
Gerhard Tech⁵

Chapter
First Online: 01 January 2012

2052 Accesses
2 Citations

Abstract

In this chapter, compression methods for 3D video (3DV) are presented. This includes data formats, video and depth compression, evaluation methods, and analysis tools. First, the fundamental principles of video coding for classical 2D video content are reviewed, including signal prediction, quantization, transformation, and entropy coding. These methods are extended toward multi-view video coding (MVC), where inter-view prediction is added to the 2D video coding methods to gain higher coding efficiency. Next, 3DV coding principles are introduced, which are different from previous coding methods. In 3DV, a generic input format is used for coding and a dense number of output views are generated for different types of autostereoscopic displays. This influences the format selection, encoder optimization, evaluation methods, and requires new modules, like the decoder-side view generation, as discussed in this chapter. Finally, different 3DV formats are compared and discussed for their applicability for 3DV systems.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Benzie P, Watson J, Surman P, Rakkolainen I, Hopf K, Urey H, Sainov V, von Kopylow C (2007) A survey of 3DTV displays: techniques and technologies. IEEE Trans Circuits Syst Video Technol 17(11):1647–1658
Article Google Scholar
Konrad J, Halle M (2007) 3-D displays and signal processing: an Answer to 3-D Ills? IEEE Signal Proces Mag 24(6):21
Google Scholar
Shannon CE (1948) A mathematical theory of communication. Bell Syst Tech J 27(3):2163–2177
MathSciNet Google Scholar
Berger T (1971) Rate distortion theory. Prentice-Hall, Englewood Cliffs
Google Scholar
Wiegand T, Schwarz H (2011) Source coding: part I of fundamentals of source and video coding. Found Trends Signal Proces 4(1-2):1–222, Jan 2011. http://dx.doi.org/10.1561/2000000010
Jayant NS, Noll P (1994) Digital coding of waveforms. Prentice-Hall, Englewood Cliffs
Google Scholar
Huffman DA (1952) A method for the construction of minimum redundancy codes. In: Proceedings IRE, pp 1098–1101, Sept 1952
Google Scholar
Said A (2003) Arithmetic coding. In: Sayood K (ed) Lossless compression handbook. San Diego, Academic, London
Google Scholar
Chen Y, Wang Y-K, Ugur K, Hannuksela M, Lainema J, Gabbouj M (2009) The Emerging MVC standard for 3D video services. EURASIP J Adv Sign Proces 2009(1)
Google Scholar
ISO/IEC JTC1/SC29/WG11 (2008) Text of ISO/IEC 14496-10:200X/FDAM 1 multiview video coding. Doc. N9978, Hannover, Germany, July 2008
Google Scholar
Merkle P, Smolic A, Mueller K, Wiegand T (2007) Efficient prediction structures for multiview video coding, invited paper. IEEE Trans Circuits Syst Video Technol 17(11):1461–1473
Article Google Scholar
Shimizu S, Kitahara M, Kimata H, Kamikura K, Yashima Y (2007) View scalable multi-view video coding using 3-d warping with depth map. IEEE Trans Circuits Syst Video Technol 17(11):1485–1495
Article Google Scholar
Vetro A, Wiegand T, Sullivan GJ (2011) Overview of the stereo and multiview video coding extensions of the H.264/AVC standard. Proc IEEE, Special issue on 3D Media and Displays 99(4):626–642
Google Scholar
ITU-T and ISO/IEC JTC 1 (2010) Advanced video coding for generic audiovisual services. ITU-T Recommendation H.264 and ISO/IEC 14496-10 (MPEG-4 AVC), Version 10, March 2010
Google Scholar
Wiegand T, Sullivan GJ, Bjøntegaard G, Luthra A (2003) Overview of the H.264/AVC video coding standard. IEEE Trans Circuits Syst Video Technol 13(7):560–576
Article Google Scholar
Schwarz H, Marpe D, Wiegand T (2006) Analysis of hierarchical B pictures and MCTF, ICME 2006. IEEE international conference on multimedia and expo, Toronto, July 2006
Google Scholar
Strohmeier D, Tech G (2010) On comparing different codec profiles of coding methods for mobile 3D television and video. In: Proceedings 3D systems and applications, Tokyo, May 2010
Google Scholar
ISO/IEC JTC1/SC29/WG11 (2009) Vision on 3D video. Doc. N10357, Lausanne, Feb 2009
Google Scholar
Müller K, Smolic A, Dix K, Merkle P, Wiegand T (2009) Coding and intermediate view synthesis of multi-view video plus depth. In: Proceedings IEEE international conference on image processing (ICIP’09), Cairo, pp 741–744, Nov. 2009
Google Scholar
Müller K, Merkle P, Wiegand T (2011) 3D video representation using depth maps. Proc IEEE, Special issue on 3D media and displays 99(4):643–656
Google Scholar
Faugeras O (1993) Three-dimensional computer vision: a geometric viewpoint. MIT Press, Cambridge
Google Scholar
Hartley R, Zisserman A (2000) Multiple view geometry in computer vision. Cambrigde University Press, Cambrigde
MATH Google Scholar
Scharstein D, Szeliski R (2002) A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. Int J Comput Vision 47(1):7–42
Article MATH Google Scholar
Bleyer M, Gelautz M (2005) A layered stereo matching algorithm using image segmentation and global visibility constraints. ISPRS J Photogrammetry Remote Sens 59(3):128–150
Article Google Scholar
Szeliski R, Zabih R, Scharstein D, Veksler O, Kolmogorov V, Agarwala A, Tappen M, Rother C (2006) A Comparative study of energy minimization methods for markov random fields. European conference on computer vision (ECCV 2006), vol 2, pp 16–29, Graz, May 2006
Google Scholar
Atzpadin N, Kauff P, Schreer O (2004) Stereo analysis by hybrid recursive matching for real-time immersive video conferencing. IEEE Trans Circuits Syst Video Technol, Special issue on immersive Telecommunications 14(3):321–334
Google Scholar
Cigla C, Zabulis X, Alatan AA (2007) Region-based dense depth extraction from multi-view video. In: Proceedings IEEE international conference on image processing (ICIP’07), San Antonio, USA, pp 213–216, Sept 2007
Google Scholar
Felzenszwalb PF, Huttenlocher DP (2006) Efficient belief propagation for early vision. Int J Comp Vision 70(1):41
Article Google Scholar
Kolmogorov V (2006) Convergent tree-reweighted message passing for energy minimization. IEEE Trans Pattern Anal Mach Intell 28(10):1568
Article Google Scholar
Kolmogorov V, Zabih R (2002) Multi-camera scene reconstruction via graph cuts. European conference on computer vision, May 2002
Google Scholar
Lee S-B, Ho Y-S (2010) View consistent multiview depth estimation for three-dimensional video generation. In: Proceedings IEEE 3DTV conference, Tampere, Finland, June 2010
Google Scholar
Min D, Yea S, Vetro A (2010) Temporally consistent stereo matching using coherence function. In: Proceedings IEEE 3DTV conference, Tampere, June 2010
Google Scholar
Tanimoto M, Fujii T, Suzuki K (2008) Improvement of depth map estimation and view synthesis. ISO/IEC JTC1/SC29/WG11, M15090, Antalya, Jan 2008
Google Scholar
Müller K, Smolic A, Dix K, Merkle P, Kauff P, Wiegand T (2008) View synthesis for advanced 3D video systems. EURASIP J Image Video Proces, Special issue on 3D Image and Video Processing, vol 2008, Article ID 438148, 11 pages, 2008 doi:10.1155/2008/438148
Google Scholar
Zitnick CL, Kang SB, Uyttendaele M, Winder S, Szeliski R (2004) High-quality video view interpolation using a layered representation. ACM SIGGRAPH and ACM Transaction on Graphics, Los Angeles, Aug 2004
Google Scholar
Gokturk S, Yalcin H, Bamji C (2004) A time‐of‐flight depth sensor system description, issues and solutions. In: Proceedings of IEEE computer vision and pattern recognition workshop, vol 4, pp 35–43
Google Scholar
ISO/IEC DIS 14772-1 (1997) The virtual reality modeling language. April 1997
Google Scholar
Würmlin S, Lamboray E, Gross M (2004) 3d video fragments: dynamic point samples for real-time free-viewpoint video. Computers and graphics, Special issue on coding, compression and streaming techniques for 3D and multimedia data, Elsevier, pp 3–14
Google Scholar
Fusiello A, Trucco E, Verri A (2000) A compact algorithm for rectification of stereo pairs. Mach Vis Appl 12(1):16–22
Article Google Scholar
Kauff P, Atzpadin N, Fehn C, Müller M, Schreer O, Smolic A, Tanger R (2007) Depth map creation and image based rendering for advanced 3DTV services providing interoperability and scalability. Signal processing: image communication. Special issue on 3DTV, Feb 2007
Google Scholar
Redert A, de Beeck MO, Fehn C, Ijsselsteijn W, Pollefeys M, Van Gool L, Ofek E, Sexton I, Surman P (2002) ATTEST–advanced three-dimensional television system techniques. In: Proceedings of international symposium on 3D data processing, visualization and transmission, pp 313–319, June 2002
Google Scholar
Merkle P, Morvan Y, Smolic A, Farin D, Müller K, de With PHN, Wiegand T (2009) The effects of multiview depth video compression on multiview rendering. Signal Proces: Image Commun 24(1–2):73–88
Article Google Scholar
Liu Y, Huang Q, Ma S, Zhao D, Gao W (2009) Joint video/depth rate allocation for 3d video coding based on view synthesis distortion model. Signal Proces: Image Commun 24(8):666–681
Article Google Scholar
Merkle P, Singla J, Müller K, Wiegand T (2010) Correlation histogram analysis of depth-enhanced 3D video coding’. In: Proceedings IEEE international conference on image processing (ICIP’10), Hong Kong, pp 2605–2608, Sept 2010
Google Scholar
Choi J, Min D, Ham B, Sohn K (2009) Spatial and temporal up-conversion technique for depth video. In: Proceedings IEEE international conference on image processing (ICIP’09), Cairo, Egypt, pp 741–744, Nov 2009
Google Scholar
Daribo I, Tillier C, Pesquet-Popescu B (2008) Adaptive wavelet coding of the depth map for stereoscopic view synthesis. In: Proceedings IEEE international workshop on multimedia signal processing (MMSP’08), Cairns, Australia, pp 34–39, Oct 2009
Google Scholar
Kim S-Y, Ho Y-S (2007) Mesh-based depth coding for 3d video using hierarchical decomposition of depth maps. In: Proceedings IEEE international conference on image processing (ICIP’07), San Antonio, pp V117–V120, Sept 2007
Google Scholar
Kim W-S, Ortega A, Lai P, Tian D, Gomila C (2010) Depth map coding with distortion estimation of rendered view. Visual information processing and communication, Proceedings of the SPIE, vol 7543
Google Scholar
Oh K-J, Yea S, Vetro A, Ho Y-S (2009) Depth reconstruction filter and down/up sampling for depth coding in 3-D video. IEEE Signal Proces Lett 16(9):747–750
Article Google Scholar
Smolic A, Müller K, Merkle P, Kauff P, Wiegand T (2009) An overview of available and emerging 3D video formats and depth enhanced stereo as efficient generic solution. In: Proceedings picture coding symposium (PCS 2009), Chicago, May 2009
Google Scholar
Müller K, Smolic A, Dix K, Kauff P, Wiegand T (2008) Reliability-based generation and view synthesis in layered depth video. In: Proceedings IEEE international workshop on multimedia signal processing (MMSP2008), Cairns, pp 34–39, Oct 2008
Google Scholar
Maitre M, Do MN (2009) Shape-adaptive wavelet encoding of depth maps. In: Proceedings picture coding symposium (PCS’09), Chicago, USA, May 2009
Google Scholar

Download references

Author information

Authors and Affiliations

Image Processing Department, Fraunhofer Institute for Telecommunications, Heinrich-Hertz-Institut, Einsteinufer 37, 10587, Berlin, Germany
Karsten Müller, Philipp Merkle & Gerhard Tech

Authors

Karsten Müller
View author publications
You can also search for this author in PubMed Google Scholar
Philipp Merkle
View author publications
You can also search for this author in PubMed Google Scholar
Gerhard Tech
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Karsten Müller .

Editor information

Editors and Affiliations

, School of Electrical & Electronic, Nanyang Technological University, Nanyang Avenue 50, Singapore, 639798, Singapore
Ce Zhu
Electronic Engineering, Department of Information Science &, Zheda Road 38, Hangzhou, 310027, China, People's Republic
Yin Zhao
, Department of Information Science &, Zhejiang University, Zheda Road 38, Hangzhou, 310027, China, People's Republic
Lu Yu
Graduate School of Engineering, Department of Electrical Engineering and, Nagoya University, Nagoya, 464-8603, Japan
Masayuki Tanimoto

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Müller, K., Merkle, P., Tech, G. (2013). 3D Video Compression. In: Zhu, C., Zhao, Y., Yu, L., Tanimoto, M. (eds) 3D-TV System with Depth-Image-Based Rendering. Springer, New York, NY. https://doi.org/10.1007/978-1-4419-9964-1_8

Download citation

DOI: https://doi.org/10.1007/978-1-4419-9964-1_8
Published: 15 August 2012
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4419-9963-4
Online ISBN: 978-1-4419-9964-1
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics