Depth map compression via 3D region-based representation

Duch, Marc Maceira; Morros, Josep-Ramon; Ruiz-Hidalgo, Javier

doi:10.1007/s11042-016-3727-1

Depth map compression via 3D region-based representation

Published: 21 July 2016

Volume 76, pages 13761–13784, (2017)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

330 Accesses
2 Citations
3 Altmetric
Explore all metrics

Abstract

In 3D video, view synthesis is used to create new virtual views between encoded camera views. Errors in the coding of the depth maps introduce geometry inconsistencies in synthesized views. In this paper, a new 3D plane representation of the scene is presented which improves the performance of current standard video codecs in the view synthesis domain. Two image segmentation algorithms are proposed for generating a color and depth segmentation. Using both partitions, depth maps are segmented into regions without sharp discontinuities without having to explicitly signal all depth edges. The resulting regions are represented using a planar model in the 3D world scene. This 3D representation allows an efficient encoding while preserving the 3D characteristics of the scene. The 3D planes open up the possibility to code multiview images with a unique representation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Efficient Depth-Based Coding

Efficient representation of disoccluded regions in 3D video coding

Article 12 March 2024

Muhammad Shahid Farid, Badi uz Zaman Babar & Muhammad Hassan Khan

Transformation of Video Signal Processing Techniques from 2D to 3D: A Survey

References

Achanta R, Shaji A, Smith K, Lucchi A, Fua P, Süsstrunk S (2012) SLIC superpixels compared to state-of-the-art superpixel methods. IEEE Trans Pattern Anal Mach Intell 34(11):2274–2282. doi:10.1109/TPAMI.2012.120
Article Google Scholar
Arbelaez P, Maire M, Fowlkes C, Malik J (2009) From contours to regions: an empirical evaluation. In: IEEE Conference on computer vision and pattern recognition, Miami, pp 2294–2301. doi:10.1109/CVPR.2009.5206707
Ataer-Cansizoglu E, Taguchi Y, Ramalingam S, Garaas T (2013) Tracking an RGB-D camera using points and planes. In: IEEE International conference on computer vision workshops, Sydney, pp 51–58. doi:10.1109/ICCVW.2013.14
Bergh M, Boix X, Roig G, Capitani B, Gool L (2012) SEEDS: superpixels extracted via energy-driven sampling. In: European Conference on computer vision, lecture notes in computer science, vol 7578, pp 13–26. doi:10.1007/978-3-642-33786-4_2
Cheung G, Kim WS, Ortega A, Ishida J, Kubota A (2011) Depth map coding using graph based transform and transform domain sparsification. In: International workshop on multimedia signal processing, pp 1–6. doi:10.1109/MMSP.2011.6093810
Dodgson N (2005) Autostereoscopic 3D displays. Computer 38(8):31–36. doi:10.1109/MC.2005.252
Article Google Scholar
Farid M, Lucenteforte M, Grangetto M (2015) Panorama view with spatiotemporal occlusion compensation for 3D video coding. IEEE Trans Image Process 24(1):205–219. doi:10.1109/TIP.2014.2374533
Article MathSciNet Google Scholar
Fehn C (2004) Depth-image-based rendering (DIBR), compression, and transmission for a new approach on 3D-TV. In: Proceedings of SPIE 5291, Stereoscopic displays and virtual reality systems, pp 93–104. doi:10.1117/12.524762
Fischler M, Bolles R (1981) Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun ACM 24(6):381–395. doi:10.1145/358669.358692
Article MathSciNet Google Scholar
Freeman H (1961) On the coding of arbitrary geometric configurations. IRE Trans Electron Comput EC-10:260–268. doi:10.1109/TEC.1961.5219197
Article MathSciNet Google Scholar
Gallup D, Frahm JM, Pollefeys M (2010) Piecewise planar and non-planar stereo for urban scene reconstruction. In: IEEE conference on computer vision and pattern recognition, San Francisco, pp 1418–1425. doi:10.1109/CVPR.2010.5539804
Gupta S, Arbelaez P, Malik J (2013) Perceptual organization and recognition of indoor scenes from RGB-D images. In: IEEE conference on computer vision and pattern recognition, Portland, pp 564–571. doi:10.1109/CVPR.2013.79
Hanhart P, Ebrahimi T (2012) Quality assessment of a stereo pair formed from decoded and synthesized views using objective metrics. In: 3DTV-Conference: the true vision - capture, transmission and display of 3D video, pp 1–4. doi:10.1109/3DTV.2012.6365478
Hirschmuller H, Scharstein D (2007) Evaluation of cost functions for stereo matching. In: IEEE conference on computer vision and pattern recognition, pp 1–8. doi:10.1109/CVPR.2007.383248
Jager F (2011) Contour-based segmentation and coding for depth map compression. In: Visual communications and image processing, pp 1–4. doi:10.1109/VCIP.2011.6115989
Kim WS, Ortega A, Lai P, Tian D (2015) Depth map coding optimization using rendered view distortion for 3D video coding. IEEE Trans Image Process 24 (11):3534–3545. doi:10.1109/TIP.2015.2447737
Article MathSciNet Google Scholar
Kowdle A, Sinha S, Szeliski R (2012) Multiple view object cosegmentation using appearance and stereo cues. In: European conference on computer vision, Firenze, pp 789–803. doi:10.1007/978-3-642-33715-4_57
Lei J, Li S, Zhu C, Sun M, Hou C (2015) Depth coding based on depth-texture motion and structure similarities. IEEE Trans Circuits Syst Video Technol 25(2):275–286. doi:10.1109/TCSVT.2014.2335471
Article Google Scholar
Martin D, Fowlkes C, Tal D, Malik J (2001) A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In: International conference on computer vision, Vancouver, vol 2, pp 416–423. doi:10.1109/ICCV.2001.937655
Maugey T, Ortega A, Frossard P (2015) Graph-based representation for multiview image geometry. IEEE Trans Image Process 24(5):1573–1586. doi:10.1109/TIP.2015.2400817
Article MathSciNet Google Scholar
Merkle P, Smolic A, Muller K, Wiegand T (2007a) Efficient prediction structures for multiview video coding. IEEE Trans Circuits Syst Video Technol 17 (11):1461–1473. doi:10.1109/TCSVT.2007.903665
Merkle P, Smolic A, Muller K, Wiegand T (2007b) Multi-view video plus depth representation and coding. In: IEEE international conference on image processing, San Antonio, vol 1, pp 201–204. doi:10.1109/ICIP.2007.4378926
Merkle P, Morvan Y, Smolic A, Farin D, Muller K, de With P, Wiegand T (2008) The effect of depth compression on multiview rendering quality. In: 3DTV-conference: the true vision - capture, transmission and display of 3D video, pp 245–248. doi:10.1109/3DTV.2008.4547854
Merkle P, Muller K, Marpe D, Wiegand T (2015) Depth intra coding for 3D video based on geometric primitives. IEEE Trans Circuits Syst Video Technol PP (99):1–1. doi:10.1109/TCSVT.2015.2407791
Google Scholar
Milani S, Zanuttigh P, Zamarin M, Forchhammer S (2011) Efficient depth map compression exploiting segmented color data. In: IEEE international conference on multimedia and expo, pp 1–6. doi:10.1109/ICME.2011.6011969
Muller K, Merkle P, Wiegand T (2011) 3-D video representation using depth maps. Proc IEEE 99(4):643–656. doi:10.1109/JPROC.2010.2091090
Article Google Scholar
Muller K, Schwarz H, Marpe D, Bartnik C, Bosse S, Brust H, Hinz T, Lakshman H, Merkle P, Rhee F, Tech G, Winken M, Wiegand T (2013) 3D high-efficiency video coding for multi-view video and depth data. IEEE Trans Image Process 22(9):3366–3378. doi:10.1109/TIP.2013.2264820
Article MathSciNet Google Scholar
Ostermann J, Bormans J, List P, Marpe D, Narroschke M, Pereira F, Stockhammer T, Wedi T (2004) Video coding with H.264/AVC: tools, performance, and complexity. IEEE Circuits Syst Mag 4(1):7–28. doi:10.1109/MCAS.2004.1286980
Article Google Scholar
Ozkalayci B, Alatan A (2014) 3D planar representation of stereo depth images for 3DTV applications. IEEE Trans Image Process 23(12):5222–5232. doi:10.1109/TIP.2014.2360452
Article MathSciNet Google Scholar
Peng J, Kim CS, Jay Kuo CC (2005) Technologies for 3D mesh compression: a survey. J Vis Commun Image Represent 16(6):688–733. doi:10.1016/j.jvcir.2005.03.001
Article Google Scholar
Rabbani T, van den Heuvel FA, Vosselman G (2006) Segmentation of point clouds using smoothness constraint. In: ISPRS commission V cymposium ‘image engineering and vision metrology’, pp 248–253
Salembier P, Garrido L (2000) Binary partition tree as an efficient representation for image processing, segmentation, and information retrieval. IEEE Trans Image Process 9(4):561–576. doi:10.1109/83.841934
Scharstein D, Pal C (2007) Learning conditional random fields for stereo. In: IEEE conference on computer vision and pattern recognition, pp 1–8. doi:10.1109/CVPR.2007.383191
Scharstein D, Szeliski R (2003) High-accuracy stereo depth maps using structured light. In: IEEE conference on computer vision and pattern recognition, vol 1, pp 195–202. doi:10.1109/CVPR.2003.1211354
Shahriyar S, Murshed M, Ali M, Paul M (2014) Efficient coding of depth map by exploiting temporal correlation. In: International conference on digital image computing: techniques and applications, pp 1–8. doi:10.1109/DICTA.2014.7008105
Shao F, Lin W, Jiang G, Yu M, Dai Q (2014) Depth map coding for view synthesis based on distortion analyses. IEEE J. Emerging Sel Top Circuits Syst 4 (1):106–117. doi:10.1109/JETCAS.2014.2298314
Article Google Scholar
Shen L, Liu Z, Zhang X, Zhao W, Zhang Z (2013) An effective CU size decision method for HEVC encoders. IEEE Trans Multimedia 15(2):465–470. doi:10.1109/TMM.2012.2231060
Article Google Scholar
Smolic A, Mueller K, Merkle P, Fehn C, Kauff P, Eisert P, Wiegand T (2006) 3D video and free viewpoint video - technologies, applications and MPEG standards. In: IEEE international conference on multimedia and expo, Toronto, pp 2161–2164. doi:10.1109/ICME.2006.262683
Sullivan G, Ohm J, Han WJ, Wiegand T (2012) Overview of the high efficiency video coding (HEVC) standard. IEEE Trans Circuits Syst Video Technol 22 (12):1649–1668. doi:10.1109/TCSVT.2012.2221191
Article Google Scholar
Tech G, Schwarz H, Muller K, Wiegand T (2012) 3D video coding using the synthesized view distortion change. In: Picture coding symposium, pp 25–28. doi:10.1109/PCS.2012.6213277
Vetro A, Wiegand T, Sullivan G (2011) Overview of the stereo and multiview video coding extensions of the H.264/MPEG-4 AVC standard. Proc IEEE 99(4):626–642. doi:10.1109/JPROC.2010.2098830
Article Google Scholar
Vilaplana V, Marqués F, Salembier P (2008) Binary partition trees for object detection. IEEE Trans Image Process 17(11):2201–2216. doi:10.1109/TIP.2008.2002841
Article MathSciNet Google Scholar
Yan C, Zhang Y, Xu J, Dai F, Li L, Dai Q, Wu F (2014a) A highly parallel framework for HEVC coding unit partitioning tree decision on many-core processors. IEEE Signal Process Lett 21(5):573–576. doi:10.1109/LSP.2014.2310494
Yan C, Zhang Y, Xu J, Dai F, Zhang J, Dai Q, Wu F (2014b) Efficient parallel framework for HEVC motion estimation on many-core processors. IEEE Trans Circuits Syst Video Technol 24(12):2077–2089. doi:10.1109/TCSVT.2014.2335852

Download references

Acknowledgments

This work has been developed in the framework of the project BIGGRAPH-TEC2013-43935-R, financed by the Spanish Ministerio de Economía y Competitividad and the European Regional Development Fund (ERDF).

Author information

Authors and Affiliations

Universitat Politècnica de Catalunya, Barcelona, Spain
Marc Maceira Duch, Josep-Ramon Morros & Javier Ruiz-Hidalgo

Authors

Marc Maceira Duch
View author publications
You can also search for this author in PubMed Google Scholar
Josep-Ramon Morros
View author publications
You can also search for this author in PubMed Google Scholar
Javier Ruiz-Hidalgo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Marc Maceira Duch.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Duch, M.M., Morros, JR. & Ruiz-Hidalgo, J. Depth map compression via 3D region-based representation. Multimed Tools Appl 76, 13761–13784 (2017). https://doi.org/10.1007/s11042-016-3727-1

Download citation

Received: 21 January 2016
Revised: 09 June 2016
Accepted: 27 June 2016
Published: 21 July 2016
Issue Date: June 2017
DOI: https://doi.org/10.1007/s11042-016-3727-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Depth map compression via 3D region-based representation

Abstract

Access this article

Similar content being viewed by others

Efficient Depth-Based Coding

Efficient representation of disoccluded regions in 3D video coding

Transformation of Video Signal Processing Techniques from 2D to 3D: A Survey

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Depth map compression via 3D region-based representation

Abstract

Access this article

Similar content being viewed by others

Efficient Depth-Based Coding

Efficient representation of disoccluded regions in 3D video coding

Transformation of Video Signal Processing Techniques from 2D to 3D: A Survey

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation