Abstract
In 3D video, view synthesis is used to create new virtual views between encoded camera views. Errors in the coding of the depth maps introduce geometry inconsistencies in synthesized views. In this paper, a new 3D plane representation of the scene is presented which improves the performance of current standard video codecs in the view synthesis domain. Two image segmentation algorithms are proposed for generating a color and depth segmentation. Using both partitions, depth maps are segmented into regions without sharp discontinuities without having to explicitly signal all depth edges. The resulting regions are represented using a planar model in the 3D world scene. This 3D representation allows an efficient encoding while preserving the 3D characteristics of the scene. The 3D planes open up the possibility to code multiview images with a unique representation.
Similar content being viewed by others
References
Achanta R, Shaji A, Smith K, Lucchi A, Fua P, Süsstrunk S (2012) SLIC superpixels compared to state-of-the-art superpixel methods. IEEE Trans Pattern Anal Mach Intell 34(11):2274–2282. doi:10.1109/TPAMI.2012.120
Arbelaez P, Maire M, Fowlkes C, Malik J (2009) From contours to regions: an empirical evaluation. In: IEEE Conference on computer vision and pattern recognition, Miami, pp 2294–2301. doi:10.1109/CVPR.2009.5206707
Ataer-Cansizoglu E, Taguchi Y, Ramalingam S, Garaas T (2013) Tracking an RGB-D camera using points and planes. In: IEEE International conference on computer vision workshops, Sydney, pp 51–58. doi:10.1109/ICCVW.2013.14
Bergh M, Boix X, Roig G, Capitani B, Gool L (2012) SEEDS: superpixels extracted via energy-driven sampling. In: European Conference on computer vision, lecture notes in computer science, vol 7578, pp 13–26. doi:10.1007/978-3-642-33786-4_2
Cheung G, Kim WS, Ortega A, Ishida J, Kubota A (2011) Depth map coding using graph based transform and transform domain sparsification. In: International workshop on multimedia signal processing, pp 1–6. doi:10.1109/MMSP.2011.6093810
Dodgson N (2005) Autostereoscopic 3D displays. Computer 38(8):31–36. doi:10.1109/MC.2005.252
Farid M, Lucenteforte M, Grangetto M (2015) Panorama view with spatiotemporal occlusion compensation for 3D video coding. IEEE Trans Image Process 24(1):205–219. doi:10.1109/TIP.2014.2374533
Fehn C (2004) Depth-image-based rendering (DIBR), compression, and transmission for a new approach on 3D-TV. In: Proceedings of SPIE 5291, Stereoscopic displays and virtual reality systems, pp 93–104. doi:10.1117/12.524762
Fischler M, Bolles R (1981) Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun ACM 24(6):381–395. doi:10.1145/358669.358692
Freeman H (1961) On the coding of arbitrary geometric configurations. IRE Trans Electron Comput EC-10:260–268. doi:10.1109/TEC.1961.5219197
Gallup D, Frahm JM, Pollefeys M (2010) Piecewise planar and non-planar stereo for urban scene reconstruction. In: IEEE conference on computer vision and pattern recognition, San Francisco, pp 1418–1425. doi:10.1109/CVPR.2010.5539804
Gupta S, Arbelaez P, Malik J (2013) Perceptual organization and recognition of indoor scenes from RGB-D images. In: IEEE conference on computer vision and pattern recognition, Portland, pp 564–571. doi:10.1109/CVPR.2013.79
Hanhart P, Ebrahimi T (2012) Quality assessment of a stereo pair formed from decoded and synthesized views using objective metrics. In: 3DTV-Conference: the true vision - capture, transmission and display of 3D video, pp 1–4. doi:10.1109/3DTV.2012.6365478
Hirschmuller H, Scharstein D (2007) Evaluation of cost functions for stereo matching. In: IEEE conference on computer vision and pattern recognition, pp 1–8. doi:10.1109/CVPR.2007.383248
Jager F (2011) Contour-based segmentation and coding for depth map compression. In: Visual communications and image processing, pp 1–4. doi:10.1109/VCIP.2011.6115989
Kim WS, Ortega A, Lai P, Tian D (2015) Depth map coding optimization using rendered view distortion for 3D video coding. IEEE Trans Image Process 24 (11):3534–3545. doi:10.1109/TIP.2015.2447737
Kowdle A, Sinha S, Szeliski R (2012) Multiple view object cosegmentation using appearance and stereo cues. In: European conference on computer vision, Firenze, pp 789–803. doi:10.1007/978-3-642-33715-4_57
Lei J, Li S, Zhu C, Sun M, Hou C (2015) Depth coding based on depth-texture motion and structure similarities. IEEE Trans Circuits Syst Video Technol 25(2):275–286. doi:10.1109/TCSVT.2014.2335471
Martin D, Fowlkes C, Tal D, Malik J (2001) A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In: International conference on computer vision, Vancouver, vol 2, pp 416–423. doi:10.1109/ICCV.2001.937655
Maugey T, Ortega A, Frossard P (2015) Graph-based representation for multiview image geometry. IEEE Trans Image Process 24(5):1573–1586. doi:10.1109/TIP.2015.2400817
Merkle P, Smolic A, Muller K, Wiegand T (2007a) Efficient prediction structures for multiview video coding. IEEE Trans Circuits Syst Video Technol 17 (11):1461–1473. doi:10.1109/TCSVT.2007.903665
Merkle P, Smolic A, Muller K, Wiegand T (2007b) Multi-view video plus depth representation and coding. In: IEEE international conference on image processing, San Antonio, vol 1, pp 201–204. doi:10.1109/ICIP.2007.4378926
Merkle P, Morvan Y, Smolic A, Farin D, Muller K, de With P, Wiegand T (2008) The effect of depth compression on multiview rendering quality. In: 3DTV-conference: the true vision - capture, transmission and display of 3D video, pp 245–248. doi:10.1109/3DTV.2008.4547854
Merkle P, Muller K, Marpe D, Wiegand T (2015) Depth intra coding for 3D video based on geometric primitives. IEEE Trans Circuits Syst Video Technol PP (99):1–1. doi:10.1109/TCSVT.2015.2407791
Milani S, Zanuttigh P, Zamarin M, Forchhammer S (2011) Efficient depth map compression exploiting segmented color data. In: IEEE international conference on multimedia and expo, pp 1–6. doi:10.1109/ICME.2011.6011969
Muller K, Merkle P, Wiegand T (2011) 3-D video representation using depth maps. Proc IEEE 99(4):643–656. doi:10.1109/JPROC.2010.2091090
Muller K, Schwarz H, Marpe D, Bartnik C, Bosse S, Brust H, Hinz T, Lakshman H, Merkle P, Rhee F, Tech G, Winken M, Wiegand T (2013) 3D high-efficiency video coding for multi-view video and depth data. IEEE Trans Image Process 22(9):3366–3378. doi:10.1109/TIP.2013.2264820
Ostermann J, Bormans J, List P, Marpe D, Narroschke M, Pereira F, Stockhammer T, Wedi T (2004) Video coding with H.264/AVC: tools, performance, and complexity. IEEE Circuits Syst Mag 4(1):7–28. doi:10.1109/MCAS.2004.1286980
Ozkalayci B, Alatan A (2014) 3D planar representation of stereo depth images for 3DTV applications. IEEE Trans Image Process 23(12):5222–5232. doi:10.1109/TIP.2014.2360452
Peng J, Kim CS, Jay Kuo CC (2005) Technologies for 3D mesh compression: a survey. J Vis Commun Image Represent 16(6):688–733. doi:10.1016/j.jvcir.2005.03.001
Rabbani T, van den Heuvel FA, Vosselman G (2006) Segmentation of point clouds using smoothness constraint. In: ISPRS commission V cymposium ‘image engineering and vision metrology’, pp 248–253
Salembier P, Garrido L (2000) Binary partition tree as an efficient representation for image processing, segmentation, and information retrieval. IEEE Trans Image Process 9(4):561–576. doi:10.1109/83.841934
Scharstein D, Pal C (2007) Learning conditional random fields for stereo. In: IEEE conference on computer vision and pattern recognition, pp 1–8. doi:10.1109/CVPR.2007.383191
Scharstein D, Szeliski R (2003) High-accuracy stereo depth maps using structured light. In: IEEE conference on computer vision and pattern recognition, vol 1, pp 195–202. doi:10.1109/CVPR.2003.1211354
Shahriyar S, Murshed M, Ali M, Paul M (2014) Efficient coding of depth map by exploiting temporal correlation. In: International conference on digital image computing: techniques and applications, pp 1–8. doi:10.1109/DICTA.2014.7008105
Shao F, Lin W, Jiang G, Yu M, Dai Q (2014) Depth map coding for view synthesis based on distortion analyses. IEEE J. Emerging Sel Top Circuits Syst 4 (1):106–117. doi:10.1109/JETCAS.2014.2298314
Shen L, Liu Z, Zhang X, Zhao W, Zhang Z (2013) An effective CU size decision method for HEVC encoders. IEEE Trans Multimedia 15(2):465–470. doi:10.1109/TMM.2012.2231060
Smolic A, Mueller K, Merkle P, Fehn C, Kauff P, Eisert P, Wiegand T (2006) 3D video and free viewpoint video - technologies, applications and MPEG standards. In: IEEE international conference on multimedia and expo, Toronto, pp 2161–2164. doi:10.1109/ICME.2006.262683
Sullivan G, Ohm J, Han WJ, Wiegand T (2012) Overview of the high efficiency video coding (HEVC) standard. IEEE Trans Circuits Syst Video Technol 22 (12):1649–1668. doi:10.1109/TCSVT.2012.2221191
Tech G, Schwarz H, Muller K, Wiegand T (2012) 3D video coding using the synthesized view distortion change. In: Picture coding symposium, pp 25–28. doi:10.1109/PCS.2012.6213277
Vetro A, Wiegand T, Sullivan G (2011) Overview of the stereo and multiview video coding extensions of the H.264/MPEG-4 AVC standard. Proc IEEE 99(4):626–642. doi:10.1109/JPROC.2010.2098830
Vilaplana V, Marqués F, Salembier P (2008) Binary partition trees for object detection. IEEE Trans Image Process 17(11):2201–2216. doi:10.1109/TIP.2008.2002841
Yan C, Zhang Y, Xu J, Dai F, Li L, Dai Q, Wu F (2014a) A highly parallel framework for HEVC coding unit partitioning tree decision on many-core processors. IEEE Signal Process Lett 21(5):573–576. doi:10.1109/LSP.2014.2310494
Yan C, Zhang Y, Xu J, Dai F, Zhang J, Dai Q, Wu F (2014b) Efficient parallel framework for HEVC motion estimation on many-core processors. IEEE Trans Circuits Syst Video Technol 24(12):2077–2089. doi:10.1109/TCSVT.2014.2335852
Acknowledgments
This work has been developed in the framework of the project BIGGRAPH-TEC2013-43935-R, financed by the Spanish Ministerio de Economía y Competitividad and the European Regional Development Fund (ERDF).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Duch, M.M., Morros, JR. & Ruiz-Hidalgo, J. Depth map compression via 3D region-based representation. Multimed Tools Appl 76, 13761–13784 (2017). https://doi.org/10.1007/s11042-016-3727-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-016-3727-1