Skip to main content

3D Video Representation and Coding

  • Chapter
  • First Online:
Book cover Novel 3D Media Technologies

Abstract

The technologies which allow an immersive user experience in 3D environments are rapidly evolving and new services have emerged in various fields of application. Most of these services require the use of 3D video, combined with appropriate display systems. As a consequence, research and development in 3D video continues attracting sustained interest.

While stereoscopic viewing is already widely spread, namely in TV and gaming, new displays and applications, such as FTV (Free viewpoint TV), require the use of a larger number of views. Hence, the multiview video format was considered, which uses N views, corresponding to the images captured by N cameras (either real or virtual), with a controlled spatial arrangement. In order to avoid a linear escalation of the bitrate, associated with the use of multiple views, video-plus-depth formats have been proposed. A small number of texture and depth video sequences are used to synthesize intermediate texture views at a different space position, through a depth-image-based rendering (DIBR) technique. This technology allows the use of advanced stereoscopic display processing and to improve support for high-quality autostereoscopic multiview displays.

In order to provide a true 3D content and fatigue-free 3D visualization, holoscopic imaging has been introduced as an acquisition and display solution. However, efficient coding schemes for this particular type of content are needed to enable proper storage and delivery of the large amount of data involved in these systems, which is also addressed in this chapter.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 54.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. CISCO (2014) Cisco visual networking index: forecast and methodology, 2013–2018. White paper

    Google Scholar 

  2. Vetro A, Tourapis A, Müller K, Chen T (2011) 3D-TV content storage and transmission. IEEE Trans Broadcast 57(2):384–394

    Article  Google Scholar 

  3. Zilly F, Kluger J, Kauff P (2011) Production rules for stereo acquisition. Proc IEEE 99(4):590–606

    Article  Google Scholar 

  4. Konrad J, Halle M (2007) 3D displays and signal processing. IEEE Signal Process Mag 24(6):97–111

    Article  Google Scholar 

  5. Dodgson NA (2005) Autostereoscopic 3D displays. Computer 38(8):31–36

    Article  Google Scholar 

  6. Tanimoto M, Tehrani MP, Fujii T, Yendo T (2011) Free-Viewpoint TV. IEEE Signal Process Mag 28(1):67–76

    Article  Google Scholar 

  7. Adelson EH, Bergen JR (1991) The plenoptic function and the elements of early vision. In: Landy M, Movshon JA (eds) Computation models of visual processing. MIT Press, Cambridge, pp 3–20

    Google Scholar 

  8. Levoy M, Hanrahan P (1996) Light field rendering. In: Proc. ACM SIGGRAPH. pp 31–42

    Google Scholar 

  9. Lippmann G (1908) Epreuves Reversibles Donnant la Sensation du Relief. Journal de Physique Théorique et Appliquée 7(1):821–825

    Article  Google Scholar 

  10. Aggoun A et al (2013) Immersive 3D holoscopic video system. IEEE Multimedia 20(1):28–37

    Article  Google Scholar 

  11. Muller K, Merkle P, Wiegand T (2011) 3D video representation using depth maps. Proc IEEE 99(4):643–656

    Article  Google Scholar 

  12. Muller K, Schwarz H, Marpe D, Bartnik C, Bosse S, Brust H, Hinz T, Lakshman H, Merkle P, Rhee F, Tech G, Winken M, Wiegand T (2013) 3D high-efficiency video coding for multiview video and depth data”. IEEE Trans Image Process 22(9):3366–3378

    Article  MathSciNet  Google Scholar 

  13. Atzpadin N, Kauff P, Schreer O (2004) Stereo analysis by hybrid recursive matching for real-time immersive video conferencing. IEEE Trans Circuits Syst Video Technol 14(3):321–334

    Article  Google Scholar 

  14. Lee SB, Ho YS (2010) View-consistent multiview depth estimation for three-dimensional video generation. In: 3DTV-conference: the true vision – capture, transmission and display of 3D video (3DTV-CON). pp 1–4

    Google Scholar 

  15. Dongbo Min, Sehoon Yea, Vetro A (2010) Temporally consistent stereo matching using coherence function. In: 3DTV-conference: the true vision - capture, transmission and display of 3D video (3DTV-CON). pp 1–4

    Google Scholar 

  16. Shade JW, Gortler SJ, He L-W, Szeliski R (1998) Layered depth images. In: Computer graphics, vol. 32. Annual conference series. pp 231–242 [Online]. Available: http://grail.cs.washington.edu/projects/ldi/

  17. Cheng X, Sun L, Yang S (2007) Generation of layered depth images from multiview video. In: IEEE Int. Conf. Image Processing (ICIP), vol. 5, San Antonio, USA

    Google Scholar 

  18. Daribo I, Saito H (2011) A novel inpainting-based layered depth video for 3DTV”. IEEE Trans Broadcast 57(2):533–541

    Article  Google Scholar 

  19. Smolic A, Mueller K, Merkle P, Kauff P, Wiegand T (2009) An overview of available and emerging 3D video formats and depth enhanced stereo as efficient generic solution. In: Picture coding symposium, Chicago, USA, pp 1–4

    Google Scholar 

  20. ITU-T and ISO/IEC (2012) Advanced video coding for generic audiovisual services. Rec. ITU-T H.264|ISO/IEC 14496-10

    Google Scholar 

  21. ITU-T and ISO/IEC JTC 1/SC 29 (MPEG) (2013) High efficiency video coding. Recommendation ITU-T H.265 and ISO/IEC 23008-2

    Google Scholar 

  22. Text of ISO/IEC MPEG2011/N12543 (2012) Additional profiles and SEI messages. San Jose, USA

    Google Scholar 

  23. Ballocca G, D’Amato P, Grangetto M, Lucenteforte M (2011) Tile format: a novel frame compatible approach for 3D video broadcasting. In: IEEE International Conference on Multimedia and Expo (ICME), Barcelona, Spain, pp 1–4

    Google Scholar 

  24. Müller K, Schwarz H, Marpe D, Bartnik C, Bosse S, Brust H, Hinz T, Lakshman H, Merkle P, Rhee FH, Tech G, Winken M, Wiegand T (2013) 3D high-efficiency video coding for multiview video and depth. IEEE Trans Image Process 22(9):3366–3378

    Article  MathSciNet  Google Scholar 

  25. ISO/IEC IS 13818-2 AMD3 (1996) MPEG-2 Video Multiview Profile

    Google Scholar 

  26. Mueller K, Merkle P, Smolic A, Wiegand T (2006) Multiview coding using AVC. MPEG Doc. M12945, Bangkok, Thailand

    Google Scholar 

  27. Oh KJ, Yea S, Vetro A, Ho YS (2010) Virtual view synthesis method and self-evaluation metrics for free viewpoint television and 3D video. Int J Imaging Syst Technol 20(4):378–390

    Article  Google Scholar 

  28. Fukushima N, Yendo T, Fujii T, Tanimoto M (2007) Free viewpoint image generation using multi-pass dynamic programming. In: SPIE stereoscopic displays and virtual reality systems XIV, vol. 6490, pp 460–470

    Google Scholar 

  29. Hannuksela MM, Chen Y, Suzuki T, Ohm J-R, Sullivan G (2013) 3DAVC draft text 8, JCT-3V document JCT3V-F1002

    Google Scholar 

  30. Tech G, Wegner K, Chen Y, Yea S (2013) 3D-HEVC Draft Text 2, JCT-3V document JCT3V-F1001

    Google Scholar 

  31. Schwarz H, Bartnik C, Bosse S, Brust H, Hinz T, Lakshman H, Merkle P, Muller K, Rhee H, Tech G, Winken M, Marpe D, Wiegand T (2012) Extension of high efficiency video coding (HEVC) for multiview video and depth data. In: 19th IEEE international conference on image processing. pp 205–208

    Google Scholar 

  32. Merkle P, Smolic A, Müller K, Weigand T (2007) Efficient prediction structures for multiview video coding. IEEE Trans Circuits Syst Video Technol 17(11):1461–1473

    Article  Google Scholar 

  33. Micallef BW, Debono CJ, Farrugia RA (2011) Exploiting depth information for fast motion and disparity estimation in multiview video coding. In: 3DTV conference

    Google Scholar 

  34. Muller K, Merkle P, Tech G, Wiegand T (2012) 3D video coding with depth modeling modes and view synthesis optimization. In: Signal Information Processing Association Annual Summit and Conference (APSIPA ASC) 2012 Asia-Pacific, pp 1–4

    Google Scholar 

  35. Jager F (2012) Simplified depth map intra coding with an optional depth lookup table. In: International Conference on 3D Imaging (IC3D). pp 1–4

    Google Scholar 

  36. Lucas LFR, Rodrigues NMM, Pagliari CL, da Silva EAB, de Faria SMM (2013) Predictive depth map coding for efficient virtual view synthesis. In: IEEE International Conference on Image Processing (ICIP’13), Melbourne, Australia

    Google Scholar 

  37. Zamarin M, Salmistraro M, Forchhammer S, Ortega A (2013) Edge-preserving intra depth coding based on context-coding and H.264/AVC. In: IEEE International Conference on Multimedia and Expo (ICME). pp 1–6

    Google Scholar 

  38. Oh BT, Wey HC, Park D-S (2012) Plane segmentation based intra prediction for depth map coding. In: Picture Coding Symposium (PCS2012). pp 41–44

    Google Scholar 

  39. Shen G, Kim WS, Ortega A, Lee J, Wey H (2010) Edge-aware intra prediction for depth-map coding. In: IEEE International Conference on Image Processing (ICIP2010). pp 3393–3396

    Google Scholar 

  40. Merkle P, Morvan Y, Smolic A, Farin D, Mueller K, de With PHN, Wiegand T (2009) The effects of multiview depth video compression on multiview rendering”. Signal Process: Image Commun 24(1–2):73–88

    Google Scholar 

  41. Graziosi DB, Rodrigues NMM, Pagliari C, da Silva EAB, de Faria SMM, de Carvalho MB (2010) Compressing depth maps using multiscale recurrent pattern image coding. IET Electron Lett 46(5):340–341

    Article  Google Scholar 

  42. Graziosi DBG, Rodrigues NMM, Pagliari CLP, Silva E, Faria SMM, Perez MMP, Carvalho M (2010) Multiscale recurrent pattern matching approach for depth map coding. In: Picture coding symposium, Nagoya, Japan

    Google Scholar 

  43. Francisco NC, Rodrigues NMM, da Silva EAB, de Carvalho MB, de Faria SMM, Silva VMM (2010) Scanned compound document encoding using multiscale recurrent patterns. IEEE Trans Image Process 9(10):2712–2724

    Article  Google Scholar 

  44. Lucas LFR, Rodrigues NMM, Pagliari CL, Silva EAB, Faria SMM (2012) Efficient depth map coding using linear residue approximation and a flexible prediction framework. In: IEEE International Conference on Image Processing (ICIP 2012), Orlando, EUA

    Google Scholar 

  45. Merkle P, Smolic A, Müller K, Weigand T (2007) Efficient compression of multiview depth data based on MVC. In: 3DTV conference

    Google Scholar 

  46. Zhu S, Ma K-K (2009) A new diamond search algorithm for fast block-matching motion estimation. IEEE Trans Image Process 9(2):387–392

    MathSciNet  Google Scholar 

  47. Lu J, Cai H, Lou J-G, Li J (2007) An epipolar geometry-based fast disparity estimation algorithm for multiview image and video coding. IEEE Trans Circuits Syst Video Technol 17(6):737–750

    Article  Google Scholar 

  48. Hartley R, Zisserman A (2003) Multiple view geometry in computer vision. Cambridge University Press, Cambridge, pp 279–309

    Google Scholar 

  49. Micallef BW, Debono CJ, Farrugia RA (2011) Fast disparity estimation for multiview plus depth video coding. In: IEEE visual communications and image processing conference

    Google Scholar 

  50. Micallef BW, Debono CJ, Farrugia RA (2013) Low complexity disparity estimation for immersive 3D video transmission. In: IEEE international conference 2013 – workshop on immersive and interactive multimedia communications over the future internet. pp. 622–626

    Google Scholar 

  51. Mora EG, Jung J, Cagnazzo M, Pesquet-Popescu B (2014) Initialization, limitation and predictive coding of the depth and texture quadtree in 3D-HEVC. Trans Circuits Syst Video Technol 24(9):1554–1565

    Article  Google Scholar 

  52. Deng H, Yu Li, Qui Jinbo, Zhang J (2012) A joint texture/depth edge-directed up-sampling algorithm for depth map coding. In: IEEE international conference on multimedia and expo

    Google Scholar 

  53. Zhang J, Hannuksela MM, Li H (2010) Joint multiview video plus depth coding. In: 2010 I.E. 17th international conference on image processing

    Google Scholar 

  54. Tao S, Chen Y, Hannuksela MM, Wang Y-K, Gabbouj M, Li H (2009) Joint texture and depth map video coding based on the scalable extension of H.264/AVC. In: IEEE international symposium on circuits and systems. pp 2353–2356

    Google Scholar 

  55. Zaharia R, Aggoun A, McCormick M (2002) Adaptive 3D-DCT compression algorithm for continuous parallax 3D integral imaging. Signal Process: Image Commun 17(3):231–242

    Google Scholar 

  56. Aggoun A (2006) A 3D DCT compression algorithm for omnidirectional integral images. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2006), Toulouse, France, vol. 2. pp 517–520

    Google Scholar 

  57. Zaharia R, Aggoun A, McCormick M (2001) Compression of full parallax colour integral 3D TV image data based on subsampling of chrominance components. In: Proc. of the IEEE Data Compression Conference (DCC 2001), Snowbird, USA, pp 27–29

    Google Scholar 

  58. Forman MC, Aggoun A (1997) Quantisation strategies for 3D-DCT based compression of full parallax 3D images. In: IEE International Conference on Image Processing Applications (IPA 1997), Dublin, Ireland, pp 32–35

    Google Scholar 

  59. Aggoun A, Mazri M (2008) Wavelet-based compression algorithm for still omnidirectional 3D integral images. Signal, Image Video Process 2(2):141–153

    Article  Google Scholar 

  60. Elharar E, Stern A, Hadar O, Javidi B (2007) A hybrid compression method for integral images using discrete wavelet transform and discrete cosine transform. J Display Technol 3(3):321–325

    Article  Google Scholar 

  61. Olsson R, Sjostrom M, Xu Y (2006) A combined pre-processing and H.264-compression scheme for 3D integral images. In: IEEE International Conference on Image Processing (ICIP 2006), Atlanta, USA, pp 513–516

    Google Scholar 

  62. Adedoyin S, Fernando WAC, Aggoun A (2007) A joint motion and disparity motion estimation technique for 3D integral video compression using evolutionary strategy. IEEE Trans Consum Electr 53(2):732–739

    Article  Google Scholar 

  63. Adedoyin S, Fernando WAC, Aggoun A (2007) Motion and disparity estimation with self adapted evolutionary strategy in 3D video coding. IEEE Trans Consum Electr 53(4):1768–1775

    Article  Google Scholar 

  64. Conti C, Nunes P, Soares LD (submitted) 3D holoscopic video coding. IEEE Trans Circuits Syst Video Technol

    Google Scholar 

  65. Dick J, Almeida H, Soares LD, Nunes P (2011) 3D holoscopic video coding using MVC. In: IEEE International Conference on Computer as a Tool (EUROCON 2011), pp 1–4

    Google Scholar 

  66. Conti C, Lino J, Nunes P, Ducla Soares L, Lobato Correia P (2011) Spatial prediction based on self-similarity compensation for 3D holoscopic image and video coding. In: 18th IEEE International Conference on Image Processing (ICIP), pp 961–964

    Google Scholar 

  67. Conti C, Nunes P, Soares LD (2012) New HEVC prediction modes for 3D holoscopic video coding. In: 19th IEEE International Conference on Image Processing (ICIP), Orlando, USA, pp 1325–1328

    Google Scholar 

Download references

Acknowledgements

The authors would like to thank the Interactive Visual Media Group of Microsoft Research and National Institute of Information and Communications Technology (NICT), for providing the Ballet and Breakdancers and Shark data set, respectively, for research purposes.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sérgio M. M. Faria .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer Science+Business Media New York

About this chapter

Cite this chapter

Faria, S.M.M., Debono, C.J., Nunes, P., Rodrigues, N.M.M. (2015). 3D Video Representation and Coding. In: Kondoz, A., Dagiuklas, T. (eds) Novel 3D Media Technologies. Springer, New York, NY. https://doi.org/10.1007/978-1-4939-2026-6_3

Download citation

  • DOI: https://doi.org/10.1007/978-1-4939-2026-6_3

  • Published:

  • Publisher Name: Springer, New York, NY

  • Print ISBN: 978-1-4939-2025-9

  • Online ISBN: 978-1-4939-2026-6

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics