Skip to main content

Scene Segmentation Assisted by Depth Data

  • Chapter
  • First Online:
Book cover Time-of-Flight and Structured Light Depth Cameras

Abstract

Segmentation, or detecting scene elements within an image, can be drastically simplified by combining depth and color data. This approach delivers segmentation tools which outperform techniques based on color alone. This chapter shows how consumer depth camera data can be used for three different tasks. The first is video matting, the separation of foreground objects from the background. The second is scene segmentation, the partitioning of color images and depth maps into different regions corresponding to scene elements. The third is semantic segmentation, the task of segmenting the framed scene and associating each segment to a specific category of object. We present various algorithms and methodologies for both single frame, color, and depth video sequences.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    In the case of structured light depth cameras the amplitude image is not very informative about the scene’s color, since it is dominated by the projected pattern.

References

  1. A. Abramov, K. Pauwels, J. Papon, F. Worgotter, B. Dellen, Depth-supported real-time video segmentation with the kinect, in Proceedings of IEEE Workshop on Applications of Computer Vision (2012), pp. 457–464

    Google Scholar 

  2. P. Arbelaez, M. Maire, C. Fowlkes, J. Malik, Contour detection and hierarchical image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 33(5), 898–916 (2011)

    Article  Google Scholar 

  3. O. Arif, W. Daley, P.A. Vela, J. Teizer, J. Stewart, Visual tracking and segmentation using time-of-flight sensor, in Proceedings of IEEE International Conference on Image Processing (2010), pp. 2241–2244

    Google Scholar 

  4. D. Banica, C. Sminchisescu, Second-order constrained parametric proposals and sequential search-based structured prediction for semantic segmentation in RGB-D images, in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (2015), pp. 3517–3526

    Google Scholar 

  5. A. Bleiweiss, M. Werman, Fusing time-of-flight depth and color for real-time segmentation and tracking, in Proceedings of DAGM Workshop, Dynamic 3D Imaging (2009), pp. 58–69

    Google Scholar 

  6. Y. Boykov, O. Veksler, R. Zabih, Fast approximate energy minimization via graph cuts. IEEE Trans. Pattern Anal. Mach. Intell. 23, 1222–1239 (2001)

    Article  Google Scholar 

  7. F. Calderero, F. Marques, Hierarchical fusion of color and depth information at partition level by cooperative region merging, in Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (2009), pp. 973–976

    Google Scholar 

  8. M. Camplani, M. Salgado, Background foreground segmentation with RGB-D kinect data: An efficient combination of classifiers. J. Vis. Commun. Image Represent. 25(1), 122–136 (2014). Visual Understanding and Applications with RGB-D Cameras

    Google Scholar 

  9. M. Camplani, C.R. Del Blanco, L. Salgado, F. Jaureguizar, N. García, Advanced background modeling with RGB-D sensors through classifiers combination and inter-frame foreground prediction. Mach. Vis. Appl. 25(5), 1197–1210 (2014)

    Google Scholar 

  10. Y. Cheng, Mean shift, mode seeking, and clustering, IEEE Trans. Pattern Anal. Mach. Intell. 17(8), 790–799 (1995)

    Article  Google Scholar 

  11. Y.Y. Chuang, B. Curless, D.H. Salesin, R. Szeliski, A bayesian approach to digital matting, in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (2001), p. 264

    Google Scholar 

  12. D. Comaniciu, P. Meer, Mean shift: a robust approach toward feature space analysis. IEEE Trans. Pattern Anal. Mach. Intell. 24, 603–619 (2002)

    Article  Google Scholar 

  13. C. Couprie, C. Farabet, L. Najman, Y. LeCun, Convolutional Nets and Watershed Cuts for Real-Time Semantic Labeling of RGBD Videos. J. Mach. Learn. Res. 15(Oct), 3489–3511 (2014)

    MathSciNet  Google Scholar 

  14. R. Crabb, C. Tracey, A. Puranik, J. Davis, Real-time foreground segmentation via range and color imaging, in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition Workshops (2008), pp. 1–5

    Google Scholar 

  15. M.J. Dahan, N. Chen, A. Shamir, D. Cohen-Or, Combining color and depth for enhanced image segmentation and retargeting. Vis. Comput. 28(12), 1181–1193 (2012)

    Article  Google Scholar 

  16. C. Dal Mutto, P. Zanuttigh, G.M. Cortelazzo, Scene segmentation assisted by stereo vision, in Proceedings of International Conference on 3D Imaging, Modeling, Processing, Visualization and Transmission 2011 (Hangzhou, 2011)

    Google Scholar 

  17. C. Dal Mutto, P. Zanuttigh, G.M. Cortelazzo, Fusion of geometry and color information for scene segmentation. Proceedings of IEEE J. Sel. Top. Signal Process. 6(5), 505–521 (2012)

    Article  Google Scholar 

  18. N. Dalal, B. Triggs, Histograms of oriented gradients for human detection, in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (2005), pp. 886–893

    Google Scholar 

  19. C.R. Del-Blanco, T. Mantecón, M. Camplani, F. Jaureguizar, L. Salgado, N. García, Foreground segmentation in depth imagery using depth and spatial dynamic models for video surveillance applications. Sensors 14(2), 1961–1987 (2014)

    Article  Google Scholar 

  20. B. Dellen, G. Alenyá, S. Foix, C. Torras, Segmenting color images into surface patches by exploiting sparse depth data, in Proceedings of Winter Vision Meeting: Workshop on Applications of Computer Vision (2011), pp. 591–598

    Google Scholar 

  21. C. Erdogan, M. Paluri, F. Dellaert, Planar segmentation of RGBD images using fast linear fitting and markov chain monte carlo, in Proceedings of Conference on Computer and Robot Vision (Toronto, 2012), pp. 32–39

    Google Scholar 

  22. P.F. Felzenszwalb, D.P. Huttenlocher, Efficient graph-based image segmentation. Int. J. Comput. Vis. 59 (2004)

    Google Scholar 

  23. C. Fowlkes, S. Belongie, Fan Chung, J. Malik, Spectral grouping using the nystrom method. IEEE Trans. Pattern Anal. Mach. Intell. 26(2), 214–225 (2004)

    Google Scholar 

  24. J. Gallego, M. Pardàs, Region based foreground segmentation combining color and depth sensors via logarithmic opinion pool decision. J. Vis. Commun. Image Represent. 25(1), 184–194 (2014)

    Article  Google Scholar 

  25. S. Gupta, P. Arbeláez, R. Girshick, J. Malik. Indoor scene understanding with RGB-D images: Bottom-up segmentation, object detection and semantic segmentation. Int. J. Comput. Vis. 112(2), 133–149 (2015)

    Article  MathSciNet  Google Scholar 

  26. M. Harville, G. Gordon, J. Woodfill, Foreground segmentation using adaptive mixture models in color and depth, in Proceedings of IEEE Workshop on Detection and Recognition of Events in Video (2001)

    Google Scholar 

  27. E. Herbst, P. Henry, D. Fox, Toward online 3-D object segmentation and mapping, in Proceedings of IEEE International Conference on Robotics and Automation (2014)

    Google Scholar 

  28. A. Hermans, G. Floros, B. Leibe, Dense 3d semantic mapping of indoor scenes from RGB-D images, in Proceedings of 2014 IEEE International Conference on Robotics and Automation (IEEE, Hong Kong, 2014), pp. 2631–2638

    Google Scholar 

  29. S. Hickson, S. Birchfield, I. Essa, H. Christensen, Efficient hierarchical graph-based segmentation of RGBD videos, in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (Columbus, 2014)

    Google Scholar 

  30. N. Höft, H. Schulz, S. Behnke, Fast semantic segmentation of RGB-D scenes with gpu-accelerated deep neural networks, in Proceedings of Conference on Advances in Artificial Intelligence (Springer, Cham, 2014), pp. 80–85

    Google Scholar 

  31. D. Holz, S. Holzer, R. Bogdan Rusu, S. Behnke, Real-time plane segmentation using RGB-D cameras, in Proceedings of RoboCup International Symposium, Istanbul (2011)

    Google Scholar 

  32. A.E. Johnson, M. Hebert, Using spin images for efficient object recognition in cluttered 3d scenes. IEEE Trans. Pattern Anal. Machine Intell. 21(5), 433–449 (1999)

    Article  Google Scholar 

  33. V. Kolmogorov, A. Criminisi, A. Blake, G. Cross, C. Rother, Bi-layer segmentation of binocular stereo video, in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (2005), p. 1186

    Google Scholar 

  34. L. Ladicky, P. Sturgess, C. Russell, S. Sengupta, Y. Bastanlar, W. Clocksin, P. Torr, Joint optimisation for object class segmentation and dense stereo reconstruction. International Journal of Computer Vision, Springer US 100(2), 122–133 (2012)

    Article  MathSciNet  Google Scholar 

  35. B. Langmann, S.E. Ghobadi, K. Hartmann, O. Loffeld, Multi-modal background subtraction using gaussian mixture models, in Proceedings of ISPRS Technical Commission III Symposium on Photogrammetry Computer Vision and Image Analysis (2010), pp. 61–66

    Google Scholar 

  36. J. Leens, S. Pierard, O. Barnich, M. Van Droogenbroeck, J.M. Wagner, Combining color, depth, and motion for video segmentation, in Proceedings of Conference on Computer Vision Systems (2009)

    Google Scholar 

  37. D.G. Lowe, Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)

    Article  Google Scholar 

  38. T. Lu, S. Li, Image matting with color and depth information, in Proceedings of 2012 International Conference on Pattern Recognition (2012), pp. 3787–3790

    Google Scholar 

  39. A.C. Muller, S. Behnke, Learning depth-sensitive conditional random fields for semantic segmentation of RGB-D images, in Proceedings of 2014 IEEE International Conference on Robotics and Automation (2014), pp. 6232–6237

    Google Scholar 

  40. A. Nguyen, B. Le, 3d point cloud segmentation: a survey, in Proceedings of 2013 IEEE Conference on Robotics, Automation and Mechatronics (2013), pp. 225–230

    Google Scholar 

  41. T. Ojala, M. Pietikainen, T. Maenpaa, Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans. Pattern Anal. Mach. Intell. 24(7), 971–987 (2002)

    Article  MATH  Google Scholar 

  42. G. Pagnutti, P. Zanuttigh, Scene segmentation from depth and color data driven by surface fitting, in Proceedings of International Conference on Image Processing (2014)

    Google Scholar 

  43. G. Pagnutti, P. Zanuttigh, Scene segmentation based on nurbs surface fitting metrics, in Proceedings of Smart Tools and Apps in computer Graphics (2015)

    Google Scholar 

  44. N.R. Pal, S.K. Pal, A review on image segmentation techniques. Pattern Recogn. 26(9), 1277–1294 (1993)

    Article  Google Scholar 

  45. T. Porter, T. Duff, Compositing digital images, in Proceedings of ACM SIGGRAPH (New York, 1984), pp. 253–259

    Google Scholar 

  46. T. Rabbani, F. Van Den Heuvel, G. Vosselmann, Segmentation of point clouds using smoothness constraint, in Proceedings of International Archives of Photogrammetry, Remote Sensing and Spatial Information Sciences, vol. 36 (Dresden, 2006)

    Google Scholar 

  47. Recommendations on uniform color spaces, color difference equations, psychometric color terms. Supplement No. 2 to CIE publication No. 15 (E.-1.3.1) 1971/(TC-1.3.) (1978)

    Google Scholar 

  48. X. Ren, L. Bo, D. Fox, RGB-(D) scene labeling: Features and algorithms, in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (2012), pp. 2759–2766

    Google Scholar 

  49. R. Schnabel, R. Wahl, R. Klein, Efficient ransac for point-cloud shape detection. Comput. Graph. Forum 26(2), 214–226 (2007)

    Article  Google Scholar 

  50. J. Shi, J. Malik, Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 22, 888–905 (2000)

    Article  Google Scholar 

  51. N. Silberman, R. Fergus, Indoor scene segmentation using a structured light sensor, in Proceedings of IEEE International Conference on Computer Vision Workshops (Barcelona, 2011), pp. 601–608

    Google Scholar 

  52. N. Silberman, D. Hoiem, O. Kohli, R. Fergus, Indoor segmentation and support inference from RGBD images, in Proceedings of European Conference on Computer Vision (2012)

    Google Scholar 

  53. L. Spinello, K.O. Arras, People detection in RGB-D data, in Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (San Francisco, 2011), pp. 3838–3843

    Google Scholar 

  54. N. Srinivasan, F. Dellaert, A rao-blackwellized mcmc algorithm for recovering piecewise planar 3d model from multiple view RGBD images, in Proceedings of International Conference on Image Processing (2014)

    Google Scholar 

  55. C. Stauffer, W.E.L. Grimson, Learning patterns of activity using real-time tracking. IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 747–757 (2000)

    Article  Google Scholar 

  56. A. Störmer, M. Hofmann, G. Rigoll, Depth gradient based segmentation of overlapping foreground objects in range images, in Proceedings of 2010 Conference on Information Fusion (Edinburgh, 2010), pp. 1–4

    Google Scholar 

  57. J. Sun, J. Jia, C. Tang, H. Shum, Poisson matting. ACM Trans. Graph. 23, 315–321 (2004)

    Article  Google Scholar 

  58. R. Szeliski, Computer Vision: Algorithms and Applications (Springer, New York, 2010)

    MATH  Google Scholar 

  59. M. Wallenberg, M. Felsberg, P. Forssen, B. Dellen, Channel coding for joint colour and depth segmentation. in Proceedings of Annual Symposium of the German Association for Pattern Recognition, vol. 6835 (Springer, Heidelberg, 2011), pp. 306–315

    Google Scholar 

  60. L. Wang, C. Zhang, R. Yang, C. Zhang, Tofcut: Towards robust real-time foreground extraction using time-of-flight camera, in Proceedings of 3D Data Processing, Visualization and Transmission (Paris, 2010)

    Google Scholar 

  61. L. Wang, M. Gong, C. Zhang, R. Yang, C. Zhang, Y.-H. Yang, Automatic real-time video matting using time-of-flight camera and multichannel poisson equations. Int. J. Comput. Vis. 97, 1–18 (2011)

    MATH  Google Scholar 

  62. O. Wang, J. Finger, Q. Yang, J. Davis, R. Yang, Automatic natural video matting with depth, in Proceedings of Pacific Conference on Computer Graphics and Applications (2007), pp. 469–472

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Zanuttigh, P., Marin, G., Dal Mutto, C., Dominio, F., Minto, L., Cortelazzo, G.M. (2016). Scene Segmentation Assisted by Depth Data. In: Time-of-Flight and Structured Light Depth Cameras. Springer, Cham. https://doi.org/10.1007/978-3-319-30973-6_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-30973-6_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-30971-2

  • Online ISBN: 978-3-319-30973-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics