Scene Segmentation Assisted by Depth Data

Zanuttigh, Pietro; Marin, Giulio; Dal Mutto, Carlo; Dominio, Fabio; Minto, Ludovico; Cortelazzo, Guido Maria

doi:10.1007/978-3-319-30973-6_6

Pietro Zanuttigh⁷,
Giulio Marin⁷,
Carlo Dal Mutto⁸,
Fabio Dominio⁷,
Ludovico Minto⁷ &
…
Guido Maria Cortelazzo⁹

3181 Accesses

Abstract

Segmentation, or detecting scene elements within an image, can be drastically simplified by combining depth and color data. This approach delivers segmentation tools which outperform techniques based on color alone. This chapter shows how consumer depth camera data can be used for three different tasks. The first is video matting, the separation of foreground objects from the background. The second is scene segmentation, the partitioning of color images and depth maps into different regions corresponding to scene elements. The third is semantic segmentation, the task of segmenting the framed scene and associating each segment to a specific category of object. We present various algorithms and methodologies for both single frame, color, and depth video sequences.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 149.00; Price excludes VAT (USA)

Softcover Book: USD 199.99; Price excludes VAT (USA)

Hardcover Book: USD 199.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
In the case of structured light depth cameras the amplitude image is not very informative about the scene’s color, since it is dominated by the projected pattern.

References

A. Abramov, K. Pauwels, J. Papon, F. Worgotter, B. Dellen, Depth-supported real-time video segmentation with the kinect, in Proceedings of IEEE Workshop on Applications of Computer Vision (2012), pp. 457–464
Google Scholar
P. Arbelaez, M. Maire, C. Fowlkes, J. Malik, Contour detection and hierarchical image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 33(5), 898–916 (2011)
Article Google Scholar
O. Arif, W. Daley, P.A. Vela, J. Teizer, J. Stewart, Visual tracking and segmentation using time-of-flight sensor, in Proceedings of IEEE International Conference on Image Processing (2010), pp. 2241–2244
Google Scholar
D. Banica, C. Sminchisescu, Second-order constrained parametric proposals and sequential search-based structured prediction for semantic segmentation in RGB-D images, in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (2015), pp. 3517–3526
Google Scholar
A. Bleiweiss, M. Werman, Fusing time-of-flight depth and color for real-time segmentation and tracking, in Proceedings of DAGM Workshop, Dynamic 3D Imaging (2009), pp. 58–69
Google Scholar
Y. Boykov, O. Veksler, R. Zabih, Fast approximate energy minimization via graph cuts. IEEE Trans. Pattern Anal. Mach. Intell. 23, 1222–1239 (2001)
Article Google Scholar
F. Calderero, F. Marques, Hierarchical fusion of color and depth information at partition level by cooperative region merging, in Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (2009), pp. 973–976
Google Scholar
M. Camplani, M. Salgado, Background foreground segmentation with RGB-D kinect data: An efficient combination of classifiers. J. Vis. Commun. Image Represent. 25(1), 122–136 (2014). Visual Understanding and Applications with RGB-D Cameras
Google Scholar
M. Camplani, C.R. Del Blanco, L. Salgado, F. Jaureguizar, N. García, Advanced background modeling with RGB-D sensors through classifiers combination and inter-frame foreground prediction. Mach. Vis. Appl. 25(5), 1197–1210 (2014)
Google Scholar
Y. Cheng, Mean shift, mode seeking, and clustering, IEEE Trans. Pattern Anal. Mach. Intell. 17(8), 790–799 (1995)
Article Google Scholar
Y.Y. Chuang, B. Curless, D.H. Salesin, R. Szeliski, A bayesian approach to digital matting, in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (2001), p. 264
Google Scholar
D. Comaniciu, P. Meer, Mean shift: a robust approach toward feature space analysis. IEEE Trans. Pattern Anal. Mach. Intell. 24, 603–619 (2002)
Article Google Scholar
C. Couprie, C. Farabet, L. Najman, Y. LeCun, Convolutional Nets and Watershed Cuts for Real-Time Semantic Labeling of RGBD Videos. J. Mach. Learn. Res. 15(Oct), 3489–3511 (2014)
MathSciNet Google Scholar
R. Crabb, C. Tracey, A. Puranik, J. Davis, Real-time foreground segmentation via range and color imaging, in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition Workshops (2008), pp. 1–5
Google Scholar
M.J. Dahan, N. Chen, A. Shamir, D. Cohen-Or, Combining color and depth for enhanced image segmentation and retargeting. Vis. Comput. 28(12), 1181–1193 (2012)
Article Google Scholar
C. Dal Mutto, P. Zanuttigh, G.M. Cortelazzo, Scene segmentation assisted by stereo vision, in Proceedings of International Conference on 3D Imaging, Modeling, Processing, Visualization and Transmission 2011 (Hangzhou, 2011)
Google Scholar
C. Dal Mutto, P. Zanuttigh, G.M. Cortelazzo, Fusion of geometry and color information for scene segmentation. Proceedings of IEEE J. Sel. Top. Signal Process. 6(5), 505–521 (2012)
Article Google Scholar
N. Dalal, B. Triggs, Histograms of oriented gradients for human detection, in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (2005), pp. 886–893
Google Scholar
C.R. Del-Blanco, T. Mantecón, M. Camplani, F. Jaureguizar, L. Salgado, N. García, Foreground segmentation in depth imagery using depth and spatial dynamic models for video surveillance applications. Sensors 14(2), 1961–1987 (2014)
Article Google Scholar
B. Dellen, G. Alenyá, S. Foix, C. Torras, Segmenting color images into surface patches by exploiting sparse depth data, in Proceedings of Winter Vision Meeting: Workshop on Applications of Computer Vision (2011), pp. 591–598
Google Scholar
C. Erdogan, M. Paluri, F. Dellaert, Planar segmentation of RGBD images using fast linear fitting and markov chain monte carlo, in Proceedings of Conference on Computer and Robot Vision (Toronto, 2012), pp. 32–39
Google Scholar
P.F. Felzenszwalb, D.P. Huttenlocher, Efficient graph-based image segmentation. Int. J. Comput. Vis. 59 (2004)
Google Scholar
C. Fowlkes, S. Belongie, Fan Chung, J. Malik, Spectral grouping using the nystrom method. IEEE Trans. Pattern Anal. Mach. Intell. 26(2), 214–225 (2004)
Google Scholar
J. Gallego, M. Pardàs, Region based foreground segmentation combining color and depth sensors via logarithmic opinion pool decision. J. Vis. Commun. Image Represent. 25(1), 184–194 (2014)
Article Google Scholar
S. Gupta, P. Arbeláez, R. Girshick, J. Malik. Indoor scene understanding with RGB-D images: Bottom-up segmentation, object detection and semantic segmentation. Int. J. Comput. Vis. 112(2), 133–149 (2015)
Article MathSciNet Google Scholar
M. Harville, G. Gordon, J. Woodfill, Foreground segmentation using adaptive mixture models in color and depth, in Proceedings of IEEE Workshop on Detection and Recognition of Events in Video (2001)
Google Scholar
E. Herbst, P. Henry, D. Fox, Toward online 3-D object segmentation and mapping, in Proceedings of IEEE International Conference on Robotics and Automation (2014)
Google Scholar
A. Hermans, G. Floros, B. Leibe, Dense 3d semantic mapping of indoor scenes from RGB-D images, in Proceedings of 2014 IEEE International Conference on Robotics and Automation (IEEE, Hong Kong, 2014), pp. 2631–2638
Google Scholar
S. Hickson, S. Birchfield, I. Essa, H. Christensen, Efficient hierarchical graph-based segmentation of RGBD videos, in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (Columbus, 2014)
Google Scholar
N. Höft, H. Schulz, S. Behnke, Fast semantic segmentation of RGB-D scenes with gpu-accelerated deep neural networks, in Proceedings of Conference on Advances in Artificial Intelligence (Springer, Cham, 2014), pp. 80–85
Google Scholar
D. Holz, S. Holzer, R. Bogdan Rusu, S. Behnke, Real-time plane segmentation using RGB-D cameras, in Proceedings of RoboCup International Symposium, Istanbul (2011)
Google Scholar
A.E. Johnson, M. Hebert, Using spin images for efficient object recognition in cluttered 3d scenes. IEEE Trans. Pattern Anal. Machine Intell. 21(5), 433–449 (1999)
Article Google Scholar
V. Kolmogorov, A. Criminisi, A. Blake, G. Cross, C. Rother, Bi-layer segmentation of binocular stereo video, in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (2005), p. 1186
Google Scholar
L. Ladicky, P. Sturgess, C. Russell, S. Sengupta, Y. Bastanlar, W. Clocksin, P. Torr, Joint optimisation for object class segmentation and dense stereo reconstruction. International Journal of Computer Vision, Springer US 100(2), 122–133 (2012)
Article MathSciNet Google Scholar
B. Langmann, S.E. Ghobadi, K. Hartmann, O. Loffeld, Multi-modal background subtraction using gaussian mixture models, in Proceedings of ISPRS Technical Commission III Symposium on Photogrammetry Computer Vision and Image Analysis (2010), pp. 61–66
Google Scholar
J. Leens, S. Pierard, O. Barnich, M. Van Droogenbroeck, J.M. Wagner, Combining color, depth, and motion for video segmentation, in Proceedings of Conference on Computer Vision Systems (2009)
Google Scholar
D.G. Lowe, Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)
Article Google Scholar
T. Lu, S. Li, Image matting with color and depth information, in Proceedings of 2012 International Conference on Pattern Recognition (2012), pp. 3787–3790
Google Scholar
A.C. Muller, S. Behnke, Learning depth-sensitive conditional random fields for semantic segmentation of RGB-D images, in Proceedings of 2014 IEEE International Conference on Robotics and Automation (2014), pp. 6232–6237
Google Scholar
A. Nguyen, B. Le, 3d point cloud segmentation: a survey, in Proceedings of 2013 IEEE Conference on Robotics, Automation and Mechatronics (2013), pp. 225–230
Google Scholar
T. Ojala, M. Pietikainen, T. Maenpaa, Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans. Pattern Anal. Mach. Intell. 24(7), 971–987 (2002)
Article MATH Google Scholar
G. Pagnutti, P. Zanuttigh, Scene segmentation from depth and color data driven by surface fitting, in Proceedings of International Conference on Image Processing (2014)
Google Scholar
G. Pagnutti, P. Zanuttigh, Scene segmentation based on nurbs surface fitting metrics, in Proceedings of Smart Tools and Apps in computer Graphics (2015)
Google Scholar
N.R. Pal, S.K. Pal, A review on image segmentation techniques. Pattern Recogn. 26(9), 1277–1294 (1993)
Article Google Scholar
T. Porter, T. Duff, Compositing digital images, in Proceedings of ACM SIGGRAPH (New York, 1984), pp. 253–259
Google Scholar
T. Rabbani, F. Van Den Heuvel, G. Vosselmann, Segmentation of point clouds using smoothness constraint, in Proceedings of International Archives of Photogrammetry, Remote Sensing and Spatial Information Sciences, vol. 36 (Dresden, 2006)
Google Scholar
Recommendations on uniform color spaces, color difference equations, psychometric color terms. Supplement No. 2 to CIE publication No. 15 (E.-1.3.1) 1971/(TC-1.3.) (1978)
Google Scholar
X. Ren, L. Bo, D. Fox, RGB-(D) scene labeling: Features and algorithms, in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (2012), pp. 2759–2766
Google Scholar
R. Schnabel, R. Wahl, R. Klein, Efficient ransac for point-cloud shape detection. Comput. Graph. Forum 26(2), 214–226 (2007)
Article Google Scholar
J. Shi, J. Malik, Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 22, 888–905 (2000)
Article Google Scholar
N. Silberman, R. Fergus, Indoor scene segmentation using a structured light sensor, in Proceedings of IEEE International Conference on Computer Vision Workshops (Barcelona, 2011), pp. 601–608
Google Scholar
N. Silberman, D. Hoiem, O. Kohli, R. Fergus, Indoor segmentation and support inference from RGBD images, in Proceedings of European Conference on Computer Vision (2012)
Google Scholar
L. Spinello, K.O. Arras, People detection in RGB-D data, in Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (San Francisco, 2011), pp. 3838–3843
Google Scholar
N. Srinivasan, F. Dellaert, A rao-blackwellized mcmc algorithm for recovering piecewise planar 3d model from multiple view RGBD images, in Proceedings of International Conference on Image Processing (2014)
Google Scholar
C. Stauffer, W.E.L. Grimson, Learning patterns of activity using real-time tracking. IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 747–757 (2000)
Article Google Scholar
A. Störmer, M. Hofmann, G. Rigoll, Depth gradient based segmentation of overlapping foreground objects in range images, in Proceedings of 2010 Conference on Information Fusion (Edinburgh, 2010), pp. 1–4
Google Scholar
J. Sun, J. Jia, C. Tang, H. Shum, Poisson matting. ACM Trans. Graph. 23, 315–321 (2004)
Article Google Scholar
R. Szeliski, Computer Vision: Algorithms and Applications (Springer, New York, 2010)
MATH Google Scholar
M. Wallenberg, M. Felsberg, P. Forssen, B. Dellen, Channel coding for joint colour and depth segmentation. in Proceedings of Annual Symposium of the German Association for Pattern Recognition, vol. 6835 (Springer, Heidelberg, 2011), pp. 306–315
Google Scholar
L. Wang, C. Zhang, R. Yang, C. Zhang, Tofcut: Towards robust real-time foreground extraction using time-of-flight camera, in Proceedings of 3D Data Processing, Visualization and Transmission (Paris, 2010)
Google Scholar
L. Wang, M. Gong, C. Zhang, R. Yang, C. Zhang, Y.-H. Yang, Automatic real-time video matting using time-of-flight camera and multichannel poisson equations. Int. J. Comput. Vis. 97, 1–18 (2011)
MATH Google Scholar
O. Wang, J. Finger, Q. Yang, J. Davis, R. Yang, Automatic natural video matting with depth, in Proceedings of Pacific Conference on Computer Graphics and Applications (2007), pp. 469–472
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Information Engineering, University of Padova, Padova, Italy
Pietro Zanuttigh, Giulio Marin, Fabio Dominio & Ludovico Minto
Aquifi Inc., Palo Alto, CA, USA
Carlo Dal Mutto
3D Everywhere s.r.l., Padova, Italy
Guido Maria Cortelazzo

Authors

Pietro Zanuttigh
View author publications
You can also search for this author in PubMed Google Scholar
Giulio Marin
View author publications
You can also search for this author in PubMed Google Scholar
Carlo Dal Mutto
View author publications
You can also search for this author in PubMed Google Scholar
Fabio Dominio
View author publications
You can also search for this author in PubMed Google Scholar
Ludovico Minto
View author publications
You can also search for this author in PubMed Google Scholar
Guido Maria Cortelazzo
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Zanuttigh, P., Marin, G., Dal Mutto, C., Dominio, F., Minto, L., Cortelazzo, G.M. (2016). Scene Segmentation Assisted by Depth Data. In: Time-of-Flight and Structured Light Depth Cameras. Springer, Cham. https://doi.org/10.1007/978-3-319-30973-6_6

Download citation

DOI: https://doi.org/10.1007/978-3-319-30973-6_6
Published: 25 May 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-30971-2
Online ISBN: 978-3-319-30973-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics