Abstract
Segmentation, or detecting scene elements within an image, can be drastically simplified by combining depth and color data. This approach delivers segmentation tools which outperform techniques based on color alone. This chapter shows how consumer depth camera data can be used for three different tasks. The first is video matting, the separation of foreground objects from the background. The second is scene segmentation, the partitioning of color images and depth maps into different regions corresponding to scene elements. The third is semantic segmentation, the task of segmenting the framed scene and associating each segment to a specific category of object. We present various algorithms and methodologies for both single frame, color, and depth video sequences.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
In the case of structured light depth cameras the amplitude image is not very informative about the scene’s color, since it is dominated by the projected pattern.
References
A. Abramov, K. Pauwels, J. Papon, F. Worgotter, B. Dellen, Depth-supported real-time video segmentation with the kinect, in Proceedings of IEEE Workshop on Applications of Computer Vision (2012), pp. 457–464
P. Arbelaez, M. Maire, C. Fowlkes, J. Malik, Contour detection and hierarchical image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 33(5), 898–916 (2011)
O. Arif, W. Daley, P.A. Vela, J. Teizer, J. Stewart, Visual tracking and segmentation using time-of-flight sensor, in Proceedings of IEEE International Conference on Image Processing (2010), pp. 2241–2244
D. Banica, C. Sminchisescu, Second-order constrained parametric proposals and sequential search-based structured prediction for semantic segmentation in RGB-D images, in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (2015), pp. 3517–3526
A. Bleiweiss, M. Werman, Fusing time-of-flight depth and color for real-time segmentation and tracking, in Proceedings of DAGM Workshop, Dynamic 3D Imaging (2009), pp. 58–69
Y. Boykov, O. Veksler, R. Zabih, Fast approximate energy minimization via graph cuts. IEEE Trans. Pattern Anal. Mach. Intell. 23, 1222–1239 (2001)
F. Calderero, F. Marques, Hierarchical fusion of color and depth information at partition level by cooperative region merging, in Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (2009), pp. 973–976
M. Camplani, M. Salgado, Background foreground segmentation with RGB-D kinect data: An efficient combination of classifiers. J. Vis. Commun. Image Represent. 25(1), 122–136 (2014). Visual Understanding and Applications with RGB-D Cameras
M. Camplani, C.R. Del Blanco, L. Salgado, F. Jaureguizar, N. García, Advanced background modeling with RGB-D sensors through classifiers combination and inter-frame foreground prediction. Mach. Vis. Appl. 25(5), 1197–1210 (2014)
Y. Cheng, Mean shift, mode seeking, and clustering, IEEE Trans. Pattern Anal. Mach. Intell. 17(8), 790–799 (1995)
Y.Y. Chuang, B. Curless, D.H. Salesin, R. Szeliski, A bayesian approach to digital matting, in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (2001), p. 264
D. Comaniciu, P. Meer, Mean shift: a robust approach toward feature space analysis. IEEE Trans. Pattern Anal. Mach. Intell. 24, 603–619 (2002)
C. Couprie, C. Farabet, L. Najman, Y. LeCun, Convolutional Nets and Watershed Cuts for Real-Time Semantic Labeling of RGBD Videos. J. Mach. Learn. Res. 15(Oct), 3489–3511 (2014)
R. Crabb, C. Tracey, A. Puranik, J. Davis, Real-time foreground segmentation via range and color imaging, in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition Workshops (2008), pp. 1–5
M.J. Dahan, N. Chen, A. Shamir, D. Cohen-Or, Combining color and depth for enhanced image segmentation and retargeting. Vis. Comput. 28(12), 1181–1193 (2012)
C. Dal Mutto, P. Zanuttigh, G.M. Cortelazzo, Scene segmentation assisted by stereo vision, in Proceedings of International Conference on 3D Imaging, Modeling, Processing, Visualization and Transmission 2011 (Hangzhou, 2011)
C. Dal Mutto, P. Zanuttigh, G.M. Cortelazzo, Fusion of geometry and color information for scene segmentation. Proceedings of IEEE J. Sel. Top. Signal Process. 6(5), 505–521 (2012)
N. Dalal, B. Triggs, Histograms of oriented gradients for human detection, in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (2005), pp. 886–893
C.R. Del-Blanco, T. Mantecón, M. Camplani, F. Jaureguizar, L. Salgado, N. García, Foreground segmentation in depth imagery using depth and spatial dynamic models for video surveillance applications. Sensors 14(2), 1961–1987 (2014)
B. Dellen, G. Alenyá, S. Foix, C. Torras, Segmenting color images into surface patches by exploiting sparse depth data, in Proceedings of Winter Vision Meeting: Workshop on Applications of Computer Vision (2011), pp. 591–598
C. Erdogan, M. Paluri, F. Dellaert, Planar segmentation of RGBD images using fast linear fitting and markov chain monte carlo, in Proceedings of Conference on Computer and Robot Vision (Toronto, 2012), pp. 32–39
P.F. Felzenszwalb, D.P. Huttenlocher, Efficient graph-based image segmentation. Int. J. Comput. Vis. 59 (2004)
C. Fowlkes, S. Belongie, Fan Chung, J. Malik, Spectral grouping using the nystrom method. IEEE Trans. Pattern Anal. Mach. Intell. 26(2), 214–225 (2004)
J. Gallego, M. Pardàs, Region based foreground segmentation combining color and depth sensors via logarithmic opinion pool decision. J. Vis. Commun. Image Represent. 25(1), 184–194 (2014)
S. Gupta, P. Arbeláez, R. Girshick, J. Malik. Indoor scene understanding with RGB-D images: Bottom-up segmentation, object detection and semantic segmentation. Int. J. Comput. Vis. 112(2), 133–149 (2015)
M. Harville, G. Gordon, J. Woodfill, Foreground segmentation using adaptive mixture models in color and depth, in Proceedings of IEEE Workshop on Detection and Recognition of Events in Video (2001)
E. Herbst, P. Henry, D. Fox, Toward online 3-D object segmentation and mapping, in Proceedings of IEEE International Conference on Robotics and Automation (2014)
A. Hermans, G. Floros, B. Leibe, Dense 3d semantic mapping of indoor scenes from RGB-D images, in Proceedings of 2014 IEEE International Conference on Robotics and Automation (IEEE, Hong Kong, 2014), pp. 2631–2638
S. Hickson, S. Birchfield, I. Essa, H. Christensen, Efficient hierarchical graph-based segmentation of RGBD videos, in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (Columbus, 2014)
N. Höft, H. Schulz, S. Behnke, Fast semantic segmentation of RGB-D scenes with gpu-accelerated deep neural networks, in Proceedings of Conference on Advances in Artificial Intelligence (Springer, Cham, 2014), pp. 80–85
D. Holz, S. Holzer, R. Bogdan Rusu, S. Behnke, Real-time plane segmentation using RGB-D cameras, in Proceedings of RoboCup International Symposium, Istanbul (2011)
A.E. Johnson, M. Hebert, Using spin images for efficient object recognition in cluttered 3d scenes. IEEE Trans. Pattern Anal. Machine Intell. 21(5), 433–449 (1999)
V. Kolmogorov, A. Criminisi, A. Blake, G. Cross, C. Rother, Bi-layer segmentation of binocular stereo video, in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (2005), p. 1186
L. Ladicky, P. Sturgess, C. Russell, S. Sengupta, Y. Bastanlar, W. Clocksin, P. Torr, Joint optimisation for object class segmentation and dense stereo reconstruction. International Journal of Computer Vision, Springer US 100(2), 122–133 (2012)
B. Langmann, S.E. Ghobadi, K. Hartmann, O. Loffeld, Multi-modal background subtraction using gaussian mixture models, in Proceedings of ISPRS Technical Commission III Symposium on Photogrammetry Computer Vision and Image Analysis (2010), pp. 61–66
J. Leens, S. Pierard, O. Barnich, M. Van Droogenbroeck, J.M. Wagner, Combining color, depth, and motion for video segmentation, in Proceedings of Conference on Computer Vision Systems (2009)
D.G. Lowe, Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)
T. Lu, S. Li, Image matting with color and depth information, in Proceedings of 2012 International Conference on Pattern Recognition (2012), pp. 3787–3790
A.C. Muller, S. Behnke, Learning depth-sensitive conditional random fields for semantic segmentation of RGB-D images, in Proceedings of 2014 IEEE International Conference on Robotics and Automation (2014), pp. 6232–6237
A. Nguyen, B. Le, 3d point cloud segmentation: a survey, in Proceedings of 2013 IEEE Conference on Robotics, Automation and Mechatronics (2013), pp. 225–230
T. Ojala, M. Pietikainen, T. Maenpaa, Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans. Pattern Anal. Mach. Intell. 24(7), 971–987 (2002)
G. Pagnutti, P. Zanuttigh, Scene segmentation from depth and color data driven by surface fitting, in Proceedings of International Conference on Image Processing (2014)
G. Pagnutti, P. Zanuttigh, Scene segmentation based on nurbs surface fitting metrics, in Proceedings of Smart Tools and Apps in computer Graphics (2015)
N.R. Pal, S.K. Pal, A review on image segmentation techniques. Pattern Recogn. 26(9), 1277–1294 (1993)
T. Porter, T. Duff, Compositing digital images, in Proceedings of ACM SIGGRAPH (New York, 1984), pp. 253–259
T. Rabbani, F. Van Den Heuvel, G. Vosselmann, Segmentation of point clouds using smoothness constraint, in Proceedings of International Archives of Photogrammetry, Remote Sensing and Spatial Information Sciences, vol. 36 (Dresden, 2006)
Recommendations on uniform color spaces, color difference equations, psychometric color terms. Supplement No. 2 to CIE publication No. 15 (E.-1.3.1) 1971/(TC-1.3.) (1978)
X. Ren, L. Bo, D. Fox, RGB-(D) scene labeling: Features and algorithms, in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (2012), pp. 2759–2766
R. Schnabel, R. Wahl, R. Klein, Efficient ransac for point-cloud shape detection. Comput. Graph. Forum 26(2), 214–226 (2007)
J. Shi, J. Malik, Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 22, 888–905 (2000)
N. Silberman, R. Fergus, Indoor scene segmentation using a structured light sensor, in Proceedings of IEEE International Conference on Computer Vision Workshops (Barcelona, 2011), pp. 601–608
N. Silberman, D. Hoiem, O. Kohli, R. Fergus, Indoor segmentation and support inference from RGBD images, in Proceedings of European Conference on Computer Vision (2012)
L. Spinello, K.O. Arras, People detection in RGB-D data, in Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems (San Francisco, 2011), pp. 3838–3843
N. Srinivasan, F. Dellaert, A rao-blackwellized mcmc algorithm for recovering piecewise planar 3d model from multiple view RGBD images, in Proceedings of International Conference on Image Processing (2014)
C. Stauffer, W.E.L. Grimson, Learning patterns of activity using real-time tracking. IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 747–757 (2000)
A. Störmer, M. Hofmann, G. Rigoll, Depth gradient based segmentation of overlapping foreground objects in range images, in Proceedings of 2010 Conference on Information Fusion (Edinburgh, 2010), pp. 1–4
J. Sun, J. Jia, C. Tang, H. Shum, Poisson matting. ACM Trans. Graph. 23, 315–321 (2004)
R. Szeliski, Computer Vision: Algorithms and Applications (Springer, New York, 2010)
M. Wallenberg, M. Felsberg, P. Forssen, B. Dellen, Channel coding for joint colour and depth segmentation. in Proceedings of Annual Symposium of the German Association for Pattern Recognition, vol. 6835 (Springer, Heidelberg, 2011), pp. 306–315
L. Wang, C. Zhang, R. Yang, C. Zhang, Tofcut: Towards robust real-time foreground extraction using time-of-flight camera, in Proceedings of 3D Data Processing, Visualization and Transmission (Paris, 2010)
L. Wang, M. Gong, C. Zhang, R. Yang, C. Zhang, Y.-H. Yang, Automatic real-time video matting using time-of-flight camera and multichannel poisson equations. Int. J. Comput. Vis. 97, 1–18 (2011)
O. Wang, J. Finger, Q. Yang, J. Davis, R. Yang, Automatic natural video matting with depth, in Proceedings of Pacific Conference on Computer Graphics and Applications (2007), pp. 469–472
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Zanuttigh, P., Marin, G., Dal Mutto, C., Dominio, F., Minto, L., Cortelazzo, G.M. (2016). Scene Segmentation Assisted by Depth Data. In: Time-of-Flight and Structured Light Depth Cameras. Springer, Cham. https://doi.org/10.1007/978-3-319-30973-6_6
Download citation
DOI: https://doi.org/10.1007/978-3-319-30973-6_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-30971-2
Online ISBN: 978-3-319-30973-6
eBook Packages: Computer ScienceComputer Science (R0)