Abstract
Extracting foreground objects from videos captured by a handheld camera has emerged as a new challenge. While existing approaches aim to exploit several clues such as depth and motion to extract the foreground layer, there are limitations in handling partial movement and cast shadow. In this paper, we bring a novel perspective to address these two issues by utilizing occlusion map introduced by object and camera motion and taking the advantage of interactive image segmentation methods. For partial movement, we treat each video frame as an image and synthesize “seeding” user interactions (i.e., user manually marking foreground and background) with both forward and backward occlusion maps to leverage the advances in high quality interactive image segmentation. For cast shadow, we utilize a paired region based shadow detection method to further refine initial segmentation results by removing detected shadow regions. Experimental results from both qualitative evaluation and quantitative evaluation on the Hopkins dataset demonstrate both the effectiveness and the efficiency of our proposed approach.
Similar content being viewed by others
References
Ayvaci A, Raptis M, Soatto S (2012) Sparse occlusion detection with optical flow. Int J Comput Vis 97:322–338
Boykov Y, Jolly MP (2001) Interactive graph cuts for optimal boundary and region segmentation. In: IEEE International Conference on Computer Vision, vol 1, pp 105–112
Boykov Y, Veksler O, Zabih R (2001) Fast approximate energy minimization via graph cuts. IEEE Trans Pattern Anal Mach Intell 23(11):1222–1239
Bugeau A, Prez P (2009) Detection and segmentation of moving objects in complex scenes. Comput Vis Image Underst 113:459–476
Cheng FC, Huang SC, Ruan SJ (2011) Scene analysis for object detection in advanced surveillance systems using laplacian distribution model. IEEE Trans Syst Man Cybern Part C Appl Rev 41(5):589–598
Comaniciu D, Meer P (2002) Mean shift: A robust approach toward feature space analysis. IEEE Trans Pattern Anal Mach Intell 24(5):603–619
Criminisii A, Cross G, Blake A, Kolmogorov V (2006) Bilayer segmentation of live video. In: IEEE International Conference on Computer Vision and Pattern Recognition, pp 53–60
Cucchiara R, Grana C, Piccardi M, Prati A (2003) Detecting moving objects, ghots and shadows in video streams. IEEE Trans Pattern Anal Mach Intell 25 (10):1337–1342
Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: IEEE International Conference on Computer Vision and Pattern Recognition, vol 1, pp 886–893
Elgammali A, Harwood D, Davis L (2000) Non-parametric model for background subtraction. In: European Conference on Computer Vision, pp 751–767
Finlayson G, Hordley S, Lu C, Drew M (2006) On the removal of shadows from images. IEEE Trans Pattern Anal Mach Intell 28(1):59–68
Fischler MA, Bolles RC (1981) RANSAC random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography. Commun ACM 26:381–395
Gulshan V, Rother C, Criminisi A, Blake A, Zisserman A (2010) Geodesic star convexity for interactive image segmentation. In: IEEE International Conference on Computer Vision and Pattern Recognition, pp 3129–3136
Guo R, Dai Q, Hoiem D (2011) Single-image shadow detection and removal using paired regions. In: IEEE International Conference on Computer Vision and Pattern Recognition, pp 2033–2040
Kolmogorov V, Criminisii A, Blake A, Cross G, Rother C (2005) Bilayer segmentation of binocular stereo video. In: IEEE International Conference on Computer Vision and Pattern Recognition, pp 407–414 4
Lalonde J, Efros A, Narasimhan S (2010) Detect ground shadows in outdoor consumer photograph. In: European Conference on Computer Vision, pp 322–335
Leung M, Yang Y (1987) Human body motion segmentation in a complex scene. Pattern Recog 20(1):55–64
Li Y, Sun J, Shum HY (2005) Video object cut and paste. ACM Trans Graph 24(3):595–600
Mahadevan V, Vasconcelos N (2010) Spatiotemporal saliency in dynamic scenes. IEEE Trans Pattern Anal Mach Intell 32(1):171–177
Mitzel D, Horbert E, Ess A, Leibe B (2010) Multi-person tracking with sparse detection and continuous segmentation. In: European Conference on Computer Vision, pp 397–410
Nguyen TNA, Cai J, Zhang J, Zheng J (2012) Robust interactive image segmentation using convex active contours. IEEE Trans Image Process 21(8):3734–3743
Prati A, Mikic I, Trivedi MM, Cucchira R (2003) Detecting moving shadows:algorithms and evaluation. IEEE Trans Pattern Anal Mach Intell 25(7):918–923
Reddy V, Sanderson C, Lovell BC (2013) Improved foreground detection via block-based classifier cascade with probabilistic decision integration. IEEE Trans Circ Syst Video Technol 23(1):83–93
Sanin A, Sanderson C, Lovell BC (2012) Shadow detection: A survey and comparative evaluation of recent methods. Pattern Recog 45:1684–1695
Shao J, Jia Z, Li Z, Liu F, Zhao J, Peng P (2011) Spatiotemporal energy modeling for foreground segmentation in multiple object tracking. In: IEEE International Conference on Robotics and Automation
Sun J, Zhang W, Tang X, Shum H (2006) Background cut. In: European Conference on Computer Vision, pp 628–641
Tron R, Vidal R (2007) A benchmark for the comparison of 3-d motion segmentation algorithms. In: IEEE International Conference on Computer Vision and Pattern Recognition
Veksler O (2008) Star shape prior for graph-cut image segmentation. In: European Conference on Computer Vision, pp 454–467
Vidal R, Hartley R (2004) Motion segmentation with missing data by power factorization and generalized PCA. In: IEEE Conference on Computer Vision and Pattern Recognition
Wang W, Yang J, Gao W (2008) Modeling background and segmenting moving objects from compressed video. IEEE Trans Circ Syst Video Technol 18(5):670–681
Xiong H, Wang Z, He R, Feng DD (2012) Video object segmentation with occlusion map. In: International Conference on Digital Image Computing: Techniques and Applications (DICTA). Fremantle, Western Australia, Australia
Yan J, Pollefeys M (2006) A general framework for motion segmentation: Independent, articulated, rigid, non-rigid, degenerate and non-degenerate. In: European Conference on Computer Vision
Yang Q, Tan KH, Ahuja N (2012) Shadow removal using bilateral filtering. IEEE Trans Image Process 21(10):4361–4368
Yin P, Criminisii A, Winn J, Essa I (2007) Tree-based classfiers for bilayer video segmentation. In: IEEE International Conference on Computer Vision and Pattern Recognition
Zhang G, Jia J, Hua W, Bao H (2011) Robust bilayer segmentation and motion/depth estimation with a handheld camera. IEEE Trans Pattern Anal Mach Intell 33(3):603–617
Zhang G, Jia J, Wong T, Bao H (2007) Consistent depth maps recovery from a video sequence. IEEE Trans Pattern Anal Mach Intell 31(6):974–988
Zhao T, Nevatia R, Wu B (2008) Segmentation and tracking of multiple humans in crowded environments. IEEE Trans Pattern Anal Mach Intell 30(7):1198–1211
Zhong J, Sclaroff S (2003) Segmenting foreground objects from a dynamic textured background via a robust kalman filter. In: IEEE International Conference on Computer Vision, pp 44–50
Zhu J, Samuel K, Masood S, Tappen M (2010) Learning to recognize shadows in monochromatic natural images. In: IEEE International Conference on Computer Vision and Pattern Recognition, pp 223–230
Acknowledgments
This research was supported by the Australian Research Council (ARC) grants.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Xiong, H., Wang, Z., He, R. et al. Robust foreground object segmentation from handheld camera videos with occlusion map. Multimed Tools Appl 75, 5751–5776 (2016). https://doi.org/10.1007/s11042-015-2538-0
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-015-2538-0