Joint motion boundary detection and CNN-based feature visualization for video object segmentation

Kamranian, Zahra; Naghsh Nilchi, Ahmad Reza; Sadeghian, Hamid; Tombari, Federico; Navab, Nassir

doi:10.1007/s00521-019-04448-7

Joint motion boundary detection and CNN-based feature visualization for video object segmentation

Original Article
Published: 12 September 2019

Volume 32, pages 4073–4091, (2020)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Zahra Kamranian¹,
Ahmad Reza Naghsh Nilchi¹,
Hamid Sadeghian²,
Federico Tombari³ &
…
Nassir Navab³

456 Accesses
5 Citations
Explore all metrics

Abstract

This paper presents a video object segmentation method which jointly uses motion boundary and convolutional neural network (CNN)-based class-level maps to carry out the co-segmentation of the frames. The key characteristic of the proposed approach is a combination of those two sources of information to create initial object and background regions. These regions are employed within the co-segmentation energy function. The motion boundary map detects the areas which contain the object movement, and the CNN-based class saliency map determines the regions with more impact on acquiring the correct network classification. The proposed approach can be implemented on unconstrained natural videos which include changes in an object’s appearance, rapidly moving background, object deformation in non-rigid moving, rapid camera motion and even the existence of a static object. Experimental results on two challenging datasets (i.e., Davis and SegTrackv2 datasets) demonstrate the competitive performance of the proposed method compared with the state-of-the-art approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Unsupervised Video Object Segmentation Using Motion Saliency-Guided Spatio-Temporal Propagation

Enhanced Video Segmentation with Object Tracking

Unsupervised Video Object Segmentation with Motion-Based Bilateral Networks

Notes

https://github.com/zkamranian/Video-Object-Segmentation.

References

Arbeláez P, Pont-Tuset J, Barron JT, Marques F, Malik J (2014) Multiscale combinatorial grouping. In: Computer vision and pattern recognition (CVPR) conference. IEEE, pp 328–335
Chatfield K, Simonyan K, Vedaldi A, Zisserman A (2014) Return of the devil in the details: delving deep into convolutional nets. arXiv preprint arXiv:14053531
Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L (2009) Imagenet: a large-scale hierarchical image database. In: Computer vision and pattern recognition (CVPR) conference. IEEE, pp 248–255
Dong X, Shen J, Shao L, Yang MH (2015) Interactive cosegmentation using global and local energy optimization. IEEE Trans Image Process 24(11):3966–3977
Article MathSciNet Google Scholar
Faktor A, Irani M (2014) Video segmentation by non-local consensus voting. In: British machine vision (BMVC) conference
Fathi A, Naghsh-Nilchi AR (2013) Integrating adaptive neuro-fuzzy inference system and local binary pattern operator for robust retinal blood vessels segmentation. Neural Comput Appl 22(1):163–174
Article Google Scholar
Fragkiadaki K, Arbelaez P, Felsen P, Malik J (2015) Learning to segment moving objects in videos. In: Computer vision and pattern recognition (CVPR) conference. IEEE, pp 4083–4090
Hariharan B, Arbeláez P, Girshick R, Malik J (2015) Hypercolumns for object segmentation and fine-grained localization. In: Computer vision and pattern recognition (CVPR) conference, IEEE, pp 447–456
He K, Zhang X, Ren S, Sun J (2015) Deep residual learning for image recognition. arXiv preprint arXiv:151203385
Hochbaum DS, Singh V (2009) An efficient algorithm for co-segmentation. In: Computer vision (ICCV) international conference. IEEE, pp 269–276
Hu YT, Huang JB, Schwing A (2017) Maskrnn: instance level video object segmentation. In: Advances in neural information processing systems. pp 325–334
Jain SD, Xiong B, Grauman K (2017) Fusionseg: learning to combine motion and appearance for fully automatic segmentation of generic objects in videos. arXiv preprint arXiv:170105384
Jiang YG, Ngo CW, Yang J (2007) Towards optimal bag-of-features for object categorization and semantic video retrieval. In: Image and video retrieval international conference. ACM, pp 494–501
Kamranian Z, Nilchi ARN, Monadjemi A, Navab N (2018a) Iterative algorithm for interactive co-segmentation using semantic information propagation. Appl Intell 48(12):5019–5036
Article Google Scholar
Kamranian Z, Tombari F, Nilchi ARN, Monadjemi A, Navab N (2018b) Co-segmentation via visualization. J Vis Commun Image Represent 55:201–214
Article Google Scholar
Karpathy A, Toderici G, Shetty S, Leung T, Sukthankar R, Fei-Fei L (2014) Large-scale video classification with convolutional neural networks. In: Computer vision and pattern recognition (CVPR) conference. IEEE, pp 1725–1732
Khoreva A, Perazzi F, Benenson R, Schiele B, Sorkine-Hornung A (2016) Learning video object segmentation from static images. arXiv preprint arXiv:161202646
Kim G, Xing EP (2012) On multiple foreground cosegmentation. In: Computer vision and pattern recognition (CVPR) conference. IEEE, pp 837–844
Kim G, Xing EP, Fei-Fei L, Kanade T (2011) Distributed cosegmentation via submodular optimization on anisotropic diffusion. In: Computer vision (ICCV) international conference. IEEE, pp 169–176
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems (NIPS) conference. NIPS, pp 1097–1105
Lee YJ, Kim J, Grauman K (2011) Key-segments for video object segmentation. In: Computer vision (ICCV) international conference. IEEE, pp 1995–2002
Lee YJ, Ghosh J, Grauman K (2012) Discovering important people and objects for egocentric video summarization. In: Computer vision and pattern recognition (CVPR) conference. IEEE, pp 1346–1353
Li F, Kim T, Humayun A, Tsai D, Rehg JM (2013) Video segmentation by tracking many figure-ground segments. In: Computer vision (ICCV) international conference. IEEE, pp 2192–2199
Li H, Li Y, Porikli F (2016a) Deeptrack: learning discriminative feature representations online for robust visual tracking. IEEE Trans Image Process 25(4):1834–1848
Article MathSciNet Google Scholar
Li K, Zhang J, Tao W (2016b) Unsupervised co-segmentation for indefinite number of common foreground objects. IEEE Trans Image Process 25(4):1898–1909
Article MathSciNet Google Scholar
Ma C, Huang JB, Yang X, Yang MH (2015) Hierarchical convolutional features for visual tracking. In: Computer vision (ICCV) international conference. IEEE, pp 3074–3082
Ma T, Latecki LJ (2012) Maximum weight cliques with mutex constraints for video object segmentation. In: Computer vision and pattern recognition (CVPR) conference. IEEE, pp 670–677
Meng F, Li H, Liu G, Ngan KN (2012) Object co-segmentation based on shortest path algorithm and saliency model. IEEE Trans Multimed 14(5):1429–1441
Article Google Scholar
Meng F, Cai J, Li H (2016) Cosegmentation of multiple image groups. Comput Vis Image Underst 146:67–76
Article Google Scholar
Mukherjee L, Singh V, Dyer CR (2009) Half-integrality based algorithms for cosegmentation of images. In: Computer vision and pattern recognition (CVPR) conference. IEEE, pp 2028–2035
Nam H, Han B (2016) Learning multi-domain convolutional neural networks for visual tracking. In: Computer vision and pattern recognition (CVPR) conference. IEEE, pp 4293–4302
Oneata D, Revaud J, Verbeek J, Schmid C (2014) Spatio-temporal object detection proposals. In: European conference on computer vision (ECCV). Springer, pp 737–752
Papazoglou A, Ferrari V (2013) Fast object segmentation in unconstrained video. In: Computer Vision (ICCV) International Conference, IEEE, pp 1777–1784
Perazzi F, Pont-Tuset J, McWilliams B, Van Gool L, Gross M, Sorkine-Hornung A (2016) A benchmark dataset and evaluation methodology for video object segmentation. In: Computer vision and pattern recognition (CVPR) conference. IEEE, pp 724–732
Rother C, Minka T, Blake A, Kolmogorov V (2006) Cosegmentation of image pairs by histogram matching-incorporating a global constraint into MRFS. In: Computer vision and pattern recognition (CVPR) conference. IEEE, pp 993–1000
Sadeghian H, Villani L, Kamranian Z, Karami A (2015) Visual servoing with safe interaction using image moments. In: Intelligent robots and systems (IROS) international conference. IEEE, pp 5479–5485
Schwarz LA, Mateus D, Castañeda V, Navab N (2010) Manifold learning for tof-based human body tracking and activity recognition. In: British machine vision (BMVC) conference. Citeseer, pp 1–11
Simonyan K, Zisserman A (2014a) Two-stream convolutional networks for action recognition in videos. In: Advances in neural information processing systems (NIPS) conference. NIPS, pp 568–576
Simonyan K, Zisserman A (2014b) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:14091556
Springenberg JT, Dosovitskiy A, Brox T, Riedmiller M (2014) Striving for simplicity: the all convolutional net. arXiv preprint arXiv:14126806
Sundberg P, Brox T, Maire M, Arbeláez P, Malik J (2011) Occlusion boundary detection and figure/ground assignment from optical flow. In: Computer vision and pattern recognition (CVPR) conference. IEEE, pp 2233–2240
Taylor B, Karasev V, Soatto S (2015) Causal video object segmentation from persistence of occlusions. In: Computer vision and pattern recognition (CVPR) conference. IEEE, pp 4268–4276
Tsai D, Flagg M, Nakazawa A, Rehg JM (2012) Motion coherent tracking using multi-label MRF optimization. Int J Comput Vis 100(2):190–202
Article MathSciNet Google Scholar
Tsai YH, Yang MH, Black MJ (2016a) Video segmentation via object flow. In: Computer vision and pattern recognition (CVPR) conference. IEEE, pp 3899–3908
Tsai YH, Zhong G, Yang MH (2016b) Semantic co-segmentation in videos. In: European conference computer vision (ECCV). Springer, pp 760–775
Wang H, Raiko T, Lensu L, Wang T, Karhunen J (2016) Semi-supervised domain adaptation for weakly labeled semantic video object segmentation. arXiv preprint arXiv:160602280
Wang W, Shen J, Porikli F (2015) Saliency-aware geodesic video object segmentation. In: Computer vision and pattern recognition (CVPR) conference. IEEE, pp 3395–3402
Wen L, Du D, Lei Z, Li SZ, Yang MH (2015) Jots: joint online tracking and segmentation. In: Computer vision and pattern recognition (CVPR) conference. IEEE, pp 2226–2234
Xiao F, Jae Lee Y (2016) Track and segment: an iterative unsupervised approach for video object proposals. In: Computer vision and pattern recognition (CVPR) conference. IEEE, pp 933–942
Yu G, Yuan J (2015) Fast action proposals for human action detection and search. In: Computer vision and pattern recognition (CVPR) conference. IEEE, pp 1302–1311
Zeiler MD, Fergus R (2014) Visualizing and understanding convolutional networks. In: European conference on computer vision (ECCV). Springer, pp 818–833
Zhang D, Javed O, Shah M (2013) Video object segmentation through spatially accurate and temporally dense extraction of primary object regions. In: Computer vision and pattern recognition (CVPR) conference. IEEE, pp 628–635
Zhang L, He Z, Liu Y (2017a) Deep object recognition across domains based on adaptive extreme learning machine. Neurocomputing 239:194–203
Article Google Scholar
Zhang L, Yang J, Zhang D (2017b) Domain class consistency based transfer learning for image classification across domains. Inf Sci 418:242–257
Article Google Scholar
Zhang Y, Chen X, Li J, Wang C, Xia C (2015) Semantic object segmentation via detection in weakly labeled video. In: Computer vision and pattern recognition (CVPR) conference. IEEE, pp 3641–3649

Download references

Author information

Authors and Affiliations

Department of Artificial Intelligence, Faculty of Computer Engineering, University of Isfahan, Isfahan, 8174673441, Iran
Zahra Kamranian & Ahmad Reza Naghsh Nilchi
Faculty of Engineering, University of Isfahan, Isfahan, 8174673441, Iran
Hamid Sadeghian
Computer Aided Medical Procedures and Augmented Reality, Technische Universität München, Munich, Germany
Federico Tombari & Nassir Navab

Authors

Zahra Kamranian
View author publications
You can also search for this author in PubMed Google Scholar
Ahmad Reza Naghsh Nilchi
View author publications
You can also search for this author in PubMed Google Scholar
Hamid Sadeghian
View author publications
You can also search for this author in PubMed Google Scholar
Federico Tombari
View author publications
You can also search for this author in PubMed Google Scholar
Nassir Navab
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ahmad Reza Naghsh Nilchi.

Ethics declarations

Conflict of interest

All the authors declare that they have no conflict of interest regarding the publication of this paper.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kamranian, Z., Naghsh Nilchi, A.R., Sadeghian, H. et al. Joint motion boundary detection and CNN-based feature visualization for video object segmentation. Neural Comput & Applic 32, 4073–4091 (2020). https://doi.org/10.1007/s00521-019-04448-7

Download citation

Received: 14 March 2018
Accepted: 19 August 2019
Published: 12 September 2019
Issue Date: April 2020
DOI: https://doi.org/10.1007/s00521-019-04448-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Joint motion boundary detection and CNN-based feature visualization for video object segmentation

Abstract

Access this article

Similar content being viewed by others

Unsupervised Video Object Segmentation Using Motion Saliency-Guided Spatio-Temporal Propagation

Enhanced Video Segmentation with Object Tracking

Unsupervised Video Object Segmentation with Motion-Based Bilateral Networks

Notes

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Joint motion boundary detection and CNN-based feature visualization for video object segmentation

Abstract

Access this article

Similar content being viewed by others

Unsupervised Video Object Segmentation Using Motion Saliency-Guided Spatio-Temporal Propagation

Enhanced Video Segmentation with Object Tracking

Unsupervised Video Object Segmentation with Motion-Based Bilateral Networks

Notes

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation