Advertisement

Geodesic Distance Histogram Feature for Video Segmentation

  • Hieu LeEmail author
  • Vu Nguyen
  • Chen-Ping Yu
  • Dimitris Samaras
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10111)

Abstract

This paper proposes a geodesic-distance-based feature that encodes global information for improved video segmentation algorithms. The feature is a joint histogram of intensity and geodesic distances, where the geodesic distances are computed as the shortest paths between superpixels via their boundaries. We also incorporate adaptive voting weights and spatial pyramid configurations to include spatial information into the geodesic histogram feature and show that this further improves results. The feature is generic and can be used as part of various algorithms. In experiments, we test the geodesic histogram feature by incorporating it into two existing video segmentation frameworks. This leads to significantly better performance in 3D video segmentation benchmarks on two datasets.

Keywords

Spectral Cluster Geodesic Distance Temporal Consistency Histogram Feature Video Segmentation 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Notes

Acknowledgement

Partially supported by the Vietnam Education Foundation, NSF IIS-1161876, FRA DTFR5315C00011, the Stony Brook SensonCAT, the SubSample project from the DIGITEO Institute, France, and a gift from Adobe Corporation

References

  1. 1.
    Taralova, E.H., Torre, F., Hebert, M.: Motion words for videos. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 725–740. Springer, Heidelberg (2014). doi: 10.1007/978-3-319-10590-1_47 Google Scholar
  2. 2.
    Jain, A., Chatterjee, S., Vidal, R.: Coarse-to-fine semantic video segmentation using supervoxel trees. In: ICCV, pp. 1865–1872. IEEE Computer Society (2013)Google Scholar
  3. 3.
    Kundu, A., Li, Y., Dellaert, F., Li, F., Rehg, J.M.: Joint semantic segmentation and 3D reconstruction from monocular video. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8694, pp. 703–718. Springer, Heidelberg (2014). doi: 10.1007/978-3-319-10599-4_45 Google Scholar
  4. 4.
    Khoreva, A., Galasso, F., Hein, M., Schiele, B.: Classifier based graph construction for video segmentation. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 951–960 (2015)Google Scholar
  5. 5.
    Grundmann, M., Kwatra, V., Han, M., Essa, I.: Efficient hierarchical graph-based video segmentation. In: 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2141–2148 (2010)Google Scholar
  6. 6.
    Li, F., Kim, T., Humayun, A., Tsai, D., Rehg, J.M.: Video segmentation by tracking many figure-ground segments. In: 2013 IEEE International Conference on Computer Vision, pp. 2192–2199 (2013)Google Scholar
  7. 7.
    Yu, C.P., Le, H., Zelinsky, G., Samaras, D.: Efficient video segmentation using parametric graph partitioning. In: The IEEE International Conference on Computer Vision (ICCV) (2015)Google Scholar
  8. 8.
    Galasso, F., Cipolla, R., Schiele, B.: Video segmentation with superpixels. In: Lee, K.M., Matsushita, Y., Rehg, J.M., Hu, Z. (eds.) ACCV 2012. LNCS, vol. 7724, pp. 760–774. Springer, Heidelberg (2013). doi: 10.1007/978-3-642-37331-2_57 CrossRefGoogle Scholar
  9. 9.
    Krähenbühl, P., Koltun, V.: Geodesic object proposals. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 725–739. Springer, Heidelberg (2014). doi: 10.1007/978-3-319-10602-1_47 Google Scholar
  10. 10.
    Bai, X., Sapiro, G.: A geodesic framework for fast interactive image and video segmentation and matting. In: 2007 IEEE 11th International Conference on Computer Vision, pp. 1–8 (2007)Google Scholar
  11. 11.
    Wang, W., Shen, J., Porikli, F.: Saliency-aware geodesic video object segmentation. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3395–3402 (2015)Google Scholar
  12. 12.
    Price, B.L., Morse, B., Cohen, S.: Geodesic graph cut for interactive image segmentation. In: 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3161–3168 (2010)Google Scholar
  13. 13.
    Ling, H., Jacobs, D.W.: Deformation invariant image matching. In: Tenth IEEE International Conference on Computer Vision (ICCV 2005), vols. 1, 2, pp. 1466–1473 (2005)Google Scholar
  14. 14.
    Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2006), vol. 2, pp. 2169–2178 (2006)Google Scholar
  15. 15.
    Cheng, H.T., Ahuja, N.: Exploiting nonlocal spatiotemporal structure for video segmentation. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 741–748 (2012)Google Scholar
  16. 16.
    Leung, T., Malik, J.: Representing and recognizing the visual appearance of materials using three-dimensional textons. Int. J. Comput. Vision 43, 29–44 (2001)CrossRefzbMATHGoogle Scholar
  17. 17.
    Galasso, F., Keuper, M., Brox, T., Schiele, B.: Spectral graph reduction for efficient image and streaming video segmentation. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2014)Google Scholar
  18. 18.
    Galasso, F., Iwasaki, M., Nobori, K., Cipolla, R.: Spatio-temporal clustering of probabilistic region trajectories. In: Metaxas, D.N., Quan, L., Sanfeliu, A., Gool, L.J.V. (eds.) ICCV, pp. 1738–1745. IEEE Computer Society (2011)Google Scholar
  19. 19.
    Tsai, Y.H., Yang, M.H., Black, M.J.: Video segmentation via object flow. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)Google Scholar
  20. 20.
    Brox, T., Malik, J.: Object segmentation by long term analysis of point trajectories. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6315, pp. 282–295. Springer, Heidelberg (2010). doi: 10.1007/978-3-642-15555-0_21 CrossRefGoogle Scholar
  21. 21.
    Lezama, J., Alahari, K., Sivic, J., Laptev, I.: Track to the future: Spatio-temporal video segmentation with long-range motion cues. In: 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2011)Google Scholar
  22. 22.
    Palou, G., Salembier, P.: Hierarchical video representation with trajectory binary partition tree. In: Computer Vision and Pattern Recognition (CVPR), Portland, Oregon (2013)Google Scholar
  23. 23.
    Brox, T., Malik, J.: Object segmentation by long term analysis of point trajectories. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6315, pp. 282–295. Springer, Heidelberg (2010). doi: 10.1007/978-3-642-15555-0_21 CrossRefGoogle Scholar
  24. 24.
    Chen, A.Y.C., Corso, J.J.: Propagating multi-class pixel labels throughout video frames. In: 2010 Western New York Image Processing Workshop (WNYIPW), pp. 14–17 (2010)Google Scholar
  25. 25.
    Dollár, P., Zitnick, C.L.: Structured forests for fast edge detection. In: ICCV, International Conference on Computer Vision (2013)Google Scholar
  26. 26.
    Weinzaepfel, P., Revaud, J., Harchaoui, Z., Schmid, C.: Learning to detect motion boundaries. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015)Google Scholar
  27. 27.
    Xu, C., Corso, J.J.: Evaluation of super-voxel methods for early video processing. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1202–1209 (2012)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Hieu Le
    • 1
    Email author
  • Vu Nguyen
    • 1
  • Chen-Ping Yu
    • 2
  • Dimitris Samaras
    • 1
  1. 1.Stony Brook UniversityStony BrookUSA
  2. 2.Harvard UniversityCambridgeUSA

Personalised recommendations