Skip to main content

Superpixels for Video Content Using a Contour-Based EM Optimization

  • Conference paper
  • First Online:
Computer Vision -- ACCV 2014 (ACCV 2014)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 9006))

Included in the following conference series:

  • 2333 Accesses

Abstract

A wide variety of computer vision applications rely on superpixel or supervoxel algorithms as a preprocessing step. This underlines the overall importance that these algorithms have gained in the recent years. However, most methods show a lack of temporal consistency or fail in producing temporally stable segmentations. In this paper, we propose a novel, contour-based approach that generates temporally consistent superpixels for video content. It can be expressed in an expectation-maximization framework and utilizes an efficient label propagation built on backward optical flow in order to encourage the preservation of superpixel shapes and their spatial constellation over time. Using established benchmark suites, we show the superior performance of our approach compared to state of the art supervoxel and superpixel algorithms for video content.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The underlying assumption is that a temporal superpixel should share the same color in successive frames but not necessarily the same position.

  2. 2.

    The changes after 5 iterations are only marginal. It should be noted that the boundary can move more than 1 pixel per iteration.

References

  1. Ren, X., Malik, J.: Learning a classification model for segmentation. In: ICCV, pp. 10–17 (2003)

    Google Scholar 

  2. Lezama, J., Alahari, K., Sivic, J., Laptev, I.: Track to the future: spatio-temporal video segmentation with long-range motion cues. In: CVPR, pp. 3369–3376 (2011)

    Google Scholar 

  3. Galasso, F., Cipolla, R., Schiele, B.: Video segmentation with superpixels. In: Lee, K.M., Matsushita, Y., Rehg, J.M., Hu, Z. (eds.) ACCV 2012, Part I. LNCS, vol. 7724, pp. 760–774. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  4. Wang, S., Lu, H., Yang, F., Yang, M.H.: Superpixel tracking. In: ICCV, pp. 1323– 1330 (2011)

    Google Scholar 

  5. Djelouah, A., Franco, J.S., Boyer, E., Le Clerc, F., Pérez, P.: Multi-view object segmentation in space and time. In: ICCV, pp. 2640–2647 (2013)

    Google Scholar 

  6. Vogel, C., Schindler, K., Roth, S.: Piecewise rigid scene flow. In: ICCV, pp. 1377– 1384 (2013)

    Google Scholar 

  7. Zhang, J., Kan, C., Schwing, A.G., Urtasun, R.: Estimating the 3D layout of indoor scenes and its clutter from depth sensors. In: ICCV, pp. 1273–1280 (2013)

    Google Scholar 

  8. van den Hengel, A., Dick, A., Thormählen, T., Ward, B., Torr, P.H.S.: VideoTrace. ACM TOG 26, 86 (2007)

    Article  Google Scholar 

  9. Tighe, J., Lazebnik, S.: Superparsing. IJCV 101, 329–349 (2012)

    Article  MathSciNet  Google Scholar 

  10. Roig, G., Boix, X., Nijs, R.D., Ramos, S., Kuhnlenz, K., Gool, L.V.: Active MAP inference in CRFs for efficient semantic segmentation. In: ICCV, pp. 2312–2319 (2013)

    Google Scholar 

  11. Jain, A., Chatterjee, S., Vidal, R.: Coarse-to-fine semantic video segmentation using supervoxel trees. In: ICCV, pp. 1865–1872 (2013)

    Google Scholar 

  12. Hoiem, D., Efros, A.A., Hebert, M.: Geometric context from a single image. In: ICCV, pp. 654–661 (2005)

    Google Scholar 

  13. Grundmann, M., Kwatra, V., Han, M., Essa, I.: Efficient hierarchical graph-based video segmentation. In: CVPR, pp. 2141–2148 (2010)

    Google Scholar 

  14. Veksler, O., Boykov, Y., Mehrani, P.: Superpixels and supervoxels in an energy optimization framework. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part V. LNCS, vol. 6315, pp. 211–224. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  15. Achanta, R., Shaji, A., Smith, K., Lucchi, A., Fua, P., Susstrunk, S.: SLIC superpixels compared to state-of-the-art superpixel methods. TPAMI 34, 2274–2282 (2012)

    Article  Google Scholar 

  16. Chang, J., Wei, D., Fisher, J.W.: A video representation using temporal superpixels. In: CVPR, pp. 2051–2058 (2013)

    Google Scholar 

  17. Van den Bergh, M., Roig, G., Boix, X., Manen, S., Van Gool, L.: Online video seeds for temporal window objectness. In: ICCV, pp. 377–384 (2013)

    Google Scholar 

  18. Reso, M., Jachalsky, J., Rosenhahn, B., Ostermann, J.: Temporally consistent superpixels. In: ICCV, pp. 385–392 (2013)

    Google Scholar 

  19. Levinshtein, A., Sminchisescu, C., Dickinson, S.: Spatiotemporal closure. In: Kimmel, R., Klette, R., Sugimoto, A. (eds.) ACCV 2010, Part I. LNCS, vol. 6492, pp. 369–382. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  20. Zitnick, C.L., Jojic, N., Kang, S.B.: Consistent segmentation for optical flow estimation. In: ICCV, pp. 1308–1315 (2005)

    Google Scholar 

  21. Xu, C., Xiong, C., Corso, J.J.: Streaming hierarchical video segmentation. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part VI. LNCS, vol. 7577, pp. 626–639. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  22. Felzenszwalb, P.F., Huttenlocher, D.P.: Efficient graph-based image segmentation. IJCV 59, 167–181 (2004)

    Article  Google Scholar 

  23. Van den Bergh, M., Boix, X., Roig, G., de Capitani, B., Van Gool, L.: SEEDS: superpixels extracted via energy-driven sampling. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part VII. LNCS, vol. 7578, pp. 13–26. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  24. Xu, C., Corso, J.J.: Evaluation of super-voxel methods for early video processing. In: CVPR, pp. 1202–1209 (2012)

    Google Scholar 

  25. Arbeláez, P., Maire, M., Fowlkes, C., Malik, J.: Contour detection and hierarchical image segmentation. TPAMI 33, 898–916 (2011)

    Article  Google Scholar 

  26. Schick, A., Fischer, M., Stiefelhagen, R.: Measuring and evaluating the compactness of superpixels. In: ICPR, pp. 930–934 (2012)

    Google Scholar 

  27. Schick, A., Fischer, M., Stiefelhagen, R.: An evaluation of the compactness of superpixels. Pattern Recogn. Lett. 43, 71–80 (2014)

    Article  Google Scholar 

  28. Sundberg, P., Brox, T., Maire, M., Arbelaez, P., Malik, J.: Occlusion boundary detection and figure/ground assignment from optical flow. In: CVPR, pp. 2233–2240 (2011)

    Google Scholar 

  29. Chen, A., Corso, J.J.: Propagating multi-class pixel labels throughout video frames. In: WNYIPW, pp. 14–17 (2010)

    Google Scholar 

  30. Galasso, F., Nagaraja, N.S., Cárdenas, T.J., Brox, T., Schiele, B.: A unified video segmentation benchmark: annotation, metrics and analysis. In: ICCV, pp. 3527–3534 (2013)

    Google Scholar 

  31. Moore, A.P., Prince, S., Warrell, J., Mohammed, U., Jones, G.: Superpixel lattices. In: CVPR, pp. 1–8 (2008)

    Google Scholar 

  32. Perbet, F., Maki, A.: Homogeneous superpixels from random walks. In: MVA, pp. 26–30 (2011)

    Google Scholar 

  33. Liu, C.: Beyond pixels: exploring new representations and applications for motion analysis. Ph.D. thesis, Massachusetts Institute of Technology (2009)

    Google Scholar 

  34. Horn, B.K.P., Schunck, B.G.: Determining optical flow. AI 17, 185–203 (1981)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Matthias Reso .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Reso, M., Jachalsky, J., Rosenhahn, B., Ostermann, J. (2015). Superpixels for Video Content Using a Contour-Based EM Optimization. In: Cremers, D., Reid, I., Saito, H., Yang, MH. (eds) Computer Vision -- ACCV 2014. ACCV 2014. Lecture Notes in Computer Science(), vol 9006. Springer, Cham. https://doi.org/10.1007/978-3-319-16817-3_45

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-16817-3_45

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-16816-6

  • Online ISBN: 978-3-319-16817-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics