Advertisement

Unsupervised Learning of Multi-Frame Optical Flow with Occlusions

  • Joel Janai
  • Fatma Güney
  • Anurag Ranjan
  • Michael Black
  • Andreas Geiger
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11220)

Abstract

Learning optical flow with neural networks is hampered by the need for obtaining training data with associated ground truth. Unsupervised learning is a promising direction, yet the performance of current unsupervised methods is still limited. In particular, the lack of proper occlusion handling in commonly used data terms constitutes a major source of error. While most optical flow methods process pairs of consecutive frames, more advanced occlusion reasoning can be realized when considering multiple frames. In this paper, we propose a framework for unsupervised learning of optical flow and occlusions over multiple frames. More specifically, we exploit the minimal configuration of three frames to strengthen the photometric loss and explicitly reason about occlusions. We demonstrate that our multi-frame, occlusion-sensitive formulation outperforms existing unsupervised two-frame methods and even produces results on par with some fully supervised methods.

Supplementary material

474218_1_En_42_MOESM1_ESM.mp4 (35.6 mb)
Supplementary material 1 (mp4 36457 KB)
474218_1_En_42_MOESM2_ESM.pdf (37.4 mb)
Supplementary material 2 (pdf 38316 KB)

References

  1. 1.
    Horn, B.K.P., Schunck, B.G.: Determining optical flow. Artif. Intell. (AI) 17(1–3), 185–203 (1981)CrossRefGoogle Scholar
  2. 2.
    Black, M.J., Anandan, P.: A framework for the robust estimation of optical flow. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV) (1993)Google Scholar
  3. 3.
    Yamaguchi, K., McAllester, D., Urtasun, R.: Efficient Joint Segmentation, Occlusion Labeling, Stereo and Flow Estimation. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 756–771. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-10602-1_49CrossRefGoogle Scholar
  4. 4.
    Yang, J., Li, H.: Dense, accurate optical flow estimation with piecewise parametric model. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015)Google Scholar
  5. 5.
    Sun, D., Sudderth, E.B., Black, M.J.: Layered segmentation and optical flow estimation over time. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2012)Google Scholar
  6. 6.
    Sevilla-Lara, L., Sun, D., Jampani, V., Black, M.J.: Optical flow with semantic segmentation and localized layers. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)Google Scholar
  7. 7.
    Bai, M., Luo, W., Kundu, K., Urtasun, R.: Exploiting Semantic Information and Deep Matching for Optical Flow. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9910, pp. 154–170. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46466-4_10CrossRefGoogle Scholar
  8. 8.
    Dosovitskiy, A., Fischer, P., Ilg, E., Haeusser, P., Hazirbas, C., Golkov, V., et al.: Flownet: learning optical flow with convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV) (2015)Google Scholar
  9. 9.
    Ranjan, A., Black, M.: Optical flow estimation using a spatial pyramid network. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)Google Scholar
  10. 10.
    Ilg, E., Mayer, N., Saikia, T., Keuper, M., Dosovitskiy, A., Brox, T.: Flownet 2.0: evolution of optical flow estimation with deep networks. Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)Google Scholar
  11. 11.
    Sun, D., Yang, X., Liu, M.Y., Kautz, J.: Pwc-net: Cnns for optical flow using pyramid, warping, and cost volume. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)Google Scholar
  12. 12.
    Janai, J., Güney, F., Wulff, J., Black, M., Geiger, A.: Slow flow: exploiting high-speed cameras for accurate and diverse optical flow reference data. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)Google Scholar
  13. 13.
    Mayer, N., Ilg, E., Haeusser, P., Fischer, P., Cremers, D., Dosovitskiy, A., et al.: A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)Google Scholar
  14. 14.
    Butler, D.J., Wulff, J., Stanley, G.B., Black, M.J.: A Naturalistic Open Source Movie for Optical Flow Evaluation. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, vol. 7577, pp. 611–625. Springer, Heidelberg (2012).  https://doi.org/10.1007/978-3-642-33783-3_44CrossRefGoogle Scholar
  15. 15.
    Yu, J.J., Harley, A.W., Derpanis, K.G.: Back to Basics: Unsupervised Learning of Optical Flow via Brightness Constancy and Motion Smoothness. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9915, pp. 3–10. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-49409-8_1CrossRefGoogle Scholar
  16. 16.
    Vijayanarasimhan, S., Ricco, S., Schmid, C., Sukthankar, R., Fragkiadaki, K.: Sfm-net: Learning of structure and motion from video. arXiv.org 1704.07804 (2017)
  17. 17.
    Pătrăucean, V., Handa, A., Cipolla, R.: Spatio-temporal video autoencoder with differentiable memory. In: Proceedings of the International Conference on Learning Representations (ICLR) (2016)Google Scholar
  18. 18.
    Ren, Z., Yan, J., Ni, B., Liu, B., Yang, X., Zha, H.: Unsupervised deep learning for optical flow estimation. In: Proceedings of the Conference on Artificial Intelligence (AAAI) (2017)Google Scholar
  19. 19.
    Wang, Y., Yang, Y., Yang, Z., Zhao, L., Wang, P., Xu, W.: Occlusion aware unsupervised learning of optical flow. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)Google Scholar
  20. 20.
    Meister, S., Hur, J., Roth, S.: Unflow: unsupervised learning of optical flow with a bidirectional census loss. In: Proceedings of the Conference on Artificial Intelligence (AAAI) (2018)Google Scholar
  21. 21.
    Geiger, A., Lenz, P., Stiller, C., Urtasun, R.: Vision meets robotics: the KITTI dataset. Int. J. Robot. Res. (IJRR) 32(11), 1231–1237 (2013)CrossRefGoogle Scholar
  22. 22.
    Menze, M., Geiger, A.: Object scene flow for autonomous vehicles. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015)Google Scholar
  23. 23.
    Heeger, D.J.: Optical flow using spatiotemporal filters. Int. J. Comput. Vis. (IJCV) 1(4), 279–302 (1988)CrossRefGoogle Scholar
  24. 24.
    Fleet, D.J., Jepson, A.D.: Computation of component image velocity from local phase information. Int. J. Comput. Vis. (IJCV) 5(1), 77–104 (1990)CrossRefGoogle Scholar
  25. 25.
    Weickert, J., Schnörr, C.: Variational optic flow computation with a spatio-temporal smoothness constraint. J. Math. Imaging Vis. (JMIV) 14(3), 245–255 (2001)CrossRefGoogle Scholar
  26. 26.
    Stoll, M., Volz, S., Bruhn, A.: Joint trilateral filtering for multiframe optical flow. In: Proceedings of IEEE International Conference on Image Processing (ICIP) (2013)Google Scholar
  27. 27.
    Zimmer, H., Bruhn, A., Weickert, J.: Optic flow in harmony. Int. J. Comput. Vis. (IJCV) 93(3), 368–388 (2011)MathSciNetCrossRefGoogle Scholar
  28. 28.
    Ralli, J., Díaz, J., Ros, E.: Spatial and temporal constraints in variational correspondence methods. Mach. Vis. Appl. (MVA) 24(2), 275–287 (2013)CrossRefGoogle Scholar
  29. 29.
    Werlberger, M., Trobin, W., Pock, T., Wedel, A., Cremers, D., Bischof, H.: Anisotropic Huber-L1 optical flow. In: Proceedings of the British Machine Vision Conference (BMVC) (2009)Google Scholar
  30. 30.
    Volz, S., Bruhn, A., Valgaerts, L., Zimmer, H.: Modeling temporal coherence for optical flow. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV) (2011)Google Scholar
  31. 31.
    Salgado, A., Sánchez, J.: Temporal constraints in large optical flow. In: Proceedings of the International Conference on Computer Aided Systems Theory (EUROCAST) (2007)Google Scholar
  32. 32.
    Sun, D., Sudderth, E.B., Black, M.J.: Layered image motion with explicit occlusions, temporal consistency, and depth ordering. In: Advances in Neural Information Processing Systems (NIPS) (2010)Google Scholar
  33. 33.
    Wang, C.M., Fan, K.C., Wang, C.T.: Estimating optical flow by integrating multi-frame information. J. Inf. Sci. Eng. (JISE) (2008)Google Scholar
  34. 34.
    Black, M.J., Anandan, P.: Robust dynamic motion estimation over time. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (1991)Google Scholar
  35. 35.
    Kennedy, R., Taylor, C.J.: Optical Flow with Geometric Occlusion Estimation and Fusion of Multiple Frames. In: Tai, X.-C., Bae, E., Chan, T.F., Lysaker, M. (eds.) EMMCVPR 2015. LNCS, vol. 8932, pp. 364–377. Springer, Cham (2015).  https://doi.org/10.1007/978-3-319-14612-6_27CrossRefGoogle Scholar
  36. 36.
    Garg, R., B.G., V.K., Carneiro, G., Reid, I.: Unsupervised CNN for Single View Depth Estimation: Geometry to the Rescue. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9912, pp. 740–756. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46484-8_45CrossRefGoogle Scholar
  37. 37.
    Xie, J., Girshick, R., Farhadi, A.: Deep3D: Fully Automatic 2D-to-3D Video Conversion with Deep Convolutional Neural Networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 842–857. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46493-0_51CrossRefGoogle Scholar
  38. 38.
    Godard, C., Mac Aodha, O., Brostow, G.J.: Unsupervised monocular depth estimation with left-right consistency. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)Google Scholar
  39. 39.
    Zhou, T., Brown, M., Snavely, N., Lowe, D.G.: Unsupervised learning of depth and ego-motion from video. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)Google Scholar
  40. 40.
    Agrawal, P., Carreira, J., Malik, J.: Learning to see by moving. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV) (2015)Google Scholar
  41. 41.
    Long, G., Kneip, L., Alvarez, J.M., Li, H., Zhang, X., Yu, Q.: Learning Image Matching by Simply Watching Video. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9910, pp. 434–450. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46466-4_26CrossRefGoogle Scholar
  42. 42.
    Alletto, S., Abati, D., Calderara, S., Cucchiara, R., Rigazio, L.: Transflow: Unsupervised motion flow by joint geometric and pixel-level estimation. arXiv.org (2017)
  43. 43.
    Jaderberg, M., Simonyan, K., Zisserman, A., Kavukcuoglu, K.: Spatial transformer networks. In: Advances in Neural Information Processing Systems (NIPS) (2015)Google Scholar
  44. 44.
    Bruhn, A., Weickert, J., Schnörr, C.: Lucas/Kanade meets Horn/Schunck: combining local and global optic flow methods. Int. J. Comput. Vis. (IJCV) 61(3), 211–231 (2005)CrossRefGoogle Scholar
  45. 45.
    Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? The KITTI vision benchmark suite. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2012)Google Scholar
  46. 46.
    Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. In: Proceedings of the International Conference on Learning Representations (ICLR) (2015)Google Scholar
  47. 47.
    Mayer, N., Ilg, E., Fischer, P., Hazirbas, C., Cremers, D., Dosovitskiy, A., et al.: What makes good synthetic training data for learning disparity and optical flow estimation? Int. J. Comput. Vis. (IJCV) (2018)Google Scholar
  48. 48.
    Leordeanu, M., Zanfir, A., Sminchisescu, C.: Locally affine sparse-to-dense matching for motion and occlusion estimation. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV) (2013)Google Scholar
  49. 49.
    Xu, L., Jia, J., Matsushita, Y.: Motion detail preserving optical flow estimation. IEEE Trans. Pattern Anal. Mach. Intell. (PAMI) 34(9) (2012) 1744–1757CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  • Joel Janai
    • 1
    • 3
  • Fatma Güney
    • 4
  • Anurag Ranjan
    • 2
  • Michael Black
    • 2
  • Andreas Geiger
    • 1
    • 3
  1. 1.Autonomous Vision Group, MPI for Intelligent SystemsTübingenGermany
  2. 2.Perceiving Systems DepartmentMPI for Intelligent SystemsTübingenGermany
  3. 3.University of TübingenTübingenGermany
  4. 4.Visual Geometry GroupUniversity of OxfordOxfordEngland

Personalised recommendations