3D Szenenfluss – bildbasierte Schätzung dichter Bewegungsfelder

Vogel, Christoph; Roth, Stefan; Schindler, Konrad

doi:10.1007/978-3-662-46900-2_48-1

3D Szenenfluss – bildbasierte Schätzung dichter Bewegungsfelder

Christoph Vogel³,
Stefan Roth⁴ &
Konrad Schindler⁵

Living reference work entry
First Online: 01 January 2016

450 Accesses

Part of the book series: Springer Reference Naturwissenschaften ((SRN))

Zusammenfassung

Der 3D Szenenfluss (scene flow) ist eine dichte Beschreibung der Geometrie und des Bewegungsfeldes einer dynamischen Szene. Entsprechend ist die Bestimmung des Szenenflusses aus binokularen Videosequenzen eine Generalisierung zweier klassischer Aufgaben der bildbasierten Messtechnik, der Schätzung von Stereokorrespondenz und optischem Fluss. Im folgenden wird ein Modell vorgestellt, in dem die dynamische 3D Szene durch eine Menge von planaren Segmenten repräsentiert wird, wobei jedes Segment eine Starrkörperbewegung (Translation und Rotation) ausführt. Die (Über-)Segmentierung in starre, ebene Segmente wird gemeinsam mit deren 3D Geometrie und 3D Bewegung geschätzt. Das beschriebene Modell ist wesentlich kompakter als die konventionelle pixelweise Repräsentation, verfügt aber dennoch über genügend Flexibilität, um reale Szenen mit mehreren unabhängigen Bewegungen zu beschreiben. Darüber hinaus erlaubt es, a-priori Annahmen über die Szene einzubinden und Verdeckungen zu berücksichtigen, und ermöglicht den Einsatz robuster diskreter Optimierungsmethoden. Weiters ist das Modell, in Kombination mit einem dynamischen Modell, direkt auf mehrere aufeinanderfolgende Zeitschritte anwendbar. Dazu wird für die einzelnen Bilder jeweils eine eigene Repräsentation instanziiert. Entsprechende Bedingungen stellen sicher, dass die Schätzung über verschiedene Ansichten und verschiedene Zeitpunkte konsistent ist. Das beschriebene Modell verbessert die Genauigkeit und Zuverlässigkeit der Szenenfluss-Schätzung speziell bei ungünstigen Aufnahmebedingungen.

Dieser Beitrag ist Teil des Handbuchs der Geodäsie, Band „Photogrammetrie und Fernerkundung“, herausgegeben von Christian Heipke, Hannover.

This is a preview of subscription content, log in via an institution.

Notes

1.
Das Kapitel basiert auf den Publikationen [40–42].
2.
Die Begriffe „links“ und „rechts“ dienen lediglich der Intuition, die geometrische Konfiguration der Kameras kann beliebig gewählt werden.
3.
Die Hammingdistanz zwischen zwei Census-Signaturen kann im Prinzip beliebig skaliert werden. Die weiter unten angegebenen Werte für die Gewichte \(\lambda\) und μ gelten für den Faktor 1∕30.
4.
Auch bei der Optimierung hat die konsistente Schätzung technische Vorteile; die Energiefunktion hat einen geringeren Anteil an Termen, die nicht submodular sind.
5.
www.cvlibs.net/datasets/kitti.

Literatur

Adiv, G.: Determining three-dimensional motion and structure from optical flow generated by several moving objects. IEEE Trans. Pattern Anal. Mach. Intell. 7(4), 384–401 (1985)
Article Google Scholar
Badino, H., Kanade, T.: A head-wearable short-baseline stereo system for the simultaneous estimation of structure and motion. In: IAPR Conference on Machine Vision Application, Nara, S. 185–189 (2011)
Google Scholar
Baker, S., Scharstein, D., Lewis, J., Roth, S., Black, M.J., Szeliski, R.: A database and evaluation methodology for optical flow. Int. J. Comput. Vis. 92(1), 1–31 (2011). vision.middlebury.edu/flow
Article Google Scholar
Basha, T., Moses, Y., Kiryati, N.: Multi-view scene flow estimation: a view centered variational approach. In: IEEE Conference on Computer Vision and Pattern Recognition, San Francisco (2010)
Google Scholar
Black, M.J., Anandan, P.: Robust dynamic motion estimation over time. In: IEEE Conference on Computer Vision and Pattern Recognition, Lahaina (1991)
Book Google Scholar
Bleyer, M., Rother, C., Kohli, P.: Surface stereo with soft segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition, San Francisco (2010)
Google Scholar
Bleyer, M., Rother, C., Kohli, P., Scharstein, D., Sinha, S.N.: Object stereo – joint stereo matching and object segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition, Colorado Springs (2011)
Book Google Scholar
Brox, T., Malik, J.: Large displacement optical flow: descriptor matching in variational motion estimation. IEEE Trans. Pattern Anal. Mach. Intell. 33(3), 500–513 (2011)
Article Google Scholar
Brox, T., Bruhn, A., Papenberg, N., Weickert, J.: High accuracy optical flow estimation based on a theory for warping. In: European Conference on Computer Vision, Prague (2004)
Book Google Scholar
Carceroni, R.L., Kutulakos, K.N.: Multi-view scene capture by surfel sampling: from video streams to non-rigid 3D motion, shape and reflectance. Int. J. Comput. Vis. 49, 175–214 (2002)
Article Google Scholar
Courchay, J., Pons, J.P., Monasse, P., Keriven, R.: Dense and accurate spatio-temporal multi-view stereovision. In: Asian Conference on Computer Vision, Xi’an (2009)
Google Scholar
Devernay, F., Mateus, D., Guilbert, M.: Multi-camera scene flow by tracking 3-D points and surfels. In: IEEE Conference on Computer Vision and Pattern Recognition, New York (2006)
Book Google Scholar
Furukawa, Y., Ponce, J.: Dense 3D motion capture from synchronized video streams. In: IEEE Conference on Computer Vision and Pattern Recognition, Anchorage (2008)
Google Scholar
Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? In: IEEE Conference on Computer Vision and Pattern Recognition, Providence (2012). www.cvlibs.net/datasets/kitti/
Google Scholar
Gorelick, L., Veksler, O., Boykov, Y., Ben Ayed, I., Delong, A.: Local submodular approximations for binary pairwise energies. In: IEEE Conference on Computer Vision and Pattern Recognition, Columbus (2014)
Google Scholar
Hirschmüller, H.: Stereo processing by semiglobal matching and mutual information. IEEE Trans. Pattern Anal. Mach. Intell. 30(2), 328–341 (2008)
Article Google Scholar
Huguet, F., Devernay, F.: A variational method for scene flow estimation from stereo sequences. In: IEEE International Conference on Computer Vision, Rio de Janeiro (2007)
Book Google Scholar
Hung, C.H., Xu, L., Jia, J.: Consistent binocular depth and scene flow with chained temporal profiles. Int. J. Comput. Vis. 102(1–3), 271–292 (2013)
Article Google Scholar
Lempitsky, V., Roth, S., Rother, C.: FusionFlow: discrete-continuous optimization for optical flow estimation. In: IEEE Conference on Computer Vision and Pattern Recognition, Anchorage (2008)
Google Scholar
Lempitsky, V., Rother, C., Roth, S., Blake, A.: Fusion moves for Markov random field optimization. IEEE Trans. Pattern Anal. Mach. Intell. 32(8), 1392–1405 (2010)
Article Google Scholar
Lucas, B.D., Kanade, T.: An iterative image registration technique with an application to stereo vision. In: International Joint Conference on Artificial Intelligence, Bd. 2, Vancouver (1981)
Google Scholar
Meister, S., Jähne, B., Kondermann, D.: Outdoor stereo camera system for the generation of real-world benchmark data sets. Opt. Eng. 51(2), 021107 (2012)
Article Google Scholar
Müller, T., Rannacher, J., Rabe, C., Franke, U.: Feature- and depth-supported modified total variation optical flow for 3D motion field estimation in real scenes. In: IEEE Conference on Computer Vision and Pattern Recognition, Colorado Springs (2011)
Book Google Scholar
Murray, D.W., Buxton, B.F.: Scene segmentation from visual motion using global optimization. IEEE Trans. Pattern Anal. Mach. Intell. 9(2), 220–228 (1987)
Article Google Scholar
Nir, T., Bruckstein, A., Kimmel, R.: Over-parameterized variational optical flow. Int. J. Comput. Vis. 76(2), 205–216 (2008)
Article Google Scholar
Park, J., Oh, T.H., Jung, J., Tai, Y.W., Kweon, I.S.: A tensor voting approach for multi-view 3D scene flow estimation and refinement. In: European Conference on Computer Vision, Florence (2012)
Book Google Scholar
Rabe, C., Müller, T., Wedel, A., Franke, U.: Dense, robust, and accurate motion field estimation from stereo image sequences in real-time. In: European Conference on Computer Vision, Heraklion (2010)
Book Google Scholar
Rother, C., Kolmogorov, V., Lempitsky, V., Szummer, M.: Optimizing binary MRFs via extended roof duality. In: IEEE Conference on Computer Vision and Pattern Recognition, Minneapolis (2007)
Book Google Scholar
Schoenemann, T., Cremers, D.: High resolution motion layer decomposition using dual-space graph cuts. In: IEEE Conference on Computer Vision and Pattern Recognition, Anchorage (2008)
Book Google Scholar
Sun, D., Sudderth, E.B., Black, M.J.: Layered image motion with explicit occlusions, temporal consistency, and depth ordering. In: Neural Information Processing Systems, Vancouver (2010)
Google Scholar
Sun, D., Wulff, J., Sudderth, E., Pfister, H., Black, M.: A fully-connected layered model of foreground and background flow. In: IEEE Conference on Computer Vision and Pattern Recognition, Portland (2013)
Book Google Scholar
Tao, H., Sawhney, H.S.: Global matching criterion and color segmentation based stereo. In: IEEE Workshop on Applications in Computer Vision, Palm Springs (2000)
Google Scholar
Unger, M., Werlberger, M., Pock, T., Bischof, H.: Joint motion estimation and segmentation of complex scenes with label costs and occlusion modeling. In: IEEE Conference on Computer Vision and Pattern Recognition, Providence (2012)
Book Google Scholar
Valgaerts, L., Bruhn, A., Zimmer, H., Weickert, J., Stoll, C., Theobalt, C.: Joint estimation of motion, structure and geometry from stereo sequences. In: European Conference on Computer Vision, Heraklion (2010)
Book Google Scholar
Vaudrey, T., Rabe, C., Klette, R., Milburn, J.: Differences between stereo and motion behaviour on synthetic and real-world stereo sequences. In: International Conference on Image and Vision Computing New Zealand, Christ Church (2008)
Book Google Scholar
Vedula, S., Baker, S., Collins, R., Kanade, T., Rander, P.: Three-dimensional scene flow. In: IEEE Conference on Computer Vision and Pattern Recognition, Ft. Collins (1999)
Book Google Scholar
Veksler, O., Boykov, Y., Mehrani, P.: Superpixels and supervoxels in an energy optimization framework. In: European Conference on Computer Vision, Heraklion (2010)
Book Google Scholar
Vogel, C., Schindler, K., Roth, S.: 3D scene flow estimation with a rigid motion prior. In: IEEE International Conference on Computer Vision, Barcelona (2011)
Book Google Scholar
Vogel, C., Roth, S., Schindler, K.: An evaluation of data costs for optical flow. In: Pattern Recognition (Proceedings of GCPR), Saarbrücken, S. 343–353 (2013)
Google Scholar
Vogel, C., Schindler, K., Roth, S.: Piecewise rigid scene flow. In: IEEE International Conference on Computer Vision, Sydney (2013)
Book Google Scholar
Vogel, C., Roth, S., Schindler, K.: 3D scene flow estimation with a piecewise rigid scene model. Int. J. Comput. Vis. 111(3), 1–28 (2015)
Article Google Scholar
Vogel, C., Roth, S., Schindler, K.: View-consistent 3D scene flow estimation over multiple frames. In: European Conference on Computer Vision, Zurich (2014)
Book Google Scholar
Volz, S., Bruhn, A., Valgaerts, L., Zimmer, H.: Modeling temporal coherence for optical flow. In: IEEE International Conference on Computer Vision, Barcelona (2011)
Book Google Scholar
Wang, J.Y.A., Adelson, E.H.: Representing moving images with layers. IEEE Trans. Image Process. 3, 625–638 (1994)
Article Google Scholar
Wedel, A., Rabe, C., Vaudrey, T., Brox, T., Franke, U., Cremers, D.: Efficient dense scene flow from sparse or dense stereo data. In: European Conference on Computer Vision, Marseille (2008)
Book Google Scholar
Werlberger, M., Trobin, W., Pock, T., Wedel, A., Cremers, D., Bischof, H.: Anisotropic Huber-L1 optical flow. In: British Machine Vision Conference, London (2009)
Book Google Scholar
Yamaguchi, K., Hazan, T., McAllester, D., Urtasun, R.: Continuous Markov random fields for robust stereo estimation. In: European Conference on Computer Vision, Florence (2012)
Book Google Scholar
Yamaguchi, K., McAllester, D., Urtasun, R.: Robust monocular epipolar flow estimation. In: IEEE Conference on Computer Vision and Pattern Recognition, Portland (2013)
Book Google Scholar
Yamaguchi, K., McAllester, D., Urtasun, R.: Efficient joint segmentation, occlusion labeling, stereo and flow estimation. In: European Conference on Computer Vision, Zurich (2014)
Book Google Scholar
Zabih, R., Woodfill, J.: Non-parametric local transforms for computing visual correspondence. In: European Conference on Computer Vision, Stockholm (1994)
Book Google Scholar

Download references

Author information

Authors and Affiliations

Photogrammetry and Remote Sensing, ETH Zürich, Stefano-Franscini-Platz 5, 8093, Zürich, Schweiz
Christoph Vogel
Visual Inference, TU Darmstadt, 64283, Darmstadt, Deutschland
Stefan Roth
Photogrammetry and Remote Sensing, ETH Zürich, Stefano-Franscini-Platz 5, 8093, Zürich, Schweiz
Konrad Schindler

Authors

Christoph Vogel
View author publications
You can also search for this author in PubMed Google Scholar
Stefan Roth
View author publications
You can also search for this author in PubMed Google Scholar
Konrad Schindler
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Christoph Vogel .

Editor information

Editors and Affiliations

Lehr- und Forschungsgebiet "Geomathemati, TU Kaiserslautern, Kaiserslautern, Germany
Willi Freeden
Physikalische Geodäsie, TU München Inst. Astronomische und, München, Germany
Reiner Rummel

Rights and permissions

Reprints and permissions

Copyright information

About this entry

Cite this entry

Vogel, C., Roth, S., Schindler, K. (2015). 3D Szenenfluss – bildbasierte Schätzung dichter Bewegungsfelder. In: Freeden, W., Rummel, R. (eds) Handbuch der Geodäsie. Springer Reference Naturwissenschaften . Springer Spektrum, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-46900-2_48-1

Download citation

DOI: https://doi.org/10.1007/978-3-662-46900-2_48-1
Received: 25 April 2015
Accepted: 23 June 2015
Published: 26 February 2016
Publisher Name: Springer Spektrum, Berlin, Heidelberg
Online ISBN: 978-3-662-46900-2
eBook Packages: Springer Referenz Naturwissenschaften

Publish with us

Policies and ethics