Real Time Direct Visual Odometry for Flexible Multi-camera Rigs

  • Benjamin ReschEmail author
  • Jian Wei
  • Hendrik P. A. Lensch
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10114)


We present a Direct Visual Odometry (VO) algorithm for multi-camera rigs, that allows for flexible connections between cameras and runs in real-time at high frame rate on GPU for stereo setups. In contrast to feature-based VO methods, Direct VO aligns images directly to depth-enhanced previous images based on the photoconsistency of all high-contrast pixels. By using a multi-camera setup we can introduce an absolute scale into our reconstruction. Multiple views also allow us to obtain depth from multiple disparity sources: static disparity between the different cameras of the rig and temporal disparity by exploiting rig motion. We propose a joint optimization of the rig poses and the camera poses within the rig which enables working with flexible rigs. We show that sub-pixel rigidity is difficult to manufacture for 720p or higher resolution cameras which makes this feature important, particularly in current and future (semi-)autonomous cars or drones. Consequently, we evaluate our approach on own, real-world and synthetic datasets that exhibit flexibility in the rig beside sequences from established KITTI dataset.


Stereo Camera Structure From Motion Visual Odometry Multi View Stereo Photometric Error 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.



This work was supported by Daimler AG, Germany. Real-world flexible stereo rig datasets were kindly provided by Dr. Senya Polikovsky, OSLab, Max Planck Institute for Intelligent Systems Tübingen.

Supplementary material

416263_1_En_31_MOESM1_ESM.pdf (134 kb)
Supplementary material 1 (pdf 134 KB)

Supplementary material 2 (mp4 17297 KB)


  1. 1.
    Engel, J., Stueckler, J., Cremers, D.: Large-scale direct slam with stereo cameras. In: International Conference on Intelligent Robots and Systems (IROS) (2015)Google Scholar
  2. 2.
    Chiuso, A., Favaro, P., Jin, H., Soatto, S.: Structure from motion causally integrated over time. IEEE Trans. Pattern Anal. Mach. Intell. 24, 523–535 (2002)CrossRefGoogle Scholar
  3. 3.
    Nistér, D., Naroditsky, O., Bergen, J.: Visual odometry, pp. 652–659 (2004)Google Scholar
  4. 4.
    Davison, A.J., Reid, I.D., Molton, N.D., Stasse, O.: Monoslam: real-time single camera slam. IEEE Trans. Pattern Anal. Mach. Intell. 29, 1052–1067 (2007)CrossRefGoogle Scholar
  5. 5.
    Klein, G., Murray, D.: Parallel tracking and mapping for small AR workspaces. In: Proceedings of Sixth IEEE and ACM International Symposium on Mixed and Augmented Reality, ISMAR 2007, Nara, Japan (2007)Google Scholar
  6. 6.
    Paz, L.M., Piniés, P., Tardós, J.D., Neira, J.: Large scale 6-DOF slam with stereo-in-hand. IEEE Trans. Robot. 24, 946–957 (2008)CrossRefGoogle Scholar
  7. 7.
    Kerl, C., Sturm, J., Cremers, D.: Robust odometry estimation for RGB-D cameras. In: ICRA, pp. 3748–3754. IEEE (2013)Google Scholar
  8. 8.
    Meilland, M., Comport, A.I.: On unifying key-frame and voxel-based dense visual SLAM at large scales. In: 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems, Tokyo, Japan, 3–7 November 2013, pp. 3677–3683 (2013)Google Scholar
  9. 9.
    Engel, J., Schöps, T., Cremers, D.: LSD-SLAM: Large-Scale Direct Monocular SLAM. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8690, pp. 834–849. Springer, Heidelberg (2014). doi: 10.1007/978-3-319-10605-2_54 Google Scholar
  10. 10.
    Pillai, S., Ramalingam, S., Leonard, J.: High-performance and tunable stereo reconstruction. In: 2016 IEEE International Conference on Robotics and Automation (ICRA). IEEE (2016)Google Scholar
  11. 11.
    Comport, A.I., Malis, E., Rives, P.: Accurate quadrifocal tracking for robust 3D visual odometry. In: Proceedings 2007 IEEE International Conference on Robotics and Automation, pp. 40–45 (2007)Google Scholar
  12. 12.
    Resch, B., Lensch, H.P.A., Wang, O., Pollefeys, M., Sorkine-Hornung, A.: Scalable structure from motion for densely sampled videos. In: CVPR, pp. 3936–3944. IEEE Computer Society (2015)Google Scholar
  13. 13.
    Kim, C., Zimmer, H., Pritch, Y., Sorkine-Hornung, A., Gross, M.: Scene reconstruction from high spatio-angular resolution light fields. ACM Trans. Graph. Proc. ACM SIGGRAPH 32, 73:1–73:12 (2013)zbMATHGoogle Scholar
  14. 14.
    Wei, J., Resch, B., Lensch, H.P.A.: Dense and occlusion-robust multi-view stereo for unstructured videos. In: 13th Conference on Computer and Robot Vision, CRV 2016, Victoria, British Columbia, 1–3 June 2016. IEEE Computer Society (2016)Google Scholar
  15. 15.
    Delaunoy, A., Pollefeys, M.: Photometric bundle adjustment for dense multi-view 3D modeling. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1486–1493. IEEE (2014)Google Scholar
  16. 16.
    Engel, J., Sturm, J., Cremers, D.: Semi-dense visual odometry for a monocular camera. In: IEEE International Conference on Computer Vision (ICCV), Sydney, Australia (2013)Google Scholar
  17. 17.
    Levenberg, K.: A method for the solution of certain non-linear problems in least squares. Q. J. Appl. Math. II, 164–168 (1944)MathSciNetCrossRefzbMATHGoogle Scholar
  18. 18.
    Crouse, D.F., Willett, P., Pattipati, K., Svensson, L.: A look at Gaussian mixture reduction algorithms. In: 2011 Proceedings of the 14th International Conference on Information Fusion (FUSION), pp. 1–8 (2011)Google Scholar
  19. 19.
    Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? the KITTI vision benchmark suite. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2012)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Benjamin Resch
    • 1
    Email author
  • Jian Wei
    • 1
  • Hendrik P. A. Lensch
    • 1
  1. 1.Computer Graphics GroupUniversity of TübingenTübingenGermany

Personalised recommendations