Autonomous Robots

, Volume 43, Issue 1, pp 21–35 | Cite as

2D–3D synchronous/asynchronous camera fusion for visual odometry

  • Danda Pani PaudelEmail author
  • Cédric Demonceaux
  • Adlane Habed
  • Pascal Vasseur


We propose a robust and direct 2D–3D registration method for camera synchronization. Once the cameras are synchronized—or for synchronous setups—we also propose a visual odometry framework that benefits from both 2D and 3D acquisitions. Our method does not require a precise set of 2D-to-3D correspondences, handles occlusions and works when the scene is only partially known. It is carried out through a 2D–3D based initial motion estimation followed by a constrained nonlinear optimization for motion refinement. The problems of occlusion and that of missing scene parts are handled by comparing the image-based reconstruction and 3D sensor measurements. The results of our experiments demonstrate that the proposed framework allows to obtain a good initial motion estimate and a significant improvement through refinement.


Asynchronous cameras 2D–3D registration Structure-from-Motion Visual Odometry 


  1. Besl, P. J., & McKay, N. D. (1992). A method for registration of 3-D shapes. IEEE Transactions on Pattern Analysis, 14, 239–256.CrossRefGoogle Scholar
  2. Bok, Y., Jeong, Y., Choi, D. G., & Kweon, I. S. (2011). Capturing village-level heritages with a hand-held camera-laser fusion sensor. International Journal of Computer Vision, 94, 36–53.CrossRefGoogle Scholar
  3. Buczko, & Willert, V. (2016). Flow-decoupled normalized reprojection error for visual odometry. In IEEE Intelligent Transportation Systems Conference (ITSC).Google Scholar
  4. Chiuso, A., Favaro, P., Jin, H., Soatto, S. (2000). 3-D motion and structure from 2-D motion causally integrated over time: Implementation, ECCV.Google Scholar
  5. Christy, S., & Horaud, R. (1999). Iterative pose computation from line correspondences. Computer Vision and Image Understanding, 73, 137–144.CrossRefzbMATHGoogle Scholar
  6. Clarkson, M. J., Rueckert, D., Hill, D. L. G., & Hawkes, D. J. (2001). Using photo-consistency to register 2D optical images of the human face to a 3D surface model. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23, 1266–1280.CrossRefGoogle Scholar
  7. Comport, A., Malis, E., & Rives, P. (2007). Accurate quadri-focal tracking for robust 3D visual odometry, ICRA.Google Scholar
  8. Corsini, M., Dellepiane, M., Ganovelli, F., Gherardi, R., Fusiello, A., & Scopigno, R. (2013). Fully automatic registration of image sets on approximate geometry. International Journal of Computer Cision, 102, 91–111.CrossRefGoogle Scholar
  9. Eckart, B., Kim, K., Troccoli, A., Kelly, A., & Kautz, J. (2015). Mlmd: Max-imum likelihood mixture decoupling for fast and accurate point cloud registration. In 3DVision (3DV), 2015 International Conference on (pp. 241–249).Google Scholar
  10. Evangelidis, G. D., Kounades-Bastian, D., Horaud, R., & ZPsarakis, E. (2014). A generative model for the joint registration of multiple point sets. In ECCV (pp. 109–122).Google Scholar
  11. Fitzgibbon, A. (2003). Robust registration of 2D and 3D point sets. Image and Vision Computing, 21, 1145–1153.CrossRefGoogle Scholar
  12. Geiger, A., Lenz, P., Stiller, C., & Urtasun, R. (2013). Vision meets robotics: The KITTI dataset. International Journal of Robotics Research, 32, 1231–1237.CrossRefGoogle Scholar
  13. Henry, P., Krainin, M., Herbst, E., Ren, X., & Fox, D. (2012). RGB-D mapping: Using kinect-style depth cameras for dense 3D modeling of indoor environments, IJRR.Google Scholar
  14. Hesch, J. A., & Roumeliotis, S. I. (2011) A direct least-squares (DLS) method for PnP, ICCV.Google Scholar
  15. Holz, D., Lorken, C., & Surmann, H. (2008). Continuous 3D sensing for navigation and SLAM in cluttered and dynamic environments, ICIF.Google Scholar
  16. Horaud, R., Forbes, F., Yguel, M., Dewaele, G., & Zhang, J. (2011). Rigid and articulated point registration with expectation conditional maximization. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(3), 587–602.CrossRefGoogle Scholar
  17. Jensen, R., Dahl, A., Vogiatzis, G., Tola, E., & Aanaes, H. (2014). Large scale multi-view stereopsis evaluation, CVPR.Google Scholar
  18. Jia, Li, B., Zhang, G., & Li, X. (2016). Improved kinect fusion based on graph-based optimization and large loop model. In 2016 IEEE international conference on information and automation (ICIA) (pp. 813–818). Ningbo.Google Scholar
  19. Kaess, M., Johannsson, H., Roberts, R., Ila, V., Leonard, J. J., & Dellaert, F. (2011). iSAM2: Incremental smoothing and mapping using the Bayes tree. IJRR.Google Scholar
  20. Kerl, C., Sturm, J., & Cremers, D. (2013). Dense visual SLAM for RGB-D Cameras, IROS.Google Scholar
  21. Knopp, J., Sivic, J., & Pajdla, T. (2010). Avoiding confusing features in place recognition, ECCV.Google Scholar
  22. Koch, R. (1993). Dynamic 3-d scene analysis through synthesis feedback control. IEEE Transactions on Pattern Analysis and Machine Intelligence, 15, 556–568.CrossRefGoogle Scholar
  23. Lhuillier, M. (2012). Incremental fusion of structure-from-motion and GPS using constrained bundle adjustments. IEEE Transactions on Pattern Analysis and Machine Intelligence, 34, 2489–2495.CrossRefGoogle Scholar
  24. Liu, L., & Stamos, I. (2005). Automatic 3D to 2D registration for the photorealistic rendering of urban scenes, CVPR.Google Scholar
  25. Lourakis, M. I. A., & Argyros, A. A. (2009). SBA: A software package for generic sparse bundle adjustment. ACM Transactions on Mathematical Software, 36, 2.MathSciNetCrossRefzbMATHGoogle Scholar
  26. Martin, K., & Jakob, W. (2012). Iterative closest point. Lyngby: Technical University of Denmark.Google Scholar
  27. Newcombe, R. A., Izadi, S., Hilliges, O., Molyneaux, D., Kim, D., Davison, A. J., Kohli, P., Shotton, J., Hodges, S., & Fitzgibbon, A. (2011). KinectFusion: Real-time dense surface mapping and tracking, ISMAR.Google Scholar
  28. Nister, D. (2004). A minimal solution to the generalised 3-point pose problem, CVPR.Google Scholar
  29. Nister, D. (2004). An efficient solution to the five-point relative pose problem. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26, 756–770.CrossRefGoogle Scholar
  30. Nüchter, A., Lingemann, K., Hertzberg, J., & Surmann, H. (2007). 6D SLAM-3D mapping outdoor environments: Research articles. Journal of Field Robotics, 24, 699–722.CrossRefzbMATHGoogle Scholar
  31. Nister, D., Naroditsky, O., & Bergen, J. (2004). Visual odometry, CVPR.Google Scholar
  32. Paudel, D. P., Demonceaux, C., Habed, A., & Vasseur, P. (2014). Localization of 2D cameras in a known environment using direct 2D-3D registration, ICPR.Google Scholar
  33. Paudel, D. P., Demonceaux, C., Habed, A., Vasseur, P., & Kweon, I. S. (2014). 2D-3D camera fusion for visual odometry in outdoor environments, IROS.Google Scholar
  34. Pire, T., Fischer, T., Civera. J., De Cristóforis, P., & Berlles, J. J. (2015). Stereo parallel tracking and mapping for robot localization. IROS.Google Scholar
  35. Pomerleau, F., Colas, F., & Siegwart, R. (2015). A review of point cloud registration algorithms for mobile robotics. Foundations and Trends in Robotics, 4(1), 1–104.CrossRefGoogle Scholar
  36. Ramalingam, S., Bouaziz, S., Sturm, P. & Brand, M. (2009). Geolocalization using skylines from omni-images, ICCV Workshops.Google Scholar
  37. Rawia, M., Vasseur, P., Mousset, S., Boutteau, R., & Bensrhair, A. (2014) Visual odometry with unsynchronized multi-cameras setup for intelligent vehicle application. In Intelligent vehicles symposium proceedings.Google Scholar
  38. Rusinkiewicz, S., & Levoy, M. (2001) Efficient variants of the ICP algo-rithm, 3DIM.Google Scholar
  39. Sattler, T., Leibe, B., & Kobbelt, L. (2011). Fast image-based localization using direct 2D-to-3D matching, ICCV.Google Scholar
  40. Smith, A. E., & Coit, D. W. (1995). Penalty functions. Pittsburgh: University of Pittsburgh.Google Scholar
  41. Stoyanov, T., Magnusson, M., & Lilienthal, A. J. (2012). Point set registration through minimization of the L2 distance between 3D-NDT models. In IEEE International Conference on Robotics and Automation (pp. 5196–5201).Google Scholar
  42. Taguchi, Y., Jian, Y. D., Ramalingam, S., & Feng, C. (2013). Point-plane SLAM for hand-held 3D sensors. ICRA.Google Scholar
  43. Tamaazousti, M., Gay-Bellile, V., Collette, S. N., Bourgeois, S., & Dhome, M. (2011). NonLinear refinement of structure from motion reconstruction by taking advantage of a partial knowledge of the environment, CVPR.Google Scholar
  44. Taneja, A., Ballan, L., & Pollefeys, M. (2012). 3DIMPVT, registration of spherical panoramic images with cadastral 3D models.Google Scholar
  45. Tardif, George, M., Laverne, M., Kelly, A., & Stentz, A. (2010). A new approach to vision-aided inertial navigation. In 2010 IEEE/RSJ international conference on intelligent robots and systems (pp. 18–22).Google Scholar
  46. Trevor, A. J. B., Rogers, J. G., & Christensen, H. I. (2012). Planar surface SLAM with 3D and 2D sensors, ICRA.Google Scholar
  47. Triggs, B., Mclauchlan, P., Hartley, R., & Fitzgibbon, A. (2000). Bundle adjustment a modern synthesis. Vision Algorithms: Theory and Practice, LNCS.Google Scholar
  48. Viola, P., & Wells III, W. M. (1997). Alignment by maximization of mutual information. International Journal of Computer Vision, 24, 137–154.Google Scholar
  49. Weingarten, J. W., Gruener, G., Siegwart, R. (2004). A state-of-the-art 3D sensor for robot navigation, IROS.Google Scholar
  50. Williams, B., Cummins, M., Neira, J., Newman, P., Reid, I., & Tardós, J. (2009). A comparison of loop closing techniques in monocular SLAM. Robotics and Autonomous Systems, 57, 1188–1197.CrossRefGoogle Scholar
  51. Zhang, J., Kaess, M., & Singh, S. (2014). Real-time depth enhanced monocular odometry. Intelligent Robots and Systems, IROS.Google Scholar
  52. Zhao, W., David, N., & Steve, H. (2005). Alignment of continuous video onto 3D point clouds. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27, 1305–1318.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Computer Vision LaboratoryETH ZurichZürichSwitzerland
  2. 2.Le2i, VIBOT ERL CNRS 6000Université Bourgogne Franche ComtéLe CreusotFrance
  3. 3.ICube UMR 7357, CNRSUniversity of StrasbourgStrasbourgFrance
  4. 4.LITIS EA 4108University of RouenRouenFrance

Personalised recommendations