Advertisement

Scale Drift Correction of Camera Geo-Localization Using Geo-Tagged Images

  • Kazuya IwamiEmail author
  • Satoshi Ikehata
  • Kiyoharu Aizawa
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11129)

Abstract

Camera geo-localization from a monocular video is a fundamental task for video analysis and autonomous navigation. Although 3D reconstruction is a key technique to obtain camera poses, monocular 3D reconstruction in a large environment tends to result in the accumulation of errors in rotation, translation, and especially in scale: a problem known as scale drift. To overcome these errors, we propose a novel framework that integrates incremental structure from motion (SfM) and a scale drift correction method utilizing geo-tagged images, such as those provided by Google Street View. Our correction method begins by obtaining sparse 6-DoF correspondences between the reconstructed 3D map coordinate system and the world coordinate system, by using geo-tagged images. Then, it corrects scale drift by applying pose graph optimization over \(\mathrm {Sim}(3)\) constraints and bundle adjustment. Experimental evaluations on large-scale datasets show that the proposed framework not only sufficiently corrects scale drift, but also achieves accurate geo-localization in a kilometer-scale environment.

Keywords

3D reconstruction Localization Street View 

Notes

Acknowledgement

This work was partially supported by VTEC laboratories Inc.

References

  1. 1.
  2. 2.
    Agarwal, P., Burgard, W., Spinello, L.: Metric localization using Google street view. In: 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 3111–3118. IEEE (2015)Google Scholar
  3. 3.
    Blanco-Claraco, J.L., Moreno-Dueñas, F.Á., González-Jiménez, J.: The málaga urban dataset: high-rate stereo and LiDAR in a realistic urban scenario. Int. J. Robot. Res. 33(2), 207–214 (2014)CrossRefGoogle Scholar
  4. 4.
    Brubaker, M.A., Geiger, A., Urtasun, R.: Map-based probabilistic visual self-localization. IEEE Trans. Pattern Anal. Mach. Intell. 38(4), 652–665 (2016)CrossRefGoogle Scholar
  5. 5.
    Caselitz, T., Steder, B., Ruhnke, M., Burgard, W.: Monocular camera localization in 3D LiDAR maps. In: 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 1926–1931. IEEE (2016)Google Scholar
  6. 6.
    Clemente, L.A., Davison, A.J., Reid, I.D., Neira, J., Tardós, J.D.: Mapping large loops with a single hand-held camera. In: Robotics: Science and Systems, vol. 2 (2007)Google Scholar
  7. 7.
    Engel, J., Schöps, T., Cremers, D.: LSD-SLAM: large-scale direct monocular SLAM. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8690, pp. 834–849. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-10605-2_54CrossRefGoogle Scholar
  8. 8.
    Fischler, M.A., Bolles, R.C.: Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM 24(6), 381–395 (1981)MathSciNetCrossRefGoogle Scholar
  9. 9.
    Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? The KITTI vision benchmark suite. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3354–3361 (2012)Google Scholar
  10. 10.
    Kaminsky, R.S., Snavely, N., Seitz, S.M., Szeliski, R.: Alignment of 3D point clouds to overhead images. In: CVPR Workshops 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, pp. 63–70 (2009)Google Scholar
  11. 11.
    Klein, G., Murray, D.: Parallel tracking and mapping for small AR workspaces. In: 6th IEEE and ACM International Symposium on Mixed and Augmented Reality, ISMAR 2007, pp. 225–234. IEEE (2007)Google Scholar
  12. 12.
    Klingner, B., Martin, D., Roseborough, J.: Street view motion-from-structure-from-motion. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 953–960 (2013)Google Scholar
  13. 13.
    Kümmerle, R., Grisetti, G., Strasdat, H., Konolige, K., Burgard, W.: g2o: a general framework for graph optimization. In: 2011 IEEE International Conference on Robotics and Automation (ICRA), pp. 3607–3613. IEEE (2011)Google Scholar
  14. 14.
    Lhuillier, M.: Incremental fusion of structure-from-motion and GPS using constrained bundle adjustments. IEEE Trans. Pattern Anal. Mach. Intell. 34(12), 2489–2495 (2012)CrossRefGoogle Scholar
  15. 15.
    Liu, Z., Marlet, R.: Virtual line descriptor and semi-local matching method for reliable feature correspondence. In: British Machine Vision Conference 2012, p. 16-1 (2012)Google Scholar
  16. 16.
    Lu, F., Milios, E.: Globally consistent range scan alignment for environment mapping. Auton. Robots 4(4), 333–349 (1997)CrossRefGoogle Scholar
  17. 17.
    Majdik, A.L., Albers-Schoenberg, Y., Scaramuzza, D.: MAV urban localization from Google street view data. In: 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 3979–3986. IEEE (2013)Google Scholar
  18. 18.
    Middelberg, S., Sattler, T., Untzelmann, O., Kobbelt, L.: Scalable 6-DOF localization on mobile devices. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8690, pp. 268–283. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-10605-2_18CrossRefGoogle Scholar
  19. 19.
    Mur-Artal, R., Montiel, J.M.M., Tardos, J.D.: ORB-SLAM: a versatile and accurate monocular SLAM system. IEEE Trans. Robot. 31(5), 1147–1163 (2015)CrossRefGoogle Scholar
  20. 20.
    Rehder, J., Gupta, K., Nuske, S., Singh, S.: Global pose estimation with limited GPS and long range visual odometry. In: 2012 IEEE International Conference on Robotics and Automation (ICRA), pp. 627–633 (2012)Google Scholar
  21. 21.
    Rublee, E., Rabaud, V., Konolige, K., Bradski, G.: ORB: an efficient alternative to SIFT or SURF. In: 2011 IEEE International Conference on Computer Vision (ICCV), pp. 2564–2571. IEEE (2011)Google Scholar
  22. 22.
    Strasdat, H., Montiel, J., Davison, A.J.: Scale drift-aware large scale monocular SLAM. In: Robotics: Science and Systems VI (2010)Google Scholar
  23. 23.
    Tamaazousti, M., Gay-Bellile, V., Collette, S.N., Bourgeois, S., Dhome, M.: Nonlinear refinement of structure from motion reconstruction by taking advantage of a partial knowledge of the environment. In: 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3073–3080. IEEE (2011)Google Scholar
  24. 24.
    Untzelmann, O., Sattler, T., Middelberg, S., Kobbelt, L.: A scalable collaborative online system for city reconstruction. In: Proceedings of the IEEE International Conference on Computer Vision Workshops, pp. 644–651 (2013)Google Scholar
  25. 25.
    Wang, C.P., Wilson, K., Snavely, N.: Accurate georegistration of point clouds using geographic data. In: 2013 International Conference on 3DTV-Conference, pp. 33–40 (2013)Google Scholar
  26. 26.
    Wendel, A., Irschara, A., Bischof, H.: Automatic alignment of 3D reconstructions using a digital surface model. In: 2011 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 29–36. IEEE (2011)Google Scholar
  27. 27.
    Zandbergen, P.A., Barbeau, S.J.: Positional accuracy of assisted GPS data from high-sensitivity GPS-enabled mobile phones. J. Navig. 64(3), 381–399 (2011)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.The University of TokyoTokyoJapan
  2. 2.National Institute of InformaticsTokyoJapan

Personalised recommendations