Advertisement

Localisation via Deep Imagination: Learn the Features Not the Map

  • Jaime SpencerEmail author
  • Oscar Mendez
  • Richard Bowden
  • Simon Hadfield
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11133)

Abstract

How many times does a human have to drive through the same area to become familiar with it? To begin with, we might first build a mental model of our surroundings. Upon revisiting this area, we can use this model to extrapolate to new unseen locations and imagine their appearance.

Based on this, we propose an approach where an agent is capable of modelling new environments after a single visitation. To this end, we introduce “Deep Imagination”, a combination of classical Visual-based Monte Carlo Localisation and deep learning. By making use of a feature embedded 3D map, the system can “imagine” the view from any novel location. These “imagined” views are contrasted with the current observation in order to estimate the agent’s current location. In order to build the embedded map, we train a deep Siamese Fully Convolutional U-Net to perform dense feature extraction. By training these features to be generic, no additional training or fine tuning is required to adapt to new environments.

Our results demonstrate the generality and transfer capability of our learnt dense features by training and evaluating on multiple datasets. Additionally, we include several visualizations of the feature representations and resulting 3D maps, as well as their application to localisation.

Keywords

Localization Deep Imagination VMCL FCU-Net 

Notes

Acknowledgements

This work was funded by the EPSRC under grant agreements (EP/R512217/1) and (EP/R03298X/1) and Innovate UK Autonomous Valet Parking Project (Grant No. 104273). We would also like to thank NVIDIA Corporation for their Titan Xp GPU grant.

References

  1. 1.
    Fox, D., Burgard, W., Dellaert, F., Thrun, S.: Monte carlo localization: efficient position estimation for mobile robots. Technical report Handschin 1970 (1999)Google Scholar
  2. 2.
    Dellaert, F., Fox, D., Burgard, W., Thrun, S.: Monte carlo localization for mobile robots. In: Proceedings of the 1999 IEEE International Conference on Robotics and Automation (Cat. No. 99CH36288C), vol. 2, pp. 1322–1328 (1999)Google Scholar
  3. 3.
    Dellaert, F., Burgard, W., Fox, D., Thrun, S.: Using the condensation algorithm for robust, vision-based mobile robot localization. In: Proceedings of the 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No. PR00149), pp. 588–594. IEEE Computer Society (1999)Google Scholar
  4. 4.
    Alonso, I.P., et al.: Accurate global localization using visual odometry and digital maps on urban environments. IEEE Trans. Intell. Transp. Syst. 13(4), 1535–1545 (2012)CrossRefGoogle Scholar
  5. 5.
    Li, C., Dai, B., Wu, T.: Vision-based precision vehicle localization in urban environments. In: Proceedings of the 2013 Chinese Automation Congress, CAC 2013, pp. 599–604. IEEE, November 2013Google Scholar
  6. 6.
    Wei, L., Cappelle, C., Ruichek, Y., Zann, F.: Intelligent vehicle localization in urban environments using EKF-based visual odometry and GPS fusion. In: IFAC Proceedings Volumes (IFAC-Papers Online), vol. 18, pp. 13776–13781 (2011)CrossRefGoogle Scholar
  7. 7.
    Gao, X., Zhang, T.: Robust RGB-D simultaneous localization and mapping using planar point features. Robot. Auton. Syst. 72, 1–14 (2015)CrossRefGoogle Scholar
  8. 8.
    Paton, M., Košecka, J.: Adaptive RGB-D localization. In: Proceedings of the 2012 9th Conference on Computer and Robot Vision, CRV 2012, pp. 24–31. IEEE, May 2012Google Scholar
  9. 9.
    Kamijo, S., Gu, Y., Hsu, L.T.: Autonomous vehicle technologies: localization and mapping. IEICE ESS Fundam. Rev. 9(2), 131–141 (2015)CrossRefGoogle Scholar
  10. 10.
    Mendez, O., Hadfield, S., Pugeault, N., Bowden, R.: SeDAR - semantic detection and ranging: humans can localise without LiDAR, can robots? In: ICRA 2018 (2018)Google Scholar
  11. 11.
    Shotton, J., Glocker, B., Zach, C., Izadi, S., Criminisi, A., Fitzgibbon, A.: Scene coordinate regression forests for camera relocalization in RGB-D images. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 2930–2937. IEEE, June 2013Google Scholar
  12. 12.
    Kendall, A., Grimes, M., Cipolla, R.: PoseNet: a convolutional network for real-time 6-DOF camera relocalization. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2938–2946 (2015)Google Scholar
  13. 13.
    Kendall, A., Cipolla, R.: Geometric loss functions for camera pose regression with deep learning. In: Proceedings of the 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, pp. 6555–6564, January 2017Google Scholar
  14. 14.
    Melekhov, I., Ylioinas, J., Kannala, J., Rahtu, E.: Image-based localization using hourglass networks. In: Proceedings of the 2017 IEEE International Conference on Computer Vision Workshops, ICCVW 2017, pp. 870–877, January 2018Google Scholar
  15. 15.
    Brachmann, E., Krull, A., Nowozin, S., Shotton, J.: DSAC-differentiable RANSAC for camera localization. In: CVPR 2017 (2017)Google Scholar
  16. 16.
    Clark, R., Wang, S., Markham, A., Trigoni, N., Wen, H.: VidLoc: a deep spatio-temporal model for 6-DOF video-clip relocalization. In: Proceedings of the 30th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, vol. January 2017, pp. 2652–2660. IEEE, July 2017Google Scholar
  17. 17.
    Walch, F., Hazirbas, C., Leal-Taixe, L., Sattler, T., Hilsenbeck, S., Cremers, D.: Image-based localization using LSTMs for structured feature correlation. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 627–637, October 2017Google Scholar
  18. 18.
    Davison, A.J.: Real-time simultaneous localisation and mapping with a single camera. In: ICCV, vol. 2, pp. 1403–1410 (2003)Google Scholar
  19. 19.
    Mur-Artal, R., Montiel, J.M., Tardos, J.D.: ORB-SLAM: a versatile and accurate monocular SLAM system. IEEE Trans. Robot. 31(5), 1147–1163 (2015)CrossRefGoogle Scholar
  20. 20.
    Mouragnon, E., Lhuillier, M., Dhome, M., Dekeyser, F., Sayd, P.: Real time localization and 3D reconstruction. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 363–370. IEEE (2006)Google Scholar
  21. 21.
    Eudes, A., Lhuillier, M., Naudet-Collette, S., Dhome, M.: Fast odometry integration in local bundle adjustment-based visual SLAM. In: Proceedings of the International Conference on Pattern Recognition, pp. 290–293 (2010)Google Scholar
  22. 22.
    Mendez, O., Hadfield, S., Pugeault, N., Bowden, R.: Taking the scenic route to 3D: optimising reconstruction from moving cameras. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4687–4695, October 2017Google Scholar
  23. 23.
    Mendez, O., Hadfield, S., Pugeault, N., Bowden, R.: Next-best stereo: Extending next-best view optimisation for collaborative sensors. British Machine Vision Conference 2016, BMVC 2016 2016-Septe (2016) 1–12Google Scholar
  24. 24.
    Wang, P., Yang, R., Cao, B., Xu, W., Lin, Y.: DeLS-3D: deep localization and segmentation with a 3D semantic map. In: CVPR 2018 (2018)Google Scholar
  25. 25.
    Brahmbhatt, S., Gu, J., Kim, K., Hays, J., Kautz, J.: Geometry-aware learning of maps for camera localization. Technical report (2017)Google Scholar
  26. 26.
    Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 07–12 June 2015, pp. 3431–3440 (2015)Google Scholar
  27. 27.
    Schmidt, T., Newcombe, R., Fox, D.: Self-supervised visual descriptor learning for dense correspondence. IEEE Robot. Autom. Lett. 2(2), 420–427 (2017)CrossRefGoogle Scholar
  28. 28.
    Chopra, S., Hadsell, R., LeCun, Y.: Learning a similarity metric discriminatively, with application to face verification. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 539–546 (2005)Google Scholar
  29. 29.
    Geiger, A., Lenz, P., Stiller, C., Urtasun, R.: Vision meets robotics: the KITTI dataset. Int. J. Robot. Res. 32(11), 1231–1237 (2013)CrossRefGoogle Scholar
  30. 30.
    Abadi, M., et al.: TensorFlow: large-scale machine learning on heterogeneous distributed systems (2015)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Jaime Spencer
    • 1
    Email author
  • Oscar Mendez
    • 1
  • Richard Bowden
    • 1
  • Simon Hadfield
    • 1
  1. 1.University of SurreyGuildfordUK

Personalised recommendations