Autonomous Robots

, Volume 42, Issue 3, pp 665–685 | Cite as

Are you ABLE to perform a life-long visual topological localization?

  • Roberto Arroyo
  • Pablo F. Alcantarilla
  • Luis M. Bergasa
  • Eduardo Romera
Article
  • 265 Downloads

Abstract

Visual topological localization is a process typically required by varied mobile autonomous robots, but it is a complex task if long operating periods are considered. This is because of the appearance variations suffered in a place: dynamic elements, illumination or weather. Due to these problems, long-term visual place recognition across seasons has become a challenge for the robotics community. For this reason, we propose an innovative method for a robust and efficient life-long localization using cameras. In this paper, we describe our approach (ABLE), which includes three different versions depending on the type of images: monocular, stereo and panoramic. This distinction makes our proposal more adaptable and effective, because it allows to exploit the extra information that can be provided by each type of camera. Besides, we contribute a novel methodology for identifying places, which is based on a fast matching of global binary descriptors extracted from sequences of images. The presented results demonstrate the benefits of using ABLE, which is compared to the most representative state-of-the-art algorithms in long-term conditions.

Keywords

Localization across seasons Visual place recognition Loop closure detection Image matching Binary descriptors 

Notes

Acknowledgements

This work has been funded in part from the Spanish MINECO through the SmartElderlyCar project (TRA2015-70501-C2-1-R) and from the RoboCity2030-III-CM project (Robotica aplicada a la mejora de la calidad de vida de los ciudadanos. fase III; S2013/MIT-2748), funded by Programas de actividades I+D (CAM) and cofunded by EU Structural Funds.

References

  1. Alahi, A., Ortiz, R., & Vandergheynst, P. (2012). FREAK: Fast retina keypoint. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (Vol. 2, pp. 510–517). doi: 10.1109/CVPR.2012.6247715.
  2. Alcantarilla, P. F., Stasse, O., Druon, S., Bergasa, L. M., & Dellaert, F. (2013). How to localize humanoids with a single camera? Autonomous Robots, 34(1), 47–71. doi: 10.1007/s10514-012-9312-1.CrossRefGoogle Scholar
  3. Alcantarilla, P. F., Stent, S., Ros, G., Arroyo, R., & Gherardi, R. (2016). Street-view change detection with deconvolutional networks. In Robotics Science and Systems Conference (RSS) (pp. 1–10). doi: 10.15607/RSS.2016.XII.044.
  4. Arroyo, R., Alcantarilla, P. F., Bergasa, L. M., Yebes, J. J., & Bronte, S. (2014a). Fast and effective visual place recognition using binary codes and disparity information. In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (pp. 3089–3094). doi: 10.1109/IROS.2014.6942989.
  5. Arroyo, R., Alcantarilla, P. F., Bergasa, L. M., Yebes, J. J., & Gámez, S. (2014b). Bidirectional loop closure detection on panoramas for visual navigation. In IEEE Intelligent Vehicles Symposium (IV) (pp. 1378–1383). doi: 10.1109/IVS.2014.6856457.
  6. Arroyo, R., Alcantarilla, P. F., Bergasa, L. M., & Romera, E. (2015). Towards life-long visual localization using an efficient matching of binary sequences from images. In IEEE International Conference on Robotics and Automation (ICRA) (pp. 6328–6335). doi: 10.1109/ICRA.2015.7140088.
  7. Arroyo, R., Alcantarilla, P. F., Bergasa, L. M., & Romera, E. (2016a). Fusion and binarization of CNN features for robust topological localization across seasons. In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (pp. 4656–4663). doi: 10.1109/IROS.2016.7759685.
  8. Arroyo, R., Alcantarilla, P. F., Bergasa, L. M., & Romera, E. (2016b). OpenABLE: An open-source toolbox for application in life-long visual localization of autonomous vehicles. In IEEE Intelligent Transportation Systems Conference (ITSC) (pp. 965–970). doi: 10.1109/ITSC.2016.7795672.
  9. Badino, H., Huber, D. F., & Kanade, T. (2012). Real-time topometric localization. In IEEE International Conference on Robotics and Automation (ICRA) (pp. 1635–1642). doi: 10.1109/ICRA.2012.6224716.
  10. Bailey, T., & Durrant-Whyte, H. (2006). Simultaneous localisation and mapping (SLAM): Part II state of the art. IEEE Robotics and Automation Magazine (RAM), 13(3), 108–117. doi: 10.1109/MRA.2006.1678144.CrossRefGoogle Scholar
  11. Bay, H., Ess, A., Tuytelaars, T., & van Gool, L. (2008). Speeded-up robust features (SURF). Computer Vision and Image Understanding (CVIU), 110(3), 346–359. doi: 10.1016/j.cviu.2007.09.014.CrossRefGoogle Scholar
  12. Cadena, C., Gálvez-López, D., Ramos, F., Tardós, J. D., & Neira, J. (2010). Robust place recognition with stereo cameras. In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (pp. 5182–5189). doi: 10.1109/IROS.2010.5650234.
  13. Cadena, C., Gálvez-López, D., Tardós, J. D., & Neira, J. (2012). Robust place recognition with stereo sequences. IEEE Transactions on Robotics (TRO), 28(4), 871–885. doi: 10.1109/TRO.2012.2189497.CrossRefGoogle Scholar
  14. Calonder, M., Lepetit, V., Özuysal, M., Trzcinski, T., Strecha, C., & Fua, P. (2012). BRIEF: Computing a local binary descriptor very fast. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 34(7), 1281–1298. doi: 10.1109/TPAMI.2011.222.CrossRefGoogle Scholar
  15. Campos, F. M., Correia, L., & Calado, J. M. F. (2013). Loop closure detection with a holistic image feature. In Portuguese Conference on Artificial Intelligence (EPIA) (Vol. 8154, pp. 247–258). doi: 10.1007/978-3-642-40669-0_22.
  16. Caramazana, L., Arroyo, R., & Bergasa, L. M. (2016). Visual odometry correction based on loop closure detection. In: Open Conference on Future Trends in Robotics (RoboCity16) (pp. 97–104).Google Scholar
  17. Carlevaris-Bianco, N., & Eustice, R. M. (2014). Learning visual feature descriptors for dynamic lighting conditions. In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (pp. 2769–2776). doi: 10.1109/IROS.2014.6942941.
  18. Carlevaris-Bianco, N., Ushani, A. K., & Eustice, R. M. (2016). University of Michigan North Campus long-term vision and lidar dataset. International Journal of Robotics Research (IJRR), 35(9), 1023–1035. doi: 10.1177/0278364915614638.CrossRefGoogle Scholar
  19. Ceriani, S., Fontana, G., Giusti, A., Marzorati, D., Matteucci, M., Migliore, D., et al. (2009). Rawseeds ground truth collection systems for indoor self-localization and mapping. Autonomous Robots, 27(4), 353–371. doi: 10.1007/s10514-009-9156-5.CrossRefGoogle Scholar
  20. Clemente, L. A., Davison, A. J., Reid, I. D., Neira, J., & Tardós, J. D. (2007). Mapping large loops with a single hand-held camera. In Robotics Science and Systems Conference (RSS) (pp. 297–304). doi: 10.15607/RSS.2007.III.038.
  21. Corke, P., Paul, R., Churchill, W., & Newman, P. (2013). Dealing with shadows: Capturing intrinsic scene appearance for image-based outdoor localisation. In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (pp. 2085–2092). doi: 10.1109/IROS.2013.6696648.
  22. Cummins, M., & Newman, P. (2008). FAB-MAP: Probabilistic localization and mapping in the space of appearance. International Journal of Robotics Research (IJRR), 27(6), 647–665. doi: 10.1177/0278364908090961.CrossRefGoogle Scholar
  23. Cummins, M., & Newman, P. (2010). Accelerating FAB-MAP with concentration inequalities. IEEE Transactions on Robotics (TRO), 26(6), 1042–1050. doi: 10.1109/TRO.2010.2080390.CrossRefGoogle Scholar
  24. Cummins, M., & Newman, P. (2010b). Appearance-only SLAM at large scale with FAB-MAP 2.0. International Journal of Robotics Research (IJRR), 30(9), 1100–1123. doi: 10.1177/0278364910385483.CrossRefGoogle Scholar
  25. Dalal, N., & Triggs, B. (2005). Histograms of oriented gradients for human detection. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (Vol. 2, pp. 886–893). doi: 10.1109/CVPR.2005.177.
  26. Drouilly, R., Rives, P., & Morisset, B. (2015) Semantic representation for navigation in large-scale environments. In IEEE International Conference on Robotics and Automation (ICRA) (pp. 1106–1111). doi: 10.1109/ICRA.2015.7139314.
  27. Durrant-Whyte, H., & Bailey, T. (2006). Simultaneous localisation and mapping (SLAM): Part I the essential algorithms. IEEE Robotics and Automation Magazine (RAM), 13(2), 99–110. doi: 10.1109/MRA.2006.1638022.CrossRefGoogle Scholar
  28. Dymczyk, M., Lynen, S., Cieslewski, T., Bosse, M., Siegwart, R., & Furgale, P. (2015). The gist of maps—Summarizing experience for lifelong localization. In IEEE International Conference on Robotics and Automation (ICRA) (pp. 2767–2773). doi: 10.1109/ICRA.2015.7139575.
  29. Erkent, O., & Bozma, H. I. (2015). Long-term topological place learning. In IEEE International Conference on Robotics and Automation (ICRA) (pp. 5462–5467). doi: 10.1109/ICRA.2015.7139962.
  30. Fraundorfer, F., & Scaramuzza, D. (2012). Visual odometry—Part II: Matching, robustness, and applications. IEEE Robotics and Automation Magazine (RAM), 19(2), 78–90. doi: 10.1109/MRA.2012.2182810.CrossRefGoogle Scholar
  31. Fuentes-Pacheco, J., Ruiz-Ascencio, J., & Rendón-Mancha, J. M. (2012). Visual simultaneous localization and mapping: A survey. Artificial Intelligence Review (AIR). doi: 10.1007/s10462-012-9365-8.
  32. Gálvez-López, D., & Tardós, J. D. (2012). Bags of binary words for fast place recognition in image sequences. IEEE Transactions on Robotics (TRO), 28(5), 1188–1197. doi: 10.1109/TRO.2012.2197158.CrossRefGoogle Scholar
  33. Gao, X., & Zhang, T. (2017). Unsupervised learning to detect loops using deep neural networks for visual SLAM system. Autonomous Robots, 41(1), 1–18. doi: 10.1007/s10514-015-9516-2.MathSciNetCrossRefGoogle Scholar
  34. Garcia-Fidalgo, E., & Ortiz, A. (2015). Vision-based topological mapping and localization methods: A survey. Robotics and Autonomous Systems (RAS), 64, 1–20. doi: 10.1016/j.robot.2014.11.009.CrossRefGoogle Scholar
  35. Geiger, A., Roser, M., & Urtasun, R. (2010). Efficient large-scale stereo matching. In Asian Conference on Computer Vision (ACCV) (Vol. 6492, pp. 25–38). doi: 10.1007/978-3-642-19315-6_3.
  36. Geiger, A., Lenz, P., & Urtasun, R. (2012). Are we ready for autonomous driving? The KITTI vision benchmark suite. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 3354–3361). doi: 10.1109/CVPR.2012.6248074.
  37. Glover, A. J., Maddern, W., Milford, M., & Wyeth, G. F. (2010). FAB-MAP + RatSLAM: Appearance-based SLAM for multiple times of day. In IEEE International Conference on Robotics and Automation (ICRA) (pp. 3507–3512). doi: 10.1109/ROBOT.2010.5509547.
  38. Glover, A. J., Maddern, W., Warren, M., Reid, S., Milford, M., & Wyeth, G. F. (2012). OpenFABMAP: An open source toolbox for appearance-based loop closure detection. In IEEE International Conference on Robotics and Automation (ICRA) (pp. 4730–4735). doi: 10.1109/ICRA.2012.6224843.
  39. Hirschmuller, H. (2008). Stereo processing by semiglobal matching and mutual information. IEEE Transaction on Pattern Analysis and Machine Intelligence (TPAMI), 30(2), 328–341. doi: 10.1109/TPAMI.2007.1166.CrossRefGoogle Scholar
  40. Johns, E., & Yang, G. (2014). Generative methods for long-term place recognition in dynamic scenes. International Journal of Computer Vision (IJCV), 106(3), 297–314. doi: 10.1007/s11263-013-0648-6.MathSciNetCrossRefMATHGoogle Scholar
  41. Korrapati, H., & Mezouar, Y. (2017). Multi-resolution map building and loop closure with omnidirectional images. Autonomous Robots, 41(4), 967–987. doi: 10.1007/s10514-016-9560-6.CrossRefGoogle Scholar
  42. Korrapati, H., Uzer, F., & Mezouar, Y. (2013). Hierarchical visual mapping with omnidirectional images. In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (pp. 3684–3690). doi: 10.1109/IROS.2013.6696882.
  43. Lee, G. H., & Pollefeys, M. (2014). Unsupervised learning of threshold for geometric verification in visual-based loop-closure. In IEEE International Conference on Robotics and Automation (ICRA) (pp. 1510–1516). doi: 10.1109/ICRA.2014.6907052.
  44. Leutenegger, S., Chli, M., & Siegwart, R. Y. (2011). BRISK: Binary robust invariant scalable keypoints. In International Conference on Computer Vision (ICCV) (pp. 2548–2555). doi: 10.1109/ICCV.2011.6126542.
  45. Linegar, C., Churchill, W., & Newman, P. (2015). Work smart, not hard: Recalling relevant experiences for vast-scale but time-constrained localisation. In IEEE International Conference on Robotics and Automation (ICRA) (pp. 90–97). doi: 10.1109/ICRA.2015.7138985.
  46. Liu, Y., & Zhang, H. (2012). Visual loop closure detection with a compact image descriptor. In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (pp. 1051–1056). doi: 10.1109/IROS.2012.6386145.
  47. Lowe, D. G. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision (IJCV), 60(2), 91–110. doi: 10.1023/B:VISI.0000029664.99615.94.CrossRefGoogle Scholar
  48. Lowry, S., & Milford, M. (2015). Change removal: Robust online learning for changing appearance and changing viewpoint. In Workshop on Visual Place Recognition in Changing Environments at the IEEE International Conference on Robotics and Automation (W-ICRA).Google Scholar
  49. Lowry, S., Sünderhauf, N., Newman, P., Leonard, J. J., Cox, D., Corke, P., et al. (2016). Visual place recognition: A survey. IEEE Transactions on Robotics (TRO), 32(1), 1–19. doi: 10.1109/TRO.2015.2496823.CrossRefGoogle Scholar
  50. Lv, Q., Josephson, W., Wang, Z., Charikar, M., & Li, K. (2007). Multi-probe LSH: Efficient indexing for high-dimensional similarity search. In International Conference on Very Large Data Bases (VLDB) (pp. 950–961).Google Scholar
  51. Masatoshi, A., Yuuto, C., Kanji, T., & Kentaro, Y. (2015). Leveraging image-based prior in cross-season place recognition. In IEEE International Conference on Robotics and Automation (ICRA) (pp. 5455–5461). doi: 10.1109/ICRA.2015.7139961.
  52. McManus, C., Churchill, W., Maddern, W., Stewart, A., & Newman, P. (2014). Shady dealings: Robust, long-term visual localisation using illumination invariance. In IEEE International Conference on Robotics and Automation (ICRA) (pp. 901–906). doi: 10.1109/ICRA.2014.6906961.
  53. Milford, M. (2012). Visual route recognition with a handful of bits. In Robotics Science and Systems Conference (RSS) (pp. 297–304). doi: 10.15607/RSS.2012.VIII.038.
  54. Milford, M., & Wyeth, G. F. (2012). SeqSLAM: Visual route-based navigation for sunny summer days and stormy winter nights. In IEEE International Conference on Robotics and Automation (ICRA) (pp. 1643–1649). doi: 10.1109/ICRA.2012.6224623.
  55. Mohan, M., Gálvez-López, D., Monteleoni, C., & Sibley, G. (2015). Environment selection and hierarchical place recognition. In IEEE International Conference on Robotics and Automation (ICRA) (pp. 5487–5494). doi: 10.1109/ICRA.2015.7139966.
  56. Mousavian, A., Kosecká, J., & Lien, J. (2015). Semantically guided location recognition for outdoors scenes. In IEEE International Conference on Robotics and Automation (ICRA). (pp. 4882–4889). doi: 10.1109/ICRA.2015.7139877.
  57. Muja, M., & Lowe, D. G. (2012). Fast matching of binary features. In Canadian Conference on Computer and Robot Vision (CRV) (pp. 404–410). doi: 10.1109/CRV.2012.60.
  58. Muja, M., & Lowe, D. G. (2014). Scalable nearest neighbor algorithms for high dimensional data. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 36(11), 2227–2240. doi: 10.1109/TPAMI.2014.2321376.CrossRefGoogle Scholar
  59. Mur-Artal, R., Montiel, J. M. M., & Tardós, J. D. (2015). ORB-SLAM: A versatile and accurate monocular SLAM system. IEEE Transactions on Robotics (TRO), 31(5), 1147–1163. doi: 10.1109/TRO.2015.2463671.CrossRefGoogle Scholar
  60. Murillo, A. C., Singh, G., Kosecká, J., & Guerrero, J. J. (2013). Localization in urban environments using a panoramic gist descriptor. IEEE Transactions on Robotics (TRO), 29(1), 146–160. doi: 10.1109/TRO.2012.2220211.CrossRefGoogle Scholar
  61. Negre-Carrasco, P. L., Bonin-Font, F., & Oliver-Codina, G. (2016). Global image signature for visual loop-closure detection. Autonomous Robots, 40(8), 1403–1417. doi: 10.1007/s10514-015-9522-4.CrossRefGoogle Scholar
  62. Nelson, P., Churchill, W., Posner, I., & Newman, P. (2015). From dusk till dawn: Localisation at night using artificial light sources. In IEEE International Conference on Robotics and Automation (ICRA) (pp. 5245–5252). doi: 10.1109/ICRA.2015.7139930.
  63. Neubert, P., Sünderhauf, N., & Protzel, P. (2015). Superpixel-based appearance change prediction for long-term navigation across seasons. Robotics and Autonomous Systems (RAS), 69(7), 15–27. doi: 10.1016/j.robot.2014.08.005.CrossRefGoogle Scholar
  64. Ojala, T., Pietikäinen, M., & Harwood, D. (1996). A comparative study of texture measures with classification based on featured distributions. Pattern Recognition (PR), 29(1), 51–59. doi: 10.1016/0031-3203(95)00067-4.CrossRefGoogle Scholar
  65. Oliva, A., & Torralba, A. (2006). Building the gist of a scene: The role of global image features in recognition. Visual Perception, Progress in Brain Research (PBR), 155(B), 23–36. doi: 10.1016/S0079-6123(06)55002-2.Google Scholar
  66. Pandey, G., McBride, J. R., & Eustice, R. (2011). Ford Campus vision and lidar data set. International Journal of Robotics Research (IJRR), 30(13), 1543–1552. doi: 10.1177/0278364911400640.CrossRefGoogle Scholar
  67. Paul, R., & Newman, P. (2010). FAB-MAP 3D: Topological mapping with spatial and visual appearance. In IEEE International Conference on Robotics and Automation (ICRA) (pp. 2649–2656). doi: 10.1109/ROBOT.2010.5509587.
  68. Pepperell, E., Corke, P., & Milford, M. (2014). All-environment visual place recognition with SMART. In IEEE International Conference on Robotics and Automation (ICRA) (pp. 1612–1618). doi: 10.1109/ICRA.2014.6907067.
  69. Pepperell, E., Corke, P., & Milford, M. (2015). Automatic image scaling for place recognition in changing environments. In IEEE International Conference on Robotics and Automation (ICRA) (pp. 1118–1124). doi: 10.1109/ICRA.2015.7139316.
  70. Rublee, E., Rabaud, V., Konolige, K., & Bradski, G. (2011). ORB: An efficient alternative to SIFT or SURF. In International Conference on Computer Vision (ICCV) (pp. 2564–2571). doi: 10.1109/ICCV.2011.6126544.
  71. Scaramuzza, D., & Fraundorfer, F. (2011). Visual Odometry—Part I: The first 30 years and fundamentals. IEEE Robotics and Automation Magazine (RAM), 18(4), 80–92. doi: 10.1109/MRA.2011.943233.CrossRefGoogle Scholar
  72. Smith, M., Baldwin, I., Churchill, W., Paul, R., & Newman, P. (2009). The New College vision and laser data set. International Journal of Robotics Research (IJRR), 28(5), 595–599. doi: 10.1177/0278364909103911.CrossRefGoogle Scholar
  73. Sünderhauf, N., & Protzel, P. (2011). BRIEF-Gist—Closing the loop by simple means. In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (pp. 1234–1241). doi: 10.1109/IROS.2011.6094921.
  74. Sünderhauf, N., Neubert, P., & Protzel, P. (2013). Are we there yet? Challenging SeqSLAM on a 3000 km journey across all four seasons. In Workshop on Long-Term Autonomy at the IEEE International Conference on Robotics and Automation (W-ICRA).Google Scholar
  75. Sünderhauf, N., Shirazi, S., Jacobson, A., Dayoub, F., Pepperell, E., Upcroft, B., & Milford, M. (2015). Place recognition with ConvNet landmarks: Viewpoint-robust, condition-robust, training-free. In Robotics Science and Systems Conference (RSS) (pp. 1–10). doi: 10.15607/RSS.2015.XI.022.
  76. Ulrich, I., & Nourbakhsh, I. R. (2000) Appearance-based place recognition for topological localization. In IEEE International Conference on Robotics and Automation (ICRA). (pp. 1023–1029). doi: 10.1109/ROBOT.2000.844734.
  77. Upcroft, B., McManus, C., Churchill, W., Maddern, W., & Newman, P. (2014). Lighting invariant urban street classification. In IEEE International Conference on Robotics and Automation (ICRA) (pp. 1712–1718). doi: 10.1109/ICRA.2014.6907082.
  78. Valgren, C., & Lilienthal, A. J. (2010). SIFT, SURF and seasons: Appearance-based long-term localization in outdoor environments. Robotics and Autonomous Systems (RAS), 58(2), 149–156. doi: 10.1016/j.robot.2009.09.010.CrossRefGoogle Scholar
  79. Williams, B., Cummins, M., Neira, J., Newman, P., Reid, I. D., & Tardós, J. D. (2008). An image-to-map loop closing method for monocular SLAM. In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (pp. 2053–2059). doi: 10.1109/IROS.2008.4650996.
  80. Williams, B., Cummins, M., Neira, J., Newman, P., Reid, I. D., & Tardós, J. D. (2009). A comparison of loop closing techniques in monocular SLAM. Robotics and Autonomous Systems (RAS), 57(12), 1188–1197. doi: 10.1016/j.robot.2009.06.010.CrossRefGoogle Scholar
  81. Yang, X., & Cheng, K. T. (2014). Local difference binary for ultrafast and distinctive feature description. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 36(1), 188–194. doi: 10.1109/TPAMI.2013.150.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2017

Authors and Affiliations

  • Roberto Arroyo
    • 1
  • Pablo F. Alcantarilla
    • 2
  • Luis M. Bergasa
    • 1
  • Eduardo Romera
    • 1
  1. 1.Department of ElectronicsUniversity of Alcalá (UAH)Alcalá de Henares, MadridSpain
  2. 2.iRobot CorporationVictoria, LondonUK

Personalised recommendations