Efficient vision-based navigation

Learning about the influence of motion blur

Abstract

In this article, we present a novel approach to learning efficient navigation policies for mobile robots that use visual features for localization. As fast movements of a mobile robot typically introduce inherent motion blur in the acquired images, the uncertainty of the robot about its pose increases in such situations. As a result, it cannot be ensured anymore that a navigation task can be executed efficiently since the robot’s pose estimate might not correspond to its true location. We present a reinforcement learning approach to determine a navigation policy to reach the destination reliably and, at the same time, as fast as possible. Using our technique, the robot learns to trade off velocity against localization accuracy and implicitly takes the impact of motion blur on observations into account. We furthermore developed a method to compress the learned policy via a clustering approach. In this way, the size of the policy representation is significantly reduced, which is especially desirable in the context of memory-constrained systems. Extensive simulated and real-world experiments carried out with two different robots demonstrate that our learned policy significantly outperforms policies using a constant velocity and more advanced heuristics. We furthermore show that the policy is generally applicable to different indoor and outdoor scenarios with varying landmark densities as well as to navigation tasks of different complexity.

This is a preview of subscription content, access via your institution.

References

  1. Bay, H., Tuytelaars, T., & Van Gool, L. (2006). SURF: speeded-up robust features. Proc. of the European Conf. on Computer Vision, 110(3), 346–359.

    Google Scholar 

  2. Bennewitz, M., Stachniss, C., Burgard, W., & Behnke, S. (2006). Metric localization with scale-invariant visual features using a single perspective camera. In H. I. Christiensen (Ed.), Springer tracts in advanced robotics : Vol. 22, European robotics symposium 2006. Berlin: Springer.

    Google Scholar 

  3. Brock, O., & Khatib, O. (1999). High-speed navigation using the global dynamic window approach. In Proc. of the IEEE int. conf. on robotics & automation—ICRA.

  4. Bryson, M., & Sukkarieh, S. (2006). Active airborne localisation and exploration in unknown environments using inertial SLAM. In IEEE Aerospace Conference.

  5. Cassandra, A. R., Kaelbling, L. P., & Kurien, J. A. (1996). Acting under uncertainty: discrete Bayesian models for mobile-robot navigation. In Proc. of the IEEE/RSJ int. conf. on intelligent robots and systems—IROS (pp. 963–972).

  6. Doya, K. (2000). Reinforcement learning in continuous time and space. Neural Computation, 12(1), 219–245.

    Article  Google Scholar 

  7. Fox, D., Burgard, W., & Thrun, S. (1997). The dynamic window approach to collision avoidance. IEEE Robotics & Automation Magazine, 4, 23–33.

    Article  Google Scholar 

  8. He, R., Prentice, S., & Roy, N. (2008). Planning in information space for a quadrotor helicopter in a GPS-denied environments. In Proc. of the IEEE int. conf. on robotics & automation—ICRA (pp. 1814–1820).

  9. Hornung, A., Strasdat, H., Bennewitz, M., & Burgard, W. (2009). Learning efficient policies for vision-based navigation. In Proc. of the IEEE/RSJ int. conf. on intelligent robots and systems—IROS.

  10. Ido, J., Shimizu, Y., Matsumoto, Y., & Ogasawara, T. (2009). Indoor navigation for a humanoid robot using a view sequence. Int. Journal of Robotics Research, 28(2), 315–325.

    Article  Google Scholar 

  11. Julier, S. J., & Uhlmann, J. K. (1997). A new extension of the Kalman filter to nonlinear systems. In Int. symposium on aerospace/defense sensing, simulation and controls, pp. 182–193.

  12. Kass, R. E., & Wasserman, L. (1995). A reference Bayesian test for nested hypotheses and its relationship to the Schwarz criterion. Journal of the American Statistical Association, 90(431), 928–934.

    MATH  Article  MathSciNet  Google Scholar 

  13. Kollar, T., & Roy, N. (2006). Using reinforcement learning to improve exploration trajectories for error minimization. In Proc. of the IEEE int. conf. on robotics & automation—ICRA (pp. 3338–3343).

  14. Kwok, C., & Fox, D. (2004). Reinforcement learning for sensing strategies. In Proc. of the IEEE/RSJ int. conf. on intelligent robots and systems—IROS (vol. 4, pp. 3158–3163), 28 Sept.–2 Oct.

  15. LaValle, S. M., & Kuffner, J. J. (1999). Randomized kinodynamic planning. In Proc. of the IEEE int. conf. on robotics & automation—ICRA (pp. 473–479).

  16. Lovejoy, W. S. (1991). Computationally feasible bounds for partially observed Markov decision processes. Operations Research, 39(1), 162–175.

    MATH  Article  MathSciNet  Google Scholar 

  17. Martinez-Cantin, R., de Freitas, N., Brochu, E., Castellanos, J., & Doucet, A. (2009). A Bayesian exploration-exploitation approach for optimal online sensing and planning with a visually guided mobile robot. Journal of Autonomous Robots, 27(2), 93–103.

    Article  Google Scholar 

  18. Menache, I., Mannor, S., & Shimkin, N. (2005). Basis function adaptation in temporal difference reinforcement learning. Annals of Operations Research, 134(1), 215–238.

    MATH  Article  MathSciNet  Google Scholar 

  19. Michels, J., Saxena, A., & Ng, A. Y. (2005). High speed obstacle avoidance using monocular vision and reinforcement learning. In Proc. of the int. conf. on machine learning—ICML (pp. 593–600). New York: ACM.

    Google Scholar 

  20. Miura, J., Negishi, Y., & Shirai, Y. (2006). Adaptive robot speed control by considering map and motion uncertainty. Journal of Robotics & Autonomous Systems, 54(2), 110–117.

    Article  Google Scholar 

  21. Neumann, G. (2005). The reinforcement learning toolbox, reinforcement learning for optimal control tasks. Diplomarbeit, Technischen Universität (University of Technology) Graz, May 2005.

  22. Pelleg, D., & Moore, A. (2000). X-means: extending K-means with efficient estimation of the number of clusters. In Proc. of the int. conf. on machine learning—ICML (pp. 727–734). San Mateo: Morgan Kaufmann.

    Google Scholar 

  23. Pretto, A., Menegatti, E., Bennewitz, M., Burgard, W., & Pagello, E. (2009). A visual odometry framework robust to motion blur. In Proc. of the IEEE int. conf. on robotics & automation (ICRA).

  24. Roy, N., & Gordon, G. (2002). Exponential family PCA for belief compression in POMDPs. In S. Becker, S. Thrun, K. Obermayer (Eds.), Proc. of the conf. on neural information processing systems—NIPS (pp. 1043–1049), Vancouver, Canada, December 2002.

  25. Roy, N., & Thrun, S. (1999). Coastal navigation with mobile robots. In Proc. of the conf. on neural information processing systems—NIPS (vol. 12, pp. 1043–1049).

  26. Roy, N., Burgard, W., Fox, D., & Thrun, S. (1999). Coastal navigation–mobile robot navigation with uncertainty in dynamic environments. In Proc. of the IEEE int. conf. on robotics & automation—ICRA (vol. 1, pp. 35–40).

  27. Rubinstein, R. Y., & Kroese, D. P. (2004). The cross-entropy method: a unified approach to combinatorial optimization, monte-carlo simulation and neural computation. Berlin: Springer.

    Google Scholar 

  28. Rummery, G. A., & Niranjan, M. (1994). On-line Q-learning using connectionist systems (Technical report CUED/F-INFENG/TR 166). Cambridge University, Cambridge, UK, September 1994.

  29. Satoh, H. (2006). A state space compression method based on multivariate analysis for reinforcement learning in high-dimensional continuous state spaces. IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences, E89-A(8), 2181–2191.

    Article  Google Scholar 

  30. Schlegel, C. (1998). Fast local obstacle avoidance under kinematic and dynamic constraints for a mobile robot. In: Proc. of the IEEE/RSJ int. conf. on intelligent robots and systems—IROS.

  31. Simmons, R. (1996). The curvature-velocity method for local obstacle avoidance. In Proc. of the IEEE int. conf. on robotics & automation—ICRA.

  32. Sondik, E. J. (1971). The optimal control of partially observable Markov decision processes. Ph.D. thesis, Stanford University, Stanford, USA.

  33. Stachniss, C., & Burgard, W. (2002). An integrated approach to goal-directed obstacle avoidance under dynamic constraints for dynamic environments. In Proc. of the IEEE/RSJ int. conf. on intelligent robots and systems—IROS (pp. 508–513), Lausanne, Switzerland.

  34. Strasdat, H., Stachniss, C., & Burgard, W. (2009). Which landmark is useful? Learning selection policies for navigation in unknown environments. In Proc. of the IEEE int. conf. on robotics & automation—ICRA.

  35. Sutton, R. S. (1996). Generalization in reinforcement learning: successful examples using sparse coarse coding. In Proc. of the conf. on neural information processing systems—NIPS (pp. 1038–1044). Cambridge: MIT Press.

    Google Scholar 

  36. Sutton, R. S., & Barto, A. G. (1998). Adaptive computation and machine learning reinforcement learning: an introduction. Cambridge: MIT Press.

    Google Scholar 

  37. Thrun, S., Burgard, W., & Fox, D. (2005). Probabilistic Robotics. Cambridge: MIT Press.

    Google Scholar 

  38. Uther, W. T. B., & Veloso, M. M. (1998). Tree based discretization for continuous state space reinforcement learning. In Proc. of the national conference on artificial intelligence—AAAI (pp. 769–774).

  39. Van Huynh, A., & Roy, N. (2009). icLQG: combining local and global optimization for control in information space. In Proc. of the IEEE international conference on robotics and automation—ICRA.

  40. Weiss, C., Fröhlich, H., & Zell, A. (2006). Vibration-based terrain classification using support vector machines. In Proc. of the IEEE/RSJ int. conf. on intelligent robots and systems—IROS.

  41. Wiering, M., & Schmidhuber, J. (1998). Fast online Q(λ). Machine Learning, 33(1), 105–115.

    MATH  Article  Google Scholar 

  42. Wurm, K. M., Kuemmerle, R., Stachniss, C., & Burgard, W. (2009). Improving robot navigation in structured outdoor environments. In Proc. of the IEEE/RSJ int. conf. on intelligent robots and systems—IROS.

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Armin Hornung.

Additional information

This work has been supported by the German Research Foundation (DFG) under contract number SFB/TR-8.

Electronic Supplementary Material

Below is the link to the electronic supplementary material.

(MPG 11.3 MB)

(MPG 18.1 MB)

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Hornung, A., Bennewitz, M. & Strasdat, H. Efficient vision-based navigation. Auton Robot 29, 137–149 (2010). https://doi.org/10.1007/s10514-010-9190-3

Download citation

Keywords

  • Navigation
  • Reinforcement learning
  • Vision
  • Motion blur