Generating Adaptive Route Instructions Using Hierarchical Reinforcement Learning

  • Heriberto Cuayáhuitl
  • Nina Dethlefs
  • Lutz Frommberger
  • Kai-Florian Richter
  • John Bateman
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6222)


We present a learning approach for efficiently inducing adaptive behaviour of route instructions. For such a purpose we propose a two-stage approach to learn a hierarchy of wayfinding strategies using hierarchical reinforcement learning. Whilst the first stage learns low-level behaviour, the second stage focuses on learning high-level behaviour. In our proposed approach, only the latter is to be applied at runtime in user-machine interactions. Our experiments are based on an indoor navigation scenario for a building that is complex to navigate. We compared our approach with flat reinforcement learning and a fully-learnt hierarchical approach. Our experimental results show that our proposed approach learns significantly faster than the baseline approaches. In addition, the learnt behaviour shows to adapt to the type of user and structure of the spatial environment. This approach is attractive to automatic route giving since it combines fast learning with adaptive behaviour.


Reward Function Route Direction Dialogue System Learning Agent Primitive Action 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Lovelace, K.L., Hegarty, M., Montello, D.R.: Elements of good route directions in familiar and unfamiliar environments. In: Freksa, C., Mark, D.M. (eds.) COSIT 1999. LNCS, vol. 1661, pp. 65–82. Springer, Heidelberg (1999)CrossRefGoogle Scholar
  2. 2.
    Sutton, R., Barto, A.: Reinforcement Learing: An Introduction. MIT Press, Cambridge (1998)Google Scholar
  3. 3.
    Denis, M.: The description of routes: A cognitive approach to the production of spatial discourse. Cahiers Psychologie Cognitive 16(4), 409–458 (1997)Google Scholar
  4. 4.
    Sorrows, M.E., Hirtle, S.C.: The nature of landmarks for real and electronic spaces. In: Freksa, C., Mark, D.M. (eds.) COSIT 1999. LNCS, vol. 1661, pp. 37–50. Springer, Heidelberg (1999)CrossRefGoogle Scholar
  5. 5.
    Daniel, M.P., Denis, M.: The production of route directions: investigating conditions that favour conciseness in spatial discourse. Applied Cognitive Psychology 18(1), 57–75 (2004)CrossRefGoogle Scholar
  6. 6.
    Klippel, A., Hansen, S., Richter, K.F., Winter, S.: Urban granularities - a data structure for cognitively ergonomic route directions. GeoInformatica 13(2), 223–247 (2009)CrossRefGoogle Scholar
  7. 7.
    May, A.J., Ross, T., Bayer, S.H., Burnett, G.: Using landmarks to enhance navigation systems: Driver requirements and industrial constraints. In: Proceedings of the 8th World Congress on Intelligent Transport Systems, Sydney, Australia (2001)Google Scholar
  8. 8.
    Ross, T., May, A., Thompson, S.: The use of landmarks in pedestrian navigation instructions and the effects of context. In: Brewster, S., Dunlop, M. (eds.) Mobile HCI 2004. LNCS, vol. 3160, pp. 300–304. Springer, Heidelberg (2004)Google Scholar
  9. 9.
    Klippel, A., Tenbrink, T., Montello, D.R.: The role of structure and function in the conceptualization of directions. In: van der Zee, E., Vulchanova, M. (eds.) Motion Encoding in Language and Space. Oxford University Press, Oxford (to appear)Google Scholar
  10. 10.
    Tenbrink, T., Winter, S.: Variable granularity in route directions. Spatial Cognition and Computation: An Interdisciplinary Journal 9(1), 64–93 (2009)CrossRefGoogle Scholar
  11. 11.
    Duckham, M., Kulik, L.: “Simplest” paths: Automated route selection for navigation. In: Kuhn, W., Worboys, M., Timpf, S. (eds.) COSIT 2003. LNCS, vol. 2825, pp. 169–185. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  12. 12.
    Haque, S., Kulik, L., Klippel, A.: Algorithms for reliable navigation and wayfinding. In: Barkowsky, T., Knauff, M., Ligozat, G., Montello, D.R. (eds.) Spatial Cognition 2007. LNCS (LNAI), vol. 4387, pp. 308–326. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  13. 13.
    Richter, K.F., Duckham, M.: Simplest instructions: Finding easy-to-describe routes for navigation. In: Cova, T.J., Miller, H.J., Beard, K., Frank, A.U., Goodchild, M.F. (eds.) GIScience 2008. LNCS, vol. 5266, pp. 274–289. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  14. 14.
    Dale, R., Geldof, S., Prost, J.P.: Using natural language generation in automatic route description. Journal of Research and Practice in Information Technology 37(1), 89–105 (2005)Google Scholar
  15. 15.
    Richter, K.F.: Context-Specific Route Directions - Generation of Cognitively Motivated Wayfinding Instructions. DisKi 314 / SFB/TR 8 Monographs, vol. 3. IOS Press, Amsterdam (2008)Google Scholar
  16. 16.
    Klippel, A., Richter, K.F., Hansen, S.: Structural salience as a landmark. In: Workshop Mobile Maps 2005, Salzburg, Austria (2005)Google Scholar
  17. 17.
    Marciniak, T., Strube, M.: Classification-based generation using TAG. In: Belz, A., Evans, R., Piwek, P. (eds.) INLG 2004. LNCS (LNAI), vol. 3123, pp. 100–109. Springer, Heidelberg (2004)Google Scholar
  18. 18.
    Marciniak, T., Strube, M.: Modeling and annotating the semantics of route directions. In: Proceedings of the Sixth International Workshop on Computational Semantics (IWCS-6), Tilburg, The Netherlands, pp. 151–162 (2005)Google Scholar
  19. 19.
    Cleary, J., Trigg, L.: An instance-based learner using an entropic distance measure. In: Proceedings of the 12th International Conference on Machine Learning, Tahoe City, Ca, pp. 108–114 (1995)Google Scholar
  20. 20.
    Kaelbling, L.P., Littmann, M.L., Moore, A.W.: Reinforcement learning: A survey. Journal of Artificial Intelligence Research 4, 237–285 (1996)Google Scholar
  21. 21.
    Cuayáhuitl, H.: Hierarchical Reinforcement Learning for Spoken Dialogue Systems. PhD thesis, School of Informatics, University of Edinburgh (January 2009)Google Scholar
  22. 22.
    Cuayáhuitl, H., Renals, S., Lemon, O., Shimodaira, H.: Evaluation of a hierarchical reinforcement learning spoken dialogue system. Computer Speech and Language 24(2), 395–429 (2010)CrossRefGoogle Scholar
  23. 23.
    Dietterich, T.: Hierarchical reinforcement learning with the MAXQ value function decomposition. Journal of Artificial Intelligence Research 13(1), 227–303 (2000)zbMATHMathSciNetGoogle Scholar
  24. 24.
    Dietterich, T.: An overview of MAXQ hierarchical reinforcement learning. In: Choueiry, B.Y., Walsh, T. (eds.) SARA 2000. LNCS (LNAI), vol. 1864, pp. 26–44. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  25. 25.
    Raubal, M., Winter, S.: Enriching wayfinding instructions with local landmarks. In: Egenhofer, M., Mark, D. (eds.) GIScience 2002. LNCS, vol. 2478, pp. 243–259. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  26. 26.
    Cuayáhuitl, H., Dethlefs, N., Richter, K.F., Tenbrink, T., Bateman, J.: A dialogue system for indoor wayfinding using text-based natural language. In: Proceedings of the 11th International Conference on Intelligent Text Processing and Computational Linguistics (CICLing), Iasi, Romania (2010)Google Scholar
  27. 27.
    Dethlefs, N., Cuayáhuitl, H., Richter, K.F., Andonova, E., Bateman, J.: Evaluating task success in a dialogue system for indoor navigation. In: Proc. of the 14th Workshop on the Semantics and Pragmatics of Dialogue, SemDial (2010)Google Scholar
  28. 28.
    Frommberger, L., Wolter, D.: Spatial abstraction: Aspectualization, coarsening, and conceptual classification. In: Freksa, C., Newcombe, N.S., Gärdenfors, P., Wölfl, S. (eds.) Spatial Cognition VI. LNCS (LNAI), vol. 5248, pp. 311–327. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  29. 29.
    Frommberger, L.: Situation dependent spatial abstraction in reinforcement learning. In: ICML/UAI/COLT Workshop on Abstraction in Reinforcement Learning, Montreal, Canada (2009)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Heriberto Cuayáhuitl
    • 1
  • Nina Dethlefs
    • 2
  • Lutz Frommberger
    • 1
  • Kai-Florian Richter
    • 1
  • John Bateman
    • 1
    • 2
  1. 1.Transregional Collaborative Research Center SFB/TR 8 Spatial CognitionUniversity of BremenBremenGermany
  2. 2.FB10 Faculty of Linguistics and Literary SciencesUniversity of BremenBremenGermany

Personalised recommendations