Evolutionary Function Approximation for Gait Generation on Legged Robots

Silva, Oscar A.; Solis, Miguel A.

doi:10.1007/978-3-319-26230-7_10

Oscar A. Silva³ &
Miguel A. Solis⁴

Part of the book series: Studies in Systems, Decision and Control ((SSDC,volume 40))

818 Accesses

Abstract

Reinforcement learning methods can be computationally expensive. Their cost is prone to be higher when the cardinality of the state space representation becomes larger. This curse of dimensionality plays an important role on our work, since gait generation by using more degrees of freedom at each leg, implies a bigger state space after discretization, and look-up tables become impractical. Thus, appropriate function approximators are needed for such kind of tasks on robotics. This chapter shows the advantage of using reinforcement learning, specifically within the batch framework. A neuroevolution of augmenting topologies scheme is used as function approximator, a particular case of a topology and weight evolving artificial neural network which has proved to outperform a fixed-topology network for certain tasks. A comparison between function approximators within the batch reinforcement learning approach is tested on a simulated version of an hexapod robot designed and already built at our undergraduate and graduate students group.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Altendorfer, R., Moore, N., Komsuoglu, H., Buehler, M., Brown Jr, H., McMordie, D., Saranli, U., Full, R., Koditschek, D.E.: Rhex: a biologically inspired hexapod runner. Auton. Robots 11(3), 207–213 (2001)
Article MATH Google Scholar
Beer, R.D., Quinn, R.D., Chiel, H.J., Ritzmann, R.E.: Biologically inspired approaches to robotics: what can we learn from insects? Commun. ACM 40(3), 30–38 (1997)
Article Google Scholar
Bertsekas, D.P., Bertsekas, D.P.: Dynamic programming and optimal control, vol. 1. Athena Scientific, Belmont (1995)
Google Scholar
Cunha, J., Lau, N., Neves, A.J.R.: Q-batch: initial results with a novel update rule for batch reinforcement learning. In: Advances in Artificial Intelligence-Local Proceedings, XVI Portuguese Conference on Artificial Intelligence. Azores pp. 240–251 (2013)
Google Scholar
Devjanin, E.A., Gurfinkel, V.S., Gurfinkel, E.V., Kartashev, V.A., Lensky, A.V., Yu Shneider, A., Shtilman, L.G.: The six-legged walking robot capable of terrain adaptation. Mech. Mach. Theor. 18(4), 257–260 (1983)
Article Google Scholar
Duan, X., Chen, W., Yu, S., Liu, J.: Tripod gaits planning and kinematics analysis of a hexapod robot. In: Control and Automation, 2009. ICCA 2009. IEEE International Conference on, pp. 1850–1855, IEEE (2009)
Google Scholar
Erden, M.S., Leblebicioğlu, K.: Free gait generation with reinforcement learning for a six-legged robot. Robot. Auton. Syst. 56(3), 199–212 (2008)
Article Google Scholar
Ernst, D., Geurts, P., Wehenkel, L.: Tree-based batch mode reinforcement learning. J.Mach. Learn. Res., 503–556 (2005)
Google Scholar
Freese, M., Singh, S., Ozaki, F., Matsuhira, N.: Virtual robot experimentation platform v-rep: a versatile 3d robot simulator. Simulation, modeling, and programming for autonomous robots, pp. 51–62. Springer, Berlin (2010)
Chapter Google Scholar
Ghanbari, A., Vaghei, Y., Noorani, S., Reza, S.M.: Reinforcement learning in neural networks: a survey. Int. J. Adv. Biol. Biomed. Res. 2(5), 1398–1416 (2014)
Google Scholar
Glette, K., Klaus, G., Zagal, J.C., Torresen, J.: Evolution of locomotion in a simulated quadruped robot and transferral to reality. In: Proceedings of the Seventeenth International Symposium on Artificial Life and Robotics (2012)
Google Scholar
Glorennec, P.Y., Jouffe, L.: Fuzzy Q-learning. In: Fuzzy Systems, 1997., Proceedings of the Sixth IEEE International Conference on, vol. 2. pp. 659–662, IEEE (1997)
Google Scholar
Gruau, F.: Genetic synthesis of modular neural networks. In: Proceedings of the 5th International Conference on Genetic Algorithms, pp. 318–325. Morgan Kaufmann Publishers Inc. (1993)
Google Scholar
He, P., Jagannathan, S.: Reinforcement learning-based output feedback control of nonlinear systems with input constraints. IEEE Trans. Syst. Man Cybern. B Cybern. 35(1), 150–154 (2005)
Article Google Scholar
Hirose, S., Fukuda, Y., Yoneda, K., Nagakubo, A., Tsukagoshi, H., Arikawa, K., Endo, G., Doi, T., Hodoshima, R.: Quadruped walking robots at tokyo institute of technology: design, analysis, and gait control methods. IEEE Robot. Autom. Mag. 16(2), 104–114 (2009)
Article Google Scholar
Huang, Q., Yokoi, K., Kajita, S., Kaneko, K., Arai, H., Koyachi, N., Tanie, K.: Planning walking patterns for a biped robot. IEEE Trans. Robot. Autom. 17(3), 280–289 (2001)
Article Google Scholar
Kajita, S., Morisawa, M., Miura, K., Nakaoka, S., Harada, K., Kaneko, K., Kanehiro, F., Yokoi, K.: Biped walking stabilization based on linear inverted pendulum tracking. In: Intelligent Robots and Systems (IROS), 2010 IEEE/RSJ International Conference on, pp. 4489–4496. IEEE (2010)
Google Scholar
Kalyanakrishnan, S., Stone, P.: Batch reinforcement learning in a complex domain. In: Proceedings of the 6th International Joint Conference on Autonomous Agents and Multiagent Systems, p.94. ACM (2007)
Google Scholar
Kamikawa, K., Arai, T., Inoue, K., Mae, Y.: Omni-directional gait of multi-legged rescue robot. In: Robotics and Automation, 2004. Proceedings. ICRA’04. 2004 IEEE International Conference on, vol. 3, pp. 2171–2176. IEEE (2004)
Google Scholar
Kiumarsi, B., Lewis, F.L., Modares, H., Karimpour, A., Naghibi-Sistani, M.B.: Reinforcement Q-learning for optimal tracking control of linear discrete-time systems with unknown dynamics. Automatica 50(4), 1167–1175 (2014)
Article MathSciNet MATH Google Scholar
Kiumarsi-Khomartash, B., Lewis, F.L., Naghibi-Sistani, M.B., Karimpour, A.: Optimal tracking control for linear discrete-time systems using reinforcement learning. In: Decision and Control (CDC), 2013 IEEE 52nd Annual Conference on, pp. 3845–3850. IEEE (2013)
Google Scholar
Kober, J., Bagnell, J.A., Peters, J.: Reinforcement learning in robotics: a survey. Int. J. Robot. Res. 32(11), 1238–1274 (2013)
Article Google Scholar
Konidaris, G., Osentoski, S., Thomas, P.S.: Value function approximation in reinforcement learning using the fourier basis. In: AAAI (2011)
Google Scholar
Kosslyn, S.M., Kosslyn, S.: Top brain, bottom brain: surprising insights into how you think. Simon and Schuster, New York (2013)
Google Scholar
Lange, S., Gabel, T., Riedmiller, M.: Batch reinforcement learning. In: Reinforcement Learning, pp. 45–73. Springer, Berlin (2012)
Google Scholar
Lewis, F.L., Liu, D.: Reinforcement learning and approximate dynamic programming for feedback control, vol. 17. Wiley, New York (2013)
Google Scholar
Lin, L.J.: Self-improving reactive agents based on reinforcement learning, planning and teaching. Mach. Learn. 8(3–4), 293–321 (1992)
Google Scholar
Lin, L.J.: Reinforcement learning for robots using neural networks. Technical report, DTIC Document (1993)
Google Scholar
Lohmann, S., Yosinski, J., Gold, E., Clune, J., Blum, J., Lipson, H.: Aracna: an open-source quadruped platform for evolutionary robotics. Artif. Life 13, 387–392 (2012)
Google Scholar
Ma, S., Tomiyama, T., Wada, H.: Omnidirectional static walking of a quadruped robot. IEEE Trans. Robot. 21(2), 152–161 (2005)
Article Google Scholar
Modares, H., Lewis, F.L.: Online solution to the linear quadratic tracking problem of continuous-time systems using reinforcement learning. In: Decision and Control (CDC), 2013 IEEE 52nd Annual Conference on, pp. 3851–3856. IEEE (2013)
Google Scholar
Munos, R.: Error bounds for approximate policy iteration. In: Proceedings of the 20th International Conference on Machine Learning (ICML-03), pp. 560–567 (2003)
Google Scholar
Nakamura, Y., Mori, T., Sato, M., Ishii, S.: Reinforcement learning for a biped robot based on a cpg-actor-critic method. Neural Netw. 20(6), 723–735 (2007)
Article MATH Google Scholar
Parr, R., Li, L., Taylor, G., Painter-Wakefield, C., Littman, M.L.: An analysis of linear models, linear value-function approximation, and feature selection for reinforcement learning. In: Proceedings of the 25th International Conference on Machine Learning, pp. 752–759. ACM (2008)
Google Scholar
Pyeatt, L.D., Howe, A.E., et al.: Decision tree function approximation in reinforcement learning. In: Proceedings of the Third International Symposium on Adaptive Systems: Evolutionary Computation and Probabilistic Graphical Models, vol. 2. pp. 70–77 (2001)
Google Scholar
Riedmiller, M.: Neural fitted Q iteration–first experiences with a data efficient neural reinforcement learning method. In: Machine Learning: ECML 2005, pp. 317–328. Springer, Berlin (2005)
Google Scholar
Schmucker,U., Schneider, A., Ihme, T.: Hexagonal walking vehicle with force sensing capability. In: Proceedings of 6th International Symposium on Measurement and Control in Robotics. Brussel, pp. 354–359 (1996)
Google Scholar
Stanley, K.O., Miikkulainen, R.: Evolving neural networks through augmenting topologies. Evol. Comput. 10(2), 99–127 (2002)
Article Google Scholar
Sutton, R.S., Barto, A.G.: Introduction to reinforcement learning. MIT Press, Cambridge (1998)
Google Scholar
Sutton, R.S., Barto, A.G., Williams, R.J.: Reinforcement learning is direct adaptive optimal control. IEEE Control Syst. 12(2), 19–22 (1992)
Article Google Scholar
Vamvoudakis, K.G., Lewis, F.L.: Online actor–critic algorithm to solve the continuous-time infinite horizon optimal control problem. Automatica 46(5), 878–888 (2010)
Article MathSciNet MATH Google Scholar
Watkins, C.J.C.H.: Learning from delayed rewards. PhD thesis, University of Cambridge, England (1989)
Google Scholar
Whiteson, S., Stone, P.: Evolutionary function approximation for reinforcement learning. J. Mach. Learn. Res. 7, 877–917 (2006)
MathSciNet MATH Google Scholar
Wiering, M., Van Otterlo, M.: Reinforcement learning. In: Adaptation, Learning, and Optimization, vol. 12. Springer, Berlin (2012)
Google Scholar
Williams, R.J., Baird, L.C.: Tight performance bounds on greedy policies based on imperfect value functions. Technical report, Citeseer (1993)
Google Scholar
Yamaguchi, A., Hyon, S., Ogasawara, T.: Reinforcement learning for balancer embedded humanoid locomotion. In: Humanoid Robots (Humanoids), 2010 10th IEEE-RAS International Conference on, pp. 308–313. IEEE (2010)
Google Scholar

Download references

Author information

Authors and Affiliations

Innovación y Robótica Estudiantil UTFSM, Valparaíso, Chile
Oscar A. Silva
Centro de Robótica UTFSM, Valparaíso, Chile
Miguel A. Solis

Authors

Oscar A. Silva
View author publications
You can also search for this author in PubMed Google Scholar
Miguel A. Solis
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Miguel A. Solis .

Editor information

Editors and Affiliations

Faculty of Engineering, Universidad Panamericana, Mexico City, Mexico
Hiram Eredín Ponce Espinosa

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Silva, O.A., Solis, M.A. (2016). Evolutionary Function Approximation for Gait Generation on Legged Robots. In: Espinosa, H. (eds) Nature-Inspired Computing for Control Systems. Studies in Systems, Decision and Control, vol 40. Springer, Cham. https://doi.org/10.1007/978-3-319-26230-7_10

Download citation

DOI: https://doi.org/10.1007/978-3-319-26230-7_10
Published: 17 December 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-26228-4
Online ISBN: 978-3-319-26230-7
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics