Swing Up and Balance Control of the Acrobot Solved by Genetic Programming

  • Dimitris C. Dracopoulos
  • Barry D. Nichols
Conference paper


The evolution of controllers using genetic programming is described for the continuous, limited torque minimum time swing-up and inverted balance problems of the acrobot. The best swing-up controller found is able to swing the acrobot up to a position very close to the inverted ‘handstand’ position in a very short time, which is comparable to the results which have been achieved by other methods using similar parameters for the dynamic system. The balance controller is successful at keeping the acrobot in the unstable, inverted position when starting from the inverted position.


Genetic Programming Reinforcement Learning Balance Position Linear Quadratic Regulator Balance Task 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Boone, G.: Minimum-time control of the acrobot. In: Robotics and Automation, 1997. Proceedings. 1997 IEEE International Conference on, vol. 4, pp. 3281–3287 (1997)Google Scholar
  2. 2.
    Coulom, R.: High-accuracy value-function approximation with neural networks. In: EuropeanSymposium on Artificial Neural Networks (2004)Google Scholar
  3. 3.
    Doucette, J., Heywood, M.I.: Revisiting the acrobot ’height’ task: An example of efficient evolutionary policy search under an episodic goal seeking task. In: Evolutionary Computation(CEC), 2011 IEEE Congress on, pp. 468 –475 (2011)Google Scholar
  4. 4.
    Dracopoulos, D.C.: Genetic evolution of controllers for challenging control problems. Journalof Computational Methods in Science and Engineering 11(4), 227–242 (2011)MathSciNetGoogle Scholar
  5. 5.
    Duong, S., Kinjo, H., Uezato, E., Yamamoto, T.: On the continuous control of the acrobotvia computational intelligence. In: B.C. Chien, T.P. Hong, S.M. Chen, M. Ali (eds.) Next-Generation Applied Intelligence, Lecture Notes in Computer Science, vol. 5579, pp. 231–241.Springer Berlin / Heidelberg (2009)Google Scholar
  6. 6.
    Franklin, G.F., Powell, J.D., Emami-Naeini, A.: Feedback Control of Dynamic Systems, 4edn. Prentice Hall, New Jersey (2002)Google Scholar
  7. 7.
    Fukushima, R., Uezato, E.: Swing-up control of a 3-dof acrobot using an evolutionary approach.Artificial Life and Robotics 14, 160–163 (2009)Google Scholar
  8. 8.
    Jung, T., Polani, D., Stone, P.: Empowerment for continuous agent-environment systems.Adaptive Behavior - Animals, Animats, Software Agents, Robots, Adaptive Systems 19, 16–39 (2011)CrossRefGoogle Scholar
  9. 9.
    Koza, J.R.: Genetic Programming: On the Programming of Computers by Means of Natural Selection. MIT Press, Cambridge, MA, USA (1992)MATHGoogle Scholar
  10. 10.
    Lai, X.Z., She, J.H., Yang, S.X., Wu, M.: Comprehensive unified control strategy for underactuated two-link manipulators. Systems, Man, and Cybernetics, Part B: Cybernetics, IEEE Transactions on 39(2), 389–398 (2009)CrossRefGoogle Scholar
  11. 11.
    RLC: Reinforcement learning competition. (2009)
  12. 12.
    Spong, M.W.: Swing up control of the acrobot. In: Robotics and Automation, 1994. Proceedings., 1994 IEEE International Conference on, vol. 3, pp. 2356–2361 (1994)Google Scholar
  13. 13.
    Spong, M.W.: The swing up control problem for the acrobot. Control Systems, IEEE 15(1)49–55 (1995)Google Scholar
  14. 14.
    Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge, Massachusetts (1998)Google Scholar
  15. 15.
    Wiklendt, L., Chalup, S., Middleton, R.: A small spiking neural network with lqr control applied to the acrobot. Neural Computing & Applications 18, 369–375 (2009)CrossRefGoogle Scholar
  16. 16.
    Willson, S., Mullhaupt, P., Bonvin, D.: Quotient method for controlling the acrobot. In: Decision and Control, 2009 held jointly with the 2009 28th Chinese Control Conference. CDC/CCC 2009. Proceedings of the 48th IEEE Conference on, pp. 1770–1775 (2009)Google Scholar
  17. 17.
    Xu, X., Hu, D., Lu, X.: Kernel-based least squares policy iteration for reinforcement learning. Neural Networks, IEEE Transactions on 18(4), 973–992 (2007)CrossRefGoogle Scholar
  18. 18.
    Yoshimoto, J., Nishimura, M., Tokita, Y., Ishii, S.: Acrobot control by learning the switching of multiple controllers. Artificial Life and Robotics 9, 67–71 (2005)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag London 2012

Authors and Affiliations

  1. 1.School of Electronics and Computer ScienceUniversity of WestminsterLondonUK

Personalised recommendations