Advertisement

Performance of evolutionary wavelet neural networks in acrobot control tasks

  • Maryam Mahsal KhanEmail author
  • Alexandre Mendes
  • Stephan K. Chalup
Original Article
  • 31 Downloads

Abstract

Wavelet neural networks (WNN) combine the strength of artificial neural networks and the multiresolution ability of wavelets. Determining the structure and, more specifically, the appropriate number of neurons in a WNN is a time-consuming process. We propose a type of multidimensional evolutionary WNN and, using an acrobot, evaluate this approach with two benchmark nonlinear control tasks: a height task and a hand-stand task. To facilitate direct comparison with other methods, we report on swing-up and balance times. In 50 trials, the controllers produced faster swing-up times—1.0 s for the best controller and 2.3 s on average—than any other methods reported in the literature. Moreover, the controller with the best swing-up time had a maximum balance time of 1.25 s, surpassing most other methods.

Keywords

Evolutionary algorithms Wavelet neural networks Acrobot Intelligent control 

Notes

Acknowledgements

The first author would like to acknowledge the support through an Australian Government Research Training Program Scholarship.

References

  1. 1.
    Zhang J, Walter GG, Miao Y, Lee W (1995) Wavelet neural networks for function learning. IEEE Trans Signal Process 43(6):1485–1497CrossRefGoogle Scholar
  2. 2.
    Vazquez LA, Jurado F, Alanis AY (2015) Decentralized identification and control in real-time of a robot manipulator via recurrent wavelet first-order neural network. Math Probl Eng 2015:1–12MathSciNetCrossRefzbMATHGoogle Scholar
  3. 3.
    Oysal Y, Yilmaz AS, Koklukaya E (2005) A dynamic wavelet network based adaptive load frequency control in power systems. Int J Electr Power Energy Syst 27(1):21–29CrossRefGoogle Scholar
  4. 4.
    Zhong L (2010) Adaptive intelligent control of hydraulic generator unit based on wavelet networks. In: 2010 International conference on artificial intelligence and computational intelligence, vol 1, pp 107–110Google Scholar
  5. 5.
    Farahani M, Bidaki A, Enshaeieh M (2014) Intelligent control of a dc motor using a self-constructing wavelet neural network. Syst Sci Control Eng 2(1):261–267CrossRefGoogle Scholar
  6. 6.
    Lin C, Tai C, Chung C (2014) Intelligent control system design for UAV using a recurrent wavelet neural network. Neural Comput Appl 24(2):487–496CrossRefGoogle Scholar
  7. 7.
    Tian J, Gao M, Zhou H, Li J (2006) The intelligent control system of flocculation process of sewage treatment based on wavelet neural networks. In: Proceedings of the sixth international conference on intelligent systems design and applications (ISDA), vol 2. Washington, DC, USA, pp 219–224, IEEE Computer SocietyGoogle Scholar
  8. 8.
    Zhang Q, Benveniste A (1992) Wavelet networks. IEEE Trans Neural Netw 3(6):889–898CrossRefGoogle Scholar
  9. 9.
    Szu H, Telfer B, Kadambe SL (1992) Neural network adaptive wavelets for signal representation and classification. Opt Eng 31(9):1907–1916CrossRefGoogle Scholar
  10. 10.
    Yao S, Wei C, He Z (1995) Evolving wavelet neural networks. IEEE Int Conf Neural Netw 4:1851–1854Google Scholar
  11. 11.
    Prochazka A, Sys V (1994) Time series prediction using genetically trained wavelet networks. In: Proceedings of the IEEE workshop on neural networks for signal processing, pp 195–203Google Scholar
  12. 12.
    Cristea P, Tuduce R, Cristea A (2000) Time series prediction with wavelet neural networks. In: Proceedings of the 5th seminar on neural network applications in electrical engineering (NEUREL2000), pp 5–10Google Scholar
  13. 13.
    Yongyong H, Fulei C, Binglin Z (2002) A hierarchical evolutionary algorithm for constructing and training wavelet networks. Neural Comput Appl 10(4):357–366CrossRefzbMATHGoogle Scholar
  14. 14.
    Khan M, Chalup S, Mendes A (2014) Evolving wavelet neural networks for breast cancer classification. In: Twelfth Australasian data mining conference (AUSDM’2014), vol 158, pp 121–130Google Scholar
  15. 15.
    Alexandridis A, Zapranis A (2013) Wavelet neural networks: a practical guide. Neural Netw 42:1–27CrossRefzbMATHGoogle Scholar
  16. 16.
    Hsu C (2013) A self-evolving functional-linked wavelet neural network for control applications. Appl Soft Comput 13(11):4392–4402CrossRefGoogle Scholar
  17. 17.
    Grefenstette J, Moriarty D, Schultz A (2011) Evolutionary algorithms for reinforcement learning, CoRR, vol arXiv:1106.0221
  18. 18.
    Whiteson S (2012) Evolutionary computation for reinforcement learning. Springer, Berlin, pp 325–355CrossRefGoogle Scholar
  19. 19.
    Khan M, Khan G, Miller J (2010) Efficient representation of recurrent neural networks for markovian/non-markovian non-linear control problems. In: International conference on system design and applications (ISDA2010), pp 615–620Google Scholar
  20. 20.
    Gomez F, Schmidhuber J, Miikkulainen R (2006) Efficient non-linear control through neuroevolution. In: Proceedings of the 17th European conference on machine learning (ECML), Springer, Berlin, pp 654–662Google Scholar
  21. 21.
    Gomez F, Miikkulainen R (2002) Robust non-linear control through neuroevolution. tech. rep., Technical Report AI-TR-03-303, Artificial Intelligence Laboratory, The University of Texas at AustinGoogle Scholar
  22. 22.
    Moriarty DE (1997) Symbiotic evolution of neural networks in sequential decision tasks. Ph.D. thesis, Department of Computer Sciences, The University of Texas at Austin. Technical Report UT-AI97-257Google Scholar
  23. 23.
    Taylor M, Whiteson S, Stone P (2006) Comparing evolutionary and temporal difference methods for reinforcement learning. In: Proceedings of the genetic and evolutionary computation conference (GECCO2006), pp 1321–28, JulyGoogle Scholar
  24. 24.
    Schmidhuber J (2000) Evolutionary computation versus reinforcement learning. IEEE Int Conf Ind Electron Control Instrum 4:2992–2997Google Scholar
  25. 25.
    Spong MW (1994) Swing up control of the acrobot. IEEE Int Conf Robot Autom 3:2356–2361Google Scholar
  26. 26.
    Spong M (1995) The swing up control problem for the acrobot. IEEE Control Syst 15(1):49–55CrossRefGoogle Scholar
  27. 27.
    Boone G (1997) Minimum-time control of the acrobot. Proc Int Conf Robot Autom 4:3281–3287CrossRefGoogle Scholar
  28. 28.
    Sutton R (1996) Generalization in reinforcement learning: successful examples using sparse coarse coding. In: Advances in neural information processing systems (NIPS), vol 8, pp 1038–1044, MIT PressGoogle Scholar
  29. 29.
    Xu X, Hu D, Lu X (2007) Kernel-based least squares policy iteration for reinforcement learning. IEEE Trans Neural Netw 18(4):973–992CrossRefGoogle Scholar
  30. 30.
    Doucette J, Heywood MI (2011) “Revisiting the acrobot ’height’ task: an example of efficient evolutionary policy search under an episodic goal seeking task. In: IEEE congress of evolutionary computation (CEC), pp 468–475Google Scholar
  31. 31.
    Neumann G, Peters JR (2009) Fitted q-iteration by advantage weighted regression. In: Koller D, Schuurmans D, Bengio Y, Bottou L (eds) Advances in neural information processing systems, vol 21, Curran Associates, Inc, pp 1177–1184Google Scholar
  32. 32.
    Duan Y, Chen X, Houthooft R, Schulman J, Abbeel P (2016) Benchmarking deep reinforcement learning for continuous control, CoRR, vol. arXiv:1604.06778
  33. 33.
    Wiklendt L, Chalup S, Middleton R (2009) A small spiking neural network with LQR control applied to the acrobot. Neural Comput Appl 18(4):369–375CrossRefGoogle Scholar
  34. 34.
    Coulom R (2004) High-accuracy value-function approximation with neural networks. In: European symposium on artificial neural networksGoogle Scholar
  35. 35.
    DeJong G, Spong MW (1994) Swinging up the acrobot: an example of intelligent control. In: American control conference, 1994, vol 2, pp 2158–2162Google Scholar
  36. 36.
    Jung T, Polani D, Stone P (2011) Empowerment for continuous agent-environment systems. Adapt Behav 19(1):16–39CrossRefGoogle Scholar
  37. 37.
    Munos R, Moore A (1999) Variable resolution discretization for high-accuracy solutions of optimal control problems. In: Proceedings of the 16th international joint conference on artificial intelligence (IJCAI), vol 2, pp 1348–1355Google Scholar
  38. 38.
    Duong SC, Kinjo H, Uezato E, Yamamoto T (2009) On the continuous control of the acrobot via computational intelligence. In: International conference on industrial, engineering and other applications of applied intelligent systems: next-generation applied intelligence, Springer, Berlin, pp 231–241Google Scholar
  39. 39.
    Dracopoulos D, Nichols B (2015) Genetic programming for the minimum time swing up and balance control acrobot problem. Expert Syst 34(5):1–9Google Scholar
  40. 40.
    Yoshimoto J, Ishii S, Sato M (1999) Application of reinforcement learning to balancing of acrobot. In: IEEE International conference on systems, man and cybernetics (SMC), vol 5, pp 516–521Google Scholar
  41. 41.
    Yoshimoto J, Nishimura M, Tokita Y, Ishii S (2005) Acrobot control by learning the switching of multiple controllers. Artif Life Robot 9(2):67–71CrossRefGoogle Scholar
  42. 42.
    Oussar Y, Dreyfus G (2000) Initialization by selection for wavelet network training. Neurocomputing 34:131–143CrossRefzbMATHGoogle Scholar
  43. 43.
    Khan M, Khan G, Ahmad A, Miller J (2013) Fast learning neural networks using cartesian genetic programming. Neurocomputing 121:274–289CrossRefGoogle Scholar
  44. 44.
    Mnih V, Kavukcuoglu K, Silver D, Graves A, Antonoglou I, Wierstra D, Riedmiller MA (2013) Playing atari with deep reinforcement learning, CoRR, vol. arXiv:1312.5602

Copyright information

© Springer-Verlag London Ltd., part of Springer Nature 2019

Authors and Affiliations

  1. 1.CSIRO Energy TechnologyNewcastleAustralia
  2. 2.Interdisciplinary Machine Learning Research Group (IMLRG), School of Electrical Engineering and ComputingThe University of NewcastleCallaghanAustralia

Personalised recommendations