Q-learning-based simulated annealing algorithm for constrained engineering design problems

  • Hussein SammaEmail author
  • Junita Mohamad-Saleh
  • Shahrel Azmin Suandi
  • Badr Lahasan
Original Article


Simulated annealing (SA) was recognized as an effective local search optimizer, and it showed a great success in many real-world optimization problems. However, it has slow convergence rate and its performance is widely affected by the settings of its parameters, namely the annealing factor and the mutation rate. To mitigate these limitations, this study presents an enhanced optimizer that integrates Q-learning algorithm with SA in a single optimization model, named QLSA. In particular, the Q-learning algorithm is embedded into SA to enhance its performances by controlling its parameters adaptively at run time. The main characteristics of Q-learning are that it applies reward/penalty technique to keep track of the best performing values of these parameters, i.e., annealing factor and the mutation rate. To evaluate the effectiveness of the proposed QLSA algorithm, a total of seven constrained engineering design problems were used in this study. The outcomes show that QLSA was able to report a mean fitness value of 1.33 on cantilever beam design, 263.60 on three-bar truss design, 1.72 on welded beam design, 5905.42 on pressure vessel design, 0.0126 on compression coil spring design, 0.25 on multiple disk clutch brake design, and 2994.47 on speed reducer design problem. Further analysis was conducted by comparing QLSA with the state-of-the-art population optimization algorithms including PSO, GWO, CLPSO, harmony, and ABC. The reported results show that QLSA significantly (i.e., 95% confidence level) outperforms other studied algorithms.


Simulated annealing Q-learning algorithm Constrained engineering design problems 


Compliance with ethical standards

Conflict of interest

Hussein Samma, Junita Mohamad-Saleh, Shahrel Azmin Suandi, and Badr Lahasan declare that they have no conflict of interest.


  1. 1.
    Alswaitti M, Albughdadi M, Isa NAM (2018) Density-based particle swarm optimization algorithm for data clustering. Expert Syst Appl 91:170–186CrossRefGoogle Scholar
  2. 2.
    Ozsoydan FB, Baykasoğlu A (2019) Quantum firefly swarms for multimodal dynamic optimization problems. Expert Syst Appl 115:189–199CrossRefGoogle Scholar
  3. 3.
    Zouache D, Abdelaziz FB (2018) A cooperative swarm intelligence algorithm based on quantum-inspired and rough sets for feature selection. Comput Ind Eng 115:26–36CrossRefGoogle Scholar
  4. 4.
    Xiao J, Li W, Liu B, Ni P (2018) A novel multi-population coevolution strategy for single objective immune optimization algorithm. Neural Comput Appl 29:1115–1128CrossRefGoogle Scholar
  5. 5.
    Zheng Z-X, Li J-Q, Duan P-Y (2018) Optimal chiller loading by improved artificial fish swarm algorithm for energy saving. Math Comput Simul 155:227–243MathSciNetCrossRefGoogle Scholar
  6. 6.
    Prakasam A, Savarimuthu N (2018) Novel local restart strategies with hyper-populated ant colonies for dynamic optimization problems. Neural Comput Appl. Google Scholar
  7. 7.
    Mahdavi S, Rahnamayan S, Mahdavi A (2019) Majority voting for discrete population-based optimization algorithms. Soft Comput 23(1):1–18CrossRefGoogle Scholar
  8. 8.
    Arora S, Anand P (2019) Binary butterfly optimization approaches for feature selection. Expert Syst Appl 116:147–160CrossRefGoogle Scholar
  9. 9.
    Chen Y, Li L, Peng H, Xiao J, Wu Q (2018) Dynamic multi-swarm differential learning particle swarm optimizer. Swarm Evol Comput 39:209–221CrossRefGoogle Scholar
  10. 10.
    Mafarja M, Aljarah I, Faris H, Hammouri AI, Ala’M A-Z, Mirjalili S (2018) Binary grasshopper optimisation algorithm approaches for feature selection problems. Expert Syst Appl 117:267–286CrossRefGoogle Scholar
  11. 11.
    Wang Y, Ouyang D, Yin M, Zhang L, Zhang Y (2018) A restart local search algorithm for solving maximum set k-covering problem. Neural Comput Appl 29:755–765CrossRefGoogle Scholar
  12. 12.
    Zhang H, Cai S, Luo C, Yin M (2017) An efficient local search algorithm for the winner determination problem. J Heuristics 23:367–396CrossRefGoogle Scholar
  13. 13.
    Zhou Y, Wang Y, Gao J, Luo N, Wang J (2018) An efficient local search for partial vertex cover problem. Neural Comput Appl 30:1–12CrossRefGoogle Scholar
  14. 14.
    Li X, Zhu L, Baki F, Chaouch A (2018) Tabu search and iterated local search for the cyclic bottleneck assignment problem. Comput Oper Res 96:120–130MathSciNetCrossRefzbMATHGoogle Scholar
  15. 15.
    Cai S, Li Y, Hou W, Wang H (2019) Towards faster local search for minimum weight vertex cover on massive graphs. Inf Sci 471:64–79MathSciNetCrossRefGoogle Scholar
  16. 16.
    Samma H, Lim CP, Saleh JM (2016) A new reinforcement learning-based memetic particle swarm optimizer. Appl Soft Comput 43:276–297CrossRefGoogle Scholar
  17. 17.
    Boughaci D (2013) Metaheuristic approaches for the winner determination problem in combinatorial auction. In: Artificial intelligence, evolutionary computing and metaheuristics. Springer, Berlin, Heidelberg, pp 775–791CrossRefGoogle Scholar
  18. 18.
    Dinur I, Safra S (2005) On the hardness of approximating minimum vertex cover. Ann Math 162(1):439–485MathSciNetCrossRefzbMATHGoogle Scholar
  19. 19.
    Kirkpatrick S, Gelatt CD, Vecchi MP (1983) Optimization by simulated annealing. Science 220:671–680MathSciNetCrossRefzbMATHGoogle Scholar
  20. 20.
    Vincent FY, Redi AP, Hidayat YA, Wibowo OJ (2017) A simulated annealing heuristic for the hybrid vehicle routing problem. Appl Soft Comput 53:119–132CrossRefGoogle Scholar
  21. 21.
    Akram K, Kamal K, Zeb A (2016) Fast simulated annealing hybridized with quenching for solving job shop scheduling problem. Appl Soft Comput 49:510–523CrossRefGoogle Scholar
  22. 22.
    Liu Z, Liu Z, Zhu Z, Shen Y, Dong J (2018) Simulated annealing for a multi-level nurse rostering problem in hemodialysis service. Appl Soft Comput 64:148–160CrossRefGoogle Scholar
  23. 23.
    Xinchao Z (2011) Simulated annealing algorithm with adaptive neighborhood. Appl Soft Comput 11:1827–1836CrossRefGoogle Scholar
  24. 24.
    Ezugwu AE-S, Adewumi AO, Frîncu ME (2017) Simulated annealing based symbiotic organisms search optimization algorithm for traveling salesman problem. Expert Syst Appl 77:189–210CrossRefGoogle Scholar
  25. 25.
    Torkaman S, Ghomi SF, Karimi B (2017) Hybrid simulated annealing and genetic approach for solving a multi-stage production planning with sequence-dependent setups in a closed-loop supply chain. Appl Soft Comput 71:1085–1104CrossRefGoogle Scholar
  26. 26.
    Assad A, Deep K (2018) A hybrid harmony search and simulated annealing algorithm for continuous optimization. Inf Sci 450:246–266CrossRefGoogle Scholar
  27. 27.
    Javidrad F, Nazari M (2017) A new hybrid particle swarm and simulated annealing stochastic optimization method. Appl Soft Comput 60:634–654CrossRefGoogle Scholar
  28. 28.
    Fardi K, Jafarzadeh_Ghoushchi S, Hafezalkotob A (2018) An extended robust approach for a cooperative inventory routing problem. Expert Syst Appl 116:310–327CrossRefGoogle Scholar
  29. 29.
    Kempen R, Meier A, Hasche J, Mueller K (2018) Optimized multi-algorithm voting: increasing objectivity in clustering. Expert Syst Appl 118:217–230CrossRefGoogle Scholar
  30. 30.
    Andradóttir S (2015) A review of random search methods. In: Handbook of simulation optimization. Springer, New York, pp 277–292Google Scholar
  31. 31.
    Sutton RS, Precup D, Singh S (1999) Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artif Intell 112:181–211MathSciNetCrossRefzbMATHGoogle Scholar
  32. 32.
    Wei L, Zhang Z, Zhang D, Leung SC (2018) A simulated annealing algorithm for the capacitated vehicle routing problem with two-dimensional loading constraints. Eur J Oper Res 265:843–859MathSciNetCrossRefzbMATHGoogle Scholar
  33. 33.
    He Q, Wang L (2007) An effective co-evolutionary particle swarm optimization for constrained engineering design problems. Eng Appl Artif Intell 20:89–99CrossRefGoogle Scholar
  34. 34.
    Ferreira MP, Rocha ML, Neto AJS, Sacco WF (2018) A constrained ITGO heuristic applied to engineering optimization. Expert Syst Appl 110:106–124CrossRefGoogle Scholar
  35. 35.
    Zahara E, Kao Y-T (2009) Hybrid Nelder-Mead simplex search and particle swarm optimization for constrained engineering design problems. Expert Syst Appl 36:3880–3886CrossRefGoogle Scholar
  36. 36.
    Rizk-Allah RM (2017) Hybridizing sine cosine algorithm with multi-orthogonal search strategy for engineering design problems. J Comput Des Eng 5:249–273Google Scholar
  37. 37.
    McPartland M, Gallagher M (2011) Reinforcement learning in first person shooter games. IEEE Trans Comput Intell AI Games 3:43–56CrossRefGoogle Scholar
  38. 38.
    Sharma R, Spaan MTJ (2012) Bayesian-game-based fuzzy reinforcement learning control for decentralized POMDPs. IEEE Trans Comput Intell AI Games 4:309–328CrossRefGoogle Scholar
  39. 39.
    Rakshit P, Konar A, Bhowmik P, Goswami I, Das S, Jain LC, Nagar AK (2013) Realization of an adaptive memetic algorithm using differential evolution and Q-learning: a case study in multirobot path planning. IEEE Trans Syst Man Cybern Syst 43:814–831CrossRefGoogle Scholar
  40. 40.
    Thanedar P, Vanderplaats G (1995) Survey of discrete variable optimization for structural design. J Struct Eng 121:301–306CrossRefGoogle Scholar
  41. 41.
    Nowacki H (1973) Optimization in pre-contract ship design, vol 2. Elsevier, New York, pp 327–338Google Scholar
  42. 42.
    Deb K, Pratap A, Moitra S (2000) Mechanical component design for multiple objectives using elitist non-dominated sorting ga. In: International conference on parallel problem solving from nature, Springer, pp 859–868Google Scholar
  43. 43.
    Sandgren E (1990) Nonlinear integer and discrete programming in mechanical design optimization. J Mech Des 112:223–229CrossRefGoogle Scholar
  44. 44.
    Osyczka A (2002) Evolutionary algorithms for single and multicriteria design optimization. Studies in fuzzyness and soft computing. Springer, HeidelbergGoogle Scholar
  45. 45.
    Mezura-Montes E, Coello CAC (2005) Useful infeasible solutions in engineering optimization with evolutionary algorithms. In: Mexican international conference on artificial intelligence, Springer, pp 652–662Google Scholar
  46. 46.
    Kennedy J, Eberhart R (1995) Particle swarm optimization. In: IEEE international conference on neural networks, 1995. Proceedings, vol 1944, pp 1942–1948Google Scholar
  47. 47.
    Liang JJ, Qin AK, Suganthan PN, Baskar S (2006) Comprehensive learning particle swarm optimizer for global optimization of multimodal functions. IEEE Trans Evol Comput 10:281–295CrossRefGoogle Scholar
  48. 48.
    Mirjalili S, Mirjalili SM, Lewis A (2014) Grey wolf optimizer. Adv Eng Softw 69:46–61CrossRefGoogle Scholar
  49. 49.
    Pham D, Ghanbarzadeh A, Koc E, Otri S, Rahim S, Zaidi M (2005) The bees algorithm. Technical note, Manufacturing Engineering Centre, Cardiff University, UK, pp 1–57Google Scholar
  50. 50.
    Zhao SZ, Suganthan PN, Pan Q-K, Tasgetiren MF (2011) Dynamic multi-swarm particle swarm optimizer with harmony search. Expert Syst Appl 38:3735–3742CrossRefGoogle Scholar
  51. 51.
    Chan C-L, Chen C-L (2015) A cautious PSO with conditional random. Expert Syst Appl 42:4120–4125CrossRefGoogle Scholar
  52. 52.
    Zhang Y, Wang S, Phillips P, Ji G (2014) Binary PSO with mutation operator for feature selection using decision tree applied to spam detection. Knowl-Based Syst 64:22–31CrossRefGoogle Scholar
  53. 53.
    Pandi R, Panigrahi BK (2011) Dynamic economic load dispatch using hybrid swarm intelligence based harmony search algorithm. Expert Syst Appl 38:8509–8514CrossRefGoogle Scholar
  54. 54.
    Sheskin DJ (2003) Handbook of parametric and nonparametric statistical procedures. CRC Press, Boca RatonCrossRefzbMATHGoogle Scholar
  55. 55.
    Van Laarhoven PJM, Aarts EH (1987) Simulated annealing. Simulated annealing: theory and applications. Springer, Dordrecht, pp 7–15CrossRefGoogle Scholar
  56. 56.
    Yu K, Wang X, Wang Z (2016) Constrained optimization based on improved teaching–learning-based optimization algorithm. Inf Sci 352:61–78CrossRefGoogle Scholar
  57. 57.
    Yi W, Li X, Gao L, Zhou Y, Huang J (2016) ε constrained differential evolution with pre-estimated comparison using gradient-based approximation for constrained optimization problems. Expert Syst Appl 44:37–49CrossRefGoogle Scholar
  58. 58.
    Efron B (1979) Bootstrap methods: another look at the jackknife. Ann Stat 7:1–26MathSciNetCrossRefzbMATHGoogle Scholar
  59. 59.
    Liang JJ, Qin AK, Suganthan PN, Baskar S (2006) Comprehensive learning particle swarm optimizer for global optimization of multimodal functions. IEEE Trans Evol Comput 10:281–295CrossRefGoogle Scholar
  60. 60.
    Geem ZW, Kim JH, Loganathan GV (2001) A new heuristic optimization algorithm: harmony search. Simulation 76:60–68CrossRefGoogle Scholar
  61. 61.
    Liu B, Wang L, Jin Y-H (2007) An effective PSO-based memetic algorithm for flow shop scheduling. IEEE Trans Syst Man Cybern Part B Cybern 37:18–27CrossRefGoogle Scholar
  62. 62.
    Chiam SC, Tan KC, Mamun AA (2009) A memetic model of evolutionary PSO for computational finance applications. Expert Syst Appl 36:3695–3711CrossRefGoogle Scholar

Copyright information

© Springer-Verlag London Ltd., part of Springer Nature 2019

Authors and Affiliations

  • Hussein Samma
    • 1
    • 2
    Email author
  • Junita Mohamad-Saleh
    • 1
  • Shahrel Azmin Suandi
    • 1
  • Badr Lahasan
    • 2
  1. 1.Intelligent Biometric Group, School of Electrical and Electronic Engineering, Engineering CampusUniversiti Sains MalaysiaNibong TebalMalaysia
  2. 2.Department of Computer Programming, Faculty of Education– ShabwaUniversity of AdenAdenRepublic of Yemen

Personalised recommendations