A survey of swarm and evolutionary computing approaches for deep learning

Abstract

Deep learning (DL) has become an important machine learning approach that has been widely successful in many applications. Currently, DL is one of the best methods of extracting knowledge from large sets of raw data in a (nearly) self-organized manner. The technical design of DL depends on the feed-forward information flow principle of artificial neural networks with multiple layers of hidden neurons, which form deep neural networks (DNNs). DNNs have various architectures and parameters and are often developed for specific applications. However, the training process of DNNs can be prolonged based on the application and training set size (Gong et al. 2015). Moreover, finding the most accurate and efficient architecture of a deep learning system in a reasonable time is a potential difficulty associated with this approach. Swarm intelligence (SI) and evolutionary computing (EC) techniques represent simulation-driven non-convex optimization frameworks with few assumptions based on objective functions. These methods are flexible and have been proven effective in many applications; therefore, they can be used to improve DL by optimizing the applied learning models. This paper presents a comprehensive survey of the most recent approaches involving the hybridization of SI and EC algorithms for DL, the architecture of DNNs, and DNN training to improve the classification accuracy. The paper reviews the significant roles of SI and EC in optimizing the hyper-parameters and architectures of a DL system in context to large scale data analytics. Finally, we identify some open problems for further research, as well as potential issues related to DL that require improvements, and an extensive bibliography of the pertinent research is presented.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Figure adopted from https://github.com/nnrg/opennero/wiki/NeuroEvolution

Fig. 11

References

  1. Ackley DH, Hinton GE, Sejnowski TJ (1985) A learning algorithm for Boltzmann machines. Cognit Sci 9(1):147–169

    Article  Google Scholar 

  2. Agapitos A, O’Neill M, Nicolau M, Fagan D, Kattan A, Brabazon A, Curran K (2015) Deep evolution of image representations for handwritten digit recognition. In 2015 IEEE congress on evolutionary computation (CEC). IEEE, pp 2452–2459

  3. Alejandro M, Lara-Cabrera R, Fuentes-Hurtado F, Naranjo V (2018) EvoDeep: A new evolutionary approach for automatic deep neural networks parametrisation. J Parallel Distrib Comput 117:180–191

    Article  Google Scholar 

  4. Bäck T, Foussette C, Krause P (2013) Contemporary evolution strategies. Springer, Berlin

    Google Scholar 

  5. Badem H, Basturk A, Caliskan A, Yuksel ME (2017) A new efficient training strategy for deep neural networks by hybridization of artificial bee colony and limited-memory BFGS optimization algorithms. Neurocomputing 266:506–526

    Article  Google Scholar 

  6. Bae C, Kang K, Liu G, Chung YY (2016) A novel real time video tracking framework using adaptive discrete swarm optimization. Expert Syst Appl 64:385–399

    Article  Google Scholar 

  7. Banharnsakun A (2018) Towards improving the convolutional neural networks for deep learning using the distributed artificial bee colony method. Int J Mach Learn Cybern. https://doi.org/10.1007/s13042-018-0811-z

    Article  Google Scholar 

  8. Bayer J, Wierstra D, Togelius J, Schmidhuber J (2009) Evolving memory cell structures for sequence learning. In: International conference on artificial neural networks (ICANN 2009), Springer LNCS, pp 755–764

  9. Bengio Y, Lamblin P, Popovici D, Larochelle H (2007) Greedy layer-wise training of deep networks. In: Advances in neural information processing systems, pp 153–160

  10. Bengio Y, Courville A, Vincent P (2013) Representation learning: a review and new perspectives. IEEE Trans Pattern Anal Mach Intell 35(8):1798–1828

    Article  Google Scholar 

  11. Biswas A, Chandrakasan AP (2018) Conv-RAM: an energy-efficient SRAM with embedded convolution computation for low-power CNN-based machine learning applications. In: 2018 IEEE international solid-state circuits conference—(ISSCC), San Francisco, CA, pp 488–490

  12. Bonyadi MR, Michalewicz Z (2017) Particle swarm optimization for single objective continuous space problems: a review. Evolut Comput 25:1–54

    Article  Google Scholar 

  13. Breuel TM (2015) On the convergence of SGD training of neural networks. arXiv preprint arXiv:1508.02790

  14. Carreira-Perpinan MA, Hinton GE (2005) On contrastive divergence learning. In: 10th international workshop on artificial intelligence and statistics (AISTATS 2005), pp 59–66

  15. Chandra R (2015) Competition and collaboration in cooperative coevolution of Elman recurrent neural networks for time-series prediction. IEEE Trans Neural Netw Learn Syst 26(12):3123–3136

    MathSciNet  Article  Google Scholar 

  16. Chen XW, Lin X (2014) Big data deep learning: challenges and perspectives. IEEE Access 2:514–525

    Article  Google Scholar 

  17. Chen S, Liu G, Wu C, Jiang Z, Chen J (2016) Image classification with stacked restricted boltzmann machines and evolutionary function array classification voter. In: 2016 IEEE congress on evolutionary computation (CEC). IEEE, pp 4599–4606

  18. Chen J, Zeng GQ, Zhou W, Du W, Lu KD (2018) Wind speed forecasting using nonlinear-learning ensemble of deep learning time series prediction and extremal optimization. Energy Convers Manag 165:681–695

    Article  Google Scholar 

  19. Cheung B, Sable C (2011) Hybrid evolution of convolutional networks. In: 2011 10th international conference on machine learning and applications workshops. IEEE, pp 293–297

  20. Corne DW, Reynolds A, Bonabeau E (2012) Swarm intelligence. In: Rozenberg G, Bäck T, Kok JN (eds) Handbook of natural computing. Springer, Berlin, pp 1599–1622

    Google Scholar 

  21. Das S (2013) Evaluating the evolutionary algorithms—classical perspectives and recent trends, in computational intelligence. In: Ishibuchi H (ed) Encyclopedia of life support systems (EOLSS), Developed under the Auspices of the UNESCO, Eolss Publishers, Oxford, UK. http://www.eolss.net

  22. Das S, Mullick SS, Suganthan PN (2016) Recent advances in differential evolution—an updated survey. Swarm Evolut Comput 27:1–30

    Article  Google Scholar 

  23. Das S, Datta S, Chaudhuri BB (2018) Handling data irregularities in classification: foundations, trends, and future challenges. Pattern Recognit 81:674–693

    Article  Google Scholar 

  24. David RW (2012) Software review: the ECJ toolkit. Genet Progr Evolvable Mach 13(1):65–67

    Article  Google Scholar 

  25. David OE, Greental I (2014) Genetic algorithms for evolving deep neural networks. In: Proceedings of the companion publication of the 2014 annual conference on genetic and evolutionary computation. ACM, pp 1451–1452

  26. David RC, Precup RE, Petriu EM, Purcaru C, Preitl S (2012) PSO and GSA algorithms for fuzzy controller tuning with reduced process small time constant sensitivity. In: 2012 16th international conference on system theory, control and computing (ICSTCC). IEEE, pp 1–6

  27. Deepa SN, Baranilingesan I (2017) Optimized deep learning neural network predictive controller for continuous stirred tank reactor. Comput Electr Eng 000:1–16

    Google Scholar 

  28. Del Ser J, Osaba E, Molina D, Yang X-S, Salcedo-Sanz S, Camacho D, Das S, Suganthan PN, Coello Coello CC, Herrera F (2019) Bio-inspired computation: where we stand and what’s next. Swarm Evolut Comput 48:220–250

    Article  Google Scholar 

  29. Desell T (2017) Large scale evolution of convolutional neural networks using volunteer computing. In: Proceedings of the genetic and evolutionary computation conference companion. ACM, pp 127–128

  30. Desell T, Clachar S, Higgins J, Wild B (2015) Evolving deep recurrent neural networks using ant colony optimization. In: European conference on evolutionary computation in combinatorial optimization. Springer, Cham, pp 86–98

  31. Duchi J, Hazan E, Singer Y (2011) Adaptive subgradient methods for online learning and stochastic optimization. J Mach Learn Res 12:2121–2159

    MathSciNet  MATH  Google Scholar 

  32. Dufourq E, Bassett BA (2017) EDEN: evolutionary deep networks for efficient machine learning. In: Pattern recognition association of South Africa and robotics and mechatronics (PRASA-RobMech). IEEE, pp 110–115

  33. Durillo JJ, Nebro AJ (2011) jMetal: a Java framework for multi-objective optimization. Adv Eng Softw 42(10):760–771

    Article  Google Scholar 

  34. Eiben AE, Smit SK (2011) Parameter tuning for configuring and analyzing evolutionary algorithms. Swarm Evolut Comput 1(1):19–31

    Article  Google Scholar 

  35. Elman JL (1990) Finding structure in time. Cognit Sci 14(2):179–211

    Article  Google Scholar 

  36. ElSaid A, Wild B, Jamiy FE, Higgins J, Desell T (2017) Optimizing LSTM RNNs using ACO to predict turbine engine vibration. In: Proceedings of the genetic and evolutionary computation conference companion. ACM, pp 21–22

  37. ElSaid A, Jamiy FE, Higgins J, Wild B, Desell T (2018) Using ant colony optimization to optimize long short-term memory recurrent neural networks. In: Proceedings of the genetic and evolutionary computation conference. ACM, pp 13–20

  38. Erol OK, Eksin I (2006) A new optimization method: big bang-big crunch. Adv Eng Softw 37(2):106–111

    Article  Google Scholar 

  39. Fielding B, Zhang L (2018) Evolving image classification architectures with enhanced particle swarm optimisation. In: IEEE Access, vol 6, pp 68560–68575

  40. Fogel DB (1995) Phenotypes, genotypes, and operators in evolutionary computation. In: IEEE international conference on evolutionary computation, 1995, vol 1. IEEE, p 193

  41. Fujino S, Mori N, Matsumoto K (2017) Deep convolutional networks for human sketches by means of the evolutionary deep learning. In: 2017 Joint 17th world congress of international fuzzy systems association and 9th international conference on soft computing and intelligent systems (IFSA-SCIS). IEEE, pp 1–5

  42. Fukushima K (1980) Neocognitron: a self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biol Cybern 36:193–202

    MATH  Article  Google Scholar 

  43. Galloway GS, Catterson VM, Fay T, Robb A, Love C (2016) Diagnosis of tidal turbine vibration data through deep neural networks. In: Third European conference of the prognostics and health management society, pp 172–180

  44. Gascón-Moreno J, Salcedo-Sanz S, Saavedra-Moreno B, Carro-Calvo L, Portilla-Figueras A (2013) An evolutionary-based hyper-heuristic approach for optimal construction of group method of data handling networks. Inf Sci 247:94–108

    MathSciNet  Article  Google Scholar 

  45. Gauci J, Stanley K (2007) Generating large-scale neural networks through discovering geometric regularities. In: Proceedings of the 9th annual conference on genetic and evolutionary computation. ACM, pp 997–1004

  46. Gauriau R, Cuingnet R, Lesage D, Bloch I (2015) Multi-organ localization with cascaded global-to-local regression and shape prior. Med Image Anal 23(1):70–83

    Article  Google Scholar 

  47. Geng W (2018) Cognitive deep neural networks prediction method for software fault tendency module based on bound particle swarm optimization. Cognit Syst Res 52:12–20

    Article  Google Scholar 

  48. Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. CVPR 2014:580–587

    Google Scholar 

  49. Glorot X, Bordes A, Bengio Y (2011) Deep sparse rectifier networks. In: AISTATS, vol 15, pp 315–323

  50. Gomes L (2014) Machine-learning maestro michael jordan on the delusions of big data and other huge engineering efforts. In: IEEE spectrum, Oct 20

  51. Gong M, Liu J, Li H, Cai Q, Su L (2015) A multiobjective sparse feature learning model for deep neural networks. IEEE Trans Neural Netw Learn Syst 26(12):3263–3277

    MathSciNet  Article  Google Scholar 

  52. Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial networks. arXiv:1406.2661

  53. Goodfellow I, Bengio Y, Courville A (2015) Modern practical deep networks. In: Goodfellow I, Bengio Y, Courville A (eds) Deep learning. MIT Press, Cambridge, pp 162–481

    Google Scholar 

  54. Greff K, Srivastava RK, Koutník J, Steunebrink BR, Schmidhuber J (2017) LSTM: a search space odyssey. IEEE Trans Neural Netw Learn Syst 28(10):2222–2232

    MathSciNet  Article  Google Scholar 

  55. Grievank A (2000) Principles and techniques of algorithmic differentiation: evaluating derivatives. SIAM, Philadelphia

    Google Scholar 

  56. Guo S, Yang Z (2018) Multi-channel-ResNet: an integration framework towards skin lesion analysis. Inform Med Unlocked 12:67–74

    Article  Google Scholar 

  57. Han S, Pool J, Tran J, Dally W (2015) Learning both weights and connections for efficient neural network. In Advances in neural information processing systems, pp 1135–1143

  58. Hardt M, Recht B, Singer Y (2015) Train faster, generalize better: stability of stochastic gradient descent. arXiv preprint arXiv:1509.01240

  59. Hatamlou A (2013) Black hole: a new heuristic optimization approach for data clustering. Inf Sci 222:175–184

    MathSciNet  Article  Google Scholar 

  60. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition, In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), Las Vegas, NV, pp 770–778

  61. Hinton GE, Osindero S, Teh YW (2006) A fast learning algorithm for deep belief nets. Neural Comput 18(7):1527–1554

    MathSciNet  MATH  Article  Google Scholar 

  62. Hinton GE, Srivastava N, Krizhevsky A, Sutskever I, Salakhutdinov RR (2012a) Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint arXiv:1207.0580

  63. Hinton G, Deng L, Yu D, Dahl GE, Mohamed AR, Jaitly N et al (2012b) Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. IEEE Signal Process Mag 29(6):82–97

    Article  Google Scholar 

  64. Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780

    Article  Google Scholar 

  65. Holker G, dos Santos MV (2010) Toward an estimation of distribution algorithm for the evolution of artificial neural networks. In: Proceedings of the third C* conference on computer science and software engineering. ACM, pp 17–22

  66. Horng MH (2017) Fine-tuning parameters of deep belief networks using artificial bee colony algorithm. In: 2017 2nd international conference on artificial intelligence: techniques and applications DEStech transactions on computer science and engineering (AITA 2017)

  67. Huang G, Liu Z, Maaten LVD, Weinberger KQ (2017) Densely connected convolutional networks. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR), Honolulu, HI, 2017, pp 2261–2269

  68. Hubel DH, Wiesel TN (1959) Receptive fields of single neurones in the cat’s striate cortex. J Physiol 148(3):574–591

    Article  Google Scholar 

  69. Hubel DH, Wiesel TN (1968) Receptive fields and functional architecture of monkey striate cortex. J Physiol 195(1):215–243

    Article  Google Scholar 

  70. Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167

  71. Jain M, Singh V, Rani A (2018) A novel nature-inspired algorithm for optimization: squirrel search algorithm. Swarm Evolut Comput. https://doi.org/10.1016/j.swevo.2018.02.013

    Article  Google Scholar 

  72. Jiang S, Ji Z, Shen Y (2014) A novel hybrid particle swarm optimization and gravitational search algorithm for solving economic emission load dispatch problems with various practical constraints. Int J Electr Power Energy Syst 55:628–644

    Article  Google Scholar 

  73. Jiang S, Chin KS, Wang L, Qu G, Tsui KL (2017) Modified genetic algorithm-based feature selection combined with pre-trained deep neural network for demand forecasting in outpatient department. Exp Syst Appl 82:216–230

    Article  Google Scholar 

  74. Junbo T, Weining L, Juneng A, Xueqian W (2015) Fault diagnosis method study in roller bearing based on wavelet transform and stacked auto-encoder. In: The 27th Chinese control and decision conference (2015 CCDC), IEEE 2015, pp 4608–4613

  75. Justesen N, Risi S (2017) Continual online evolutionary planning for in-game build order adaptation in StarCraft. In: Proceedings of the genetic and evolutionary computation conference. ACM, pp 187–194

  76. Kang K, Bae C, Yeung HWF, Chung YY (2018) A hybrid gravitational search algorithm with swarm intelligence and deep convolutional feature for object tracking optimization. Appl Soft Comput 66:319–329

    Article  Google Scholar 

  77. Kenny A, Li X (2017) A study on pre-training deep neural networks using particle swarm optimisation. In: Asia-Pacific conference on simulated evolution and learning. Springer, Cham, pp 361–372

  78. Khalifa MH, Ammar M, Ouarda W, Alimi AM (2017) Particle swarm optimization for deep learning of convolution neural network. In: 2017 Sudan conference on computer science and information technology (SCCSIT). IEEE, pp 1–5

  79. Kim JK, Han YS, Lee JS (2017) Particle swarm optimization–deep belief network–based rare class prediction model for highly class imbalance problem. Concurr Comput Pract Exp 2017(29):e4128

    Article  Google Scholar 

  80. Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980

  81. Koza JR, Rice JP (1991) Genetic generation of both the weights and architecture for a neural network. In: IJCNN-91-seattle international joint conference on neural networks, vol 2. IEEE, pp 397–404

  82. Kriegman S, Cheney N, Corucci F, Bongard JC (2017) A minimal developmental model can increase evolvability in soft robots. In: Proceedings of the genetic and evolutionary computation conference. ACM, pp 131–138

  83. Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105

  84. Kuremoto T, Kimura S, Kobayashi K, Obayashi M (2014) Time series forecasting using a deep belief network with restricted Boltzmann machines. Neurocomputing 137:47–56

    Article  Google Scholar 

  85. Lamos-Sweeney J, Gaborski R (2012) Deep learning using genetic algorithms. Master thesis, Institute Thomas Golisano College of Computing and Information Sciences. Advisor

  86. Lander S, Shang Y (2015) EvoAE—a new evolutionary method for training autoencoders for deep learning networks. In: 2015 IEEE 39th annual computer software and applications conference (COMPSAC), vol 2. IEEE, pp 790–795

  87. LeCun Y, Boser BE, Denker JS, Henderson D, Howard RE, Hubbard WE, Jackel LD (1990) Handwritten digit recognition with a back-propagation network. In: Advances in neural information processing systems, pp 396–404

  88. LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324

    Article  Google Scholar 

  89. LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444. https://doi.org/10.1038/nature14539

    Article  Google Scholar 

  90. Lee H, Pham P, Largman Y, Ng AY (2009) Unsupervised feature learning for audio classification using convolutional deep belief networks. In: Advances in neural information processing systems, pp 1096–1104

  91. Leke C, Ndjiongue AR, Twala B, Marwala T (2017) A deep learning-cuckoo search method for missing data estimation in high-dimensional datasets. In: International conference in swarm intelligence. Springer, Cham, pp 561–572

  92. Leung FHF, Lam HK, Ling SH, Tam PKS (2003) Tuning of the structure and parameters of a neural network using an improved genetic algorithm. IEEE Trans Neural Netw 14(1):79–88

    Article  Google Scholar 

  93. Liang J, Meyerson E, Miikkulainen R (2018) Evolutionary architecture search for deep multitask networks. In: GECCO 18: genetic and evolutionary computation conference, July 15–19, Kyoto, Japan. ACM, New York, NY, USA

  94. Lieto A, Radicioni DP, Cruciani M (eds) Proceedings of the second international workshop on artificial intelligence and cognition, pp 164–171

  95. Liu Q, Wang Z, He X, Zhou DH (2015a) Event-based H ∞ consensus control of multiagent systems with relative output feedback: the finite-horizon case. IEEE Trans Autom Control 60(9):2553–2558

    MATH  Article  Google Scholar 

  96. Liu X, Gao J, He X, Deng L, Duh K, Wang YY (2015b) Representation learning using multi-task deep neural networks for semantic classification and information retrieval. In: Proc. of NAACL, pp 912–921

  97. Liu S, Hou Z, Yin C (2016) Data-driven modeling for UGI gasification processes via an enhanced genetic BP neural network with link switches. IEEE Trans Neural Netw Learn Syst 27(12):2718–2729

    Article  Google Scholar 

  98. Liu Q, Wang Z, He X, Ghinea G, Alsaadi FE (2017) A resilient approach to distributed filter design for time-varying systems under stochastic nonlinearities and sensor degradation. IEEE Trans Signal Process 65(5):1300–1309

    MathSciNet  MATH  Article  Google Scholar 

  99. Liu H, Simonyan K, Vinyals O, Fernando C, Kavukcuoglu K (2018a) Hierarchical representations for efficient architecture search. In: Sixth international conference on learning representations (ICLR 2018). Canada

  100. Liu J, Gong M, Miao Q, Wang X, Li H (2018b) Structure learning for deep neural networks based on multiobjective optimization. IEEE Trans Neural Netw Learn Syst 29(6):2450–2463

    MathSciNet  Article  Google Scholar 

  101. Loh B, Then P (2017) Deep learning for cardiac computer-aided diagnosis: benefits, issues & solutions. Mhealth 3:45. https://doi.org/10.21037/mhealth.2017.09.01

    Article  Google Scholar 

  102. López-Ibáñez M, Stützle T, Dorigo M (2018) Ant colony optimization: a component-wise overview. In: Handbook of heuristics, pp 371–407

  103. Lopez-Rincon A, Tonda A, Elati M, Schwander O, Piwowarski B, Gallinari P (2018) Evolutionary optimization of convolutional neural networks for cancer miRNA biomarkers classification. Appl Soft Comput 65:91–100

    Article  Google Scholar 

  104. Lorenzo PR, Nalepa J (2018) Memetic evolution of deep neural networks. In: Proceedings of the genetic and evolutionary computation conference. ACM, pp 505–512

  105. Lorenzo PR, Nalepa J, Kawulok M, Ramos LS, Pastor JR (2017) Particle swarm optimization for hyper-parameter selection in deep neural networks. In: Proceedings of the genetic and evolutionary computation conference. ACM, pp 481–488

  106. Lu C, Wang ZY, Qin WL, Ma J (2017) Fault diagnosis of rotary machinery components using a stacked denoising autoencoder-based health state identification. Signal Process 130:377–388

    Article  Google Scholar 

  107. Ma L, Wang Z, Lam HK (2017a) Event-triggered mean-square consensus control for time-varying stochastic multi-agent system with sensor saturations. IEEE Trans Autom Control 62(7):3524–3531

    MathSciNet  MATH  Article  Google Scholar 

  108. Ma L, Wang Z, Lam HK (2017b) Mean-square H∞ consensus control for a class of nonlinear time-varying stochastic multiagent systems: the finite-horizon case. IEEE Trans Syst Man Cybern Syst 47(7):1050–1060

    Article  Google Scholar 

  109. Mandischer M (2002) A comparison of evolution strategies and backpropagation for neural network training. Neurocomputing 42(1–4):87–117

    MATH  Article  Google Scholar 

  110. Mandt S, Hoffman M, Blei D (2016) A variational analysis of stochastic gradient algorithms. In: International conference on machine learning, pp 354–363

  111. Maravall D, de Lope J (2009) Hybridizing evolutionary computation and reinforcement learning for the design of almost universal controllers for autonomous robots. Neurocomputing 72(4–6):887–894

    Article  Google Scholar 

  112. Martin A, Lara-Cabrera R, Fuentes-Hurtado F, Naranjo V, Camacho D (2018) EvoDeep: a new evolutionary approach for automatic deep neural networks parametrisation. J Parallel Distrib Comput 117:180–191

    Article  Google Scholar 

  113. McCulloch WS, Pitts W (1943) A logical calculus of the ideas immanent in nervous activity. Bull Math Biophys 5(4):115–133

    MathSciNet  MATH  Article  Google Scholar 

  114. Miikkulainen R (2017) Neuroevolution. In: Encyclopedia of machine learning and data mining, pp 899–904

  115. Miikkulainen R et al (2017) Evolving deep neural networks. arXiv preprint arXiv:1703.00548

  116. Mirjalili S, Andrew L (2016) The whale optimization algorithm. Adv Eng Softw 95:51–67

    Article  Google Scholar 

  117. Mirjalili S, Mirjalili SM, Lewis A (2014) Grey wolf optimizer. Adv Eng Softw 69:46–61

    Article  Google Scholar 

  118. Mukhopadhyay A, Maulik U, Bandyopadhyay S (2015) A survey of multiobjective evolutionary clustering. ACM Comput Surv 47(4):61:1–61:46

    Article  Google Scholar 

  119. Nair V, Hinton GE (2010) Rectified linear units improve restricted boltzmann machines. In: Proceedings of the 27th international conference on machine learning (ICML-10), pp 807–814

  120. Neri F, Cotta C (2012) Memetic algorithms and memetic computing optimization: a literature review. Swarm Evolut Comput 2:1–14

    Article  Google Scholar 

  121. Neyshabur B, Salakhutdinov RR, Srebro N (2015) Path-sgd: path-normalized optimization in deep neural networks. In: Advances in neural information processing systems, pp 2422–2430

  122. Papa JP, Scheirer W, Cox DD (2016) Fine-tuning deep belief networks using harmony search. Appl Soft Comput 46:875–885

    Article  Google Scholar 

  123. Parker A, Nitschke G (2017) Autonomous intersection driving with neuro-evolution. In: Proceedings of the genetic and evolutionary computation conference companion. ACM, pp 133–134

  124. Passino KM (2002) Biomimicry of bacterial foraging for distributed optimization and control. IEEE Control Syst 22(3):52–67

    MathSciNet  Article  Google Scholar 

  125. Passos LA, Rodrigues DR, Papa JP (2018) Fine tuning deep boltzmann machines through meta-heuristic approaches. In: 2018 IEEE 12th international symposium on applied computational intelligence and informatics (SACI). IEEE, pp 000419–000424

  126. Pawełczyk K, Kawulok M, Nalepa J (2018) Genetically-trained deep neural networks. In: Proceedings of the genetic and evolutionary computation conference companion. ACM, pp 63–64

  127. Peña-Reyes CA, Sipper M (2000) Evolutionary computation in medicine: an overview. Artif Intell Med 19(1):1–23

    Article  Google Scholar 

  128. Peng L, Liu S, Liu R, Wang L (2018) Effective long short-term memory with differential evolution algorithm for electricity price prediction. Energy 162(2018):1301–1314

    Article  Google Scholar 

  129. Piotrowski AP (2014) Differential evolution algorithms applied to neural network training suffer from stagnation. Appl Soft Comput 21:382–406

    Article  Google Scholar 

  130. Rajasekhar A, Lynn N, Das S, Suganthan PN (2017) Computing with the collective intelligence of honey bees–a survey. Swarm Evolut Comput 32:25–48

    Article  Google Scholar 

  131. Rao RV, Savsani VJ, Vakharia DP (2011) Teaching–learning-based optimization: a novel method for constrained mechanical design optimization problems. Comput Aided Des 43(3):303–315

    Article  Google Scholar 

  132. Rashedi E, Nezamabadi-Pour H, Saryazdi S (2009) GSA: a gravitational search algorithm. Inf Sci 179(13):2232–2248

    MATH  Article  Google Scholar 

  133. Rawal A, Miikkulainen R (2016) Evolving deep LSTM-based memory networks using an information maximization objective. In: Friedrich T (ed) Proceedings of the genetic and evolutionary computation conference 2016 (GECCO’16). ACM, New York, NY, USA, pp 501–508

  134. Real E, Moore S, Selle A, Saxena S, Suematsu YL, Tan J, Le QV, Kurakin A (2017) Large-scale evolution of image classifiers. ICML 2017:2902–2911

    Google Scholar 

  135. Real E, Aggarwal A, Huang Y, Le QV (2018) Regularized evolution for image classifier architecture search. arXiv preprint arXiv:1802.01548

  136. Reddy KK, Sarkar S, Venugopalan V, Giering M (2016) Anomaly detection and fault disambiguation in large flight data: a multi-modal deep auto-encoder approach. In: Annual conference of the prognostics and health management society, Denver, Colorado, pp 1–8

  137. Risi S, Stanley KO (2012) A unified approach to evolving plasticity and neural geometry. In: International joint conference on neural networks. IEEE, pp 1–8

  138. Rosa G, Papa J, Marana A, Scheirer W, Cox D (2015) Fine-tuning convolutional neural networks using harmony search. In: Iberoamerican congress on pattern recognition. Springer, Cham, pp 683–690

  139. Rosa G, Papa J, Costa K, Passos L, Pereira C, Yang XS (2016) Learning parameters in deep belief networks through firefly algorithm. In: IAPR workshop on artificial neural networks in pattern recognition. Springer, Cham, pp 138–149

  140. Salakhutdinov R, Hinton GE (2009) Deep Boltzmann machines. In: AISTATS: 1, p 3

  141. Salakhutdinov R, Larochelle H (2010) Efficient learning of deep Boltzmann machines. In: Proceedings of the thirteenth international conference on artificial intelligence and statistics, pp 693–700

  142. Salimans T, Ho J, Chen X, Sidor S, Sutskever I (2017) Evolution strategies as a scalable alternative to reinforcement learning. arXiv:1703.03864

  143. Sánchez D, Melin P, Castillo O (2017) A grey Wolf optimizer for modular granular neural networks for human recognition. Comput Intell Neurosci 2017:1–26

    Article  Google Scholar 

  144. Sarikaya R, Hinton GE, Deoras A (2014) Application of deep belief networks for natural language understanding. IEEE/ACM Trans Audio Speech Lang Process (TASLP) 22(4):778–784

    Article  Google Scholar 

  145. Schmidhuber J (2015) Deep learning in neural networks: an overview. Neural Netw 61:85–117

    Article  Google Scholar 

  146. Shafiee M, Wong A (2016) Evolutionary synthesis of deep neural networks via synaptic cluster-driven genetic encoding. In: NIPS Workshop on efficient methods for deep neural networks. Thirtieth conference on neural information processing systems, Barcelona, Spain, Dec 5–10, 2016

  147. Shenfield A, Rostami S (2017) Multi-objective evolution of artificial neural networks in multi-class medical diagnosis problems with class imbalance. In: 2017 IEEE conference on computational intelligence in bioinformatics and computational biology (CIBCB). IEEE, pp 1–8

  148. Shi Y (2011) An optimization algorithm based on brainstorming process. Int J Swarm Intell Res 2(4):35–62

    Article  Google Scholar 

  149. Shinozaki T, Watanabe S (2015) Structure discovery of deep neural network based on evolutionary algorithms. In: 2015 IEEE international conference on acoustics, speech, and signal processing, ICASSP 2015—proceedings, vol 2015-August, [7178918] Institute of Electrical and Electronics Engineers Inc., pp 4979–4983. https://doi.org/10.1109/icassp.2015.7178918

  150. Silver D, Huang A, Maddison CJ, Guez A, Sifre L, Van Den Driessche G (2016) Mastering the game of Go with deep neural networks and tree search. Nature 529(7587):484–489

    Article  Google Scholar 

  151. Simon D (2013) Evolutionary optimization algorithms. Wiley, New York

    Google Scholar 

  152. Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: ICLR

  153. Singh P, Dwivedi P (2018) Integration of new evolutionary approach with artificial neural network for solving short term load forecast problem. In: Applied energy, vol 217(C). Elsevier, pp 537–549

  154. Song J, Niu Y (2016) Resilient finite-time stabilization of fuzzy stochastic systems with randomly occurring uncertainties and randomly occurring gain fluctuations. Neurocomputing 171:444–451

    Article  Google Scholar 

  155. Song YS, Hu J, Chen D, Ji D, Liu F (2016) Recursive approach to networked fault estimation with packet dropouts and randomly occurring uncertainties. Neurocomputing 214:340–349

    Article  Google Scholar 

  156. Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014a) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958

    MathSciNet  MATH  Google Scholar 

  157. Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014b) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958

    MathSciNet  MATH  Google Scholar 

  158. Srivastava RK, Greff K, Schmidhuber J (2015) Highway networks. arXiv:1505.00387

  159. Stanley KO, Miikkulainen R (2002) Evolving neural networks through augmenting topologies. Evolut Comput 10(2):99–127

    Article  Google Scholar 

  160. Stanley KO, Clune J, Lehman J, Miikkulainen R (2019) Designing neural networks through neuroevolution. Nat Mach Intell 1:24–35

    Article  Google Scholar 

  161. Suganthan PN (2018) On non-iterative learning algorithms with closed-form solution. Appl Soft Comput 70:1078–1082

    Article  Google Scholar 

  162. Sun Y, Xue B, Zhang M, Yen GG (2018a) A particle swarm optimization-based flexible convolutional autoencoder for image classification. IEEE Trans Neural Netw Learn Syst. https://doi.org/10.1109/tnnls.2018.2881143

    Article  Google Scholar 

  163. Sun Y, Yen GG, Yi Z (2018b) Evolving unsupervised deep neural networks for learning meaningful representations. IEEE Trans Evolut Comput 23:89–103

    Article  Google Scholar 

  164. Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: 2015 IEEE conference on computer vision and pattern recognition (CVPR), Boston, MA, pp 1–9

  165. Takase T, Oyama S, Kurihara M (2018) Effective neural network training with adaptive learning rate based on training loss. Neural Netw 101:68–78

    Article  Google Scholar 

  166. Tan Y, Zhu Y (2010) Fireworks algorithm for optimization. In: International conference in swarm intelligence. Springer, Berlin, pp 355–364

  167. Tan SC, Watada J, Ibrahim Z, Khalid M (2015) Evolutionary fuzzy ARTMAP neural networks for classification of semiconductor defects. IEEE Trans Neural Netw Learn Syst 26(5):933–950

    MathSciNet  Article  Google Scholar 

  168. Team TTD, Al-Rfou R, Alain G, Almahairi A, Angermueller C, Bahdanau D et al (2016) Theano: a python framework for fast computation of mathematical expressions. arXiv preprint arXiv:1605.02688

  169. Thirukovalluru R, Dixit S, Sevakula RK, Verma NK, Salour A (2016) Generating feature sets for fault diagnosis using denoising stacked auto-encoder. In: 2016 IEEE international conference on prognostics and health management (ICPHM). IEEE, pp 1–7

  170. Tieleman T, Hinton GE (2012) Lecture 6.5—rmsprop, COURSERA: neural networks for machine learning

  171. Tirumala SS (2014) Implementation of evolutionary algorithms for deep architectures. CEUR workshop proceedings

  172. Tomoumi T, Satoshi O, Masahito K (2018) Effective neural network training with adaptive learning rate based on training loss. Neural Netw 101:68–78

    Article  Google Scholar 

  173. Trivedi A, Srinivasan D, Sanyal K, Ghosh A (2017) A survey of multiobjective evolutionary algorithms based on decomposition. IEEE Trans Evolut Comput 21(3):440–462

    Google Scholar 

  174. Vincent P, Larochelle H, Lajoie I, Bengio Y, Manzagol PA (2010) Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. J Mach Learn Res 11:3371–3408

    MathSciNet  MATH  Google Scholar 

  175. Wan L, Zeiler M, Zhang S, Le Cun Y, Fergus R (2013) Regularization of neural networks using dropconnect. In: International conference on machine learning, pp 1058–1066

  176. Wang B, Merrick KE, Abbass HA (2017) Co-operative coevolutionary neural networks for mining functional association rules. IEEE Trans Neural Netw Learn Syst 28(6):1331–1344

    Article  Google Scholar 

  177. Wang B, Sun Y, Xue B, Zhang M (2018a) A hybrid differential evolution approach to designing deep convolutional neural networks for image classification. In: The Australasian joint conference on artificial intelligence (AI 2018). Springer, pp 237–250

  178. Wang B, Sun Y, Xue B, Zhang M (2018b) Evolving deep convolutional neural networks by variable-length particle swarm optimization for image classification. arXiv preprint arXiv:1803.06492

  179. Wang R, Clune J, Stanley KO (2018c) VINE: an open source interactive data visualization tool for neuroevolution. In: GECCO ‘18 companion: genetic and evolutionary computation conference companion, July 15–19, Kyoto, Japan. ACM, New York, NY, USA

  180. Wang C, Xu C, Yao X, Tao D (2019) Evolutionary generative adversarial networks. IEEE Trans Evolut Comput. https://doi.org/10.1109/tevc.2019.2895748

    Article  Google Scholar 

  181. Wiatowski T, Bölcskei H (2018) A mathematical theory of deep convolutional neural networks for feature extraction. In: IEEE transactions on information theory, vol 64(3), pp 1845–1866

  182. Wu ZY, Rahaman A (2017) Optimized deep learning framework for water distribution data-driven modeling. In: XVIII international conference on water distribution systems analysis, WDSA2016, Procedia Engineering, vol 186, pp 261–268

  183. Xie L, Yuille A (2017) Genetic CNN. In: 2017 IEEE international conference on computer vision (ICCV), Venice, pp 1388–1397

  184. Yang XS (2010) Nature-inspired metaheuristic algorithms, 2nd edn. Luniver Press, Frome

    Google Scholar 

  185. Yang H, Wang Z, Shu H, Alsaadi FE, Hayat T (2016) Almost sure H∞ sliding mode control for nonlinear stochastic systems with Markovian switching and time-delays. Neurocomputing 175(Part A):392–400

    Article  Google Scholar 

  186. Yao X (1999) Evolving artificial neural networks. Proc IEEE 87(9):1423–1447

    Article  Google Scholar 

  187. Yao X, Liu Y (1997) A new evolutionary system for evolving artificial neural networks. IEEE Trans Neural Netw Learn Syst 8(3):694–713

    Article  Google Scholar 

  188. Ye F (2017) Particle swarm optimization-based automatic parameter selection for deep neural networks and its applications in large-scale and high-dimensional data. PLoS ONE 12(12):e0188746

    Article  Google Scholar 

  189. Yuan Y, Sun F, Liu H, Yang H (2014a) Low-frequency robust control for singularly perturbed system. IET Control Theory Appl 9(2):203–210

    MathSciNet  Article  Google Scholar 

  190. Yuan Z, Lu Y, Wang Z, Xue Y (2014b) Droid-sec: deep learning in android malware detection. In: ACM SIGCOMM computer communication review, vol 44(4). ACM., pp 371–372

  191. Yuan Z, Lu Y, Xue Y (2016) Droiddetector: android malware characterization and detection using deep learning. Tsinghua Sci Technol 21(1):114–123

    Article  Google Scholar 

  192. Zagoruyko S, Komodakis N (2016) Wide residual networks. arXiv preprint arXiv:1605.07146

  193. Zhang C, Lim P, Qin AK, Tan KC (2017a) Multiobjective deep belief networks ensemble for remaining useful life estimation in prognostics. IEEE Trans Neural Netw Learn Syst 28(10):2306–2318

    Article  Google Scholar 

  194. Zhang C, Tan KC, Li H, Hong GS (2017b) A cost-sensitive deep belief network for imbalanced classification. IEEE Trans Neural Netw Learn Syst. https://doi.org/10.1109/tnnls.2018.2832648

    Article  Google Scholar 

  195. Zhong Z, Yan J, Liu C-L (2018) Practical network blocks design with q-learning. In; Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR 2018), pp 2423–2432

  196. Zhou C, Paffenroth RC (2017) Anomaly detection with robust deep autoencoders. In: Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 665–674

  197. Zhou S, Chen Q, Wang X (2010) Discriminative deep belief networks for image classification. In 2010 17th IEEE international conference on image processing (ICIP). IEEE, pp 1561–1564

  198. Zhou A, Qu BY, Li H, Zhao SZ, Suganthan PN, Zhang Q (2011) Multiobjective evolutionary algorithms: a survey of the state of the art. Swarm Evolut Comput 1(1):32–49

    Article  Google Scholar 

  199. Zhu G, Lizotte D, Hoey J (2014) Scalable approximate policies for Markov decision process models of hospital elective admissions. Artif Intell Med 61(1):21–34

    Article  Google Scholar 

  200. Zoph B, Vasudevan V, Shlens J, Le QV (2017) Learning transferable architectures for scalable image recognition. arXiv preprint arXiv:1707.07012

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Swagatam Das.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Darwish, A., Hassanien, A.E. & Das, S. A survey of swarm and evolutionary computing approaches for deep learning. Artif Intell Rev 53, 1767–1812 (2020). https://doi.org/10.1007/s10462-019-09719-2

Download citation

Keywords

  • Deep learning
  • Metaheuristic algorithms
  • Artificial neural networks
  • Deep neural networks
  • Evolutionary computing
  • Swarm intelligence