Skip to main content

Using Particle Swarm Optimization with Gradient Descent for Parameter Learning in Convolutional Neural Networks

  • Conference paper
  • First Online:
Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications (CIARP 2021)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12702))

Included in the following conference series:

Abstract

The use of gradient-based methods are ubiquitously used to update the internal parameters of neural networks. Problems commonly associated with gradient based methods are the tendency for the algorithms to get stuck in sub-optimal local minima, and their slow convergence rate. Efficacious solutions to these issues, such as the addition of “momentum” and adaptive learning rates, have been offered. In this paper, we investigate the efficacy of using particle swarm optimization (PSO) to help gradient-based methods search for the optimal internal parameters to minimize the loss function of a convolutional neural network (CNN). We compare the metric performance of traditional gradient-baseds method with and without the use of a PSO to either guide or refine the search for the optimal weights. The gradient-based methods we examine are stochastic gradient descent with and without a momentum term, as well as Adaptive Moment Estimation (Adam). We find that, with the exception of the Adam optimized networks, regular gradient-based methods achieve better metric scores than when used in conjunction with a PSO. We also observe that using a PSO to refine the solution found through a gradient-descent technique reduces loss better than when using a PSO to dictate that starting solution for gradient descent. Ultimately, the best solution on the MNIST dataset was achieved by the network optimized with stochastic gradient descent and momentum with an average loss score of 0.0092 when evaluated using k-fold cross validation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Lecun, Y.: The power and limits of deep learning. Res. Technol. Manage. 61, 22–27 (2018)

    Article  Google Scholar 

  2. Lecun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)

    Article  Google Scholar 

  3. Wang, Y.-J.: Improving particle swarm optimization performance with local search for high-dimensional function optimization. Optim. Methods Softw. 25(5), 781–795 (2010)

    Article  MathSciNet  Google Scholar 

  4. Noel, M.M.: A new gradient based particle swarm optimization algorithm for accurate computation of global minimum. Appl. Soft Comput. 12(1), 353–359 (2012)

    Article  Google Scholar 

  5. Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning (Adaptive Computation and Machine Learning series). MIT Press Ltd. (2017)

    Google Scholar 

  6. Cauchy, A.: Méthode générale pour la résolution des systèmes d’équations simultanées. Comptes Rendus 25(2), 536–538 (1847)

    Google Scholar 

  7. Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning representations by back-propagating errors. Nature 323(6088), 533–536 (1986)

    Article  Google Scholar 

  8. Montana, D.: Neural network weight selection using genetic algorithms. Intell. Hybrid Syst. 8(6), 9–12 (1995)

    Google Scholar 

  9. Kennedy, J., Eberhart, R.: Particle swarm optimization. In: Proceedings of ICNN95 - International Conference on Neural Networks (1995)

    Google Scholar 

  10. Oldewage, E.T.: The perils of particle swarm optimization in high dimensional problem spaces. University of Pretoria (2017)

    Google Scholar 

  11. Mendes, R., Cortez, P., Rocha, M., Neves, J.: Particle swarms for feedforward neural network training. In: Proceedings of the 2002 International Joint Conference on Neural Networks. IJCNN 2002 (Cat. No. 02CH37290) (2002)

    Google Scholar 

  12. Ding, S., Su, C., Yu, J.: An optimizing BP neural network algorithm based on genetic algorithm. Artif. Intell. Rev. 36(2), 153–162 (2011)

    Article  Google Scholar 

  13. LeCun, Y., Cortes, C., Burges, C.: The mnist database, November 1998

    Google Scholar 

  14. “Papers with code - mnist benchmark (image classification).”

    Google Scholar 

  15. Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)

    MathSciNet  MATH  Google Scholar 

  16. Keskar, N.S., Mudigere, D., Nocedal, J., Smelyanskiy, M., Tang, P.T.P.: On large-batch training for deep learning: Generalization gap and sharp minima (2017)

    Google Scholar 

  17. He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: surpassing human-level performance on ImageNet classification. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV), December 2015

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dustin van der Haar .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Wessels, S., van der Haar, D. (2021). Using Particle Swarm Optimization with Gradient Descent for Parameter Learning in Convolutional Neural Networks. In: Tavares, J.M.R.S., Papa, J.P., González Hidalgo, M. (eds) Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications. CIARP 2021. Lecture Notes in Computer Science(), vol 12702. Springer, Cham. https://doi.org/10.1007/978-3-030-93420-0_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-93420-0_12

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-93419-4

  • Online ISBN: 978-3-030-93420-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics