Skip to main content

A Study on Pre-training Deep Neural Networks Using Particle Swarm Optimisation

  • Conference paper
  • First Online:
Simulated Evolution and Learning (SEAL 2017)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10593))

Included in the following conference series:

Abstract

Deep learning is a “hot-topic” in machine learning at the moment. Currently deep learning networks are constrained in their size and complexity due to the algorithms used to optimise being computationally expensive. This paper examines the potential of optimising deep neural networks using particle swarm optimisation (PSO) as a substitute for the most common methods of contrastive divergence (CD) or stochastic gradient descent. It investigates the problems caused by using PSO in such high-dimensional problem spaces and the issues around applying divide-and-conquer techniques to neural networks. A novel network architecture is proposed to overcome the limitations caused by the low dimensional capabilities of PSO, dubbed semi-disjoint expanded networks (SdENs). A comparative analysis is performed between the proposed model and more popular techniques. Our experiment results suggest that the proposed techniques could perform similar functions to the more traditional pre-training technique of CD, however it is identified that the deeper networks required suffer from the vanishing gradient problem. This paper serves to highlight the issues prevalent in this new and fertile ground of research.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bengio, Y.: Learning deep architectures for AI. Found. Trends Mach. Learn. 2(1), 1–127 (2009)

    Article  MATH  Google Scholar 

  2. Bengio, Y.: Practical recommendations for gradient-based training of deep architectures. In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Neural Networks: Tricks of the Trade. LNCS, vol. 7700, 2nd edn, pp. 437–478. Springer, Heidelberg (2012). doi:10.1007/978-3-642-35289-8_26

    Chapter  Google Scholar 

  3. Bengio, Y.: Evolving culture versus local minima. Grow. Adapt. Mach. - Stud. Comput. Intell. 557, 109–138 (2014)

    Article  Google Scholar 

  4. Blum, C., Li, X.: Swarm intelligence in optimization. In: Blum, C., Merkle, D. (eds.) Swarm Intelligence - Introduction and Applications. NCS, pp. 43–86. Springer, Heidelberg (2008). doi:10.1007/978-3-540-74089-6_2

    Chapter  Google Scholar 

  5. Clerc, M.: Particle Swarm Optimization. ISTE Ltd. (2006)

    Google Scholar 

  6. David, O.E., Greental, I.: Genetic algorithms for evolving deep neural networks. In: GECCO 2014 - Proceedings of the 2014 Conference on Genetic and Evolutionary Computation, pp. 1451–1452. Association for Computing Machinery Special Interest Group on Genetic and Evolutionary Computation (2014)

    Google Scholar 

  7. Deng, L., Yu, D.: Deep learning: methods and applications. Found. Trends Sig. Process. 7(3–4), 197–387 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  8. Hinton, G.E.: Training products of experts by minimizing contrastive divergence. Neural Comput. 14(8), 1771–1800 (2002)

    Article  MATH  Google Scholar 

  9. Hinton, G.E.: To recognize shapes, first learn to generate images. Prog. Brain Res. 165, 535–547 (2007)

    Article  Google Scholar 

  10. Hinton, G.E.: A practical guide to training restricted boltzmann machines. Momentum 1, 926 (2010)

    Google Scholar 

  11. Hinton, G.E., Salakhutdinov, R.: Reducing the dimensionality of data with neural networks. Science 313, 504–507 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  12. Hochreiter, S., Bengio, Y., Frasconi, P., Schmidhuber, J.: Gradient flow in recurrent nets: the difficulty of learning long-term dependencies. In: A Field Guide to Dynamical Recurrent Neural Networks (2001)

    Google Scholar 

  13. Kennedy, J.: Swarm intelligence. In: Zomaya, A.Y. (ed.) Handbook of Nature-Inspired and Innovative Computing. Springer, Boston (2006). doi:10.1007/0-387-27705-6_6

    Google Scholar 

  14. Kennedy, J., Eberhart, R.: Particle swarm optimization. In: Proceedings of IEEE International Conference on Neural Networks, vol. 4, pp. 1942–1948 (1995)

    Google Scholar 

  15. Kuok, K.K., Harun, S., Shamsuddin, S.M.: Particle swarm optimization feed forward neural network for hourly rainfall-runoff modeling in Bedup Basin, Malaysia. Int. J. Civil Environ. Eng. 9(10), 9–18 (2010)

    Google Scholar 

  16. Lamos-Sweeney, J.: Deep learning using genetic algorithms. Master’s thesis, Rochester Institute of Technology (2012)

    Google Scholar 

  17. LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)

    Article  Google Scholar 

  18. Levy, E., David, O.E., Netanyahu, N.S.: Genetic algorithms and deep learning for automatic painter classification. In: GECCO 2014 - Proceedings of 2014 Conference on Genetic and Evolutionary Computation, pp. 1143–1150. Association for Computing Machinery Special Interest Group on Genetic and Evolutionary Computation (2014)

    Google Scholar 

  19. Mendes, R., Cortez, P., Rocha, M., Neves, J.: Particle swarms for feedforward neural network training. In: Proceedings of 2002 International Joint Conference on Neural Networks (2002)

    Google Scholar 

  20. Morse, G., Stanley, K.O.: Simple evolutionary optimization can rival stochastic gradient descent in neural networks. In: Proceedings of Genetic and Evolutionary Computation Conference 2016, GECCO 2016, pp. 477–484. ACM, New York (2016)

    Google Scholar 

  21. Nielsen, M.A.: Neural Networks and Deep Learning. Determination Press (2014)

    Google Scholar 

  22. Porto, V.W., Fogel, D.B.: Alternative neural network training methods. IEEE Expert 10(3), 16–22 (1995)

    Article  Google Scholar 

  23. Squartini, S., Cecchi, S., Rossini, S., Piazza, F.: Comparing different recurrent neural architectures on a specific task from vanishing gradient effect perspective. In: Proceedings of 2006 IEEE International Conference on Networking, Sensing and Control, pp. 380–385 (2006)

    Google Scholar 

  24. Tsou, D., MacNish, C.: Adaptive particle swarm optimisation for high-dimensional, highly convex search spaces. In: 2003 Congress on Evolutionary Computation, vol. 2, pp. 783–789 (2003)

    Google Scholar 

  25. Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., Manzagol, P.: Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. J. Mach. Learn. Res. 11, 3371–3408 (2010)

    MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiaodong Li .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Kenny, A., Li, X. (2017). A Study on Pre-training Deep Neural Networks Using Particle Swarm Optimisation. In: Shi, Y., et al. Simulated Evolution and Learning. SEAL 2017. Lecture Notes in Computer Science(), vol 10593. Springer, Cham. https://doi.org/10.1007/978-3-319-68759-9_30

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-68759-9_30

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-68758-2

  • Online ISBN: 978-3-319-68759-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics