A Study on Pre-training Deep Neural Networks Using Particle Swarm Optimisation

Kenny, Angus; Li, Xiaodong

doi:10.1007/978-3-319-68759-9_30

Angus Kenny²² &
Xiaodong Li²²

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10593))

Included in the following conference series:

Asia-Pacific Conference on Simulated Evolution and Learning

3249 Accesses
5 Citations

Abstract

Deep learning is a “hot-topic” in machine learning at the moment. Currently deep learning networks are constrained in their size and complexity due to the algorithms used to optimise being computationally expensive. This paper examines the potential of optimising deep neural networks using particle swarm optimisation (PSO) as a substitute for the most common methods of contrastive divergence (CD) or stochastic gradient descent. It investigates the problems caused by using PSO in such high-dimensional problem spaces and the issues around applying divide-and-conquer techniques to neural networks. A novel network architecture is proposed to overcome the limitations caused by the low dimensional capabilities of PSO, dubbed semi-disjoint expanded networks (SdENs). A comparative analysis is performed between the proposed model and more popular techniques. Our experiment results suggest that the proposed techniques could perform similar functions to the more traditional pre-training technique of CD, however it is identified that the deeper networks required suffer from the vanishing gradient problem. This paper serves to highlight the issues prevalent in this new and fertile ground of research.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bengio, Y.: Learning deep architectures for AI. Found. Trends Mach. Learn. 2(1), 1–127 (2009)
Article MATH Google Scholar
Bengio, Y.: Practical recommendations for gradient-based training of deep architectures. In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Neural Networks: Tricks of the Trade. LNCS, vol. 7700, 2nd edn, pp. 437–478. Springer, Heidelberg (2012). doi:10.1007/978-3-642-35289-8_26
Chapter Google Scholar
Bengio, Y.: Evolving culture versus local minima. Grow. Adapt. Mach. - Stud. Comput. Intell. 557, 109–138 (2014)
Article Google Scholar
Blum, C., Li, X.: Swarm intelligence in optimization. In: Blum, C., Merkle, D. (eds.) Swarm Intelligence - Introduction and Applications. NCS, pp. 43–86. Springer, Heidelberg (2008). doi:10.1007/978-3-540-74089-6_2
Chapter Google Scholar
Clerc, M.: Particle Swarm Optimization. ISTE Ltd. (2006)
Google Scholar
David, O.E., Greental, I.: Genetic algorithms for evolving deep neural networks. In: GECCO 2014 - Proceedings of the 2014 Conference on Genetic and Evolutionary Computation, pp. 1451–1452. Association for Computing Machinery Special Interest Group on Genetic and Evolutionary Computation (2014)
Google Scholar
Deng, L., Yu, D.: Deep learning: methods and applications. Found. Trends Sig. Process. 7(3–4), 197–387 (2014)
Article MathSciNet MATH Google Scholar
Hinton, G.E.: Training products of experts by minimizing contrastive divergence. Neural Comput. 14(8), 1771–1800 (2002)
Article MATH Google Scholar
Hinton, G.E.: To recognize shapes, first learn to generate images. Prog. Brain Res. 165, 535–547 (2007)
Article Google Scholar
Hinton, G.E.: A practical guide to training restricted boltzmann machines. Momentum 1, 926 (2010)
Google Scholar
Hinton, G.E., Salakhutdinov, R.: Reducing the dimensionality of data with neural networks. Science 313, 504–507 (2006)
Article MathSciNet MATH Google Scholar
Hochreiter, S., Bengio, Y., Frasconi, P., Schmidhuber, J.: Gradient flow in recurrent nets: the difficulty of learning long-term dependencies. In: A Field Guide to Dynamical Recurrent Neural Networks (2001)
Google Scholar
Kennedy, J.: Swarm intelligence. In: Zomaya, A.Y. (ed.) Handbook of Nature-Inspired and Innovative Computing. Springer, Boston (2006). doi:10.1007/0-387-27705-6_6
Google Scholar
Kennedy, J., Eberhart, R.: Particle swarm optimization. In: Proceedings of IEEE International Conference on Neural Networks, vol. 4, pp. 1942–1948 (1995)
Google Scholar
Kuok, K.K., Harun, S., Shamsuddin, S.M.: Particle swarm optimization feed forward neural network for hourly rainfall-runoff modeling in Bedup Basin, Malaysia. Int. J. Civil Environ. Eng. 9(10), 9–18 (2010)
Google Scholar
Lamos-Sweeney, J.: Deep learning using genetic algorithms. Master’s thesis, Rochester Institute of Technology (2012)
Google Scholar
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Article Google Scholar
Levy, E., David, O.E., Netanyahu, N.S.: Genetic algorithms and deep learning for automatic painter classification. In: GECCO 2014 - Proceedings of 2014 Conference on Genetic and Evolutionary Computation, pp. 1143–1150. Association for Computing Machinery Special Interest Group on Genetic and Evolutionary Computation (2014)
Google Scholar
Mendes, R., Cortez, P., Rocha, M., Neves, J.: Particle swarms for feedforward neural network training. In: Proceedings of 2002 International Joint Conference on Neural Networks (2002)
Google Scholar
Morse, G., Stanley, K.O.: Simple evolutionary optimization can rival stochastic gradient descent in neural networks. In: Proceedings of Genetic and Evolutionary Computation Conference 2016, GECCO 2016, pp. 477–484. ACM, New York (2016)
Google Scholar
Nielsen, M.A.: Neural Networks and Deep Learning. Determination Press (2014)
Google Scholar
Porto, V.W., Fogel, D.B.: Alternative neural network training methods. IEEE Expert 10(3), 16–22 (1995)
Article Google Scholar
Squartini, S., Cecchi, S., Rossini, S., Piazza, F.: Comparing different recurrent neural architectures on a specific task from vanishing gradient effect perspective. In: Proceedings of 2006 IEEE International Conference on Networking, Sensing and Control, pp. 380–385 (2006)
Google Scholar
Tsou, D., MacNish, C.: Adaptive particle swarm optimisation for high-dimensional, highly convex search spaces. In: 2003 Congress on Evolutionary Computation, vol. 2, pp. 783–789 (2003)
Google Scholar
Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., Manzagol, P.: Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. J. Mach. Learn. Res. 11, 3371–3408 (2010)
MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

School of Science, RMIT University, Melbourne, Australia
Angus Kenny & Xiaodong Li

Authors

Angus Kenny
View author publications
You can also search for this author in PubMed Google Scholar
Xiaodong Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiaodong Li .

Editor information

Editors and Affiliations

Southern University of Science and Technology, Shenzhen, China
Yuhui Shi
City University of Hong Kong, Hong Kong, Kowloon, Hong Kong
Kay Chen Tan
Victoria University of Wellington, Wellington, Wellington, New Zealand
Mengjie Zhang
Southern University of Science and Technology, Shenzhen, China
Ke Tang
RMIT University, Melbourne, Victoria, Australia
Xiaodong Li
City University of Hong Kong, Kowloon Tong, Hong Kong
Qingfu Zhang
Peking University, Beijing, China
Ying Tan
University of Leipzig, Leipzig, Germany
Martin Middendorf
University of Surrey, Guildford, Surrey, United Kingdom
Yaochu Jin

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kenny, A., Li, X. (2017). A Study on Pre-training Deep Neural Networks Using Particle Swarm Optimisation. In: Shi, Y., et al. Simulated Evolution and Learning. SEAL 2017. Lecture Notes in Computer Science(), vol 10593. Springer, Cham. https://doi.org/10.1007/978-3-319-68759-9_30

Download citation

DOI: https://doi.org/10.1007/978-3-319-68759-9_30
Published: 14 October 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-68758-2
Online ISBN: 978-3-319-68759-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics