Abstract
Deep learning is a “hot-topic” in machine learning at the moment. Currently deep learning networks are constrained in their size and complexity due to the algorithms used to optimise being computationally expensive. This paper examines the potential of optimising deep neural networks using particle swarm optimisation (PSO) as a substitute for the most common methods of contrastive divergence (CD) or stochastic gradient descent. It investigates the problems caused by using PSO in such high-dimensional problem spaces and the issues around applying divide-and-conquer techniques to neural networks. A novel network architecture is proposed to overcome the limitations caused by the low dimensional capabilities of PSO, dubbed semi-disjoint expanded networks (SdENs). A comparative analysis is performed between the proposed model and more popular techniques. Our experiment results suggest that the proposed techniques could perform similar functions to the more traditional pre-training technique of CD, however it is identified that the deeper networks required suffer from the vanishing gradient problem. This paper serves to highlight the issues prevalent in this new and fertile ground of research.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bengio, Y.: Learning deep architectures for AI. Found. Trends Mach. Learn. 2(1), 1–127 (2009)
Bengio, Y.: Practical recommendations for gradient-based training of deep architectures. In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Neural Networks: Tricks of the Trade. LNCS, vol. 7700, 2nd edn, pp. 437–478. Springer, Heidelberg (2012). doi:10.1007/978-3-642-35289-8_26
Bengio, Y.: Evolving culture versus local minima. Grow. Adapt. Mach. - Stud. Comput. Intell. 557, 109–138 (2014)
Blum, C., Li, X.: Swarm intelligence in optimization. In: Blum, C., Merkle, D. (eds.) Swarm Intelligence - Introduction and Applications. NCS, pp. 43–86. Springer, Heidelberg (2008). doi:10.1007/978-3-540-74089-6_2
Clerc, M.: Particle Swarm Optimization. ISTE Ltd. (2006)
David, O.E., Greental, I.: Genetic algorithms for evolving deep neural networks. In: GECCO 2014 - Proceedings of the 2014 Conference on Genetic and Evolutionary Computation, pp. 1451–1452. Association for Computing Machinery Special Interest Group on Genetic and Evolutionary Computation (2014)
Deng, L., Yu, D.: Deep learning: methods and applications. Found. Trends Sig. Process. 7(3–4), 197–387 (2014)
Hinton, G.E.: Training products of experts by minimizing contrastive divergence. Neural Comput. 14(8), 1771–1800 (2002)
Hinton, G.E.: To recognize shapes, first learn to generate images. Prog. Brain Res. 165, 535–547 (2007)
Hinton, G.E.: A practical guide to training restricted boltzmann machines. Momentum 1, 926 (2010)
Hinton, G.E., Salakhutdinov, R.: Reducing the dimensionality of data with neural networks. Science 313, 504–507 (2006)
Hochreiter, S., Bengio, Y., Frasconi, P., Schmidhuber, J.: Gradient flow in recurrent nets: the difficulty of learning long-term dependencies. In: A Field Guide to Dynamical Recurrent Neural Networks (2001)
Kennedy, J.: Swarm intelligence. In: Zomaya, A.Y. (ed.) Handbook of Nature-Inspired and Innovative Computing. Springer, Boston (2006). doi:10.1007/0-387-27705-6_6
Kennedy, J., Eberhart, R.: Particle swarm optimization. In: Proceedings of IEEE International Conference on Neural Networks, vol. 4, pp. 1942–1948 (1995)
Kuok, K.K., Harun, S., Shamsuddin, S.M.: Particle swarm optimization feed forward neural network for hourly rainfall-runoff modeling in Bedup Basin, Malaysia. Int. J. Civil Environ. Eng. 9(10), 9–18 (2010)
Lamos-Sweeney, J.: Deep learning using genetic algorithms. Master’s thesis, Rochester Institute of Technology (2012)
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Levy, E., David, O.E., Netanyahu, N.S.: Genetic algorithms and deep learning for automatic painter classification. In: GECCO 2014 - Proceedings of 2014 Conference on Genetic and Evolutionary Computation, pp. 1143–1150. Association for Computing Machinery Special Interest Group on Genetic and Evolutionary Computation (2014)
Mendes, R., Cortez, P., Rocha, M., Neves, J.: Particle swarms for feedforward neural network training. In: Proceedings of 2002 International Joint Conference on Neural Networks (2002)
Morse, G., Stanley, K.O.: Simple evolutionary optimization can rival stochastic gradient descent in neural networks. In: Proceedings of Genetic and Evolutionary Computation Conference 2016, GECCO 2016, pp. 477–484. ACM, New York (2016)
Nielsen, M.A.: Neural Networks and Deep Learning. Determination Press (2014)
Porto, V.W., Fogel, D.B.: Alternative neural network training methods. IEEE Expert 10(3), 16–22 (1995)
Squartini, S., Cecchi, S., Rossini, S., Piazza, F.: Comparing different recurrent neural architectures on a specific task from vanishing gradient effect perspective. In: Proceedings of 2006 IEEE International Conference on Networking, Sensing and Control, pp. 380–385 (2006)
Tsou, D., MacNish, C.: Adaptive particle swarm optimisation for high-dimensional, highly convex search spaces. In: 2003 Congress on Evolutionary Computation, vol. 2, pp. 783–789 (2003)
Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., Manzagol, P.: Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. J. Mach. Learn. Res. 11, 3371–3408 (2010)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Kenny, A., Li, X. (2017). A Study on Pre-training Deep Neural Networks Using Particle Swarm Optimisation. In: Shi, Y., et al. Simulated Evolution and Learning. SEAL 2017. Lecture Notes in Computer Science(), vol 10593. Springer, Cham. https://doi.org/10.1007/978-3-319-68759-9_30
Download citation
DOI: https://doi.org/10.1007/978-3-319-68759-9_30
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-68758-2
Online ISBN: 978-3-319-68759-9
eBook Packages: Computer ScienceComputer Science (R0)