Advertisement

Random Regrouping and Factorization in Cooperative Particle Swarm Optimization Based Large-Scale Neural Network Training

  • Cody Dennis
  • Beatrice M. Ombuki-BermanEmail author
  • Andries P. Engelbrecht
Article
  • 18 Downloads

Abstract

Previous studies have shown that factorization and random regrouping significantly improve the performance of the cooperative particle swarm optimization (CPSO) algorithm. However, few studies have examined whether this trend continues when CPSO is applied to the training of feed forward neural networks. Neural network training problems often have very high dimensionality and introduce the issue of saturation, which has been shown to significantly affect the behavior of particles in the swarm; thus it should not be assumed that these trends hold. This study identifies the benefits of random regrouping and factorization to CPSO based neural network training, and proposes a number of approaches to problem decomposition for use in neural network training. Experiments are performed on 11 problems with sizes ranging from 35 up to 32,811 weights and biases, using a number of general approaches to problem decomposition, and state of the art algorithms taken from the literature. This study found that the impact of factorization and random regrouping on solution quality and swarm behavior depends heavily on the general approach to problem decomposition. It is shown that a random problem decomposition is effective in feed forward neural network training. A random problem decomposition has the benefit of reducing the issue of problem decomposition to the tuning of a single parameter.

Keywords

Feed forward neural network Particle swarm optimization Random regrouping Factorization Variable interdependence Saturation 

Notes

References

  1. 1.
    Bai X, Gao X, Xue B (2018) Particle swarm optimization based two-stage feature selection in text mining. In: Proceedings of the congress on evolutionary computation. IEEE, pp 1–8Google Scholar
  2. 2.
    Baraldi A, Blonda P (1999) A survey of fuzzy clustering algorithms for pattern recognition. IEEE Trans Syst Man Cybern Part B 29(6):778–785.  https://doi.org/10.1109/3477.809032 CrossRefGoogle Scholar
  3. 3.
    Bishop C (1995) Neural networks for pattern recognition. Oxford University Press, OxfordzbMATHGoogle Scholar
  4. 4.
    Carlisle A, Dozier G (2001) An off-the-shelf pso. In: Proceedings of the workshop on particle swarm optimization, vol 1. Technology IUPUI, Indianapolis, IN, USA, pp 1–6Google Scholar
  5. 5.
    Chen A, Huang S, Hong P, Cheng C, Lin E (2011) HDPS: heart disease prediction system. In: Computing in cardiology, pp 557–560Google Scholar
  6. 6.
    Chen A, Ren Z, Yang Y, Liang Y, Pang B (2018) A historical interdependency based differential grouping algorithm for large scale global optimization. In: Proceedings of the genetic and evolutionary computation conference companion, GECCO ’18. ACM, New York, NY, USA, pp 1711–1715.  https://doi.org/10.1145/3205651.3208278
  7. 7.
    Ciarelli P, Oliveira E (2009) CNAE-9 data set. https://archive.ics.uci.edu/ml/datasets/CNAE-9. Accessed 2 Aug 2018
  8. 8.
    Clerc M, Kennedy J (2002) The particle swarm—explosion, stability, and convergence in a multidimensional complex space. IEEE Trans Evolut Comput 6(1):58–73.  https://doi.org/10.1109/4235.985692 CrossRefGoogle Scholar
  9. 9.
    Das M, Dulger L (2009) Signature vecification (SV) toolbox: applications of PSO-NN. Eng Appl Artif Intell 22(4):688–694CrossRefGoogle Scholar
  10. 10.
    Demsar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30MathSciNetzbMATHGoogle Scholar
  11. 11.
    Douglas J (2018) Efficient merging and decomposition variants of cooperative particle swarm optimization for large scale problems. Master’s thesis, Brock UniversityGoogle Scholar
  12. 12.
    Eberhart R, Shi Y (2000) Comparing inertia weights and constriction factors in particle swarm optimization. In: Proceedings of the congress on evolutionary computation, vol 1. IEEE, pp 84–88Google Scholar
  13. 13.
    Engelbrecht AP (2013) Roaming behavior of unconstrained particles. In: Proceedings of the Brazilian congress on computational intelligence, pp 104–111.  https://doi.org/10.1109/BRICS-CCI-CBIC.2013.28
  14. 14.
    Fisher R (1936) Iris data set. https://archive.ics.uci.edu/ml/datasets/Iris. Accessed 2 Aug 2018
  15. 15.
    Forina M, et al. (1991) Wine data set. https://archive.ics.uci.edu/ml/datasets/Wine. Accessed 2 Aug 2018
  16. 16.
    Graf F, Kriegel H, Schubert M, Poelsterl S, Cavallaro A (2011) Relative location of ct slices on axial axis data set. https://archive.ics.uci.edu/ml/datasets/Relative+location+of+CT+slices+on+axial+axis#. Accessed 2 Aug 2018
  17. 17.
    Helwig S, Wanka R (2008) Theoretical analysis of initial particle swarm behavior. In: Rudolph G, Jansen T, Beume N, Lucas S, Poloni C (eds) Parallel Problem Solving from Nature—PPSN X. Springer, Berlin, pp 889–898CrossRefGoogle Scholar
  18. 18.
    Hu C, Wu X, Wang Y, Xie F (2009) Multi-swarm particle swarm optimizer with cauchy mutation for dynamic optimization problems. In: Cai Z, Li Z, Kang Z, Liu Y (eds) Advances in Computation and Intelligence. Springer, Berlin, pp 443–453CrossRefGoogle Scholar
  19. 19.
    Ismail A, Engelbrecht AP (2012) Measuring diversity in the cooperative particle swarm optimizer. In: Dorigo M, et al (eds) Proceedings of the international conference on swarm intelligence. Springer, Berlin, pp 97–108Google Scholar
  20. 20.
    Janosi A, Steinbrunn W, Pfisterer M, Detrano R (1989) Heart disease data set. https://archive.ics.uci.edu/ml/datasets/Heart+Disease. Accessed 2 Aug 2018
  21. 21.
    Kennedy J, Eberhart R (1995) Particle swarm optimization. Proc IEEE Int Conf Neural Netw 4:1942–1948.  https://doi.org/10.1109/ICNN.1995.488968 CrossRefGoogle Scholar
  22. 22.
    Kennedy J, Mendes R (2002) Population structure and particle swarm performance. In: Proceedings of the international congress on evolutionary computation, vol 2. IEEE Computer Society, Washington, DC, USA, pp 1671–1676Google Scholar
  23. 23.
    Lawrence S, Tsoi A, Back A (1996) Function approximation with neural networks and local methods: bias, variance and smoothness. In: Proceedings of the australian conference on neural networks, vol 1621. Australian National UniversityGoogle Scholar
  24. 24.
    LeCun Y, Cortes C, Burges J (1999) MNIST database of handwritten digits. http://yann.lecun.com/exdb/mnist/. Accessed 2 Aug 2018
  25. 25.
    Lensen A, Xue B, Zhang M (2017) Using particle swarm optimisation and the silhouette metric to estimate the number of clusters, select features, and perform clustering. In: Proceedings of the European conference on the applications of evolutionary computation. Springer, pp 538–554Google Scholar
  26. 26.
    Li X, Yao X (2009) Tackling high dimensional nonseparable optimization problems by cooperatively coevolving particle swarms. In: Proceedings of the international congress on evolutionary computation, pp 1546–1553.  https://doi.org/10.1109/CEC.2009.4983126
  27. 27.
    Li X, Yao X (2012) Cooperatively coevolving particle swarms for large scale optimization. IEEE Trans Evolut Comput 16(2):210–224.  https://doi.org/10.1109/TEVC.2011.2112662 CrossRefGoogle Scholar
  28. 28.
    Mendes R, Cortez P, Rocha M, Neves J (2002) Particle swarms for feedforward neural network training. Proc IEEE Int Conf Neural Netw 2:1895–1899.  https://doi.org/10.1109/IJCNN.2002.1007808 Google Scholar
  29. 29.
    Michalski R, Chilausky R (1980) Soybean (large) data set. https://archive.ics.uci.edu/ml/datasets/Soybean+%28Large%29. Accessed 2 Aug 2018
  30. 30.
    Mikula M, Gao X, Machová K (2017) Adapting sentiment analysis system from english to slovak. In: Proceedings of the symposium series on computational intelligence, pp 1–8.  https://doi.org/10.1109/SSCI.2017.8285313
  31. 31.
    Oldewage E (2018) The perils of particle swarm optimization in high dimensional problem spaces. Master’s thesis, University of PretoriaGoogle Scholar
  32. 32.
    Oldewage E, Engelbrecht AP, Cleghorn C (2017) The merits of velocity clamping particle swarm optimisation in high dimensional spaces. In: Symposium series on computational intelligence, pp 1–8.  https://doi.org/10.1109/SSCI.2017.8280887
  33. 33.
    Oldewage E, Engelbrecht A, Cleghorn C (2018) The importance of component-wise stochasticity in particle swarm optimization. In: International conference on swarm intelligence. Springer, pp 264–276Google Scholar
  34. 34.
    Omidvar MN, Li X, Mei Y, Yao X (2014) Cooperative co-evolution with differential grouping for large scale optimization. IEEE Trans Evol Comput 18(3):378–393.  https://doi.org/10.1109/TEVC.2013.2281543 CrossRefGoogle Scholar
  35. 35.
    Omidvar MN, Yang M, Mei Y, Li X, Yao X (2017) DG2: a faster and more accurate differential grouping for large-scale black-box optimization. IEEE Trans Evol Comput 21(6):929–942.  https://doi.org/10.1109/TEVC.2017.2694221 CrossRefGoogle Scholar
  36. 36.
    Pillai K, Sheppard J (2011) Overlapping swarm intelligence for training artificial neural networks. In: Proceedings of the Symposium on Swarm Intelligence, pp 1–8.  https://doi.org/10.1109/SIS.2011.5952566
  37. 37.
    Qureshi S, Sheppard JW (2016) Dynamic sampling in training artificial neural networks with overlapping swarm intelligence. In: Proceedings of the congress on evolutionary computation, pp 440–446.  https://doi.org/10.1109/CEC.2016.7743827
  38. 38.
    Rakitianskaia A, Engelbrecht AP (2014a) Training high-dimensional neural networks with cooperative particle swarm optimiser. In: Proceedings of the international joint conference on neural networks, pp 4011–4018.  https://doi.org/10.1109/IJCNN.2014.6889933
  39. 39.
    Rakitianskaia A, Engelbrecht AP (2014b) Weight regularisation in particle swarm optimisation neural network training. In: Proceedings of the symposium on swarm intelligence, pp 1–8.  https://doi.org/10.1109/SIS.2014.7011773
  40. 40.
    Redmond M (2009) Communities and crime data set. https://archive.ics.uci.edu/ml/datasets/Communities+and+Crime. Accessed 2 Aug 2018
  41. 41.
    Ren Z, Chen A, Wang L, Liang Y, Pang B (2017) An efficient vector-growth decomposition algorithm for cooperative coevolution in solving large scale problems. In: Proceedings of the genetic and evolutionary computation conference companion, GECCO ’17, ACM, New York, NY, USA, pp 41–42.  https://doi.org/10.1145/3067695.3082048
  42. 42.
    Röbel A (1994) The dynamic pattern selection algorithm: effective training and controlled generalization of backpropagation neural networks. Technical report, Technische Universität BerlinGoogle Scholar
  43. 43.
    Sexton RS, Dorsey RE (2000) Reliable classification using neural networks: a genetic algorithm and backpropagation comparison. Decis Support Syst 30(1):11–22CrossRefGoogle Scholar
  44. 44.
    Shi Y, Eberhart R (1998) A modified particle swarm optimizer. In: Proceedings of the international congress on evolutionary computation, pp 69–73.  https://doi.org/10.1109/ICEC.1998.699146
  45. 45.
    Strasser S, Sheppard J, Fortier N, Goodman R (2017) Factored evolutionary algorithms. IEEE Trans Evol Comput 21(2):281–293.  https://doi.org/10.1109/TEVC.2016.2601922 CrossRefGoogle Scholar
  46. 46.
    Sun L, Yoshida S, Cheng X, Liang Y (2012) A cooperative particle swarm optimizer with statistical variable interdependence learning. Inf Sci 186(1):20–39MathSciNetCrossRefGoogle Scholar
  47. 47.
    Sun Y, Kirley M, Halgamuge SK (2018) A recursive decomposition method for large scale continuous optimization. IEEE Trans Evol Comput 22(5):647–661CrossRefGoogle Scholar
  48. 48.
    Tang R, Li X (2018) Adaptive multi-context cooperatively coevolving in differential evolution. Appl Intell 48(9):2719–2729CrossRefGoogle Scholar
  49. 49.
    Tang R, Wu Z, Fang Y (2017) Adaptive multi-context cooperatively coevolving particle swarm optimization for large-scale problems. Soft Comput 21(16):4735–4754CrossRefGoogle Scholar
  50. 50.
    Tang R, Li X, Lai J (2018) A novel optimal energy-management strategy for a maritime hybrid energy system based on large-scale global optimization. Appl Energy 228:254–264CrossRefGoogle Scholar
  51. 51.
    Van den Bergh F (2001) An analysis of particle swarm optimizers. PhD thesis, University of PretoriaGoogle Scholar
  52. 52.
    Van den Bergh F, Engelbrecht AP (2000) Cooperative learning in neural networks using particle swarm optimizers. S Afr Comput J 2000(26):84–90Google Scholar
  53. 53.
    Van den Bergh F, Engelbrecht AP (2004) A cooperative approach to particle swarm optimization. IEEE Trans Evol Comput 8(3):225–239CrossRefGoogle Scholar
  54. 54.
    Van der Putten P, Van Someren M (eds) (2000) Insurance company benchmark (coil 2000) data set. https://archive.ics.uci.edu/ml/datasets/Insurance+Company+Benchmark+%28COIL+2000%29. Accessed 02 Aug 2018
  55. 55.
    Van Wyk A, Engelbrecht AP (2010) Overfitting by PSO trained feedforward neural networks. In: Proceedings of the congress on evolutionary computation, pp 1–8.  https://doi.org/10.1109/CEC.2010.5586333
  56. 56.
    Van Wyk A, Engelbrecht AP (2016) Analysis of activation functions for particle swarm optimised feedforward neural networks. In: Proceedings of the congress on evolutionary computation, pp 423–430.  https://doi.org/10.1109/CEC.2016.7743825
  57. 57.
    Volschenk A, Engelbrecht AP (2016) An analysis of competitive coevolutionary particle swarm optimizers to train neural network game tree evaluation functions. In: Tan Y, Shi Y, Niu B (eds) Advances in Swarm Intelligence. Springer, Cham, pp 369–380CrossRefGoogle Scholar
  58. 58.
    Wolberg W (1990) Breast cancer Wisconsin (original) data set. https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+%28Original%29. Accessed 2 Aug 2018
  59. 59.
    Xiao H, Rasul K, Vollgraf R (2017) Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms. https://arxiv.org/abs/1708.07747. Accessed 2 Aug 2018
  60. 60.
    Xu X, Tang Y, Li J, Hua C, Guan X (2015) Dynamic multi-swarm particle swarm optimizer with cooperative learning strategy. Appl Soft Comput 29:169–183CrossRefGoogle Scholar
  61. 61.
    Zyl E, Engelbrecht AP (2015) A subspace-based method for PSO initialization. In: Symposium series on computational intelligence, pp 226–233.  https://doi.org/10.1109/SSCI.2015.42

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  • Cody Dennis
    • 1
  • Beatrice M. Ombuki-Berman
    • 1
    Email author
  • Andries P. Engelbrecht
    • 2
  1. 1.Department of Computer ScienceBrock UniversitySt. CatharinesCanada
  2. 2.Department of Industrial Engineering and Computer Science DivisionStellenbosch UniversityStellenboschSouth Africa

Personalised recommendations