A Comparative Analysis of Various Regularization Techniques to Solve Overfitting Problem in Artificial Neural Network

  • Shrikant GuptaEmail author
  • Rajat Gupta
  • Muneendra Ojha
  • K. P. Singh
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 799)


Neural networks having a large number of parameters are considered as very effective machine learning tool. But as the number of parameters becomes large, the network becomes slow to use and the problem of overfitting arises. Various ways to prevent overfitting of model are further discussed here and a comparative study has been done for the same. The effects of various regularization methods on the performance of neural net models are observed.


Dropout L1 regularization L2 regularization Neural network Overfitting 


  1. 1.
    Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15, 1929–1958 (2014)MathSciNetzbMATHGoogle Scholar
  2. 2.
    Smirnov, E.A., Timoshenko, D.M., Andrianov, S.N.: Comparison of regularization methods for imagenet classification with deep convolutional neural networks. AASRI Procedia 6, 89–94 (2014)CrossRefGoogle Scholar
  3. 3.
    Lau, K., López, R., Oñate, E.: A neural networks approach to aerofoil noise prediction. In: International Centre Numerical Methods Engineering, vol. CIMNE No-3 (2009)Google Scholar
  4. 4.
    Wan, L., Zeiler, M., Zhang, S., LeCun, Y., Fergus, R.: Regularization of neural networks using dropconnect. In: ICML, no. 1, pp. 109–111 (2013)Google Scholar
  5. 5.
    Ng, A.: Feature selection, L1 vs. L2 regularization, and rotational invariance. In: Twenty-First International Conference Machine Learning - ICML 2004, p. 78 (2004)Google Scholar
  6. 6.
    Golik, P., Doetsch, P., Ney, H.: Cross-entropy vs. squared error training: a theoretical and experimental comparison. In: Interspeech, vol. 13 (2013)Google Scholar
  7. 7.
    Kingma, D., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
  8. 8.
    Lopez, R., Brooks, T.F., Pope, D.S., Marcolini, M.A.: Airfoil self-noise data set. UCI Machine Learning Repository (2008)Google Scholar
  9. 9.
    Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. In: Proceedings of IEEE, vol. 86, no. 11, pp. 2278–2324 (1998)Google Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2018

Authors and Affiliations

  • Shrikant Gupta
    • 1
    Email author
  • Rajat Gupta
    • 1
  • Muneendra Ojha
    • 1
  • K. P. Singh
    • 2
  1. 1.Dr. SPM International Institute of Information Technology Naya RaipurNaya RaipurIndia
  2. 2.Indian Institute of Information Technology AllahabadAllahabadIndia

Personalised recommendations