A Penalization Criterion Based on Noise Behaviour for Model Selection

  • JoaquínPizarro Junquera
  • Pedro Galindo Riaño
  • Elisa Guerrero Vázquez
  • Andrés Yañez Escolano1
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2085)


Complexity-penalization strategies are one way to decide on the most appropriate network size in order to address the trade-off between overfitted and underfitted models. In this paper we propose a new penalty term derived from the behaviour of candidate models under noisy conditions that seems to be much more robust against catastrophic overfitting errors that standard techniques. This strategy is applied to several regression problems using polynomial functions, univariate autoregressive models and RBF neural networks. The simulation study at the end of the paper will show that the proposed criterion is extremely competitive when compared to state-of-the-art criteria.


Akaike Information Criterion Candidate Model Penalty Term Generalization Error Model Selection Criterion 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Abu-Mostafa, Y. (1989), The Vapnik-Chervonenkis dimension: Information versus complexity in learning, Neural Computation 1(3), 312–317.CrossRefGoogle Scholar
  2. 2.
    Akaike, H Information theory and an extension of the maximum likelihood principle. In B.N. Petrov and C saki ed. 2nd Intl. Symp. Inform. Theory 267–281, Budapest. (1973).Google Scholar
  3. 3.
    Amari, S. (1995), Learning and statistical inference, in M.A. Arbib, ed., ‘The Handbook of Brain Theory and Neural Networks’, MIT Press, Cambridge, Massachusetts.Google Scholar
  4. 4.
    Foster, P.D. Characterizing the generalization performance of model selection strategies. Proceeding of the 14th Intl. Conf. on Machine Learning (ICML-97) Nashville, (1997).Google Scholar
  5. 5.
    Hurvich, C.M. and Tsai, C.L. Regression and time series model selection in small samples. Biometrika 76, 297–307 (1989).zbMATHCrossRefMathSciNetGoogle Scholar
  6. 6.
    McQuarrie A.D.R. Shumway, R.H. and Tsai C.L. The model selection criterion AICu. Statistical and Probability letters 34, 285–292. (1997).zbMATHCrossRefMathSciNetGoogle Scholar
  7. 7.
    Moody, J. (1992), The effective number of parameters: An analysis of generalization and regularization in nonlinear learning systems, in J. Moody et Al. eds, ‘Advances in Neural Information Processing Systems’, Vol. 4, Morgan Kaufmann, pp. 847–854.Google Scholar
  8. 8.
    Ripley, B. (1995), Statistical ideas for selecting network architectures, Invited Presentation, Neural Information Processing Systems 8.Google Scholar
  9. 9.
    Schwarz, G. Estimating the dimension of a model. Annals of Statistics, 6, 461–515.(1978)zbMATHMathSciNetCrossRefGoogle Scholar
  10. 10.
    Sugiura, N. Further analysis of the data by Akaike’s information criterion and the finite corrections. Communications in Statistic-Theory and Methods 7,13–26. (1978).CrossRefMathSciNetGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2001

Authors and Affiliations

  • JoaquínPizarro Junquera
    • 1
  • Pedro Galindo Riaño
    • 1
  • Elisa Guerrero Vázquez
    • 1
  • Andrés Yañez Escolano1
    • 1
  1. 1.Departamento de Lenguajes y Sistemas Informáticos, Grupo de Investigación “Sistemas Inteligentes de Computación”, C.A.S.E.M.Universidad de CadizPuerto RealSpain

Personalised recommendations