Automatic Selection for Non-linear Models

  • Jennifer L. Castle
  • David F. Hendry


Our strategy for automatic selection in potentially non-linear processes is: test for non-linearity in the unrestricted linear formulation; if that test rejects, specify a general model using polynomials, to be simplified to a minimal congruent representation; finally select by encompassing tests of specific non-linear forms against the selected model. Non-linearity poses many problems: extreme observations leading to non-normal (fat-tailed) distributions; collinearity between non-linear functions; usually more variables than observations when approximating the non-linearity; and excess retention of irrelevant variables; but solutions are proposed. A returns-to-education empirical application demonstrates the feasibility of the non-linear automatic model selection algorithm Autometrics.


Irrelevant Variable Extreme Observation Linear Regressor Model Selection Algorithm Excess Retention 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Abadir, K.M.: An introduction to hypergeometric functions for economists. Econom. Rev. 18, 287–330 (1999) MathSciNetMATHCrossRefGoogle Scholar
  2. 2.
    Ahmed, S.: Econometric issues on the return to education. MPhil thesis. University of Oxford (2007) Google Scholar
  3. 3.
    Altonji, J., Dunn, T.: Using siblings to estimate the effect of schooling quality on wages. Rev. Econ. Stat. 78, 665–671 (1996) CrossRefGoogle Scholar
  4. 4.
    Blundell, R., Dearden, L., Sianesi, B.: Evaluating the effect of education on earnings: models, methods and results from the National Child Development Survey. J. R. Stat. Soc. A 168, 473–512 (2005) MathSciNetMATHCrossRefGoogle Scholar
  5. 5.
    Campos, J., Hendry, D.F., Krolzig, H.M.: Consistent model selection by an automatic Gets approach. Oxf. Bull. Econ. Stat. 65, 803–819 (2003) CrossRefGoogle Scholar
  6. 6.
    Card, D.: The causal effect of education on earnings. In: Ashenfelter, O., Card, D. (eds.) Handbook of Labor Economics, vol. 3A, pp. 1801–1863. North-Holland, Amsterdam (1999) Google Scholar
  7. 7.
    Castle, J.L.: Evaluating PcGets and RETINA as automatic model selection algorithms. Oxf. Bull. Econ. Stat. 67, 837–880 (2005) CrossRefGoogle Scholar
  8. 8.
    Castle, J.L., Hendry, D.F.: A low-dimension, portmanteau test for non-linearity. J. Econom. (2010) Google Scholar
  9. 9.
    Castle, J.L., Doornik, J.A., Hendry, D.F.: Evaluating automatic model selection. J. Time Ser. Econom. 3(1), Article 8 (2011) Google Scholar
  10. 10.
    Castle, J.L., Doornik, J.A., Hendry, D.F.: Model selection when there are multiple breaks. Working Paper 472, Economics Department, University of Oxford (2009) Google Scholar
  11. 11.
    Castle, J.L., Fawcett, N.W.P., Hendry, D.F.: Forecasting breaks and during breaks. In: Clements, M.P., Hendry, D.F. (eds.) Oxford Handbook of Economic Forecasting, Chap. 11, pp. 315–354. Oxford University Press, London (2011) Google Scholar
  12. 12.
    Clements, M.P., Hendry, D.F.: Forecasting Economic Time Series. Cambridge University Press, Cambridge (1998) CrossRefGoogle Scholar
  13. 13.
    Copson, E.T.: Asymptotic Expansions. Cambridge University Press, Cambridge (1965) MATHCrossRefGoogle Scholar
  14. 14.
    Dearden, L.: The effects of families and ability on men’s education and earnings in Britain. Labour Econ. 6, 551–567 (1999) CrossRefGoogle Scholar
  15. 15.
    Doornik, J.A.: Econometric model selection with more variables than observations. Working Paper, Economics Department, University of Oxford (2007) Google Scholar
  16. 16.
    Doornik, J.A.: Encompassing and automatic model selection. Oxf. Bull. Econ. Stat. 70, 915–925 (2008) CrossRefGoogle Scholar
  17. 17.
    Doornik, J.A.: Autometrics. In: Castle, J.L., Shephard, N. (eds.) The Methodology and Practice of Econometrics, pp. 88–121. Oxford University Press, Oxford (2009) CrossRefGoogle Scholar
  18. 18.
    Doornik, J.A., Hansen, H.: An omnibus test for univariate and multivariate normality. Oxf. Bull. Econ. Stat. 70, 927–939 (2008) CrossRefGoogle Scholar
  19. 19.
    Doornik, J.A., Hendry, D.F.: Empirical model discovery. Working paper, Economics Department, University of Oxford (2009) Google Scholar
  20. 20.
    Frisch, R.: Statistical Confluence Analysis by Means of Complete Regression Systems. University Institute of Economics, Oslo (1934) MATHGoogle Scholar
  21. 21.
    Garen, J.: The returns to schooling: a selectivity bias approach with a continuous choice variable. Econometrica 52(5), 1199–1218 (1984) CrossRefGoogle Scholar
  22. 22.
    Granger, C.W.J., Teräsvirta, T.: Modelling Nonlinear Economic Relationships. Oxford University Press, Oxford (1993) MATHGoogle Scholar
  23. 23.
    Griliches, Z.: Estimating the returns to schooling: some econometric problems. Econometrica 45, 1–22 (1977) CrossRefGoogle Scholar
  24. 24.
    Harmon, C., Walker, I.: Estimates of the economic return to schooling for the UK. Am. Econ. Rev. 85, 1278–1286 (1995) Google Scholar
  25. 25.
    Harrison, A.: Earnings by size: a tale of two distributions. Rev. Econ. Stud. 48, 621–631 (1981) CrossRefGoogle Scholar
  26. 26.
    Heckman, J.J., Lochner, L.J., Todd, P.E.: Earnings functions, rates of return and treatment effects: the Mincer equation and beyond. In: Hanushek, E., Welch, F. (eds.) Handbook of the Economics of Education, vol. 1. North Holland, Amsterdam (2006). Chap. 7 Google Scholar
  27. 27.
    Hendry, D.F., Doornik, J.A.: Empirical Econometric Modelling using PcGive, vol. I. Timberlake Consultants Press, London (2009) Google Scholar
  28. 28.
    Hendry, D.F., Krolzig, H.M.: Automatic Econometric Model Selection. Timberlake Consultants Press, London (2001) Google Scholar
  29. 29.
    Hendry, D.F., Krolzig, H.M.: The properties of automatic Gets modelling. Econ. J. 115, C32–C61 (2005) CrossRefGoogle Scholar
  30. 30.
    Hendry, D.F., Mizon, G.E.: Econometric modelling of changing time series. Working Paper 475, Economics Department, Oxford University (2009) Google Scholar
  31. 31.
    Hendry, D.F., Morgan, M.S.: A re-analysis of confluence analysis. Oxf. Econ. Pap. 41, 35–52 (1989) Google Scholar
  32. 32.
    Hendry, D.F., Richard, J.F.: Recent developments in the theory of encompassing. In: Cornet, B., Tulkens, H. (eds.) Contributions to Operations Research and Economics. The XXth Anniversary of CORE, pp. 393–440. MIT Press, Cambridge (1989) Google Scholar
  33. 33.
    Hendry, D.F., Santos, C.: Regression models with data-based indicator variables. Oxf. Bull. Econ. Stat. 67, 571–595 (2005) CrossRefGoogle Scholar
  34. 34.
    Hendry, D.F., Johansen, S., Santos, C.: Automatic selection of indicators in a fully saturated regression. Comput. Stat. 33, 317–335 (2008). Erratum, 337–339 MathSciNetGoogle Scholar
  35. 35.
    Hoover, K.D., Perez, S.J.: Data mining reconsidered: encompassing and the general-to-specific approach to specification search. Econom. J. 2, 167–191 (1999) CrossRefGoogle Scholar
  36. 36.
    Johansen, S., Nielsen, B.: An analysis of the indicator saturation estimator as a robust regression estimator. In: Castle, J.L., Shephard, N. (eds.) The Methodology and Practice of Econometrics, pp. 1–36. Oxford University Press, Oxford (2009) CrossRefGoogle Scholar
  37. 37.
    Lehergott, S.: The shape of the income distribution. Am. Econ. Rev. 49, 328–347 (1959) Google Scholar
  38. 38.
    Mincer, J.: Investment in human capital and personal income distribution. J. Polit. Econ. 66(4), 281–302 (1958) CrossRefGoogle Scholar
  39. 39.
    Mincer, J.: Schooling, Experience and Earnings. Natonal Bureau of Economic Research, New York (1974) Google Scholar
  40. 40.
    Mizon, G.E., Richard, J.F.: The encompassing principle and its application to non-nested hypothesis tests. Econometrica 54, 657–678 (1986) MathSciNetMATHCrossRefGoogle Scholar
  41. 41.
    Perez-Amaral, T., Gallo, G.M., White, H.: A flexible tool for model building: the relevant transformation of the inputs network approach (RETINA). Oxf. Bull. Econ. Stat. 65, 821–838 (2003) CrossRefGoogle Scholar
  42. 42.
    Phillips, P.C.B.: Regression with slowly varying regressors and nonlinear trends. Econom. Theory 23, 557–614 (2007) Google Scholar
  43. 43.
    Priestley, M.B.: Spectral Analysis and Time Series. Academic Press, New York (1981) MATHGoogle Scholar
  44. 44.
    Ramsey, J.B.: Tests for specification errors in classical linear least squares regression analysis. J. R. Stat. Soc. B 31, 350–371 (1969) MathSciNetMATHGoogle Scholar
  45. 45.
    Rushton, S.: On least squares fitting by orthogonal polynomials using the Choleski method. J. R. Stat. Soc. B 13, 92–99 (1951) MathSciNetMATHGoogle Scholar
  46. 46.
    Schwarz, G.: Estimating the dimension of a model. Ann. Stat. 6, 461–464 (1978) MATHCrossRefGoogle Scholar
  47. 47.
    Staehle, H.: Ability, wages and income. Rev. Econ. Stat. 25, 77–87 (1943) CrossRefGoogle Scholar
  48. 48.
    Teräsvirta, T.: Specification, estimation and evaluation of smooth transition autoregressive models. J. Am. Stat. Assoc. 89, 208–218 (1994) CrossRefGoogle Scholar
  49. 49.
    White, H.: A heteroskedastic-consistent covariance matrix estimator and a direct test for heteroskedasticity. Econometrica 48, 817–838 (1980) MathSciNetMATHCrossRefGoogle Scholar
  50. 50.
    White, H.: Artificial Neural Networks: Approximation and Learning Theory. Oxford University Press, Oxford (1992) Google Scholar

Copyright information

© Springer-Verlag London Limited 2012

Authors and Affiliations

  1. 1.Magdalen College & Institute for New Economic Thinking at the Oxford Martin SchoolUniversity of OxfordOxfordUK
  2. 2.Economics Department & Institute for New Economic Thinking at the Oxford Martin SchoolUniversity of OxfordOxfordUK

Personalised recommendations