Choosing a regression model

  • G. Barrie Wetherill
  • P. Duncombe
  • M. Kenward
  • J. Köllerström
  • S. R. Paul
  • B. J. Vowden
Part of the Monographs on Statistics and Applied Probability book series (MSAP)


There is a very large literature on methods of choosing a regression model but, in spite of this, there is little clear guidance on what to do in a specific case. For access to the literature see Hocking (1976,1983), Mosteller and Tukey (1977), Seber (1977), Thompson, (1978), Daniel and Wood (1980), and Miller (1984). This chapter presents a review of the main points and for further details these references should be consulted.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Allen, D. M. (1974) The relationship between variable selection and data augmentation and a method for prediction. Technometrics, 16, 125–127.CrossRefGoogle Scholar
  2. Anscombe, R. J. (1981) Computing in Statistical Science Through APL. Springer-Verlag, New York.CrossRefGoogle Scholar
  3. Beale, E. M. L, Kendall, M. G. and Mann, D. W. (1967) The discarding of variables in multivariate analysis. Biometrika, 54, 357–366.Google Scholar
  4. Belsley, D., Kuh, E. and Welsch, R. E. (1980) Regression Diagnostics: Identifying Influential Data and Sources of Collinearity. Wiley, New York.CrossRefGoogle Scholar
  5. Berk, K. N., (1978) Comparing subset regression procedures. Technometrics, 20, 1–6.CrossRefGoogle Scholar
  6. Clarke, M. R. B. (1981) A given algorithm for moving from one linear model to another without going back to the data. Algorithm AS 153. Appl. Statist., 30, 198–203.CrossRefGoogle Scholar
  7. Clarke, M. R. B. (1982) The Gauss-Jordan sweep operator with detection of collinearity. Algorithm AS178. Appl. Statist., 31, 166–168.CrossRefGoogle Scholar
  8. Copas, J. B. (1983) Regression, prediction and shrinkage. J. Roy. Statist. Soc. B, 45, 311–354.Google Scholar
  9. Daniel, C. and Wood, F. S. (1980) Fitting Equations to Data. Wiley, New York.Google Scholar
  10. Forsythe, A. B., Engelman, L., Jennrich, R. and May, P. R. A. (1973) A stopping rule for variable selection in multiple regression. J. Amer. Statist. Assoc., 68, 75–77.CrossRefGoogle Scholar
  11. Freedman, D. A. (1983) A note on screening regression equations. Amer. Statist., 37, 147–151.Google Scholar
  12. Furnival, G. M. (1971) All possible regressions with less computation. Technometrics, 13, 403–408.CrossRefGoogle Scholar
  13. Furnival, G. M. and Wilson, R. W., Jr. (1974) Regression by leaps and bounds. Technometrics, 16, 499–512.CrossRefGoogle Scholar
  14. Gorman, J. W. and Toman, R. J. (1966) Selection of variables for fitting equations to data. Technometrics, 8, 27–51.CrossRefGoogle Scholar
  15. Henderson, H. V. and Vellman, P. F. (1981) Building multiple regression models interactively. Biometrics, 37, 391–411.CrossRefGoogle Scholar
  16. Hocking, R. R. (1976) The analysis, and selection of variables in linear regression. Biometrics, 32, 1–49.CrossRefGoogle Scholar
  17. Hocking R. R. (1983) Developments of linear regression methodology: 1959–1983. Technometrics, 25, 219–249.Google Scholar
  18. Hocking, R. R. and Leslie, R. N. (1967) Selection of the best subset in regression analysis. Technometrics, 9, 531–540.CrossRefGoogle Scholar
  19. Judge, G. G., Griffiths, W. E., Hill, R. C. and Lee, Tsoung-Chao (1980) The Theory and Practice of Econometrics. Wiley, New York.Google Scholar
  20. Mallows, C. L. (1973) Some comments on C p. Technometrics, 15, 661–675.Google Scholar
  21. Miller, A. J. (1984) Selection of subsets of regression variables. J. Roy. Statist. Soc. A, 147, 389–425.Google Scholar
  22. Morgan, J. A. and Tatar, J. F. (1972) Calculation of the residual sum of squares for all possible regressions. Technometrics, 14, 317–325.CrossRefGoogle Scholar
  23. Mosteller, F. and Tukey, J. W. (1977) Data Analysis and Regression, Addison-Wesley, Reading, MA.Google Scholar
  24. Newton, R. G. and Spurrell, D. J. (1967a) A development of multiple regression for the analysis of routine data. Appl. Statist., 16, 51–65.CrossRefGoogle Scholar
  25. Newton, R. G. and Spurrell, D. J. (1967b) Examples of the use of elements for clarifying regression analysis. Appl. Statist., 16, 165–171.CrossRefGoogle Scholar
  26. Pope, P. T. and Webster, J. T. (1972) The use of an F-statistic in stepwise regression procedures. Technometrics, 14, 327–340.Google Scholar
  27. Preece, D. A. (1981) Distribution of final digits in data. The Statistician, 30, 31–60.CrossRefGoogle Scholar
  28. Rencher, A. C. and Pun, F. C. (1980) Inflation of R 2 in best subset regression. Technometrics, 22, 49–54.CrossRefGoogle Scholar
  29. Seber, G. A. F. (1977) Linear Regression Analysis. Wiley, New York.Google Scholar
  30. Spjøtvoll, E. (1972a) Multiple comparison of regression functions. Ann. Math. Statist., 43, 1076–1088.CrossRefGoogle Scholar
  31. Spjøtvoll, E. (1972b) A note on a theorem of Forsythe and Golub. SIAM J. Appl. Math., 23, 307–311.CrossRefGoogle Scholar
  32. Stone, M. (1974) Cross-validatory choice and assessment of statistical predictions. J. Roy. Statist. Soc. B, 36, 111–147.Google Scholar
  33. Thompson, M. L. (1978) Selection of variables in multiple regression. Part I. A review and evaluation. Part II. Chosen procedures, computations and examples. Int. Statist. Rev., 46, 1–19 and 129–146.Google Scholar
  34. Wilkinson, L. and Dallal, G. E. (1981) Tests of significance in forward selection regression with an F-to enter stopping rule. Technometrics, 23, 377–380.Google Scholar

Copyright information

© G. Barrie Wetherill 1986

Authors and Affiliations

  • G. Barrie Wetherill
    • 1
  • P. Duncombe
    • 2
  • M. Kenward
    • 3
  • J. Köllerström
    • 3
  • S. R. Paul
    • 4
  • B. J. Vowden
    • 3
  1. 1.Department of StatisticsThe University of Newcastle upon TyneUK
  2. 2.Applied Statistics Research UnitUniversity of Kent at CanterburyUK
  3. 3.Mathematical InstituteUniversity of Kent at CanterburyUK
  4. 4.Department of Mathematics and StatisticsUniversity of WindsorCanada

Personalised recommendations