Prescriptions for Working Statisticians pp 181-213 | Cite as

# Independent Variable Selection in Multiple Regression

Chapter

## Abstract

The data analyst confronted with a large number of variables from which he must select a parsimonious subset, as independent variables in a multiple regression, is faced with a number of technical issues. What criterion should he use to judge the adequacy of his selection? What procedure should he use to select the subset of independent variables? How should he check for/guard against/correct for possible multicollinearity in his chosen set of independent variables?

## Keywords

Matrix Inversion Ridge Regression Stepwise Procedure Multivariate Normal Distribution Multiple Correlation Coefficient
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

## Preview

Unable to display preview. Download preview PDF.

## References

- Abt, K. 1967. On the identification of the significant independent variables in linear models.
*Metrika***12**: 2–15.MathSciNetGoogle Scholar - Akaike, H. 1972. Information theory and an extension of the maximum likelihood principle.
*Proceedings, Second International Symposium on Information. Theory*, 267–81.Google Scholar - Allen, D. M. 1971. The prediction sum of squares as a criterion for selecting prediction variables. Technical Report 23. Department of Statistics. University of Kentucky.Google Scholar
- Allen, D. M. 1974. The relationship between variable selection and data augmentation and a method for prediction.
*Technometrics***16**(February): 125–27.MathSciNetzbMATHCrossRefGoogle Scholar - Amemiya, T. 1976. Selection of regressors. Technical Report 225. Institute for Mathematical Studies in the Social Sciences. Stanford University.Google Scholar
- Beaton, A. E. 1964. The use of special matrix operators in statistical calculus.
*Research Bulletin*RB 64 51 (October). Princeton: Educational Testing Service.Google Scholar - Bendel, R. B. and Afifi, A. A. 1977. Comparison of stopping rules in forward “Stepwise” regression.
*Journal of the American Statistical Association***72**(March): 46–53.zbMATHCrossRefGoogle Scholar - Daniel, C. and Wood, F. S. 1980.
*Fitting Equations to Data*. New York: Wiley.zbMATHGoogle Scholar - Draper, N. R. and Smith, H. 1966.
*Applied Regression Analysis*. New York: Wiley.Google Scholar - Dunnett, C. W. and Sobel, M. 1954. A bivariate generalization of Student’s distribution, with tables for certain special cases.
*Biometrika***41**(April): 153–69.MathSciNetzbMATHGoogle Scholar - Efroymson, M. A. 1962. Multiple regression analysis. In
*Mathematical Methods for Digital Computers*, ed. A. Ralston, and H. S. Wilf. New York: Wiley.Google Scholar - Fisher, R. A. 1934.
*Statistical Methods for Research Workers*. New York: Hafner.Google Scholar - Goldstein, M. and Smith, A. F. M. 1974. Ridge type estimators for regression analysis.
*Journal of the Royal Statistical Society, Series B***36**(December): 284–91.MathSciNetzbMATHGoogle Scholar - Goodnight, J. H. 1979. A tutorial on the SWEEP operator.
*American Statistician***33**(August): 149–58.zbMATHCrossRefGoogle Scholar - Gorman, J. W. and Toman, R. J. 1966. Selection of variables for fitting equations to data.
*Technometrics***8**(February): 27–51.CrossRefGoogle Scholar - Hocking, R. R. 1972. Criteria for selection of a subset regression: Which one should be used?
*Technometrics***14**(November): 967–70.CrossRefGoogle Scholar - Hocking, R. R. 1976. The analysis and selection of variables in linear regression.
*Biometrics***32**(March): 1–49.MathSciNetzbMATHCrossRefGoogle Scholar - Hoerl, A. E. and Kennard, R. W. 1970. Ridge regression: Biased estimation of non-orthogonal problems.
*Technometrics***12**(February): 55–67.zbMATHCrossRefGoogle Scholar - Krishnaiah, P. R. and Armitage, J. V. 1. 1965.
*Probability Integrals of the Multivariate F Distribution, with Tables and Applications*, ARL 65-236. Ohio: Wright-Patterson AFB.Google Scholar - Kullback, S. and Leibler, R. A. 1951. On information and sufficiency.
*Annals of Mathematical Statistics***22**(March): 79–66.MathSciNetzbMATHCrossRefGoogle Scholar - Madansky, A. 1976.
*Foundations of Econometrics*. Amsterdam: North-Holland.zbMATHGoogle Scholar - Malinvaud, E. 1966.
*Statistical Methods of Econometrics*. Chicago: Rand-McNally.zbMATHGoogle Scholar - Mallows, C. L. 1967. Choosing a subset regression. Bell Telephone Laboratories, unpublished report.Google Scholar
- Pope, P. T. 1969. On the stepwise construction of a prediction equation. Tech. Report 37. THEMIS, Statistics Department, Southern Methodist University, Dallas, Texas.Google Scholar
- Pope, P. T. and Webster, J. T. 1972. The use of an
*F*-statistic*n*stepwise regression procedures.*Technometrics***14**(May): 327–40.zbMATHCrossRefGoogle Scholar - Roberts, H. V. and Ling, R. F. 1982.
*Conversational Statistics with IDA*. New York: Scientific Press and McGraw-Hill.Google Scholar - Theil, H. 1961.
*Economic Forecasts and Policy*. Amsterdam: North-Holland.Google Scholar - Thompson, M. L. 1978a. Selection of variables in multiple regression: Part I. A review and evaluation.
*International Statistical Review***46**(April): 1–19.MathSciNetzbMATHGoogle Scholar - Thompson, M. L. 1978b. Selection of variables in multiple regression: Part II. Chosen procedures, computations, and examples.
*International Statistical Review***46**(April): 129–46.MathSciNetzbMATHGoogle Scholar - Vinod, H. D. 1976. Application of new ridge regression methods to a study of Bell system scale economies.
*Journal of the American Statistical Association***71**(December): 929–33.zbMATHCrossRefGoogle Scholar - Vinod, H. D. and Ullah, A. 1981.
*Recent Advances in Regression Analysis*. New York: Marcel Dekker.Google Scholar

## Copyright information

© Springer-Verlag New York Inc. 1988