Selection of main effects
Model specification is the most difficult part of prediction modelling.472 Especially in smaller data sets it is virtually impossible to obtain a reliable answer to the question: which predictors are important and which are not? In this chapter, we focus on the advantages and problems that are associated with model reduction techniques such as stepwise selection, including overfitting and the quality of predictions from a model. Specific issues include instability of selection, biased estimation of coefficients, and exaggeration of p-values. We explore the influence of including noise variables as predictors in a model, and find that their influence is not detrimental to legitimize widespread use of stepwise methods. Alternative approaches include making a list of a limited number of candidate predictors to consider for the prediction model, e.g. based on a metaanalysis of available literature, and some more modern selection methods.