Page 383 - Applied Statistics with R
P. 383
Chapter 16
Variable Selection and
Model Building
“Choose well. Your choice is brief, and yet endless.”
— Johann Wolfgang von Goethe
After reading this chapter you will be able to:
• Understand the trade-off between goodness-of-fit and model complexity.
• Use variable selection procedures to find a good model from a set of pos-
sible models.
• Understand the two uses of models: explanation and prediction.
Last chapter we saw how correlation between predictor variables can have un-
desirable effects on models. We used variance inflation factors to assess the
severity of the collinearity issues caused by these correlations. We also saw how
fitting a smaller model, leaving out some of the correlated predictors, results
in a model which no longer suffers from collinearity issues. But how should we
chose this smaller model?
This chapter, we will discuss several criteria and procedures for choosing a
“good” model from among a choice of many.
16.1 Quality Criterion
2
So far, we have seen criteria such as and RMSE for assessing quality of fit.
However, both of these have a fatal flaw. By increasing the size of a model, that
is adding predictors, that can at worst not improve. It is impossible to add a
383

