Page 185 - Applied Statistics with R
P. 185
Chapter 10
Model Building
“Statisticians, like artists, have the bad habit of falling in love with
their models.”
— George Box
Let’s take a step back and consider the process of finding a model for data at a
higher level. We are attempting to find a model for a response variable based
on a number of predictors , , , … , −1 .
3
2
1
Essentially, we are trying to discover the functional relationship between and
the predictors. In the previous chapter we were fitting models for a car’s fuel
efficiency (mpg) as a function of its attributes (wt, year, cyl, disp, hp, acc). We
also consider to be a function of some noise. Rarely if ever do we expect there
to be an exact functional relationship between the predictors and the response.
= ( , , , … , −1 ) +
2
1
3
We can think of this as
response = signal + noise.
We could consider all sorts of complicated functions for . You will likely en-
counter several ways of doing this in future machine learning courses. So far in
this course we have focused on (multiple) linear regression. That is
= ( , , , … , −1 ) +
2
1
3
= + + + ⋯ + −1 −1 +
2 2
1 1
0
In the big picture of possible models that we could fit to this data, this is a
rather restrictive model. What do we mean by a restrictive model?
185

