Page 367 - Applied Statistics with R

P. 367

15.1. EXACT COLLINEARITY 367

As a result of this issue, R essentially chose to fit the model y ~ x1 + x2.
However notice that two other models would accomplish exactly the same fit.

fit1 = lm(y ~ x1 + x2, data = exact_collin_data)
fit2 = lm(y ~ x1 + x3, data = exact_collin_data)
fit3 = lm(y ~ x2 + x3, data = exact_collin_data)

We see that the fitted values for each of the three models are exactly the same.
This is a result of containing all of the information from and . As long
3
1
2
as one of or are included in the model, can be used to recover the
3
1
2
information from the variable not included.
all.equal(fitted(fit1), fitted(fit2))
## [1] TRUE

all.equal(fitted(fit2), fitted(fit3))

## [1] TRUE

While their fitted values are all the same, their estimated coeﬀicients are wildly
different. The sign of is switched in two of the models! So only fit1 properly
2
explains the relationship between the variables, fit2 and fit3 still predict as
well as fit1, despite the coeﬀicients having little to no meaning, a concept we
will return to later.

coef(fit1)

## (Intercept) x1 x2
## 2.9573357 0.9856291 1.0170586

coef(fit2)

## (Intercept) x1 x3
## 2.1945418 0.4770998 0.2542647

coef(fit3)

## (Intercept) x2 x3
## 1.4788921 -0.9541995 0.4928145

362 363 364 365 366 367 368 369 370 371 372