Page 406 - Applied Statistics with R
P. 406

406    CHAPTER 16. VARIABLE SELECTION AND MODEL BUILDING


                                 best_aic_ind = which.min(hipcenter_mod_aic)
                                 all_hipcenter_mod$which[best_aic_ind,]


                                 ## (Intercept)          Age       Weight     HtShoes           Ht      Seated
                                 ##         TRUE        TRUE        FALSE       FALSE         TRUE       FALSE
                                 ##          Arm       Thigh          Leg
                                 ##        FALSE       FALSE         TRUE


                                 Let’s fit this model so we can compare to our previously chosen models using
                                 AIC and search procedures.

                                 hipcenter_mod_best_aic = lm(hipcenter ~ Age + Ht + Leg, data = seatpos)


                                 The extractAIC() function will calculate the AIC defined above for a fitted
                                 model.
                                 extractAIC(hipcenter_mod_best_aic)



                                 ## [1]    4.0000 274.2418

                                 extractAIC(hipcenter_mod_back_aic)


                                 ## [1]    4.0000 274.2597

                                 extractAIC(hipcenter_mod_forw_aic)


                                 ## [1]    4.0000 274.2418

                                 extractAIC(hipcenter_mod_both_aic)


                                 ## [1]    4.0000 274.2418


                                 We see that two of the models chosen by search procedures have the best possible
                                 AIC, as they are the same model. This is however never guaranteed. We see
                                 that the model chosen using backwards selection does not achieve the smallest
                                 possible AIC.

                                 plot(hipcenter_mod_aic ~ I(2:p), ylab = "AIC", xlab = "p, number of parameters",
                                      pch = 20, col = "dodgerblue", type = "b", cex = 2,
                                      main = "AIC vs Model Complexity")
   401   402   403   404   405   406   407   408   409   410   411