Page 284 - Applied Statistics with R
P. 284

284                             CHAPTER 13. MODEL DIAGNOSTICS


                                 coef(model_2)[2]


                                 ##           x
                                 ## -0.9507397

                                 We will say that this point has high leverage, is not an outlier due to its small
                                 residual, and has a very small influence.
                                 Lastly, the added point in the third plot has a large effect on the slope, which
                                 is now:

                                 coef(model_3)[2]


                                 ##           x
                                 ## -0.5892241


                                 This added point is influential. It both has high leverage, and is an outlier due
                                 to its large residual.

                                 We’ve now mentioned three new concepts: leverage, outliers, and influential
                                 points, each of which we will discuss in detail.


                                 13.3.1   Leverage


                                 A data point with high leverage, is a data point that could have a large influence
                                 when fitting the model.
                                 Recall that,


                                                            ̂
                                                                         ⊤
                                                                 ⊤
                                                              = (     ) −1       .
                                 Thus,

                                                              ̂
                                                                     ⊤
                                                                             ⊤
                                                         ̂    =      =    (     ) −1       
                                 Now we define,

                                                                    ⊤
                                                             =    (     ) −1     ⊤
                                 which we will refer to as the hat matrix. The hat matrix is used to project
                                 onto the subspace spanned by the columns of   . It is also simply known as a
                                 projection matrix.
   279   280   281   282   283   284   285   286   287   288   289