Page 285 - Applied Statistics with R
P. 285

13.3. UNUSUAL OBSERVATIONS                                        285


                      The hat matrix, is a matrix that takes the original    values, and adds a hat!


                                                       ̂    =     

                      The diagonal elements of this matrix are called the leverages


                                                        = ℎ ,
                                                          
                                                             
                      where ℎ is the leverage for the   th observation.
                               
                      Large values of ℎ indicate extreme values in   , which may influence regression.
                                       
                      Note that leverages only depend on   .
                      Here,    the number of   s is also the trace (and rank) of the hat matrix.


                                                       
                                                    ∑ ℎ =   
                                                           
                                                       =1
                      What is a value of ℎ that would be considered large? There is no exact answer
                                          
                      to this question. A common heuristic would be to compare each leverage to
                      two times the average leverage. A leverage larger than this is considered an
                      observation to be aware of. That is, if


                                                     ℎ > 2ℎ ̄
                                                         
                      we say that observation    has large leverage. Here,


                                                         
                                                     ∑    ℎ      
                                                  ̄
                                                 ℎ =     =1      =  .
                                                                 
                      For simple linear regression, the leverage for each point is given by


                                                     1   (   − ̄  ) 2
                                                ℎ =      +              .
                                                   

                      This expression should be familiar. (Think back to inference for SLR.) It sug-
                      gests that the large leverages occur when    values are far from their mean.
                                                                       
                      Recall that the regression goes through the point ( ̄  , ̄).
                      There are multiple ways to find leverages in R.
   280   281   282   283   284   285   286   287   288   289   290