Page 277 - Jolliffe I. Principal Component Analysis
P. 277

10. Outlier Detection, Influential Observations and Robust Estimation
                              244































                              Figure 10.3. Anatomical measurements: plot of observations with respect to the
                              last two PCs.


                                Most of the observations identified in Table 10.1 are near the edge of the
                              plot given in Figure 10.3. Observations 2, 4, 5, 12, 16, 20, 23 and 27 are
                              close to the main body of the data, but observations 7, 11, 15, and to a
                              lesser extent 1, are sufficiently far from the remaining data to be worthy of
                              further consideration. To roughly judge their ‘significance,’ recall that, if no
                              outliers are present and the data are approximately multivariate normal,
                              then the values of d 4i , are (approximately) absolute values of a normal
                              random variable with zero mean and unit variance. The quantities given in
                              the relevant columns of Table 10.1 are therefore the six largest among 28q
                              such variables, and none of them look particularly extreme. Nevertheless,
                              it is of interest to investigate the reasons for the outlying positions of some
                              of the observations, and to do so it is necessary to examine the coefficients
                              of the last few PCs. The final PC, accounting for only 1.7% of the total
                              variation, is largely a contrast between chest and hand measurements with
                              positive coefficients 0.55, 0.51, and waist and height measurements, which
                              have negative coefficients −0.55, −0.32. Looking at observation 15, we find
   272   273   274   275   276   277   278   279   280   281   282