Page 160 - Jolliffe I. Principal Component Analysis
P. 160

6.1. How Many Principal Components?
                                                                                            129
                              95% thresholds. It may therefore be better to look at the size of second and
                              subsequent eigenvalues only with respect to smaller, not larger, eigenval-
                              ues. This could be achieved by removing the first term in the singular value
                              decomposition (SVD) (3.5.2), and viewing the original second eigenvalue
                              as the first eigenvalue in the analysis of this residual matrix. If the second
                              eigenvalue is above its 95% threshold in this analysis, we subtract a second
                              term from the SVD, and so on. An alternative idea, noted in Preisendorfer
                              and Mobley (1988, Section 5f), is to simulate from a given covariance or
                              correlation structure in which not all the variables are uncorrelated.
                                If the data are time series, with autocorrelation between successive obser-
                              vations, Preisendorfer and Mobley (1988) suggest calculating an ‘equivalent
                              sample size’, n , allowing for the autocorrelation. The simulations used to
                                          ∗
                                                                                  ∗
                              implement Rule N are then carried out with sample size n , rather than
                              the actual sample size, n. They also note that both Rules A 4 and N tend to
                              retain too few components, and therefore recommend choosing a value for
                              m that is the larger of the two values indicated by these rules. In Section 5k
                              Preisendorfer and Mobley (1988) provide rules for the case of vector-valued
                              fields.
                                Like Besse and de Falguerolles (1993) (see Section 6.1.5) North et al.
                              (1982) argue strongly that a set of PCs with similar eigenvalues should
                              either all be retained or all excluded. The size of gaps between successive
                              eigenvalues is thus an important consideration for any decision rule, and
                              North et al. (1982) provide a rule-of-thumb for deciding whether gaps are
                              too small to split the PCs on either side of the gap.
                                The idea of using simulated data to assess significance of eigenvalues
                              has also been explored by other authors, for example, Farmer (1971) (see
                              also Section 6.1.3 above), Cahalan (1983) and, outside the meteorological
                              context, Mandel (1972), Franklin et al. (1995) and the parallel analysis
                              literature.
                                Other methods have also been suggested in the atmospheric science liter-
                              ature. For example, Jones et al. (1983), Briffa et al. (1986) use a criterion for
                              correlation matrices, which they attribute to Guiot (1981). In this method
                              PCs are retained if their cumulative eigenvalue product exceeds one. This
                              technique retains more PCs than most of the other procedures discussed
                              earlier, but Jones et al. (1983) seem to be satisfied with the results it
                              produces. Preisendorfer and Mobley (1982, Part IV) suggest a rule that
                              considers retaining subsets of m PCs not necessarily restricted to the first
                              m. This is reasonable if the PCs are to be used for an external purpose,
                              such as regression or discriminant analysis (see Chapter 8, Section 9.1),
                              but is not really relevant if we are merely interested in accounting for as
                              much of the variation in x as possible. Richman and Lamb (1987) look
                              specifically at the case where PCs are rotated (see Section 11.1), and give
                              a rule for choosing m based on the patterns in rotated eigenvectors.
                                North and Wu (2001), in an application of PCA to climate change
                              detection, use a modification of the percentage of variation criterion of
   155   156   157   158   159   160   161   162   163   164   165