Page 160 - Jolliffe I. Principal Component Analysis
P. 160
6.1. How Many Principal Components?
129
95% thresholds. It may therefore be better to look at the size of second and
subsequent eigenvalues only with respect to smaller, not larger, eigenval-
ues. This could be achieved by removing the first term in the singular value
decomposition (SVD) (3.5.2), and viewing the original second eigenvalue
as the first eigenvalue in the analysis of this residual matrix. If the second
eigenvalue is above its 95% threshold in this analysis, we subtract a second
term from the SVD, and so on. An alternative idea, noted in Preisendorfer
and Mobley (1988, Section 5f), is to simulate from a given covariance or
correlation structure in which not all the variables are uncorrelated.
If the data are time series, with autocorrelation between successive obser-
vations, Preisendorfer and Mobley (1988) suggest calculating an ‘equivalent
sample size’, n , allowing for the autocorrelation. The simulations used to
∗
∗
implement Rule N are then carried out with sample size n , rather than
the actual sample size, n. They also note that both Rules A 4 and N tend to
retain too few components, and therefore recommend choosing a value for
m that is the larger of the two values indicated by these rules. In Section 5k
Preisendorfer and Mobley (1988) provide rules for the case of vector-valued
fields.
Like Besse and de Falguerolles (1993) (see Section 6.1.5) North et al.
(1982) argue strongly that a set of PCs with similar eigenvalues should
either all be retained or all excluded. The size of gaps between successive
eigenvalues is thus an important consideration for any decision rule, and
North et al. (1982) provide a rule-of-thumb for deciding whether gaps are
too small to split the PCs on either side of the gap.
The idea of using simulated data to assess significance of eigenvalues
has also been explored by other authors, for example, Farmer (1971) (see
also Section 6.1.3 above), Cahalan (1983) and, outside the meteorological
context, Mandel (1972), Franklin et al. (1995) and the parallel analysis
literature.
Other methods have also been suggested in the atmospheric science liter-
ature. For example, Jones et al. (1983), Briffa et al. (1986) use a criterion for
correlation matrices, which they attribute to Guiot (1981). In this method
PCs are retained if their cumulative eigenvalue product exceeds one. This
technique retains more PCs than most of the other procedures discussed
earlier, but Jones et al. (1983) seem to be satisfied with the results it
produces. Preisendorfer and Mobley (1982, Part IV) suggest a rule that
considers retaining subsets of m PCs not necessarily restricted to the first
m. This is reasonable if the PCs are to be used for an external purpose,
such as regression or discriminant analysis (see Chapter 8, Section 9.1),
but is not really relevant if we are merely interested in accounting for as
much of the variation in x as possible. Richman and Lamb (1987) look
specifically at the case where PCs are rotated (see Section 11.1), and give
a rule for choosing m based on the patterns in rotated eigenvectors.
North and Wu (2001), in an application of PCA to climate change
detection, use a modification of the percentage of variation criterion of

