Page 158 - Jolliffe I. Principal Component Analysis
P. 158
127
6.1. How Many Principal Components?
decide on m, two criteria need to be satisfied. First, the confidence inter-
vals for λ m and λ m+1 should not overlap, and second no component should
be retained unless it has at least two coefficients whose confidence intervals
exclude zero. This second requirement is again relevant for factor analy-
sis, but not PCA. With regard to the first criterion, it has already been
noted that avoiding small gaps between l m and l m+1 is desirable because
it reduces the likelihood of instability in the retained components.
6.1.6 Partial Correlation
For PCA based on a correlation matrix, Velicer (1976) suggested that the
partial correlations between the p variables, given the values of the first
m PCs, may be used to determine how many PCs to retain. The criterion
proposed is the average of the squared partial correlations
∗ 2
(r )
p
p
V = ij ,
p(p − 1)
i=1 j=1
i =j
where r is the partial correlation between the ith and jth variables, given
∗
ij
the first m PCs. The statistic r ∗ is defined as the correlation between the
ij
residuals from the linear regression of the ith variable on the first m PCs,
and the residuals from the corresponding regression of the jth variable on
these m PCs. It therefore measures the strength of the linear relationship
between the ith and jth variables after removing the common effect of the
first m PCs.
The criterion V first decreases, and then increases, as m increases, and
Velicer (1976) suggests that the optimal value of m corresponds to the
minimum value of the criterion. As with Jackson’s (1993) bootstrap rules
of Section 6.1.5, and for the same reasons, this criterion is plausible as
a means of deciding the number of factors in a factor analysis, but it is
inappropriate in PCA. Numerous other rules have been suggested in the
context of factor analysis (Reddon, 1984, Chapter 3). Many are subjective,
although some, such as parallel analysis (see Sections 6.1.3, 6.1.5) attempt
a more objective approach. Few are relevant to, or useful for, PCA unless
they are modified in some way.
Beltrando (1990) gives a sketchy description of what appears to be an-
other selection rule based on partial correlations. Instead of choosing m so
that the average squared partial correlation is minimized, Beltrando (1990)
selects m for which the number of statistically significant elements in the
matrix of partial correlations is minimized.
6.1.7 Rules for an Atmospheric Science Context
As mentioned in Section 4.3, PCA has been widely used in meteorology
and climatology to summarize data that vary both spatially and tempo-

