Page 158 - Jolliffe I. Principal Component Analysis
P. 158

127
                                                        6.1. How Many Principal Components?
                              decide on m, two criteria need to be satisfied. First, the confidence inter-
                              vals for λ m and λ m+1 should not overlap, and second no component should
                              be retained unless it has at least two coefficients whose confidence intervals
                              exclude zero. This second requirement is again relevant for factor analy-
                              sis, but not PCA. With regard to the first criterion, it has already been
                              noted that avoiding small gaps between l m and l m+1 is desirable because
                              it reduces the likelihood of instability in the retained components.
                              6.1.6 Partial Correlation
                              For PCA based on a correlation matrix, Velicer (1976) suggested that the
                              partial correlations between the p variables, given the values of the first
                              m PCs, may be used to determine how many PCs to retain. The criterion
                              proposed is the average of the squared partial correlations
                                                                  ∗ 2
                                                             (r )
                                                          p
                                                             p
                                                     V =          ij   ,
                                                                p(p − 1)
                                                         i=1 j=1
                                                         i =j
                              where r is the partial correlation between the ith and jth variables, given
                                    ∗
                                    ij
                              the first m PCs. The statistic r  ∗  is defined as the correlation between the
                                                         ij
                              residuals from the linear regression of the ith variable on the first m PCs,
                              and the residuals from the corresponding regression of the jth variable on
                              these m PCs. It therefore measures the strength of the linear relationship
                              between the ith and jth variables after removing the common effect of the
                              first m PCs.
                                The criterion V first decreases, and then increases, as m increases, and
                              Velicer (1976) suggests that the optimal value of m corresponds to the
                              minimum value of the criterion. As with Jackson’s (1993) bootstrap rules
                              of Section 6.1.5, and for the same reasons, this criterion is plausible as
                              a means of deciding the number of factors in a factor analysis, but it is
                              inappropriate in PCA. Numerous other rules have been suggested in the
                              context of factor analysis (Reddon, 1984, Chapter 3). Many are subjective,
                              although some, such as parallel analysis (see Sections 6.1.3, 6.1.5) attempt
                              a more objective approach. Few are relevant to, or useful for, PCA unless
                              they are modified in some way.
                                Beltrando (1990) gives a sketchy description of what appears to be an-
                              other selection rule based on partial correlations. Instead of choosing m so
                              that the average squared partial correlation is minimized, Beltrando (1990)
                              selects m for which the number of statistically significant elements in the
                              matrix of partial correlations is minimized.

                              6.1.7 Rules for an Atmospheric Science Context

                              As mentioned in Section 4.3, PCA has been widely used in meteorology
                              and climatology to summarize data that vary both spatially and tempo-
   153   154   155   156   157   158   159   160   161   162   163