Page 161 - Jolliffe I. Principal Component Analysis
P. 161

6. Choosing a Subset of Principal Components or Variables
                              130
                              Section 6.1.1. They use instead the percentage of ‘signal’ accounted for,
                              although the PCA is done on a covariance matrix other than that associ-
                              ated with the signal (see Section 12.4.3). Buell (1978) advocates stability
                              with respect to different degrees of approximation of a continuous spatial
                              field by discrete points as a criterion for choosing m. Section 13.3.4 of von
                              Storch and Zwiers (1999) is dismissive of selection rules.
                              6.1.8 Discussion
                              Although many rules have been examined in the last seven subsections,
                              the list is by no means exhaustive. For example, in Section 5.1 we noted
                              that superimposing a minimum spanning tree on a plot of the observations
                              with respect to the first two PCs gives a subjective indication of whether or
                              not a two-dimensional representation is adequate. It is not possible to give
                              definitive guidance on which rules are best, but we conclude this section
                              with a few comments on their relative merits. First, though, we discuss a
                              small selection of the many comparative studies that have been published.
                                Reddon (1984, Section 3.9) describes nine such studies, mostly from the
                              psychological literature, but all are concerned with factor analysis rather
                              than PCA. A number of later studies in the ecological, psychological and
                              meteorological literatures have examined various rules on both real and
                              simulated data sets. Simulation of multivariate data sets can always be
                              criticized as unrepresentative, because they can never explore more than
                              a tiny fraction of the vast range of possible correlation and covariance
                              structures. Several of the published studies, for example Grossman et al.
                              (1991), Richman (1988), are particularly weak in this respect, looking only
                              at simulations where all p of the variables are uncorrelated, a situation
                              which is extremely unlikely to be of much interest in practice. Another
                              weakness of several psychology-based studies is their confusion between
                              PCA and factor analysis. For example, Zwick and Velicer (1986) state that
                              ‘if PCA is used to summarize a data set each retained component must
                              contain at least two substantial loadings.’ If the word ‘summarize’ implies
                              a descriptive purpose the statement is nonsense, but in the simulation study
                              that follows all their ‘components’ have three or more large loadings. With
                              this structure, based on factor analysis, it is no surprise that Zwick and
                              Velicer (1986) conclude that some of the rules they compare, which were
                              designed with descriptive PCA in mind, retain ‘too many’ factors.
                                Jackson (1993) investigates a rather broader range of structures, includ-
                              ing up to 12 variables in up to 3 correlated groups, as well as the completely
                              uncorrelated case. The range of stopping rules is also fairly wide, includ-
                              ing: Kaiser’s rule; the scree graph; the broken stick rule; the proportion of
                              total variance; tests of equality of eigenvalues; and Jackson’s two bootstrap
                              procedures described in Section 6.1.5. Jackson (1993) concludes that the
                              broken stick and bootstrapped eigenvalue-eigenvector rules give the best
                              results in his study. However, as with the reasoning used to develop his
   156   157   158   159   160   161   162   163   164   165   166