Page 52 - Jolliffe I. Principal Component Analysis
P. 52

2.3. Principal Components Using a Correlation Matrix
                                                                                            21
                              2.3 Principal Components Using a Correlation
                                    Matrix
                              The derivation and properties of PCs considered above are based on the
                              eigenvectors and eigenvalues of the covariance matrix. In practice, as will
                              be seen in much of the remainder of this text, it is more common to define
                              principal components as
                                                          z = A x ,                      (2.3.1)
                                                                 ∗
                              where A now has columns consisting of the eigenvectors of the correlation
                                          ∗
                              matrix, and x consists of standardized variables. The goal in adopting
                              such an approach is to find the principal components of a standardized
                                                                        1/2
                                      ∗
                              version x of x, where x has jth element x j /σ  ,j =1, 2,... ,p, x j is
                                                    ∗
                                                                        jj
                              the jth element of x,and σ jj is the variance of x j . Then the covariance
                              matrix for x is the correlation matrix of x, and the PCs of x are given
                                         ∗
                                                                                     ∗
                              by (2.3.1).
                                A third possibility, instead of using covariance or correlation matrices,
                              is to use covariances of x j /w j , where the weights w j are chosen to reflect
                              some a priori idea of the relative importance of the variables. The special
                                         1/2
                              case w j = σ  leads to x , and to PCs based on the correlation matrix,
                                                     ∗
                                        jj
                                                                                 1/2
                              but various authors have argued that the choice of w j = σ  is somewhat
                                                                                jj
                              arbitrary, and that different values of w j might be better in some applica-
                              tions (see Section 14.2.1). In practice, however, it is relatively unusual that
                              a uniquely appropriate set of w j suggests itself.
                                All the properties of the previous two sections are still valid for corre-
                              lation matrices, or indeed for covariances based on other sets of weights,
                              except that we are now considering PCs of x (or some other transformation
                                                                   ∗
                              of x), instead of x.
                                It might seem that the PCs for a correlation matrix could be obtained
                              fairly easily from those for the corresponding covariance matrix, since x ∗
                              is related to x by a very simple transformation. However, this is not the
                              case; the eigenvalues and eigenvectors of the correlation matrix have no
                              simple relationship with those of the corresponding covariance matrix. In
                              particular, if the PCs found from the correlation matrix are expressed in
                                                                  ∗
                              terms of x by transforming back from x to x, then these PCs are not
                              the same as the PCs found from Σ, except in very special circumstances
                              (Chatfield and Collins, 1989, Section 4.4). One way of explaining this is
                              that PCs are invariant under orthogonal transformations of x but not, in
                              general, under other transformations (von Storch and Zwiers, 1999, Section
                              13.1.10). The transformation from x to x is not orthogonal. The PCs
                                                                    ∗
                              for correlation and covariance matrices do not, therefore, give equivalent
                              information, nor can they be derived directly from each other. We now
                              discuss the relative merits of the two types of PC.
   47   48   49   50   51   52   53   54   55   56   57