Page 56 - Jolliffe I. Principal Component Analysis
P. 56

25
                                            2.3. Principal Components Using a Correlation Matrix
                              as in allometry (Section 13.2) and for compositional data (Section 13.3).
                                We conclude this section by looking at three interesting properties which
                              hold for PCs derived from the correlation matrix. The first is that the
                              PCs depend not on the absolute values of correlations, but only on their
                              ratios. This follows because multiplication of all off-diagonal elements of
                              a correlation matrix by the same constant leaves the eigenvectors of the
                              matrix unchanged (Chatfield and Collins, 1989, p. 67).
                                The second property, which was noted by Hotelling (1933) in his original
                              paper, is that if, instead of the normalization α α k =1, we use

                                                                       k

                                                  ˜ α ˜ α k = λ k ,  k =1, 2,... ,p,     (2.3.2)
                                                   k
                              then, ˜α kj the jth element of ˜ α k , is the correlation between the jth stan-
                              dardized variable x ∗  and the kth PC. To see this note that for k =
                                               j
                              1, 2,... ,p,
                                                       1/2
                                                 ˜ α k = λ  α k ,  var(z k )= λ k ,
                                                       k
                              and the p-element vector Σα k hasasits jth element the covariance between
                               ∗
                              x and z k . But Σα k = λ k α k , so the covariance between x and z k is λ k α kj .
                                                                               ∗
                               j                                               j
                                                                        ∗
                                       ∗
                              Also var(x ) = 1, and the correlation between x and z k is therefore
                                       j                                j
                                                                      1/2
                                                                  = λ
                                                       λ k α jk
                                                 [var(x )var(z k )] 1/2  k  α kj
                                                      ∗
                                                      j
                                                                  =˜α kj ,
                              as required.
                                Because of this property the normalization (2.3.2) is quite often used, in
                              particular in computer packages, but it has the disadvantage that it is less
                              easy to informally interpret and compare a set of PCs when each PC has a
                              different normalization on its coefficients. This remark is, of course, relevant
                              to sample, rather than population, PCs, but, as with some other parts of
                              the chapter, it is included here to avoid a possibly disjointed presentation.
                                Both of these properties that hold for correlation matrices can be
                              modified for covariance matrices, but the results are, in each case, less
                              straightforward.
                                The third property is sufficiently substantial to deserve a label. It is
                              included in this section because, at first sight, it is specific to correlation
                              matrix PCA although, as we will see, its implications are much wider.
                              Proofs of the result are available in the references cited below and will not
                              be reproduced here.
                              Property A6.    For any integer q, 1 ≤ q ≤ p, consider the orthonormal
                              linear transformation

                                                          y = B x,                       (2.3.3)

                              as defined in Property A1. Let R 2  be the squared multiple correlation be-
                                                          j:q
                              tween x j and the q variables y 1 ,y 2 ,...,y q , defined by the elements of y.
   51   52   53   54   55   56   57   58   59   60   61