Page 57 - Jolliffe I. Principal Component Analysis
P. 57

2. Properties of Population Principal Components
                              26
                              The criterion
                                                            p
                                                               2
                                                              R
                                                               j:q
                                                           j=1
                              is maximized when y 1 ,y 2 ,...,y q are the first q correlation matrix PCs.
                              The maximized value of the criterion is equal to the sum of the q largest
                              eigenvalues of the correlation matrix.
                                Because the principal components are uncorrelated, the criterion in
                              Property A6 reduces to
                                                           p  q
                                                                 2
                                                                r
                                                                 jk
                                                          j=1 k=1
                              where r 2  is the squared correlation between the jth variable and the
                                     jk
                              kth PC. The criterion will be maximized by any matrix B that gives
                              y spanning the same q-dimensional space as the first q PCs. How-
                              ever, the correlation matrix PCs are special, in that they successively
                              maximize the criterion for q =1, 2,... ,p. As noted following Prop-
                              erty A5, this result was given by Hotelling (1933) alongside his original
                              derivation of PCA, but it has subsequently been largely ignored. It is
                              closely related to Property A5. Meredith and Millsap (1985) derived
                              Property A6 independently and noted that optimizing the multiple cor-
                              relation criterion gives a scale invariant method (as does Property A5;
                              Cadima, 2000). One implication of this scale invariance is that it gives
                              added importance to correlation matrix PCA. The latter is not simply
                              a variance-maximizing technique for standardized variables; its derived
                              variables are also the result of optimizing a criterion which is scale
                              invariant, and hence is relevant whether or not the variables are stan-
                              dardized. Cadima (2000) discusses Property A6 in greater detail and
                              argues that optimization of its multiple correlation criterion is actually
                              a new technique, which happens to give the same results as correla-
                              tion matrix PCA, but is broader in its scope. He suggests that the
                              derived variables be called Most Correlated Components.Lookedatfrom
                              another viewpoint, this broader relevance of correlation matrix PCA
                              gives another reason to prefer it over covariance matrix PCA in most
                              circumstances.
                                To conclude this discussion, we note that Property A6 can be easily
                              modified to give a new property for covariance matrix PCA. The first q
                              covariance marix PCs maximize, amongst all orthonormal linear transfor-
                              mations of x,thesumofsquared covariances between x 1 ,x 2 ,...,x p and
                              the derived variables y 1 ,y 2 ,...,y q . Covariances, unlike correlations, are not
                              scale invariant, and hence neither is covariance matrix PCA.
   52   53   54   55   56   57   58   59   60   61   62