Page 57 - Jolliffe I. Principal Component Analysis
P. 57
2. Properties of Population Principal Components
26
The criterion
p
2
R
j:q
j=1
is maximized when y 1 ,y 2 ,...,y q are the first q correlation matrix PCs.
The maximized value of the criterion is equal to the sum of the q largest
eigenvalues of the correlation matrix.
Because the principal components are uncorrelated, the criterion in
Property A6 reduces to
p q
2
r
jk
j=1 k=1
where r 2 is the squared correlation between the jth variable and the
jk
kth PC. The criterion will be maximized by any matrix B that gives
y spanning the same q-dimensional space as the first q PCs. How-
ever, the correlation matrix PCs are special, in that they successively
maximize the criterion for q =1, 2,... ,p. As noted following Prop-
erty A5, this result was given by Hotelling (1933) alongside his original
derivation of PCA, but it has subsequently been largely ignored. It is
closely related to Property A5. Meredith and Millsap (1985) derived
Property A6 independently and noted that optimizing the multiple cor-
relation criterion gives a scale invariant method (as does Property A5;
Cadima, 2000). One implication of this scale invariance is that it gives
added importance to correlation matrix PCA. The latter is not simply
a variance-maximizing technique for standardized variables; its derived
variables are also the result of optimizing a criterion which is scale
invariant, and hence is relevant whether or not the variables are stan-
dardized. Cadima (2000) discusses Property A6 in greater detail and
argues that optimization of its multiple correlation criterion is actually
a new technique, which happens to give the same results as correla-
tion matrix PCA, but is broader in its scope. He suggests that the
derived variables be called Most Correlated Components.Lookedatfrom
another viewpoint, this broader relevance of correlation matrix PCA
gives another reason to prefer it over covariance matrix PCA in most
circumstances.
To conclude this discussion, we note that Property A6 can be easily
modified to give a new property for covariance matrix PCA. The first q
covariance marix PCs maximize, amongst all orthonormal linear transfor-
mations of x,thesumofsquared covariances between x 1 ,x 2 ,...,x p and
the derived variables y 1 ,y 2 ,...,y q . Covariances, unlike correlations, are not
scale invariant, and hence neither is covariance matrix PCA.

