Page 45 - Jolliffe I. Principal Component Analysis
P. 45
2. Properties of Population Principal Components
14
This result will prove to be useful later. Looking at diagonal elements,
we see that
p
2
var(x j )= λ k α .
kj
k=1
However, perhaps the main statistical implication of the result is that not
only can we decompose the combined variances of all the elements of x
into decreasing contributions due to each PC, but we can also decompose
the whole covariance matrix into contributions λ k α k α from each PC. Al-
k
though not strictly decreasing, the elements of λ k α k α will tend to become
k
smaller as k increases, as λ k decreases for increasing k, whereas the ele-
ments of α k tend to stay ‘about the same size’ because of the normalization
constraints
α α k =1, k =1, 2,...,p.
k
Property Al emphasizes that the PCs explain, successively, as much as
possible of tr(Σ), but the current property shows, intuitively, that they
also do a good job of explaining the off-diagonal elements of Σ.Thisis
particularly true when the PCs are derived from a correlation matrix, and
is less valid when the covariance matrix is used and the variances of the
elements of x are widely different (see Section 2.3).
It is clear from (2.1.10) that the covariance (or correlation) matrix can
be constructed exactly, given the coefficients and variances of the first r
PCs, where r is the rank of the covariance matrix. Ten Berge and Kiers
(1999) discuss conditions under which the correlation matrix can be exactly
reconstructed from the coefficients and variances of the first q (<r)PCs.
A corollary of the spectral decomposition of Σ concerns the conditional
distribution of x, given the first q PCs, z q ,q =1, 2,... , (p − 1). It can
be shown that the linear combination of x that has maximum variance,
conditional on z q , is precisely the (q + 1)th PC. To see this, we use the
result that the conditional covariance matrix of x,given z q ,is
Σ − Σ xz Σ −1 Σ zx ,
zz
where Σ zz is the covariance matrix for z q , Σ xz is the (p × q) matrix
whose (j, k)th element is the covariance between x j and z k ,and Σ zx is
the transpose of Σ xz (Mardia et al., 1979, Theorem 3.2.4).
It is seen in Section 2.3 that the kth column of Σ xz is λ k α k . The matrix
−1 −1
Σ is diagonal, with kth diagonal element λ , so it follows that
zz
k
q
−1 −1
Σ xz Σ Σ zx = λ k α k λ λ k α
zz k
k
k=1
q
= λ k α k α ,
k
k=1

