Page 65 - Jolliffe I. Principal Component Analysis
P. 65
3. Properties of Sample Principal Components
34
Property G2 may also be carried over from populations to samples as
follows. Suppose that the observations x 1 , x 2 ,... x n are transformed by
y i = B x i , i =1, 2,...,n,
where B is a (p × q) matrix with orthonormal columns, so that
y 1 , y 2 ,..., y n , are projections of x 1 , x 2 ,..., x n onto a q-dimensional
subspace. Then
n n
(y h − y i ) (y h − y i )
h=1 i=1
is maximized when B = A q . Conversely, the same criterion is minimized
when B = A .
∗
q
This property means that if the n observations are projected onto a
q-dimensional subspace, then the sum of squared Euclidean distances be-
tween all pairs of observations in the subspace is maximized when the
subspace is defined by the first q PCs, and minimized when it is defined
by the last q PCs. The proof that this property holds is again rather sim-
ilar to that for the corresponding population property and will not be
repeated.
The next property to be considered is equivalent to Property A5.
Both are concerned, one algebraically and one geometrically, with least
squares linear regression of each variable x j on the q variables contained
in y.
Property G3. As before, suppose that the observations x 1 , x 2 ,..., x n
are transformed by y i = B x i ,i =1, 2,... ,n,where B is a (p × q) ma-
trix with orthonormal columns, so that y 1 , y 2 ,..., y n are projections of
x 1 , x 2 ,..., x n onto a q-dimensional subspace. A measure of ‘goodness-of-
fit’ of this q-dimensional subspace to x 1 , x 2 ,..., x n can be defined as the
sum of squared perpendicular distances of x 1 , x 2 ,..., x n from the subspace.
This measure is minimized when B = A q .
Proof. The vector y i is an orthogonal projection of x i onto a q-
dimensional subspace defined by the matrix B. Let m i denote the position
of y i in terms of the original coordinates, and r i = x i − m i . (See Fig-
ure 3.1 for the special case where p =2, q = 1; in this case y i is a scalar,
whose value is the length of m i .) Because m i is an orthogonal projection
of x i onto a q-dimensional subspace, r i is orthogonal to the subspace, so
r m i = 0. Furthermore, r r i is the squared perpendicular distance of x i
i i
from the subspace so that the sum of squared perpendicular distances of
x 1 , x 2 ,..., x n from the subspace is
n
r r i .
i
i=1

