Page 421 - Jolliffe I. Principal Component Analysis
P. 421
14. Generalizations and Adaptations of Principal Component Analysis
386
14.2.2 Metrics
The idea of defining PCA with respect to a metric or an inner-product dates
back at least to Dempster (1969, Section 7.6). Following the publication of
Cailliez and Pag`es (1976) it became, together with an associated ‘duality
diagram,’ a popular view of PCA in France in the 1980s (see, for example,
Caussinus, 1986; Escoufier, 1987). In this framework, PCA is defined in
terms of a triple (X, Q, D), the three elements of which are:
• the matrix X is the (n × p) data matrix, which is usually but not
necessarily column-centred;
• the (p × p) matrix Q defines a metric on the p variables, so that the
distance between two observations x j and x k is (x j −x k ) Q(x j −x k );
• the (n × n) matrix D is usually diagonal, and its diagonal elements
consist of a set of weights for the n observations. It can, however, be
more general, for example when the observations are not independent,
as in time series (Caussinus, 1986; Escoufier, 1987).
The usual definition of covariance-based PCA has Q = I p the identity
I
matrix, and D = 1 n n , though to get the sample covariance matrix with
divisor (n − 1) it is necessary to replace n by (n − 1) in the definition of
D, leading to a set of ‘weights’ which do not sum to unity. Correlation-
based PCA is achieved either by standardizing X, or by taking Q to be
the diagonal matrix whose jth diagonal element is the reciprocal of the
standard deviation of the jth variable, j =1, 2,... ,p.
Implementation of PCA with a general triple (X, Q, D) is readily
achieved by means of the generalized SVD, described in Section 14.2.1,
with Φ and Ω from that section equal to Q and D from this section. The
coefficients of the generalized PCs are given in the columns of the matrix
B defined by equation (14.2.2). Alternatively, they can be found from an
eigenanalysis of X DXQ or XQX D (Escoufier, 1987).
A number of particular generalizations of the standard form of PCA fit
within this framework. For example, Escoufier (1987) shows that, in addi-
tion to the cases already noted, it can be used to: transform variables; to
remove the effect of an observation by putting it at the origin; to look at
subspaces orthogonal to a subset of variables; to compare sample and theo-
retical covariance matrices; and to derive correspondence and discriminant
analyses. Maurin (1987) examines how the eigenvalues and eigenvectors of
a generalized PCA change when the matrix Q in the triple is changed.
The framework also has connections with the fixed effects model of Sec-
tion 3.9. In that model, the observations x i are such that x i = z i + e i ,
where z i lies in a q-dimensional subspace and e i is an error term with zero
2
mean and covariance matrix σ Γ. Maximum likelihood estimation of the
w i
model, assuming a multivariate normal distribution for e, leads to a gener-
alized PCA, where D is diagonal with elements w i and Q (which is denoted

