Page 421 - Jolliffe I. Principal Component Analysis
P. 421

14. Generalizations and Adaptations of Principal Component Analysis
                              386
                              14.2.2 Metrics
                              The idea of defining PCA with respect to a metric or an inner-product dates
                              back at least to Dempster (1969, Section 7.6). Following the publication of
                              Cailliez and Pag`es (1976) it became, together with an associated ‘duality
                              diagram,’ a popular view of PCA in France in the 1980s (see, for example,
                              Caussinus, 1986; Escoufier, 1987). In this framework, PCA is defined in
                              terms of a triple (X, Q, D), the three elements of which are:
                                 • the matrix X is the (n × p) data matrix, which is usually but not
                                   necessarily column-centred;
                                 • the (p × p) matrix Q defines a metric on the p variables, so that the
                                   distance between two observations x j and x k is (x j −x k ) Q(x j −x k );

                                 • the (n × n) matrix D is usually diagonal, and its diagonal elements
                                   consist of a set of weights for the n observations. It can, however, be
                                   more general, for example when the observations are not independent,
                                   as in time series (Caussinus, 1986; Escoufier, 1987).

                                The usual definition of covariance-based PCA has Q = I p the identity
                                               I
                              matrix, and D =  1 n n , though to get the sample covariance matrix with
                              divisor (n − 1) it is necessary to replace n by (n − 1) in the definition of
                              D, leading to a set of ‘weights’ which do not sum to unity. Correlation-
                              based PCA is achieved either by standardizing X, or by taking Q to be
                              the diagonal matrix whose jth diagonal element is the reciprocal of the
                              standard deviation of the jth variable, j =1, 2,... ,p.
                                Implementation of PCA with a general triple (X, Q, D) is readily
                              achieved by means of the generalized SVD, described in Section 14.2.1,
                              with Φ and Ω from that section equal to Q and D from this section. The
                              coefficients of the generalized PCs are given in the columns of the matrix
                              B defined by equation (14.2.2). Alternatively, they can be found from an
                              eigenanalysis of X DXQ or XQX D (Escoufier, 1987).


                                A number of particular generalizations of the standard form of PCA fit
                              within this framework. For example, Escoufier (1987) shows that, in addi-
                              tion to the cases already noted, it can be used to: transform variables; to
                              remove the effect of an observation by putting it at the origin; to look at
                              subspaces orthogonal to a subset of variables; to compare sample and theo-
                              retical covariance matrices; and to derive correspondence and discriminant
                              analyses. Maurin (1987) examines how the eigenvalues and eigenvectors of
                              a generalized PCA change when the matrix Q in the triple is changed.
                                The framework also has connections with the fixed effects model of Sec-
                              tion 3.9. In that model, the observations x i are such that x i = z i + e i ,
                              where z i lies in a q-dimensional subspace and e i is an error term with zero
                                                         2
                              mean and covariance matrix  σ  Γ. Maximum likelihood estimation of the
                                                        w i
                              model, assuming a multivariate normal distribution for e, leads to a gener-
                              alized PCA, where D is diagonal with elements w i and Q (which is denoted
   416   417   418   419   420   421   422   423   424   425   426