Page 434 - Jolliffe I. Principal Component Analysis
P. 434

399
                                           14.5. Three-Mode, Multiway and Multiple Group PCA
                              procedures can be extended to more than two groups. For example, Casin
                              (2001) reviews a number of techniques for dealing with K sets of variables,
                              most of which involve a PCA of the data arranged in one way or another.
                              He briefly compares these various methods with his own ‘generalization’ of
                              PCA, which is now described.
                                Suppose that X k is an (n × p k ) data matrix consisting of measurements
                              of p k variables on n individuals, k =1, 2,... ,K. The same individuals are
                              observed for each k. The first step in Casin’s (2001) procedure is a PCA
                              based on the correlation matrix obtained from the (n × p) supermatrix
                                                   X =(X 1 X 2 ... X K ),


                                                             (1)
                              where p =  K   p k . The first PC, z  , thus derived is then projected onto
                                         k=1
                              the subspaces spanned by the columns of X k ,k =1, 2,... ,K, to give a
                                              (1)
                              ‘first component’ z  for each X k . To obtain a second component, residual
                                              k
                                       (2)                                (2)
                              matrices X  are calculated. The jth column of X  consists of residuals
                                       k                                  k
                                                                         (1)
                              from a regression of the jth column of X k on z  . A covariance matrix
                                                                         k
                              PCA is then performed for the supermatrix
                                                 X (2)  =(X (2)  X (2)  ... X (2) ).
                                                          1    2        K
                              The first PC from this analysis is next projected onto the subspaces spanned
                                               (2)                                       (2)
                              by the columns of X  ,k =1, 2,... ,K to give a second component z  for
                                               k                                         k
                              X k . This is called a ‘second auxiliary’ by Casin (2001). Residuals from re-
                                                       (2)   (2)             (3)
                              gressions of the columns of X  on z  give matrices X  , and a covariance
                                                       k     k               k
                              matrix PCA is carried out on the supermatrix formed from these matrices.
                                                       (3)
                              From this, third auxiliaries z  are calculated, and so on. Unlike an ordi-
                                                       k
                              nary PCA of X, which produces p PCs, the number of auxiliaries for the
                              kth group of variables is only p k . Casin (2001) claims that this procedure is
                              a sensible compromise between separate PCAs for each X k , which concen-
                              trate on within-group relationships, and extensions of canonical correlation
                              analysis, which emphasize relationships between groups.
                                Van de Geer (1984) reviews the possible ways in which linear relation-
                              ships between two groups of variables can be quantified, and then discusses
                              how each might be generalized to more than two groups (see also van de
                              Geer (1986)). One of the properties considered by van de Geer (1984) in
                              his review is the extent to which within-group, as well as between-group,
                              structure is considered. When within-group variability is taken into account
                              there are links to PCA, and one of van de Geer’s (1984) generalizations is
                              equivalent to a PCA of all the variables in the K groups, as in extended
                              EOF analysis. Lafosse and Hanafi (1987) extend Tucker’s inter-battery
                              model, which was discussed in Section 9.3.3, to more than two groups.
   429   430   431   432   433   434   435   436   437   438   439