Page 432 - Jolliffe I. Principal Component Analysis
P. 432

14.5. Three-Mode, Multiway and Multiple Group PCA
                              derlying factors will very often be correlated, and that it is too restrictive
                              to force them to be uncorrelated, let alone independent (see, for example
                              Cattell (1978, p. 128); Richman (1986)).                      397
                              14.5 Three-Mode, Multiway and Multiple Group
                                      Principal Component Analysis

                              Principal component analysis is usually done on a single (n×p) data matrix
                              X, but there are extensions to many other data types. In this section we
                              discuss briefly the case where there are additional ‘modes’ in the data. As
                              well as rows (individuals) and columns (variables) there are other layers,
                              such as different time periods.
                                The ideas for three-mode methods were first published by Tucker in the
                              mid-1960s (see, for example, Tucker, 1966) and by the early 1980s the topic
                              of three-mode principal component analysis was, on its own, the subject of
                              a 398-page book (Kroonenberg, 1983a). A 33-page annotated bibliogra-
                              phy (Kroonenberg, 1983b) gave a comprehensive list of references for the
                              slightly wider topic of three-mode factor analysis. The term ‘three-mode’
                              refers to data sets that have three modes by which the data may be classi-
                              fied. For example, when PCs are obtained for several groups of individuals
                              as in Section 13.5, there are three modes corresponding to variables, groups
                              and individuals. Alternatively, we might have n individuals, p variables and
                              t time points, so that ‘individuals,’ ‘variables’ and ‘time points’ define the
                              three modes. In this particular case we have effectively n time series of p
                              variables, or a single time series of np variables. However, the time points
                              need not be equally spaced, nor is the time-order of the t repetitions neces-
                              sarily relevant in the sort of data for which three-mode PCA is used, in the
                              same way that neither individuals nor variables usually have any particular
                              a priori ordering.
                                Let x ijk be the observed value of the jth variable for the ith individual
                              measured on the kth occasion. The basic idea in three-mode analysis is to
                              approximate x ijk by the model

                                                       m   q  s

                                                 ˜ x ijk =      a ih b jl c kr g hlr .
                                                       h=1 l=1 r=1
                              The values m, q, s are less, and if possible very much less, than n, p,
                              t, respectively, and the parameters a ih ,b jl, c kr ,g hlr ,i =1, 2,... ,n, h =
                              1, 2,... ,m, j =1, 2,... ,p, l =1, 2,...,q, k =1, 2,... ,t, r =1, 2,... ,s
                              are chosen to give a good fit of ˜x ijk to x ijk for all i, j, k. There are a
                              number of methods for solving this problem and, like ordinary PCA, they
                              involve finding eigenvalues and eigenvectors of cross-product or covariance
                              matrices, in this case by combining two of the modes (for example, combine
   427   428   429   430   431   432   433   434   435   436   437