Page 417 - Jolliffe I. Principal Component Analysis
P. 417

14. Generalizations and Adaptations of Principal Component Analysis
                              382
                              ter 6), and a shorter description is given by Krzanowski and Marriott (1994,
                              Chapter 8). The link between non-linear biplots and PCA is somewhat
                              tenuous, so we introduce them only briefly. Classical biplots are based
                              on the singular value decomposition of the data matrix X, and provide
                              a best possible rank 2 approximation to X in a least squares sense (Sec-
                              tion 3.5). The distances between observations in the 2-dimensional space
                              of the biplot with α = 1 (see Section 5.3) give optimal approximations to
                              the corresponding Euclidean distances in p-dimensional space (Krzanow-
                              ski and Marriott, 1994). Non-linear biplots replace Euclidean distance by
                              other distance functions. In plots thus produced the straight lines or ar-
                              rows representing variables in the classical biplot are replaced by curved
                              trajectories. Different trajectories are used to interpolate positions of ob-
                              servations on the plots and to predict values of the variables given the
                              plotting position of an observation. Gower and Hand (1996) give examples
                              of interpolation biplot trajectories but state that they ‘do not yet have an
                              example of prediction nonlinear biplots.’
                                Tenenbaum et al. (2000) describe an algorithm in which, as with
                              non-linear biplots, distances between observations other than Euclidean
                              distance are used in a PCA-related procedure. Here so-called geodesic dis-
                              tances are approximated by finding the shortest paths in a graph connecting
                              the observations to be analysed. These distances are then used as input to
                              what seems to be principal coordinate analysis, a technique which is related
                              to PCA (see Section 5.2).



                              14.2 Weights, Metrics, Transformations and
                                      Centerings


                              Various authors have suggested ‘generalizations’ of PCA. We have met ex-
                              amples of this in the direction of non-linearity in the previous section. A
                              number of generalizations introduce weights or metrics on either observa-
                              tions or variables or both. The related topics of weights and metrics make
                              up two of the three parts of the present section; the third is concerned with
                              different ways of transforming or centering the data.



                              14.2.1 Weights
                              We start with a definition of generalized PCA which was given by Greenacre
                              (1984, Appendix A). It can viewed as introducing either weights or metrics
                              into the definition of PCA. Recall the singular value decomposition (SVD)
                              of the (n × p) data matrix X defined in equation (3.5.1), namely

                                                         X = ULA .                      (14.2.1)
   412   413   414   415   416   417   418   419   420   421   422