Page 420 - Jolliffe I. Principal Component Analysis
P. 420

385
                                          14.2. Weights, Metrics, Transformations and Centerings
                              PC framework above unless the w ij can be written as products w ij =
                              ω i φ j ,i =1, 2,... ,n; j =1, 2,... ,p, although this method involves similar
                              ideas. The examples given by Gabriel and Zamir (1979) can be expressed as
                              contingency tables, so that correspondence analysis rather than PCA may
                              be more appropriate, and Greenacre (1984), too, develops generalized PCA
                              as an offshoot of correspondence analysis (he shows that another special
                              case of the generalized SVD (14.2.2) produces correspondence analysis, a
                              result which was discussed further in Section 13.1). The idea of weighting
                              could, however, be used in PCA for any type of data, provided that suitable
                              weights can be defined.
                                Gabriel and Zamir (1979) suggest a number of ways in which special cases
                              of their weighted analysis may be used. As noted in Section 13.6, it can
                              accommodate missing data by giving zero weight to missing elements of X.
                              Alternatively, the analysis can be used to look for ‘outlying cells’ in a data
                              matrix. This can be achieved by using similar ideas to those introduced
                              in Section 6.1.5 in the context of choosing how many PCs to retain. Any
                              particular element x ij of X is estimated by least squares based on a subset
                              of the data that does not include x ij . This (rank m) estimate m ˆx ij is
                              readily found by equating to zero a subset of weights in (14.2.5), including
                              w ij , The difference between x ij and m ˆx ij provides a better measure of the
                              ‘outlyingness’ of x ij compared to the remaining elements of X, than does
                              the difference between x ij and a rank m estimate, m ˜x ij , based on the SVD
                              for the entire matrix X. This result follows because m ˆx ij is not affected by
                              x ij , whereas x ij contributes to the estimate m ˜x ij .
                                Commandeur et al. (1999) describe how to introduce weights for both
                              variables and observations into Meulman’s (1986) distance approach to
                              nonlinear multivariate data analysis (see Section 14.1.1).
                                In the standard atmospheric science set-up, in which variables correspond
                              to spatial locations, weights may be introduced to take account of uneven
                              spacing between the locations where measurements are taken. The weights
                              reflect the size of the area for which a particular location (variable) is
                              the closest point. This type of weighting may also be necessary when the
                              locations are regularly spaced on a latitude/longitude grid. The areas of the
                              corresponding grid cells decrease towards the poles, and allowance should
                              be made for this if the latitudinal spread of the data is moderate or large. An
                              obvious strategy is to assign to the grid cells weights that are proportional
                              to their areas. However, if there is a strong positive correlation within cells,
                              it can be argued that doubling the area, for example, does not double the
                              amount of independent information and that weights should reflect this.
                              Folland (1988) implies that weights should be proportional to (Area) ,
                                                                                             c
                              where c is between  1  and 1. Hannachi and O’Neill (2001) weight their data
                                               2
                              by the cosine of latitude.
                                Buell (1978) and North et al. (1982) derive weights for irregularly spaced
                              atmospheric data by approximating a continuous version of PCA, based on
                              an equation similar to (12.3.1).
   415   416   417   418   419   420   421   422   423   424   425