Page 437 - Jolliffe I. Principal Component Analysis
P. 437

14. Generalizations and Adaptations of Principal Component Analysis
                              402
                              generality, that x 1 ≤ x 2 ≤ ... ≤ x n , and define the sample distribution
                              function as F n (x)= i/n for x i ≤ x<x i+1 ,i =0, 1,... ,n, where x 0 ,
                              x n+1 are defined as 0, 1 respectively. Then a well-known test statistic is the
                              Cram´er-von Mises statistic:
                                                         )  1
                                                                      2
                                                    2
                                                  W = n     (F n (x) − x) dx.
                                                    n
                                                          0
                                                                         2
                              Like most all-purpose goodness-of-fit statistics, W can detect many differ-
                                                                         n
                              ent types of discrepancy between the observations and G(y); a large value
                                  2
                              of W on its own gives no information about what type has occurred. For
                                  n
                              this reason a number of authors, for example Durbin and Knott (1972),
                                                                                2
                              Durbin et al. (1975), have looked at decompositions of W into a number
                                                                                n
                              of separate ‘components,’ each of which measures the degree to which a
                              different type of discrepancy is present.
                                                                          2
                                It turns out that a ‘natural’ way of partitioning W is (Durbin and Knott,
                                                                          n
                              1972)
                                                              ∞
                                                                  2
                                                          2
                                                        W =      z ,
                                                          n       nk
                                                              k=1
                              where
                                                 )  1
                                     z nk =(2n) 1/2  (F n (x) − x)sin (kπx) dx,  k =1, 2,... ,
                                                  0
                                           √                               √
                              are the PCs of  n(F n (x) − x). The phrase ‘PCs of  n(F n (x) − x)’ needs
                                                     √
                              further explanation, since  n(F n (x) − x) is not, as is usual when defining
                              PCs, a p-variable vector. Instead, it is an infinite-dimensional random vari-
                              able corresponding to the continuum of values for x between zero and one.
                              Durbin and Knott (1972) solve an equation of the form (12.3.1) to obtain
                              eigenfunctions a k (x), and hence corresponding PCs
                                                        )  1
                                                     √
                                               z nk =  n   a k (x)(F n (x) − x) dx,
                                                         0
                                           √
                              where a k (x)=  2sin(kπx).
                                The components z nk ,k =1, 2, ... are discussed in considerable detail,
                              from both theoretical and practical viewpoints, by Durbin and Knott
                              (1972), and Durbin et al. (1975), who also give several additional references
                              for the topic.
                                Another use of PCA in goodness-of-fit testing is noted by Jackson (1991,
                              Section 14.3), namely using an extension to the multivariate case of the
                              Shapiro-Wilk test for normality, based on PCs rather than on the origi-
                              nal variables. Kaigh (1999) also discusses something described as ‘principal
                              components’ in the context of goodness-of-fit, but these appear to be related
                              to Legendre polynomials, rather than being the usual variance-maximizing
                              PCs.
   432   433   434   435   436   437   438   439   440   441   442