Page 429 - Jolliffe I. Principal Component Analysis
P. 429

14. Generalizations and Adaptations of Principal Component Analysis
                              394
                              be used to obtain improved estimates of the coefficients B in the equation
                              predicting y from x. Kloek and Mennes (1960) examine a number of ways
                              in which PCs of w or PCs of the residuals obtained from regressing w on
                              x or PCs of the combined vector containing all elements of w and x,can
                              be used as ‘instrumental variables’ in order to obtain improved estimates
                              of the coefficients B.
                              14.4 Alternatives to Principal Component Analysis
                                      for Non-Normal Distributions


                              We have noted several times that for many purposes it is not necessary to
                              assume any particular distribution for the variables x in a PCA, although
                              some of the properties of Chapters 2 and 3 rely on the assumption of
                              multivariate normality.
                                One way of handling possible non-normality, especially if the distribution
                              has heavy tails, is to use robust estimation of the covariance or correla-
                              tion matrix, or of the PCs themselves. The estimates may be designed
                              to allow for the presence of aberrant observations in general, or may be
                              based on a specific non-normal distribution with heavier tails, as in Bac-
                              cini et al. (1996) (see Section 10.4). In inference, confidence intervals or
                              tests of hypothesis may be constructed without any need for distributional
                              assumptions using the bootstrap or jackknife (Section 3.7.2). The paper
                              by Dudzi´nski et al. (1995), which was discussed in Section 10.3, investi-
                              gates the effect of non-normality on repeatability of PCA, albeit in a small
                              simulation study.
                                Another possibility is to assume that the vector x of random variables
                              has a known distribution other than the multivariate normal. A number
                              of authors have investigated the case of elliptical distributions, of which
                              the multivariate normal is a special case. For example, Waternaux (1984)
                              considers the usual test statistic for the null hypothesis H 0q , as defined in
                              Section 6.1.4, of equality of the last (p−q) eigenvalues of the covariance ma-
                              trix. She shows that, with an adjustment for kurtosis, the same asymptotic
                              distribution for the test statistic is valid for all elliptical distributions with
                              finite fourth moments. Jensen (1986) takes this further by demonstrating
                              that for a range of hypotheses relevant to PCA, tests based on a multivari-
                              ate normal assumption have identical level and power for all distributions
                              with ellipsoidal contours, even those without second moments. Things get
                              more complicated outside the class of elliptical distributions, as shown by
                              Waternaux (1984) for H 0q .
                                Jensen (1987) calls the linear functions of x that successively maximize
                              ‘scatter’ of a conditional distribution, where conditioning is on previously
                              derived linear functions, principal variables. Unlike McCabe’s (1984) usage
                              of the same phrase, these ‘principal variables’ are not a subset of the original
   424   425   426   427   428   429   430   431   432   433   434