Page 436 - Jolliffe I. Principal Component Analysis
P. 436

14.6. Miscellanea
                                                                                            401
                                 • Linear Approximation Asymmetric PCA. This leads to an equation
                                   that is equivalent to (9.3.2). Hence the technique is the same as re-
                                   dundancy analysis, one form of reduced rank regression and PCA of
                                   instrumental variables (Sections 9.3.3, 9.3.4, 14.3).
                                 • Cross-correlation Asymmetric PCA. This reduces to finding the SVD
                                   of the matrix of covariances between two sets of variables, and so is
                                   equivalent to maximum covariance analysis (Section 9.3.3).
                                 • Constrained PCA. This technique finds ‘principal components’ that
                                   are constrained to be orthogonal to a space defined by a set of con-
                                   straint vectors. It is therefore closely related to the idea of projecting
                                   orthogonally to the isometric vector for size and shape data (Sec-
                                   tion 13.2) and is similar to Rao’s (1964) PCA uncorrelated with
                                   instrumental variables (Section 14.3). A soft-constraint version of this
                                   technique, giving a compromise between constrained PCA and or-
                                   dinary PCA, is discussed in Diamantaras and Kung (1996, Section
                                   7.3).
                                 • Oriented PCA. In general terms, the objective is to find a 1 , a 2 ,...,

                                   a k ,... that successively maximize  a S 1 a k , where S 1 , S 2 are two covari-
                                                                 k

                                                                a S 2 a k
                                                                 k
                                   ance matrices. Diamantaras and Kung (1996, Section 7.2) note that
                                   special cases include canonical discriminant analysis (Section 9.1) and
                                   maximization of a signal to noise ratio (Sections 12.4.3, 14.2.2).
                                Xu and Yuille (1992) describe a neural network approach based on statis-
                              tical physics that gives a robust version of PCA (see Section 10.4). Fancourt
                              and Principe (1998) propose a network that is tailored to find PCs for
                              locally stationary time series.
                                As well as using neural networks to find PCs, the PCs can also be
                              used as inputs to networks designed for other purposes. Diamantaras and
                              Kung (1996, Section 4.6) give examples in which PCs are used as inputs
                              to discriminant analysis (Section 9.1) and image processing. McGinnis
                              (2000) uses them in a neural network approach to predicting snowpack
                              accumulation from 700 mb geopotential heights.


                              14.6.2 Principal Components for Goodness-of-Fit Statistics

                              The context of this application of PCA is testing whether or not a (uni-
                              variate) set of data y 1 ,y 2 ,...,y n could have arisen from a given probability
                              distribution with cumulative distribution function G(y); that is, we want a
                              goodness-of-fit test. If the transformation

                                                  x i = G(y i ),  i =1, 2,... ,n
                              is made, then we can equivalently test whether or not x 1 ,x 2 ,...,x n are
                              from a uniform distribution on the range (0, 1). Assume, without loss of
   431   432   433   434   435   436   437   438   439   440   441