Page 405 - Jolliffe I. Principal Component Analysis
P. 405

13. Principal Component Analysis for Special Types of Data
                              370
                              Hence both the PC scores and their vectors of loadings have real and imag-
                              inary parts that can be examined separately. Alternatively, they can be
                              expressed in polar coordinates, and displayed as arrows whose lengths and
                              directions are defined by the polar coordinates. Such displays for loadings
                              are particularly useful when the variables correspond to spatial locations,
                              as in the example of wind measurements noted above, so that a map of the
                              arrows can be constructed for the eigenvectors. For such data, the ‘obser-
                              vations’ usually correspond to different times, and a different kind of plot
                              is needed for the PC scores. For example, Klink and Willmott (1989) use
                              two-dimensional contour plots in which the horizontal axis corresponds to
                              time (different observations), the vertical axis gives the angular coordinate
                              of the complex score, and contours represent the amplitudes of the scores.
                                The use of complex PCA for wind data dates back to at least Walton
                              and Hardy (1978). An example is given by Klink and Willmott (1989) in
                              which two versions of complex PCA are compared. In one, the real and
                              imaginary parts of the complex data are zonal (west-east) and meridional
                              (south-north) wind velocity components, while wind speed is ignored in the
                              other with real and imaginary parts corresponding to sines and cosines of
                              the wind direction. A third analysis performs separate PCAs on the zonal
                              and meridional wind components, and then recombines the results of these
                              scalar analyses into vector form. Some similarities are found between the
                              results of the three analyses, but there are non-trivial differences. Klink and
                              Willmott (1989) suggest that the velocity-based complex PCA is most ap-
                              propriate for their data. Von Storch and Zwiers (1999, Section 16.3.3) have
                              an example in which ocean currents, as well as wind stresses, are considered.
                                One complication in complex PCA is that the resulting complex eigenvec-
                              tors can each be arbitrarily rotated in the complex plane. This is different
                              in nature from rotation of (real) PCs, as described in Section 11.1, be-
                              cause the variance explained by each component is unchanged by rotation.
                              Klink and Willmott (1989) discuss how to produce solutions whose mean
                              direction is not arbitrary, so as to aid interpretation.
                                Preisendorfer and Mobley (1988, Section 2c) discuss the theory of
                              complex-valued PCA in some detail, and extend the ideas to quaternion-
                              valued and matrix-valued data sets. In their Section 4e they suggest that
                              it may sometimes be appropriate with vector-valued data to take Fourier
                              transforms of each element in the vector, and conduct PCA in the fre-
                              quency domain. There are, in any case, connections between complex PCA
                              and PCA in the frequency domain (see Section 12.4.1 and Brillinger (1981,
                              Chapter 9)).
                              PCA for Data Given as Intervals
                              Sometimes, because the values of the measured variables are imprecise or
                              because of other reasons, an interval of values is given for a variable rather
                              than a single number. An element of the (n × p) data matrix is then an
                              interval (x ij , x ij ) instead of the single value x ij . Chouakria et al. (2000)
   400   401   402   403   404   405   406   407   408   409   410