Page 139 - Jolliffe I. Principal Component Analysis
P. 139

5. Graphical Representation of Data Using Principal Components
                              108
                              remain limitations on how many dimensions can be effectively shown si-
                              multaneously. The less sophisticated ideas of Tukey and Tukey (1981) still
                              have a rˆole to play in this respect.
                                Even when dimensionality cannot be reduced to two or three, a reduc-
                              tion to as few dimensions as possible, without throwing away too much
                              information, is still often worthwhile before attempting to graph the data.
                              Some techniques, such as Chernoff’s faces, impose a limit on the number
                              of variables that can be handled, although a modification due to Flury and
                              Riedwyl (1981) increases the limit, and for most other methods a reduction
                              in the number of variables leads to simpler and more easily interpretable
                              diagrams. An obvious way of reducing the dimensionality is to replace the
                              original variables by the first few PCs, and the use of PCs in this context
                              will be particularly successful if each PC has an obvious interpretation (see
                              Chapter 4). Andrews (1972) recommends transforming to PCs in any case,
                              because the PCs are uncorrelated, which means that tests of significance
                              for the plots may be more easily performed with PCs than with the origi-
                              nal variables. Jackson (1991, Section 18.6) suggests that Andrews’ curves
                              of the residuals after ‘removing’ the first q PCs, that is, the sum of the last
                              (r − q) terms in the SVD of X, may provide useful information about the
                              behaviour of residual variability.


                              5.6.1 Example

                              In Jolliffe et al. (1986), 107 English local authorities are divided into groups
                              or clusters, using various methods of cluster analysis (see Section 9.2), on
                              the basis of measurements on 20 demographic variables.
                                The 20 variables can be reduced to seven PCs, which account for over
                              90% of the total variation in the 20 variables, and for each local authority
                              an Andrews’ curve is defined on the range −π ≤ t ≤ π by the function
                                     z 1
                               f(t)= √ + z 2 sin t + z 3 cos t + z 4 sin 2t + z 5 cos 2t + z 6 sin 3t + z 7 cos 3t,
                                       2
                              where z 1 ,z 2 ,...,z 7 are the values of the first seven PCs for the local au-
                              thority. Andrews’ curves may be plotted separately for each cluster. These
                              curves are useful in assessing the homogeneity of the clusters. For example,
                              Figure 5.7 gives the Andrews’ curves for three of the clusters (Clusters 2,
                              11 and 12) in a 13-cluster solution, and it can be seen immediately that
                              the shape of the curves is different for different clusters.
                                Compared to the variation between clusters, the curves fall into fairly
                              narrow bands, with a few exceptions, for each cluster. Narrower bands for
                              the curves imply greater homogeneity in the cluster.
                                In Cluster 12 there are two curves that are somewhat different from
                              the remainder. These curves have three complete oscillations in the range
                              (−π, π), with maxima at 0 and ±2π/3. This implies that they are domi-
                              nated by cos 3t and hence z 7 . Examination of the seventh PC shows that
   134   135   136   137   138   139   140   141   142   143   144