Page 427 - Jolliffe I. Principal Component Analysis
P. 427

392
                                    14. Generalizations and Adaptations of Principal Component Analysis
                              14.3 Principal Components in the Presence of
                                      Secondary or Instrumental Variables
                              Rao (1964) describes two modifications of PCA that involve what he calls
                              ‘instrumental variables.’ These are variables which are of secondary im-
                              portance, but which may be useful in various ways in examining the
                              variables that are of primary concern. The term ‘instrumental variable’ is
                              in widespread use in econometrics, but in a rather more restricted context
                              (see, for example, Darnell (1994, pp. 197–200)).
                                Suppose that x is, as usual, a p-element vector of primary variables,
                              and that w is a vector of s secondary, or instrumental, variables. Rao
                              (1964) considers the following two problems, described respectively as ‘prin-
                              cipal components of instrumental variables’ and ‘principal components
                              ... uncorrelated with instrumental variables’:
                              (i) Find linear functions γ w, γ w,..., of w that best predict x.


                                                      1
                                                          2
                              (ii) Find linear functions α x, α x,... with maximum variances that,


                                                       1    2
                                 as well as being uncorrelated with each other, are also uncorrelated
                                 with w.
                                For (i), Rao (1964) notes that w may contain some or all of the elements
                              of x, and gives two possible measures of predictive ability, corresponding to
                              the trace and Euclidean norm criteria discussed with respect to Property
                              A5 in Section 2.1. He also mentions the possibility of introducing weights
                              into the analysis. The two criteria lead to different solutions to (i), one
                              of which is more straightforward to derive than the other. There is a su-
                              perficial resemblance between the current problem and that of canonical
                              correlation analysis, where relationships between two sets of variables are
                              also investigated (see Section 9.3), but the two situations are easily seen to
                              be different. However, as noted in Sections 6.3 and 9.3.4, the methodology
                              of Rao’s (1964) PCA of instrumental variables has reappeared under other
                              names. In particular, it is equivalent to redundancy analysis (van den Wol-
                              lenberg, 1977) and to one way of fitting a reduced rank regression model
                              (Davies and Tso, 1982).
                                The same technique is derived by Esposito (1998). He projects the matrix
                              X onto the space spanned by W, where X, W are data matrices associated
                              with x, w, and then finds principal components of the projected data. This
                              leads to an eigenequation
                                                   S XW S −1  S WX a k = l k a k ,
                                                        WW
                              which is the same as equation (9.3.5). Solving that equation leads to re-
                              dundancy analysis. Kazi-Aoual et al. (1995) provide a permutation test,
                                                          −1
                              using the test statistic tr(S WX S  S XW ) to decide whether there is any
                                                          XX
                              relationship between the x and w variables.
   422   423   424   425   426   427   428   429   430   431   432