Page 428 - Jolliffe I. Principal Component Analysis
P. 428
14.3. PCs in the Presence of Secondary or Instrumental Variables
393
Inevitably, the technique has been generalized. For example, Sabatier
et al. (1989) do so using the generalization of PCA described in Sec-
tion 14.2.2, with triples (X, Q 1 , D), (W, Q 2 , D). They note that Rao’s
(1964) unweighted version of PCA of instrumental variables results from
1
doing a generalized PCA on W, with D = n n ,and Q 2 chosen to mini-
I
mize XX −WQ 2 W , where . denotes Euclidean norm. Sabatier et al.
(1989) extend this to minimize XQ 1 X D − WQ 2 W D with respect to
Q 2 . They show that for various choices of Q 1 and D, a number of other
statistical techniques arise as special cases. Another generalization is given
by Takane and Shibayama (1991). For an (n 1 ×p 1 ) data matrix X, consider
the prediction of X not only from an (n 1 × p 2 ) matrix of additional vari-
ables measured on the same individuals, but also from an (n 2 × p 1 ) matrix
of observations on a different set of n 2 individuals for the same variables as
in X. PCA of instrumental variables occurs as a special case when only the
first predictor matrix is present. Takane et al. (1995) note that redundancy
analysis, and Takane and Shibayama’s (1991) extension of it, amount to
projecting the data matrix X onto a subspace that depends on the external
information W and then conducting a PCA on the projected data. This
projection is equivalent to putting constraints on the PCA, with the same
constraints imposed in all dimensions. Takane et al. (1995) propose a fur-
ther generalization in which different constraints are possible in different
dimensions. The principal response curves of van den Brink and ter Braak
(1999) (see Section 12.4.2) represent another extension.
One situation mentioned by Rao (1964) in which problem type (ii)
(principal components uncorrelated with instrumental variables) might be
relevant is when the data x 1 , x 2 ,..., x n form a multiple time series with p
variables and n time points, and it is required to identify linear functions
of x that have large variances, but which are uncorrelated with ‘trend’
in the time series (see Section 4.5 for an example where the first PC is
dominated by trend). Rao (1964) argues that such functions can be found
by defining instrumental variables which represent trend, and then solv-
ing the problem posed in (ii), but he gives no example to illustrate this
idea. A similar idea is employed in some of the techniques discussed in Sec-
tion 13.2 that attempt to find components that are uncorrelated with an
isometric component in the analysis of size and shape data. In the context
of neural networks, Diamantaras and Kung (1996, Section 7.1) describe a
form of ‘constrained PCA’ in which the requirement of uncorrelatedness in
Rao’s method is replaced by orthogonality of the vectors of coefficients in
the constrained PCs to the subspace spanned by a set of constraints (see
Section 14.6.1).
Kloek and Mennes (1960) also discussed the use of PCs as ‘instrumental
variables,’ but in an econometric context. In their analysis, a number of
dependent variables y are to be predicted from a set of predictor variables
x. Information is also available concerning another set of variables w (the
instrumental variables) not used directly in predicting y, but which can

