Page 435 - Jolliffe I. Principal Component Analysis
P. 435
14. Generalizations and Adaptations of Principal Component Analysis
400
14.6 Miscellanea
This penultimate section discusses briefly some topics involving PCA that
do not fit very naturally into any of the other sections of the book.
14.6.1 Principal Components and Neural Networks
This subject is sufficiently large to have a book devoted to it (Diaman-
taras and Kung, 1996). The use of neural networks to provide non-linear
extensions of PCA is discussed in Section 14.1.3 and computational as-
pects are revisited in Appendix A1. A few other related topics are noted
here, drawing mainly on Diamantaras and Kung (1996), to which the in-
terested reader is referred for further details. Much of the work in this
area is concerned with constructing efficient algorithms, based on neural
networks, for deriving PCs. There are variations depending on whether a
single PC or several PCs are required, whether the first or last PCs are
of interest, and whether the chosen PCs are found simultaneously or se-
quentially. The advantage of neural network algorithms is greatest when
data arrive sequentially, so that the PCs need to be continually updated.
In some algorithms the transformation to PCs is treated as deterministic;
in others noise is introduced (Diamantaras and Kung, 1996, Chapter 5). In
this latter case, the components are written as
y = B x + e,
and the original variables are approximated by
ˆ x = Cy = CB x + Ce,
where B, C are (p × q) matrices and e is a noise term. When e = 0, mi-
nimizing E[(ˆ x − x) (ˆ x − x)] with respect to B and C leads to PCA (this
follows from Property A5 of Section 2.1), but the problem is complicated
by the presence of the term Ce in the expression for ˆ x. Diamantaras and
Kung (1996, Chapter 5) describe solutions to a number of formulations of
the problem of finding optimal B and C. Some constraints on B and/or C
are necessary to make the problem well-defined, and the different formu-
lations correspond to different constraints. All solutions have the common
feature that they involve combinations of the eigenvectors of the covariance
matrix of x with the eigenvectors of the covariance matrix of e. As with
other signal/noise problems noted in Sections 12.4.3 and 14.2.2, there is
the necessity either to know the covariance matrix of e or to be able to
estimate it separately from that of x.
Networks that implement extensions of PCA are described in Diamanta-
ras and Kung (1996, Chapters 6 and 7). Most have links to techniques
developed independently in other disciplines. As well as non-linear
extensions, the following analysis methods are discussed:

