Page 410 - Jolliffe I. Principal Component Analysis

P. 410

14.1. Non-Linear Extensions of Principal Component Analysis
375
and Marriott (1994, Chapter 8), and Michailidis and de Leeuw (1998) give
a review.
Giﬁ’s (1990) form of non-linear PCA is based on a generalization of the
result that if, for an (n × p) data matrix X, we minimize

tr{(X − YB ) (X − YB )}, (14.1.1)

with respect to the (n × q) matrix Y whose columns are linear functions of
columns of X, and with respect to the (q ×p) matrix B where the columns

of B are orthogonal, then the optimal Y consists of the values (scores) of
the first q PCs for the n observations, and the optimal matrix B consists
of the coefficients of the first q PCs. The criterion (14.1.1) corresponds to
that used in the sample version of Property A5 (see Section 2.1), and can
be rewritten as
 
 p 

tr (x j − Yb j ) (x j − Yb j ) , (14.1.2)
j=1
 

where x j , b j are the jth columns of X, B , respectively.
Giﬁ’s (1990) version of non-linear PCA is designed for categorical vari-
ables so that there are no immediate values of x j to insert in (14.1.2). Any
variables that are continuous are ﬁrst converted to categories; then values
need to be derived for each category of every variable. We can express this
algebraically as the process minimizing
 
 p 

tr (G j c j − Yb j ) (G j c j − Yb j ) , (14.1.3)

 
j=1
where G j is an (n × g j ) indicator matrix whose (h, i)th value is unity if
the hth observation is in the ith category of the jth variable and is zero
otherwise, and c j is a vector of length g j containing the values assigned
to the g j categories of the jth variable. The minimization takes place with
respect to both c j and Yb j , so that the difference from (linear) PCA is
that there is optimization over the values of the variables in addition to
optimization of the scores on the q components. The solution is found by
an alternating least squares (ALS) algorithm which alternately fixes the
c j and minimizes with respect to the Yb j , then fixes the Yb j at the new
values and minimizes with respect to the c j , fixes the c j at the new values
and minimizes over Yb j , and so on until convergence. This is implemented
by the Gifi-written PRINCALS computer program (Gifi, 1990, Section 4.6)
which is incorporated in the SPSS software.
A version of non-linear PCA also appears in another guise within the
Gifi system. For two categorical variables we have a contingency table that
can be analysed by correspondence analysis (Section 13.1). For more than
two categorical variables there is an extension of correspondence analysis,
called multiple correspondence analysis (see Section 13.1 and Greenacre,

405 406 407 408 409 410 411 412 413 414 415