Page 135 - Jolliffe I. Principal Component Analysis

P. 135

5. Graphical Representation of Data Using Principal Components
104
and the jth value for the second (column) variable. Let N be the (r × c)
matrix whose (i, j)th element is n ij .
There are a number of seemingly different approaches, all of which lead
to correspondence analysis; Greenacre (1984, Chapter 4) discusses these
various possibilities in some detail. Whichever approach is used, the final
product is a sequence of pairs of vectors (f 1 , g 1 ), (f 2 , g 2 ),..., (f q , g q ) where
f k ,k =1, 2,...,are r-vectors of scores or coefficients for the rows of N,
and g k ,k =1, 2,... are c-vectors of scores or coefficients for the columns
of N. These pairs of vectors are such that the first q such pairs give a ‘best-
fitting’ representation in q dimensions, in a sense defined in Section 13.1,
of the matrix N, and of its rows and columns. It is common to take q =2.
The rows and columns can then be plotted on a two-dimensional diagram;
the coordinates of the ith row are the ith elements of f 1 , f 2 ,i =1, 2,... ,r,
and the coordinates of the jth column are the jth elements of g 1 , g 2 ,j =
1, 2,... ,c.
Such two-dimensional plots cannot in general be compared in any direct
way with plots made with respect to PCs or classical biplots, as N is
a different type of data matrix from that used for PCs or their biplots.
However, Greenacre (1984, Sections 9.6 and 9.10) gives examples where
correspondence analysis is done with an ordinary (n × p) data matrix,
X replacing N. This is only possible if all variables are measured in the
same units. In these circumstances, correspondence analysis produces a
simultaneous two-dimensional plot of the rows and columns of X,which is
precisely what is done in a biplot, but the two analyses are not the same.
Both the classical biplot and correspondence analysis determine the
plotting positions for rows and columns of X from the singular value de-
composition (SVD) of a matrix (see Section 3.5). For the classical biplot,
the SVD is calculated for the column-centred matrix X, but in correspon-
dence analysis, the SVD is found for a matrix of residuals, after subtracting
‘expected values assuming independence of rows and columns’ from X/n
(see Section 13.1). The effect of looking at residual (or interaction) terms is
(Greenacre, 1984, p. 288) that all the dimensions found by correspondence
analysis represent aspects of the ‘shape’ of the data, whereas in PCA the
first PC often simply represents ‘size’ (see Sections 4.1, 13.2). Correspon-
dence analysis provides one way in which a data matrix may be adjusted
in order to eliminate some uninteresting feature such as ‘size,’ before find-
ing an SVD and hence ‘PCs.’ Other possible adjustments are discussed in
Sections 13.2 and 14.2.3.
As with the biplot and its choice of α, there are several different ways of
plotting the points corresponding to rows and columns in correspondence
analysis. Greenacre and Hastie (1987) give a good description of the geom-
etry associated with the most usual of these plots. Whereas the biplot may
approximate Euclidean or Mahalanobis distances between rows, in corre-
spondence analysis the points are often plotted to optimally approximate
2
so-called χ distances (see Greenacre (1984), Benzécri (1992)).

130 131 132 133 134 135 136 137 138 139 140