Page 396 - Jolliffe I. Principal Component Analysis
P. 396
13.5. Common Principal Components
361
draw distinctions between the properties of principal components found by
each. Although the different criteria lead to a number of different general-
izations, it is arguable just how great a distinction should be drawn between
the three ungeneralized analyses (see Cadima and Jolliffe (1997); ten Berge
and Kiers (1997)).
The first property considered by ten Berge and Kiers (1996) corre-
sponds to Property A1 of Section 2.1, in which tr(B ΣB) is minimized.
For G groups of individuals treated separately this leads to minimiza-
tion of G tr(B Σ g B g ), but taking B g the same for each group gives
g=1 g
simultaneous components that minimize
G G
tr(B Σ g B)=tr[B ( Σ g )B]
g=1 g=1
¯
= G tr(B ΣB),
¯
where Σ is the average of Σ 1 , Σ 2 ,..., Σ G .
Ten Berge and Kiers’ (1996) second property is a sample version of
Property A5 in Section 2.1. They express this property as minimizing
2
X − XBC .For G groups treated separately, the quantity
G
2
X g − X g B g C (13.5.3)
g
g=1
is minimized. Ten Berge and Kiers (1996) distinguish three different ways
of adapting this formulation to find simultaneous components.
G 2
• Minimize g=1 X g − X g BC .
g
2
• Minimize G X g − X g B g C .
g=1
• Minimize (13.5.3) subject to Σ g B g = SD g , where D g is diagonal and
S is a ‘common component structure.’
The third optimality criterion considered by Ten Berge and Kiers (1996)
is that noted at the end of Section 2.1, and expressed by Rao (1964) as
minimizing Σ − ΣB(B ΣB) −1 B Σ . Ten Berge and Kiers (1996) write
2
this as minimizing Σ − FF , which extends to G groups by minimizing
2
G Σ g − F g F . This can then be modified to give simultaneous com-
g=1 g
2
ponents by minimizing G Σ g − FF . They show that this criterion
g=1
and the criterion based on Property A1 are both equivalent to the second
of their generalizations derived from Property A5.
Ten Berge and Kiers (1996) compare properties of the three generaliza-
tions of Property A5, but do not reach any firm conclusions as to which
is preferred. They are, however, somewhat dismissive of Flury’s (1988)
approach on the grounds that it has at its heart the simultaneous diag-
onalization of G covariance matrices and ‘it is by no means granted that

