Page 62 - Jolliffe I. Principal Component Analysis
P. 62

3.1. Optimal Algebraic Properties of Sample Principal Components
                              where
                                                       n
                                               ¯ x j =  1     ˜ x ij ,  j =1, 2,... ,p.     31
                                                    n
                                                      i=1
                              The matrix S can therefore be written as
                                                             1

                                                       S =      X X,                     (3.1.1)
                                                           n − 1
                              where X is an (n × p) matrix with (i, j)th element (˜x ij − ¯x j ); the repre-
                              sentation (3.1.1) will be very useful in this and subsequent chapters. The
                              notation x ij will be used to denote the (i, j)th element of X, so that x ij is
                              the value of the jth variable measured about its mean ¯x j for the ith obser-
                              vation. A final notational point is that it will be convenient to define the
                              matrix of PC scores as
                                                          Z = XA,                        (3.1.2)
                              rather than as it was in the earlier definition. These PC scores will have
                                                                                     ˜
                              exactly the same variances and covariances as those given by Z, but will
                              have zero means, rather than means ¯z k ,k =1, 2,... ,p.
                                Another point to note is that the eigenvectors of  1  X X and X X are


                                                                           n−1

                              identical, and the eigenvalues of  1  X X are simply  1  (the eigenvalues
                                                           n−1               n−1

                              of X X). Because of these relationships it will be convenient in some places
                              below to work in terms of eigenvalues and eigenvectors of X X, rather than

                              directly with those of S.
                                Turning to the algebraic properties A1–A5 listed in Section 2.1, define

                                                             for i =1, 2,... ,n,         (3.1.3)
                                                y i = B x i
                              where B, as in Properties A1, A2, A4, A5, is a (p×q) matrix whose columns
                              are orthonormal. Then Properties A1, A2, A4, A5, still hold, but with the
                              sample covariance matrix of the observations y i ,i =1, 2,... ,n, replacing
                              Σ y , and with the matrix A now defined as having kth column a k , with
                              A q , A , respectively, representing its first and last q columns. Proofs in
                                   ∗
                                   q
                              all cases are similar to those for populations, after making appropriate
                              substitutions of sample quantities in place of population quantities, and
                              will not be repeated. Property A5 reappears as Property G3 in the next
                              section and a proof will be given there.
                                The spectral decomposition, Property A3, also holds for samples in the
                              form

                                               S = l 1 a 1 a + l 2 a 2 a + ··· + l p a p a .  (3.1.4)


                                                       1       2            p
                              The statistical implications of this expression, and the other algebraic prop-
                              erties, A1, A2, A4, A5, are virtually the same as for the corresponding
                              population properties in Section 2.1, except that they must now be viewed
                              in a sample context.
   57   58   59   60   61   62   63   64   65   66   67