Page 41 - Jolliffe I. Principal Component Analysis
P. 41

2


                              Mathematical and Statistical
                              Properties of Population Principal

                              Components















                              In this chapter many of the mathematical and statistical properties of PCs
                              are discussed, based on a known population covariance (or correlation)
                              matrix Σ. Further properties are included in Chapter 3 but in the context
                              of sample, rather than population, PCs. As well as being derived from a
                              statistical viewpoint, PCs can be found using purely mathematical argu-
                              ments; they are given by an orthogonal linear transformation of a set of
                              variables optimizing a certain algebraic criterion. In fact, the PCs optimize
                              several different algebraic criteria and these optimization properties, to-
                              gether with their statistical implications, are described in the first section
                              of the chapter.
                                In addition to the algebraic derivation given in Chapter 1, PCs can also be
                              looked at from a geometric viewpoint. The derivation given in the original
                              paper on PCA by Pearson (1901) is geometric but it is relevant to samples,
                              rather than populations, and will therefore be deferred until Section 3.2.
                              However, a number of other properties of population PCs are also geometric
                              in nature and these are discussed in the second section of this chapter.
                                The first two sections of the chapter concentrate on PCA based on a
                              covariance matrix but the third section describes how a correlation, rather
                              than a covariance, matrix may be used in the derivation of PCs. It also
                              discusses the problems associated with the choice between PCAs based on
                              covariance versus correlation matrices.
                                In most of this text it is assumed that none of the variances of the PCs are
                              equal; nor are they equal to zero. The final section of this chapter explains
                              briefly what happens in the case where there is equality between some of
                              the variances, or when some of the variances are zero.
   36   37   38   39   40   41   42   43   44   45   46