Page 183 - Jolliffe I. Principal Component Analysis
P. 183

152
                                    7. Principal Component Analysis and Factor Analysis
                                Of these three assumptions, the first is a standard assumption for error
                              terms in most statistical models, and the second is convenient and loses no
                              generality. The third may not be true, but if it is not, (7.1.2) can be simply
                              adapted to become x = µ + Λf + e, where E[x]= µ. This modification
                              introduces only a slight amount of algebraic complication compared with
                              (7.1.2), but (7.1.2) loses no real generality and is usually adopted.

                              (ii)       E[ee ]= Ψ      (diagonal)

                                         E[fe ]= 0      (a matrix of zeros)

                                                        (an identity matrix)
                                         E[ff ]= I m
                                The first of these three assumptions is merely stating that the error terms
                              are uncorrelated which is a basic assumption of the factor model, namely
                              that all of x which is attributable to common influences is contained in
                              Λf,and e j , e k , j  = k are therefore uncorrelated. The second assumption,
                              that the common factors are uncorrelated with the specific factors, is also
                              a fundamental one. However, the third assumption can be relaxed so that
                              the common factors may be correlated (oblique) rather than uncorrelated
                              (orthogonal). Many techniques in factor analysis have been developed for
                              finding orthogonal factors, but some authors, such as Cattell (1978, p.
                              128), argue that oblique factors are almost always necessary in order to
                              get a correct factor structure. Such details will not be explored here as
                              the present objective is to compare factor analysis with PCA, rather than
                              to give a full description of factor analysis, and for convenience all three
                              assumptions will be made.
                             (iii) For some purposes, such as hypothesis tests to decide on an appropriate
                                 value of m, it is necessary to make distributional assumptions. Usually
                                 the assumption of multivariate normality is made in such cases but,
                                 as with PCA, many of the results of factor analysis do not depend on
                                 specific distributional assumptions.
                             (iv) Some restrictions are generally necessary on Λ, because without any
                                 restrictions there will be a multiplicity of possible Λs that give equally
                                 good solutions. This problem will be discussed further in the next
                                 section.



                              7.2 Estimation of the Factor Model

                              At first sight, the factor model (7.1.2) looks like a standard regression model
                              such as that given in Property A7 of Section 3.1 (see also Chapter 8). How-
                              ever, closer inspection reveals a substantial difference from the standard
                              regression framework, namely that neither Λ nor f in (7.1.2) is known,
                              whereas in regression Λ would be known and f would contain the only un-
                              known parameters. This means that different estimation techniques must
   178   179   180   181   182   183   184   185   186   187   188