Page 402 - Jolliffe I. Principal Component Analysis
P. 402
13.7. PCA in Statistical Process Control
367
• One- or two-dimensional plots of PC scores. It was noted in Sec-
tion 10.1 that both the first few and the last few PCs may be useful
for detecting (different types of) outliers, and plots of both are used
in process control. In the published discussion of Roes and Does
(1995), Sullivan et al. (1995) argue that the last few PCs are per-
haps more useful in SPC than the first few, but in their reply to
the discussion Roes and Does disagree. If p is not too large, such
arguments can be overcome by using a scatterplot matrix to display
all two-dimensional plots of PC scores simultaneously. Plots can be
enhanced by including equal-probability contours, assuming approx-
imate multivariate normality, corresponding to warning and action
limits for those points that fall outside them (Jackson, 1991, Section
1.7; Martin et al., 1999).
2
• Hotelling’s T . It was seen in Section 10.1 that this is a special case
for q = p of the statistic d 2 in equation (10.1.2). If multivariate nor-
2i 2
mality is assumed, the distribution of T is known, and control limits
can be set based on that distribution (Jackson, 1991, Section 1.7).
• The squared prediction error (SPE). This is none other than the
statistic d 2 in equation (10.1.1). It was proposed by Jackson and
1i
Mudholkar (1979), who constructed control limits based on an
approximation to its distribution. They prefer d 2 to d 2 for com-
1i 2i
putational reasons and because of its intuitive appeal as a sum of
squared residuals from the (p − q)-dimensional space defined by the
first (p − q) PCs. However, Jackson and Hearne (1979) indicate that
2
the complement of d , in which the sum of squares of the first few
2i
rather than the last few renormalized PCs is calculated, may be use-
ful in process control when the objective is to look for groups of
‘out-of-control’ or outlying observations, rather than single outliers.
Their basic statistic is decomposed to give separate information about
variation within the sample (group) of potentially outlying observa-
tions, and about the difference between the sample mean and some
known standard value. In addition, they propose an alternative statis-
tic based on absolute, rather than squared, values of PCs. Jackson
and Mudholkar (1979) also extend their proposed control procedure,
2
based on d , to the multiple-outlier case, and Jackson (1991, Figure
1i
6.2) gives a sequence of significance tests for examining subgroups of
observations in which each test is based on PCs in some way.
2
Eggett and Pulsipher (1989) compare T , SPE, and the complement of
2
d suggested by Jackson and Hearne (1979), in a simulation study and find
2i
the third of these statistics to be inferior to the other two. On the basis of
2
their simulations, they recommend Hotelling’s T for large samples, with
SPE or univariate control charts preferred for small samples. They also
discussed the possibility of constructing CUSUM charts based on the three
statistics.

