Mpci Pca

QUALITY AND RELIABILITY ENGINEERING INTERNATIONAL
Qual. Reliab. Engng. Int. 2009; 25:69–77

Published online 5 August 2008 in Wiley InterScience (www.interscience.wiley.com). DOI: 10.1002/qre.954
Research Multivariate Process Capability

Using Principal Component
Analysis
R. L. Shinde1, ∗, † and K. G. Khadse2
1 Department of Statistics, School of Mathematical Sciences, North Maharashtra University, Jalgaon-425 001, India
2 Department of Statistics, M. J. College, Jalgaon-425 001, India
Wang and Chen (Qual. Eng. 1998; 11:21–27) have defined process capability indices
(PCIs) for multivariate normal processes data using principal component analysis
(PCA). Veevers (Statistical Process Monitoring and Optimization. Marcel Dekker:
New York, NY, 1999; 241–256) has suggested a multivariate capability index based on
the first principal component (PC). In this paper we demonstrate the problem in the
definition of PCIs given by Wang and Chen (Qual. Eng. 1998; 11:21–27) and the non-
suitability of PCI given by Veevers (Statistical Process Monitoring and Optimization.
Marcel Dekker: New York, NY, 1999; 241–256) through some examples. We also
suggest an alternative method for assessing multivariate process capability based on
the empirical probability distribution of PCs. This method has been performed on
industrial and simulated data. Copyright © 2008 John Wiley & Sons, Ltd.
KEY WORDS: Multivariate Process Capability Index; specification region; principal component analysis;
multivariate normal distribution
1. INTRODUCTION
S
ome authors have recently proposed alternative definitions of Multivariate Process Capability Indices
(MPCIs) based on different approaches. In general, MPCIs constructed using (a) the ratio of a toler-
ance region to a process region, (b) the probability of the non-conforming product, (c) the principal
component analysis (PCA) and (d) other approaches based on loss function. Carr1 has suggested that one
might just use the expected proportion of non-conforming as the process capability index.
Wang et al.2 have carried out a comparison of three MPCIs3–5 through the use of graphical and compu-
tational examples. They have also discussed the usefulness of these indices. Recently, Shinde and Khadse6
have reviewed and compared the six MPCIs 3–5,7–9 based on fraction conforming interpretation.
In this paper, we have identified a major problem in the determination of specification region of principal
components (PCs) in the definition of PCIs given by Wang and Chen7 . We have also observed that Veevers10
index based on the first PC misleads the calculations of real capability of the multivariate processes. Section 2
deals with the discussion of MPCIs based on PCA with some examples. In Section 3, we propose an
∗ Correspondence to: R. L. Shinde, Department of Statistics, School of Mathematical Sciences, North Maharashtra University, Jalgaon-
425 001, India.
† E-mail: ramkrishnashinde@yahoo.co.in
Copyright q 2008 John Wiley & Sons, Ltd.

70 R. L. SHINDE AND K. G. KHADSE
alternative way of assessing multivariate process capability using the empirical probability distribution of PC.
Section 4 summarizes the investigation with concluding remarks.
2. MPCIs BASED ON PCA
In the literature we found three research papers on MPCIs based on PCA, namely Wang and Chen7 , Wang
and Du11 and Veevers10 .
2.1. MPCI-I [Wang and Chen7 , Wang and Du11 ]

Wang and Chen7 have applied the PCA to construct MPCIs based on the multivariate normal process data.
They have defined MPCIs MC p , MC pk , MC pm and MC pmk using the univariate PCIs of the PCs. These
indices were further studied by Wang and Du11 and they have proposed interval estimators of MC p and
MC pk for multivariate normal process data. They have also suggested a new index MCPC based on PCA
with its interval estimator for non-multivariate normal data. A set of few PCs normally comprises 90% of
the process variability. By using this subset, the multivariate quality characteristic problem can be reduced
in dimensionality. The capability index for the multivariate process defined by Wang and Chen7 as
v 1/v

MC p = C p;PCi (1)
i=1
where
USLPCi −LSLPCi
C p;PCi =
6PCi
C p;PCi represents the univariate measure of potential process capability for the ith PC and v denotes the
number of PCs comprising around 90% of the process variability. The engineering specifications of PCi s
and their target values as used by Wang and Chen7 and Wang and Du11 are
LSLPCi = u i L S L, USLPCi = u i USL, TPCi = u i T (2)
where u 1 , u 2 , . . . , u p are the eigenvectors of .
Similarly, they have defined MC pk , MC pm and MC pmk by replacing C p;PCi with C pk;PCi , C pm;PCi and
C pmk;PCi , respectively, for i = 1, 2, . . . , v.
Remark 2.1. We observed that the formulae for USLPCi and LSLPCi obtained by Wang and Chen7 and
Wang and Du11 are incorrect because they have assumed that specification limits of different PCs are
independent of each other. In fact, only the distributions of PCs are independent but their specification limits
are interrelated. This can be observed through the following example.
Example 2.1 (Wang and Chen7 ). Suppose X = (X 1 , X 2 ) ∼ N2 (, ) and the specification region for
(X 1 , X 2 ) is S = {(x1 , x2 )|112.7 ≤ x1 ≤ 241.3 and 32.7 ≤ x2 ≤ 73.3}, where X 1 is the brinell hardness and X 2
is the tensile strength of a process. On the basis of the random sample of size 25, Wang and Chen7 have
obtained the estimates of and as

337.8000 85.3308
ˆ = (177.2, 52.32) and ˆ=
85.3308 33.6247
The two PCs of , ˆ PC1 and PC2 are
PC1 = Y1 = u 1 X = 0.967499X 1 +0.252873X 2 ∼ N (184.6711, 360.103) and

PC2 = Y2 = u 2 X = −0.252873X 1 +0.967499X 2 ∼ N (5.8104, 11.322)
ˆ
where u 1 and u 2 are the eigenvectors of .
Copyright q 2008 John Wiley & Sons, Ltd. Qual. Reliab. Engng. Int. 2009; 25:69–77
DOI: 10.1002/qre
MULTIVARIATE PROCESS CAPABILITY 71
80
60
x
40
20
100 150 200 250
x
Figure 1. Specification region (S) for original variables X 1 and X 2
12
8
y
4
0
100 125 150 175 200 225 250 275
y
Figure 2. Specification region (S1 ) for PCs Y1 and Y2 as obtained by Wang and Chen7
50
25
y 0
-25100 150 200 250 300
-50
y
Figure 3. Correct specification region (S2 ) for PCs, Y1 and Y2
Using the relation mentioned in Equation (2), Wang and Chen7 have obtained the specification region of
PCs as S1 = {(y1 , y2 )|117.3060 ≤ y1 ≤ 251.9930, 3.1384 ≤ y2 ≤ 9.8994}, which is not correct.
The correct specification region is

112.7 ≤ 0.967499y1 −0.252873y2 ≤ 241.3

S2 = (y1 , y2 )
32.7 ≤ 0.252873y +0.967499y ≤ 73.3
1 2
These regions are also shown in Figures 1–3.

We have evaluated the proportion of conforming products based on (i) the original specification region S
of X , (ii) the specification region S1 given by Wang and Chen7 and (iii) the correct specification region S2
DOI: 10.1002/qre
of PCs as given above. These values are, respectively, as follows:
p̂ S = P(X ∈ S) = 0.9991
p̂ S1 = P(Y ∈ S1 ) = 0.6740 = 0.9991
p̂ S2 = P(Y ∈ S2 ) = 0.9991
This example clearly indicates that there is a major problem in Wang and Chen’s approach of finding the
specification region for PCs. Subsequently we have also identified the same type of problem in obtaining
the specification region of PCs in another example of three variables discussed by Wang and Chen7 .
Hence, there is a need for another methodology for assessing process capability for multivariate data
based on PCA. An alternative approach for this purpose is discussed in Section 3.
2.2. MPCI-II [Veevers10 ]

While developing an index based on PCA, Veevers10 considered a situation where the marginal specifications
have ranges 2di , i = 1, 2, . . . , p. Let X ∼ N p ( X , X ), by transforming X to Y with elements of Y as Yi =
X i /di . Then Y follows multivariate normal distribution with variance covariance matrix Y . Veevers10 has
defined MPCI based on PCA as
√
1+ 2
MPPC = √ (3)
6 1
where 1 denotes√the eigenvalue associated with the first PC of Y , that is, the standard deviation of the
first PC of Y is 1 .
Remark 2.2. Let us consider an interesting example of on-target processes given in Table I in order to note
how MPPC mislead measurements of process capability.
Table I summarizes the values of the process capability index MPPC as well as the proportion conforming
in terms of parts per million (PPM) conforming under the assumption of bivariate normality for each process
distribution and the specification region for (X 1 , X 2 ) is
S = {(x1 , x2 )|−3 ≤ x1 ≤ 3, −3 ≤ x2 ≤ 3 and target T1 = 0, T2 = 0}
On the basis of the values of MPPC from Table I we conclude that process A is the best and process D
is the worst. But, in fact, using the values of the actual proportion of conforming in PPM we observed that
process D is the best and process A is the worst.
Therefore, MPPC defined by Veevers10 misleads the measurements of process capability as the PPM
conforming product is itself one of the well-known measures of process capability. We also note that MPPC
given by Veevers10 is applicable only for a rectangular specification region.
Table I. MPPC and proportion conforming for different processes
Process Process parameters MPPC Actual proportion conforming in PPM

A 1 = 0, 2 = 0, 1 = 1, 2 = 1, = 0 1.2071 994 695
B 1 = 0, 2 = 0, 1 = 1, 2 = 1, = 0.50 0.9856 994 848
C 1 = 0, 2 = 0, 1 = 1, 2 = 1, = 0.75 0.9124 995 262
D 1 = 0, 2 = 0, 1 = 1, 2 = 1, = 0.9999 0.8535 997 297
DOI: 10.1002/qre
3. AN ALTERNATIVE METHOD
In this section we provide an alternative method for assessing multivariate process capability based on the
empirical probability distribution of PCs. Let X ∼ N p (, ), is a positive-definite matrix.
T X = (T1 , T2 , . . . , T p ) : Target vector for X

L S L X = (LSL1 , LSL2 , . . . , LSL p ) : Lower specification vector for X
U S L X = (USL1 , USL2 , . . . , USL p ) : Upper specification vector for X
S = {x|L S L X ≤ x ≤ USL X } : Hyperrectangular specification region for X
1 ≥ 2 ≥ · · · ≥ p : Eigenvalues of
u 1 , u 2 , . . . , u p : Eigenvector of
Yi = u i X : ith PC (i = 1, 2, . . . , p)
i.e. Y =U X , Y : vector of PCs, where U = (u 1 , u 2 , . . . , u p )
E(Y ) = U and V (Y ) = Diag(1 , 2 , . . . , p )
T Y = U T : Target vector for Y
V = {y|L S L X ≤U y ≤ USL X } : Specification region for Y
Here we note that V is not hyperrectangular and it is too complex, because it is a set of 2 p linear inequalities
in p variables. In addition,
Yi ∼ N (u i , i ), i = 1, 2, . . . , p and Y1 , Y2 , . . . , Y p are independent
3.1. Probability-based Index using the first k PCs

Consider the first k PCs Y1 , Y2 , . . . , Yk (generally explaining approximately 90% process variation). In order
to carry out the process capability study based on the first k PCs, first we obtain the specification region
V for the first k PCs Y1 , . . . , Yk by taking Yi = EYi for i = k +1, . . . , p. One possible justification for this is
that Yk+1 , . . . , Y p have very less variation as compared with Y1 , . . . , Yk .
Then we have
⎧ ⎫
⎨ L S L ≤U y ≤ USL
X X where y = (y1 , y2 , . . . , y p )⎬
V = (y1 , y2 , . . . , yk )
⎩ such that yr = EYr , r = k +1, . . . , p ⎭
We define the probability-based MPCIs as
Mp1 = P{Y = (Y1 , Y2 , . . . , Yk ) ∈ V |Y ∼ Nk (Y = T Y , Y = diag(1 , . . . , k ))}
Mp2 = P{Y = (Y1 , Y2 , . . . , Yk ) ∈ V |Y ∼ Nk (Y , Y = diag(1 , . . . , k ))}
Note that Mp1 is analogous to MC p and Mp2 is analogous to MC pk
If Mp1 ≥ 0.9973, the process is potentially capable

and if Mp2 ≥ 0.9973, the process is actually capable
DOI: 10.1002/qre
3.2. Computation of Mp1 and Mp2

As the specification region V for (Y1 , Y2 , . . . , Yk ) is more complicated and it is a set of 2k inequalities,
it is difficult to compute Mp1 and Mp2 because of the problem in the evaluation of multiple integrals on
complicated regions.
3.3. Empirical approach

Generate two random samples of a large size N (≥ 20 000) from the distribution of the first k PCs with a
mean vector as follows:
Sample I: Mean vector at target, i.e. T Y = (TY1 , TY2 , . . . , TYk )

Sample II: Mean vector Y = (Y1 , Y2 , . . . , Yk )
Then the estimates of Mp1 and Mp2 based on the empirical approach are
Number of observations y = (y1 , y2. , . . . , yk ) from sample I

y ∈ V
M̂ p1 = (4)
N
Number of observations y = (y1 , y2 , . . . , yk ) from sample II
y ∈ V
M̂ p2 = (5)
N
3.4. Illustrative examples

Example 3.1 (Based on automobile industry data). We have collected real-life trivariate data set from the
automobile industry, where X 1 is the bore diameter, X 2 the bore depth and X 3 the groove diameter of
automobile products and we also observed that X = (X 1 , X 2 , X 3 ) follows trivariate normal distribution. The
specification region for S for (X 1 , X 2 , X 3 ) is
S = {x|19.055 ≤ x1 ≤ 19.102, 151.7 ≤ x2 ≤ 152.3, 25.2 ≤ x3 ≤ 25.41}
and T = (19.0785, 152, 25.305).
The estimates of process parameters based on a random sample of size 25 are
ˆ = (19.079840, 152.137600, 25.233600)
and
⎡ ⎤
0.00004497 0.00000376 0.00000643
ˆ =⎢
⎣ 0.00000376 0.00509400 −0.00057016⎦
⎥
0.00000643 −0.00057016 0.00044900
3.5. Computation of M̂ p1 and M̂ p2

ˆ are
The eigenvalue–eigenvector pairs of
1 = 0.00516296, u 1 = (−0.00057970 −0.99276444 0.12007672)
2 = 0.00038017, u 2 = (0.02039843 0.12004002 0.99255946)
3 = 0.00004483, u 3 = (−0.99979176 0.00302476 0.02018125)
DOI: 10.1002/qre
Therefore, the PCs become

Y1 = u 1 X = −0.00057970X 1 −0.99276444X 2 +0.12007672X 3
Y2 = u 2 X = 0.02039843X 1 +0.12004002X 2 +0.99255946X 3
Y3 = u 3 X = −0.99979176X 1 +0.00302476X 2 +0.02018125X 3
Here, the first PC, collectively explains 92.39% of the total sample variance. Hence, we consider the first PC
to study the process capability. In addition, we note that Y1 , Y2 and Y3 are independent normal variates with
parameters (, 2 ) = (1.0e +002∗(−1.48017892), 0.071853772 ), (1.0e +002∗0.43697648, 0.019498112 ),
(1.0e +002∗(−0.18106440), 0.006695672 ), respectively.
The target vector and engineering specification region V for Y1 are given as follows:
T Y1 = 1.0e +002∗(−1.47872714)
V = {y1 |−148.181477 ≤ y1 ≤ −147.577104}
Taking a random sample of size 20 000 from Y1 with = 1.0e +002∗(−1.48017892) and 2 = 0.071853772
and then counting the number of observations out of 20 000, which satisfy the constraints mentioned in V ,
Mp1 is estimated. In a similar manner Mp2 is estimated using (5). The estimated values of Mp1 and Mp2
based on our sample are 0.999950 and 0.988650, respectively.
Example 3.2 (Wang and Chen7 , Wang et al.2 ). Suppose X ∼ N3 (, ) and the specification region (S) in
the shape of frustum or lampshade for (X 1 , X 2 , X 3 ) is

11.5− x1 2
S = x|0 ≤ (x2 ) +(x3 ) ≤
2 2
for 9 ≤ x1 ≤ 11 and T = (10, 0, 0)
2
We consider the estimates of process parameters based on a simulated sample of size 100 from X ∼
N3 (, ) as, ˆ = [10.016267 −0.009948 −0.009058] and
⎡ ⎤
0.086895 0.003207 0.002536
ˆ =⎢
⎣ 0.003207 0.006363 0.002315⎦
⎥
0.002536 0.002315 0.005846
3.6. Computation of M̂ p1 and M̂ p2

ˆ are
The eigenvalue–eigenvector pairs of
1 = 0.087107, u 1 = (−0.998652 −0.040591 −0.032327)
2 = 0.008222, u 2 = (0.051797 −0.742382 −0.667971)
3 = 0.003774, u 3 = (0.003114 −0.668745 0.743484)
Therefore, the PCs become

Y1 = u 1 X = −0.998652X 1 −0.040591X 2 −0.032327X 3
Y2 = u 2 X = 0.051797X 1 −0.742382X 2 −0.667971X 3
Y3 = u 3 X = 0.003114X 1 −0.668745X 2 +0.743484X 3
Here the first PC explains 87.89% of the total sample variance. Hence, we consider only the first PC to study
the process capability. In addition, we note that Y1 , Y2 and Y3 are independent normal variates with parameters
(, 2 ) = (−10.002076, 0.2951392 ), (0.532258, 0.0906802 ), (0.031118, 0.0614372 ), respectively.
DOI: 10.1002/qre
The target value and engineering specification region V for Y1 are given as follows:
TY1 = −9.986527
and
⎧ 2 ⎫

⎪
⎨ 0 ≤ a 2 +b2 ≤ 11.5−(−0.998652y1 +0.027666) ⎪ ⎬

V = y1 2
⎩
⎪ ⎪
⎭
for −10.987144 ≤ y1 ≤ −8.984445
where a = −0.040591y1 −0.415948 and b = −0.032327y1 −0.332397.
Taking a random sample of size 20 000 from Y1 with = TY1 = −9.986527 and = 0.295139 and then
counting the number of observations out of 20 000, which satisfy the constraints mentioned in V , Mp1 is
estimated. In a similar manner, Mp2 is estimated using (5). The estimated values of Mp1 and Mp2 for our
sample are 0.999303 and 0.999271, respectively.
4. CONCLUDING REMARKS
In this paper after obtaining the specification region (V ) for the PCs, we have suggested modified specification
region (V ) for the first k PCs. On the basis of the random samples simulated from the distribution of the
first k PCs of (or ), ˆ we have obtained the empirical estimates of probability-based indices Mp1 and
Mp2 . We note that even though the empirical approach is old and commonly used in practice, here it also
enables us to consider other than the hyperrectangular specification region. In fact, non-hyper rectangular
specification region is either not considered in other existing MPCIs or it becomes a more complicated
computation work for their MPCIs.
REFERENCES
1. Carr WE. A New Process Capability Index: Parts per million. Quality Progress 1991; 24:152.
2. Wang FK, Hubele NF, Lawrence FP, Miskulin JD, Shahriari H. Comparison of three multivariate process capability
indices. Journal of Quality Technology 2000; 32:263–275.
3. Taam W, Subbaiah P, Liddyy JW. A note on multivariate capability indices. Journal of Applied Statistics 1993;
20:339–351.
4. Chen H. A multivariate process capability index over a rectangular solid tolerance zone. Statistica Sinica 1994;
4:749–758.
5. Shahriari H, Hubele NF, Lawrence FP. A multivariate process capability vector. Proceedings of the 4th Industrial
Engineering Research Conference, Institute of Industrial Engineers, 1995; 304–309.
6. Shinde RL, Khadse KG. A review and comparison of some multivariate process capability indices based on fraction
conforming interpretation. Statistical Methods 2005; 7:95–115.
7. Wang FK, Chen JC. Capability index using principal components analysis. Quality Engineering 1998; 11:21–27.
8. Veevers A. Viability and capability indices for multiresponse processes. Journal of Applied Statistics 1998; 25:545–558.
9. Braun L. New methods in multivariate statistical process control. Unpublished, 2001.
10. Veevers A. Capability indices for multiresponse processes. Statistical Process Monitoring and Optimization, Park SH,
Vining GG (eds.). Marcel Dekker: New York, NY, 1999; 241–256.
11. Wang FK, Du TCT. Using principal component analysis in process performance for multivariate data. Omega 2000;
28:185–194.
Authors’ biographies
R. L. Shinde received his BSc (1990), MSc (1992) and MPhil (1994) in Statistics from the University
of Pune (India) and received his PhD (2001) in Statistics from Sardar Patel University, Anand (India).
He is currently an Associate Professor and Head of the Department of Statistics at North Maharashtra
DOI: 10.1002/qre
University, Jalgaon (India). He is a life member of Indian Statistical Association (ISA), Indian Association
for Productivity, Quality and Reliability (IAPQR) and Indian Society for Probability and Statistics (ISPS).
He has guided three candidates towards PhD. His research activity includes (i) distributions of run, scans
and patterns and their applications, (ii) reliability, (iii) statistical process control, (iv) actuarial statistics and
(v) clinical trials.
K. G. Khadse received his BSc (1986) and MSc (1988) in Statistics from the University of Pune (India).
He is currently working as a Lecturer in Statistics at M. J. College, Jalgaon (India). He has submitted his
PhD thesis entitled ‘On Multivariate Process Capability Indices’ to North Maharashtra University, Jalgaon
(India).
DOI: 10.1002/qre

Mpci Pca

Загружено:

Сведения о документе

Исходное описание:

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Mpci Pca

Загружено:

Авторское право:

Доступные форматы

QUALITY AND RELIABILITY ENGINEERING INTERNATIONAL

Qual. Reliab. Engng. Int. 2009; 25:69–77

Research Multivariate Process Capability

Copyright q 2008 John Wiley & Sons, Ltd.

2. MPCIs BASED ON PCA

2.1. MPCI-I [Wang and Chen7 , Wang and Du11 ]

PC1 = Y1 = u 1 X = 0.967499X 1 +0.252873X 2 ∼ N (184.6711, 360.103) and

Figure 3. Correct specification region (S2 ) for PCs, Y1 and Y2

These regions are also shown in Figures 1–3.

of PCs as given above. These values are, respectively, as follows:

2.2. MPCI-II [Veevers10 ]

S = {(x1 , x2 )|−3 ≤ x1 ≤ 3, −3 ≤ x2 ≤ 3 and target T1 = 0, T2 = 0}

Table I. MPPC and proportion conforming for different processes

Process Process parameters MPPC Actual proportion conforming in PPM

T X = (T1 , T2 , . . . , T p ) : Target vector for X

E(Y ) = U and V (Y ) = Diag(1 , 2 , . . . , p )

T Y = U T : Target vector for Y

V = {y|L S L X ≤U y ≤ USL X } : Specification region for Y

3.1. Probability-based Index using the first k PCs

We define the probability-based MPCIs as

Mp1 = P{Y = (Y1 , Y2 , . . . , Yk ) ∈ V |Y ∼ Nk (Y = T Y , Y = diag(1 , . . . , k ))}

Mp2 = P{Y = (Y1 , Y2 , . . . , Yk ) ∈ V |Y ∼ Nk (Y , Y = diag(1 , . . . , k ))}

Note that Mp1 is analogous to MC p and Mp2 is analogous to MC pk

If Mp1 ≥ 0.9973, the process is potentially capable

3.2. Computation of Mp1 and Mp2

3.3. Empirical approach

Sample I: Mean vector at target, i.e. T Y = (TY1 , TY2 , . . . , TYk )

Number of observations y = (y1 , y2. , . . . , yk ) from sample I

3.4. Illustrative examples

S = {x|19.055 ≤ x1 ≤ 19.102, 151.7 ≤ x2 ≤ 152.3, 25.2 ≤ x3 ≤ 25.41}

and T = (19.0785, 152, 25.305).

The estimates of process parameters based on a random sample of size 25 are

ˆ = (19.079840, 152.137600, 25.233600)

3.5. Computation of M̂ p1 and M̂ p2

1 = 0.00516296, u 1 = (−0.00057970 −0.99276444 0.12007672)

2 = 0.00038017, u 2 = (0.02039843 0.12004002 0.99255946)

3 = 0.00004483, u 3 = (−0.99979176 0.00302476 0.02018125)

Therefore, the PCs become

3.6. Computation of M̂ p1 and M̂ p2

Therefore, the PCs become

Вам также может понравиться