Академический Документы
Профессиональный Документы
Культура Документы
John J. Peterson
To cite this article: John J. Peterson (2004) A Posterior Predictive Approach to Multiple
Response Surface Optimization, Journal of Quality Technology, 36:2, 139-153, DOI:
10.1080/00224065.2004.11980261
In this paper we present an approach to multiple response surface optimization that not only provides
optimal operating conditions, but also measures the reliability of an acceptable quality result for any set of
operating conditions. The most utilized multiple response optimization approaches of "overlapping mean
responses" or the desirability function do not take into account the variance-covariance structure of the data
nor the model parameter uncertainty. Some of the quadratic loss function approaches take into account the
variance-covariance structure of the predicted means, but they do not take into account the model parameter
uncertainty associated with the variance-covariance matrix of the error terms. For the optimal conditions
obtained by these approaches, the probability that they provide a good multivariate response, as measured by
that optimization criterion, can be unacceptably low. Furthermore, it is shown that ignoring the model
parameter uncertainty can lead to reliability estimates that are too large. The proposed approach can be used
with most of the current multi response optimization procedures to assess the reliability of a good future
response. This approach takes into account the correlation structure of the data, the variability of the process
distribution, and the model parameter uncertainty. The utility of this method is illustrated with two examples.
KEY WORDS: Bayesian Analysis; Desirability Function; Mixture Experiments; Preposterior Analysis;
Proportion Conforming; Quadratic Loss Function; Reliability; Ridge Analysis; Simulation.
placed on certain response types. This approach good in that it takes into account the variance-
allows an engineer or scientist to incorporate desir- covariance structure of the data, can accommodate
ability information relating to economic or specific heteroscedastic and noise variable regression models,
performance issues. Harrington (1965) also created and is easy to interpret. A serious drawback of this
an absolute scale to accompany his desirability approach, however, is that it does not take into
function, while other desirability functions only account the uncertainty of the model parameters.
provide relative measures. Here, the model parameters are simply replaced by
their point estimates. In some cases, this can cause
A major drawback of the "overlapping mean
the probability measure to be noticeably larger than
response" and desirability function methods is that
it should be. A Bayesian approach to the analysis of
they ignore the correlations among the responses and
the proportion of nonconforming units was proposed
the variability of the predictions. Understanding the
by Raghunathan (2000), but this approach does not
variability of the predictions has been stressed by address optimization over response surfaces.
Myers (1999) as a critical issue for practitioners. It is
certainly a critical issue for quality assessment. In In this paper, a Bayesian reliability approach is
addition, these approaches do not take into account proposed that takes into account the correlation
the uncertainty of the estimates of the model structure of the data, the variability of the process
parameters. distribution, and the model parameter uncertainty.
This Bayesian approach utilizes the posterior pre-
As an alternative to the product [0, 1]-metric
dictive distribution of the multivariate response to
approaches above, Khuri and Conlon (1981), Pigna-
compute the probability that a future multivariate
tiello (1993), Ames et al. (1997), and Vining (1998)
response will satisfy specified quality conditions.
proposed various types of quadratic loss function
Bayesian predictive distributions are a natural tool
procedures. Except for that of Ames et al. (1997), the
for computing such reliabilities (Bayarri and Mayoral
rest of these quadratic loss function approaches take
(2002)). These reliabilities can involve proportions of
into consideration the correlation structure of the
conforming units, desirability functions, or any loss
response types. While this is an advantage over the
functions. This approach also allows an investigator
overlapping-mean and desirability-function ap-
to perform a "preposterior analysis" to see how
proaches, these quadratic loss function methods still
reduction of model parameter uncertainty through
have two serious drawbacks. First, they do not take
acquiring additional data can modify the reliability of
into account the uncertainty of the variance-covar-
a multivariate response. The utility of this method is
iance matrix of the regression model errors. Second, a
illustrated with two examples. The first example
quadratic loss function, especially one based on a
illustrates how reducing process variation and in-
multivariate quadratic form, may be difficult for
creasing sample size can improve poor reliabilities.
some investigators to intuitively grasp, although
The second example shows that this approach is able
Vining (1998) has stated that squared-error loss is
to assert that additional experimentation to reduce
familiar to many engineers.
process variation or increase sample size is not
Del Castillo (1996) proposed a constrained con- necessary. This example also demonstrates how to
fidence region approach to multiresponse optimiza- perform a ridge analysis with the Bayesian reliability
tion that addresses response uncertainty and model procedure of this paper.
parameter uncertainty to some degree, and it appears
easier to understand than a multivariate quadratic
A Posterior Predictive Approach
form. However, it requires the type of problem that
can be clearly and appropriately reformulated as
having a primary response variable and various The Concept
secondary response variables. In addition, the corre-
We let Y = (YI , Y2, ..., r;)' be the multivariate
lations among the response types are not addressed.
(p x 1) response vector and let x = (Xl, X2, ..., Xk)' be
Recently, Chiao and Hamada (2001) took an the (k x 1) vector of factor variables. If the experi-
important step in addressing the assessment and menter is simply interested in Y being in some
optimization of the quality of a multiple response desirable subset of the response space, A, then he or
experiment. They provided a procedure to estimate she should consider the discrete "desirability" func-
the probability that a multivariate normal response tion, I(y E A), where IC) is the 0-1 indicator
will satisfy user-specified conditions. Their method is function. If one can sample from the posterior
predictive distribution of Y for a given x value, then in Equation (1), the Bayesian predictive density for
one can construct the posterior predictive distribu- Y given x and the data can be obtained in closed
tion of I(Y E A) conditional on x. From a quality form. This joint prior for Band 1: is proportional to
perspective, this gives the experimenter a measure of 11:1-(1'+1)/2. The associated Bayesian predictive den-
the reliability of Y being in A for a given x. This sity for a specified x-value has the multivariate-t
measure of course takes into account the variance- form (Johnson (1987, Chapter 6)) with v degrees of
covariance structure of the data and the uncertainty freedom (df), where v = n - p - q + 1 and n is the
of the model parameters through the posterior sample size, is
predictive distribution of Y. A search of the x-space
c{ ~ (y - Hz(x))'
then provides the experimenter with information on
conditions for optimizing the reliability of Y being f(y I x, data) = 1+
in A.
- (P+V)/ 2
The standard regression model for multiple re- Sampling from the Posterior Predictive
sponse surface modeling is the classical multivariate Distribution
multiple regression model,
Since Equation (2) is a multivariate t-distribution,
Y = Bz(x) + e, (1) it is easy to simulate V-values from this predictive
density. Following Johnson (1987, Chapter 6), one
where B is a p x q matrix of regression coefficients can simulate a multivariate t random variable (LV.),
and z(x) is a q x 1 vector-valued function of x. The Y, by simulation of a multivariate normal r.v. and an
vector e has a multivariate normal distribution with independent chi-square LV. For this particular
mean vector 0 and variance-covariance matrix 1:. In problem the simulation is done as follows. We let
response surface analysis, the model in Equation (1) W be a multivariate normal r.v. with zero mean
is typically composed of a quadratic model for each vector and variance-covariance matrix equal to H- 1 .
mean response, but we are allowing a more general We let U be a chi-square r.v. with t/ df that is
covariate structure here. The typical multivariate independent of W. Next, define
regression assumption, that z(x) is the same for each
response type, is assumed in this paper.
}j = ( Jl!Wj/JU) + lij, for j = 1, .. . ,p (3)
To account for the uncertainty in the model
parameters, Band 1:, the posterior predictive density where }j is the j element of Y, W j is}he j element of
f(y I x, data) can be used. By using the typical W, and lij is the j element of Ii = Bz(x). It follows
noninformative joint prior for Band 1: and the model that Y has a multivariate t-distribution with v df.
data had been acquired. For the posterior sampling {x: x'x = T 2}, for various values of T. It is typically
model in Equation (3) this is easily done as follows. assumed that the x-points have been centered at zero
Note that the multivariate t posterior predictive and given a common scale, such as -1::; Xi ::; 1,
density depends upon the data only through the i = 1, ... , k. An optimal response plot of p(x r ) vs. T is
sufficient statistics, E and B, the degrees of freedom, then constructed, where
and the design matrix Z. By increasing the rows of
the design matrix (to add new data points), and p(x r ) = max p(x).
x'x=r2
augmenting the df accordingly, one can simulate new
data. This corresponds to adding artificial data in Also, an overlay plot of Xir (i = 1, ... , k) vs. T is
such a way that the sufficient statistics, E and B, typically made to see how the Xir'S change as T varies.
remain the same. This gives the experimenter an idea For an excellent overview of ridge analysis see Hoerl
of how much the reliability can be increased by (1985).
reducing model uncertainty. For example, the ex-
perimenter can forecast the effects of replicating the By computing p(x) for x-values over a coarse-to-
experiment a certain number of times. This idea is moderately dense grid over the experimental region,
similar in spirit to the notion of a "preposterior" one can map out a "response surface" for this
analysis as described by Raiffa and Schlaiffer (2000). objective function. If there is a need to interpolate
between grid points, a logistic regression model such
as
Computing and Optimizing the
_() exp(u(x)'I')
Bayesian Reliabilities p x - (5)
- 1 + exp(u(x)'I')
Using Monte Carlo simulation from Equation (3),
one can approximate the reliability p(x) for various can be fit to the binary data produced by the above
x-values in the experimental region using Monte Carlo simulations to produce an approximat-
ing function over the experimental region. Basic
response surface methods can then be applied to
(4)
u(x)' I' in Equation (5) to assess optimal conditions.
For example, a canonical analysis or classical ridge
where N is the number of simulations, and the Y, r.v. analysis can be done if one can adequately model p(x)
are simulated conditional on x and the data. For a by a quadratic polynomial function for ir(x)'1'. If the
small number of factors it is computationally reason- standard quadratic model does not give an adequate
able to grid over the experimental region to compute fit, a ridge analysis using higher order polynomials or
values of p(x) for purposes of optimization. However, even nonparametric regression methods can be used,
even for three or more factors, it may be preferable to although a general nonlinear optimization procedure
have a more efficient approach to optimizing p(x). is required for the ridge analysis.
One approach is to maximize p(x) using general
Since there are many (simulated) replications for
optimization methods, such as those discussed in
each x-point, an easy way to check the fit of the model
Nelder-Mead (1964), Price (1977), or Chatterjee,
in Equation (5) is to plot p(x) vs. p(x), where p(x)
Laudato, and Lynch (1996) .
equals the right-hand-side of Equation (4) and x) isp(
Another, possibly more revealing, approach is to the estimated value of p(x) from Equation (5). Across
compute p(x) for x-points in some response surface the various values of x in an experimental region, the
experimental design and then fit a closed-form p(x) vs. p(x) plot should draw out roughly a 45 degree
response surface model to obtain an approximate line on the unit square. Of course, the final, optimized
reliability surface, p(x). Since values of p(x) can be p(xo) value should be compared with p(xo) evaluated
computed much more quickly than p(x), approxi- using Monte Carlo simulations and then with p(xo)
mate optimization of p(x) can be done. For example, obtained from actual process runs at xo.
a ridge analysis can be done on p(x) to explore, in an
If a logistic regression model does not give a good
approximate fashion, how p(x) changes as x moves
fit over the entire experimental region, it may still be
out from the center of the experimental region in an
possible to obtain a good fit over a sub-region of the
optimal way.
experimental region. Here, the Monte Carlo process
In a ridge analysis, an objective function (e.g., described above can be re-run with a finer grid over
p(x)) is optimized over hyperspheres of radius T, this smaller region. After generating a refined and
smaller response surface over the sub-region of x- of Xl, X2, and X3 that minimize, as best as possible,
values, another logistic regression model can be fit. both Yi. and Y2. In particular, it is assumed here that
This time we can expect a somewhat better fit, as the one desires Yi. to be at most 234 and Y2 to be at most
model is approximating a smaller response surface. 18.5. Anderson and Whitcomb (1998) also analyzed
This process can be continued to explore and improve this data set to illustrate Design Expert's capability
estimates of optimal conditions using, for example, to map out overlapping mean response surfaces.
only the standard quadratic model for which most
Following Frisbee and McGinity (1994) and
response surface procedures have been developed.
Anderson and Whitcomb (1998), quadratic mixture
Since p(x) is a joint probability, it turns out that models are fit to the bivariate response data. For this
the marginal probabilities corresponding to marginal example, however, a severe outlier run is deleted. The
events about some of the li values can also be easily resulting regression models obtained are
computed as a by-product of the Monte Carlo
simulation process for p(x). If we have fil = 248xl + 272x2 + 533x3 - 485xlX3 - 424x2X3
f12 = 18.7xl + 14.1x2 + 35.4x3 - 36.7xlX3 + 18.0x2x3·
Pi(X) = Pr(li E Ai I x, data),
(6)
where Ai is an interval (possibly one or two sided),
The Wilks-Shapiro test for normality of the residuals
then we can assess the Pi(X) simultaneously along
for each regression model yields p-values greater than
with p(x). Modification of A and the A; can be used
0.05. Tests for multivariate normality via skewness
to allow an investigator to directly observe economic
and kurtosis (Mardia (1974)) are not significant at
or performance issues through the p(x) and Pi(X)
the 5% level. However, it should be recognized that
respectively, which can be easily related to how often
due to the small sample size in this (and the next)
the process is invoked. Of course, other marginal
example, the tests for normality are not very power-
(joint) probabilities concerning the li can be
ful.
computed in a similar fashion.
In this example, the I(y E A), D(y), and Q(y)
For process ruggedness assessment, a Bayesian
functions are used to compare the results one obtains
credible region can be computed. For example, this
with each. For I(YEA), A={(Yl,Y2):
can be done by mapping out all of the x-values for
Yl :::; 234 and Y2 :::; 18.5}. Here, D(y) corresponds to
which p(x) (p(x)) is at least 0.95.
the Derringer-Suich (1980) desirability function and
Q(y) corresponds to the quadratic loss function of
Examples Vining (1998).
The D(y) function for this example is defined as
A Mixture Experiment D(y) = [dl (Ydd 2(Y2)]1/2, where, for i = 1,2,
This example involves a mixture experiment to
study the surfactant and emulsification variables
if Yi > u,
involved in pseudolatex formation for controlled-
release, drug-containing beads. (Frisbee and McGi-
if t; < Yi < o:
nity (1994)). An extreme vertices design is used to
if Yi < t;
study the influence of surfactant blends on the size of
the particles in the pseudolatex and the glass and L, = 227.27, Ul = 234, L 2 = 14.09, and U2 =
transition temperature of films cast from those 18.5. Here, L, and L 2 are the minimum predicted
pseudolatexes. The factors chosen are: Xl="% of response values over the mixture simplex correspond-
Pluronic F68;" X2="% of polyoxyethlene 40 mono- ing to Yl and Y2, respectively.
stearate;" and X3="% of polyoxyethylene sorbitan
fatty acid ester NF." The experimental design used is The Q(y) function is defined as Q(y) =
a modified McLean-Anderson design (McLean and (y - T)'E(Y(X)(l(y - T), where T is a 2 x 1 vector
Anderson (1966)) with two centroid points, resulting of target values, and
in a sample size of eleven. The response variables
measured were "particle size" and "glass transition E(y(x)) = z(x)'(Z'Z)-lz(x)E (7)
temperature," which are denoted here as Yi. and 1'2,
respectively. The goal of the study was to find values is the estimate of the variance-covariance matrix of
Xl=l Xl=l
y(x), the predicted mean response at x. The value response surfaces for responses Y1 and Y2. In Figure 2
chosen for T is (L 1 , L 2 )' . the region in gray is the set of x-points such that
Y1 :::; 234 and Y2 :::; 18.5. From viewing Figures 1 and
In this example the posterior predictive distribu- 2, it appears that the point x = (0.75,0,0.25)' is a
tion of Y is used to assess optimization for this good choice. In addition, the point x = (0.78,0,0.22)'
mixture experiment using the three multiresponse maximizes D(y), while x = (0.74,0,0.26)' minimizes
optimization approaches defined by I(y E A), D(y), Q(y).
and Q(y), as specified above. Using these functions,
two different strategies of assessment are employed. For Tables 1-6, the "percent reduction" repre-
The first strategy is to use each of the optimization
functions applied to the mean response surfaces to
find an "optimal" x-point in the mixture simplex.
Then the Bayesian reliability, p(x), is computed for
that x-point. In this example, D* = (1/2)Dopt, where
Dopt is the optimized value of D(y) and Q* = 2Qopt,
where Qopt is the optimized value of Q(Y). In
addition, as discussed earlier, two modifications are
made to see how reducing process variability and
increasing sample size can improve these reliabilities.
TABLE 1. Pr(Y E A I x = (0.75, 0, 0.25)') for Increasing TABLE 2. Pr(D(Y) 2': (~)Doptlx = (0.78, 0, 0.22)') for In-
Replications and Decreasing Residual Error creasing Replications and Decreasing Residual Error
sents the percent reduction in residual variation as multivariate normal distribution, yields p(x) = 0.87.
given by the data transformation y* = y + (1 - A)e. If the mle is used instead, then p(x) = 0.954. In any
Here, 100A represents a common percent reduction in case, this shows that ignoring the model parameter
residual variation. For columns 2-4 of Tables 1-6 a uncertainty can cause the estimated reliability to be
"preposterior" analysis as described earlier is dis- too large.
played. Here, data is simulated from a Eosterio~
predictive t-distribution with the same E and B However, the decrease in parameter uncertainty
values as before, but using a design matrix, Z, that due to increasing the sample size may not always
has been augmented by using 2, 3, or 4 design drive the reliability to near one, because the natural
replications (for columns 2, 3, and 4, respectively). variation in the process at hand may keep the
The degrees of freedom associated with the t- reliability less than one, so that, as the parameter
distribution are also increased accordingly. For the uncertainty decreases, the reliability converges to
Q(y) function, this preposterior analysis also requires some value less than one. By decreasing the process
that the variance-covariance matrix in Equation (7) variation, however, the reliability can be driven
use the augmented design matrix, Z, since Q(y) is a towards one, provided that the optimized mean
function of Z. The table entries in the upper left hand response surfaces are satisfactory.
corner (with 0% reduction and 1 replication) contain
the results ofthe analysis of the original experimental For the desirability function described earlier in
data with no preposterior modifications. this section, the maximized D(y) value is Dopt = 0.74
at x = (0.78,0,0.22)'. Assuming this is an adequate
As can be seen from Table 1, the entry with 0% desirability value, one might ask, from a quality
reduction and 1 replication shows that p(x) = perspective, "what is the probability that, for
Pr(Y E A I x) = 0.68 for x = (0.75,0,0.25)', which x = (0.78,0,0.22)', any future D(y) value will be at
is disappointingly small, even though both response least as large as some lower value, D*?" Table 2
surface means satisfy the conditions described by the shows values of Pr(D(Y) ~ (1/2)Dopt I x = (0.78,0,
set A. However, as the variation of the data is 0.22)'). These reliabilities show the same pattern of
reduced it is evident that the reliability increases. increasing values as the error variation and the
Also, as the number of replications of the experi- parameter uncertainty are reduced. So, from Table 2,
mental design increases, the reliability increases as it is appears from the data at hand that confirming a
the parameter uncertainty is decreased. Using the high degree of reliability for this process operating at
strategy of Chiao and Hamada (2001), this propor- x = (0.78,0,0.22)' requires a substantial reduction in
tion-conforming reliability is computed assuming a
multivariate normal distribution using the estimated TABLE 3. Pr(Q(Y) ~ 2Qoptlx = (0.76, 0, 0.24)') for In-
prediction Equations in (6) along with the estimated creasing Replications and Decreasing Residual Error
variances Wi = 13.87 and iT~ = 2.97) of el and e2,
respectively, and estimated correlation of the resi- Pr(Q(Y) ::; 2Qoptlx = (0.76,0,0.24)')
duals ("ii = -0.412). These estimates are obtained Percent
using the unbiased estimator of E, which is the sum- reduction Rep=1 Rep=2 Rep=3 Rep=4
of-squares-cross-product matrix divided by (n - q).
0 0.383 0.551 0.590 0.597
The maximum likelihood estimate (mle) is obtained
25 0.505 0.659 0.696 0.706
by dividing by n instead of (n - q). This reliability
50 0.677 0.809 0.842 0.851
measure, computed using 1000 simulations from this
TABLE 4. max Pr(Y E A I x) and Associated Optimal x-point, for Increasing Replications and Decreasing Residual Error
maxPr(Y E A I x)
Percent
reduction Rep=l Rep=2 Rep=3 Rep=4
0 0.741 0.902 0.936 0.950
(0.78, 0, 0.22) (0.78, 0, 0.22) (0.76,0,0.24) (0.78, 0, 0.22)
25 0.840 0.963 0.987 0.989
(0.78, 0.01, 0.21) (0.78, 0, 0.22) ( 0.75, 0, 0.25) (0.75, 0, 0.25)
50 0.937 0.999 1 1
(0.76, 0, 0.24) ( 0.79, 0, 0.21) (0.75,0,0.25) (0.73, 0, 0.27)
process variation as well as additional replications of Pr(DH(Y) ;::: 0.60 I x) = 0.72. This shows that the
the experiment. probability of obtaining a future response that is at
least borderline "good" is only 0.72, despite the fact
For the quadratic loss function, Q(y), described
that the optimal desirability was estimated to be
above, the minimized value of Q(y) is QUJ)f = 7.79 at
0.96, which is "excellent" on the Harrington scale.
x = (0.76,0,0.24)'. Table 3 shows values of
Pr( Q(Y) < 2Qopf I x = (0.76,0,0.24)'). As with Ta-
As discussed previously, it is also interesting
bles 1 and 2, Table 3 shows very similar behavior for
to consider "bottom-line" reliabilities such as
the Q(y) loss function at x = (0.76,0,0.24)'.
Pr(D(Y) > 0 I x) and Pr(Q(Y) < Q(I*) I x) where
A similar pattern is evident from Tables 4-6, 1* = (l~, ..., l;)' relates to the outer limits of
where the Bayesian reliabilities are maximized. Here, departures from target specifications. For this
the optimal x-points do not change much from their example, D(y) = 0 corresponds to Yl ;::: l~ = 234 or
respective values obtained by way of the mean Y2;::: l; = 18.5. At x = (0.78,0,0.22)', Pr(D(Y) > 0
response surfaces. I x) is only 0.696. This situation is better for Q(I*) at
x = (0.76,0,0.24)' wit h Pr(Q(Y) ~ Q(I*) I x) =
While the more commonly used Derringer-Suich 0.883.
(1980) desirability function does not provide an
absolute quality criterion, the Harrington (1965)
desirability function does. Harrington provides an A Ridge Analysis Example
absolute quality scale: 0-0.37 ("very poor" to
"poor"); 0.37-0.60 ("fair"); 0.60-0.80 ("good"); and This example illustrates the optimization of an
0.80-1 ("excellent") for his desirability function. event probability p(x) = Pr(y E A I x) for a high
Here, it is interesting to note that, for the example performance liquid chromatography assay. Here
in this section, DH(Y) is maximized to 0.96 at there are three factors (Xl = percent of isopropyl
x=(0.76,0,0.24)', where DH is the Harrington alcohol, X2 = temperature, and X3 = pH) and four
desirability function. However, PH(X) = responses (Yl = resolution, Y2 = run time, Y3 = signal-
TABLE 5. maxPr(D(Y) 2: WDol't I x) and Associated Optimal x-point, for Increasing Replications and
Decreasing Residual Error
TABLE 6. maxPr(Q(Y) ::; 2Qapt I x) and Associated Optimal x-point, for Increasing Replications and Decreasing Residual Error
to-noise ratio (sin), and Y4 = tailing). For this assay, Shapiro test for normality of the residuals for each
the chemist desires to have the event regression model yields p-values greater than 0.05,
and the Mardia tests for multivariate skewness and
A = {y: Yl ~ 1.8, Y2 :::; 15, Y3 ~ 300, kurtosis are not significant at the 5% level. The factor
0.75:::; Y4 < 0.85} (8) levels are coded so that all values are between -1 and
+ 1, with the center of the experimental region at the
occur with high probability. As such, it is desirable to origin.
maximize Pr(Y E A I x) as a function of x.
For this example, a ridge analysis is used to
A Box-Behnkin experimental design is run, with examine how the value of Pr(Y E A I x) changes
three center points, to gather data to fit four when maximized over spheres of varying radii
quadratic response surfaces. The data used are given centered at the origin of the experimental region. In
in Table 7. Quadratic regression models with the order to create a computationally efficient model,
covariate vector, z(x) = (I,Xl,X2,X3,xi,x~,x~, which is needed to do the ridge analysis, a fourth-
XIX2)', are used. degree polynomial model is fit to a cubical grid of 729
points containing the Box-Behnkin experimental
All of the response surface models fit well, with R 2 design. This grid is formed by the Cartesian product
values above 99%. As in example 1, the Wilks- of nine equally spaced points (increments of 0.25)
en
Q)
~
:c
~
e
a..
"'0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
~
'C p(x) PI (x ) p,(x )
~
a..
Q5
"'0
o
~
o
~
'0,
o
...I
0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
P:l(x)
FIGURE 3. Goodness-of-Fit Plots: Observed Proportions from Simulations vs. Logistic Regression Model Predicted
Probabilities.
from -1 to 1 for each factor. At each grid point, 1000 nomial logistic regression models are also created to
binary responses are generated corresponding to model each of the marginal probabilities,
whether or not the event A in Equation (8) occurred. Pr(Y; E Ai I x), i = 1, ...,4, where Al = {YI : YI 2:
The u(x)',,-term used for the logistic regression 1.8}, A 2 = {Y2 : Y2 ::; 15}, A 3 = {Y3 : Y3 2: 300}, and
model in Equation (5) is a complete fourth-degree A 4 = {Y4 : 0.75 ::; Y4 ::; 0.85}. This results in approx-
polynomial in all three factors. Fourth-degree poly- imate probability functions of Pi(X), i = 1, ...,4. The
1.1 , . . - - - - - - - - - - - - - - - - - - - - - ,
1
0.9
0.8 • p(x)
~ 0.7 .. ... . . p1(x)
z~0.6
.. ·A··· p2(x)
~ 0.5
e
0.4
a..
•••G) •• p3(x)
0.3 ...•.. p4(x)
0.2
0.1
o +--.--r---.----r----.----r--,.---.,...--.,.--~
o 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
Radius
FIGURE 4. Optimal Response Plot for Ridge Analysis of p(x). Overlaid are the Values of PI (x) - P4(X) Corresponding to the
Optimal Values of x for the Ridge Analysis of p(x). Here, the Ridge Analysis is Based Upon the p(x), PI(X) - P4(X) Functions
Approximated by Logistic Regression Models.
0.8
0.6
8
U)
Q) X1
:::l
(ij 0.4 _X2
>I
X ---.-><3
0.2
-0.2
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
Radius
logistic regression models corresponding to p(x), such a ridge analysis is given by Peterson (1993).
Pi(X), i = 1, ...,4 have reasonably good predictive Figure 4 shows the optimal probability plot of p(x r )
ability, as demonstrated by agreement between the vs. r. Figure 5 shows the optimal coordinate plot of
observed and predicted proportions. See Figure 3 for the x.-Iactor levels vs. r, Using the ridge trace of
scatter plots of observed vs. logistic model estimated optimal x-points for p(x), plots of Pi(X), i = 1, ...,4,
probabilities. are overlaid on the plot in Figure 4. As a further
check on the accuracy of the plots made by logistic
A ridge analysis for the logistic regression model is regression, the x-values used to make the optimal
conducted by gridding over the polar coordinate coordinate plot were used to compute the p(x),
angle space of each sphere at nine equally spaced Pl(X) - P4(X) probabilities using 1000 Monte Carlo
radius values from 0.1 to 1. The angle increments simulations. The resulting graph is virtually identical
used are 1 degree. A detailed discussion of how to do to Figure 4.
.0
From Figure 4 it is clear that the marginal Pr(Y E A I x). Here it can be seen that for higher
probabilities Pi(X), i = 1, 2, and 4 are close to one levels of "% isopropyl alcohol" one has more latitude
for all radius values between 0.1 and 1. However, a in varying pH and temperature while still keeping
radius value of at least 0.3 is required for P3(Xr ) to get Pr(Y E A I x) 2: 0.95.
close to 1, and thereby allow the joint probability
p(x r ) to get close to 1. Here, no preposterior or error
variability reduction analyses are needed, since Discussion
excellent event probabilities are obtainable with the
data at hand. For this example, the Bayesian We can see from the examples that the Bayesian
posterior predictive approach assures us that, if the reliability method described in this paper provides a
modeling is done well, then the uncertainty due to the complete way to assess the quality of an optimized
unknown model parameters is not a concern. How- multivariate response for the regression model of
ever, the experimenter should, of course, still do some Equation (1). This approach is the only approach
confirmatory experiments at the chosen factor con- that takes into account the variance-covariance
figuration. structure of the data as well as the uncertainty of
all of the model parameters. In addition, this method
Figure 6 is a plot of factor points where p(x) is at easily allows the investigator to assess the effect of
least 0.95. This forms a 95% Bayesian credible region modifying the variance of the process being studied
for the factor combination that maximizes and the effect of additional data on the reliability
_., ~~,----
<,...... ---~-......
< ; No
Is C(y-hat) in S
»> - -__
--~~r at least som.e x~_~_. _ //'--~---'
~///
/ Is p(xO) large
"<, enough?
-. -.'"
_.~~-
Do one or more of the following:
*-
i No
r - Use preposterior analysis to ascertain how much p(x) would increase with additional data.
t
I- Change the process mean and/or variance to increase p(x)
Relax the conformance criteria by modification of the set S or_ the~(y) functi~~ _
FIGURE AI. A Flowchart for using Bayesian Reliability Assessment for Response Surface Optimization.
"Multiple Response Surface Methods in Computer Simula- PRESS, S. J. (2003). Subjective and Objective Bayesian
tion". Simulation 29, pp. 113-12l. Statistics: Principles, Models, and Applications, 2nd ed.
MYERS, R. H. (1999). "Response Surface Methodology John Wiley, New York, NY.
Current Status and Future Directions". Journal of Quality PRICE, W. L. (1977). "A Controlled Random Search
Technology 31, pp. 30-44. Procedure for Global Optimization". The Computer Journal
MYERS, R. H. and MONTGOMERY, D. C. (2002). Response 20, pp. 367-370.
Surface Methodology, 2nd ed. John Wiley, New York, NY. RAIFFA, H. and SCHLAIFFER, R. (2000). Applied Statistical
NELDER, J. A. and MEAD, R. (1964). "A Simplex Method for Decision Theory. John Wiley, New York, NY.
Function Minimization". The Computer Journal 7, pp. 308- RAGHUNATHAN, T. E. (2000). "Bayesian Analysis of Quality
313. Level Using Simulation Methods". Journal of Quality
PETERSON, J. J. (1993). "A General Approach to Ridge Technology 32, pp. 172-182.
Analysis with Confidence Intervals". Technometrics 35, VINING, G. G. (1998). "A Compromise Approach to Multi-
pp. 204-214. response Optimization". Journal of Quality Technology 30,
PIGNATIELLO, J. J., JR. (1993). "Strategies for Robust pp. 309-313.
Multiresponse Quality Engineering". IlE Transactions 25, ZELLNER, A. (1971). An Introduction to Bayesian Inference in
pp. 5-15. Econometrics. John Wiley, New York, NY.