Content Server

Optimization Methods & Software, 2015
Vol. 30, No. 6, 1255–1275, http://dx.doi.org/10.1080/10556788.2015.1043604
A novel sequential approximate optimization approach using

data mining for engineering design optimization
Pengcheng Ye ∗ and Guang Pan
School of Marine Science and Technology, Northwestern Polytechnical University, Xi’ an 710072,
People’s Republic of China
(Received 20 July 2014; accepted 18 April 2015)
For most engineering design optimization problems, it is hard or even impossible to find the global opti-
mum owing to the unaffordable computational cost. To overcome this difficulty, a sequential approximate
optimization (SAO) approach that integrates the hybrid optimization algorithm with data mining and
surrogate models is proposed to find the global optimum in engineering design optimization. The surro-
gate model is used to replace expensive simulation analysis and the data mining is applied to obtain the
reduced search space. Thus, the efficiency of finding and quality of the global optimum will be increased
by reducing the search space using data mining. The validity and efficiency of the proposed SAO approach
are examined by studying typical numerical examples.
Keywords: engineering design optimization; sequential approximate optimization; data mining;

surrogate model
1. Introduction
In engineering, design optimization for large-scale systems is generally complex and time-
consuming. In addition, design requirements are rigorous and stringent for such systems,
especially multidiscipline design optimization systems such as those in the field of aerospace.
For example, the design optimization problems often involve multiple disciplines, multiple
objectives, and computation-intensive processes for product simulation in aircraft. Just taking
the computation challenge as an example, it is reported that it takes Ford Motor Company
about 36–160 h to run one crash simulation [15], which is unacceptable in practice. If these
computationally intensive models are directly used for engineering design optimization, the com-
putational burden would be unaffordable. Nowadays, the time made available to develop new
products is continuously being shortened, making it preferable to reduce the computing-time
required for optimization. Though computers have become increasingly powerful, the complex-
ity of analysis software, for example, finite element analysis (FEA) and computational fluid
dynamics (CFD), seem to keep pace with computing advances [24]. This implies that one of
the most important aspects is to reduce the number of function evaluations (NFEs) in engineer-
ing design optimization. To meet the challenge of increasing model complexity, designers are
seeking new methods. As a result, surrogate model which is often called metamodel or response
*Corresponding author. Email: ypc2008300718@163.com
© 2015 Taylor & Francis

1256 P. Ye and G. Pan
surface, as a widely used approximate model to replace the expensive simulation analysis is
proposed and improved by researchers. Therefore, a preferable strategy is to utilize the surrogate
models (approximate models) instead of the expensive high fidelity models in the engineer-
ing optimization process, termed as approximate optimization or surrogate-based optimization
[13,20–22,31,34,36,40].
The surrogate models can be constructed by different interpolation methods including radial
basis functions (RBFs) [3,39,45], kriging [12,28,36] and response surface methodology [27,30],
etc. RBFs are used in this paper. All surrogate models need an initial set of sample points in
order to create the first interpolation model. In general, surrogate-based optimization can be
categorized as one-stage and two-stage optimization approaches [13,21,22,36]. One-stage refers
to that in one step of the algorithm, both the interpolation of the response (surrogate) surface as
well as the determination of the new sample points are done in the same calculations. A two-stage
method first finds the interpolation (or approximation) of the response surface, then in a second
step tries to find the new sample points optimizing some auxiliary function. From a theoretical
mathematical statistical point of view a one-stage method is to be preferred. But some one-
stage methods lead to numerically difficult optimization problems and two-stage methods are
often easier to implement and are shown to work well in practice. Our sequential approximate
optimization (SAO) method is of the two-stage type, and easy to implement.
In recent years, the SAO approach has been widely studied [14,23,41,45]. In SAO approach,
the surrogate model is constructed repeatedly by adding the new sampling points until the ter-
minal criterion determined by designers is satisfied. It is evident that the efficiency of the SAO
approach is mainly determined by the accuracy of surrogate model and the sample infill strategy.
For the sake of obtaining an approximate global minimum with high accuracy efficiently, the
optimum of the response surface and new point in the sparse region as the new sampling points
are simultaneously added to the sample in this paper. Adding the optimum of the response surface
as the new sampling point will lead to a local approximation with high accuracy [26]. However,
only the successive additions of the optimum of the response surface may result in finding the
local optimum. In order to avoid falling into the local optimum and lead to the global approx-
imation, the new sampling point in the sparse region will be added to the sample [38]. Thus,
global and local approximation will be achieved simultaneously through the above sequential
sampling strategy. In this sequential sampling strategy, it is important to find the sparse region
in the design space. In this paper, an effective function called the density function [23] for deter-
mining the sparse region in the design space is considered. The density function constructed by
using the RBF is to discover a sparse region in the design space. Optimization algorithm simu-
lated annealing (SA) [5] has been employed to find the global optima of the response surface and
the density function. Thus, a novel SAO approach is presented.
The requirements of computational efficiency and accuracy of surrogate model create a
dilemma for engineering design optimization problems. From a practical viewpoint, the global
optimum is hard to find for a large-scale engineering design optimization problem with the
affordable computational resources. Regardless of which approximate method is used for a spe-
cific optimization problem, it is observed that the computational efficiency and accuracy of
surrogate model are directly related to the scale of design space. Designers tend to give very
conservative upper and lower bounds for design variables at the initial stage of setting up a
design optimization problem. This is often due to the lack of sufficient knowledge of func-
tion behaviour and interactions between objective and constraint functions in the early stages
of problem definition. Based on the aspect just mentioned, it can develop the optimization effi-
ciency immensely as well as to increase the chance of finding the global or better local optimal
solution is feasible, especially in engineering design optimization. In this article, data mining
is employed to reduce the search space. The detailed introduction of data mining will be pro-
vided in Section 2.3. In the field of engineering design optimization, data mining techniques
Optimization Methods & Software 1257
have been studied recently. Chen et al. used the data mining techniques to find a better global
optimal solution for structural optimization [8–10]. Two data mining techniques analysis of vari-
ance and self-organizing map were used to obtain positive information about design space for
multi-objective aerodynamic optimization design by Jeong and Obayashi [19]. Yadav et al. [44]
applied data mining software Weka to select the most relevant input parameters for artificial
neural network-based solar radiation prediction models. Shi et al. [37] employed a data min-
ing technique, namely, Classification and Regression Tree method to extract a set of reduced
feasible design domains from the original design space. Within the reduced feasible domains,
the first generation of designs can be selected for multi-objective optimization to identify the
Pareto set.
In this paper, a novel SAO approach using data mining for engineering design optimization
is proposed. Data mining learning was done prior to the construction of the surrogate model to
determine the reduced search space by classification, association and clustering activities with a
limited number of samples in the whole design space. Then initial sampling points are generated
in the reduced search space for constructing the surrogate model with RBFs. The surrogate model
is constructed repeatedly through the addition of the new sampling points, namely, optimal points
of response surface and density function. Afterwards, the more accurate surrogate model would
be constructed in the reduced search space by repeating the procedure of adding points to the
sample adaptively and sequentially. The optimization algorithm SA is used to locate the optimal
global solution. Thus, the global optimum can then be found based on the surrogate model in
the reduced search space. Several numerical examples are tested to demonstrate the validity and
efficiency of the proposed SAO approach.
The remainder of the paper is organized as follows. RBFs, density function and data mining is
introduced in Section 2, and then application of data mining in engineering design optimization
is represented to show its feasibility in Section 3. Section 4 provides the summary of the novel
SAO approach. In Section 5, some numerical examples are tested to show the validity of the
proposed SAO approach. Eventually, conclusions are drawn in Section 6.
2. Background
In order to illuminate the SAO approach proposed in this paper in detail, the approximate method
RBF is introduced first, followed by the density function with the RBF. Then brief description of
data mining techniques will be given at last.
2.1 Radial basis functions
The approximate method RBF was originally developed by Hardy [18] in 1971 to fit irregular
topographic contours of geographical data. It has been known tested and verified for several
decades and many positive properties have been identified. Krishnamurthy [25] added a polyno-
mial to the definition of RBF to improve the performance. Wu [43] provided criteria for positive
definiteness of the basis functions with compact support which produced series of positive
definite basis functions.
An RBF network is a three-layer feed-forward network shown in Figure 1. The output of the
network fˆ (x), which corresponds to the response surface, is typical given by

N
ŷ = fˆ (x) = λi φ(x − xi ) = · λ, (1)
i=1
Figure 1. Three-layer feed-forward RBF network.
Table 1. Commonly used basis functions.
Name Radial function r = ||x-xi ||2
Linear φ(r) = cr
Cubic φ(r) = (r + c)3
Thin-plate spline φ(r) = r2 log(cr2 )
Gaussian φ(r) = exp(−cr2 )
Multiquadric φ(r) = (r2 + c2 )1/2
Note: c is a constant.
where, N is the number of sampling points, x is a vector of the design variables, xi is a vector val-
ues of the design variables at the ith sampling point, = [φ1 , φ2 , . . . φN ](φi = φ(x − xi )), λ =
[λ1 , λ2 , . . . λN ]T , x − xi is the Euclidean norm, φ is a basis function, and λi is the coefficient
for the ith basis function. The approximation function ŷ is actually a linear combination of some
RBF with weight coefficients λi . The most commonly used basis functions are listed in Table 1.
The basis functions multiquadric and Gaussian are best known and most often applied. The
multiquadric is non-singular and simple to use [29]. Hence, RBF multiquadric is used in this
paper.
Although the approximate method RBF is good for high-order nonlinear responses, it has
been verified to be unsuitable for linear responses [16,25]. To make the RBF appropriate for
linear responses, an RBF model can be augmented with a linear polynomial given as

N
m
fˆ (x) = λi φ(x − xi ) + bj pj (x), (2)
i=1 j=1
where pj (x) = [1, x1 , . . . , xn ] is a linear polynomial function, m = n + 1 is the total number

of terms in the polynomial, n is the number of design variables, and bj is the weight coeffi-
cient. Equation (2) consists of N equations with N + m unknowns. The additional m equations
necessary to solve the m additional unknowns can be obtained from the following m constraints:

N
λi pj (xi ) = 0, for j = 1, 2, . . . , m. (3)
i=1
Adding a linear term also improves the quality of the RBF approximation far away from the
sampling points. More details about RBF is found in the work by Powell [32] and the book by
Buhmann [7].
2.2 Density function with the RBF
It is necessary to add new sampling points in the sparse region for global approximation. To
achieve this, a new function called the density function which is proposed by Kitayama et al.
[23] is constructed using the RBF network in this paper. The aim of the density function is to
discover a sparse region in the design space. This density function generates local minima in the
sparse region, so that the minimum of this function can be taken as a new sampling point. The
addition of new sampling points in the sparse region will improve the accuracy of approximation
model and help to find the global minimum of surrogate model.
To explore the sparse region, every output f of the RBF network is replaced with +1. Let N be
the number of sampling points. The procedure to construct the density function is summarized
as follows:
(1) The output vector f D is prepared at the sampling points.
f D = (1, 1, . . . , 1)TN∗1 . (4)
(2) The weight vector λD of the density function D(x) is calculated as follows:
λD = (T + )−1 T f D , (5)

⎡ ⎤ ⎡ ⎤
φ1 (x1 ) φ2 (x1 ) · · · φN (x1 ) 1 0 ··· 0
⎢ φ1 (x2 ) φ2 (x2 ) · · · φN (x2 ) ⎥ ⎢0 1 · · · 0⎥
⎢ ⎥ ⎢ ⎥
where = ⎢ . .. ⎥ = 10−3 × ⎢ . . .
. . ... ⎥
.. .. .
⎣ .. . . . ⎦ ⎣ .. .. ⎦
φ1 (xN ) φ2 (xN ) · · · φN (xN ) 0 0 · · · 1 N∗N
The addition of sampling point xD in the sparse region is explored as the global minimum of
density function with the RBF.

N
D(xD ) = λD
j φj (x ) → min.
D
(6)
j=1
2.3 Brief description of data mining
Data mining was initially developed for helping businesses to dig out useful information stored
in their huge data warehouses. Because of its good performance in business applications, its
applications have been extended to other areas such as marketing, bank security, aerospace
engineering, mechanical engineering and many other industries.
Data mining is a technique developed to extract meaningful and useful patterns and rules
hidden in a huge data set [2,17]. This useful information usually cannot be noticed directly from
those items stored in the data file. The results of data mining may be some rules or patterns
that can be applied to serve some special purposes. The books by Adriaans and Zantinge [2]
and Berry and Linoff [4] mentioned many different activities that can be used to extract various
information of interest to the designers. Three major activities, classification, association and
clustering employed in this paper are widely used to help engineering designers to accomplish
tasks efficiently and intelligently. Many mathematical theories and algorithms can be applied to
implement these activities aforementioned. The purpose and implement of those activities in data
mining is explained briefly as follows.
(1) Classification
The purpose of classification activity is to find some rules that can be used to determine
whether a previously unknown object belongs to a known class. For example, according to the
rules obtained by data mining, the classification activity can tell whether a person is rich or poor
based on his personal property data.
Implementing for classification activity, the recursive partitioning process is used to construct
the tree structure. Initially, all samples are placed in the same set. Some algorithms are then used
to generate all possible binary splits to separate the samples into two parts and this process is
continued until no meaningful split can be found. A diversity index is computed at each node.
The best split is chosen by the largest decrease in diversity. All the best splits are collected to
form the decision tree.
(2) Association
The purpose of association activity is to find things that are related. It is not used to predict the
nature of a sample as the classification activity does. For example, when a certain person buys
his breakfast, he always orders milk and a sandwich. Then we can know that milk is associated
with sandwiches for that person through the association activity.
Implementing for association activity, the approach to find the association rules is similar to
that for classification. The difference is that the rules found by association are not dedicated to a
special purpose like class; they just indicate some relationship between attributes. As a result, a
lot of rules may be found and usually only a certain number of these rules are useful. The levels
of support and confidence can be used to select valuable rules.
(3) Clustering
The purpose of clustering activity is to divide a group into a certain number of subgroups with
similar behaviours. The difference between clustering and classification is that clustering has
no pre- defined classes. The objects are grouped together on the basis of their similarity to one
another. The number of groups or classes cannot be known in advance. For example, optimization
algorithm includes SA, genetic algorithms (GA), particle swarm optimization (PSO), sequential
quadratic programming (SQP), linear programming (LP), etc. Through clustering activities, SA,
GA and PSO which are all evolutionary algorithms form a cluster. Similarly, SQP and LP which
are both gradient-based methods form another cluster.
Implementing for clustering activity, there are two widely used approaches. The first one
begins with a single cluster including all sampling points. This single cluster is then split into two
or more smaller clusters by some rules and this splitting process continues until some criteria are
satisfied. K-Means cluster is a typical approach of this kind. In the other approach, each sampling
point initially forms a basic cluster. Then, a combining process is executed repeatedly to combine
similar small clusters into several big clusters until a predetermined criterion is satisfied.
The rules or patterns extracted by data mining must be obtained through machine learning.
For the sake of obtaining reliable rules or patterns hidden in a large amount of data through data
mining, the data must be well prepared and then fed into the computer. Many business and free
software packages have been developed in recent years. The program used in this research is a
free software application called Weka which was developed by Witten and Frank [42] at Waikato
University in New Zealand. Weka is a very powerful data mining software program which
has many choices for specific purposes. Designers can compare the different results obtained
by different approaches and select the most appropriate one for their requirements. It can be
downloaded at www.cs.waikato.ac.na/ml/weka.
3. Application of data mining in engineering design optimization
Searching a global optimum design plays an important role in engineering design. However, it
is hard to find for a large-scale system with affordable computational resources. Therefore, an
appropriate and practical idea is proposed in this paper. That is, instead of finding the global
solution, the search effort is used to find an optimal solution which is better than most other local
optimal solutions. In this way, the computational time is affordable and the solution is reliable
and, most importantly, is better than many other local optimal solutions that usually be found.
To achieve this goal, a subspace which may include the global or better local optimal solutions
in the original design space needs to be identified first. It is obvious that the optimal solution
found in this reduced design space will have a very good chance of being better than that found
in the whole design space. Certainly, it is computationally efficient. Data mining is proposed in
this paper to find the reduced design space.
3.1 Ideas for reducing the design space
As mentioned previously, the main purpose of this research is to use data mining techniques
to reduce the design space. The data mining techniques used in this research are classifica-
tion, association, and clustering. After the design space is reduced, any evolutionary algorithm,
gradient-based method, or hybrid method can be used to search for the optimal solution in the
smaller design space. No matter which optimization method is used in this smaller search space,
the chance of finding the global or better local optimum solution will be greatly increased. Thus,
data mining is employed to determine a sub-region where the global optimum is most likely to
appear.
3.2 The input and output for the Weka
To use data mining software Weka to obtain the interesting information, some sampling design
points must be provided to the software for machine learning. These sampling points are dis-
tributed in the design space. The popular method for generating these points is to use a random
number generator or some methods from design of experiments. In this paper, the sampling
points generated by using the Latin hypercube design (LHD) approach are spread evenly. After
determining the design sampling points, the values of the objective functions are calculated for
each of these sampling points. The input and output for the data mining software Weka will be
described below.
The inputs for the data mining software Weka to find the desired rules or patterns include
the values of design variables and the corresponding objective function value for each sam-
pling point. There are minor differences in the processes required to prepare the input data since
the expressions of the outputs for classification, association, and clustering are different. For
classification and association activities, the output is usually represented by symbols or literal
words. But the output of clustering is some clusters and the centroid values of the clusters. There-
fore, some type of transformation from numbers to symbols or words must be performed before
preparing the input for classification and association. In this paper, the objective function val-
ues in the input are replaced by the symbol for the activities of classification and association.
The sampling points are divided into different classes by equal or unequal intervals based on the
objective function values. The best class is represented by the symbol ‘A’ and followed by other
classes in the order ‘B’, ‘C’, ‘D’ and so forth according to the order of their objective function
values. For clustering activity, the original objective function values are used because the dis-
tances between sampling points have to be calculated. The members within a certain distance
form a cluster.
The output of the classification activity is usually a decision tree. Following the path of the
tree structure from the root to a leaf can determine the class of an unknown design point. At
each node, some rules are provided to decide whether to enter different branches. By applying
these rules, a design point can be assigned a class. The output of association activity includes
many rules. The use of these rules should be based on the number of sampling points covered
by the rule as well as the accuracy of the rule. The output of the clustering activity is a certain
number of clusters. The members in each cluster are given, and the centroid of the cluster is
also computed. The best cluster is chosen to obtain the useful information. In this article, the
algorithm RandomTree is selected for realizing classification activity. Meanwhile, the algorithms
PredictiveApriori and FarthestFirst are separately chosen for realizing association and clustering
activity.
An optimization problem will be demonstrated in the next section to show the detailed
procedure to reduce the design space by using Weka.
3.3 An example
This example shows how data mining activities are used to reduce the design space. The
mathematical formulation of the problem is:
√ 2 2
min F(x) = −20e−0.2 0.5(x1 +x2 ) − e0.5[cos(2π x1 )+cos(2π x2 )] + 20 + e
subject to − 30 ≤ x1 ≤ 30 (7)
− 30 ≤ x2 ≤ 30
This objective function is the Ackley function given in Branke and Schmidt’s paper [6]. It is a
highly noisy multi-modal function and there are many local minima closely distributing around
the global minimum, as shown in Figure 2. There is only one global optimum at (0, 0). It is hard
Figure 2. Ackley function.

to find the global optimum using any gradient-based method or evolutionary algorithm in the
whole design space. Thus, a smaller space that contains global optimum is identified using the
data mining software Weka.
The ARFF file is the input file to Weka, which includes two parts. The first part of the ARFF
file defines the feature of each attribute that will appear later in each data line in the second
part of the input. The attribute must be identified as a special type which includes numeric data,
nominal data and binary data etc. Each line in the second part of the input file provides the data
related to each sampling points. Figure 3 shows an example of input file provided by Weka for
contact-lenses. For the Ackley example, each line in the second part of the input contains three
items. The first two items are values of design variables x1 and x2 , respectively. For classification
and association activities, the third one is a class related to the objective function values which
are replaced by symbols ‘A, B, C or D’ in this paper. For clustering activity, the true objective
function values are given to the third item of a data line.
Figure 3. Input file ARFF to Weka.
Figure 4. Output file from the clustering activity.

Figure 5. Output file from the classification activity.
Total 48 sampling points are generated in the whole design space using the LHD. Based on
the distribution of the objective function values of these sampling points, one of the symbols ‘A,
B, C and D’ is assigned to them according to their objective function values. The best level is
represented by symbol ‘A’ and the others follow in the order B, C, and D. The class ‘A’, ‘B’,
‘C’ and ‘D’ separately contains 4, 16, 8, 20 members with the corresponding objective function
values.
The outputs from Weka are depicted in Figures 4–6. Figure 4 shows the output of the clustering
activity. Four clusters are found. The sampling points in cluster 3 have the lowest mean objective
function value which is examined to find the boundary of this cluster. Since the largest value of
x1 for all points in cluster 3 is 6 and the smallest value of x1 for all points is − 6, 6 is chosen as
the upper bound for x1 and − 6 is the lower bound for x1 . A similar approach is applied to x2 to
determine the upper and lower bounds. Usually, a single reduced space will be identified by the
clustering activity. However, multiple best clusters may be found if the problem is a multimodal
problem and the optimal points are not located closely. In this case, the reduced space is the
PredictiveApriori
Best rules found:
1. x1='[24-inf)' x2='[18-inf)' 4 = => F=D 4 conf : (1)
2. x1='[24-inf)' x2='[-6-6]' 2 = => F=C 2 conf : (1)
3. x1='[24-inf)' x2='(-inf--18]' 4 = => F=D 4 conf : (1)
4. x1='[12-18]' x2='[18-inf)' 1 = => F=D 1 conf : (1)
5. x1='[12-18]' x2='(6-18)' 1 = => F=B 1 conf : (1)
6. x1='[12-18]' x2='[-6-6]' 2 = => F=B 2 conf : (1)
7. x1='[12-18]' x2='(-18--6)' 1 = => F=B 1 conf : (1)
8. x1='[12-18]' x2='(-inf--18]' 1 = => F=D 1 conf : (1)
9. x1='(0-6]' x2='[18-inf)' 1 = => F=C 1 conf : (1)
10. x1='(0-6]' x2='[18-inf)' 1 = => F=B 1 conf : (1)
11. x1='(0-6]' x2='[-6-6]' 2 = => F=A 2 conf : (1)
12. x1='(0-6]' x2='(-inf--18]' 1 = => F=C 1 conf : (1)
13. x1='(0-6]' x2='(-inf--18]' 1 = => F=B 1 conf : (1)
14. x1='[-6-0)' x2='[18-inf)' 1 = => F=C 1 conf : (1)
15. x1='[-6-0)' x2='[18-inf)' 1 = => F=B 1 conf : (1)
16. x1='[-6-0)' x2='[-6-6]' 2 = => F=A 2 conf : (1)
17. x1='[-6-0)' x2='(-inf--18]' 1 = => F=C 1 conf : (1)
18. x1='[-6-0)' x2='(-inf--18]' 1 = => F=B 1 conf : (1)
19. x1='[-18--12]' x2='[18-inf)' 1 = => F=D 1 conf : (1)
20. x1='[-18--12]' x2='[18-inf)' 1 = => F=B 1 conf : (1)
21. x1='[-18--12]' x2='(6-18)' 2 = => F=B 2 conf : (1)
22. x1='[-18--12]' x2='[-6-6]' 2 = => F=B 2 conf : (1)
23. x1='[-18--12]' x2='(-18--6)' 2= => F=B 2 conf : (1)
24. x1='[-18--12]' x2='(-inf--18]' 1= => F=D 1 conf : (1)
25. x1='[-18--12]' x2='(-inf--18]' 1= => F=B 1 conf : (1)
26. x1='(-inf--24]' x2='[18-inf)' 4= => F=D 4 conf : (1)
27. x1='(-inf--24]' x2='[-6-6]' 2= => F=C 2 conf : (1)
28. x1='(-inf--24]' x2='(-inf--18]' 4= => F=D 4 conf : (1)
Figure 6. Output file from the association activity.
union of the spaces defined by the best clusters, that is, finding a smallest search apace including
all the spaces defined by the best clusters. Following the approach just mentioned gives the
reduced design space as −6 ≤ x1 ≤ 6 and − 6 ≤ x2 ≤ 6. The global solution indeed appears in
this reduced space which contains only 4% of the original design space.
Figure 5 shows a tree structure, which is the outcome obtained from the classification activity.
Each leaf bearing a class symbol is the end of a route. At each node on the tree, some rules
are given to direct the user to different routes. Since class ‘A’ is the best one of interest, the
routes leading to ‘A’ can be obtained from the root of the tree through nodes and branches to
the leaves marked ‘A’. In this way, we can simply traverse the tree from left to right and pass
through the nodes and branches to the leaf marked ‘A’. The ranges for the design variables are
−9 ≤ x1 ≤ 9 and − 12 ≤ x2 ≤ 12 based on the approach just mentioned. The global solution
also appears in this reduced space. This area covers only 12% of the original design space.
Figure 6 shows the output of the association activity which are many rules leading to different
classes. There are two rules related to class ‘A’, that is, two different regions are identified by
these rules. In each rule, ‘F = A’ means the rule will lead to class ‘A’ points and the numeric
number after ‘A’ is the number of sampling points covered by the rule. At the end of each rule,
conf: (1) gives the confidence fraction in parentheses. This confidence fraction is defined as the
number of sampling points that belong to class ‘A’ divided by the number of sampling points
that satisfy the rule. 1 represents 100% confidence in the rule. Therefore, rule numbers 11, 16
provide information that can be used to identify to the class ‘A’ samples. The union of these two
design spaces is defined by −6 ≤ x1 ≤ 6 and − 6 ≤ x2 ≤ 6 which is only 4% of the original
design space.
For this example problem, the clustering and association activities yield the same reduced
search space −6 ≤ x1 ≤ 6 and − 6 ≤ x2 ≤ 6. Although the classification activity yields differ-
ent reduced search space, both of the reduced spaces contain the global optimal solution to the
problem. The chance of finding the global optimal solution in these smaller search spaces is much
greater than that of finding it in the entire design space. The novel SAO approach combining data
mining techniques and surrogate model method is introduced in detail in the next section.
4. The novel SAO approach
Following the concepts proposed in the previous sections, the SAO approach using data mining
is proposed in this research to find the optimal solution for engineering design optimization
problems.
The approach can be roughly divided into four phases: (1) obtain the reduced search space
using data mining; (2) construct the surrogate model using RBFs; (3) Add new sampling points
to the sample to update the surrogate model; (4) SA search. The following are the steps of the
SAO approach.
Step 1: Initialize parameters a, t, LH_w, LH_s, d s , ε and mmax and set counters a = 0 and
count = 0.
Step 2: Generate an initial number of sampling points (LH_w) in the whole design space using
the LHD.
Step 3: Analyse and calculate the exact objective function values through simulation analysis
programs such as FEA and CFD.
Step 4: Check Mod(a, 3) = 0? If it is true, go to the next step. Otherwise, go to the step 8.
Step 5: Obtain the reduced search space by data mining techniques, that is, classification,
association and clustering activities.
Step 6: Generate a certain number of sampling points (LH_s) in the reduced search space using
the LHD and add them to the sample.
Step 7: Analyse and calculate the exact objective function values as the Step 3.
Step 8: Construct the surrogate model using RBFs to replace the practical engineering problem
by using all sampling points in the sample
Step 9: Use optimization algorithm SA to find the optimal solution of the surrogate model in
k
the reduced search space. Add the optimal point to the sample. If the optimal solution fopt
satisfies the convergence criterion formulated as Equation (8), exit. Otherwise, go to the
next step.

1
fopt
k−1

k−1 k−2
k
− fopt

fopt − fopt

+

≤ ε, (8)
2
fopt k−1

k−2
fopt
k
fopt is the optimal solution in the kth iteration.
Step 10: Construct the density function with RBFs using all sampling points in the sample. Add
the minimum point of the density function to the sample.
Step 11: Check count ≤ t? If it is true, count = count + 1 and go to the step 10. Otherwise,
calculate the total number of sampling points m in the sample and go to the next step.
Step 12: Check m < mmax ? If it is true, a = a + 1, go to the step 3. Otherwise, exit.
In the approach above-mentioned, data mining technique is employed to obtain the reduced
search space every three iterations. The parameter a controls the operating frequency of data
mining technique. Thus, more accurate reduced search space will be iteratively identified in the
optimization procedure. If Mod(a, 3) = 0, data mining technique is used to obtain the reduced
search space with the help of all the sampling points in the sample, where Mod() represents the
complementation.
In this work, the density function is constructed to find the sparse region. The minimum point
of the density function is taken as a new sampling point. This step is repeated while a judgement
criterion count ≤ t is satisfied. Parameter t controls the number of sampling points obtained
by the density function. In this paper, parameter t = int(n/2) + 1, where int() represents the
rounding-off, n is the number of design variables. The new sampling point would be gained
Figure 7. Flowchart of the SAO approach.

if the parameter count is less than t, and it is increased as count = count + 1. In addition, the
optimal point of the surrogate model are also found and taken as the new sampling point. Thus
the more accurate surrogate model would be constructed by repeating the procedure of adding
new sampling points to the sample sequentially. In order to achieve the global optimum of the
response surface in the reduced search space faster, a certain number of sampling points in the
reduced search space are generated using the LHD as training data for RBFs.
It is likely that finding the global minimum with good accuracy will often take too much
time using the proposed optimization method. Therefore, the parameter mmax is given to control
the maximum number of expensive function evaluations allowed, that is, the maximum number
of sampling points used in the sample. If m < mmax is satisfied, another iteration cycle of the
optimization procedure is computed, otherwise the optimization procedure is ended and the best
solution found is regarded as an approximation of the global minimum.
It has been found from preliminary study that in some cases the new sampling points may be
close to one of the existing sampling points. Therefore, to avoid new sampling points clustering
around the existing sampling points, the distance between them is taken into consideration. So
the addition of new sampling points which are content with Equations (9) are taken out
0.5

n
d(x, xp ) < ds , ds = (xU
i − xLi ) , (9)
i=1
where xp is the existing sampling points, x is the new sampling point, d is the Euclidean distance,
n is the number of design variables, xU L
i and xi are the bounds of the design variable.
Figure 7 shows the flowchart for the proposed SAO approach. In this flowchart, data mining
is used to obtain the reduced search space. Then the optimal solution is explored in the reduced
search space which is smaller than the original design space. When the optimization approach
is used to perform engineering optimization, the surrogate model constructed by using RBFs is
used to replace the experiments or expensive simulation analysis to reduce the computational
time required for the SA search. In the approach, the surrogate model is constructed repeatedly
through the addition of the new sampling points. Afterwards, the more accurate approximate
model would be constructed by repeating the procedure of adding points to the sample adaptively
and sequentially. Finally, the optimal solution will be found by SA search.
5. Results and discussions
In this section, the validity of the proposed SAO approach will be examined through some
well-known benchmark problems and one engineering design optimization problem. In addition,
the proposed SAO approach is compared with other SAO approaches through some numerical
examples.
5.1 Application to several benchmark problems
The validity of the proposed SAO approach is tested by five typical benchmark problems. These
problems and their global minimum fˆmin are listed in Table 2. Ten initial sampling points (LH_w)
are generated by the LHD for data mining to obtain the reduced search space. It is advised to be
set to 5*n for new problems in most cases. Furthermore, five more sampling points (LH_s) are
generated in the reduced search space for constructing the surrogate models in the whole design
space every three iterations. However, the global optimum of surrogate model is searched in the
reduced search space. For each test problem, 10 trials are carried out with different random seeds
Table 2. Benchmark problems and global minimum.
No. Function Design space fˆmin
1 f (x) = x21 + x22 − cos(18x1 ) − cos(18x2 ) −1 ≤ x ≤ 1 −2

−5 ≤ x1 ≤ 10
2 f (x) = (x2 − 5.1/(4π 2 )x21 + 5/π x1 − 6)2 + 10(1 − 1/(8π )) cos(x1 ) + 10 0.398
0 ≤ x2 ≤ 15
3 f (x) = x1 sin(x2 ) + x2 sin(x1 ) −2π ≤ x ≤ 2π − 9.629
4 f (x) = 2 + 0.01(x2 − x21 )2 + (1 − x1 )2 + 2(2 − x2 )2 + 7 sin(0.5x1 ) sin(0.7x1 x2 ) 0≤x≤5 − 1.4565

5 f (x) = 2i=1 |xi sin(xi ) + 0.1xi | −10 ≤ x ≤ 10 0
Table 3. Results of benchmark problems.
Test 1 Test 2 Test 3 Test 4 Test 5
Minimum of global − 1.9851 0.3983 − 9.6290 − 1.4543 0.0001

optima
Maximum of global − 1.9827 0.3997 − 9.6283 − 1.4486 0.0004
optima
Average of global − 1.9840 0.3989 − 9.6287 − 1.4521 0.0003
optima
Standard deviation of 0.0153 0.0003 0.0003 0.0062 0.0002
global optima
−0.5 ≤ x1 ≤ 0.5 5 ≤ x1 ≤ 10 π ≤ x1 ≤ 2π 1.5 ≤ x1 ≤ 3.5 −2.5 ≤ x1 ≤ 3.5
Reduced search space
−0.5 ≤ x2 ≤ 0.5 0 ≤ x2 ≤ 5 π ≤ x2 ≤ 2π 1.5 ≤ x2 ≤ 3.5 −3 ≤ x2 ≤ 3
to avoid unrepresentative numerical results. In addition, the maximal number of sampling points
mmax in the optimization procedure is set to 200. ε as a threshold value is set to 0.01 in most
cases. The average results are shown in Table 3. It is clear from Table 3 that the proposed SAO
approach is valid for the benchmark problems considered here.
5.2 Comparison with other SAO approaches
In this section, the proposed SAO approach has been tested using 11 typical benchmark opti-
mization problems, and compared with two other SAO approaches including mode-pursuing
sampling (MPS) [41] and hybrid and adaptive meta-modelling (HAM) [14] which are particu-
larly suitable for optimization problems involving computation intensive, black-box analysis and
simulation, which are often complex, high dimensional. The test examples are listed as follows.
The coefficients in Equations (16)–(18) are defined in Table 4.
(1) Six-hump Camel-Back function (SC) with n = 2 [41]
x61
f (x) = 4x21 − 2.1x41 + + x1 x2 − 4x22 + 4x42 , x ∈ [−2, 2]n , fmin = −1.0316. (10)
3
(2) Griewank function (GN) with n = 2 [41]
n n
x2i xi
f (x) = − cos √ + 1, x ∈ [−100, 100]n , fmin = 0. (11)
i=1
200 i=1
i
(3) Generalized polynomial function (GF) with n = 2 [47]
f (x) = (1.5 − x1 (1 − x2 ))2 + (2.25 − x1 (1 − x22 ))2 + (2.625 − x1 (1 − x32 ))2

(12)
x ∈ [−2, 2]n , fmin = 0.5233.
(4) Goldstein and Price function (GP) with n = 2 [41]
f (x) = [1 + (x1 + x2 + 1)2 (19 − 14x1 + 3x21 − 14x2 + 6x1 x2 + 3x22 )]∗
[30 + (2x1 − 3x2 )2 (18 − 32x1 + 12x21 + 48x2 − 36x1 x2 + 27x22 )],
x ∈ [−2, 2]n , fmin = 3. (13)
(5) Leon function (LN) with n = 2 [1]
f (x) = 100(x2 − x21 )2 + (x1 − 1)2 , x ∈ [−10, 10]n , fmin = 0. (14)
(6) Himmelblau function (HM) with n = 2 [1]
f (x) = (x21 + x2 − 11)2 + (x1 + x22 − 7)2 , x ∈ [−6, 6]n , fmin = 0. (15)
(7) Shekel 5 function (S5) with n = 4 [46]

⎛ ⎞−1
5 n
f (x) = − ⎝ (xj − aij )2 + ci ⎠ , x ∈ [0, 10]n , fmin = −10.153. (16)
i=1 j=1

⎛ ⎞−1
7 n
i=1 j=1

⎛ ⎞−1
10 n
i=1 j=1
(10) Hartman function (HN) with n = 6 [41]

⎡ ⎤

4
n
f (x) = − ci exp ⎣− αij (xj − pij )2 ⎦ ,
i=1 j=1
i = 1, 2, . . . , n, x ∈ [0, 1]n ,min = −3.322, (19)

⎡ ⎤
10 3 17 3.5 1.7 8
⎢0.05 10 17 0.1 8 14⎥
where [aij ] = ⎢
⎣ 3
⎥ ci = 1 1.2 3 3.2 ;
3.5 1.7 10 17 8⎦
17 8 0.05 10 0.1 14
⎡ ⎤
1312 1696 5569 124 8283 5886
⎢2329 4135 8307 3736 1004 9991⎥
[pij ] = ⎢
⎣2348 1451 3522 2883
⎥ ∗ 10−4 ;
3047 6650⎦
4047 8828 8732 5743 1091 381
Table 4. Coefficients in Shekel functions.
i aij , j = 1, . . . , 4 ci
1 4 4 4 4 0.1
2 1 1 1 1 0.2
3 8 8 8 8 0.2
4 6 6 6 6 0.4
5 3 7 3 7 0.4
6 2 9 2 9 0.6
7 5 5 3 3 0.3
8 8 1 8 1 0.7
9 6 2 6 2 0.5
10 7 3.6 7 3.6 0.5
(11) A function of 16 variables (F16) with n = 16 [41]

n
n
f (x) = αij (x2i + xi + 1)(x2j + xj + 1), i, j = 1, 2, . . . , n,
i=1 j=1
x ∈ [−1, 0]n , fmin = 25.875, (20)

⎡ ⎤
1 0 0 1 0 0 1 1 0 0 0 0 0 0 0 1
⎢ 0 1 1 0 0 0 1 0 0 1 0 0 0 0 0 0 ⎥
⎢ ⎥
⎢ 0 0 1 0 0 0 1 0 1 1 0 0 0 1 0 0 ⎥
⎢ ⎥
⎢ 0 0 0 1 0 0 1 0 0 0 1 0 0 0 1 0 ⎥
⎢ ⎥
⎢ 0 0 0 0 1 1 0 0 0 1 0 1 0 0 0 1 ⎥
⎢ ⎥
⎢ 0 0 0 0 0 1 0 1 0 0 0 0 0 0 1 0 ⎥
⎢ ⎥
⎢ 0 0 0 0 0 0 1 0 0 0 1 0 1 0 0 0 ⎥
⎢ ⎥
⎢ 0 0 0 0 0 0 0 1 0 1 0 0 0 0 1 0 ⎥
where [aij ] = ⎢
⎢ 0 0 0 0 0 0 0 0
⎥.
⎥
⎢ 1 0 0 1 0 0 0 1 ⎥
⎢ 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 ⎥
⎢ ⎥
⎢ 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 ⎥
⎢ ⎥
⎢ 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 ⎥
⎢ ⎥
⎢ 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 ⎥
⎢ ⎥
⎢ 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 ⎥
⎢ ⎥
⎣ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 ⎦
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
To avoid unrepresentative numerical results, 10 runs are carried out for each test examples.
Ten initial sampling points are generated by the LHD for data mining to obtain the reduced
search space for each test function. Furthermore, five more sampling points are generated in the
reduced search space for constructing the surrogate models in the whole design space every three
iterations. And the maximal number of sampling points mmax is also set to 200.
In the engineering design optimization problems, computation time is largely proportional to
the number of black-box function evaluations. Therefore, reducing the function evaluations is
an important item in the SAO approaches. From this viewpoint, the mean NFEs is thus used to
measure computation efficiency. It is obvious that the global minimum is also shown to illus-
trate the validity of the approaches. The proposed SAO algorithm is compared with other SAO
approaches through benchmark problems. The average values of number of functions evaluations
and global minimum are shown in Table 5.
As shown by the average results in Table 5, the proposed SAO approach performed well for
all test examples in terms of both NFE and global minimum. MPS and HAM approaches require
Table 5. Comparisons results with other optimization methods.
MPS HAM Proposed method
Fun. fmin Reduced search space NFE f˜min NFE f˜min NFE f˜min
SC − 1.0316 −1 ≤ x ≤ 1 37.8 − 1.0308 49.6 − 1.0310 35.7 − 1.0315

GN 0 −20 ≤ x ≤ 20 371.0 0.6850 72.8 0.0537 54.3 0.0018
GF 0.5233 0≤x≤2 44.7 0.7012 88.4 0.5367 42.2 0.5237
GP 3 −1.5 ≤ x ≤ 0.5 138.0 3.1080 165.1 3.0006 117.4 3.0011
LN 0 −3 ≤ x ≤ 3 139.5 3.4341 160.6 0.0030 75.6 0.0025
HM 0 −3.5 ≤ x ≤ 3.5 57.0 0.0388 81.2 0.0053 43.5 0.0013
S5 − 10 3≤x≤5 108.3 − 7.8366 95.3 − 8.4652 42.2 − 10.063
S7 − 10 2.5 ≤ x ≤ 5.5 103.9 − 8.3987 86.2 − 9.3684 45.5 − 10.326
S10 − 10 2.5 ≤ x ≤ 5.5 105.5 − 9.0233 80.6 − 9.3285 53.6 − 10.463
HN − 3.3224 0.1 ≤ x ≤ 0.7 592.1 − 3.2350 132.0 − 3.3069 113.9 − 3.3078
F16 25.8750 −0.7 ≤ x ≤ −0.2 254.8 25.9150 250.6 26.3363 166.8 25.8970
Figure 8. Diagram of tension/compression spring design.
more sampling points, that is, expensive function evaluations, making them unsuitable to solve
the engineering design optimization problems compared with the proposed SAO approach.
5.3 An engineering design optimization problem
The validity of the SAO approach is tested by a typical mechanical design optimization problem
involving three design variables, that is, tension/compression spring design. Coello Coello [11]
and Ray and Saini [33] have used this as benchmark problem in the structural optimization.
The schematic of the tension/compression spring is shown in Figure 8. It consists of minimizing
the weight of a tension/compression spring subject to constraints on minimum deflection, shear
stress, surge frequency, limits on outside diameter and on design variables. Three variables are
identified: diameter d, mean coil diameter D, and number of active coils N. In this case, the
variable vectors are given by
X = (d, D, N) = (x1 , x2 , x3 ). (21)
The mathematical model of the optimization problem is expressed as
min f (X ) = (2 + x3 )x21 x2
x32 x3
s.t. g1 (X ) = 1 − ≤0
71785x41
4x22 − x1 x2 1
g2 (X ) = + −1≤0 (22)
12566(x2 x1 − x1 ) 5108x21
3 4
Table 6. Comparison of results on the optimum design of tension/compression spring.
x1 x2 x3 g1 g2 g3 g4 NFE fˆmin
Coello Coello [11] 0.0515 0.3517 11.6322 − 0.0021 − 0.0001 − 4.0263 − 0.7312 900,000 0.0128
Ray and Saini [33] 0.0504 0.3215 13.9799 − 0.0019 − 0.0129 − 3.8994 − 0.7520 1291 0.0134
Kitayama et al. [23] 0.0500 0.3148 14.6500 − 0.0188 − 0.0066 − 3.8378 − 0.7568 66.0 0.0131
Proposed approach 0.0500 0.3152 14.3211 − 0.0004 − 0.0055 − 3.9356 − 0.7565 52.3 0.0128
140.45x1
g3 (X ) = 1 − ≤0
x22 x3
x1 + x2
g4 (X ) = − 1 ≤ 0.
1.5
The ranges of the design variables x1 ∈ [0.05, 2], x2 ∈ [0.25, 1.3], x3 ∈ [2, 15] are used.
The problem formulated above is a simple nonlinear constrained problem. The constraints
are handled referring to the work presented by Regis [35]. Now assuming the objective and
constraint functions defined by Equations (19) are computation-intensive function and thus
the reduction of the NFEs is considered. Hence, surrogate models of objective and constraint
functions are constructed by RBF.
Ten initial sampling points are generated by the LHD for data mining to obtain the reduced
search space for each test function. Furthermore, five more sampling points are generated in the
reduced search space for constructing the surrogate models in the whole design space every three
iterations. And the maximal number of sampling points mmax is also set to 200. The final reduced
search space obtained by data mining is x1 ∈ [0.05, 0.25], x2 ∈ [0.25, 0.4], x3 ∈ [13, 15].
For a fair comparison, 11 trials are performed referring to the past researches [11,33]. The
average values obtained by applying the proposed SAO approach are shown in Table 6. It is
clear from Table 6 that the function evaluations are drastically reduced in comparison with those
in the past researches. Meanwhile, the global minimum obtained by the proposed SAO approach
is the best.
6. Conclusion
It is difficult or even impossible to find the global optimum owing to the unaffordable com-
putational cost for most engineering design optimization problems. Generally, the expensive
simulation analysis is placed by the surrogate model constructed by the RBFs to alleviate this
difficulty. However, to explore an unknown system using surrogate model, designers are always
faced with the challenge of the tradeoff between the number of expensive samples and the accu-
racy of the surrogate model. If a complex system is accurately approximated across a relatively
large design space, the number of required sampling points (i.e. function evaluations) might be
prohibitive. Therefore, reducing the search space will decrease the number of sampling points
while increasing the chance of finding the global optimum. In this paper, a SAO approach that
integrates the hybrid optimization algorithm with data mining and surrogate model is proposed
to find the global optimum in engineering design optimization. It focuses on a small region to
build a surrogate model where an accurate global optimum can be obtained. The contributions
from the proposed approach are summarized as follows:
(1) Data mining to obtain the reduced search space: The data mining activities of classification,
association, or clustering in free software Weka are used to reduce the large design space to
a smaller search space that includes the global optimal solution, the likelihood of finding the
global solution is significantly increased.
(2) Sequential approximate process: Initial sampling points are generated by LHD in the reduced
search space to construct the surrogate model by RBFs. The optimal points of surrogate
model are taken as the new sampling points in order to improve the local accuracy. In addi-
tion, new sampling points in the sparse region are required for better global approximation.
To determine the sparse region, the density function constructed by the RBFs network has
been applied. The surrogate model is constructed repeatedly through the addition of the new
sampling points. Afterwards, the more accurate surrogate model would be constructed by
repeating the procedure of adding points to the sample adaptively and sequentially.
The validity and effectiveness of proposed SAO approach are examined by studying typical
numerical examples.
Acknowledgements
Authors would like to thank everybody for their encouragement and support.
Disclosure statement
No potential conflict of interest was reported by the authors.
Funding
The grant support from National Science Foundation [CMMI-51375389 and 51279165] are greatly acknowledged.
ORCID
Pengcheng Ye http://orcid.org/0000-0002-9898-441X
References
[1] E.P. Adorio, MVF-multivariate test functions library in c for unconstrained global optimization (2005). Available
at http://www.geocities.ws/eadorio/mvf.pdf.
[2] P. Adriaans and D. Zantinge, Data Mining, Addison-Wesley Professional, New York, 1996.
[3] G.S. Babu and S. Suresh, Sequential projection-based metacognitive learning in a radial basis function network for
classification problems, IEEE Trans. Neural Netw. Learn. Syst. 24(2) (2013), pp. 194–206.
[4] M.J.A. Berry and G.S. Linoff, Mastering Data Mining, John Wiley & Sons, New York, 2000.
[5] P. Borges, T. Eid, and E. Bergseng, Applying simulated annealing using different methods for the neighborhood
search in forest planning problems, European J. Oper. Res. 233(3) (2014), pp. 700–710.
[6] J. Branke and C. Schmidt, Faster convergence by means of fitness estimation, Soft Comput. 9 (2005), pp. 13–20.
[7] M.D. Buhmann, Radial Basis Functions, Cambridge University Press, Cambridge, 2003.
[8] T.Y. Chen and Y.L. Cheng, Data-mining assisted structural optimization using the evolutionary algorithm and
neural network, Eng. Optim. 12(3) (2010), pp. 205–222.
[9] T.Y. Chen and J.H. Huang, An efficient and practical approach to obtain a better optimum solution for structural
optimization, Eng. Optim. 45(8) (2013), pp. 1005–1026.
[10] T.Y. Chen and J.H. Huang, Application of data mining in a global optimization algorithm, Adv. Eng. Softw. 66
(2013), pp. 24–33.
[11] C.A. Coello Coello, Use of a self-adaptive penalty approach for engineering optimization problems, Comput. Ind.
41 (2000), pp. 113–127.
[12] I. Couckuyt, S. Koziel, and T. Dhaene, Surrogate modeling of microwave structures using kriging, co-kriging, and
space mapping, Int. J. Numer. Model., Electron. Netw. Devices Fields 26(1) (2013), pp. 64–73.
[13] A.I.J. Forrester and A.J. Keane, Recent advances in surrogate-based optimization, Prog. Aerosp. Sci. 45 (2009),
pp. 50–79.
[14] J. Gu, G.Y. Li, and Z. Dong, Hybrid and adaptive meta-model-based global optimization, Eng. Optim. 44(1) (2012),
pp. 87–104.
[15] L. Gu, A comparison of polynomial based regression models in vehicle safety analysis, in ASME Design Engi-
neering Technical Conferences—Design Automation Conference (DAC), A. Diaz, ed., Pittsburgh, PA, USA, 2001,
pp. 9–12.
[16] H.M. Gutmann, A radial basis function method for global optimization, J. Global Optim. 19 (2001), pp. 201–227.
[17] J. Han and M. Kamber, Data Mining: Concepts and Techniques, Morgan Kaufmana Publishers, San Fancisco, 2001.
[18] R.L. Hardy, Multiquadric equations of topography and other irregular surfaces, J. Geophys. Res. 76(8) (1971),
pp. 1905–1915.
[19] S. Jeong and S. Obayashi, Efficient global optimization (EGO) for multi-objective problem and data mining. The
2005 IEEE Congress on Evolutionary Computation, 3, Edinburgh, UK, 2005, pp. 2138–2145.
[20] R. Jin, W. Chen, and T.W. Simpson, Comparative studies of metamodeling techniques under multiple modeling
criteria, Struct. Optim. 23 (2001), pp. 1–13.
[21] D.R. Jones, A taxonomy of global optimization methods based on response surfaces, J. Global Optim. 21 (2001),
pp. 345–383.
[22] D.R. Jones, M. Schonlau, and W.J. Welch, Efficient global optimization of expensive black-box functions, J. Global
Optim. 13 (1998), pp. 455–492.
[23] S. Kitayama, M. Arakawa, and K. Yamazaki, Sequential approximate optimization using radial basis function
network for engineering optimization, Optim. Eng. 12(4) (2011), pp. 535–557.
[24] P.N. Koch, T.W. Simpson, J.K. Allen, and F. Mistree, Statistical approximations for multidisciplinary design
optimization: The problem of size, J. Aircr. 36(1) (1999), pp. 275–286.
[25] T. Krishnamurthy, Response surface approximation with augmented and compactly supported radial basis func-
tions, Proceedings of the 44th AIAA/ASME/ASCE/AHS/ASC Structures, Structural Dynamics, and Materials
Conference, vol. 1748, pp. 3210–3224, April 2003, AIAA Paper.
[26] H. Kurtaran, A. Eskandarian, D. Marzougui, and NE. Bedewi, Crashworthiness design optimization using
successive response surface approximations, Comput. Mech. 29 (2002), pp. 409–421.
[27] A.J. Makadia and J.I. Nanavati, Optimisation of machining parameters for turning operations based on response
surface methodology, Measurement 46(4) (2013), pp. 1521–1529.
[28] J.D. Martin and T.W. Simpson, Use of kriging models to approximate deterministic computer models, AIAA J.
43(4) (2005), pp. 853–863.
[29] C.A. Micchel, Interpolation of scattered data: Distance matrices and conditionally positive definite functions,
Constr. Approx. 2(1) (1984), pp. 11–22.
[30] R.H. Myers and D.C. Montgomery, Response Surface Methodology: Process and Product Optimization Using
Designed Experiments, John Wiley & Sons, New York, 1995.
[31] J. Park and I.W. Sandberg, Universal approximation using radial basis function networks, Neural Comput. 3 (1991),
pp. 246–257.
[32] M.J.D. Powell, The theory of radial basis function approximation in 1990, in Advances in Numerical Analysis,
Volume 2: Wavelets, Subdivision Algorithms and Radial Basis Functions, W. Light ed., Oxford University Press,
Oxford, UK, 1992, pp. 105–210.
[33] T. Ray and P. Saini, Engineering design optimization using swarm with an intelligent information sharing among
individuals, Eng Optim. 33 (2001), pp. 735–748.
[34] L.M. Rios and N.V. Sahinidis, Derivative-free optimization: A review of algorithms and comparison of software
implementations, Journal of Global Optimization. 56(3) (2013), pp. 1247–1293.
[35] R.G. Regis, Constrained optimization by radial basis function interpolation for high-dimensional expensive black-
box problems with infeasible initial point, Eng. Optim. 46(2) (2014), pp. 218–243.
[36] J. Sacks, W.J. Welch, T.J. Mitchell, and H.P. Wynn, Design and analysis of computer experiments, Statist. Sci. 4
(1989), pp. 409–443.
[37] L. Shi, Y. Fu, R.J. Yang, B.P. Wang, and P. Zhu, Selection of initial designs for multi-objective optimization using
classification and regression tree, Struct. Multidiscip. Optim. 48(6) (2013), pp. 1057–1073.
[38] A. Sobester, S.J. Leary, and AJ. Keane, On the design of optimization strategies based on global response surface
approximation models, J. Global Optim. 33 (2005), pp. 31–59.
[39] N. Vuković and Z. Miljković, A growing and pruning sequential learning algorithm of hyper basis function neural
network for function approximation, Neural Netw. 46 (2013), pp. 210–226.
[40] G.G. Wang and S. Shan, Review of metamodeling techniques in support of engineering design optimization, J. Mech.
Des. 129 (2007), pp. 370–38.
[41] L.Q. Wang, S. Shan, and G.G. Wang, Mode-pursuing sampling method for global optimization on expensive black-
box functions, Eng. Optim. 36(4) (2004), pp. 419–438.
[42] I.H. Witten and E. Frank, Data Mining: Practical Machine Learning Tools and Techniques, Morgan Kaufmann,
Burlington, MA, 2005.
[43] Z. Wu, Compactly supported positive definite radial functions, J. Adv. Comput. Math. 4(1) (1995), pp. 283–292.
[44] A.K. Yadav, H. Malik, and S.S. Chandel, Selection of most relevant input parameters using WEKA for artificial
neural network based solar radiation prediction models, Renew. Sustain. Energy Rev. 31 (2014), pp. 509–519.
[45] W. Yao, X.Q. Chen, Y.Y. Huang, and M. van Tooren, A surrogate-based optimization method with RBF neu-
ral network enhanced by linear interpolation and hybrid infill strategy, Optim. Methods Softw. 29(2) (2014),
pp. 406–429.
[46] X. Yao, Y. Liu, and G. Lin, Evolutionary programming made faster, IEEE Trans. Evol. Comput. 3(2) (1999),
pp. 82–102.
[47] A. Younis, R. Xu, and Z. Dong, Approximated unimodal region elimination based global optimization method for
engineering design, Proceedings of the ASME 2007 International Design Engineering Technical Conferences and
Computers and Information in Engineering Conference, IDET/CIE 2007, 4–7 September, Las Vegas, NV, New
York, ASME.
Copyright of Optimization Methods & Software is the property of Taylor & Francis Ltd and
its content may not be copied or emailed to multiple sites or posted to a listserv without the
copyright holder's express written permission. However, users may print, download, or email
articles for individual use.

Content Server

Загружено:

Сведения о документе

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Content Server

Загружено:

Авторское право:

Доступные форматы

Optimization Methods & Software, 2015

Vol. 30, No. 6, 1255–1275, http://dx.doi.org/10.1080/10556788.2015.1043604

A novel sequential approximate optimization approach using

(Received 20 July 2014; accepted 18 April 2015)

Keywords: engineering design optimization; sequential approximate optimization; data mining;

*Corresponding author. Email: ypc2008300718@163.com

© 2015 Taylor & Francis

2.1 Radial basis functions

Figure 1. Three-layer feed-forward RBF network.

Table 1. Commonly used basis functions.

Name Radial function r = ||x-xi ||2

where pj (x) = [1, x1 , . . . , xn ] is a linear polynomial function, m = n + 1 is the total number

2.2 Density function with the RBF

(1) The output vector f D is prepared at the sampling points.

f D = (1, 1, . . . , 1)TN∗1 . (4)

λD = (T + )−1 T f D , (5)

2.3 Brief description of data mining

3. Application of data mining in engineering design optimization

3.1 Ideas for reducing the design space

3.2 The input and output for the Weka

Figure 2. Ackley function.

Figure 3. Input file ARFF to Weka.

Figure 4. Output file from the clustering activity.

Figure 5. Output file from the classification activity.

Figure 6. Output file from the association activity.

4. The novel SAO approach

Figure 7. Flowchart of the SAO approach.

5. Results and discussions

5.1 Application to several benchmark problems

Table 2. Benchmark problems and global minimum.

No. Function Design space fˆmin

1 f (x) = x21 + x22 − cos(18x1 ) − cos(18x2 ) −1 ≤ x ≤ 1 −2

Table 3. Results of benchmark problems.

Test 1 Test 2 Test 3 Test 4 Test 5

Minimum of global − 1.9851 0.3983 − 9.6290 − 1.4543 0.0001

5.2 Comparison with other SAO approaches

(3) Generalized polynomial function (GF) with n = 2 [47]

f (x) = (1.5 − x1 (1 − x2 ))2 + (2.25 − x1 (1 − x22 ))2 + (2.625 − x1 (1 − x32 ))2

(4) Goldstein and Price function (GP) with n = 2 [41]

(5) Leon function (LN) with n = 2 [1]

f (x) = 100(x2 − x21 )2 + (x1 − 1)2 , x ∈ [−10, 10]n , fmin = 0. (14)

(6) Himmelblau function (HM) with n = 2 [1]

(7) Shekel 5 function (S5) with n = 4 [46]

(8) Shekel 7 function (S7) with n = 4 [46]

(9) Shekel 10 function (S10) with n = 4 [46]

(10) Hartman function (HN) with n = 6 [41]

i = 1, 2, . . . , n, x ∈ [0, 1]n ,min = −3.322, (19)

Table 4. Coefficients in Shekel functions.

(11) A function of 16 variables (F16) with n = 16 [41]

x ∈ [−1, 0]n , fmin = 25.875, (20)

Table 5. Comparisons results with other optimization methods.

MPS HAM Proposed method

SC − 1.0316 −1 ≤ x ≤ 1 37.8 − 1.0308 49.6 − 1.0310 35.7 − 1.0315

Figure 8. Diagram of tension/compression spring design.

5.3 An engineering design optimization problem

X = (d, D, N) = (x1 , x2 , x3 ). (21)

The mathematical model of the optimization problem is expressed as

Table 6. Comparison of results on the optimum design of tension/compression spring.

Вам также может понравиться

λD = (T + )−1 T f D , (5)