Академический Документы
Профессиональный Документы
Культура Документы
63-70 (2006) 63
Abstract
In this paper, a method based on the particle swarm optimization (PSO) is proposed for pattern
classification to select a fuzzy classification system with an appropriate number of fuzzy rules so that
the number of incorrectly classified patterns is minimized. In the PSO-based method, each individual
in the population is considered to automatically generate a fuzzy classification system for an M-class
classification problem. Subsequently, a fitness function is defined to guide the search procedure to
select an appropriate fuzzy classification system such that the number of fuzzy rules and the number of
incorrectly classified patterns are simultaneously minimized. Finally, two classification problems are
utilized to illustrate the effectiveness of the proposed PSO-based approach.
Key Words: Particle Swarm Optimization, Fuzzy Classification System, Pattern Classification
racy with respect to the skill of the user to determine the ship function of the fuzzy set Aji is described by
appropriate learning parameter. Besides, the fixed learn-
ing parameter of the above approach to fuzzy min-max m Aji (m( ji ,1) , m( ji ,2) , m( ji ,3) ; xi )
hyperbox classifier makes the same constraint on cover-
ì æ æ x -m ö
2
ö
age resolution in the whole input space so that some ïexp ç - ç i ( ji ,1)
÷÷ ÷, if xi £ m( ji ,1) ,
small hyperboxes are generated even in the input space ç
ï ç è m( ji ,2) ÷
ï è ø ø (2)
far from class boundaries. This reduces the generaliza- =í
ï æ æ xi - m( ji ,1) ö
2
tion capability of the fuzzy min-max hyperbox classifier. ö
In this paper, a method based on the particle swarm ïexp ç - çç ÷÷ ÷ , if xi > m( ji ,1) ,
ï ç m( ji ,3) ÷
optimization (PSO), called a PSO-based method, is im- î è è ø ø
plemented to select an appropriate fuzzy classification
where m(ji,1), m(ji,2) and m(ji,3) determine the center posi-
system such that a low number of training patterns are
tion, the left and right width values of the membership
misclassified by the selected fuzzy classification system.
function, respectively. Hence, the shape of the member-
In the PSO-based approach, each individual in the popu-
ship function is determined by a parameter vector mji =
lation is considered to represent a fuzzy classification
[m(ji,1) m(ji,2) m(ji,3)]. The j-th fuzzy rule in the rule base is
system. Then, a fitness function is implemented to guide
determined by a parameter vector rj = [mj1 mj2 ¼ mjM].
the search procedure to select an appropriate fuzzy clas-
Consequently, the set of parameters in the premise part
sification system such that the number of fuzzy rules and
of the rule base is defined as r = [r1 r2 ¼ rR]. According
the number of incorrectly classified patterns are simulta-
to (1), the set of parameters in the consequent part of the
neously minimized.
rule base is defined as a = [H1 CF1 H2 CF2¼HR CFR].
The rest of this paper is organized as follows. Section
When the input x = (x1, x2,¼, xm) is given, the firing
2 describes the structure of the fuzzy classification sys-
strength of the premise of the j-th rule is calculated by
tem. Section 3 proposes a PSO-based method to select an M
proposed by Kennedy and Eberhart [2,6]. Its develop- the fuzzy rules from the candidate rules rh = [r1h r2h ¼rjh
ment was based on observations of the social behavior of ¼rBh] so that the fuzzy rule base is generated. gjh Î[0,1]
animals such as bird flocking, fish schooling, and swarm decides whether the j-th candidate rule rjh is added to the
theory. Like the GA, the PSO is initialized with a popula- rule base of the generated fuzzy system or not. If gjh ³
tion of random solutions. It also requires only the infor- 0.5, then the j-th candidate rule rjh is added to the rule
mation about the fitness values of the individuals in the base. Consequently, the total number of gjh (j = 1,2,¼,B)
population. This differs from many optimization meth- whose value is greater than or equal to 0.5 is the number
ods requiring the derivation information or the complete of fuzzy rules in the generated rule base. In order to gen-
knowledge of the problem structure and parameter. erate the rule base, the index j of gjh (j = 1,2,¼,B) whose
Compared with the GA, the PSO has memory so that the value is greater than or equal to 0.5 is defined as IrhÎ
information of good solutions is retained by all individu- {1,2,¼,B}, r = 1,2,¼,rh, where rh represents the number
h h
als. Furthermore, it has constructive cooperation be- of the fuzzy rules in the generated rule base. {r I1h , r I 2h ,
L , r I rh , L , r I rhh } generates the premise part of the fuzzy
h h
tween individuals, individuals in the population share in-
formation between them. rule base generated by the individual ph = [rh gh]. For ex-
In the PSO-based method, each individual is repre- ample, assume that rh and gh are denoted as [r1 r2 r3 r4 r5
sented to determine a fuzzy classification system. The in- r6] and [0.12 0.72 0.82 0.35 0.29 0.81], respectively. Ac-
dividual is used to partition the input space so that the cording to gh, the generated rule base has three fuzzy
rule number and the premise part of the generated fuzzy rules and {I1h, I2h, I3h} = {2,3,6} so that {r2 r3 r6} deter-
classification system are determined. Subsequently, the mines the premise part of the generated rule base. Conse-
consequent parameters of the corresponding fuzzy sys- quently, the rule base of the generated fuzzy classifica-
tem are obtained by the premise fuzzy sets of the gener- tion system is described as follows:
ated fuzzy classification system.
A set of L individuals, P, called population, is ex- r - th rule :
pressed in the following: If x1 is AIhh 1 and x2 is AIhh 2 and L and xm is AIhh m ,
r r r
ê M ú êM M ú
r
ê p ú êr g L úúû
êë L úû êë L m Ah (m(hI hi ,1) , m(hI hi ,2) , m(hI hi ,3) ; xi )
I rhi r r r
tains two parameter vectors: rh and gh. That is, ph = [rh ï è è ( I r i ,2) ø ø (6)
=í
gh]. The parameter vector rh = [r1h r2h ¼ rjh¼ rBh] con- ï æ æ xi - m h h ö
2
ö
ïexp ç - ç ( I r i ,1)
÷ ÷ , if x > m h h ,
sists of the premise parameters of the candidate fuzzy ç ç
ï ç è m( I rhi ,3)
h
÷ ÷÷ i ( I r i ,1)
î è ø ø
rules, where B is a user-defined positive integer to decide
the maximum number of fuzzy rules in the rule base gen-
erated by the individual ph. Here, rjh = [mj1h mj2h ¼ mjih ¼ Consequently, the individual ph determines the
mjMh] is the parameter vector to determine the member- premise part of the generated fuzzy classification sys-
ship functions of the j-th fuzzy rule, where mjih = [mh(ji,1) tem. Assume that N training patterns (xn, yn), n =
mh(ji,2) mh(ji,3)] is the parameter vector to determine the 1,2,¼,N, are gathered from the observation of the con-
membership function for the i-th input variable. The pa- sidered M-class classification problem, where xn = (xn1,
rameter vector gh = [g1h g2h ¼ gjh ¼ gBh] is used to select xn2,¼, xnm) is the input vector of the n-th training pattern
66 Chia-Chong Chen
and yn Î {1,2,¼,M} is the corresponding class output. In Here, NICP(ph) is the number of incorrectly classi-
order to determine the consequent parameters Hr and CFr fied patterns, rh is the number of fuzzy rules in the rule
of the r-th fuzzy rule, a procedure is described as follows base of the generated fuzzy classification system, and se
[3]: and sr are user-defined constants for the fitness function.
Step 1. Calculate qt, t = 1,2,¼,M for the r-th fuzzy Consequently, the fitness function is designed to deal
rule as follows: with the tradeoff between the number of incorrectly clas-
sified patterns and the number of fuzzy rules. In this way,
qt = å
x p ÎClass t
qr ( x p ), t = 1, 2,L , M . (7) as the fitness function value increases as much as possi-
ble based on the guidance of the proposed fitness func-
tion, the fuzzy classification system corresponding to the
Step 2. Determine Hr for the r-th fuzzy rule by
individual will satisfy the desired objective as well as
possible. That is, the selected fuzzy system has a low
M (8)
H r = arg max q t . number of rules and a low number of incorrectly classi-
t =1
fied patterns simultaneously. Subsequently, a PSO-based
Step 3. Determine the grade of certainty CFr of the method is proposed to find an appropriate individual so
r-th fuzzy rule by that the corresponding fuzzy classification system has
the desired performance. The procedure is described as
q Hr - q follows:
CFr = M (9) Step 1. Initialize the PSO-based method.
åq
t =1
t
(a) Set the number of individuals (L), the maximum
number of rules (B), the number of generations
where (K), the constants for the fitness function (se and
sr) and the constants for the PSO algorithm (y,
M
qt d1, d2, c1, and c2).
q= å M - 1.
t =1
(10) (b) Generate randomly initial population P.
t ¹ Hr
Each individual of the population is expressed as
follows:
Thus, the consequent parameters of the generated
ph = [rh gh], (14)
fuzzy classification system are determined by the above
procedure. According to the above description, each in- where rh = [mh(11,1) mh(11,2) mh(11,3)¼ mh(1m,1)
dividual corresponds to a fuzzy classification system. In mh(1m,2) mh(1m,3) ¼ mh(B1,1) mh(B1,2) mh(B1,3)¼mh(Bm,1)
order to construct a fuzzy classification system which mh(Bm,2) mh(Bm,3)] and gh = [g1 g2 ¼ gB]. mh(ji,k), j Î
has an appropriate number of fuzzy rules and minimize {1,2,¼,B}, i Î {1,2,¼,M}, k Î {1,2,3}, is ran-
incorrectly classified patterns simultaneously, the fitness domly generated as follows:
function is defined as follows:
f h = fit ( p h ) = g1 ( p h ) g 2 ( p h ) m(hji ,k ) = m(min
ji , k ) + ( m( ji , k ) - m( ji , k ) ) × rand ( ),
max min
(15)
(11)
where fh is the fitness value of the individual ph, g1(ph) where the range of the parameter mh(ji,k) is defined
and g2(ph) are defined respectively as follows: as [m(min max
ji , k ) , m( ji , k ) ] and rand() is a uniformly dis-
tributed random numbers in [0,1]. gjhÎ [0,1], j Î
æ NICP( p h ) ö {1,2,¼,B}, is randomly generated as follows:
g1 ( p h ) = exp ç - ÷ (12)
è se ø gjh = rand(). (16)
and
(c) Generate randomly initial velocity vectors nh, h =
æ r ö
g 2 ( p h ) = exp ç - h ÷ . 1,2,¼,L.
(13)
è sr ø Each velocity vector is expressed as follows:
Design of PSO-based Fuzzy Classification Systems 67
where ah = [ah(11,1) ah(11,2) ah(11,3)¼ah(1m,1) ah(1m,2) Step 6. Decrease nh and y by the constants d1 Î [0,1]
ah(1m,3) ¼ah(B1,1) ah(B1,2) ah(B1,3) ¼ah(Bm,1) ah(Bm,2) ah(Bm,3)] and d2 Î [0,1], respectively.
and bh = [b1h b2h ¼bBh]. ah(ji,k), j Î{1,2,¼,B}, i
Î{1,2,¼,M}, k Î {1,2,3}, is randomly generated as fol- v h = v h × d1 , h = 1, 2,L , L, (22)
lows:
y = y * d2 . (23)
mmax
-m min
(18)
a (hji ,k ) = × rand ( ).
( ji , k ) ( ji , k )
[12] is utilized to examine the learning ability of the pro- of the selected fuzzy classification system are shown in
posed PSO-based fuzzy classification system. The total Table 1. From simulation results of this example, it is
number of patterns in this data set is 579. Figure 1 shows clear the proposed PSO-based method can select signifi-
the data set with a mixture of spherical and ellipsoidal cant fuzzy rules to construct a fuzzy classification such
clusters. Following the proposed method, simulation re- that the number of incorrectly classified patterns are min-
sults of the proposed approach to classifying this data set imized.
is shown in Figure 2, where the initial conditions for the Example 2. Iris data set
proposed method in Example 1 are given in the follow- The Iris data set [1] contains 150 patterns with four
ing: The number of individuals: L = 100, the maximum features, that belong to three classes (Iris Setosa, Iris
number of rules: B = 20, the number of generations: K = Versicolour and Iris Virginica). The four features are the
50, the range of mh(j1,1), j Î {1,2,¼,20} : [0,1], the range sepal length in cm, the sepal width in cm, the petal length
of mh(j1,2), j Î {1,2,¼,20} : [0.05,0.5], the range of in cm and the petal width in cm. The data set contains
mh(j1,3), j Î {1,2,¼,20} : [0.05,0.5], the range of mh(j2,1), j three classes, each of 50 patterns; each class refers to a
Î {1,2¼,20} : [0,1], the range of mh(j2,2), j Î {1,2,¼,20} type of iris plant. One class is linearly separable from the
: [0.05,0.5], the range of mh(j2,3), j Î {1,2,¼,20} : [0.05, other two; the latter are not linearly separable from each
0.5], the constants for the fitness function: {se, sr} = other. In this example, the two-fold cross validation
{5,10} and the constants for the PSO: {y, c1,c2,d1,d2} = (2CV) is employed to examine the generalization ability
{1,1,1,0.75,0.75}. The rule base of the selected fuzzy of the proposed approach to classifying the iris data. In
classification system has four rules such that the number the 2CV procedure, the Iris data are separated into two
of incorrectly classified patterns is zero. The parameters subsets of the same size. That is, each subset consists of
Table 1. Parameters of the selected fuzzy classification system by the proposed PSO-based method in Example 1
j m(j1,1) m(j1,2) m(j1,3) m(j2,1) m(j2,2) m(j2,3) Hj CFj
1 0.7361 0.2883 0.4669 0.5187 0.2625 0.366 1 0.6762
2 0.1958 0.05 0.4141 0.8314 0.209 0.3121 3 0.9637
3 0 0.2181 0.2570 0.8126 0.455 0.5 3 0.9153
4 0.0154 0.3087 0.4406 0.1336 0.3009 0.3604 2 0.9343
Figure 1. A synthetic data set with a mixture of spherical and Figure 2. Simulation results of the proposed PSO-based
ellipsoidal clusters in Example 1. method for the synthetic data set in Example 1.
Design of PSO-based Fuzzy Classification Systems 69
Table 2. Simulation results of the proposed PSO-based method for the Iris data set using 2CV method in Example 2
Average classification rate on training patterns Average classification rate on test patterns Average number of fuzzy rules
98.87% 96.8% 4.725
Table 3. Generalization ability for test patterns in the Iris tion system has high generalization ability for the classi-
data classification using the same 2CV method fication problem of the Iris data set.
Method Average classification Average number
rate on test patterns of fuzzy rules 5. Conclusions
Pruning 93.30 % 28.00
Multi-rule-table 94.30 % 597.75 A PSO-based method is proposed to design an ap-
GA-based 90.67 % 10.10 propriate fuzzy classification system for pattern classifi-
Proposed method 96.80 % 4.725 cation. In the proposed approach, each individual ph con-
sists of two parameter vectors: rh and gh The parameter
75 patterns. One subset is used as training patterns to vector gh is updated so that the generated fuzzy classifi-
construct a fuzzy classification system by the proposed cation system has an appropriate number of fuzzy rules.
approach. The other subset is used as test patterns to The parameter vector rh is updated so that the premise
evaluate the generated fuzzy classification system. The part of the generated fuzzy classification system has ap-
same training-and-testing procedure is also followed af- propriate membership functions for the considered clas-
ter the roles of the subsets are exchanged with each other. sification problem. Then, the obtained premise fuzzy sets
The above procedure is iterated 20 times using different are used to determine the consequent parameters of the
partitions of the 150 patterns into two subsets for calcu- corresponding fuzzy classification system. Consequently,
lating the average classification rates on these data. The each individual corresponds to a fuzzy classification sys-
initial conditions for the proposed method in Example 2 tem. Subsequently, a fitness function is defined to guide
are given in the following: The number of individuals: L the searching procedure to select a fuzzy classification
= 100, the maximum number of rules: B = 100, the num- system with the desired performance. The simulation re-
ber of generations: K = 50, the range of mh(j1,1), j Î sults show that the selected fuzzy classification system
{1,2,¼,20} : [4.4,7.7], the range of mh(j1,2), j Î {1,2,¼, not only has an appropriate number of rules for the con-
20} : [0.1,1.65], the range of mh(j1,3), j Î {1,2,¼,20} : sidered classification problem but also has a low number
[0.1,1.65], the range of mh(j2,1), j Î {1,2,¼,20} : [2,4.4], of incorrectly classified patterns.
the range of mh(j2,2), j Î {1,2,¼,20} : [0.1,1.2], the range
of mh(j2,3), j Î {1,2,¼,20} : [0.1,1.2], the range of mh(j3,1), j Acknowledgement
Î {1,2,¼,20} : [1.2,6.7], the range of mh(j3,2), j Î {1,2,¼,
20} : [0.1,2.75], the range of mh(j3,3), j Î {1,2,¼,20} : This research was supported in part by the National
[0.1,2.75], the range of mh(j4,1), j Î {1,2,¼,20} : [0.1,2.5], Science Council of the Republic of China under contract
the range of mh(j4,2), j Î {1,2,¼,20} : [0.1,1.2], the range NSC 93-2218-274-001
of mh(j4,3), j Î {1,2,¼,20} : [0.1,1.2], the constants for the
fitness function: {se, sr} = {5,10} and the constants for References
the PSO: {y,c1,c2,d1,d2} = {1,1,1,0.75,0.75}. The aver-
age number of fuzzy rules of the selected fuzzy classifi- [1] Blake, C., Keogh, E. and Merz, C. J., UCI Repository
cation systems and the average classification rates on of Machine Learning Database, Univ. California, Irvine
training patterns and test patterns are summarized in Ta- (1998). http://www.ics.uci.edu/~mlearn/
ble 2. Table 3 compares our results with the results in [7] [2] Eberhart, R. and Kennedy, J., “A New Optimizer
for the classification of the Iris data using the same 2CV Using Particle Swarm Theory,” Proc. Int. Sym. Micro
method. From simulation results for the Iris data set, it is Machine and Human Science, Nagoya Japan, Oct., pp.
obvious that the proposed PSO-based fuzzy classifica- 39-43 (1995).
70 Chia-Chong Chen
[3] Ishibuchi, H., Nozaki, K., Yamamoto, N. and Tanaka, Rules by Learning from Examples,” IEEE Trans. on
H., “Selecting Fuzzy If-Then Rules for Classification Systems, Man and Cybernetics, Vol. 22, pp. 1414-
Problems Using Genetic Algorithms,” IEEE Trans. 1427 (1992).
Fuzzy Systems, Vol. 3, pp. 260-270 (1995). [10] Wong, C. C. and Chen, C. C., “A Hybrid Clustering
[4] Jang, J. S., “ANFIS: Adaptive-Network-Based Fuzzy and Gradient Descent Approach for Fuzzy Modeling,”
Inference systems,” IEEE Trans. on Systems, Man and IEEE Trans. on Systems, Man and Cybernetics-Part B:
Cybernetics, Vol. 23, pp. 665-685 (1993). Cybernetics, Vol. 29, pp. 686-693 (1999).
[5] Jang, J. S., Sun, C. T. and Mizutani, E., Neuro-Fuzzy [11] Wong, C. C. and Chen, C. C., “A GA-Based Method
and Soft Computing, Prentice Hall, New Jersey, U.S.A. for Constructing Fuzzy Systems Directly from Numer-
(1997). ical Data,” IEEE Trans. on Systems, Man and Cy-
[6] Kennedy, J. and Eberhart, R., “Particle Swarm Optimi- bernetics-Part B: Cybernetics, Vol. 30, pp. 904-911
zation,” Proc. IEEE Int. Conf. Neural Networks, Perth, (2000).
Australia, pp. 1942-1948 (1995). [12] Wong, C. C., Chen, C. C. and Su, M. C., “A Novel Al-
[7] Nozaki, K., Ishibuchi, H. and Tanaka, H., “Adaptive gorithm for Data Clustering,” Pattern Recognition,
Fuzzy Rule-Based Classification Systems,” IEEE Trans. Vol. 34, pp. 425-442 (2001).
on Fuzzy Systems, Vol. 4, No. 3, Aug., pp. 238-250 [13] Yager, R. R. and Filev, D. P. Essentials of Fuzzy
(1996). Modeling and Control, John Wiley, New York, U.S.A.
[8] Simpson, P. K., “Fuzzy Min-Max Neural Networks- (1994).
Part 1: Classification,” IEEE Trans. Neural Networks,
Vol. 3, Sep., pp. 776-786 (1992). Manuscript Received: Apr. 27, 2005
[9] Wang, L. X. and Mendel, J. M., “Generating Fuzzy Accepted: Jul. 13, 2005