Академический Документы
Профессиональный Документы
Культура Документы
com
Abstract
‘‘Fuzzy Functions’’ are proposed to be determined by the least squares estimation (LSE) technique for the development of fuzzy system models.
These functions, ‘‘Fuzzy Functions with LSE’’ are proposed as alternate representation and reasoning schemas to the fuzzy rule base approaches.
These ‘‘Fuzzy Functions’’ can be more easily obtained and implemented by those who are not familiar with an in-depth knowledge of fuzzy theory.
Working knowledge of a fuzzy clustering algorithm such as FCM or its variations would be sufficient to obtain membership values of input vectors.
The membership values together with scalar input variables are then used by the LSE technique to determine ‘‘Fuzzy Functions’’ for each cluster
identified by FCM. These functions are different from ‘‘Fuzzy Rule Base’’ approaches as well as ‘‘Fuzzy Regression’’ approaches. Various
transformations of the membership values are included as new variables in addition to original selected scalar input variables; and at times, a
logistic transformation of non-scalar original selected input variables may also be included as a new variable. A comparison of ‘‘Fuzzy Functions-
LSE’’ with Ordinary Least Squares Estimation (OLSE)’’ approach show that ‘‘Fuzzy Function-LSE’’ provide better results in the order of 10% or
better with respect to RMSE measure for both training and test cases of data sets.
# 2008 Elsevier B.V. All rights reserved.
Keywords: Fuzzy functions; Rule bases; Membership values; Transformations; Input–output variables; Scalar and non-scalar; Reasoning; Least squares; Logistic
the linguistic ‘‘AND’’ and ‘‘OR’’ operators can not be output universe of discourse Y [20]. In general, X and Y stand for
represented in a one-to-one correspondence with a t-norm the set of all possible values of the input variable x and the set of
and a t-conorm, respectively, as it is shown by Türkşen [14,15], all possible values of the output variable y, respectively. The
then the FDCF and FCCF, Fuzzy Disjunctive and Conjunctive relation between x and y plays a crucial role in the analysis of
Canonical Forms, are to be used for the representation of rules the system S. It is usually desirable to find a functional relation
and for reasoning with them. However, such models fall into between x and y, which is conceived as an ordinary function f:
Interval-Valued Type 2 fuzzy systems analyses which are not X ! Y; in general. Let us consider an ordinary function f:
dealt with in this paper. Finally, one has to carry out X ! Y; proposed for the functional dependency of y to x: To
defuzzification computations in all fuzzy rule base models. investigate the appropriateness of f for the functional
Furthermore the above structure assumes non-interactivity dependency of y to x, for a certain state of x and for the
between input variables [3]. In fact, this is the underlying corresponding state of y, let us measure the values of x and y.
assumption when the fuzzy subsets for the left- and right-hand From a theoretical point of view, for the measured values x-
sides are obtained from experts by interview techniques. In measured of x and y-measured of y, it is expected that the value
order to eliminate the non-interactivity assumption, Delgado y-measured should coincide with the theoretically expected
et al. [16], Babuska and Verbruggen [17], and Uncu and value f(x-measured) of y. However, practically, these two values
Türkşen [18] used multi-dimensional Type 1 fuzzy subsets to are not usually equal, but they can be very close or very similar
represent the antecedent part of the rules. In such investigations, to each other. The classical assumption, asserting the functional
generally a multi-dimensional fuzzy clustering technique, e.g., dependency of y to x as an ordinary function f: X ! Y; does not
FCM is implemented to obtain multi-dimensional fuzzy subsets tell us any thing about how the measured value x-measured of x
that capture the interactivity (or joint affect) of input variables. relates to the measured value y-measured of y; and how the
Hence, the Z-FRB structure can be expressed as follows: measured value y-measured of y and the hypothetically claimed
c
value f(x-measured) of y relate to each other. Demirci [20]
R: ALSO ðIF x 2 X isr Ai THEN y 2 Y isr Bi Þ (4) states that the main difficulty behind these two problems in the
i¼1
classical approach results from the assumption that each
where the multi-dimensional antecedent fuzzy subset of ith rule possible value of x is related to a unique possible value of y, and
is Ai. This multi-dimensional antecedent fuzzy subset determi- both the indistinguishability of the input values and the
nation eliminates the search for the appropriate t-norm for the indistinguishability of the output values are always omitted
combination of antecedent fuzzy subsets with ‘‘AND’’. Thus, mathematically. Instead of the classical assumption that accepts
the degree of firing, say, for the-ith rule, is determined directly the functional dependency of y to x as an ordinary function f:
from the corresponding ith multi-dimensional antecedent fuzzy X ! Y, Demirci [20] proposes a vaguely defined function from
subset Ai and applied to the consequent fuzzy subset with the X to Y for the description of the functional dependency of y to x
selection of appropriate implication operator, ‘‘IMP’’. But to solve these problems.
again requiring a search and a selection of a t-norm–con- In other words, taking the M-equivalence relations E on X
orm-based ‘‘IMP’’ operators. In particular, Sugeno-Yasukawa, and F on Y into account, Demirci [20] suggests that a strong
SW-FRB, would thus be expressed as Eq. (4) above; and fuzzy function r in L[X Y] from X to Y w.r.t. E and F can be
Takagi-Sugeno, TS-FRB, Fuzzy Rule Base structures would taken as the mathematical representation of the functional
thus be expressed with Eq. (5) as follows: dependency of y to x, where the M-equivalence relations on X is
c
called an M-equivalent similarity relation [22] such that
R: ALSO ðIF antecedenti THEN yi ¼ ai xT þ bi Þ (5) M = (L, < = , *) denotes an integral, commutative cqm-lattice
i¼1
with L = [0,1] and * a t-norm. For the proposed ordinary
where antecedenti = x 2 X isr Ai, and ai ¼ ðai;1 ; . . . ; ai;nv Þ is the function, f: X ! Y, to represent the functional dependency of y
regression coefficient vector associated with the ith rule in Eq. (5) to x, f can be thought as a hypothetical or an ideal description of
whereas bi’s are the scalars associated with the ith rule of Eq. (5). the functional dependency of y to x. For M-equivalence
For these special cases of Z-FRB, again each degree of firing, di, relations E on X and F on Y, a strong fuzzy function r in
associated with the-ith rule, is determined directly from the L[X Y] from X to Y w.r.t. E and F with the property f in
corresponding ith multi-dimensional antecedent fuzzy subset ORD(r) can be also conceived as a realistic and a
Ai and applied to the consequent fuzzy subset for the SY-FRB comprehensive description of the functional dependency of y
or to the classical ordinary regression for the case of TS-FRB. to x which does not ignore the indistinguishability of input
values and the indistinguishability of output values. For each x
3. Fuzzy functions in X and y in Y; the element r(x, y) of L can be interpreted as the
degree of the truth of the statement ‘‘y takes the value y for a
A conceptual origin of our proposed FF-LSE may be found given value x of x,’’ where the top element 1 (the bottom
in Demirci [19,20], Demirci and Recasens [21] discusses the element 0) of L denotes the completely true (false) case of this
general properties of ‘‘Fuzzy Functions’’ from a perspective of statement. For the sake of simplicity, if we denote the output
mathematical theory. In particular he suggests (2003) an variable y by y(x) whenever x takes the value x in X; then, for
application possibility as follows: an input/output system S can each x in X on y in Y; r(x, y) will be nothing but the degree of the
be comprehended as an input universe of discourse X and an truth of ‘‘y(x) = y’’ [20].
I.B. Türkşen / Applied Soft Computing 8 (2008) 1178–1188 1181
It is to be noted that while such theoretical analyses give us a methods as required in the current fuzzy regression meth-
base to start the formation of ‘‘Fuzzy Functions’’, it does not odologies.
specify how we are to obtain such ‘‘Fuzzy Functions’’. In this All they have to understand the notion of membership values
regard, our proposed approach, to be discussed next, provides a and how they can be obtained from a fuzzy clustering algorithm
novel way where such fuzzy functions can be determined in such as FCM in addition to their usual background knowledge
practical engineering applications. of a function estimation technique, e.g., LSE, etc.
Thus we propose a novel approach in order to provide any
3.1. Proposed FF-LSE easy entry into fuzzy system modeling for mathematicians
and statisticians who are working in industry and for other
The proposed FF-LSE‘s are structurally different from Z- novices. For this purpose, we first review the basic structure
FRB, SY-FRB, TS-FRB, and ‘‘Fuzzy Regression’’ models of of the well-known ‘‘Least Squares’’ method of function
Tanaka et al. [7], and its variations which are explained above, estimation and then present our generalization of it which
and Hathaway and Bezdek [13] model, because the proposed includes membership values and their transformations as
approach introduces membership values and their transforma- well as a logistic transformation of non-scalar input variables
tions as well as a logistic transformation of non-scalar original as new input variables in addition to the original scalar input
input variables as new input variables in addition to the original variables. It is to be noted that, with a logistic analysis of
scalar input variables for ‘‘Fuzzy Function’’ estimation with non-scalar input variables, we obtain the frequency
LSE. With this new set of (augmented) input variables, one information with which non-scalar variables occur in system
executes a fuzzy clustering algorithm such as FCM and first behavior description. This valuable information is extracted
determines (local) optimum number of fuzzy clusters and via a logistic analysis because non-scalar variables cannot be
hence the associated membership values and then identifies a used directly in FCM. It should be recalled that FCM
fuzzy function to represent each fuzzy cluster separately. Thus requires that all its variables be scalar. In real life databases,
there are as many fuzzy functions as there are fuzzy clusters generally there are many non-scalar variables, i.e., nominal,
similar to Hathaway and Bezdek [13] model but they include binary and ordinal variables; and they contain valuable
membership functions as input arguments and thus there is no information that affects output variables based on some
need to use the cluster information for switching purposes. performance measure.
These fuzzy functions are estimated by the least squares
method in this paper. Therefore, it is structurally a new and 3.2. OLSE method
unique approach for the determination of fuzzy functions
instead of fuzzy rule bases. They represent fuzzy rule bases In OLSE method, the dependent variable, y, is assumed to be
indirectly. It is to be noted for the sake of emphasis that a linear function of one or more independent, input, variables, x,
coefficients of the inputs, whether they be membership values plus an error component as follows:
or their transformations or original input variables, are not
fuzzy sets in our proposed approach. Instead membership y ¼ b0 þ b1 x1 þ . . . þ bnv xnv þ e (9)
values and their transformations enter into an augmented input
set as new and additional variables. In our experience, it is where y is the dependent output, xj’s are the input or explanatory
found that this approach is most suitable for those analysts who variables, for j ¼ 1; . . . ; nv, nv is the number of selected inputs
are familiar with a function estimation technology, e.g., the and e is the independent error term which is typically assumed
least squares technology, etc. They only need to develop an to be normally distributed. The goal of the least squares method
understanding of fuzzy clustering algorithms without studying is to obtain estimates of the unknown parameters, bj’s,
many aspects of fuzzy theory. That is they generally do not j ¼ 0; 1; . . . ; nv, which indicate how a change in one of the
need to know or to develop an in-depth understanding of independent variables affects the dependent variable. The usual
essential concepts for the development and use of fuzzy rule method is known as the ‘‘Ordinary Least Squares’’, OLS. It
bases such as the estimation of membership functions, the should be emphasized that only one function is estimated for
selection of t-norms and co-norms for the combination of the the whole training data set. In OLS exercises, generally, there is
left-hand side membership functions and the selection of no prior clustering, in fact no fuzzy clustering that is deter-
implication operator to determine the affect of the left-hand mined ahead of time. Only one function is fitted to the over all
side on the right-hand side membership function. Furthermore, training data.
they do not need to study fuzzy logic, such as GMP, In matrix notation, the general linear model is expressed
Generalized Modus Ponens, fuzzification, de-fuzzification as
and various alternatives associated with all these essential
concepts. Generally such concepts are new to most mathe- Y ¼ Xb þ e
maticians and statisticians in industry who are not working
within fuzzy theory nor would they have time to study it. Nor where Y is [nd, 1] vector of response values, X is ½nd; nv þ 1
would such researchers have to develop an understanding of matrix of known constants which are inputs, nd represents the
the essential concepts of fuzzy numbers, their centers and number of input–output vectors in a training data set and nv is
widths, and how to use them in least squares estimation the number of selected input variables, b is ½ðnv þ 1Þ; 1 vector
1182 I.B. Türkşen / Applied Soft Computing 8 (2008) 1178–1188
of parameters and e is [nd,1] vector of errors such that: centers for m = m* and c = 1, . . ., c* as
eTnd;1 ¼ ½e1 ; e2 ; :::; end From this, we identify the cluster centers of the ‘‘input
X nd;nvþ1 ¼ ½1; xk; j jk ¼ 1; 2; . . . ; nd; j ¼ 1; 2; . . . ; nv space’’ again for m = m* and c = 1, . . ., c* as
The objective is to minimize the total residual errors for the
vX; j ¼ ðxc1; j ; xc2; j ; . . . ; xcnv; j Þ
estimation of the parameters of the model, i.e. m
X
nd
Next, one computes the normalized membership values of
Min Q ðyk ðb0 þ b1 xk1 þ . . . þ bnv xknv ÞÞ
each vector of observations in the training data set with the use
k
of the cluster center values determined in the previous step.
In matrix notation, we re-write it and then take the partial There are generally two steps in these calculations.
derivatives with respect to b’s: First we determine the (local) optimum membership values
uik‘s and then determine mik‘s that are above an a-cut in order to
Min Q ¼ ðy XbÞT ðy XbÞ; eliminate harmonics generated by FCM as
@=@b½ðy XbÞT ðy XbÞ ¼ 0;
2ðX T XÞb ¼ 2X T y; 2 1
m1
1 X
c xk vX;i
b ¼ ðX T XÞ X T y:
uik ¼ ; mik a (10)
provided that XTX is not singular. j¼1 x v
k X; j
single variable, Xj = (xjkjk = 1, . . ., nd) for the jth input variable: compared with the ‘‘Ordinary Least Squares Estimation’’,
2 3 OLSE, functions on the bases of R-square measure of
1 g i1 xi j
6. performance. It is shown that the predictions made by ‘‘Fuzzy
X 0i j ¼ ½1; G i ; X i j ¼ 4 .. ... .. 7
. 5 Functions with LSE’’ are at least 10% or better that the
1 g ind xijnd predictions made by OLSE with respect to R-square.
Thus the function Yi = bi0 + bi1Gi + bi2Xij, that represents the 4.1. Desulfurization case study
ith rule corresponding to the ith interactive (joint) cluster in
(Yi,Gi,Xj) space, would be estimated as follows: Desulfurization facility in a large scale steel industry is used
bi ¼ ðX 0i j TX 0i j Þ1 ðX 0i j TY i Þ, where X 0i j ¼ ½1; G i ; X i j , pro- to remove the sulfur from hot molten steel coming from blast
vided the inverse exists. furnaces before it is sent to the next operation in the processing
Such that bi ¼ ðbi0 ; bi1 ; bi2 Þ and the estimate of Yi would be sequence. Desulfurization is carried out by the injection of two
obtained as Yi ¼ bi0 þ bi1 G i þ bi2 X i j . Within the proposed different powdered reagents, Reagent1 and Reagent2, directly
framework, the general form of the shape of a cluster for the into the hot molten steel by means of a lance. The reagents react
case of a single input variable Xj and for the ith cluster can be with the sulfur in the hot metal. Then the sulfur rich slag is
conceptually captured by a second order (cone) function when separated from the steel.
one introduces the square of membership values into the
augmented input matrix in the space of U X Y which can 4.1.1. Objective
be illustrated with a prototype shown in Fig. 1. The purpose of the modeling activity is that a reduction in
In a number of real life case studies, we have in fact found reagent consumption would be possible if a more precise and
out that generally some second order or exponential functions reliable model can be developed to estimate the right amount of
give a good approximation from amongst the following 20 reagents to be used in the desulphurization process. This is
alternatives we have experimented: based on the fact that in desulphurization process, the target
Model1 X0 ¼ ð1; G ; XÞ amount of sulfur, i.e., the aim sulfur, is often set much lower
Model2 X0 ¼ ð1; G ; G 2 ; XÞ than the actual sulfur value that comes from blast furnaces. The
Model3 X0 ¼ ð1; G m ; XÞ ‘‘aim sulfur’’ specifies the quality of steel demanded by
Model4 X0 ¼ ð1; eG ; XÞ customers. One of the key concerns in this case study is that
.. when a model with poor predictive capability is used, it requires
. that many batches of hot metal to be desulphurized again and
Model20 X 0 ð1; G ; eG m ; G m ; XÞ again. It should be noted that the desulphurization process is a
highly expensive process.
4. Real life case study applications Therefore, the main objective in this exercise is to minimize
the number of desulphurization processes by an increase of a
The proposed FF-LSE’s were developed for and applied to model’s prediction ability. For this purpose, FF-LSE’s were
two real life case studies. We briefly review here these two case developed and applied to provide a more accurate and reliable
studies: (1) desulfurization process of steel for a steel company, determinations of reagent amounts in order to desulfurize each
and (2) income prediction of customers for a bank. The results new batch of hot metal. As well, the same dataset is used to
of the fuzzy functions developed in these investigations are determine OLSE, Ordinary LSE, in order to show that it is
advantageous to use the proposed approach.
In summary, the objective of the desulphurization process is
basically to adjust the level of sulfur in the steel to meet a
quality specification. The sulfur is removed from the metal by
adding two different reagents, Reagent1 and Reagent2. These
reagents bind the sulfur and move it into the slag layer, which
forms on top of the hot metal. After the sulfur is removed from
the hot metal, then it is used to produce the final steel product.
As each batch of hot metal arrives at the station, a model is used
to predict ‘‘what amount of each of the reagents’’ will be
required to produce the final steel product quality.
4.1.2. Dataset
The input variables used in the dataset can be divided into
two parts: scalar and non-scalar variables. Scalar variables have
continuous values between (1 and +1) whereas non-scalar
variables can be either binary or ordinal variables. Hence some
of the variables of the Desulfurization model are (1) scalar
Fig. 1. A fuzzy cluster in U X Y space. variables: start sulfur, KGS, Temp, FB, aim-sulfur, end-sulfur,
1184 I.B. Türkşen / Applied Soft Computing 8 (2008) 1178–1188
compounds 1–5 and (2) binary or ordinal variables: Car-Type, probability of a certain event occurring. From the equation
Pos, Practice 1–6, Injection #, Equipment Type. shown above, the probability of Y = 1 is calculated in the
Note that, since the goal of the model is to predict what general case as follows:
amount of each of the reagents are required to produce the final Pn
steel product, the model has two dependent output, Y, variables, eb 0 þ i¼1
bi xi
one for Reagent1 and the other for Reagent2. It is found that pðY ¼ 1Þ ¼ Pn
these two variables in general are highly correlated. 1 þ eb 0 þ i¼1
b i xi
4.1.3. Data pre-processing Logistic regression thus forms a predictor variable (log( p/
In the beginning of the desulfurization project, various data (1 p)) which is a linear combination of the binary or ordinal
pre-processing techniques have been applied to construct a explanatory variables. The values of this predictor variable are
dataset to determine FF-LSE’s as well as OLSE’s in order to then transformed into probabilities by a logistic function.
estimate the necessary amounts of reagents needed in each In this study, only the discrete explanatory variables, i.e.,
process. binary and ordinal, in the dataset are used to model logistic
We were given a desulphurization dataset which contained regression. Because the original response variables are scalar,
approximately 13,000 observations, i.e., vectors, with 27 each response variable is sorted and then divided into two
variables composed of binary, ordinal and scalar variables as groups, each of which has the equal amount of observations.
well as a few categorical variables. Each observation represents Thereby, two new binary output variables are constructed. As a
a data vector for one batch of hot metal. result, using the new binary response variables and the same
First various outlier treatments have been conducted discrete explanatory variables, two logistic regression models
applying expert knowledge. After the outlier treatments, are constructed using the probabilities as the fitted values.
the remaining dataset contained only 9475 observations. Next The correlations of probabilities of fitted logistic regression
the dataset was partitioned into two separate datasets, namely model with the binary response variable are calculated and the
the training and testing datasets, using a proposed sampling probabilities having the highest correlation to the response
technique, known as Partial Iterative Sampling Method (PISM). variables are added as the new input variable to the original
PISM is an iterative method used to select best chunk of data to input variable dataset to form the augmented input variable data
be used for modeling. Different training and testing datasets are set.
prepared in each iteration. The steps of this iterative process are It should be recalled that the two response variables,
omitted for the sake of page restriction. Reagent1 and Reagent2, are highly correlated; hence the
selected probabilities of these two response variables are more
4.1.4. Variable selection or less identical. Based on this, the probability from Reagent1’s
The process of selecting input variables of the system logistic regression model is used as the new input for the
includes correlation analysis, descriptive statistics, and expert formation of FF-LSE.
knowledge. Recall that FCM requires that explanatory and
response variables be scalar. In addition, the effect of non-scalar 4.1.6. Formation of FF-FRB-LSE
variables can also be determined by a logistic regression We have developed and search through a good number of
analysis. With the application of Logistic Regression, one can FF-LSE models. Amongst these, we very briefly discuss four
capture the probability by which non-scalar input variables can specific models which are labeled as FF-LSE-M1, FF-LSE-M2,
affect an output variable. These probabilities are included as FF-LSE-M3, and FF-LSE-M4. The augmented matrices of
new additional variables into the augmented input dataset. these four models are specific selections from the 20 models
partly shown above: Model1–Model20. As well, Fuzzy
4.1.5. Logistic regression Functions-LSE are constructed for each of the two output
Logistic regression is a variation of ordinary regression. variables, Reagent1 and Reagent2, separately. These four FF-
Unlike ordinary linear regression, logistic regression does not LSE models are built using our FSM tool developed with SAS
assume that the relationship between the independent variables 9.1 in KIS Lab, Knowledge/Intelligence Laboratory, by KIS
and the dependent variable is a linear one. Nor does it assume Lab Software development group. (For more information about
that the dependent variable or the error terms are distributed the software see http://www.mie.utoronto.ca/labs/fuzzy/
normally. Fsmdemo.html.) Again for each of the four different model
The form of the model in general is structures, a heuristic search has been applied to find the
optimum model parameters. For comparative purposes, the
p results from each of the four sub-optimum FF-LSE models
log ¼ b0 þ b1 X 1 þ b2 X 2 þ þ bk X k
1 p together with OLSE are shown in Table 1. (Recall that FCM
identifies sub-optimal clusters.)
where p is the probability that Y = 1 and X1, X2, . . ., Xk are the As it was stated before, the FF-LSE models are better
binary or ordinal independent variables (predictors); and b0, b1, predictors of the outputs by 10% or more as shown above.
b2, . . ., bk are known as the regression coefficients, which have The coefficients of the best FF-LSE’s, i.e., FF-LSE-M3, are
to be estimated from the data. Logistic regression estimates the shown in Table 2.
I.B. Türkşen / Applied Soft Computing 8 (2008) 1178–1188 1185
Table 1
R-Square values of the Reagent1 and Reagent2 estimations using OLSE and FF-LSE approach for the test dataset
Model name: desulfurization R-Square Parameters
Reagent1 Reagent2 Average m* c*
Ordinary Least Square Estimation 0.82 0.76 0.79
FF models
FF-LSE-M1 (12 input) 0.92 0.93 0.925 1.5 3
FF-LSE-M2 (13 input) 0.925 0.934 0.93 1.5 3
FF-LSE-M3 (16 input) 0.9243 0.936 0.931** 1.5 3
FF-LSE-M4 (18 input) 0.9255 0.93 0.928 1.5 3
m*: optimum degree of fuzziness, c*: optimum cluster size, ‘‘Average’’ column represents the performance obtained form the average of R-square values for Reagent1
and Reagent2 models, respectively, **: bold numbers indicate optimal results.
only requires the substitution of membership values of a new these applications, we have compared the results of fuzzy
input vector (observation), determined by an execution of a function approach only to the ordinary least squares
fuzzy clustering algorithm, together with the input variables of estimation. Comparisons with other approaches could be
a new input vector in to a number of fuzzy functions. The made, but this is left for our future studies. In these
number of fuzzy functions being equal to the number of rules, comparisons, we have shown several models of fuzzy
i.e., fuzzy clusters that is determined by, say, FCM. Whereas in functions together with the best amongst them. It should
fuzzy rule bases, membership functions of fuzzy sets associated be noted that the best model result is generally obtained by
with the original input variables as well as the output variable a more complicated augmented matrix. In practice, the
must be estimated either by experts or by a fuzzy clustering simplest fuzzy functions, which are obtained with the
technique for representation. Next for approximate reasoning, augmented matrix that contains only the membership
system analyst needs to identify combination operators known variable in addition to the original variables, might be good
as t-norms and t-conorms and the associated implication enough for its operational simplicity and its closeness to
operator as well as needs to implement fuzzification, General- the best model with respect to a performance measure.
ized Modus Ponens and defuzzification. Thus for the However, one generally obtains a better estimate when
representation and reasoning with fuzzy functions, the transformations of membership values are included in the
application oriented mathematicians and statisticians: (1) do augmented matrix.
not need to deal with membership function estimation of fuzzy
sets, after membership values are obtained either from experts
References
or from a fuzzy clustering algorithm, that make up a fuzzy rule
base and to fuzzify an new input with respect to the estimated [1] J.C. Bezdek, ‘‘Fuzzy Mathematics in Pattern Classification’’, PhD Thesis,
membership functions; nor (2) do they have to know anything applied Mathematics Centre, Cornell University, Ithaca, 1973.
about t-norms, t-conorm and how to chose an appropriate t- [2] L.A. Zadeh, Fuzzy sets, Inf. Control 8 (1965) 338–353.
norm, t-conorm and an associated implication operator to [3] L.A. Zadeh, The concept of a linguistic variable and its application to
identify a good approximate reasoning structure within the approximate reasoning, Inf. Sci. 8 (1975) 199–249.
[4] E.H. Mamdani, S. Assilian, An experiment in linguistic syntesis with a
alternatives of Generalized Modus Ponens; and finally (3) they fuzzy, logic controller, in: E.H. Mamdani, B.R. Gains (Eds.), Fuzzy
do not need to choose a de-fuzzification technique. In contrast, Reasoning and Its Applications, Academic Press, New York, 1981, pp.
these applications oriented system modelers need to know just a 311–323.
fuzzy clustering together with the least squares estimation [5] T. Takagi, M. Sugeno, Fuzzy identification of systems and its applications
techniques. Finally it is important to recall that in fuzzy to modeling and control, IEEE Trans. Syst. Man Cybern. SMC-15 (1)
(1985) 116–132.
function structure identification, we include interactive [6] H. Tanaka, Fuzzy data analysis by possibilistic linear models, Fuzzy Sets
membership values and their transformations as new input Syst. 24 (1991) 363–375.
variables. Thus we do not need to make independence [7] H. Tanaka, S. Vegima, K. Asai, Linear regression analysis with fuzzy
assumption for the input variables. model, IEEE Trans. Syst. Man Cybern. SMC-2 (1982) 903–907.
[8] H. Tanaka, H. Ishibuchi, S. Yoshikawa, Exponential possibility regression
In the second case, the structural difference is in terms of
analysis, Fuzzy Sets Syst. 69 (1995) 305–318.
the schema with which the original variables enter into the [9] H. Tanaka, H. Ishibuchi, Identification of possibilistic linear systems by
formation of a function. First, generally, there is only one quadratic membership functions of fuzzy parameters, Fuzzy Sets Syst. 41
fuzzy regression generated with the use of all data. Whereas, (1991) 145–160.
there are a number of fuzzy functions equivalent to the [10] A. Celmins, Least squares model fitting to fuzzy vector data, Fuzzy Sets
number of clusters, i.e., rules. Secondly, in fuzzy regression Syst. 22 (1987) 245–269.
[11] A. Celmins, Multidimensional least squares model fitting of fuzzy models,
approach, original input variables are the only ones that enter Math. Model. 9 (1987) 669–690.
into the right hand side of the function to be estimated by the [12] D. Savic, W. Pedryzc, Evolution of fuzzy linear regression models, Fuzzy
least squares method [7]. Alternately, fuzzy clustering results Sets Syst. 39 (1991) 51–63.
are used as switching functions in applications of fuzzy [13] R.J. Hathaway, J.C. Bezdek, Switching regression models and fuzzy
clustering, IEEE Trans. Fuzzy Syst. 1 (3) (1993) 195–203.
regression suggested by Hathaway and Bezdek [13].
[14] I.B. Türkşen, Interval valued fuzzy sets based on normal forms, Fuzzy Sets
Whereas in ‘‘Fuzzy function-LSE’’ approach, membership Syst. 20 (1986) 191–210.
values and their transformations together with a logistic [15] I.B. Türkşen, Type 2 representation and reasoning for CWW, Fuzzy Sets
transformation of the original non-scalar input variables Syst. 127 (2002) 17–36.
enter into an augmented input matrix in addition to the [16] M.R. Delgado, A.F. Gomez-Skennata, F. Martin, Rapid prototyping of
original scalar variables for the application of the least fuzzy models, in: H. Hellendoorn, D. Driankov (Eds.), Fuzzy Model
Identification: Selected Approaches, Springer, Berlin, Germany, 1997, pp.
squares method. Thirdly, the coefficients of the fuzzy 53–90.
regression equations are assumed to be, generally but not [17] R. Babuska, H.B. Verbruggen, Constructing fuzzy models by product
necessarily symmetric, triangular membership functions [7]. space clustering, in: H. Hellendoorn, D. Driankov (Eds.), Fuzzy Model
In ‘‘Fuzzy function-LSE’’, membership values and their Identification: Selected Approaches, Springer, Berlin, Germany, 1997, pp.
transformations enter into the right-hand side of the 53–90.
[18] Ö. Uncu, I.B. Türkşen, A novel fuzzy system modeling approach: multi-
equations as new variables. dimensional structure identification and inference, in: Proceedings of the
To demonstrate the advantages of the proposed fuzzy Tenth IEEE International Conference on Fuzzy Systems, Melbourne,
functions, we have presented two real-life case studies. In Australia, December, (2001), pp. 557–562.
1188 I.B. Türkşen / Applied Soft Computing 8 (2008) 1178–1188
[19] M. Demirci, Fuzzy functions and their fundamental properties, Fuzzy Sets [21] M. Demirci, J. Recasens, Fuzzy groups, fuzzy functions and
Syst. 106 (1999) 239–246. fuzzy equivalence relations, Fuzzy Sets Syst. 144 (2004) 441–
[20] M. Demirci, Foundations of fuzzy functions and vague algebra based on 458.
many-valued equivalence relations. Part I. Fuzzy functions and their [22] U. Höhle, Quotients with respect to similarity relations, Fuzzy Sets Syst.
applications I, J. Gen. Syst. 32 (2003) 123–155. 27 (1988) 31–44.