Академический Документы
Профессиональный Документы
Культура Документы
FOR
MEDICAL DIAGNOSIS
A Thesis submitted
in partial fulfillment of the requirements for the award of the degree
DOCTOR OF PHILOSOPHY
by
Jyotirmoy Ghosh
Regn. no. – Comp. Sc./Sc/491
2012
ABSTRACT
ABSTRACT
Modelling intelligent system in the field of medical diagnosis is still a challenging
work. Intelligent systems in medical diagnosis can be utilized as a supporting tool to
the medical practitioner, mainly country like India with vast rural areas and absolute
shortage of physicians. Intelligent systems in the field of medical diagnosis can also
able to reduce cost and problems for the diagnosis like dynamic perturbations,
shortage of physicians, etc.
This thesis is virtually divided into three parts. In first part, we proposed a prototype
model of an Interactive Intelligent System for Medical Diagnosis (IISMD) for
diagnosis of diseases, in a particular domain, say convulsion in infancy. IISMD
employed fuzzy logic for knowledge representation and used Generalized Modus
Ponens (GMP) for inferencing. The explanation facility is also incorporated in
IISMD. The knowledge base of IISMD is constructed using a set of interconnected
rules. Each rule is associated with a certainty factor to express the degree of belief.
The proposed Interactive Intelligent System can be able to directly assist physicians
and other health professionals for diagnosis of diseases, in a particular domain, say
convulsion in infancy.
In our proposed rough-fuzzy framework incorporated the advantages of fuzzy set for
linguistic representation of data and handling the uncertainty and vagueness in data as
well as advantage of rough set to handle impreciseness in data and dependency rule
generation from experimental data set. Each rule associated with a certainty factor to
I
ABSTRACT
represent the degree of correctness. The certainty factor of each rule is constructed by
combining the both membership values of fuzzy set and rough set.
II
ACKNOWLEDGEMENT
I express my sincere gratitude to Prof. Utpal Garain, CVPR unit, Indian Statistical
Institute, Kolkata, for his technical guidance and encouragement during the work of
this thesis.
I would also like to convey my sincere thanks to my friend, Dr. Subhabrata Ray,
MBBS, MD (medicine), RMO cum Clinical Tutor, Department of General Medicine,
North Bengal Medical College & Hospital, for his constant help and support for
supply the data and share the required knowledge in medical domain in the entire
duration to my work of this thesis.
I want to thank to my friend Dr. Dipendra Nath Ghosh, Associate professor, Dr. B. C.
Roy Engineering College, Durgapur, for his constant motivation and encouragement.
I want also like to thank my fellow colleague Mr. Anirban kundu, Assistant Professor,
Dept. of Computer Application, Heritage Institute of Technology, Kolkata, for
offering me help whenever I needed it most.
III
I am thankful to my entire colleague Dept. of Computer Application, Heritage
Institute of Technology, Kolkata, for their cooperation.
My parents and my sister were my constant source of inspiration and motivation for
completion of this thesis.
Last but not the least; I am thankful to all my friends for encouraging me throughout
the time.
Jyotirmoy Ghosh
IV
Jyotirmor Ghosh
Assistant Professor
Department of Computer Application
Heritage Institute of Technology
Kolkata-700107
Declaration
I do hereby declare that the work embodied in this thesis entitled, MODELLING
work unless otherwise specified. This research work is carried out under the guidance
Science, The University of Burdwan. This piece of work or any part of this thesis has
not been submitted to any other institution for the award of any other degree.
V
Dr. Sripati Mukhopadhyay Tel: (0342)-2634975 (Extn. 296)
Professor of Computer science Tel-Fax: (0342)2634015 (O)
& Registrar (Addl. Charge) E-Mail: registrar@buruniv.ac.in
The University of Burdwan dr.sripatim@gmail.com
Rajbati , Burdwan -713 104 http://www.buruniv.ac.in
West Bengal, India
CERTIFICATE
This is to certify that the thesis entitled, MODELLING INTELLIGENT SYSTEM FOR MEDICAL
Application, Heritage Institute of Technology, Kolkata, for award of the degree of Doctor of
Philosophy under The Univetsity of Burdwan, is a record of bonafide research work carried out
by him under my supervision. Results included in this thesis or any part of this has not been
_____________________________________________________________________________
Table of Contents
Abstract ......................................................................................................................... I
Declaration ................................................................................................................. V
1. Introduction ...................................................................................................... 1
1.1 Introduction and Goals ......................................................................... 1
1.2 Thesis Outline ..................................................................................... 5
Bibliography .................................................................................................... 7
2. Knowledge-Based Intelligent System ........................................................... 11
2.1 Introduction ......................................................................................... 11
2.2 Knowledge-Based System: An Overview .......................................... 13
2.3 Components of Knowledge-Based System......................................... 15
2.3.1 Knowledge Representation .................................................... 15
2.3.1.1 Traditional Data Models ........................................ 16
2.3.1.2 Object-Based Data Models ..................................... 17
2.3.2 Inference Mechanisms .......................................................... 18
2.3.2.1 Backward-Chaining Inference Mechanism............. 19
2.3.2.2 Forward Chaining Inference Mechanism................ 20
2.3.2.3 Hybrid Inference Mechanism ................................. 21
2.3.3 Uncertainty in KBS and Reasoning Process .......................... 22
2.3.3.1 Certainty Factor ...................................................... 22
2.3.3.2 Fuzzy Logic ............................................................ 23
2.3.3.3 Belief Networks ...................................................... 25
2.3.3.4 Case-Based Reasoning ............................................ 25
2.3.3.5 Dempster-Shafer Theory ......................................... 27
VII
2.4 Fuzzy Knowledge-Based Intelligent Systems and related literature . 27
2.4.1 Architecture of FKBIS ........................................................... 28
2.4.2 Knowledge Base .................................................................... 29
2.4.3 Fuzzification .......................................................................... 30
2.4.3.1 Linguistic Variables ................................................ 30
2.4.3.2 Fuzzy Set Operations ............................................... 31
2.4.3.3 Membership Functions............................................. 32
2.4.4 Inference System.................................................................... 33
2.4.4.1 Mamdani Inference Method ..................................... 33
2.4.4.2 Takagi-Sugeno-Kang Inference Method.................. 34
2.4.5 Defuzzification ....................................................................... 34
2.4.6 Fuzzy Rule Determination ...................................................... 35
2.5 Conclusion ......................................................................................... 36
Bibliography .................................................................................................. 37
3 Interactive Intelligent System for Medical Diagnosis ................................... 46
3.1 Introduction ........................................................................................ 46
3.2 Knowledge Base and Inference Procedure ........................................ 47
3.2.1 Knowledge Base of IISMD .................................................... 47
3.2.2 Parameters Describing the Domain ....................................... 48
3.2.3 Knowledge Representation .................................................... 49
3.2.4 Inferencing ............................................................................. 53
3.3 Observation ........................................................................................ 54
3.4 Conclusion ......................................................................................... 55
Bibliography .................................................................................................. 56
4 Rough-Fuzzy Hybridization........................................................................... 58
4.1 Introduction ........................................................................................ 58
4.2 Fuzzy Set Theory ............................................................................... 59
4.3 Rough Set Theory .............................................................................. 60
4.3.1 Introduction ............................................................................ 61
4.3.2 Information System................................................................ 62
4.3.3 Decision System..................................................................... 64
4.3.4 Indiscernibility Relation......................................................... 65
4.3.5 Approximation Space............................................................. 66
4.3.6 Approximation of Sets ........................................................... 68
VIII
4.3.6.1 Accuracy of Approximation ................................... 73
4.3.6.2 Approximation and Accuracy of Classification ...... 75
4.3.6.3 Classification and Reduction of Attributes ............. 77
4.3.7 Discernibility Matrix .............................................................. 83
4.3.8 Decision Rules ....................................................................... 85
4.4 Combination of Rough and Fuzzy Sets ............................................. 87
4.4.1 Fuzzy Equivalence Classes .................................................... 88
4.4.2 Rough-Fuzzy Sets .................................................................. 89
4.4.3 Fuzzy-Rough Sets .................................................................. 91
4.4.4 Fuzzy-Rough Hybrids ............................................................ 92
4.4.5 Fuzzy-Rough Reduction Process ........................................... 94
4.5 Conclusion ......................................................................................... 95
Bibliography .................................................................................................. 96
5 Rough-Fuzzy Intelligent System.................................................................. 102
5.1 Introduction ...................................................................................... 102
5.2 Rough-Fuzzy Rule Generation ........................................................ 103
5.3 Fuzzyfication of Data using Fuzzy Set Theory................................ 105
5.4 Rule Generation using Rough Set Theory ....................................... 106
5.4.1 Method 1 .............................................................................. 106
5.4.2 Method 2 ............................................................................. 107
5.4.2.1 For inconsistent information system ........................ 108
5.4.2.2 For consistent information system ........................... 108
5.5 Modified Framework to Generate Rough-Fuzzy Rule with CF ...... 109
5.5.1 For inconsistent information system .................................... 111
5.5.2 For consistent information system ....................................... 111
5.6 Description of Medical Domain ...................................................... 113
5.7 Application on Medical Data Set ..................................................... 115
5.8 Conclusion ...................................................................................... 117
Bibliography ................................................................................................ 117
6 Rough-Fuzzy-Neural Network Hybridization ............................................. 122
6.1 Introduction ....................................................................................... 122
6.2 Neural Network System .................................................................... 123
6.2.1 McCulloch-Pitts Neuron ...................................................... 124
6.2.2 Hebb Nets or Rosenblatt’s Perceptron ................................. 125
IX
6.2.3 Adaline and Madaline models ............................................. 126
6.2.4 Multilayer Perceptron model ............................................... 127
6.2.5 Radial Basis Function Neural Network ............................... 129
6.2.6 Other Neural Network Architectures ................................... 130
6.3 Neuro-Fuzzy Systems ....................................................................... 132
6.3.1 Neural Fuzzy Systems.......................................................... 133
6.3.2 Fuzzy Neural Networks ....................................................... 134
6.3.3 Fuzzy-Neural Hybrid Systems ............................................. 134
6.4 Neuro-Fuzzy Models ....................................................................... 134
6.4.1 Multi-Layer Perceptron ........................................................ 135
6.4.2 Fuzzy Multi-Layer Perceptron ............................................. 137
6.4.2.1 Desired Output Vector .......................................... 138
6.4.2.2 Process of Updating the Weights .......................... 139
6.4.2.3 Learning Strategy .................................................. 141
6.5 Combination of Rough, Fuzzy and Neural Network ........................ 141
6.6 Conclusion ........................................................................................ 144
Bibliography ................................................................................................ 145
7 Intelligent System Based on Rough-Fuzzy-Neural Network....................... 154
7.1 Introduction ....................................................................................... 154
7.2 Rough-Fuzzy Rule generation .......................................................... 155
7.2.1 Basic Concept of Rough Set ................................................ 155
7.2.2 Fuzzyfication of Data using Fuzzy Set ................................ 156
7.2.3 Rule Generation using Rough Set ........................................ 157
7.2.3.1 For consistent information system ....................... 157
7.2.3.2 For inconsistent information system ..................... 158
7.3 Mapping rules into the Fuzzy-Neural Network ................................ 159
7.4 Application on Medical Data Set ...................................................... 162
7.5 Conclusion ........................................................................................ 164
Bibliography ................................................................................................ 164
X
Table of Figure
XI
Table of Tables
Table 4.3: The DIABET data set with the reduced attribute set B = {c1, c2, c3} .... 67
Table 5.1: Comparison result of existing method and modified method ................ 117
XII
Introduction
Chapter 1
Introduction
Medical science is often been considered as a tested domain for newly introduced
learning and reasoning techniques in computation [1, 2, 3, 4, 5]. Not only the domain
of medicine, an important field of medical science, has many applications whose
solutions are important in a social perspective, but also this domain is extremely
difficult with a wide range of confusing factors and aspects that demand distinctive
attentions, thus establishing it a technically challenging area. This association between
computation techniques and medicine has led to some significant research work being
developed in the past, but major developing efforts in research is still needed to
establish a significant impact on the practice of medicine [6, 7].
The functioning of the human body is characterized by the complex and extremely
interactive chemistry of its organs and the psyche. This concerted effort results
homeostasis and the equilibrium of all physiological quantities. This balance is
maintained in a level within physiological bounds that varies from individual to
individual. Due to internal or of external cause, deviations from it are indicative of
1
Introduction
some kind of perturbation. The identification of the cause of these perturbations is the
goal of medical diagnosis.
Reaching a foolproof diagnosis is never an easy job for medical practitioner. Today in
medical diagnosis it is often impossible to look inside a patient to determine the
primary cause that led to the series of effects and reactions the patient complains
about. Thus the diagnosis is based on indirect evidence, symptoms and the knowledge
of the medical mechanisms that relate presumed causes to observed effects. The
problems of diagnosis not only arises due to the incompleteness of knowledge, but
also most immediate limitations of the theoretical and practical knowledge
implications that lead from an initial cause to its observable effects.
The other difficulties also found in medical diagnosis that may be as follows:
All observations are subject to error. The error correction methods are stochastic
in nature that requires strong assumptions that do not always hold in practice.
Although taken alone none of the problems is unique to the medical domain, taken
together they add to an intricacy surpassing that of even the most sophisticated man-
made systems known today. It is therefore realistic to expect that medical diagnosis
will for a long time remain problematic.
As well as the world population ages, the patients per physician ratio keeps on
increasing. Much of this status is owed to the fact that medical diagnosis requires a
2
Introduction
proficiency in coping with uncertainty, vagueness and impreciseness that lead to the
modeling intelligent system in medical diagnosis. Intelligent systems in the field of
medical diagnosis can also able to reduce cost and problems for the diagnosis of
dynamic perturbations.
Fuzzy set theory was introduced by Zadeh [9] as a powerful tool for formalization of
uncertain and vague knowledge. Together with appropriate rules of inference,
reasoning process, it provides a powerful framework for the combination of evidence
and deduction of consequences based on knowledge specified in syllogistic [10] form.
Numbers of papers have been published in the recent past for modelling an intelligent
system in the medical domain. A fuzzy classifier based medical decision support
system has been developed by I-Jen Chiang et al. to deal with highly uncertain noise
data [11]. An automated model for medical diagnosis with fuzzy stochastic techniques
for monitoring chronic diseases for aged people have been developed by L. Jeanpierre
and F. Charpillet [12]. An expert system for respiratory diseases can be seen in the
work of Musbah Jumah Aqel [13]. Successful development of expert system in the
domain of child growth and development can be seen in the work of Saha, Mondal
and Samanta [14, 15].
This thesis is virtually divided into three parts. In first part, we proposed a prototype
model of an Interactive Intelligent System for Medical Diagnosis (IISMD) [18] for
diagnosis of diseases, in a particular domain, say convulsion in infancy. IISMD
employed fuzzy logic for knowledge representation and used Generalized Modus
Ponens (GMP) for inferencing. The explanation facility is also incorporated in
IISMD. The knowledge base of IISMD is constructed using a set of interconnected
rules. Each rule is associated with a certainty factor to express the degree of belief.
3
Introduction
The proposed Interactive Intelligent System can be able to directly assist physicians
and other health professionals for diagnosis of diseases, in any particular domain, say
convulsion in infancy has been considered as a case study in this thesis.
The rule generation techniques have been widely developed and used for data mining
to developed intelligent system in many application areas [23], such as medical
diagnosis, decision-making, classification and prediction. In order to provide more
flexible and robust information processing system, using only one technique is not
enough. Each technique has its own advantage. Hybridizations of rough set and fuzzy
set for rule generation are introduced [24-30].
In our proposed rough-fuzzy framework [22] incorporated the advantages of fuzzy set
for linguistic representation of data and handling the uncertainty and vagueness in
data as well as advantage of rough set to handle impreciseness in data and dependency
rule generation from a table of data. Each rule associated with a certainty factor to
represent the degree of correctness. The certainty factor of each rule is constructed by
combining the both membership values of fuzzy set and rough set.
In order to provide more efficient, reliable, accurate, flexible and robust information
processing system, hybridizations of soft computing methodologies are introduced for
modeling intelligent system in different areas like rule generation[31, 32], prediction
4
Introduction
This thesis concerns itself with tools for modeling intelligent system in medical
diagnosis. As such, the context in which to place this work is an intersection of
several scientific areas, the perhaps two most relevant being the field of data mining
and knowledge discovery, and the field of medical informatics. This work focuses on
the development of tools and techniques from a certain subfield of the former area,
and applying them in the latter.
5
Introduction
6
Introduction
Bibliography
[2] Robert S. Ledley and Lee B. Lusted, 1959, Reasoning foundations of medical
diagnosis, Science, 130(3366), pp. 9–21.
[4] W. B. Schwartz, 1970, Medicine and the computer: The promise and
problems of change, New England Journal of Medicine, 283, pp. 1257–1264.
7
Introduction
[9] L. A. Zadeh, 1965, Fuzzy Sets, Information and Control, 8, pp. 338-353
[11] I-Jen Chang, Ming-Jium Shieh, Jane Yung-Jen Hsu, Jau-Min Wong, 2005,
Building a medical decision support system for colon polyp screening by using
fuzzy classification trees, Applied Intelligence, Vol. 22, pp. 61-75.
[16] Teach, R.L. and Shortliffe, E.H. 1981, An analysis of physician attitudes
regarding computer-based clinical consultation systems, Comput. Blomed.
Res. 14, pp. 542-558.
[17] Wallis, J.W. and Shortliffe, E.H., 1982, Explanatory power for medical expert
systems: studies in the representation or causal relationships for clinical
consultations, Meth. Info. Med. 21, pp.127-136.
8
Introduction
[23] S. Mitra, Sankar K. Pal, and P. Mitra, 2002, Data Mining in Soft Computing
Framework: a Survey, IEEE Trans.on neural networks, vol. 13, no.1, pp.3–14.
[24] Shusaku Tsumoto, 2004, Mining diagnostic rules from clinical databases
using rough sets and medical diagnostic model, Information Sciences 162,
pp. 65–80
[26] P. J. Lingras, 2002, Rough Set Clustering for Web Mining, Proceedings of
2002 IEEE International Conference on Fuzzy Systems, Hawaii, pp. 12-17.
[27] Wlodzislaw Duch, Rafal Adamczak and Krzysztof Grabczewski, 2000, A new
methodology of extraction, optimization and application of crisp and fuzzy
logical rules, IEEE Transactions on Neural Networks, vol. 11, no. 2, pp. 1-31.
[29] Gang Xie , Fang Wang , Keming Xie, 2004, RST-Based System Design of
Hybrid Intelligent Control, IEEE International Conference on Systems, Man
and Cybernetics.
9
Introduction
[30] S. K. Pal and A. Skowron, Eds., 1999, Rough Fuzzy Hybridization: A New
Trend in Decision Making, Singapore: Springer-Verlag.
[31] Shi-tong Wanga, Dong-jun Yub and Jing-yu Yangb, 2003, Integrating rough
set theory and fuzzy neural network to discover fuzzy rules, Intelligent Data
Analysis 7, pp, 59–73.
[33] Jing Hong, 2011, An Improved Prediction Model based on Fuzzy-rough Set
Neural Network, International Journal of Computer Theory and Engineering,
Vol.3, No.1, pp.1793-8201.
[35] Junyang Zhao and Zhili Zhang, 2011, Fuzzy Rough Neural Network and Its
Application to Feature Selection, International Journal of Fuzzy Systems, Vol.
13, No. 4.
[36] M. Banerjee, S. Mitra, and S. K. Pal, 1998, Rough fuzzy MLP: Knowledge
encoding and classification, IEEE Transactions on Neural Networks, vol. 9,
no. 6, pp. 1203-1216.
[37] D. Zhang and Y. Wang, 2006, Fuzzy-rough Neural Network and Its
Application to Vowel Recognition, Control and Decision, vol. 21, no.2, pp.
221-224.
[39] W.C. Chena, N-B. Changb, J-C. Chenc, 2003, Rough set-based hybrid fuzzy-
neural controller design for industrial wastewater treatment, Water Research
37, pp. 95–107.
10
Knowledge‐Based Intelligent System
Chapter 2
2.1 Introduction
The KBIS and expert system must ensure that, the system will provide its users
accurate advice or correct solutions to their problems. In KBIS the verification
process about the efficiency, accuracy and reliability may be performed by
Investigating the knowledge base contains all necessary information, to verify
interpretation and application over the stored information is performed correctly or
not. Knowledge-based debugging is the process of checking that a knowledge base is
correct and complete, is one component of the larger problem of knowledge
acquisition. This process involves testing and refinement of the knowledge base to
find out the different error due to transferring expert knowledge from a human expert
to a KBIS, and correction of those errors.
The development of KBIS and ES was started in 1970s. INTERNIST-I was an expert
system designed in the early 1970's to diagnose multiple diseases in internal medicine
11
Knowledge‐Based Intelligent System
by modelling the behaviour of clinicians. In 1982, the INTERNIST-I project
represented fifteen years of work, and by some reports covered 70-80% of all the
possible diagnoses in internal medicine.
A couple of years later, MYCIN were developed by Shortliffe in 1976 [3]. The
knowledge base of MYCIN was constructed by a set of rules in the area of bacterial
infection. Initially it was 500 rules and incorporated the backward chaining inference
procedure. In the early 1980s, MYCIN was generalized to EMYCIN [4] as a shell to
use for building other expert systems and so PUFF was presented in 1982 [5] as a
medical expert system. PUFF interprets measurements from respiratory tests to
determine the presence and severity of lung disease in the patient.
Shortliffe in 1986 [6], explains the history of expert systems and medical expert
systems. HERMES is another medical expert system in the domain of hepatic diseases
designed by Bonfa et al., in 1993[7]. It was further developed in 1989 as an integrated
expert system that incorporates the RDBMS. Hudson and Cohan in 1994 developed
medical expert systems that incorporate the fuzzy logic to handle the uncertainty lies
behind imprecise or inaccurate information [8]. Schmidt et al., in 2001 explain about
Case-based Reasoning (CBR) and its possible application as a reasoning technique in
medical ESs [9]. ESTDD, is an Expert System for Thyroid Diseases Diagnosis
developed by Keleş and Keleş in 2006 [10]. This explained the importance of early
diagnosis of the disease ignoring the symptoms that can lead to death, and early
discovery that can help to control the disease. Akbarzadeh-T et al., in 2007 have
proposed a general method for the classification and diagnosis in medical system
which was a diagnostic expert system to aphasia diagnosis [11]. Moreover, Kumar et
al., in 2009 represented a hybrid approach using CBR and Rule-based Reasoning for
domain independent clinical decision support in Intensive Care Units (ICU) [12].
Many other KBIS and ES was developed incorporating the different soft computing
techniques for more efficiency, accuracy and reliability will discuss later in this thesis.
12
Knowledge‐Based Intelligent System
2.2 Knowledge-based System: An overview
Databases are now very popularly used in most industrial areas. A database is a
shared collection of structured interrelated data, designed to meet the needs of an
organization’s information system [16]. Normally according to the complex
information structure of an organization, a suitable data model such as E-R diagram is
used in design to represent the information in databases. A data model is a complete
architecture for describing data, relationships between objects and constraints on the
objects. It usually comprises three essential components [16]:
1. Structural part: It includes a set of rules which are used to build the database.
2. Manipulative part: The operations that are endorsed on the data.
3. Set of integrity rules: It ensures the accuracy of the data.
13
Knowledge‐Based Intelligent System
solve problems that require extensive human expertise in that domain; it must be
clever to solve problems directly [17]. Like a database system, an expert system also
contains a knowledge base, but knowledge representation is different, such as
production rules, which is a major component of an expert system. Production rules
can be used to infer new instances of the objects or new instances of a relationship
among objects from existing objects.
Database and ES were developed to represent different aspects of the real world
problems. Database systems has the ability to store huge amounts of structured
interrelated data with sophisticated data management facilities on the other hand ES
directed to store production rules and has the strong reasoning ability. Thus the
combination of these two technologies would benefit both the systems. So a
hybridization system of ES and database system must have the strong reasoning
ability and the ability to access huge volume of facts and rules [14].
Decision
Support
System
Artificial Intelligence
Intelligent
&
Database Expert
System System
Other System
14
Knowledge‐Based Intelligent System
2.3 Components of Knowledge-Based System
KBSs are tools for developing applications that drives logical inferences from their
stored knowledge of a specific problem domain [13]. Both database system and ES
are particular kinds of KBSs and have the more or less same general architecture. The
components of the basic structure of a KBS comprise Knowledge base and inference
engine. The knowledge base contains facts, rules, heuristics, and other related
information that are used by the inference mechanism to produce expert opinion and
other valuable resources for the users.
15
Knowledge‐Based Intelligent System
2.3.1.1 Traditional Data Models
Traditional data models include: hierarchical data model [16], network data model
[16] and the relational data model [16]. They are all record-based logical data models.
Some data were more naturally modeled with more than one parent per child. So, the
network model permitted the modeling of many-to-many relationships in data. In
1971, the Conference on Data Systems Languages (CODASYL) formally defined the
network model. The basic data modeling construct in the network model is the set
construct. A set consists of an owner record type, a set name, and a member record
type. A member record type can have that role in more than one set; hence the
multiple parent concepts are supported. An owner record type can also be a member
or owner in another set. The data model is a simple network, and link and intersection
record types may exist, as well as sets between them. Thus, the complete network of
relationships is represented by several pairwise sets; in each set some (one) record
type is owner (at the tail of the network arrow) and one or more record types are
members (at the head of the relationship arrow). Usually, a set defines a 1: N
relationship, although 1:1 is permitted. The CODASYL network model is based on
mathematical set theory.
The hierarchical data model organizes data in a tree structure. There is a hierarchy of
parent and child data segments. This structure implies that a record can have repeating
information, generally in the child data segments. Data in a series of records, which
has a set of field values attached to it. It collects all the instances of a specific record
together as a record type. These record types are the equivalent of tables in the
relational model, and with the individual records being the equivalent of rows. To
create links between these record types, the hierarchical model uses Parent Child
Relationships. These are a 1: N mapping between record types. This is done by using
trees.
16
Knowledge‐Based Intelligent System
Relational model: The relational model is based on mathematical relations, in which
data and relations are both represented as tables [16].
The relational data model is based on the mathematical theory of sets and relations. In
this model, data relationships are represented by tables, each horizontal row describes
a record (tuple) and each column describes one of the attributes (data fields) of the
record. Now the majority of commercial database systems are based on the relational
data model, which provides data independence.
The main drawback of the record-based data models is that they have no ability to
specify the constraints on the data. To represent the real world data easily object-
based data models are used in the design, such as entity relationship data model and
some semantically data models like functional data model, object-oriented data model
etc.
Functional model: The functional data model is based on entities and functions, and
also provides natural languages to give a more natural representation [18]. database
systems have data definition and manipulation language that provides a database
system interface which allows the users to model directly how they think about the
solution of a specific problem [18]. Thus, the functional data model has been act as a
suitable formal and practical basis for object-oriented databases. There does not need
to create extra tables like the relational model. Multi-valued functions allow many-to-
many, or one-to-many relation [18].
17
Knowledge‐Based Intelligent System
building of complex objects that are more natural and realistic representation of real-
world objects. There are several advantages of object-oriented databases [16]:
Enhanced modeling capabilities: The object is encapsulated both the state and
behavior to naturally represent the real world objects. It provides higher
performance management of objects, and enables better management of the
complex interrelationships between objects.
Removal of impedance mismatch: It eliminates many problems that occur in
mapping a declarative language with an imperative language.
Support for long duration transactions.
The production rules define a set of permissible transformations that move a problem
from its initial statement to its solution.
The inference engine in a KBS or KBIS match the facts stored against the condition
or IF part of the rules in the knowledge-base, and then forward the data stored in the
action or THEN part of the rule to reach the goal.
The inference mechanism of a KBS controls the reasoning process of the system. It
compiles the stored knowledge in the knowledge base of a KBS to infer decisions for
18
Knowledge‐Based Intelligent System
the solution of a specific problem. All the decisions inferred by the inference
mechanism together with the input for a specific problem form another part of a KBS
usually known as context also known as the working memory.
The two main inference mechanisms of a KBS are forward chaining or data driven
mechanism and backward chaining or goal driven mechanism. Forward chaining
assumes an initial state of known facts, and progresses though the selected problem,
utilizing the stored knowledge in KB to reach goal. Backward chaining assumes a
goal state or hypothesis and reasons back utilizing the facts to support or drop the
assumed hypothesis.
A hybrid technique also found that is the mixture of both forward and backward
chaining techniques. This indicates that the inference engine will perform forward-
chaining first, and then back ward-chaining second. This is used to confirm a
diagnosis or a hypothesis which was reached through forward chaining.
In the area of medicine, a medical KBS for neonates was presented [21]. The system
uses both frame representation and backward chaining inference mechanism to
produce quick calculated responses for nutrient for neonates. The VIE-PNN is Vienna
Expert System for Parental Nutrition of Neonates which has the advantage of
reducing calculation errors since the physician has input some dosages manually in
extreme conditions. However consider a situation that the physician may want to stop
the calculations and want to check the patient, but it is not possible with this system
since it is not an interactive program, so it can be a big disadvantage of that system.
19
Knowledge‐Based Intelligent System
A Web-based Weather Expert System (WES) [22] was proposed that supports
important shuttle decisions of “go/no go” in the NASA intelligent launch and range
operations program. WES considers many factors that affect the launch decision.
WES accepts the goal and works through backward chaining inferencing to fire the
necessary rules until the goal is reached. Here backward chaining is firing only the
necessary rules needed to prove the goal however they concluded that it is better to
combine both forward and backward chaining techniques for more accurate and
efficient performance. In another paper [23] also was presented an approach for
describing the complexity of hypotheses proving using backward chaining inference
mechanism. He also suggested that the efficiency would increase of the inference
mechanism if instead of using unification of premises and conclusions, matching
techniques is use. He also suggested that the efficiency would increase even more if
he provided the ability for the system to access external data during the inference as
well as accept user input.
Another study on a rule-based fuzzy controller for a nuclear power plant [25] proves
the importance of forward chaining inference mechanism because when the controller
came to critical areas such as nuclear power plants speed where fast responses are
important.
Another paper [26] presents a Fuzzy expert system which is built using FuzzyCLIPS
and certainty factors. FuzzyCLIPS is fuzzy extension to the famous CLIPS. The
system proposed in the research has the ability to identify causes of typical vibration
problems in rotating machinery. FuzzyCLIPS incorporates forward chaining due to
facts are available initially.
20
Knowledge‐Based Intelligent System
description of the current state of the system and the power of ordered search rules.
Also focus on a single weakness that it does not have a clear vision of the goal.
Finally they present two algorithms to give more information and guidance towards
the goal and also remove irrelevant information.
Another paper focusing on monitoring traffic also needs quick responses and also
supports the use of forward chaining due to its reliability [28].
ABIS, a language for intelligent systems [29] was developed considering the
relational data models and forward chaining. ABIS has passed all the stages of testing
and gives the right direction for future development.
DESPLATE is an expert system designed for abnormal shape diagnosis in the plate
mill [31]. The expert system also successful incorporate both forward and backward
chaining where forward chaining is used to narrow the search for faults and backward
chaining used to prove a certain fault also in some cases where backward chaining is
not applicable, forward chaining is continued. This shows the dependency of
DESPLATE expert system not on a single inference mechanism.
An expert system was proposed to analyze the data gathered from modern digital
protective relays and report the data which relate to fault disturbances and the
expected protection operation [33]. The inference mechanism uses forward chaining
first for predicting the expected protection operation and backward chaining second to
validate and diagnose the actual protection operation. The authors focus on the
strength provided by combining both techniques.
21
Knowledge‐Based Intelligent System
2.3.3 Uncertainty in KBS and Reasoning Process
KBS are required to deal with uncertainty in knowledge base and inference
mechanism. Reasoning mechanism is a part of KBS which deals with the reasoning
process and uncertainty. Reasoning give the answer of question of why a particular
action was selected or completed based on the available facts provided by the user,
i.e. how the system reached such decision. Uncertainty is the weight or accuracy of
the deducted result, or action.
In KBS, like in medical diagnostic systems that deal with probabilities and always not
completely certain of each diagnosis similar to the real life, diagnosis is the most
possible expected and accurate deduction to take the decision.
Uncertainty in itself is a very uncertain area and prone to the subjectivity of the
individual or group of individuals whom are assigning that uncertainty [34]. To assign
a specific certainty to an event or set of facts, any personnel must be in full confident
about all the facts that can affect that event. Also there must not be bias to the
characteristics of the problem that itself presents a very important property to
measure. Knowledge acquisition from experts or reference material is comparatively
simple task than assigning a certainty value for a particular piece of knowledge.
KBS are classified according to how they deal with uncertainty. Various methods
have been discussed [35] that have been used to deal with uncertain or incomplete
information in the knowledge base. To manipulate the uncertain and imprecise
knowledge requires appropriate models of inference [36, 37]. The following sections
describe the most popular and widely used reasoning techniques.
A certainty factor (CF) is just a numerical measure between -1 and +1, defined in
terms of measure of belief and disbelief [38, 39]. A negative CF indicates that it is not
confirmed by evidence and positive CF indicates that it is confirmed by evidence. The
CF value zero indicates that the evidence does not influence the belief. The higher the
value of CF means the higher the certainty of the rule.
Nowadays some researchers like to focus on other newly developed methods to deal
with this issue rather such method. Some researchers on the other hand still use that
22
Knowledge‐Based Intelligent System
CF classifying as a simple and early method. In his paper [38], Lucas presents
certainty-factor interpretation of Bayesian Belief Network. He concludes that the
purpose of the paper is not to renew research in such an old way of resolving
uncertainty, but to learn from early models.
The CF model was first introduced in the rule-based expert system (MYCIN) by
Shortliffe and Buchanan [3, 40] for representation and manipulation of uncertain
knowledge. It was found that probabilities will be limited if used in a growing
knowledge of medical systems [41]. In medical diagnosis the line and order of
reasoning as well as explanation is essential, thus the probabilities will fail to perform
as uncertainty management techniques for MYCIN.
But in 1976, Adams argues that the assumptions made with Certainty Factors may not
be always valid, and therefore the hypotheses [42]. In 1986 Heckerman, who worked
to reform the CFformula, supported that [43]. In 1992, in a research paper [44] it was
claimed that both Adams and Heckerman were incorrect about their assumptions
because they result from incorrect interpretation of the CF and it had nothing to do
with elicitation and the use of the numerical factor. Finally it also concluded that CF
should not be interpreted as a measure of belief-updating.
CF is still a useful method in different domain like it was used to identify database
distortions [45], as well as in fault diagnosis in the Zinc Leaching process [46].
Classical set theory is governed by a two-valued logic, while a fuzzy logic can be
represented to a many-valued logic. In 1965 Lotfi Zadeh first introduces the fuzzy
logic [47]. Two valued logic like, propositional logic, predicate logic, cannot deal
with the exact meaning of human belief, uncertain knowledge. In that situation fuzzy
logic works well. Sometimes knowledge is expressed with suppressed its exact
meaning, disposition [48, 50]. Using fuzzy quantifier it is possible to represent the
exact meaning of that knowledge, disposition restoration [48, 50], in fuzzy logic.
23
Knowledge‐Based Intelligent System
The word “fuzzy” is used when dealing with ambiguous terms such as near and far, or
high and low.
Important terms we must comprehend first before we explain fuzzy logic are
“linguistic variables”, “terms”, and “universe of discourse”. For example a variable is
“Blood Cholesterol”. Its terms are “high, low, very high, very low”, and each term
will have a numerical value and called universe of discourse. Terms are associated
with fuzzy membership function that assigns a value between 0 and 1 to the variable.
Fuzzy Logic gives the freedom to think outside the true or false. Elastic Fuzzy Logic
is also introduced [51] that is based on certain antecedents. It also handle missing
antecedent based on multiple passing of rules.
In medical diagnosis, doctor dealing with words input by the user such as somewhat,
little, very, normal, mild, high low and so on, this considered as fuzzy.
Fuzzy logic is a very useful to handle the uncertainty [49] and has been used in many
popular ES and studies. Sphinx [52, 53] is a medical fuzzy expert system in the field
of internal medicine which proved very efficient.
A fair comparison has been done between fuzzy logic model and crisp model in a
medical domain for an ECG fetal heart rate and ECG analysis [54]. The initial results
of fuzzy model proved that the logic approach gives an enhanced performance
H2 leak detection system as part of the DIGEST (Diagnostic Expert System for
Turbomachinery) [55] developed as an analytics and diagnostics system for steam
turbine generators in power plants. The system was also tested on a crisp model, and
the crisp model failed to provide the speed and accuracy like fuzzy model.
A fuzzy expert system for iron ore processing monitor and control for a mill and
spiral circuits at Wabush Mines has been developed [56]. The developed system
increased the productivity and high return investments of the mill line.
Zadeh in 1994, states that fuzzy logic has been succeeded where traditional
approaches have failed [57]. The main cause is that fuzzy logic is the closest to
modeling the real world problem and considers the human mind as its role model.
24
Knowledge‐Based Intelligent System
2.3.3.3 Belief Networks
Belief Networks also known as Bayesian Networks. The belief network is a causal
reasoning tool that has been used in a wide variety of applications, and is now the
support of the uncertainty reasoning [58, 59, 60]. Belief networks are based on the
laws of probability, and in particular conditional and Bayesian probability theory.
A belief network is a directed acyclic graph which encodes the causal relationships
variables, represented in the graph as nodes. Nodes are connected by causal links,
represented in the graph as arrows, which point from parent nodes (causes) to child
nodes (effects). Each node is assigned a probability value. Nodes with one or more
parents require their conditional probability distribution, which is provided by
a conditional probability table (CPT). CPT is dependent on connected nodes;
therefore, each probability is calculated based on the probability of those connected
nodes.
Moreover, Cooman and Zaffalon in 2004 address the issue of beliefs with incomplete
observations [61]. They introduce an algorithm that focuses on a more logical
approach of updating than unpretentiously doing so. They derived a more coherent
rule for updating realistic expert systems. Their algorithm can be implemented easily
without changing in the knowledge base of the expert system.
Belief networks have typically been applied to problems when there is uncertainty in
the data or in the knowledge about the domain, and when reasoning with uncertainty
is important. Belief networks have been applied particularly to problems which
require diagnosis of problems from a variety of input data.
Applying such a network in a medical field where elements are not dependent will
only strain the system, and would not much help with the uncertainty. Due to this fact,
most of the researchers were not used of this technique to develop expert system.
Only few are found as part of uncomplicated problems. Because the larger that
knowledge base, the harder it is to implement [62].
25
Knowledge‐Based Intelligent System
better understanding of CBR [63, 64, 65, 66]. Researchers are proposed manly to type
of models namely R4 and R5 [67].
Whereas R5 consists of one more steps, that deals with the knowledge acquisition.
The main problem of CBR is to handle the increasing case library. Some opinion
about the solution of the problem is to store only the most important cases. But in
some domain like medical domain, it is very hard to decide the important cases.
Uncertainty in Case-Based Reasoning (CBR) can occur due to three main reasons.
First is due to missing information. Second one occurs for different problems and
third is occurring science perfect prediction is impossible.
The uncertainty in CBR is calculated based on the occurrence of cases and the
numbers are updated information retrieval, and correct diagnosis.
A framework proposed for case features that are considered as the bottle-neck for
CBR [69], with the hope that it will increase the usability, flexibility, and
expandability of CBR.
26
Knowledge‐Based Intelligent System
CBR is used in KBS in very few cases as comparison with other techniques.
However, it seems that researches are slowly moving in the direction of considering it
as an uncertainty technique.
The Dempster-Shafer theory (DST), also known as the theory of belief functions, is a
generalization of the Bayesian theory of subjective probability. It was first introduced
by A. P. Dempster in 1968 [70], and extended by Glenn Shafer in 1976 [71]. This
theory was motivated by two difficulties faced in Bayesian probability theory, the
representation of ignorance and the idea that the subjective beliefs assign to an event
and its negation must sum to one.
Dempster's rule begins with the assumption that the questions with probabilities are
independent with respect to subjective probability judgments, but this independence is
apriori; it disappears when conflict is detected between the different items of evidence
[72]. DST is known for its computational complexity of combining beliefs.
DST also applied to develop an expert system with uncertain rules based on spatial
remote sensing data [76]. DST also used in a framework of valuation-based systems
that serve as a framework for managing uncertainty in expert systems [77].
27
Knowledge‐Based Intelligent System
to model uncertainty, the way of human thinking, reasoning, and the perception
process. Fuzzy systems were first introduced by Zadeh in 1965[47].
Nowadays, one of the most important areas of application of fuzzy logic is Fuzzy
Knowledge-Based Intelligent Systems (FKBIS). These kinds of systems extend the
rule-based systems. Usually these kinds of systems deal with "IF-THEN" rules whose
antecedents and consequents are composed of fuzzy logic statements.
In general, an FKBIS is a rule-based intelligent system where fuzzy logic (FL) is used
as a tool for representing different forms of knowledge about a specific problem. FL
is also used in FKBIS for modelling the interactions and relationships that exist
between its variables. For this property, they have been successfully applied to a wide
range of problem domains for which uncertainty and vagueness emerge in different
ways [78].
Additionally, FKBIS were most commonly applied for fuzzy modelling [79], fuzzy
control [80] and fuzzy classification [81].
28
Knowledge‐Based Intelligent System
Inference methods are more robust and flexible with the help of FL
approximate reasoning methods.
The general architecture of a FKBIS given bellow, show the flow of data through the
system. Generally FKBIS has two main parts, one is knowledge base and other is
Inference Engine. The inference engine of a FKBIS is involving the following three
components:
Fuzzification module: the fuzzification module transforms the crisp input data
into fuzzy values that play as the input of the fuzzy reasoning process.
Inference system: the inference system infers from the fuzzy input to several
resulting output fuzzy values depending on the information stored in the KB.
Inference
System
Inputs Outputs
Fuzzyfication Defuzzyfication
Knowledge
Base
29
Knowledge‐Based Intelligent System
Generally the KB contains the linguistic rules representing the expert knowledge
about the problem domain. But sometimes the KB contains two different information
levels, namely the fuzzy rule semantics and the linguistic rules representing the expert
knowledge. This conceptual distinction is reflected to constitute the KB by two
separate entity.
Database (DB): Database contains the linguistic term sets considered in the
linguistic rules and the membership function defines the semantics of the
linguistic labels. Each linguistic variable in the problem related a fuzzy
partition of its domain signifying the fuzzy set related to each of its linguistic
terms.
The DB also includes the scaling factors or scaling functions that are used for
transformation between the universe of discourse in which the fuzzy sets are
defined and the domain of the system input and output variables.
It is important that the RB can be present with several structures. The typical
one is the list of rules, although a decision table (sometimes called rule matrix)
becomes an equivalent and more compact representation for the same set of
linguistic rule.
2.4.3 Fuzzification
Input crisp data are fuzzyfied using Fuzzy set theory [47, 83]. Fuzzification is the
process of formulating the mapping from a given input to an output using fuzzy set
theory. This mapping provides a basis from which patterns discerned. Generally, the
fuzzification process involves three basic concepts: Association of linguistic
variables, fuzzy set operations and membership functions.
Let x be a linguistic variable labeled ’Age’. Its term set T could be defined as
T (age) = {young, very young, not very young, old, more or less old}
Each term is defined on the universe, for example the integers from 0 to 100 years.
Assume a discrete universe U = (0, 20, 40, 60, 80) of ages. We can assign u = [0 20
40 60 80] and
‘young’ = [1 .6 .1 0 0]
The discrete membership function for the set ‘very young’ is (young)2,
The standard fuzzy set operations are: union, intersection and additive complement.
Utilizing the standard max-min system proposed by Zadeh [47], the fuzzy-set
complement, intersection, and union are defined by
∩ , (2.1)
∪ , (2.3)
where
31
Knowledge‐Based Intelligent System
2.4.3.3 Membership functions
Each linguistic pattern has membership degree in the range [0, 1]. The type of the
membership function is used depending on the base set patterns. If the base set
contains many values, or if this set is continuous, then a parametric representation,
which can be adapted by changing the parameters, is appropriate. Mostly this type of
membership functions are triangular or trapezoidal functions that are defined by three
and four parameters respectively.
0 ,
, , , 2.2
0 ,
, , , , 2.3
1
For some applications continuously differentiable curves requires for modeling and
therefore smooth transitions, which is not possible using triangular or trapezoidal
function. Here three of these functions are mentioned
,
, , , , 1, 2.4
,
where and , and are the mathematic expectation and the standard deviation
of the first and the second Gaussian distributions
32
Knowledge‐Based Intelligent System
The difference of two sigmoidal functions [85]
, , , , 1 1 (2.5)
, , , 1 2.6
2 1
, , 2.7
1 2 0
0 otherwise
The inference system infers from the fuzzy input to several resulting output fuzzy
values depending on the information stored in the knowledge base. There are two
most popular inference method, Mamdani inference method [86] and Takagi-Sugeno-
Kang method [87, 88]. The main difference in these two method is that the result of
Mamdani inference is one or more fuzzy sets which must (almost always) then be
defuzzified into one or more real numbers, whereas the result of TSK inference is one
or more real functions which may be evaluated directly.
∶ …
1, 2, … ,
33
Knowledge‐Based Intelligent System
where m is the number of rules, (j = 1, 2, . . . , n) are the n input variables, y is the
output variable, and and are the linguistic terms corresponding to the fuzzy sets
that are characterized by membership functions and respectively. The
most important thing here is that the consequence part of each rule is characterized by
a fuzzy set ci. Thus the final output of a Mamdani inference method is an arbitrarily
complex fuzzy set which is usually needed to be defuzzified.
In Takagi-Sugeno-Kang (TSK) inference [87, 88], rules are of the following form:
∶ … ⋯
1, 2, … ,
where m is the number of rules, (j = 1, 2, . . . , n) are the n input variables, is the
output function of input variables and and (j = 1, 2, . . . , n) are real-valued
linear parameters. The most important thing here is that the consequence part of each
rule is now described by a linear function of the original input variables. The final
output the inference process is calculated by:
∑
2.8
∑
Therefore, to design the inference system of TSK FKBIS, the designer only have to
selects this conjunctive operator T, with the most common choices being the
minimum and the algebraic product.
2.4.5 Defuzzification
34
Knowledge‐Based Intelligent System
Given a distribution of truth μA(y) for output variable y, the role of defuzzification is
to select one crisp value, ŷ, for it. There are several possibilities, the most common of
which being [85]
ŷ 2.9
ý
ŷ ý ∶ μ y dy μ y dy 2.13
ý
The domain knowledge of specific problem acting an important role to derive the
fuzzy rule set. Careful concern about the rule set generation may significantly
improve the performance of the FKBIS. In a usual application of FKBIS like
modelling, control and classification, two types of data, numerical and linguistic, are
available to the FKBIS designer. The first one is generally obtained from observing
the system, while the latter one is obtain from a domain expert. Thus there are two
main ways to derive fuzzy rule set for a FKBIS [89].
35
Knowledge‐Based Intelligent System
Derivation rule from expert: In this method, the KB of a FKBIS is made by means of
the domain knowledge of a human expert. The human expert specifies the linguistic
terms associated to each linguistic variable and the structure of the rules in the KB
and also the meaning of each terms. This is the simplest method when the human
expert is able to express his knowledge in the form of linguistic rules.
Derivation rule using automated learning methods: It is very difficult to derive rule
set from human experts due to different causes. For this reason researchers developed
numerous automated learning methods over the last few years for the different types
of FKBIS. Some of those are
2.5 Conclusion
These functionalities enforce several requirements upon the system. For example, the
explanations must adequately represent the reasoning processes and they should allow
the user to check the steps of reasoning process or underlying knowledge at different
levels or in details. In addition, the approach to a problem of KBIS need not be
36
Knowledge‐Based Intelligent System
identical to the approach of any human expert, but the overall procedures and
reasoning steps must be understandable and logical. This concludes that the system
must have the capability to represents its explanations about the different
requirements and characteristics of users.
Still the interest to design an intelligent system is the notion to design a more
efficient, accurate and reliable intelligent systems that can play the role of domain
expert when human expert are unavailable like many domains in real life, especially
the medical domain in the rural area of our country.
Bibliography
37
Knowledge‐Based Intelligent System
[7] Bonfa, I, Maioli, C, Sarti, F, Milandri, G L & Monte, D P R, 1993, HERMES:
an Expert System for 131 Prognosis of Hepatic Disease, First New Zealand
International Two-Stream Conference on Artificial Neural Networks and
Expert Systems, Dunedin, New Zealand, pp. 240-246.
[8] Hudson, D L & Cohen, M E 1994, Fuzzy Logic in Medical Expert Systems,
IEEE Engineering in Medicine and Biology, vol. 13, no. 5, pp. 693-698.
[10] Keleş, A & Keleş, A 2006, ESTDD: Expert System for Thyroid Disease
Diagnosis, Expert Systems with Applications, vol. 34, no. 1, pp. 242-246.
[12] Kumar, K A, Singh, Y & Sanyal S, 2009, Hybrid Approach Using case-based
Reasoning and Rule-based Reasoning for Independent Clinical Decision
Support in ICU, Expert Systems with Applications, vol. 36(1), pp. 65-71,
[16] Thomas Connolly and Carolyn Begg 1998, Database Systems: A Practical
Approach to Design, Implementation, and Management, Addison Wesley,
ISBN 0- 201-34287-1
[17] J L Alty and M J Coombs 1984, Expert systems: Concepts and examples,
NCC publications, ISBN 0-85012-399-2
[18] Shipman 1981, The Functional Data Model and the Data language
DAPLEX,Computer Corporation of America, ACM Transactions on Database
Systems, 6(1981) 140-173
[19] Peter Jackson 1999, Introduction to expert systems, Addison Wesley Longman
Limited, ISBN 0-201-87686-8
38
Knowledge‐Based Intelligent System
[20] A. S. Belward, and C. R. Valenzuela, 1991, Remote Sensing and Geographical
Information Systems for Resource Management in Developing Countries, 1st
edition, Kluwer Academic Publishers, Netherlands
[22] T. Rajkumar, and Jorge E. Bardina, 2003, Web-based Weather Expert System
(WES) for Space Shuttle Launch, IEEE International Conference on Systems,
Man and Cybernetics, vol. 5, no. 5, pp. 5040-5045.
[25] H.L. Akin, V. Altin 1991, Rule-based Fuzzy Logic Controller for a PWRtype
Nuclear Powerplant, IEEE Transactions on Nuclear Science, vol. 38, no. 2, pp.
883-890.
[26] Chun Siu, Qiang Shen, R. Milne, 1997, A Fuzzy Expert System for Vibration
Cause Identification in Rotating Machines, Proceedings of the Sixth IEEE
International Conference on Fuzzy Systems, Barcelona, Spain, vol. 1, no. 1,
pp. 555-560
[27] F. Bacchus, and Y. Whye Teh, 1998, Making Forward Chaining Relevant,
AAAI Conference paper.
[28] R. Cucchiara, M. Piccardi & P. Mello, 2000, Image Analysis and Rule-based
Reasoning for a Traffic Monitoring System, IEEE Transactions on Intelligent
Transportation Systems, vol. 1, no. 2, pp. 119-130.
39
Knowledge‐Based Intelligent System
[32] Erik T. H. Fung, 2000, Abductive Approach to Prototyping Data Flow
Diagrams, Asia-Pacific Conference on Quality Software-APAQS: pp. 306-314
[33] X. Luo, M. Kezunovic, 2005, An expert system for diagnosis of digital relay
operation, Proceedings of the 13th International Conference on Intelligent
Systems Application to Power Systems, pp. 175 – 180
[34] J. Miles and C. Moore, 1994, Practical knowledge based systems for
conceptual design, pp. 35-53, Springer-Verlag, Berlin, Germany.
[35] Hojjat Adeli (ed.) 1988, Expert Systems in Construction and Structural
Engineering, Chapman and Hall, London, England.
[36] P. W. Mullarkey, 1987, Languages and Tools for Building Expert Systems, in
Expert Systems for Civil Engineers, (ed. Maher M.L.), American Society of
Civil Engineers, New York, U.S.A., pp 15-34.
[39] Chen, I & Lin, M T 1988, Resolution of incomplete conditions in rule- based
expert systems, System Theory, Proceedings of the Twentieth Southeastern
Symposium, March 1988, pp. 514-518.
[42] Adams, J B 1976, A probability model of medical reasoning and the MYCIN
model, Mathematical Bioscience, Vol.32, no. pp.177-186
[44] Dan, Q & Dudeck, J 1992, Some Problems Related with Probabilistic
Interpretations for Certainty Factors, Proceedings of the Fifth Annual IEEE
Symposium on Computer-Based Medical Systems, pp. 538- 545
40
Knowledge‐Based Intelligent System
[46] Wu M, She J H, Nakano M & Gui W 2000, Expert control and fault diagnosis
of the leaching process in a zinc hydrometallurgy plant, Control Engineering
Practice, Vol. 10,no. 4, pp. 433-442
[47] L. A. Zadeh, 1965, Fuzzy Sets, Information and Control, 8, pp. 338-353
[49] L.A. Zadeh, 1983, The Role of Fuzzy logic in the Management of Uncertainty
in Expert Systems, Fuzzy Sets and Systems, 11,3
[51] Tseng, H C & Teo, D W 1994, Medical Expert System with Elastic Fuzzy
Logic, paper presented in IEEE World Congress on Computational
Intelligence, Florida, USA, 26-29
[55] Muller, H, Rehbold, R and Emshoff, H 1993, A fuzzy logic expert systemfor
detecting generator H2leaks, Third International Conference on Industrial
Fuzzy Control andIntelligent Systems, pp. 185 – 190
[56] Hall, M B & Harris, C A 1993, Fuzzy logic expert system for iron ore
processing, Conference Record of the 1993 IEEE Industry Applications
Society Annual Meeting, Vol. 3, pp. 2190 – 2199
[59] Stuart Russell and Peter Norvig, 1995, Artificial Intelligence a Modern
Approach Prentice Hall, second edition,
41
Knowledge‐Based Intelligent System
[60] Judea Pearl, 1988, Probabilistic Reasoning in Intelligent Systems, Morgan
Kaufmann, San Mateo, Ca.
[65] J. Hunt, 1995, Evolutionary case based design, in: I.D. Waston (Ed.), Progress
in Case-based Reasoning, LNAI 1020, Springer, Berlin, pp. 17–31.
[66] D.B. Leake, 1996, Case-Based Reasoning: Experiences, Lessons and Future
Direction, AAAI Press/MIT Press, Menlo Park, CA.
[73] J.M. Merigó, M. Casanovas, 2007. Using fuzzy OWA operators in decision
making with Dempster-Shafer belief structure, In Proceedings of the 16th
AEDEM International Conference, Krakow, Poland, pp. 475-486.
42
Knowledge‐Based Intelligent System
[74] L. A. Zadeh, 1986, A Simple View of the Dempster-Shafer Theory of
Evidence and its Implication for the Rule of Combination, The AI Magazine,
pp. 85-90
[76] Ahmadzadeh, M.R., 2001, Petrou, M., International Geoscience and Remote
Sensing Symposium, IGARSS, IEEE, vol.2 , pp. 861 - 863
[79] W. Pedrycz, (Ed.), 1996, Fuzzy Modelling: Paradigms and Practic, Kluwer
Academic Press.
[81] Z. Chi, H. Yan, and T. Pham, 1996, Fuzzy Algorithms: With Applications to
Image Processing and Pattern Recognition, World Scientific.
[84] H. J. Zimmermann, 1993, Fuzzy set theory - and its applications, Kluwer,
Boston, second edition.
[85] S.T. Wang, Fuzzy system and Fuzzy Neural Networks, Shanghai Science and
Technology Press, 1998, Edition 1.
43
Knowledge‐Based Intelligent System
[87] M. Sugeno and G.T. Kang, 1988, Structure identification of fuzzy model.
Fuzzy Sets and Systems, 28:15–33.
[88] T. Takagi and M. Sugeno, 1985, Fuzzy identification of systems and its
applications to modeling and control. IEEE Transactions on Systems, Man,
and Cybernetics, 15(1):116–132.
[89] L. X. Wang, 1994, Adaptive Fuzzy Systems and Control: Design and
Analysis. Prentice-Hall.
[91] Shann, J. J. and H. C. Fu, 1995, A fuzzy neural network for rule acquiring on
fuzzy control systems. Fuzzy Sets and Systems 71, 345-357.
[93] Takagi, H., N. Suzuki, T. Koda, and Y. Kojima, 1992, Neural networks
designed on approximate reasoning architecture and their applications. IEEE
Transactions on Neural Networks 5(5), 752-760.
[95] Yoshinari, Y., W. Pedrycz, and K. Hirota, 1993, Construction of fuzzy models
through clustering techniques, Fuzzy Sets and Systems 54, 157-165.
[96] Cordon, O. and F. Herrera, 1995, A general study on genetic fuzzy systems.
In: J. Periaux, G. Winter, M. Galan, and P. Cuesta (Eds.), Genetic Algorithms
in Engineering and Computer Science, pp. 33-57. John Wiley and Sons.
[97] Cordon O., F. Herrera and M. Lozano, 1997, A classified review on the
combination fuzzy logic-genetic algorithms bibliography: 1989-1995. In: E.
Sanchez, T. Shibata, and L. Zadeh (Eds.), Genetic Algorithms and Fuzzy
Logic Systems. Soft Computing Perspectives, pp. 209-241. World Scientific.
[99] L.-X.Wang and J.M. Mendel, 1992, Generating fuzzy rules by learning from
examples, IEEE Transactions on Systems, Man, and Cybernetics, 22(6):1414–
1427.
44
Knowledge‐Based Intelligent System
[100] S. Mitra and Y. Hayashi, 2000, Neuro-fuzzy Rule Generation: Survey in Soft
Computing Framework, IEEE Trans. On Neural Network, vol. 11, no. 3, pp.
748–768
[102] S. K. Pal and A. Skowron, (Eds), 1999, Rough Fuzzy Hybridization: A New
Trend in Decision Making” Singapore: Springer-Verlag.
[104] Gang Xie , Fang Wang , Keming Xie, 2004, RST-Based System Design of
Hybrid Intelligent Control, IEEE International Conference on Systems, Man
and Cybernetics.
[105] Teach, R.L. and Shortliffe, E.H. 1981, An analysis of physician attitudes
regarding computer-based clinical consultation systems. Comput. Blomed.
Res. 14, pp. 542-558.
[106] Wallis, J.W. and Shortliffe, 1982, E.H. Explanatory power for medical expert
systems: studies in the representation or causal relationships for clinical
consultations. Meth. Info. Med. 21, pp.127-136.
45
Interactive Intelligent System for Medical Diagnosis
Chapter 3
Interactive Intelligent System for Medical Diagnosis
Development of intelligent system is a major area of research nowadays. Design of an
Interactive Intelligent System for Medical Diagnosis (IISMD) employing fuzzy logic
for knowledge representation and inferencing using Generalized Modus Ponens
(GMP) is the purpose of this chapter. The system developed here may be used as an
assistant to the medical practitioner for diagnosis of diseases, in a particular domain,
say convulsion in infancy.1
3.1 Introduction
IISMD helps the doctors for the diagnosis of diseases. It is an interactive intelligent
system where human reasoning process has been simulated into a program. The
elements of reasoning are KB and Inference engine.
The KB for a medical diagnosis system may contain facts and rules, which are
sometimes uncertain, incomplete or fuzzy due to knowledge, gained from experience,
acquired from human expert [9, 18]. More over the user, a medical practitioner, of the
1
This Chapter is based on the published paper, S. Mukhopadhyay, J. Ghosh, D. Ghosh Dastidar,
IISMD: An Interactive Intelligent System for Medical Diagnosis, AMSE Journal of Modeling,
Measurement and Control, France, Series C, Vol. 68, No.3, Page 1-12, 2007
46
Interactive Intelligent System for Medical Diagnosis
system may not always be completely certain that the value, acquired from patient, he
provides for a variable is hundred percent correct. Naturally the conclusion based on
KB of such a system must have some uncertainty. We have stored facts and rules in
the KB of IISMD as production rule as seen in many rule based expert systems [6, 15,
16].
Appropriate inference strategies [9, 15, 16] have been used for the diagnosis process.
We have used GMP [10] for inferencing. GMP is an extension of standard modus
ponens where statements, proposition and predicates, are characterized by fuzzy sets i.
e., premises need not be fully matched.
The organization of the chapter is as follows: In section 3.2 we describe the knowledge
base, i.e., domain of intelligent system causing convulsion (Fit) in infancy and the
appropriate inference strategy for interpretation of diseases. Section 3.3 deals with the
observation of a particular consultation session and Section 3.4 deals with the
conclusion describing the important features for the intelligent system.
KB of the IISMD considered here is a set of diseases [13] whose primary symptom is
convulsion or fit. Fit is defined as attacks of involuntary tonic or clonic movements of
the limbs, trunk and face with or without loss of consciousness. The set of diseases
covered in the system are
1) Symptomatic convulsion
(a) Meningitis
(b) Encephalitis
47
Interactive Intelligent System for Medical Diagnosis
2) Idiopathic convulsion
(i) Epilepsy
The parameters describing the domain of the intelligent system are divided into two
categories, such as user obtained parameters and rule deduced parameters. User
obtained parameters are those, which are supplied by the user during consultation
session. The user may be a doctor or any person having some knowledge about the
domain of IISMD. Examples of user obtained parameters are current-complaint, past-
illness, family-history etc. Rule deduced parameters are those, which are supplied by
the system itself during the execution time. Examples of user obtained parameters are
look-for, eeg-for, invest etc.
The parameters are their legal values used in the domain of IISMD are stored in the
variable LEGALVALS in the file RULEBASE1.LIB (see appendix). The CAAR of
LEGALVALS returns the parameters and the CDAR returns the legal value(s) of the
parameter. The user is free to give any other value of the parameter but in that case he
has to type the value instead of selection of options (see sample session). The
parameters are pat-name (patient name), pat-age (patient age), current-complaint, past-
illness, family-history. Values of these parameters are supplied in the first stage of
consultation. Some clinical examinations may be required for the diagnosis and they
are described by the parameters skull-exam, skin-exam, limb-exam, neurological-
findings, convulsion-nature, post-convulsive-disorder, blood-pressure etc. Their legal
values are displayed on the console. The user has to select options to supply the value
of the parameters.
48
Interactive Intelligent System for Medical Diagnosis
The diagnosis process can be described as the establishment of values of the various
parameters, which describe the result of consultation. The values of these parameters
are established in different stages of consultation. The investigation of convulsion
should include a careful case history, attention being paid to each system of the body in
turn, followed by a detailed physical examination. Special investigations include
lumber puncture (csf test), X-ray examination of the skull and that of limbs. Electro
encephalography (eeg) is used to search for focal disturbances around organic lesions.
If hypertension is present renal function must be investigated to determine whether the
convulsions are due to blood pressure itself (hypertensive-encephalopathy). The
parameters blood-pressure and limb-exam are used for diagnosis of hypertensive-
encephalopathy in acute-nephritis as the cause of convulsion.
In order to detect idiopathic (epileptic type) convulsion [13] current-complaint may not
provide any illuminating evidence but the following clinical parameters are very
important; convulsion-nature, conv-durn-secs (convulsion duration in seconds) post-
conv-disorder (post convulsion disorder). The values of these parameters are used to
conclude that the investigation eeg is required along with the probable type of
epilepsy. This is implemented by inferring the type of epilepsy as the value of the
parameter eeg-for from which the value of the parameters clinically-asc-disease and
invest (investigation required) are established. Once the value of invest is established,
the value of invest-asc-disease may obtained.
Knowledge is represented in the form of Fuzzy production rules. The concept of fuzzy
production rule is derived from the production rules popularly used in rule base
systems. The format of the rule is
If X is A then Y is B with CF =
49
Interactive Intelligent System for Medical Diagnosis
Where the premise 'X is A' and the conclusion 'Y is B' are fuzzy propositions and is
a numerical value known as confidence factor, CF. The CF values vary from 0 to 100
(for convenience to deal with LISP). Zadeh took this value from 0 to 1 [10].
The premise portion of a rule in a rule base may contain more than one clauses
connected by operator AND.
It is to be noted that operator OR connective has not been provided for implementing
a rule structure since the same can be done effecttively by including another rule with
the said preconditions.
The parameters and their legal values are stored in the variable LEGALVALS. The
structure of LEGALVALS in LISP is shown partially as follows:
50
Interactive Intelligent System for Medical Diagnosis
………………………………………………
( pred-csf-celltype ( lymphocyte monocyte) )
……………………………………………...)
The meaning of the parameters are shown in MEANING-PARA-VAL .The structure
of MEANING-PARA-VAL in LISP is shown partially as follows:
[Example 3.2]: (conv-durn-secs((high ( 110 130 150 170 190 210 230 250 270 290)
where the members of the first list in the fuzzy term correspond to high of the variable
conv-durn-secs meaning convulsion duration in seconds, and the members in the
second list correspond to the respective degrees of membership of being high.
51
Interactive Intelligent System for Medical Diagnosis
Two rules in the rule base are not of the form IF-THEN, they are initial data and goal.
They are used for initialization of data and to set the parameters relating to goal. In our
IISMD system our goal is the diagnosis of diseases, which are either clinically
ascertained disease or investigatingly ascertained disease. The two rules relating to
initialization of data and goal are as follows:
past-illness, family-history)))
conv-durn-secs---->high conv-durn-secs---->med
conv-durn-secs---->low
52
Interactive Intelligent System for Medical Diagnosis
3.2.4 Inferencing
The Inference Engine of IISMD uses both forward chaining and backward chaining
inference strategies. Up to certain extent it works in forward mode then it works using
backward chaining.
A temporary storage called CACHE is used to store the result of consultation. The
establishment of various values of the parameters are done here using the inference
process of a production (rule based) system, match-select-execute as shown in Figure-
3.2.
KB CACHE
1
Match
Conflict Set
2
Select
3
Execute
For inferencing MYCIN like inference procedure [2, 9, 16] is adopted. The expert
applies a confidence factor (CF) to the parameters in each of THEN statement only by
his experience. During consultation session, the user may select the CF or he may input
the CF from console to the parameter values. For an IF statement to be true, it's CF
must exceed certain limit (20 in our case). Premises may be combined by means of
AND function and the CF of IF statement is the minimum of the CF of the premises (as
per fuzzy logic). When the condition of IF statement are found to be met the action in
the statement is taken into consideration and a new CF for the parameter in the
53
Interactive Intelligent System for Medical Diagnosis
Conceptually, if we are 50% certain of evidence and 50% certain than a rule applies,
we are 25 % certain of the inference.
If the same conclusion can be derived from different sources then the CF of THEN
statement is computed by using the formula
3.3 Observation
The KB of IISMD consists of 90 rules but we have shown the following rules, in the
form of English like language, those have been used in the consultation process of this
particular session as shown in Appendix-I.
54
Interactive Intelligent System for Medical Diagnosis
Suppose the user gives the following inputs as described in the sample consultation
session i. e., the value of parameters current-complaint is visual-disturbance with CF
60, past-illness is grandmal-epilepsy with CF 70, family-history grandmal-epilepsy
with CF 70, convulsion-nature is tonic-clonic-movt-of-musc with CF 90, post-
convulsive-disorder is sleep with CF 80 and conv-durn-secs is high with CF 90.
Finally, the following conclusion is drawn on the basis of facts supplied by the user:
3.4 Conclusion
IISMD has been developed in a medical domain for diagnosis of diseases causing
convulsion in infancy, but it may be applied to any diagnostic domain having similar
rule based structure. The system is working satisfactorily. Our ultimate aim is to
extend the system based on neural network classification so that parallel computations
are possible.
55
Interactive Intelligent System for Medical Diagnosis
User can input by selecting options and making the system interactive.
Like essential features of expert system, user can obtain explanation of the
reasoning process by selecting WH (why) option.
User can abort the consultation session using AB (abort) option and continue
with the same afterwards, if required, at the point where it was aborted.
Bibliography
5. I-Jen Chang, Ming-Jium Shieh, Jane Yung-Jen Hsu, Jau-Min Wong, Building a
medical decision support system for colon polyp screening by using fuzzy
classification trees, Applied Intelligence, Vol. 22, pp. 61-75, (2005)
6. J. J. Buckley, W. Siler and Douglas Tucker, A fuzzy expert system, Fuzzy sets
and systems, Vol. 20, pp 1-16, (1986).
7. Krzysztof J. Clous, Inho Shin, Lusy S. Goodenday, Using fuzzy sets to diagnosis
coronary stenosis, IEEE Computer, Volume 24 , No. 3, pp. 57 – 63, (1991)
56
Interactive Intelligent System for Medical Diagnosis
8. K. S. Leung, M. H. Wong and W. Lam, A fuzzy expert data base system, Data and
Knowledge Engineering, Vol. 4, pp 287-304, (1989).
11. Laurent Jeanpierre, Francois Charpillet, Automated medical diagnosis with fuzzy
stochastic models: monitoring chronic diseases, Acta Biotheoretica, Vol. 52, pp.
291-311, (2004)
13. Nicholas P. Mann and Angus Nicoll, handbook of Paediatric, Blackwell Scientific
Publishers, (1989).
14. P. Mondal, A. K. Saha and R. K. Samanta; On an expert system with child growth
and development, Advances in Modelling & Analysis (B), AMSE Press, France,
Vol. 26 No.1, pp 13-28, (1993).
57
Rough‐Fuzzy Hybridization
Chapter 4
Rough-Fuzzy Hybridization
Rough set and its integration with fuzzy set has been an efficient soft computing
strategy of machine learning. Rough-Fuzzy hybridization provides a flexible way of
information processing for handling different real-life and ambiguous decision-
making problems.
Rough Set theory was introduced by Zdzislaw Pawlak [1, 2] for classificatory analysis
of data tables. Rough Set theory provides a systematic framework for studying
imprecise and insufficient knowledge. The main goal of rough set theoretic analysis is
to synthesize approximation (upper and lower) of concepts from the acquired data.
While fuzzy set theory [3] assigns to each object a grade of membership
(belongingness) to represent an imprecise set, the focus of rough set theory is on the
ambiguity caused by limited discernibility of objects in the domain of discourse.
4.1 Introduction
Fuzzy set and rough set theories are extensions of classical set theory in mathematics
to describe uncertainty, imprecision and vagueness of data. Characteristic function of
a fuzzy set uses a degree of membership in [0, 1], on the other hand characteristic
function of a rough set employs membership functions that is its lower and upper
approximations in an approximation space. There have been extensive theoretical
works on the relationships between rough sets and fuzzy sets [4, 5, 6], and many
approaches have been proposed on the combination of rough and fuzzy sets: rough-
fuzzy sets, fuzzy-rough sets [7, 8, 9], and rough fuzzy hybridization [10-14]. By
definition, analysis, and operation of a set with fuzzy concepts, it is simpler to utilize
a set-method for use of a fuzzy set. One example of using a set-method on
combination of rough and fuzzy sets is a more general framework suggested by Klir
and Yuan [15].
58
Rough‐Fuzzy Hybridization
representation of patterns and rough set theory is used to obtain dependency rules to
construct the knowledge-base of the intelligent system.
The main objective of this chapter is to review the theoretical and implementation
approaches on combination of rough and fuzzy sets.
Fuzzy sets are a generalization of sets in which their membership functions are
defined in [0, 1] of real number domain.
, | ∈ ,0 1 (4.1)
U: a universe
x: an element in U
F: a fuzzy set on U
: a membership function
Utilizing the standard max-min system proposed by Zadeh [3], the fuzzy-set
complement, intersection, and union are defined by
1 (4.2)
∩ ,
∪ ,
where
59
Rough‐Fuzzy Hybridization
An important feature of fuzzy set operations is that they are truth-functional [16]. The
membership functions of the complement, intersection, and union of fuzzy sets can be
obtained, which is based only on the membership functions of the fuzzy sets involved.
This kind of methods used to solve different type of problems like sampling, feature
selection, clustering or classification, transformation or projection, dimensionality
reduction, rule extraction, and different physical models by developing useful
algorithms. These algorithms generate assumptions about extracted data or
information. These assumptions are considered as new extracted knowledge. The
methods of KDD or extracting knowledge from acquired data sets about a problem
domain are basic steps of information processing. In real world problem,
consideration of implicit, imprecise, and insufficient knowledge in databases or
constructed data sets is a major important area in developing intelligent systems.
Typically knowledge is presented in the form of rules. However, knowledge
discovery systems often generate a huge amount of rules. Another fundamental issue
is how to automatically discover interesting and meaningful knowledge from such
discovered rules. It is infeasible for human beings to select important and interesting
rules manually.
60
Rough‐Fuzzy Hybridization
4.3.1 Introduction
The rough sets theory has been developed for KDD and extracts knowledge from
experimental data sets [18]. This theory provides a powerful foundation to describe
the behavior of concepts, properties, data, in general objects that may present some
intrinsic, vague, ambiguous, unsharp features.
Rough Set theory was introduced by Zdzislaw Pawlak [1]. The rough set approach
provides efficient algorithms for finding hidden patterns in data, minimal sets of data
(data reduction), evaluating significance of data, and generating sets of decision rules
from data also reduces the computational complexity of learning processes. This
approach is easy to understand, offers straightforward interpretation of obtained
results, most of its algorithms are particularly suited for parallel processing
Rough set theory has been applied in different domain like medical databases analysis
[19-21], medical image analysis [22-24], decision support systems [25, 26], pattern
recognition [27-29], and machine learning [30, 31] and so on. Rough sets has been
used very effectively for find out the relationships within imprecise data,
dependencies among objects and attributes also effectively used to evaluate the
classificatory importance of attributes, remove data redundancies and generate
dependency rules.
The rough sets theory also deals with information presented in table. This table
consists of objects and their attributes. The entries in the table may be the categorical
values of the attributes or features and possibly also associated categories or classes.
61
Rough‐Fuzzy Hybridization
Normally data processing problems easily converted into a data table representation
and analysis. Information processing using the rough sets also classifies the objects.
Where
62
Rough‐Fuzzy Hybridization
[Example 4.1]
Object Attributes
Where INS stands for use of insulin (1= yes, 0 = no), PPB stands for post prandial
blood glucose, ALB stands for albuminuria (0= no albuminuria, 1 =
microalbuminuria, 2 = proteinuria) and LJM is the disease limited joint mobility (1=
yes, 0 = no).
In the information system S describing in Table 4.1, the universe U consists of ten
objects, U = {p1, p2… p10} each representing one patient. Each patient is described by
the set of four attributes A = {a1, a2, a3, a4} = {c1, c2, c3, c4}, with discrete values
(numerical and symbolic), representing results of the medical observation and
diagnoses. The set of all discrete numerical values of the attribute c1 is = {0, 1},
second attribute c2 takes two discrete non-numerical values = {N, H, V}. The third
attribute c3 is with three discrete numerical values = {0, 1, 2}. The attribute c4,
with two binary values = {0, 1}. Values of information function f(x, a) are
included in Table 4.1. For example, for the patient (object) p1 and the attribute c1, the
information function values, f(p1, c1) = 0. A set of objects {p2, p3, p5, p6} can be
defined as an example of a concept in the considered information system.
63
Rough‐Fuzzy Hybridization
Where
64
Rough‐Fuzzy Hybridization
[Example 4.2]
The DIABET data set from Example 4.1 can be interpreted as a decision system as
shown in Table 4.2.
Object Attributes
C D
(medical diagnoses) (disease class)
U c1 c2 c3 d
p1 0 N 2 0
p2 0 N 2 0
p3 1 N 0 1
p4 1 N 0 0
p5 0 V 0 0
p6 0 V 0 0
p7 1 V 2 1
p8 1 V 2 1
p9 0 H 1 1
p10 0 H 1 0
In Table 4.2 the attribute c4 of Table 4.1 define as d that represents an expert’s
(doctor’s) decision, taken about the disease based on observation and test results. The
decision attribute d = 0 denotes the diagnosis that a patient does not have a disease,
and d =1 that patient suffers from disease limited joint mobility.
If the pair of objects (x, y) belongs to the relation IND( B) i.e., ( x, y) ∈ IND( B) then
the objects x and y are called indiscernible with respect to set B in S or B-
indiscernible. In other words, anyone cannot distinguish object x from y in terms of
attributes from set B only.
The indiscernibility relation IND(B) as a binary equivalence relation, splits the given
universe U into a family of equivalence classes {E1, E2,…, Er}. The family of all
equivalence classes {E1, E2, …, Er}, defined by the relation IND (B) on U, generates a
partition of U and it is denoted by U / IND (B) or simple U/B. An equivalence class
denoted by [x]B, including an object x ⊆ U is defined by
∈ ∶ (4.6)
Any finite union of elementary sets in AS is called a definable set of a composed set in
AS.
66
Rough‐Fuzzy Hybridization
partition U / IND(B) in AS, denoted by DEF(AS) is Boolean algebra [1]. The given
arbitrary set X ⊆ U, X may not be presented as union of some equivalence classes in
U. In other word, a subset X cannot be described precisely in AS. Thus, a subset X
may be characterized by a pair of its approximations, called lower and upper
approximations. It is here that the notion of rough set emerges.
[Example 4.3]
Consider the decision system, DIABET from Table 4.2 assuming only conditional
attributes i.e., observation and results of tests are considered, representing by the
attributes set B = {c1, c2, c3} and contained in the reduced Table 4.3.
Object Attributes
U c1 c2 c3
p1 0 N 2
p2 0 N 2
p3 1 N 0
p4 1 N 0
p5 0 V 0
p6 0 V 0
p7 1 V 2
p8 1 V 2
p9 0 H 1
p10 0 H 1
Table 4.3: The DIABET data set with the reduced attribute set B = {c1, c2, c3}
Equivalence classes:
67
Rough‐Fuzzy Hybridization
Thus from Table 4.2 it is found that objects are divided into five disjoint groups
according to equal values of attributes c1, c2 and c3 from the subset B defined above
by the equivalence classes E1, E2, E3, E4, E5. Objects in the same group have the same
values for all attributes. For example, in the first group it has two objects p1 and p2
since no other objects have values c1 = 0, c2 = N and c3 = 2 for attributes from B. The
object p1 belongs to the equivalence class E1 = [p1]B= [p2]B = {p1, p2}. Objects p3 and
p4 with equal values for all attributes c1 = 1, c2 = N, c3=0 form the second group. It
can be observed that objects in this group cannot be distinguished based on attributes
c1, c2 and c3 from the set B only. They belong to the equivalence class E2 = [p3]B =
[p4]B = {p3, p4} . Similarly, it can be found that other equivalence classes in S for set
B; E3 = [p5]B = [p6]B = { p5, p6 }, E4 = [p7]B = [p8]B = {p7, p8}, E5 = [p9]B = [p10]B =
{p9, p10}.
∈ ∶ ⊆ ⋃ ∈ ∶ ⊆ (4.8)
∈ ∶ ∩ ⋃ ∈ ∶ ∩ (4.9)
68
Rough‐Fuzzy Hybridization
(4.10)
Universe U
Arbitrary set X
Lower Approximation
Upper Approximation
Negative region
69
Rough‐Fuzzy Hybridization
AS. In this case the B-boundary region BNB(X) = 0. On the other hand If ≠
1. ⊆ ⊆
2. φ
3. U
4. ∪ ⊇ ∪
5. ∪ ∪
6. ∩ ∩
7. ∩ ⊆ ∩
8. X
9. X
10.
11.
[Example 4.4]
70
Rough‐Fuzzy Hybridization
According to the definition of the lower and the upper approximation of a set X1,
based on a subset of attributes B ⊆ A, B = {c1, c2, c3}, the lower approximation is the
largest composed set of B-elementary sets in S that is contained in the subset X1.
The lower approximation contains all B-elementary sets such that every element of
the elementary set is also an element of X1. A lower approximation consists of
patients that surely have a disease.
The upper approximation of set X1 is the smallest composed set of B-elementary sets
in S that contain a subset X1. An upper approximation consists of patients that
possibly have a disease.
, ∪ , ∪ , , , , , ,
The B-boundary region (B-doubtful region of IND (B)) of the set X1 in S based on B,
is
, , ,
This boundary region consists of the composed set of B-elementary sets from S whose
elements, based on the subset of attributes B, cannot be classified as belonging to X1
or not.
71
Rough‐Fuzzy Hybridization
[Example 4.5]
From example 4.3, the equivalence classes of DIABET data set with B⊆A and B =
{c1, c2, c3} are as follows.
Consider a set (concept) X1⊆ U and X1 = {p3, p4, p5, p6}. The set X1 is an example of
B-definable set with respect to B. where
∪ , , ,
Consider another set (concept) X2⊆ U and X2 = {p2, p4, p5, p6}. The set X2 is an
example of roughly B-definable set with respect to B. where
, φ
∪ ∪ , , , , ,
72
Rough‐Fuzzy Hybridization
Consider another set (concept) X3⊆ U and X3 = {p2, p4, p5, p6, p7, p9}. The set X3 is an
example of externally B-nondefinable set with respect to B. where
, φ
∪ ∪ ∪ ∪ , , , , , , , , ,
Consider another set (concept) X4⊆ U and X4 = {p2, p4, p5}. The set X4 is an example
of internally B-nondefinable set with respect to B. where
∪ ∪ , , , , ,
Consider another set (concept) X5⊆ U and X5 = {p2, p4, p5, p7, p9}. The set X5 is an
example of totally B-nondefinable set with respect to B. where
φ
∪ ∪ ∪ ∪ , , , , , , , , ,
(4.14)
1 (4.15)
73
Rough‐Fuzzy Hybridization
A vague concept description may contain boundary-line objects from the universe U,
which cannot be absolutely certain classification to satisfy the description of a
concept. Uncertainty is related to the idea of membership of an element to a set. From
rough set perspectives a set membership function can be defined, which is related to
the rough sets concept. This may be considered as another numerical measure of
imprecision (uncertainty). The rough (B-rough) membership function of an object x to
a set X⊆ U with respect to the subset of attributes B⊆ A is defined by
Where
0≤ ≤1 (4.16)
In rough set, it is possible to find out a firm connection between vagueness and
uncertainty. Vagueness is related to sets of objects (concept), whereas uncertainty is
related to elements of sets. Rough set show that vagueness is defined in terms of
uncertainty.
74
Rough‐Fuzzy Hybridization
The lower and upper approximation of a set and also the boundary region may be
defined by using the rough set membership function as follows:
∈ ∶ 1 (4.18)
∈ ∶ 0
∈ ∶ 0 1
[Example 4.6]
From example 4.4, for the subset of attributes B ⊆ A, and B = {c1, c2, c3} and set of
objects X1 ⊆ U, X1= {p3, p7, p8, p9} the accuracy of an approximation of a set X1 with
respect to B is:
, 2
0.333
, , , , , 6
Let S = < U, A, V, f > be an information system, and let B ⊆ A and Γ = {X1, X2, …,
Xn} for every subset Xi ⊆ U (1 i n) be a classification (or a partition; a family of
subsets) of U. The family of sets Γ = {X1, X2, …, Xn} is a classification in U of te
, ,…, (4.19)
, ,…, (4.20)
75
Rough‐Fuzzy Hybridization
⋃ ∈ (4.21)
⋃ ∈ (4.22)
∑
∑
(4.23)
∑
Γ (4.24)
This represents a ratio of all B-correctly classified objects and all objects in the
system S.
The idea of accuracy of a classification allows us to define closeness that one can
approximate a partition (classification U/IND(B)) generated by a set of attributes B ⊆
A by another partition U/IND(C) generated by a set of attributes C ⊆ A. The accuracy
of approximation of classification U/IND(C) by U/IND(B) may be defined as follows:
/
/ (4.25)
Where
76
Rough‐Fuzzy Hybridization
[Example 4.7]
From the information system DIABET defined in example 4.2, consider that there is a
classification Γ ={X1, X2, X3, X4} where X1 = {p2, p4, p5}, X2 = {p1, p7, p8}, X3 = {p5,
p7, p9}, X4= {p8, p10,}.
, , , , 7, 8 , ,
, , ,
, , , , , , , , , , , , , , , , , , ,
0 2 0 0 2
Γ 0.1
6 4 6 4 20
2
0.18
11
Therefore, it can be said that the accuracy of this classification is very poor and the
classification process has to be improved towards higher accuracy.
77
Rough‐Fuzzy Hybridization
classificatory power as the original set is called attribute reduction. In other words, a
reduct is a minimal set of attributes from A that preserves the partitioning of the
universe and hence the ability to perform classifications as the whole attribute set A
does. As a result the original larger information system may be reduced to a smaller
system containing fewer attributes.
Rough sets allows to determine the most important attributes from a classificatory
point of view for a given information system. A reduct is the essential part of an
information system related to a subset of attributes that can discern all objects
discernible by the original information system. A core is a common part of all reducts.
Core and reduct are fundamental rough sets concepts that may be used to knowledge
reduction. Some attributes may depend on each other. A change of a given attribute
may cause changes of other attributes in some non-linear ways. Rough sets determine
a degree of attributes’ dependency and their significance. In an indiscernibility
relation, a dependency of attributes is one of the important features of information
systems.
⋃ | ∈ (4.27)
The positive region POSB(D) contains all objects in U which can be classified
perfectly without error into distinct classes defined by IND(D), based only on
information in relation IND(B).
The definition of the positive region can be formed for any two subsets of attributes,
B1, B2 ⊆ A in the information system S. It is known that the subset of attributes B2 ⊆ A
defines the indiscernibility relation IND (B2) and thus the classification U / IND (B2)
with respect to the subset. The B1-positive region of B2 is defined bellow. The B1-
positive region of B2 contains all objects that, by using attributes B1, can be certainly
classified to one of distinct classes of the classification U/IND(B2).
⋃ ∈ (4.28)
78
Rough‐Fuzzy Hybridization
The rough sets define a degree of dependency for sets of attributes. The cardinality of
the B1-positive region of B2 is used to define a measure called a degree of
dependency of the set of attributes B2 on B1 in (4.32). It can be said that the set of
attributes B2 depends on the set of attributes B1 to the degree .
(4.29)
Suppose an information system S =< U, A, V, f > and two sets of attributes B1, B2 ⊆ A.
relation satisfies IND (B1) ⊆ IND (B2). The sets B1 and B2 are independent in S iff
neither B1→B2 nor B2→B1 holds. A set B2 is dependent to a degree k on the set B1 in
S, as denoted
→ ,0 1, (4.30)
Where
, (4.31)
The significance of the attribute a in the set B1∈ A computed with respect to the
original classification U/IND(A) generated by the entire set of attributes A from the
information system S is denoted as , .
79
Rough‐Fuzzy Hybridization
1. A set B1⊂ A is dependent in S iff ∃ B2 ⊂ B1 such that IND (B2) = IND (B1).
2. A set B1⊂ A is independent in S iff ∀ B2 ⊂ B1 such that IND (B2) ⊃ IND (B1).
A given information system may have many different reducts. If for given
information system S =< U, Q, V, f >, a subset B ⊂ A is a reduct, then the
corresponding information system S1 =<U, A, V, f 1> with the attribute set equal to a
reduct B, is called a reduced system (where f 1 is the restriction of a function f to a set
U×B). In other words, a reduced system S1 is constructed from the original system S
by removing columns related to attributes not included in a reduct B.
[Example 4.8]
Consider the information system DIABET define in Table 4.2, suppose there are two
subsets of attributes, B1 = {c1, c2, c3}, B2 = {c3}.
{X1, X2, X3, X4, X5} = {{p1, p2}, {p3, p4}, {p5, p6}, {p7, p8}, {p9, p10}}.
{Y1, Y2, Y3} = {{p1, p2, p7, p8}, {p3, p4, p5, p6}, {p9, p10}}.
∈ /
1, 2, 7, 8 , 3, 4, 5, 6 , 9, 10
80
Rough‐Fuzzy Hybridization
10
: 1.0
10
Therefore it can be said that the set of attributes B is totally dependent on a set A
For a information system some attributes may be redundant with respect to a specific
classification U/IND(B) generated by a set of attributes B ⊆ A . That implies an
information system may be overburdened by this redundant information. The
classifiers defined for overburdened information systems may exhibit a poor
generalization for new unseen objects. By virtue of the dependency properties of
attributes, one can find a reduced set of the attributes, by removing superfluous
attributes, without a loss of classification power in the reduced information system.
Thus it can lead to the substantial reduction of an information system to find the
optimal set of attributes that sufficient for a robust classification with a higher degree
of generalization.
The set of all indispensable attributes in a set B ⊂ A is called a core of a set B in the
information system S and it is denoted by CORE(B). The core contains all attributes
that cannot be removed from the set B without losing the original classification.
81
Rough‐Fuzzy Hybridization
∈ : (4.32)
A set B ⊂ A is called orthogonal if all its attributes are indispensable. A proper subset
E ⊂ B is defined as a reduct set of B in S if E is orthogonal and preserves the
classification generated by B. Thus a reduct set of B, denoted by RED(B), is defined
by
⋂ (4.34)
[Example 4.9]
Consider the decision system DIABET, suppose there are two reducts B1 and B2 of the
set of condition attributes C = {c1, c2, c3} with respect to the decision attribute D =
{d} as follows: B1 = {c1, c2}, B2 = {c2, c3}.
Thus the attribute c2 is the most significant attribute and B1 and B2 are the two subsets
of attributes B that discriminate the decision attributes.
By choosing a reduct B1, for example, the reduced decision table can be obtained by
simply removing the superfluous attribute c3 as shown in the following table. The
reduced decision table has the same information as the original with respect to the
classificatory power.
82
Rough‐Fuzzy Hybridization
Object Attributes
C D
(medical diagnoses) (disease class)
U c1 c2 d
p1 0 N 0
p2 0 N 0
p3 1 N 1
p4 1 N 0
p5 0 V 0
p6 0 V 0
p7 1 V 1
p8 1 V 1
p9 0 H 1
p10 0 H 0
Consider S = < U, A, V, f > be an information system and assume that there are n
objects, and thus U defined as U = {x1, x2, …, xn }
A discernibility matrix defined by M(A) for an information system S with the set of
attributes A is a n × n dimensional square matrix, with rows and columns labeled by
objects xi (i=1,2,…n). Each entry mij of a discernibility matrix (for a given row i and a
column j representing two objects xi and xj from U) is a subset of attributes which
discerns these objects. Therefore, a discernibility matrix can be defined by
83
Rough‐Fuzzy Hybridization
0 , ∈
∈ : , , , ∈
The entry mij contains all these attributes whose values are not identical for both xi
and xj, which means that xi, xj belong to different classes of partition generated by
IND (A). The discernibility matrix M(A) is symmetric and mii = 0, thus it is sufficient
to compute only entries in the lower triangle of M (A), i.e., the mij with 0 ≤ j<i ≤ n-1.
, , … , ⋀⋁ | 1 , (4.36)
where
[Example 4.10]
Consider the information system DIABET as shown in Table 4.2, the discernibility
matrix M(B) can be obtained as (mii = 0, mij = mji for i, j=1,…,10)
p1 p2 p3 p4 p5 p6 p7 p8 p9 p10
p1 φ
p2 φ φ
p3 c1c3 c1c3 φ
p4 c1c3 c1c3 φ φ
p5 c2c3 c2c3 c1c2 c1c2 φ
p6 c2c3 c2c3 c1c2 c1c2 φ φ
p7 c1c2 c1c2 c2c3 c2c3 c1c3 c1c3 φ
p8 c1c2 c1c2 c2c3 c2c3 c1c3 c1c3 φ φ
p9 c2c3 c2c3 c1c2c3 c1c2c3 c2c3 c2c3 c1c2c3 c1c2c3 φ
p10 c2c3 c2c3 c1c2c3 c1c2c3 c2c3 c2c3 c1c2c3 c1c2c3 φ φ
84
Rough‐Fuzzy Hybridization
One of the important applications of rough sets is a generation of decision rules for a
given information system for a classification of known objects, or a prediction of
classes for new objects unseen during design. Using an original or a reduced decision
table, one can find rules classifying objects through determining the decision attribute
value based on values of condition attributes.
Let DS =<U,C ∪ D,V, f > be a decision table (decision system) with C as a set of
condition attributes and D as a set of decision attributes. A decision table DS can be
classified as follows:
For a deterministic decision table, unique decisions can be determined when some
conditions are satisfied (attributes taking certain values). Conversely, for a roughly
deterministic decision table, decisions are not uniquely determined by the conditions.
85
Rough‐Fuzzy Hybridization
Decision rules can be derived from a decision table DS. Let , ,…,
and , ,…, be a C-definable and a D-definable classification of U.
A class Yi from a classification U/IND(D) can be identified with the decision i
(i=1,2,…,l), denoted also by rij. A set of decision rules rij for all D-definable sets Yj is
defined bellow:
→ : ∩ , ∈ , ∈ (4.37)
Where
The decision rules rij are logically described as follows: IF (a set of conditions)
THEN (a set of decisions).
if DesC (Xi) uniquely implies DesD (Yj), then the rule rij is deterministic; otherwise rij
is non-deterministic. The set of decision rules for all classes Yj generated by a set of
decision attributes D (D-definable classes in S) is called a decision algorithm resulting
from the information system S.
[Example 4.11]
Consider the decision table in Table 4.2 with the decision attribute D = {d}, Vd = {0,
1}. The resulting partition U/IND(D) = {Y1, Y2} = {{p3, p7, p8, p9} }, { p1, p2, p4, p5,
p6, p10}} for DesD (Y1)=(d=1) and DesD (Y2)=(d=0). If a reduct B = {c1, c2} of the
condition attribute is considered, a partition of the universe U corresponding to the
equivalence relation IND(B) can be determined as below.
U/IND(B) = {X1, X2, X3, X4, X5} = {{p1, p2}, {p3, p4}, {p5, p6}, {p7, p8}, {p9, p10}}.
DesB(X1) = (c1 = 0, c2 = N)
DesB(X2) = (c1 = 1, c2 = N)
86
Rough‐Fuzzy Hybridization
DesB(X3) = (c1 = 0, c2 = V)
DesB(X4) = (c1 = 1, c2 = V)
Firstly, decision rules for the class Y1 (d=1) can be designed as follows.
Next, the decision rules for the class Y2 (no disease; d=0) can be obtained as below
Mainly the combinations of rough and fuzzy set are found in the three categories:
fuzzy-rough set, rough-fuzzy set and rough-fuzzy hybridization. A rough-fuzzy set is
defined as an approximation of a fuzzy set in a crisp approximation space, while a
fuzzy-rough set is defined as an approximation of a crisp set in a fuzzy approximation
space. In generalization, the category of an approximation can be interpreted in these
three different areas; a family of rough sets, a family of rough-fuzzy sets, and a family
of fuzzy-rough sets. The approximation of a fuzzy set in a fuzzy approximation space
leads to a more general framework. Rough-fuzzy hybridization is an approach to
combine the rough and fuzzy set excluding the concept of fuzzy-rough set and rough-
87
Rough‐Fuzzy Hybridization
fuzzy set. Rough-fuzzy hybridization are used in different area like feature selection,
dependency rule generation etc.
Fuzzy equivalence relations are the generalization of the crisp equivalence relations in
the fuzzy framework. Fuzzy equivalence relations have been widely studied to
measure the degree of indistinguishability or similarity between the objects of a given
universe of discourse, and have been used in different contexts such as fuzzy control,
approximate reasoning, cluster analysis, etc [35]. Different researchers define other
names also to use in different context such as similarity relations [36-38] or
indistinguishability operators [39-41].
Fuzzy equivalence classes are central to the fuzzy-rough set approach [9,42,43] as like
as crisp equivalence classes are central to rough sets. An introduction to fuzzy set
theory can be found in appendix A. For typical Rough Set Attribute Reduction
applications, this means that the decision values and the conditional values may all be
fuzzy. The concept of crisp equivalence classes can be extended by the inclusion of a
fuzzy similarity relation R on the universe U, which determines the degree of which
two elements are similar in R. For example, if μ , 0.9, then objects x and y are
considered to be quite similar. The other properties of like reflexivity μ , 1,
symmetry μ , μ , and transitivity μ , μ , ∧ μ , hold.
Using the fuzzy similarity relation, the fuzzy equivalence class [x]R for objects close
to x can be defined:
μ μ , (4.38)
Fuzzy equivalence class F should hold the following axioms [44, 45]:
1. ∃ , 1 , is normalised
2. μ ∧ μ , μ
3. μ ∧ μ μ ,
The first axiom represents that an equivalence class is nonempty. The second axiom
says that objects which are in neighborhood of y are belonging in the equivalence
class of y. The third axiom states that any two elements in F are related through R.
88
Rough‐Fuzzy Hybridization
Thus this definition degenerates to the normal definition of equivalence classes when
R is not fuzzy.
Fuzzy partitioning of the universe of discourse formed a family of normal fuzzy sets
that can play the role of fuzzy equivalence classes [9]. Consider the crisp partitioning
of a universe of discourse U, by the attributes in A: U/A = {{1, 4, 6}, {2, 3, 5}}. This
contains two equivalence classes ({1, 4, 6} and {2, 3, 5}) that can be thought of as
degenerated fuzzy sets, with those elements belonging to the class with a membership
of one or zero. For the first class, for instance, the objects 2, 3 and 5 have a
membership of zero. Extending this concept to fuzzy equivalence classes which
results that objects can be allowed to assume membership values for any given class,
in the interval [0, 1]. U/A is not restricted to crisp partitions only; fuzzy partitions are
equally acceptable [46, 47].
Dubois and Prade introduced the Fuzzy-Rough Sets [9] are originated from Waillaeys
and Malvache [47] for defining a fuzzy set with respect to a family of fuzzy sets. It
deals with the approximation of fuzzy sets in a fuzzy approximation space defined by
a fuzzy similarity relation or defined by a fuzzy partition. The results for fuzzy-rough
sets reviewed here are based on a fuzzy similarity relation. Fuzzy similarity relation S
is a fuzzy subset of U × U. The pair (U, S) is called a fuzzy approximation space. A
fuzzy similarity relation may be used to define a fuzzy partition of the universe and
defined by U/S with respect to S.
For a fuzzy set F, its approximation in the fuzzy approximation space (U, S) i.e., the
fuzzy lower and upper approximation are defined as [48]
,1 ∈ (4.39)
, ∈
Since the universe of discourse is finite, so it is used the sup and inf. Crisp upper and
lower approximations deviate a little from these definitions due to the memberships of
individual objects of approximations are not explicitly available.
Thus the pair can be extended to a pair of fuzzy sets on the universe U as defined by
89
Rough‐Fuzzy Hybridization
,1 , | ∈ (4.40)
, , | ∈
The tuple , is called a fuzzy-rough set, which is a pair of fuzzy sets on U/S.
Many different definitions are proposed by different researcher to use fuzzy set with
different measure. Fuzzy-rough set are defined by using a family of equivalence
relations induced by different level sets of a fuzzy similarity relation [49]. Another
definition of fuzzy-rough set proposed based on a fuzzification of the lower and upper
bounds of Iwinski rough sets [50, 51]. Similar definition was also used to model the
approximation of a fuzzy set based on a weak fuzzy partition using the measures of
fuzzy set inclusion [52, 53].
The review shows that the same notions of rough fuzzy set and fuzzy rough sets are
used with different meanings by different authors. The functional approaches clearly
defined various notions mathematically. However, the physical meanings of these
notions are not clearly interpreted. In the rest of this chapter, this issue will be
addressed.
Let F1, F2 ∈ U, the properties of fuzzy-rough sets based on the properties of rough
sets are given bellow:
i. (4.41)
ii.
iii. ∩ ∩ ∪ ∪
∪ ⊇ ∩ ⊆ ∩
iv. ⊆ ⊇
Fuzzy-rough sets are monotonic with respect to set inclusion given bellow:
⊆ ⇒ ⊆ ⊆ (4.42)
Fuzzy-rough sets are monotonic with respect to the refinement of fuzzy similarity
relations. A fuzzy similarity relation S1 is a refinement of another fuzzy similarity
relation S2 if S1 belongs or equal to S2. That is a generalization of the refinement of
90
Rough‐Fuzzy Hybridization
crisp relations. The monotonicity of fuzzy-rough sets with respect to the refinement of
the fuzzy similarity relation given bellow
⊆ ⇒ ⊆ ⊇ (4.43)
Rough-Fuzzy Sets defined by Dubois and Prade deal with the approximation of fuzzy
sets in an approximation space [54]. Let S = <U, A, V, f> be an information system
and a subset of attribute B ⊆ A determines the approximations space (U, IND (B)) in
S. A fuzzy set F can be approximate in (U, IND (B)) by constructing the B-lower and
B-upper approximations of F, denoted by and respectively and defined as
follows:
μ | ∈ (4.44)
μ | ∈
Where
Since the universe of discourse is finite, so it is used the sup and inf. Using the
extension principle, the pair can be extended to a pair of rough sets on the universe U
as defined bellow
μ | ∈ (4.45)
μ | ∈
This pair can be represented in another way by expressing rough sets using the
characteristic functions of lower and upper approximation as defined
μ max , , ∈ (4.46)
μ min ,1 , ∈
Let F and G are two fuzzy sets, the properties of rough-fuzzy sets are given bellow:
91
Rough‐Fuzzy Hybridization
i. (4.47)
ii.
iii. ∩ ∩ ∪ ∪
∪ ⊇ ∩ ⊆ ∩
iv. ⊆ ⊇
v. ⊆ ⊇
vi. ⊆ ⊇
Rough-fuzzy sets are monotonic with respect to fuzzy set inclusion as given bellow:
⊆ ⇒ ⊆ ⊆ (4.48)
⊆ ⇒ ⊆ ⊇ (4.49)
By comparing Equations (4.40) and (4.46), it can be concluded that rough-fuzzy sets
are special cases of fuzzy-rough sets as defined by Dubois and Prade [9]. The
approximation of a fuzzy set in a crisp approximation space is called a rough-fuzzy
set, to be consistent with the naming of rough set as the approximation of a crisp set in
a crisp approximation space. The approximation of a crisp set in a fuzzy
approximation space is called a fuzzy-rough set. Such a naming scheme has been used
by Klir and Yuan [55], and Yao [56]. Under this scheme, these two models are
complementary to each other, in a similar way that rough sets and fuzzy sets
complementary to each other.
Except fuzzy-rough set and rough-fuzzy set, other interpretations are possible [57].
An alternative definition of fuzzy-rough set was given by utilizing the rough
membership function [58, 59]. One attempt at rough-fuzzy hybridization was
proposed [60], where rough sets are expressed by a fuzzy membership function to
represent the negative, boundary and positive regions. The objects belongs to the
positive region have a membership value one and those belonging to the boundary
92
Rough‐Fuzzy Hybridization
region have a membership of 0.5. Objects have zero membership value, belongs to the
negative region i.e., they do not belong to the rough set. Thus modifying the rough set
operator like union, intersection; rough set may be expressed as a fuzzy set.
The interest to introduce fuzziness into rough set is to handle the levels of roughness
in the boundary region by using fuzzy membership values. Then the membership
value of the objects belong to the boundary region must be in the range of 0 to 1 in
spite of a crisp value 0.5. Therefore, for a set X in rough-fuzzy hybridization and for a
rough set R and a crisp equivalence relation E the membership function may be define
as follows:
0 1
Another approach is presented where objects belongs in the rough set lower
approximation with certainty, however the boundary region is fuzzified and
membership values of objects are expressed in terms of a fuzzy membership function
[61].
Another approach to address the problem where the fuzzy set representation of a
rough set may be too precise, such that a concept is described exactly once its
membership function has been defined [62]. The solution proposed to employ an
approximation of a family of fuzzy sets, termed as shadowed set, that use basic truth
values and a zone of uncertainty instead of exact membership values.
For a fuzzy set, a shadowed set may be influenced by increasing the membership
values around 1 and decreasing membership values around 0 until a certain threshold
value is attained. Any objects that do not belong to the set with a membership of 1 or
0 are assigned a unit interval, [0,1], considered to be a nonnumeric model of
membership grade. In fuzzy set theory, vagueness is distributed across the entire
universe of discourse, but in shadowed sets this vagueness is localized in the shadow
regions. As with fuzzy sets, the basic set operations (union, intersection and
complement) can be defined for shadowed sets, as well as shadowed relations.
93
Rough‐Fuzzy Hybridization
The crisp positive region in traditional rough set theory is defined as the union of the
lower approximations. By the extension principle [63], the membership of an object x
∈ U, belonging to the fuzzy positive region can be defined by
sup ∈ / (4.50)
Object x will not belong to the positive region only if the equivalence class it belongs
to is not a constituent of the positive region. This is equivalent to the crisp version
where objects belong to the positive region only if their underlying equivalence class
does so. Similarly, the negative and boundary regions can be defined. Using the
definition of the fuzzy positive region, the new dependency function can be defined as
follows:
∑ ∈
| | | |
(4.51)
As with crisp rough sets, the dependency of Q on R is the proportion of objects that
are discernible out of the entire dataset. In the present approach, this corresponds to
determining the fuzzy cardinality of divided by the total number of
objects in the universe.
94
Rough‐Fuzzy Hybridization
if
R = {a, b}, U/IND({a}) = {Na, Za} and U/IND({b}) = {Nb, Zb},
then
U/R ={Na ∩ Nb, Na ∩ Zb, Za ∩ Nb, Za ∩ Zb} (4.53)
The extent to which an object belongs to such an equivalence class is therefore
calculated by using the conjunction of constituent fuzzy equivalence classes, say Fi,
i = 1, 2, …, n
4.5 Conclusion
The interest in the hybridization of fuzzy sets with rough sets is endured by the
number of publication in this area. The hybridization of these approaches has resulted
in methods which take advantage of the ability of rough sets to model vagueness and
that of fuzzy sets to model uncertainty. In this sense both approaches are
complimentary, furthermore when hybridized as described in this section no tunable
parameters are required and only the data is used.
In this chapter a number of theoretical and real world application areas of rough set
theory, rough set extensions, and combination of fuzzy and rough set theory are
examined. Note that these examples are for representative purposes and do not serve
to demonstrate the whole spectrum of possible applicable areas. The sheer number of
applications and amount of work has been published in the area.
There is much scope for further research in relation to the development of fuzzy-
rough sets. In particular there is much interest in the area of type-2 fuzzy sets [63] at
the present moment. However, hybridization with rough sets has not been proposed as
yet. Additionally, there are a number of aspects in respect of fuzzy measures with
application to fuzzy-rough sets which remain unexplored, and these may offer some
new and interesting future research areas.
95
Rough‐Fuzzy Hybridization
Bibliography
[3] L. A. Zadeh, Fuzzy Sets, Information and Control, pp. 338-353, 1965.
[4] S. Chanas, D. Kuchta, Further remarks on the relation between rough and
fuzzy sets, Fuzzy sets & Systems, vol. 47, pp.391-394, 1992.
[5] Z. Pawlak, Rough sets and fuzzy sets, Fuzzy Sets & Systems, vol. 17, pp. 99-
102, 1985.
[6] M. Wygralak, Rough sets and fuzzy sets – some remarks on interrelations,”
Fuzzy Sets & Systems, vol. 29, pp. 241-243, 1989.
[7] R. Biswas, On rough sets and fuzzy rough sets, Bulletin of the Polish
Academy of Sciences, Mathematics, vol. 42, pp. 345-349, 1994.
[8] D. Dubois and H. Prade, Rough fuzzy sets and fuzzy rough sets, International
Journal of General systems, vol. 17, pp.191-209, 1990.
[9] D. Dubois and H. Prade, Putting rough sets and fuzzy sets together, In:
Intelligent decision Support: Handbook of Applications and Advances of the
rough Sets Theory, R. Slowinski, Kluwer Academic Publishers, Boston, pp.
203- 222, 1992.
96
Rough‐Fuzzy Hybridization
[14] Sankar K. Pal, Pabitra Mitra, Case Generation: A Rough-fuzzy Approach, In:
Proc. Intl. Conf. Case Based Reasoning (ICCBR2001), Vancouver, Canada,
2001,
[15] G. J. Klir and B. Yuan, Fuzzy Sets and Fuzzy Logic, Theory and Applications,
Prentice Hall, 1995.
[16] Cited In H.-J. Zimmermann, Fuzzy Set Theory - And its Applications, 3rd Ed,
Kluwer Academic Publishers, 1997.
[18] T. Y. Lin and N. Cercone, Rough Sets and Data Mining – Analysis of
Imprecise Data, Kluwer Academic Publishers, 1997.
97
Rough‐Fuzzy Hybridization
[25] J.W. Grzyma - Busse, Applications of the rule induction system LERS, In:
Polkowski and Skowron [43], pp. 366-375, 1998.
[28] A. Czajewski, Rough sets in optical character recognition, In: Polkowski and
Skowron, pp. 601-604, In: L. Polkowski, A. Skowron (Eds.), Rough Sets in
Knowledge Discovery 1: Methodology and Applications, Physica-Verlag,
Heidelberg, 1998.
[30] A. An, N. Shan, C. Chan, N. Cercone, W. Ziarko, Discovering rules from data
for water demand prediction, Proceedings of the Workshop on Machine
Learning in Engineering (IJCAI'95), Montreal, pp. 187-202, 1995; see also,
Journal of Intelligent Real-Time Automation, Engineering Applications of
Artificial Intelligence 9/6, pp. 645-654, 1995.
98
Rough‐Fuzzy Hybridization
[34] Skowron A. and Rauszer C. 1992, The discernibility matrixes and functions in
information systems, In: R. Slowinski, (Ed) Intelligent Decision Support
Boston: Kluwer, pp.331-362.
99
Rough‐Fuzzy Hybridization
[42] H. Thiele. Fuzzy rough sets versus rough fuzzy sets - an interpretation and a
comparative study using concepts of modal logics. Technical report no. CI-
30/98, University of Dortmund. 1998.
[43] Y.Y. Yao. A Comparative Study of Fuzzy Sets and Rough Sets. Information
Sciences, Vol. 109, No. 1-4, pp. 21–47. 1998.
[44] U. Höhle. Quotients with respect to similarity relations. Fuzzy Sets and
Systems, Vol. 27, No. 1, pp. 31–44. 1988.
[45] Jensen, R., Shen, Q., Aiding Fuzzy Rule Induction with Fuzzy Rough
Attribute Reduction,In Proceedings of the UK Workshop on Computational
Intelligence (UKCI-02) Birmingham, UK, 81-88, September 2002.
[47] D. Waillaeys and N. Malvache, The use of fuzzy sets for the treatment of
fuzzy information by computer, Fuzzy Sets & Systems, vol. 5, pp.323-328,
1981.
[48] Y.Y. Yao, Combination of rough and fuzzy sets based on α-level sets, in:
Rough Sets and Data Mining: Analysis for Imprecise Data, Lin, T.Y. and
Cercone, N. (Eds.), Kluwer Academic Publishers, Boston, pp. 301-321, 1997.
[50] S. Nanda and S. Maumdar, Fuzzy rough sets, Fuzzy Sets & Systems, vol. 45,
pp. 157-160, 1992.
[52] R. Biswas, On rough sets and fuzzy rough sets, Bulletin of the Polish
Academy of Sciences, Mathematics, vol. 42, pp. 345-349, 1994.
100
Rough‐Fuzzy Hybridization
[53] L. I. Kuncheva, Fuzzy rough sets: application to feature selection, Fuzzy Sets
and Systems, vol. 51, pp.147-153, 1992.
[54] D. Dubois and H. Prade, Rough fuzzy sets and fuzzy rough sets, International
Journal of General systems, vol. 17, pp.191-209, 1990.
[55] G. J. Klir and B. Yuan, Fuzzy Sets and Fuzzy Logic, Theory and Applications,
Prentice Hall, 1995.
[56] Y. Y. Yao, On combining rough and fuzzy sets, Proceedings of the CSC’95
Workshop on Rough Sets and Database Mining, T. Y. Lin (Ed.), San Jose
State University, vol. 9, 9 pages, 1995.
[57] W. Pedrycz. Shadowed sets: bridging fuzzy and rough sets. In [123], pp. 179–
199. 1999.
[58] T. Beaubouef, F.E. Petry and G. Arora. Information Measures for Rough and
Fuzzy Sets and Application to Uncertainty in Relational Databases. IN: S.K.
Pal and A. Skowron (Eds.). Rough-Fuzzy Hybridization: A New Trend in
Decision Making. Springer Verlag, Singapore. 1999.
[59] Z. Pawlak. Rough Sets: Theoretical Aspects of Reasoning About Data. Kluwer
Academic Publishing, Dordrecht. 1991.
[60] M. Wygralak, Rough sets and fuzzy sets - some remarks on interrelations,
Fuzzy Sets and Systems, vol. 29, no. 2, pp. 241–243, 1989.
1, pp.329–334, 2006
[62] W. Pedrycz. Shadowed sets: bridging fuzzy and rough sets. In S.K. Pal and A.
Skowron (eds.) Rough-Fuzzy Hyridisation, Springer Verlag, Sngapore, pp.
179–199, 1999
[63] L.A. Zadeh. The Concept of a Linguistic Variable and Its Application to
Approximate Reasoning-1, Information Sciences vol.8 pp. 199–249, 1975.
101
Rough‐Fuzzy Intelligent System
Chapter 5
5.1 Introduction
The rule generation techniques have been widely developed and used for data mining
to developed intelligent system in many application areas [5], such as medical
diagnosis, decision-making, classification and prediction. Many inductive learning
methods, such as generation of decision trees [6], rule generation methods [7], soft
computing tools in rule generation are: neural network [8], fuzzy systems [9], rough
set theory [13], genetic algorithm etc, are introduced and applied to extract knowledge
from databases. Every approach has some advantages and disadvantages. In order to
1
This chapter is based on published paper, J. Ghosh, S. Mukhopadhyay, 2011, Role of Certainty Factor
in Rough-Fuzzy Rule Generation, International Journal of Computer Science, Engineering and
Applications (IJCSEA) Vol.1, No.6, pp. 49-61.
102
Rough‐Fuzzy Intelligent System
provide more flexible and robust information processing system, using only one
approach is not enough. Hybridizations of soft computing methodologies for rule
generation are introduced [22-34].
In this chapter the next sections are arranged as, section 5.2 discuss about the Rough-
Fuzzy rule generation. Section 5.3 presents the procedure of fuzzyfication of data i. e.,
the method of linguistic representation of pattern in fuzzy set theory. Section 5.4
describes the procedure of dependency rule generation using rough set theory. Section
5.5 presents the steps involve in the modified framework to compute dependency rule
with CF using rough-fuzzy hybridization. Section 5.6 gives a description about the
medical domain say diabetes mellitus. Section 5.7 shows some results and comparison
between exiting algorithms and the new modified rough-fuzzy framework and section
5.7 concludes the chapter.
Rule generation using rough set theory may be done by different approach. One
approach is first compute reducts [13], minimal set of attributes that preserve the
indiscernibility relation and, consequently, set approximation of information system.
An information system is a table, where each row represents an event and each
103
Rough‐Fuzzy Intelligent System
column represents an attribute that can be measured for each event and acquired from
the domain expert. In an information system, indiscernibility relation for a subset of
attribute set presents the same attribute value for all attribute of that subset for all
events. Then generate rule by calculating descernibility matrices [13]. If A is the
attribute set of an information system with n events then the descernibility matrix of
that information system is a n×n symmetric matrix with each entries is a set of
attribute defined as
cij={a ∈A| a(xi) ≠a(xj)} ∀ i,j=1,2,…,n . Here x’s are events (5.1)
Other way is directly generate decision rule using decision matrix, which is a
generalized from of rough set theory [14]. The decision matrix may also use to
compute reducts of the information system.
104
RoughFuzzyIntelligentSystem
deleting the equal records and store these all MV for future use to calculate CF. In
steps three we use rough set theory to generate dependency rule. Finally in step four
we calculate CF to each rule by considering the stored MV for certain rule, according
to rough set theory, and both stored MV and generated possibility value for possible
rule, according to rough set theory, and present CF as percentage form. This rule set
may be used as a knowledge base of a rule-based intelligent system as describe in
chapter 3. In time of knowledge inferencing CF of each rule will play an important
role to find out the appropriate rule to be fire. To generate rules and testing the system
we use the diabetes patient’s data.
Table of data are fuzzyfied using Fuzzy set theory [9, 11]. Fuzzy set theory has been
introduced the concepts of degree of membership of elements to set. Previously
elements could belong fully (membership 1) or not at all (membership 0) to set. The
degree of membership allows an element to lie in a set with membership values
anywhere in the range [0, 1]. A fuzzy set can be defined as a set of ordered pairs à =
{(x, Ã(x)) / x}ॼא. The function Ã(x) is called is called the membership function for
Ã, mapping each element of the base set ॼ (universe) to a membership degree in the
range [0.1]. The base set may be discrete or continuous.
For some applications continuously differentiable curves requires for modeling and
therefore smooth transitions, which is not possible using triangular or trapezoidal
function. In those cases normalized Gaussian function, difference of two sigmoidal
functions, generalized bell function, etc, and in some application functions are used
[8].
105
Rough‐Fuzzy Intelligent System
Rough set theory [12, 13] is used to generate dependency rule form table of data. Let
us discuss some basic concepts of rough set which are used in this paper.
= {x ∈ U : [x]B ⊆ X } (5.4)
= {x ∈ U : [x]B ∩ X ≠ φ} (5.5)
Rule generation using rough set can be done by the following two methods:
5.4.1 Method 1
The main task in this method, used in [34], of rule generation is to find out the reducts
relative to information system S. Let there are k number of decision attributes in D
i.e., D = {d1, d2 , … ,dk}. Then divide the decision table into k tables Si = <Ui, Ai> , ∀
i = 1, 2, …, k, where U = ⋃ U and Ai = C ∪ { di} .
106
Rough‐Fuzzy Intelligent System
Let us assume that there are n objects, i.e., U = {x1, x2, … ,xn }, and m condition
attributes, i.e., C = {a1, a2 , … ,am } in the information system S. Also assume that Ui
= {xi1, xi2, … ,xip} that occur in Si, i=1, 2, … k. Now construct the discernibility
matrix Mdi(B) for each di-reduct B = { b1, b2 , … ,bl }(say) from the di- discernibility
matrix as defined follows
For each object xj ∈ {xi1, xi2, … ,xip} the discernibility function denoted by fdi(xj) is
defined as
Thus the rule ri: Ri → di is obtained and the dependency factor dfi is calculated as
df (5.9)
5.4.2 Method 2
This method use decision matrix [14] for rule generation. Decision matrix is a
generalization of rough set theory from where reduct and decision rule can be
calculated. In this method first it is check that the information system is consistent or
not. Information system is said to consistent if there is no two objects whose condition
attributes are same but decision attributes are different. Similarly an information
system will be inconsistent if there exists any two objects whose condition attributes
are same and decision attributes are different. That means for any two object i and j
107
Rough‐Fuzzy Intelligent System
Let the domain of discourse U of the information system S is divided into k classes
(c1, c2… ck) depending on equivalence relation defined on D. For any class cp ∈ (c1,
c2… ck) , the objects ∈ U are belong in cp are numbered by subscripts i (i = 1, 2, …,
m) and those do not belong in cp are subscripts j (j = 1, 2, ,n). The decision matrix
M of the information system S for the class cp is defined as m×n matrix with elements
as a set {attribute-name, attribute-value}.
|B | ⋀ ⋁ M (5.12)
The decision rule for the class cp∈ (c1, c2… ck) is calculated as
Rp = ⋁ |Bip | ∀ i = 1, 2, …, m (5.13)
First find out the B-lower and B-upper approximation for each class cp ∈ (c1, c2… ck)
. Rule generated from the B-lower approximation are certain and rule generated from
B-upper approximation is possible rule.
Thus the certain decision rule for any object i (i = 1, 2, …, m) belong in class cp ∈
(c1, c2… ck) can be obtained as
|B | ⋀ ⋁ M (5.14)
The certain decision rule for the class cp∈ (c1, c2… ck) is calculated as
108
Rough‐Fuzzy Intelligent System
And the possible decision rule for any object i (i = 1, 2, …, m) belong in class cp ∈
(c1, c2… ck) can be obtained as
|B | ⋀ ⋁ M (5.16)
The possible decision rule for the class cp∈ (c1, c2… ck) is calculated as
1 (5.18)
Where c ∈ (c1, c2… ck) and card(.) define the cardinality of the set.
Step-1: read the data set from database. Find out the attributes which have continuous
values. Then perform the fuzzyfication operation over continuous attributes by
introducing some linguistic variables like low, high, medium etc., and calculate the
MV of each linguistic variable. The MV is calculated by using triangular membership
as described in section 5.3. According to the definition of MV describe in section 5.3,
the MV must be in [0, 1]. We also assign MV 1 or certain membership to the other
attribute values. We discard those parameters which have MV less than 0.25. Here we
use seven parameter triangular function instead of regular three parameter triangular
function defined as follows:
109
Rough‐Fuzzy Intelligent System
0 1, 7
1 2
2 3
, 1, 2, 3, 4, 5, 6, 7 3 4 (5.19)
4 5
5 6
6 7
Step-2: Find out those records which have same attribute value but may have
different MV of the attribute value in the data set. Then calculate the total
membership value of those records by summing up the MV of each attribute value in
each record. Only keep the record with maximum total membership value and delete
other records. In this way we find out a modified data set with some linguistic
variable and each attribute value has some membership degree. In this modified data
set all attribute have discrete values. Now in this point we store all MV corresponding
to the attribute values in the data set.
Step-3: Now we use rough set theory as described in section 5.4 with some
modification over the modified data set constructed in step-2. Here we have used
Method-2 as described in section 5.4.2. We take the fuzzyfied data set as information
set S. Next we check that the information system S is consistent or not using the
equation (5.10) described in Method-2 in section 5.4.2. Construct the decision matrix
as described in equation (5.11) of Method-2 in section 5.5.2 with a modification. We
modified the decision matrix by adding MV of attributes as follows:
Where a is attribute name and a(i) is attribute value and µ(i) is the MV of attribute
value.
For rule generation Method-2 describe in section 5.4.2 generate one rule for one class.
The length of the rule is very large and the form of the rule is
110
Rough‐Fuzzy Intelligent System
|B | ⋀ ⋁ M (5.21)
∑
5.22
Where k is the number of attributes present in the ith rule and µij is the MV.
If for any i, |Bip | already exists in the rule set R , then compare the CF value of
existing rule and new rule and store the rule with maximum CF value in rule set R.
First find out the B-lower and B-upper approximation for each class cp ∈ (c1, c2… ck).
Rule generated from the B-lower approximation are certain and rule generated from
B-upper approximation is possible rule.
111
Rough‐Fuzzy Intelligent System
|B | ⋀ ⋁ M (5.24)
∑
5.25
Where k is the number of attributes present in the ith rule and µij is the MV.
If for any i, |Bip | certain already exists in the rule set R then compare the CF value of
existing rule and new rule and store the rule with maximum CF value in rule set R.
Then construct the possible minimum-length decision rule for any object i (i = 1, 2,
…, m) belong in class cp ∈ (c1, c2… ck) by performing the operation described in
Method-2 of section 3.2.2 as
|B | ⋀ ⋁ M (5.27)
For possible rule the belief function of ith rule can be defined as follows
card Bc B
1 5.28
Where cp ∈ (c1, c2… ck) and card(.) define the cardinality of the set. and then
∑
∗ 5.29
Where k is the number of attributes present in the ith rule and µij is the MV.
Rj= |Bip| possible ∀ i = 1, 2, …, m; cp ∈ (c1, c2… ck); j=number of distinct rule (5.30)
112
Rough‐Fuzzy Intelligent System
If for any i, |Bip | possible already exists in the rule set R then compare the CF value of
existing rule and new rule and store the rule with maximum CF value in rule set R.
Hands are a target for several diabetes related complications. Lundbaek K first
described hand stiffness in young diabetics in 1957. and this was subsequently
termed “ diabetic hand syndrome”, limited joint mobility”(LJM) or
“cheiroarthropathy ( after the Greek word “cheiros” for hand ) . LJM was initially
described in type I patients with long standing diabetes, but it is found in both type I
113
Rough‐Fuzzy Intelligent System
and type II diabetes mellitus and can be detected very early the evolution of the
disease. The prevalence of LJM in type 1 diabetes quoted in several studies of
pediatric and adult diabetes clinics ranges from 9 to 58%.
114
Rough‐Fuzzy Intelligent System
non-diabetic subjects. It has been suggested that 25% of patients with Dupuytren’s
contractures have diabetes.
Patients with type 2 diabetes seem to make more bone or to develop hyperostosis. The
most common form of new bone formation in diabetes is diffuse interstitial skeletal
hyperostosis (DISH) or ankylosing hyperostosis of the spine. DISH is more frequent
in males and the prevalence increases with age, affecting mainly subjects over the age
of 40 years.
Osteoarthritis is the most common rheumatic disease in the general population. It may
be asymptomatic, but severe involvement leads to pain, stiffness, and limitation of
motion in the affected joints, most commonly involving the knees, hips, and spine.
We have applied this framework over the above said medical data-sets of diabetes
patients for rheumatological manifestations of Diabetes Mellitus like
- Dupuytren’s contracture(DUPY)
115
Rough‐Fuzzy Intelligent System
- Age(Integer Numbers)
We find out the result considering the best rule i.e. maximum CF value as well as
consider the maximum occurrence (vote) of a class for rules of top three CF value and
top five CF value. We also compare the result with the previously established rough-
fuzzy framework shown in Table-5.1 which is self-explaining. The datasets contains
100 instances. The datasets are presents in form of two files .nam and .dat file. In
.data file present the data and .nam file represent the data structure about data. Here
we used 5-fold data. Only 20% data are used to generate rule (c.f. Appendix B) and
other 80 % data is used for testing.
116
Rough‐Fuzzy Intelligent System
5.8 Conclusion
Bibliography
117
Rough‐Fuzzy Intelligent System
[5] S. Mitra, Sankar K. Pal, and P. Mitra, “Data Mining in Soft Computing
Framework: a Survey”, IEEE Trans.on neural networks, vol. 13, no. 1, pp. 3–
14, 2002.
[8] S.T. Wang, “Fuzzy system and Fuzzy Neural Networks”, Shanghai Science
and Technology Press, 1998, Edition 1.
118
Rough‐Fuzzy Intelligent System
[14] Shi-tong Wang, Dong-jun Yu and Jing-yu Yang, “Integrating rough set theory
and fuzzy neural network to discover fuzzy rules”, Intelligent Data Analysis 7
(2003) 59–73
[16] Gang Xie , Fang Wang , Keming Xie, “RST-Based System Design of Hybrid
Intelligent Control”, IEEE International Conference on Systems, Man and
Cybernetics, 2004
[17] S. K. Pal and A. Skowron, Eds., “Rough Fuzzy Hybridization: A New Trend
in Decision Making” Singapore: Springer-Verlag, 1999
[18] Shusaku Tsumoto, “Mining diagnostic rules from clinical databases using
rough sets and medical diagnostic model”, Information Sciences 162 (2004)
65–80
119
Rough‐Fuzzy Intelligent System
[24] R. Rovatti and R. Guerrieri, “Fuzzy Sets of Rules for System Identification,”
IEEE Trans. Fuzzy Syst., vol. 4, pp. 89–102, 1996.
[26] L. Wang and J. Yen, “Extracting Fuzzy Rules for System Modeling Using a
Hybrid of Genetic Algorithms and Kalman Filter,” Fuzzy Sets Syst., vol. 101,
pp. 353–362, 1999.
[27] P. Mitra and S. K. Pal, “Staging of Cervical Cancer with Soft Computing”,
IEEE Trans. On Biomedical Engineering, vol. 47, no. 7, pp. 934–940, 2000.
[30] R. Yasdi, “Combining Rough Sets Learning and Neural Learning Method to
Deal with Uncertain and Imprecise Information”, Neurocomputing, vol. 7, pp.
61–84, 1995.
[32] R. R. Hashemi et. al., “A Hybrid Intelligent System for Predicting Bank
Holding Structures”, European Journal of Operational Research, vol. 109, pp.
390–402, 1998.
[33] P. J. Lingras, “Rough Set Clustering for Web Mining,” Proceedings of 2002
IEEE International Conference on Fuzzy Systems, Hawaii, May 2002, pp. 12-
17.
120
Rough‐Fuzzy Intelligent System
121
Rough‐Fuzzy‐Neural Network Hybridization
Chapter 6
6.1 Introduction
The organization of this chapter is as follows: section 6.2 devoted for the theoretical
discussion of different type of neural network systems. Section 6.3 represents
different type of neuro-fuzzy systems. Section 6.4 describes the Neuro-Fuzzy Models
122
Rough‐Fuzzy‐Neural Network Hybridization
focused on the process modules of fuzzy multi-layer perceptron. Section 6.5 presents
the different proposed models that combine the Rough, Fuzzy and Neural Network.
Section 6.6 deals with the conclusion describing the key findings and scope of rough-
fuzzy neural network hybridization.
A neural network (NN), in the domain of artificial intelligence called artificial neural
network (ANN), is an interconnected collection of artificial neurons that uses
a mathematical or computational model for information processing based on
a connectionistic approach to computation. NN was first proposed by Warren
McCulloch and Walter Pitts in 1943 [3] based upon knowledge about the human brain
architecture. Warren McCulloch and Walter Pitts are quite generally regarded as the
inventors of the NN model. This NN model, named after the inventors, included a
nonlinear activation function in the neuron and a threshold value, so that the neuron
only fires if the input value is larger than the threshold value. The next NN model was
Hebb network, Introduced by Donald Hebb in 1949[13].
The years in between 1950 and 1960 are defined as the golden ages of NN. In this
time with the growing interest in NN development a large number of models was
introduced, such as the Perceptron [14] and Adaline architectures [15]. The most of
the NN models those are recently used were actually invented in that time. Mainly the
models were developed in the domain of psychology. The development of NN was
not able to create any massive impact in understanding the human brain or real
artificial intelligence. The main reason was that in absence of optimization algorithms
and the lack of real applications that generated a disbelief in the capabilities of NN.
Widrow and Hoff [15] introduced the Adaline NN model in 1960 in the domain of
electrical engineering. A more mathematical approach to NN started in the 1970s, but
only gained success in the early 1980s. The early work came from Kohonen,
Anderson [17], Carpenter and Grossberg who wrote many mathematical and
biological papers on the subject [18-25]. After the 1980 most work was done on
learning algorithms [26], new network topologies [27] and the universal
approximation properties of neural networks [28, 29, 30, 31]. The Hopfield networks
[32] and the Boltzmann machine were invented. A lot of work was done by
123
Rough‐Fuzzy‐Neural Network Hybridization
combining neural networks with radial basis functions. The numbers of applications
are growing, combination of fuzzy logic with NN as well as rough set with NN also
hybridization with other soft computing techniques.
, , , ,…, ,
, , , ,…, ,
The inhibitory inputs prevent the neuron from firing, irrespective of the other inputs.
The transfer function of a single neuron , , with Wi the weight of
the neuron, can then be written as the logical relationship
1; , , 0
6.1
0;
with Ki the threshold value for the neuron and the output vector Y = [Y1, Y2, … Yn].
Initially all inputs have the same weight, so that the first term of the logical
relationship can be replaced by ∑ , 1 with / , thus all neurons use
the same nonlinear transfer function. Any inhibitive input can be given a large
negative weight -N with max 1 , such that the transfer function for a
neuron becomes
1; 1 1 1
6.2
0;
124
Rough‐Fuzzy‐Neural Network Hybridization
γi
1
∑ Yi
-N 1
The Hebb NN or Rosenblatt’s Perceptron [13, 33, 34] model is a single layer of
neurons. The output , ,…, is modelled using R neurons. Every neuron
Yi = fH(I) has m+1 inputs, m inputs Ij plus a bias term Bi and a single output. The bias
term is considered as an extra input that always equals one. Inputs have the different
weights Wi,j. The input-output relationship defined as
1; ,
6.3
0;
The output is a binary value i.e., ∈ 0, 1 . All the inputs may not be used. In the
equation given bellow, this can be enforced by setting some of the weights to zero.
Also in that situation, elimination of the threshold Ki is possible. The parameter vector
1 ; , , … , ,1 Θ
6.4
0 ;
Transfer function can also be simplified further by putting the threshold value in the
bias vector and that results the inequality with respect to one converts to an inequality
with respect to zero. The parameter vector is normally optimized by the Hebbian
optimization function. Therefore, the parameters are updated with the parameter
update vector
125
Rough‐Fuzzy‐Neural Network Hybridization
ΔΘ , ,…, ,1 (6.5)
γi, 1
γi, 2
1
∑
…...
Yi
…
1
γi, m
Hebb nets are basically used to identification of digital value, like pattern recognition.
To reduce few drawbacks like poor generalization properties in the Hebb net, the
output is often allowed to become negative, that results the nonlinear relationship is
reformed to a sign(*) function instead of the larger-than relationship.
ADALINE (Adaptive Linear Neuron or Adaptive Linear Element) has only one layer
of neurons. The name Adaline was introduced by Professor Bernard Widrow and his
graduate student Ted Hoff at Stanford University in 1960 [15]. This neuron is Hebb
NN model with a transfer function defined as
, 6.6
, 6.7
The main difference between the Hebb net and Adaline models is the transfer function
of the nonlinear part. These are treated differently due to the way of how the
parameter vector is updated. Hebb nets are normally optimized by the Hebbian
optimization function, but the parameters of the Adaline model are optimized by
gradient methods with a LS cost function [35].
126
Rough‐Fuzzy‐Neural Network Hybridization
In Adaline model the first derivative with respect to the number of iterations is chosen
as a stopping criterion. If the cost drops below a user defined value, the optimization
is stopped and assumed that the cost function is to be in the local minimum.
Wi, 1
Wi, 2
1
∑
…...
Yi
…
1
Wi, m
Madaline models is an extension of Adaline model. When Adaline neurons are placed
in a multilayer architecture, the NN model is called a Madaline network.
The Perceptron NN model [14] is the most used and popular NN architecture
nowadays. Originally the Perceptron model employed three layers of neurons, namely
the sensory units, the associator units and a response unit. But later, Perceptron
models with only two layers were found the capabilities to universal approximation.
Using only two layers has the advantage of rather simpler mathematical calculation as
well as easier optimization method.
The nonlinear transfer function may be binary like the McCulloch-Pitts and the Hebb
networks. In gradient optimization methods by choice a transfer function with finite
higher order derivatives makes the optimization of the parameters easier. A
Perceptron employs a bias term and weights and its transfer function that define as
, , 6.8
With , ,…, the output vector, Wi,j the weights and Wi, m+1 the bias term.
The parameter vector involves all weights and biases. Many nonlinear activation
function is there, the sigmoid function defined bellow
127
Rough‐Fuzzy‐Neural Network Hybridization
1
6.9
1
tanh 6.10
for the output when allowed to be negative. Other transfer functions that are also used
regularly are the hard-limit functions σ(x) = sign(x) and σ(x) = (x 0) , and RBF
functions. When Perceptron is use in a multilayer configuration, the outputs of one
layer use as the inputs for the next layer. A three layer Perceptron is defined as
with
the Wi are weight matrices, n1 and n2 are the number of neurons in the first and
second layer respectively and , ,…, is the input vector. The bias vectors
are
Θ : ; : ; : ; : ; : ; : (6.13)
Wi, 1
Wi, 2
1
∑
…...
Yi
…
1
Wi, m
Bi
128
Rough‐Fuzzy‐Neural Network Hybridization
The simple one-layer Perceptron is also used for digital valued identification e.g. for
pattern recognition. Multilayer Perceptron (MLP) models have been used to model a
large number of nonlinear domains, typically with a soft nonlinear relationship
between inputs and outputs. Many researchers have been used the MLP for its strong
modeling capabilities.
The Radial Basis Function (RBF) neural network is similar with the two-layer MLP.
RBF networks, however, are currently widely used and studied by many researchers.
The RBF network is usually configured in a two-layer structure. Every node in the
hidden layer has a center ci that is compared with the input I by a norm ||I – ci||, and a
width ρi. The output layer is the linear combination of all hidden neurons. The transfer
function , ‖ ‖, of each neuron in the hidden layer may employ different
forms. Most generally it is similar to the Gaussian function
‖ ‖
, ‖ ‖, 6.14
2
, ‖ ‖, ‖ ‖ log‖ ‖ 6.15
, ‖ ‖, ‖ ‖ 6.16
‖ ‖, ‖ ‖ /
, 6.17
The combination of the neurons is shown in FIGURE 6.5. RBF input-output relation
is as follows
(6.18)
with (x) = x and (x) = fRBF(x ) , has the same form as the MLP model (6.11). The
approximation of RBF networks are similar to those of more general MLP networks,
129
Rough‐Fuzzy‐Neural Network Hybridization
irrespective of the choice of the transfer function fRBF. This suggests that the selection
of the nonlinearity function cri is not critical for the performance. It has a severe
effect, however, on the number of local minima when the cost function is optimized
W1
W2
∑ Y
…
…
WN
All of the above NN models may be considered as feed forward networks, since the
neurons are joined in a layered or grouped manner. In the network topology, does not
exist any cyclic paths. But in the Recurrent NN model each neuron is coupled to some
or all other neurons. The result is a recursive structure that may represent a form of
storage. The main drawback of the NN is the large interconnection weights.
130
Rough‐Fuzzy‐Neural Network Hybridization
The Maxnet is a fully connected network with each node connecting to every other
node, including itself. The basic idea is that the nodes compete against each other by
sending out inhibiting signals to each other. The Mexican Hat variant of Maxnet is
only allowing interaction with other neurons lie in a neighborhood. Moreover,
neurons that lays symmetric region of a particular neuron get the same weight with
different sign. Hamming nets incorporates the both techniques, such as neurons also
base their output on the neurons that lie close and with largest output. The
optimization of the net is done by LS cost function.
ART maps look like Kohonen networks, but here user can control the degree of
similarity of inputs that are correlated with the same cluster of neurons. ART1 was
introduced for digital data; on the other hand ART2 has the ability to accept
continuous data. ART maps were developed using their proper optimization methods
depends on a designated cost function.
The neocognitron net was proposed for handwritten recognition. It has nine layers
made with a large number of neuron arrays. The interconnections between the arrays
are sparse; every array in one layer is connected to a specified number of arrays of the
previous layer only.
Many more NN model were introduced in past. Mainly they differ in few
characteristics define bellow
131
Rough‐Fuzzy‐Neural Network Hybridization
Some NN models are distinguished by only optimization method was used (e.g.
Backprop NN, Boltzmann and Cauchy machines), or because specific cost functions
were used (e.g. Hamming nets).
Neuro-Fuzzy systems are soft computing methods for design of Intelligent Systems
that combine in various ways neural networks [37, 38] and fuzzy logic [1, 39, 40]
concepts. The combination of these two techniques into an integrated system that
leads toward the development of intelligent system, capable of acquiring quality to
characterizing the human brain. However, fuzzy logic and neural networks usually
approach the design of intelligent systems from different angles.
The Neuro-fuzzy systems makes possible to bring the low-level learning and
computational power of neural networks into fuzzy systems and also high-level
humanlike IF-THEN thinking and reasoning of fuzzy systems into neural networks.
Therefore, the neural network gained more and more transparency to pursued and
obtained either by pre-structuring a neural network to improve its performance or by
132
Rough‐Fuzzy‐Neural Network Hybridization
possible interpretation of the weight matrix in the learning stage. On the other hand,
the development of fuzzy systems allows automatic tuning of the parameters that
characterize the fuzzy system can largely draw motivation from similar methods used
in the connectionist community. Finally, neural networks may improve their
transparency to come closer to fuzzy systems, while fuzzy systems can self-adapt, that
make them closer to neural networks [41].
Researchers attracted with their growing interest with neural fuzzy systems [41, 42] in
various scientific and engineering areas. Specifically increasing interest found in the
area of pattern recognition, hybrid Neuro-fuzzy systems [43, 44, 45].
Several ways to combine neural networks and fuzzy logic has been proposed. Efforts
or combining these two techniques may be categorized by considering three main
categories: neural fuzzy systems, fuzzy neural networks and fuzzy-neural hybrid
systems.
Neural fuzzy systems are mentioned by the use of neural networks in the fuzzy
systems by providing an automatic tuning method, without disturbing their
functionality. One example of this type of system would be, where neural networks
used for the membership function elicitation and mapping between fuzzy sets that are
utilized as fuzzy rules. This kind of hybridization is mostly found in control
applications. This approach used in [46-52].
Consider an example, where the neural network simulates the processing of a fuzzy
system where the neurons of the first layer are responsible for the fuzzification
process. The neurons of the second layer represent the linguistic term that are used in
the fuzzy rules in third layer. Finally, the neurons of the third layer are an example of
the mapping of a neural network to a fuzzy logic system, is responsible for the
defuzzification process. At time of training the neural network adjusts its weights in
order to minimize the mean square error between the output of the network and the
desired output. In this example, the weights of the neural network represent the
parameters of the fuzzification function, linguistic term with membership function,
fuzzy rule certainty factor and defuzzification function respectively. Therefore, the
133
Rough‐Fuzzy‐Neural Network Hybridization
In this approach fuzzy logic fuzzify some of the elements of neural networks. In this
way a crisp neuron may become to a fuzzy neuron. Since the fuzzy neural networks
are basically neural networks, they are mostly used in Pattern Recognition
Applications. This system is used in [44, 53-59]. In [41] a neural network composed
of fuzzy neurons is presented.
In these fuzzy neurons, the input values are crisp, but the weighting operations are
replaced by membership functions that results of each weighting operation is the
membership value which map to the inputs in the fuzzy set. The aggregation
operations used many operations such as min and max and any other t-norms and t-
conorms [41]
In this approach, both fuzzy and neural networks techniques are used independently,
to construct a hybrid system. Each one does its own task in serving different functions
in the system, incorporating and complementing each other in order to achieve a
common goal. This kind of merging is application-oriented and suitable for both
control and pattern recognition applications [41].
Several Neural Network models have been discussed in this thesis in section 6.2.
Among them the multi-layer perceptron (MLP) [37, 38] using the backpropagation
learning mechanism [26], is the most prevalent neural network discussed in the
literature [26]. Thus the study in this section focused on the fuzzy-MLP.
MLP has been used in a wide range of applications such as pattern recognition [60,
61, 62], Medicine [63, 64], forecasting [65] and so on. Basically the MLP is a
feedforward multi-layer network that uses a supervised learning mechanism for error
corrections. That means, it employs a mechanism to modify the weights of the
134
Rough‐Fuzzy‐Neural Network Hybridization
network in order to minimize the mean squared error between the actual and desired
outputs of the neural network [66].
The fuzzy multi-layer perceptron is an application of fuzzy set in a MLP network [54,
67]. In other words, fuzzy set is used directly to perform the fuzzification operation
either at a network-level, learning-level or a network learning- level of the MLP.
Many research works have been done on combining fuzzy set and the MLP such as in
[48, 49, 50, 54, 67, 68, 69].
Fuzzy desired output: Fuzzy concepts are used in calculating the desired output
of input patterns that are existing to the MLP neural model during the learning
process and recalling phase;
Degree of ambiguity: Parameter that involves handling the degree of ambiguity
of the input pattern has been added to the weight update equation.
This fuzzy MLP model [56] is a modified model of the model in [54].
The perceptron and other single layer networks have been limited in their capabilities
[70]. Feedforward multi-layer networks, such as the MLP with nonlinear activation
functions, can overcome these limitations and employed to a wide range of
applications [60, 61, 63, 64, 65]. The learning mechanism usually used in this network
is called backpropagation learning [26]. However, it was not much used by the
researchers until the 1980's, when it was independently rediscovered by Parker [71],
LeCun [72] and in its most popular version by Rumelhart, Hinton and Williams [66].
135
Rough‐Fuzzy‐Neural Network Hybridization
Hidden Layer I
Input Layer Hidden Layer II
Output Layer
…
…
…
Feedforward step: In this phase, the input pattern is feed to the input layer
neurons that pass them on to the first hidden layer neurons. The hidden layer
neurons compute a weighted sum of their inputs, pass the sum through their
activation function and send the result to the next hidden layer neurons. This
process is repeated until the result is send to the output layer neurons of the
network where the actual output of the neural network is generated.
Backward step: Once the output of the network is generated, the mean squared
error between the desired and actual outputs of the network is calculated.
Subsequently, this error is back-propagated and is used as basis in order to
modify the weights of the network [37, 66].
136
Rough‐Fuzzy‐Neural Network Hybridization
Fundamentally, the combination of Fuzzy set and MLP can be achieved in three
different ways, which are network-level fuzzification, learning-level fuzzification and
network-learning-level fuzzification, as follows:
In this approach the fuzzy systems are optimized. Usually fuzzy rules are
acquired from human experts or operators in that domain according to their
domain knowledge or experiences. To modeling a fuzzy system, knowledge
acquisition a very difficult task to performed. Due to the ambiguity,
uncertainty or complexity of the identifying system, it is often too difficult
and sometimes impossible for human expert to produce the desired fuzzy
rules or membership values. Therefor necessity occurs to generate fuzzy
rules by some learning technique. Fuzzy rules can be generated by building
an optimal fuzzy system model by the backpropagation algorithm. This
system approach has been widely used and found in [46-52, 73].
137
Rough‐Fuzzy‐Neural Network Hybridization
Usually, in the conventional MLP, the number of neuron in the output layer
corresponds to the number of pattern classes occurring in the experimental data set. In
this type of neural network, the winner-takes-all method is used during the learning
and recalling phases of the neural network. In winner-takes-all method, 1 is assign to
the winning element and 0 to the other elements. In the recalling phase, winner-takes-
all method is used to define the winning neuron, the neuron that represents the
network's prediction about the class where the input pattern belongs, by assigning 1 to
this neuron and 0 to the other neurons.
The winner-takes-all method is applied to the learning process to define the desired
output vector. In the desired output vector, the class assigned the value 1, where input
pattern belongs and the other classes are assigned the value 0. This is called a crisp
desired output. Then the mean square error is calculated by comparing actual and
desire output, which is then back-propagated to modify the weights of the neural
network.
In real world problems the data are usually ill-defined, with overlapping or fuzzy class
boundaries. Some patterns may have nonzero membership value in two or more
pattern classes. In the Pattern Recognition field, it is very common that an input
pattern has a degree of similarity to more than one class.
138
Rough‐Fuzzy‐Neural Network Hybridization
In the conventional MLP, this multiple similarity is not considered since crisp desired
output is assigned and used during training process and recalling phase. Therefore, to
consider the membership values of every class, it would be promising to include fuzzy
concepts in the calculation of the desired output [54, 56]. Unlike the conventional
MLP, the fuzzy MLP can employ desired membership values to the output neurons
during the training process instead of choosing crisp values as in a winner-takes-all
method [54]. Subsequently, the errors are back-propagated with respect to the exact
similarity which is found in the fuzzy desired output.
Thus fuzzy MLP model with fuzzy desired output can able to more efficiently classify
fuzzy data with overlapping or fuzzy class boundaries. As well as, the fuzzy MLP can
be applied in any problem where conventional MLP is used.
The process of learning knowledge in the conventional MLP network is done through
the minimization of least mean square error in output vectors and it is known as a
gradient descent method. The minimization of least mean square error is achieved
through the update of the weights of the neural network. To calculate the new weights
of the network, a momentum updating equation is used, defined as follows [37]:
1 1 Δ (6.19)
Where
1 : The new weight of the connection from the i-th neuron to the j-th neuron.
: The old weight of the connection from the i-th neuron to the j-th neuron.
: The parameter that is calculated according to the LMS error and the layer in which
the neuron is found.
: Momentum parameter.
139
Rough‐Fuzzy‐Neural Network Hybridization
The weight update process defined in equation 6.19, where every training pattern has
the same importance in adjusting the weights of the neural network. However, the
patterns lie in overlapping areas is mainly responsible for misclassification. In the
momentum equation, this factor is not care and ambiguous and unambiguous training
patterns have the same influence in the updating the weights.
One approach to improve the performance of the algorithm was proposed in [56] by
considering the amount of correction in the weight vector produced by the input
pattern. The amount of correction was defined by the degree of ambiguity of a pattern
where more ambiguity an input pattern means less correction in the weight. By this
approach the ambiguous patterns have less influence in the vector updating process
rather the unambiguous patterns.
(6.20)
Where
To update the weights of the neural network, the degree of ambiguity employed as a
parameter in the weight updating equation along with the learning rate ( ) and the
output of the neuron defined bellow
140
Rough‐Fuzzy‐Neural Network Hybridization
1 1 1 Δ (6.21)
Thus by this approach [56] the problem of ambiguous patterns with much influence in
the weight updating process was avoided. The more ambiguous a training pattern, the
less its influence on the weight updating equation.
In real world, to solve a problem, need to process data. Almost in every solution,
processing of data faced a situation where data involve the vagueness, uncertainty,
imprecision. Many researchers have developed different kinds of approaches, such as
fuzzy systems [1, 76-79], neural network [80, 81], rough set theory [2, 82-84] etc.
Researchers also indicated that each approach has its own advantages and
disadvantages. It can be conclude that to provide more flexible and robust information
processing system using only one approach is not enough.
141
Rough‐Fuzzy‐Neural Network Hybridization
Many researchers are already tried to integrate different computing paradigms such as
neural network, fuzzy systems, rough set theory and so on to generate more efficient
hybrid systems such as fuzzy-rough or rough–fuzzy [85, 86] systems, neural-fuzzy or
fuzzy neural systems [81-89], rough-neural network [4, 90-94] etc.
Typically, fuzzy neural network (FNN) embodies both advantages of neural networks
and fuzzy systems. Fuzzy neural networks are designed to utilize a synthesis of the
computational power of the neural networks along with the uncertainty handling
capabilities of fuzzy logic. In other words, FNN can be used to construct knowledge-
based neural network i.e. knowledge of human expert can be incorporated into neural
networks, so FNN can be more suitable for the real life problem to be solved. But
there still exist questions. For example, in some situation appropriate rules cannot be
derived for a particular system. Of course, every input dimension divides into several
fuzzy subsets and then all the subsets for all input dimension are combined to
construct the complete rule set. However, such kind of FNN contains no expert
knowledge, thus this type of FNN may not fit to construct a particular system at the
very beginning.
The rough neural network (RNN) is an intelligent system [95]. RNNs are the neural
networks based on rough set which combine the advantage of rough set to process
uncertainly question: attributes reduce by none information losing then extract rule,
and the neural networks have the strongly fault tolerance, self-organization, massively
parallel processing and self-adapted. So that RNNs can process the massively and
uncertainly information, which is widespread applied in real life.
But still all this hybridization is not able to give satisfactory result in many situations
and thus to provide more flexible and robust information processing system
hybridization of fuzzy set, rough set and neural network are considered.
142
Rough‐Fuzzy‐Neural Network Hybridization
Wang et al, [4] proposed a new generalized incremental rule extraction algorithm
(GIREA). GIREA is based on rough set theory that presented to extract rough
domain knowledge in terms of certain and possible rules. Then, FNN is used to refine
the obtained certain and possible rules to produce the fuzzy rule set. Authors claimed
that their approach and experimental results demonstrate the superiority in both rule’s
length and the number of fuzzy rules.
Junyang Zhao and Zhili Zhang [8] worked on the hybridization of fuzzy rough neural
network for feature selection. They introduced a feature selection algorithm based on
fuzzy-rough neural network (FRNN). They constructed four-layer feedforward FRNN
based on neural network implementation of the fuzzy-rough membership function of
fuzzy rough set. In this paper neural network adapted to deal with noise data for its
merits of strong approaching ability and good fault-tolerance performance and fuzzy-
rough membership function used to deal with real world data uncertainty.
143
Rough‐Fuzzy‐Neural Network Hybridization
classification precision than Radial Basis Function Neural Network and it has also the
same merit of quick learning as Radial Basis Function Neural Network.
Jing Hong [6] proposed an improved prediction model based on Fuzzy-rough Set
Neural Network to predict the gas emission of a coal mine. In this work back-
propagation neural network and fuzzy-rough set are used to develop the model.
To provide more flexible and robust information processing system, researchers are
also tried to hybridize of more soft computing tools, statistical models and
mathematical methods.
P Mitra et al, [96] proposed a method that describes a way of designing a hybrid
system for classification and rule generation, in soft computing paradigm, integrating
rough set theory with a fuzzy MLP using an evolutionary algorithm.
W.C. Chena et al, [12] also presents an innovative hybrid control algorithm leading to
integrate the distinct aspects of indiscernibility capability of rough set theory and
search capability of genetic algorithms with conventional neural-fuzzy controller for
industrial wastewater treatment.
A Ganivada et al, [97, 98] introduce a novel fuzzy rough granular neural network
(NFRGNN) and fuzzy rough granular neural network (FRGNN) model both based
on the multilayer perceptron using a back-propagation algorithm for the fuzzy
classification of patterns.
6.6 Conclusion
144
Rough‐Fuzzy‐Neural Network Hybridization
Bibliography
[2] Z. Pawlak, 1982, Rough sets, International Journal of Computer and Information
Sciences 11, pp. 341–356.
[3] McCulloch, Warren; Pitts, Walter, 1943, A Logical Calculus of Ideas Immanent
in Nervous Activity, Bulletin of Mathematical Biophysics 5, pp.115-133.
[4] Shi-tong Wanga, Dong-jun Yub and Jing-yu Yangb, 2003, Integrating rough set
theory and fuzzy neural network to discover fuzzy rules, Intelligent Data
Analysis 7 pp, 59–73.
[6] Jing Hong, 2011, An Improved Prediction Model based on Fuzzy-rough Set
Neural Network, International Journal of Computer Theory and Engineering,
Vol.3, No.1, pp.1793-8201.
145
Rough‐Fuzzy‐Neural Network Hybridization
[8] Junyang Zhao and Zhili Zhang, 2011, Fuzzy Rough Neural Network and Its
Application to Feature Selection, International Journal of Fuzzy Systems, Vol.
13, No. 4.
[9] M. Banerjee, S. Mitra, and S. K. Pal, 1998, Rough fuzzy MLP: Knowledge
encoding and classification, IEEE Transactions on Neural Networks, vol. 9, no.
6, pp. 1203-1216.
[10] D. Zhang and Y. Wang, 2006, Fuzzy-rough Neural Network and Its Application
to Vowel Recognition, Control and Decision, vol. 21, no.2, pp. 221-224.
[12] W.C. Chena, N-B. Changb, J-C. Chenc, 2003, Rough set-based hybrid fuzzy-
neural controller design for industrial wastewater treatment, Water Research 37,
pp. 95–107.
[13] Hebb, D.O., 1949, The organization of behavior, New York: Wiley & Sons.
[17] Anderson, J.A. 1970, Two Models for Memory Organization using Interacting
Traces. Mathematical Biosciences 8: 137–160.
[18] Carpenter, G.A. & Grossberg, S. 2003, Adaptive Resonance Theory, In Michael
A. Arbib (Ed.), The Handbook of Brain Theory and Neural Networks, Second
Edition (pp. 87-90). Cambridge, MA: MIT Press
146
Rough‐Fuzzy‐Neural Network Hybridization
[21] Carpenter, G.A., Grossberg, S., & Rosen, D.B. 1991, ART 2-A: An adaptive
resonance algorithm for rapid category learning and recognition, Neural
Networks (Publication), 4, 493-504
[22] Carpenter, G.A. & Grossberg, S. 1990, ART 3: Hierarchical search using
chemical transmitters in self-organizing pattern recognition architectures, Neural
Networks (Publication), 3, 129-152
[23] Carpenter, G.A., Grossberg, S., & Rosen, D.B. 1991, Fuzzy ART: Fast stable
learning and categorization of analog patterns by an adaptive resonance system,
Neural Networks (Publication), 4, 759-771
[24] Carpenter, G.A., Grossberg, S., & Reynolds, J.H. 1991, ARTMAP: Supervised
real-time learning and classification of nonstationary data by a self-organizing
neural network, Neural Networks (Publication), 4, 565-588
[25] Carpenter, G.A., Grossberg, S., Markuzon, N., Reynolds, J.H., & Rosen, D.B.
1992, Fuzzy ARTMAP: A neural network architecture for incremental
supervised learning of analog multidimensional maps, IEEE Transactions on
Neural Networks, 3, 698-713
[26] Werbos, P.J. 1975, Beyond Regression: New Tools for Prediction and Analysis
in the Behavioral Science.
[27] Chen S., Billings S. A., 1992Neural Networks for Nonlinear Dynamic System
Modeling and Identification, International Journal of Control, Vol. 56, No.2, pp.
319-346.
[29] Ge, S. S., 1996, Robust Adaptive Control of Robots Based on Static Neural
Networks, 13th Triennial World Congress of the IFAC, San Francisco, USA, pp.
139- 144
147
Rough‐Fuzzy‐Neural Network Hybridization
[31] Williamwn R. C., Helmke U., 1995, Existence and Uniqueness Results for
Neural Network Approximations, IEEE Trdnsadions on Neural Networks, Vol.
6, No. l, pp. 2-13.
[32] J. J. Hopfield, 1982, Neural networks and physical systems with emergent
collective computational abilities. Proc. NatL Acad. Sci. USA Vol. 79, pp.
2554-2558, Biophysics.
[35] Baldi P.F., Hornik K., 1995, Learning in Linear Neural Networks: a Survey”,
IEEE Transactions on Neural Networks, Vol. 6, No. 4, pp. 837-858.
[38] Mehrotra, K., Mohan, C. K., and Ranka, S. 1997, Elements of Artificial Neural
Networks. The MIT Press.
[39] Ruspini, E., Bonissone, P., and Pedrycz, W. 1998, Handbook of Fuzzy
Computation. Ed. Iop Pub/Inst of Physics.
[40] Cox, E. 1994, The Fuzzy Systems Handbook. AP Professional - New York.
[41] Lin, C.-T. and Lee, G. ,1996, Neural Fuzzy Systems: A Neuro-Fuzzy
Synergism to Intelligent Systems. Ed. Prentice Hall.
[42] Jang, J.-S. R., Sun, C.-T., and Mizutani 1997, Neuro-fuzzy and Soft Computing:
A Computational Approach to Learning and Machine Intelligence. Prentice
Hall.
[44] Baraldi, A. and Blonda, P. 1998, Fuzzy neural networks for pattern recognition,
Tech Report, IMGA-CNR, Italy.
148
Rough‐Fuzzy‐Neural Network Hybridization
[45] Meneganti, M., Saviello, F., and Tagliaferri, R. 1998, Fuzzy neural networks for
classification and detection of anomalies, IEEE Transactions on Neural
Networks, 9(2), pp. 848--861.
[47] Nomura, H., Hayashi, I., and Wakami, N. 1992, A self-tuning method of fuzzy
control by descent method, Proceedings of IEEE International Conference on
Fuzzy Systems, pages 203-210.
[49] Shi, Y., Mizumoto, M. 2000, A new approach of neuro- fuzzy learning
algorithm for tuning fuzzy rules, Fuzzy sets and systems, 112(1), pp.99-116.
[51] Yager, R., Filev, D. 1994, Generation of fuzzy rules by mountain clustering,
Journal of Intelligent Fuzzy Systems, 2(3), pp.209-219.
[53] Dagher, I., Georgiopoulos, M., Heileman, G., and Bebis, G. 1998, Fuzzy artvar:
An improved fuzzy artmap algorithm. International Joint Conference on Neural
Networks (IJCNN-98), 3, pp.1688-1693.
[54] S. K. Pal and S. Mitra, 1992, Multi-layer perceptron, fuzzy sets and
classification, IEEE Transactions on Neural Networks, vol. 3, pp. 683-697.
[55] Carpenter, G., Markuzon, N. 1998, Artmap-IC and medical diagnosis: instance
counting and inconsistent cases. Neural Networks, 11, pp. 323-336.
149
Rough‐Fuzzy‐Neural Network Hybridization
[56] Canuto, A., Howells, G., Fairhurst, M. 1999, Fuzzy multilayer perceptron for
binary pattern recognition, Seventh International Conference on Image
Processing and Its Application, 1, pp. 260--264.
[57] Canuto, A., Howells, G., Fairhurst, M. 1999, Repart: A modified fuzzy artmap
for pattern recognition, 6th Fuzzy Days, pp. 159-168.
[58] Carpenter, G., Grossberg, S., Markunzo, M., Reynolds, J. H., Rosen, D. B.
199l, Fuzzy art: Fats stable learning and categorization of analog patterns by an
adaptive ressonance system, Neural Networks, 4, pp.759-771.
[59] Carpenter, G., Grossberg, S., Markunzo, M., Reynolds, J. H., Rosen, D. B. 1992,
Fuzzy artmap: A neural network architecture for incremental supervised
learning of analog multidimensional maps, IEEE Transactions on Neural
Networks, 3, pp.698-713.
[60] DimlaSr., D. and Lister, P. 2000, On-line metal cutting tool condition
monitoring. ii: tool-state classification using multi-layer perceptron neural
networks, International Journal of Machine Tools and Manufacture, 40(5),
pp.769-781.
[61] Jeong, J.-H., Kim, H., Kim, D.-S., Lee, S.-Y. 2000, Speaker adaptation based on
judge neural networks for real world implementations of voicecommand
systems, Information Sciences, 123(1-2), pp. 13-24.
[62] Zhang, Z., Lyons, M., Schuster, M., Akrunatsu, S. 1998, Comparison between
geometry-based and gabor-wavelets-based facial expression recognition using
multi-layer-perceptron, Proceedings of the 3rd IEEE International Conference
on Automatic Face and Gesture Recognition, Japan, pp. 454-459.
[63] Guier, E., Sankur, B., Kabya, Y., Raudys, S. 1998, Visual classification of
medical data using mlp mapping, Computers in Biology and Medicine, 28(3),
pp. 275-287.
[64] Sheppard, D., McPhee, D., Darke, C., Shrethra, B., Moore, R., Jurewitz, A.,
Gray, A. 1999, Predicting cytomegalovirus disease after renal transplantation:
an artificial neural network approach. International Journal of Medical
Informatics, 54(1):55-76.
150
Rough‐Fuzzy‐Neural Network Hybridization
[65] Indro, D., Jiang, C., Patuwo, B., Zhang, G. 1999, Predicting mutual fund
performance using artificial neural networks. Omega, 27(3), pp.373-380.
[67] Keller, J. M., Hunt, D. J., 1985, Incorporating fuzzy membership functions into
perceptron algorithm, IEEE Transactions on Pattern Analysis and Machine
Intelligence, 7(6), pp.693-699.
[68] Stoeva, S., Nikov, A. 2000, A fuzzy backpropagation algorithm, Fuzzy Sets and
Systems, 112(1), pp.27-39.
[69] Sural, S., Das, P. 1999, An mlp using hough transform based fuzzy feature
extraction for bengali script recognition, Pattern Recognition Letters, 20(8),
pp.771-782.
[70] Minsky, M. L., Papert, S. A. 1969, Perceptrons, The MIT Press, Cambridge.
[71] Parker, D. B., 1985, Learning-logic: Casting the cortex of the humam brain in
silicon, Technical report, tr-47, MIT.
[73] Cho, K., Wang, B. 1996, Radial basis function based adaptive fuzzy systems
and their application to system identification and prediction, Fuzzy sets and
systems, 83(3), pp.325--339.
[74] Hwang, R. C., Huang, H.-C., Chen, Y.-J., Hsier, J.-G., Chao, H. 1997, Adaptive
power signal prediction by non-fixed neural network model with modified fuzzy
back-propagation learning algorithm. Trends in Information Systems,
Engineering and Wireless Multimedia Communications; Proceedings of the
International Conference on Information, Communications and Signal
Processing, 2: pp. 689-692.
151
Rough‐Fuzzy‐Neural Network Hybridization
[77] L. X. Wang and J. M. Mendel, 1992, Generating Fuzzy Rules by Learning from
Examples, IEEE Trans. Systems,Man, and Cybernetics, vol. 22, pp. 1414–1427.
[78] R. Rovatti and R. Guerrieri, 1996, Fuzzy Sets of Rules for System Identification,
IEEE Trans. Fuzzy Syst., vol. 4, pp. 89–102.
[79] Wlodzislaw Duch, Rafal Adamczak and Krzysztof Grabczewski, 2000, A new
methodology of extraction, optimization and application of crisp and fuzzy
logical rules”, IEEE Transactions on Neural Networks, vol. 11, no. 2, pp. 1-31.
[80] B. M. Happel and J. J. Murre, 1994, Design and Evolution of Modular Neural
Network Architec- tures, Neural Networks, vol. 7, pp. 985-1004.
[81] S.T. Wang, 1998, Fuzzy system and Fuzzy Neural Networks, Shanghai Science
and Technology Press, Edition 1.
[82] Z. Pawlak, Rough Sets, 1991, Theoretical Aspects of Reasoning About Data.
Dordrecht, Kluwer, The Netherlands.
[84] Shusaku Tsumoto, 2004, Mining diagnostic rules from clinical databases using
rough sets and medical diagnostic model”, Information Sciences 162, pp. 65–80.
[85] S. K. Pal and A. Skowron, Eds., 1999, Rough Fuzzy Hybridization: A New
Trend in Decision Making, Singapore: Springer-Verlag.
[86] Sankar K. Pal, Pabitra Mitra, 2001, Case Generation: A Rough-fuzzy Approach,
In: Proc. Intl. Conf. Case Based Reasoning (ICCBR2001), Vancouver, Canada.
[87] Y. Hayashi, 1992, Neural expert system using fuzzy teaching input and its
application to medical diagnosis, in: Proceedings of 2nd International
Conference on Fuzzy Logic and Neural Networks (Iizuka), pp. 989-993.
[88] S. Mitra, 1994, Fuzzy MLP based expert system for medical diagnosis, Fuzzy
Sets and Systems 65, pp.285-296
152
Rough‐Fuzzy‐Neural Network Hybridization
[89] S. Mitra and Y. Hayashi, 2000, Neuro-fuzzy Rule Generation: Survey in Soft
Computing Framework, IEEE Trans. On Neural Network, vol. 11, no. 3, pp.
748–768.
[90] R. Yasdi, 1995, Combining Rough Sets Learning and Neural Learning Method
to Deal with Uncertain and Imprecise Information”, Neurocomputing, vol. 7, pp.
61–84.
[91] M S Szczuka. 1998, Rough sets and artificial neural networks. In:Rough Sets in
Knowledge Discovery (2):Applications,Case Studies and Software Systems, pp.
449-470.
[93] R. W. Swiniarski and L. Hargis, 2001, Rough Sets as a Front End of Neural
Networks Texture Classifiers, Neurocomputing, vol. 36, pp. 85–102.
[94] Dongbo Zhang, 2007, Integrated methods of rough sets and neural network and
their applications in pattern recognition[D]. Hunan university.
[95] S. DING, J. CHEN, X XU, J. LI, 2011, Rough Neural Networks: A Review,
Journal of Computational Information Systems 7: 7, pp. 2338-2346.
[96] P Mitra, S Mitra, S. K. Pal, 1999, Modular Rough Fuzzy MLP: Evolutionary
Design, New Directions in Rough Sets, Data Mining, and Granular-Soft
Computing, Lecture Notes in Computer Science Volume 1711, pp 128-136.
[97] A. Ganivada, S. K. Pal, 2011, A Novel Fuzzy Rough Granular Neural Network
for Classification, International Journal of Computational Intelligence Systems,
Vol. 4, No. 5, pp. 1042-105.
[98] A. Ganivada, S. Dutta, S. K. Pal, 2011, Fuzzy rough granular neural networks,
fuzzy granules, and classification, Theoretical Computer Science, 412, pp.5834-
5853
153
Intelligent System Based on Rough‐Fuzzy‐Neural Network
Chapter 7
Intelligent System Based on Rough-
Fuzzy-Neural Network
An Intelligent system based on Rough-fuzzy-neural network is a hybridization
method of rough set, fuzzy set and neural network, the main components of soft
computing. Soft computing, may be considered as a science directed towards the
capturing the human ability to deal with uncertainty and imprecision in real time.
Enchanting from the conventional AI techniques, soft computing has been developed
not as one technique but as a synergistic collection of more than one technique;
Evolutionary, Neural, Fuzzy and Rough Computing being the prime methodologies.
Rough-fuzzy-neural network hybridization captures the power of handling
uncertainty and vagueness in data and human-like reasoning techniques through the
If-Then rule from fuzzy set, power of handling roughness in data and dependency rule
generation from rough set and automated learning and connectionist structure
representation power from Neural network.
7.1 Introduction
Rough-fuzzy-neural network hybridization has been employed in past few years with
many different combinations to solve the different problems, like Rule generation [1,
2], prediction [3], classification [4], feature selection [5], knowledge encoding [6]
pattern recognition [7, 8], industrial wastewater treatment [9] and so on. It is a tool in
the soft computing area that utilizes the advantages of all the individual methods as
well as pairwise hybrid methods. Combining the human-like reasoning process of
Fuzzy set [10], rule generation techniques from Rough set [11], with automated
learning concept from Neural network [12], established itself a powerful tool in soft
computing area. It can also able to handle uncertain, vague and imprecise data as well
as able to reduced unimportant attribute from the experimental data set and presents in
a connectionist structure. This hybridization can be efficiently used for representing
knowledge base of an intelligent system, knowledge discovery from database or
154
Intelligent System Based on Rough‐Fuzzy‐Neural Network
experimental data set, generating dependency rules that represents the knowledgebase
of a rule-based intelligent system.
The organization of this chapter is as follows: section 7.2 presents the procedure of
Rugh-Fuzzy rule generation. Section 7.3 describes the process of mapping rules into a
fuzzy-neural network. Section 7.4 shows the application of the modified framework in
the medical dataset of diseases diabetes mellitus and section 7.5 concludes the key
finding of this chapter and possible area of future work.
155
Intelligent System Based on Rough‐Fuzzy‐Neural Network
= {x ∈ U : [x]B ⊆ X } (7.3)
= {x ∈ U : [x]B ∩ X ≠ φ} (7.4)
Table of data are fuzzyfied using Fuzzy set theory [10, 17]. Fuzzy set theory has been
introduced the concepts of degree of membership of elements to set. In classical set,
elements could belong fully (membership 1) or not at all (membership 0) to set. The
degree of membership allows an element to lie in a set with membership values
anywhere in the range [0, 1]. A fuzzy set can be defined as a set of ordered pairs à =
{(x, μÃ(x)) / x∈ }. The function μÃ(x) is called is called the membership function for
Ã, mapping each element of the base set (universe) to a membership degree in the
range [0.1]. The base set may be discrete or continuous.
For some applications continuously differentiable curves requires for modeling and
therefore smooth transitions, which is not possible using triangular or trapezoidal
function. In those cases normalized Gaussian function, difference of two sigmoidal
functions, generalized bell function, etc, and in some application π functions are used
[18].
156
Intelligent System Based on Rough‐Fuzzy‐Neural Network
in [0, 1]. Also assign MV 1 or certain membership to the other attribute values.
Finally discard those parameters which have MV less than 0.25.
To generate rule, decision matrix [1, 13] approach is considered in this modified
method. Decision matrix is a generalization of rough set theory from where reduct and
decision rule can be calculated. In this approach, first it is check that the information
system is consistent or not. Information system is said to consistent if there is no two
objects whose condition attributes are same but decision attributes are different.
Similarly an information system will be inconsistent if there exists any two objects
whose condition attributes are same and decision attributes are different. That means
for any two object i and j
Let the domain of discourse U of the information system S is divided into k classes
(c1, c2… ck) depending on equivalence relation defined on D. For any class cp ∈ (c1,
c2… ck), the objects ∈ U are belong in cp are numbered by subscripts i (i = 1, 2, …, m)
and those do not belong in cp are subscripts j (j = 1, 2, ,n). The decision matrix M of
the information system S for the class cp is defined as m×n matrix with elements as a
set {attribute-name, attribute-value, membership-value}.
Where a is attribute name and a(i) is attribute value and µ(i) is the MV of attribute
value.
|B | ⋀ ⋁ M (7.7)
157
Intelligent System Based on Rough‐Fuzzy‐Neural Network
∑
7.8
Where k is the number of attributes present in the ith rule and µij is the MV of attribute
value.
First find out the B-lower and B-upper approximation for each class cp ∈ (c1, c2… ck).
Rule generated from the B-lower approximation are certain and rule generated from
B-upper approximation is possible rule.
|B | ⋀ ⋁ M (7.10)
∑
7.11
Where k is the number of attributes present in the ith rule and µij is the MV of attribute
value.
Then construct the possible minimum-length decision rule for any object i (i = 1, 2,
…, m) belong in class cp ∈ (c1, c2… ck) can be obtained as
|B | ⋀ ⋁ M (7.13)
158
Intelligent System Based on Rough‐Fuzzy‐Neural Network
For possible rule the belief function of ith rule can be defined as follows
card Bc B
1 7.14
Where cp ∈ (c1, c2… ck) and card(.) define the cardinality of the set. and then
∑
∗ 7.15
Where k is the number of attributes present in the ith rule and µij is the MV of attribute
value.
Rj= |Bip| possible ∀ i = 1, 2, …, m; cp ∈ (c1, c2… ck); j=number of distinct rule (7.16)
When the rules are extracted from the experimental data set, map them in to a fuzzy-
neural network[1, 2, 14].
: …
Where is the kth rule , 1 k N, { }i=1 …n are the input variables, y is the output
variable, are fuzzy sets defined on the input variables, and is the fuzzy
singletons defined on the output variables.
/
(7.17)
where and are the center and the width of the Gaussian function, respectively.
159
Intelligent System Based on Rough‐Fuzzy‐Neural Network
∑ ∏
7.18
∑ ∏
The FNN consists of four layers: input layer, fuzzification layer, rule inference layer
and Output layer as shown in Figure. 7.1.
…
x1 ~
Input
…
…
Output
xn ~
…
Input layer: Neurons in this layer receive the input values (x1, x2 … xn). This input
values are then transferred to the fuzzification layer to fuzzyfy.
Fuzzification layer: In this layer neurons (nodes) are arranged into N groups; each
group representing the IF-part of a rough-fuzzy rule. Each neuron ik receives the input
variable, i.e. consider the input variable xi and calculates the membership value
that identifies the degree of membership of the input value xi belongs to the
fuzzy set . Thus, the output of neuron ik is in the range [0, 1] and is calculated by
the following functions.
/
(7.19)
Rule inference layer: In this layer number of neurons (nodes) is considered as equal
number of fuzzy rules. Thus neuron in this layer represents a fuzzy rule. Each neuron
is connected with n fixed links from the input term neurons representing the IF part of
160
Intelligent System Based on Rough‐Fuzzy‐Neural Network
the fuzzy rule. The kth neuron performs the ∧ operation for matching of the kth rule
by Larsen product operator; the output of this node is:
7.20
Output layer: In this layer neurons (nodes) represent the output variables of the
system. Each node j acts as a defuzzifier and computes the output values for an input
vector ,… according to equation (1):
∑
7.21
∑
This FNN refines a set of fuzzy rules in its topology, and processes information in a
way that matches the fuzzy reasoning scheme. The weights of the network correspond
to the Gaussian membership functions parameters { }, { } and to the consequent
singleton {Bk}. In other words, each neuron k is associated with two premise weight
vectors … , … and one consequent weight Bk.
7.22
With
7.23
Where is the jth output of the FNN for the current input sample and is
the corresponding desired output.
161
Intelligent System Based on Rough‐Fuzzy‐Neural Network
Δ
∑
2
∆
2
∆
Where
; ∑ ; 7.24
The final solution may be presented as a set of rules or a network on nodes (neurons)
performing the logical functions, with hidden neurons realizing the rules.
Let is the certainty factor of ith Rough-Fuzzy rule from equation (). The final
certainty factor of the ith rule is calculated as
∗ (7.25)
- Dupuytren’s contracture(DUPY)
162
Intelligent System Based on Rough‐Fuzzy‐Neural Network
- Age(Integer Numbers)
Result are compared with the previous framework, rough-fuzzy intelligent system
shown in table-7.1 which is self-explaining.The datasets contains 100 instances. The
datasets are presents in form of two files .nam and .dat file. In .data file present the
data and .nam file represent the data structure about data. Here we used 5-fold data.
Only 20% data are used to generate rule and other 80 % data is used for testing.
163
Intelligent System Based on Rough‐Fuzzy‐Neural Network
7.5 Conclusion
In this chapter a hybrid intelligent system is proposed that used to generate efficient,
reliable and more approximate fuzzy rules. The rules are also tested and result is
given. Result indicates that this hybridization is better. To construct more efficient
method further hybridization with combining evolution computing or other
mathematical or statistical tool may be considered as the future work. Agent based
computing is also another area for updating this model in future.
Bibliography
[1] Shi-tong Wanga, Dong-jun Yub and Jing-yu Yangb, 2003, Integrating rough
set theory and fuzzy neural network to discover fuzzy rules, Intelligent Data
Analysis 7 pp, 59–73.
[3] Jing Hong, 2011, An Improved Prediction Model based on Fuzzy-rough Set
Neural Network, International Journal of Computer Theory and Engineering,
Vol.3, No.1, pp.1793-8201.
164
Intelligent System Based on Rough‐Fuzzy‐Neural Network
[5] Junyang Zhao and Zhili Zhang, 2011, Fuzzy Rough Neural Network and Its
Application to Feature Selection, International Journal of Fuzzy Systems, Vol.
13, No. 4.
[6] M. Banerjee, S. Mitra, and S. K. Pal, 1998, Rough fuzzy MLP: Knowledge
encoding and classification, IEEE Transactions on Neural Networks, vol. 9,
no. 6, pp. 1203-1216.
[7] D. Zhang and Y. Wang, 2006, Fuzzy-rough Neural Network and Its
Application to Vowel Recognition, Control and Decision, vol. 21, no.2, pp.
221-224.
[9] W.C. Chena, N-B. Changb, J-C. Chenc, 2003, Rough set-based hybrid fuzzy-
neural controller design for industrial wastewater treatment, Water Research
37, pp. 95–107.
[14] G. Castellano, A. M. Fanelli, 2000, Fuzzy inference and rule extraction using
a neural network NEURAL NETWORK WORLD JOURNAL 3, pp. 361-371,
165
Intelligent System Based on Rough‐Fuzzy‐Neural Network
[18] S.T. Wang, “Fuzzy system and Fuzzy Neural Networks”, Shanghai Science
and Technology Press, 1998, Edition 1.
166
Conclusion and Discussion
Chapter 8
Conclusion and Discussion
This chapter is devoted for conclusion of the thesis. A summary of the research
presented in this thesis is given, with a focus on the main contribution, modelling
intelligent system and its application in medical diagnosis. Three frameworks for
modelling intelligent system in medical diagnosis are proposed in this thesis.
Based on the survey of the existing literature consolidated in chapter 2, it has been
seen that many approaches have been proposed for modeling knowledge-based
intelligent systems in medical diagnosis, but most of them have no explanation
capabilities which is suggested as the most important capability to accept clinical
decision tool.
In this context the automated rule generation from the observation and clinically test
data set stand as an important research area. But also the observation and clinically
test data set is not noise free, means there exists uncertainty, vagueness as well as
impreciseness. This facts demand methods that not only extract rules from data but
167
Conclusion and Discussion
also have the ability to handling uncertainty, vagueness and impreciseness lie in data
itself.
Other additional work that has been performed includes a comprehensive study of
rough sets and some of their applications that is presented in chapter 4. This review
allows a details view of existing methodologies as well as the hybridizations of rough
sets with fuzzy sets. That suggests and identifies the areas for further exploration. The
application of fuzzy and rough techniques for rule generation is a most promising and
a powerful tool.
Another exhaustive study includes a review of neural network systems and neuro-
fuzzy hybridization and their application documented in chapter 6. This review
recommends that hybridization of rough set, fuzzy set and neural network can able to
incorporate the above said criteria, automated learning, also.
168
Conclusion and Discussion
Granular computing is another new area of research, where this hybridization may be
utilized, which must be the area of future work.
Multi-agent base design of the existing intelligent systems is also an area of future
work.
Web-based intelligent system, with all its necessary characteristics like reliability,
efficiency, accurate, flexibility, robustness and also user friendliness, design in the
field of medical science considering a real life large domain, which by nature very
complex, is a challenging work of future.
169
Publications Contributes Thesis
S. Mukhopadhyay, Jyotirmoy Ghosh, 2011, Studies on Fuzzy Logic and Dispositions for
Medical Diagnosis, International Journal of Computer Technology and Applications, Vol 2 (5),
pp. 1235-1240.
170
Appendix‐A
Appendix-A
(A Consultation Session)
$(IISMD)
(Enter source drive to read RULBASEl .LIB)
C:
****Do you want to begin a knowledge acquisition session?
< YES/ NO> NO *
Do you want to begin a fresh consultation ?
< YES/ NO> YES *
What is (are) pat-name ?
171
Appendix‐A
172
Appendix‐A
173
Appendix‐A
(CONSULTATION-OVER)
$(SYSTEM)
174
Appendix‐B
Appendix-B
Following are the portion of the rule-base generated by modified method for the
diabetes mellitus. Here decision is 1 means that diabetic mellitus is present and 0
means it is absent.
LJM
RULE-24: If TYPE is 1 and PPB is Extrm High Then Decision is 1 With CF 100
RULE-26: If INS is 1 and PPB is Extrm High Then Decision is 1 With CF 100
RULE-27: If PPB is Extrm High and ALB is 1 Then Decision is 1 With CF 100
RULE-33: If AGE is Very High and DUR is Extrm High Then Decision is 0 With CF
96
ADH
RULE-10: If SEX is 1 and PPB is Extrm Low and ALB is 1 Then Decision is 1 With
CF 100
RULE-12: If AGE is Very High and SEX is 1 and ALB is 1 Then Decision is 1 With
CF 95
175
Appendix‐B
CTS CL
RULE-14: If SEX is 1 and DUR is Medium and INS is 0 Then Decision is 1 With CF
94
RULE-15: If AGE is High and SEX is 1 and DUR is Medium Then Decision is 1
With CF 92
CNCV
176
Appendix‐B
RULE-21: If SEX is 0 and FB is High and PPB is High Then Decision is 1 With CF
84
DUPY
FTS
RULE-13: If SEX is 1 and PPB is Extrm Low Then Decision is 1 With CF 100
177
Appendix‐B
DISH
RULE-14: If SEX is 0 and PPB is High and ALB is 1 Then Decision is 1 With CF 97
RULE-16: If AGE is High and PPB is High and ALB is 1 Then Decision is 1 With CF
94
GOUT
RULE-13: If INS is 0 and FB is Medium and PPB is Very Low Then Decision is 1
With CF 93
RULE-14: If FB is Medium and PPB is Very Low and ALB is 0 Then Decision is 1
With CF 93
OAH
178
Appendix‐B
RULE-11: If SEX is 1 and TYPE is 2 and INS is 1 Then Decision is 1 With CF 100
RULE-12: If TYPE is 2 and INS is 1 and ALB is 1 Then Decision is 1 With CF 100
RULE-13: If AGE is High and INS is 0 and ALB is 1 Then Decision is 0 With CF 97
RULE-14: If AGE is High and SEX is 1 and INS is 0 Then Decision is 0 With CF 97
RULE-15: If AGE is High and SEX is 1 and INS is 1 Then Decision is 1 With CF 94
RULE-16: If AGE is High and INS is 1 and ALB is 1 Then Decision is 1 With CF 94
OAK
RULE-07: If AGE is Extrm High and SEX is 1 Then Decision is 1 With CF 100
RULE-09: If AGE is Extrm High and PPB is Very Low Then Decision is 1 With CF
99
179