Вы находитесь на странице: 1из 192

MODELLING INTELLIGENT SYSTEM

FOR
MEDICAL DIAGNOSIS

A Thesis submitted
in partial fulfillment of the requirements for the award of the degree

DOCTOR OF PHILOSOPHY

by
Jyotirmoy Ghosh
Regn. no. – Comp. Sc./Sc/491

Under the guidance and supervision of

Dr. Sripati Mukhopadhyay


Professor, Department of Computer Science

THE UNIVERSITY OF BURDWAN


BURDWAN-713104, WEST BENGAL, INDIA

2012
ABSTRACT

ABSTRACT
Modelling intelligent system in the field of medical diagnosis is still a challenging
work. Intelligent systems in medical diagnosis can be utilized as a supporting tool to
the medical practitioner, mainly country like India with vast rural areas and absolute
shortage of physicians. Intelligent systems in the field of medical diagnosis can also
able to reduce cost and problems for the diagnosis like dynamic perturbations,
shortage of physicians, etc.

An intelligent system may be considered as an information system that provides


answer to queries relating to the information stored in the Knowledge Base (KB),
which is a repository of human knowledge.

This thesis is virtually divided into three parts. In first part, we proposed a prototype
model of an Interactive Intelligent System for Medical Diagnosis (IISMD) for
diagnosis of diseases, in a particular domain, say convulsion in infancy. IISMD
employed fuzzy logic for knowledge representation and used Generalized Modus
Ponens (GMP) for inferencing. The explanation facility is also incorporated in
IISMD. The knowledge base of IISMD is constructed using a set of interconnected
rules. Each rule is associated with a certainty factor to express the degree of belief.
The proposed Interactive Intelligent System can be able to directly assist physicians
and other health professionals for diagnosis of diseases, in a particular domain, say
convulsion in infancy.

On modelling an intelligent system in medical diagnosis, knowledge acquisition from


a domain expert in form of a set of interconnected rules lacking of efficiency and
reliability rather to construct rules from observation and clinically tested data set.

Thus in the second part of this thesis, we proposed a framework of rough-fuzzy


intelligent system, that is a hybridization of rough set and fuzzy set, incorporating the
rough-fuzzy rules in a domain of diseases, say diabetes mellitus.

In our proposed rough-fuzzy framework incorporated the advantages of fuzzy set for
linguistic representation of data and handling the uncertainty and vagueness in data as
well as advantage of rough set to handle impreciseness in data and dependency rule
generation from experimental data set. Each rule associated with a certainty factor to

I
ABSTRACT

represent the degree of correctness. The certainty factor of each rule is constructed by
combining the both membership values of fuzzy set and rough set.

In the third part of this thesis, we proposed a rough-fuzzy-neural network intelligent


system, that is a hybridization of rough set and fuzzy set and neural network, in a
domain of diseases, say diabetes mellitus. The proposed framework amalgamated the
advantages of fuzzy set for linguistic representation of data and handling the
uncertainty and vagueness in data and advantage of rough set to handle impreciseness
in data and dependency rule generation from a table of data as well as connectionist
structure of neural network. Each rule in this framework as like as earlier proposed
framework, rough-fuzzy intelligent system, associated with a certainty factor to
represent the degree of correctness. The certainty factor of each rule is constructed by
combining the both membership values of fuzzy set and rough set. Finally all these
generated rules are mapped to a fuzzy-neural network for refinement.

II
ACKNOWLEDGEMENT

At the outset, it is my fundamental duty to express my deepest and heartiest gratitude


to Dr. Sripati Mukhopadhyay, Professor, Dept. of Computer Science, The University
of Burdwan, my respected supervisor and esteemed teacher, whose constant guidance
and encouragement made this thesis possible.

I must take the opportunity to render my regard to Professor B. B. Paira, Director,


Heritage Institute of Technology, Kolkata, for kindly allowing me for the work of this
thesis.

I express my sincere gratitude to Prof. Utpal Garain, CVPR unit, Indian Statistical
Institute, Kolkata, for his technical guidance and encouragement during the work of
this thesis.

I want to thank my respected teachers, Prof. Abhoy Chand Mondal, Associate


Professor & head of the department, and Prof. Sunil Karforma, Associate Professor,
Dept. of Computer science, The University of Burdwan, for their constant
encouragement in the entire duration of my work.

I would also like to express my deepest gratitude to Professor N. C. Maiti, Director of


CAC, Heritage Institute of Technology, for his support and encouragement to
complete this thesis.

I would also like to convey my sincere thanks to my friend, Dr. Subhabrata Ray,
MBBS, MD (medicine), RMO cum Clinical Tutor, Department of General Medicine,
North Bengal Medical College & Hospital, for his constant help and support for
supply the data and share the required knowledge in medical domain in the entire
duration to my work of this thesis.

I want to thank to my friend Dr. Dipendra Nath Ghosh, Associate professor, Dr. B. C.
Roy Engineering College, Durgapur, for his constant motivation and encouragement.

I want also like to thank my fellow colleague Mr. Anirban kundu, Assistant Professor,
Dept. of Computer Application, Heritage Institute of Technology, Kolkata, for
offering me help whenever I needed it most.

III
I am thankful to my entire colleague Dept. of Computer Application, Heritage
Institute of Technology, Kolkata, for their cooperation.

My parents and my sister were my constant source of inspiration and motivation for
completion of this thesis.

Last but not the least; I am thankful to all my friends for encouraging me throughout
the time.

Jyotirmoy Ghosh

IV
Jyotirmor Ghosh
Assistant Professor
Department of Computer Application
Heritage Institute of Technology
Kolkata-700107



Declaration

I do hereby declare that the work embodied in this thesis entitled, MODELLING

INTELLIGENT SYSTEM FOR MEDICAL DIAGONOSIS, is the outcome of my original research

work unless otherwise specified. This research work is carried out under the guidance

and supervision of Dr. Sripati Mukhopadhyay, Professor, Department of Computer

Science, The University of Burdwan. This piece of work or any part of this thesis has

not been submitted to any other institution for the award of any other degree.

Date: 24th November, 2012 Jyotirmoy Ghosh


 
Dr. Sripati Mukhopadhyay Tel: (0342)-2634975 (Extn. 296)
Professor of Computer science Tel-Fax: (0342)2634015 (O)
& Registrar (Addl. Charge) E-Mail: registrar@buruniv.ac.in
The University of Burdwan dr.sripatim@gmail.com
Rajbati , Burdwan -713 104 http://www.buruniv.ac.in
West Bengal, India

CERTIFICATE

This is to certify that the thesis entitled, MODELLING INTELLIGENT SYSTEM FOR MEDICAL

DIAGNOSIS, submitted by Mr. Jyotirmoy Ghosh, Asst. Professor, Department of Computer

Application, Heritage Institute of Technology, Kolkata, for award of the degree of Doctor of

Philosophy under The Univetsity of Burdwan, is a record of bonafide research work carried out

by him under my supervision. Results included in this thesis or any part of this has not been

submitted to any other institution for award of any other degree.

Date: 24th November, 2012 (Sripati Mukhopadhyay)

_____________________________________________________________________________
Table of Contents
Abstract ......................................................................................................................... I

Acknowledgements .................................................................................................... III

Declaration ................................................................................................................. V

Certificate from Supervisor........................................................................................ VI

Table of Contents ...................................................................................................... VII

Table of Figures ........................................................................................................ XI

Table of Tables ........................................................................................................ XII

1. Introduction ...................................................................................................... 1
1.1 Introduction and Goals ......................................................................... 1
1.2 Thesis Outline ..................................................................................... 5
Bibliography .................................................................................................... 7
2. Knowledge-Based Intelligent System ........................................................... 11
2.1 Introduction ......................................................................................... 11
2.2 Knowledge-Based System: An Overview .......................................... 13
2.3 Components of Knowledge-Based System......................................... 15
2.3.1 Knowledge Representation .................................................... 15
2.3.1.1 Traditional Data Models ........................................ 16
2.3.1.2 Object-Based Data Models ..................................... 17
2.3.2 Inference Mechanisms .......................................................... 18
2.3.2.1 Backward-Chaining Inference Mechanism............. 19
2.3.2.2 Forward Chaining Inference Mechanism................ 20
2.3.2.3 Hybrid Inference Mechanism ................................. 21
2.3.3 Uncertainty in KBS and Reasoning Process .......................... 22
2.3.3.1 Certainty Factor ...................................................... 22
2.3.3.2 Fuzzy Logic ............................................................ 23
2.3.3.3 Belief Networks ...................................................... 25
2.3.3.4 Case-Based Reasoning ............................................ 25
2.3.3.5 Dempster-Shafer Theory ......................................... 27

VII
2.4 Fuzzy Knowledge-Based Intelligent Systems and related literature . 27
2.4.1 Architecture of FKBIS ........................................................... 28
2.4.2 Knowledge Base .................................................................... 29
2.4.3 Fuzzification .......................................................................... 30
2.4.3.1 Linguistic Variables ................................................ 30
2.4.3.2 Fuzzy Set Operations ............................................... 31
2.4.3.3 Membership Functions............................................. 32
2.4.4 Inference System.................................................................... 33
2.4.4.1 Mamdani Inference Method ..................................... 33
2.4.4.2 Takagi-Sugeno-Kang Inference Method.................. 34
2.4.5 Defuzzification ....................................................................... 34
2.4.6 Fuzzy Rule Determination ...................................................... 35
2.5 Conclusion ......................................................................................... 36
Bibliography .................................................................................................. 37
3 Interactive Intelligent System for Medical Diagnosis ................................... 46
3.1 Introduction ........................................................................................ 46
3.2 Knowledge Base and Inference Procedure ........................................ 47
3.2.1 Knowledge Base of IISMD .................................................... 47
3.2.2 Parameters Describing the Domain ....................................... 48
3.2.3 Knowledge Representation .................................................... 49
3.2.4 Inferencing ............................................................................. 53
3.3 Observation ........................................................................................ 54
3.4 Conclusion ......................................................................................... 55
Bibliography .................................................................................................. 56
4 Rough-Fuzzy Hybridization........................................................................... 58
4.1 Introduction ........................................................................................ 58
4.2 Fuzzy Set Theory ............................................................................... 59
4.3 Rough Set Theory .............................................................................. 60
4.3.1 Introduction ............................................................................ 61
4.3.2 Information System................................................................ 62
4.3.3 Decision System..................................................................... 64
4.3.4 Indiscernibility Relation......................................................... 65
4.3.5 Approximation Space............................................................. 66
4.3.6 Approximation of Sets ........................................................... 68
VIII
4.3.6.1 Accuracy of Approximation ................................... 73
4.3.6.2 Approximation and Accuracy of Classification ...... 75
4.3.6.3 Classification and Reduction of Attributes ............. 77
4.3.7 Discernibility Matrix .............................................................. 83
4.3.8 Decision Rules ....................................................................... 85
4.4 Combination of Rough and Fuzzy Sets ............................................. 87
4.4.1 Fuzzy Equivalence Classes .................................................... 88
4.4.2 Rough-Fuzzy Sets .................................................................. 89
4.4.3 Fuzzy-Rough Sets .................................................................. 91
4.4.4 Fuzzy-Rough Hybrids ............................................................ 92
4.4.5 Fuzzy-Rough Reduction Process ........................................... 94
4.5 Conclusion ......................................................................................... 95
Bibliography .................................................................................................. 96
5 Rough-Fuzzy Intelligent System.................................................................. 102
5.1 Introduction ...................................................................................... 102
5.2 Rough-Fuzzy Rule Generation ........................................................ 103
5.3 Fuzzyfication of Data using Fuzzy Set Theory................................ 105
5.4 Rule Generation using Rough Set Theory ....................................... 106
5.4.1 Method 1 .............................................................................. 106
5.4.2 Method 2 ............................................................................. 107
5.4.2.1 For inconsistent information system ........................ 108
5.4.2.2 For consistent information system ........................... 108
5.5 Modified Framework to Generate Rough-Fuzzy Rule with CF ...... 109
5.5.1 For inconsistent information system .................................... 111
5.5.2 For consistent information system ....................................... 111
5.6 Description of Medical Domain ...................................................... 113
5.7 Application on Medical Data Set ..................................................... 115
5.8 Conclusion ...................................................................................... 117
Bibliography ................................................................................................ 117
6 Rough-Fuzzy-Neural Network Hybridization ............................................. 122
6.1 Introduction ....................................................................................... 122
6.2 Neural Network System .................................................................... 123
6.2.1 McCulloch-Pitts Neuron ...................................................... 124
6.2.2 Hebb Nets or Rosenblatt’s Perceptron ................................. 125
IX
6.2.3 Adaline and Madaline models ............................................. 126
6.2.4 Multilayer Perceptron model ............................................... 127
6.2.5 Radial Basis Function Neural Network ............................... 129
6.2.6 Other Neural Network Architectures ................................... 130
6.3 Neuro-Fuzzy Systems ....................................................................... 132
6.3.1 Neural Fuzzy Systems.......................................................... 133
6.3.2 Fuzzy Neural Networks ....................................................... 134
6.3.3 Fuzzy-Neural Hybrid Systems ............................................. 134
6.4 Neuro-Fuzzy Models ....................................................................... 134
6.4.1 Multi-Layer Perceptron ........................................................ 135
6.4.2 Fuzzy Multi-Layer Perceptron ............................................. 137
6.4.2.1 Desired Output Vector .......................................... 138
6.4.2.2 Process of Updating the Weights .......................... 139
6.4.2.3 Learning Strategy .................................................. 141
6.5 Combination of Rough, Fuzzy and Neural Network ........................ 141
6.6 Conclusion ........................................................................................ 144
Bibliography ................................................................................................ 145
7 Intelligent System Based on Rough-Fuzzy-Neural Network....................... 154
7.1 Introduction ....................................................................................... 154
7.2 Rough-Fuzzy Rule generation .......................................................... 155
7.2.1 Basic Concept of Rough Set ................................................ 155
7.2.2 Fuzzyfication of Data using Fuzzy Set ................................ 156
7.2.3 Rule Generation using Rough Set ........................................ 157
7.2.3.1 For consistent information system ....................... 157
7.2.3.2 For inconsistent information system ..................... 158
7.3 Mapping rules into the Fuzzy-Neural Network ................................ 159
7.4 Application on Medical Data Set ...................................................... 162
7.5 Conclusion ........................................................................................ 164
Bibliography ................................................................................................ 164

8 Conclusion and Discussion ................................................................................ 167


Publications Contributes Thesis .............................................................................. 170
Appendix-A A Consultation Session ..................................................................... 171
Appendix-B Portion of Rule-Set Generated by Rough-Fuzzy Intelligent System 175

X
Table of Figure

Figure 2.1: Classifications of KBS ........................................................................... 14

Figure 2.2: Architecture of a fuzzy knowledge-based intelligent system................. 29

Figure 3.1: Graphical representation of 10 point possibility distribution of fuzzy


variables high, med and low ................................................................... 52

Figure 3.2: The production system inference cycle .................................................. 53

Figure 4.1: A set approximation of an arbitrary set X in U ...................................... 69

Figure 6.1: McCulloch-Pitts neuron ....................................................................... 125

Figure 6.2: Hebb neuron – Rosenblatt’s perceptron .............................................. 126

Figure 6.3: Adaline neuron ..................................................................................... 127

Figure 6.4: Perceptron neuron ................................................................................ 128

Figure 6.5: Radial Basis Function network ........................................................... 130

Figure 6.6: The general structure of multi-layer perceptron ................................... 136

Figure 7.1: The FNN implementation of Rough-fuzzy rules.................................. 160

XI
Table of Tables

Table 4.1: A medical data set DIABET ................................................................... 63

Table 4.2: A decision system DIABET .................................................................... 65

Table 4.3: The DIABET data set with the reduced attribute set B = {c1, c2, c3} .... 67

Table 4.5: A reduced MEDICAL decision table....................................................... 83

Table 4.6: Discernibility Matrix M(B) ...................................................................... 84

Table 5.1: Comparison result of existing method and modified method ................ 117

Table 7.1: Comparison Result with Rough-Fuzzy Hybridization .......................... 164

XII
Introduction

Chapter 1
Introduction
Medical science is often been considered as a tested domain for newly introduced
learning and reasoning techniques in computation [1, 2, 3, 4, 5]. Not only the domain
of medicine, an important field of medical science, has many applications whose
solutions are important in a social perspective, but also this domain is extremely
difficult with a wide range of confusing factors and aspects that demand distinctive
attentions, thus establishing it a technically challenging area. This association between
computation techniques and medicine has led to some significant research work being
developed in the past, but major developing efforts in research is still needed to
establish a significant impact on the practice of medicine [6, 7].

Therefore modelling intelligent system in the field of medical diagnosis is still a


challenging work. Intelligent systems in medical diagnosis can be utilized as
supporting tool to the medical practitioner, mainly country like India with vast rural
areas and absolute shortage of physicians.

1.1 Introduction and Goals


Medical diagnosis is still considered an art, despite all standardization efforts.
Medical diagnosis is the art of determining a patient’s pathological status from an
available set of symptoms (findings). It is defined an art, because it is a complicated
problem with many and manifold factors, and its solution comprises literally all of a
human's abilities including intuition and the subconscious [8].

The process of medical diagnosis is composed of, evaluation of a given set of


symptoms (findings) performing relevant pathological tests (patient’s test data), and
ultimately identifying the diseases validating the particular findings.

The functioning of the human body is characterized by the complex and extremely
interactive chemistry of its organs and the psyche. This concerted effort results
homeostasis and the equilibrium of all physiological quantities. This balance is
maintained in a level within physiological bounds that varies from individual to
individual. Due to internal or of external cause, deviations from it are indicative of

1
Introduction

some kind of perturbation. The identification of the cause of these perturbations is the
goal of medical diagnosis.

Reaching a foolproof diagnosis is never an easy job for medical practitioner. Today in
medical diagnosis it is often impossible to look inside a patient to determine the
primary cause that led to the series of effects and reactions the patient complains
about. Thus the diagnosis is based on indirect evidence, symptoms and the knowledge
of the medical mechanisms that relate presumed causes to observed effects. The
problems of diagnosis not only arises due to the incompleteness of knowledge, but
also most immediate limitations of the theoretical and practical knowledge
implications that lead from an initial cause to its observable effects.

The other difficulties also found in medical diagnosis that may be as follows:

 The relations between medical diagnoses and their symptoms, cause-effect


relationships, are hardly ever one-to-one. Differentiation of diagnoses that share
an overlapping range of symptoms is therefore inherently difficult.

 All observations are subject to error. The error correction methods are stochastic
in nature that requires strong assumptions that do not always hold in practice.

 The required observations can often not be made on a continuous basis.


Sometimes many diagnostically meaningful observations can only be obtained
at rather high risk or cost. Therefore, often the diagnosis has to make with
significantly less than desired information.

 This is particularly a problem for the diagnosis of dynamic perturbations that


evolve over an extended period of time: uninterrupted recording of the time
course of physiologically decisive parameters is still more a desideratum than
reality. Medical practitioners are left with a lot to speculate about.

Although taken alone none of the problems is unique to the medical domain, taken
together they add to an intricacy surpassing that of even the most sophisticated man-
made systems known today. It is therefore realistic to expect that medical diagnosis
will for a long time remain problematic.

As well as the world population ages, the patients per physician ratio keeps on
increasing. Much of this status is owed to the fact that medical diagnosis requires a

2
Introduction

proficiency in coping with uncertainty, vagueness and impreciseness that lead to the
modeling intelligent system in medical diagnosis. Intelligent systems in the field of
medical diagnosis can also able to reduce cost and problems for the diagnosis of
dynamic perturbations.

An intelligent system may be considered as an information system that provides


answer to queries relating to the information stored in the Knowledge Base (KB),
which is a repository of human knowledge.

Fuzzy set theory was introduced by Zadeh [9] as a powerful tool for formalization of
uncertain and vague knowledge. Together with appropriate rules of inference,
reasoning process, it provides a powerful framework for the combination of evidence
and deduction of consequences based on knowledge specified in syllogistic [10] form.

Numbers of papers have been published in the recent past for modelling an intelligent
system in the medical domain. A fuzzy classifier based medical decision support
system has been developed by I-Jen Chiang et al. to deal with highly uncertain noise
data [11]. An automated model for medical diagnosis with fuzzy stochastic techniques
for monitoring chronic diseases for aged people have been developed by L. Jeanpierre
and F. Charpillet [12]. An expert system for respiratory diseases can be seen in the
work of Musbah Jumah Aqel [13]. Successful development of expert system in the
domain of child growth and development can be seen in the work of Saha, Mondal
and Samanta [14, 15].

Modelling an intelligent system in medical domain increasingly recognized the


importance of explanation capabilities of the system. One survey within potential
users of medical diagnostic systems suggested that explanation facility is the most
important capability to accept clinical decision tool [16, 17].

This thesis is virtually divided into three parts. In first part, we proposed a prototype
model of an Interactive Intelligent System for Medical Diagnosis (IISMD) [18] for
diagnosis of diseases, in a particular domain, say convulsion in infancy. IISMD
employed fuzzy logic for knowledge representation and used Generalized Modus
Ponens (GMP) for inferencing. The explanation facility is also incorporated in
IISMD. The knowledge base of IISMD is constructed using a set of interconnected
rules. Each rule is associated with a certainty factor to express the degree of belief.

3
Introduction

The proposed Interactive Intelligent System can be able to directly assist physicians
and other health professionals for diagnosis of diseases, in any particular domain, say
convulsion in infancy has been considered as a case study in this thesis.

In medical diagnosis, in many instances decisions made by physicians are arbitrary,


highly variable among themselves and often lacking explanation or “rationalization”
[19, 20]. Compositions of modern medicine are often very complex, but evidence for
the best choice to be made is often lacking. Clinical examples of this phenomenon in
diagnosis making are abundant and easy to understand.

Thus for modeling an intelligent system in medical diagnosis, knowledge acquisition


from a domain expert in form of a set of interconnected rules lacking of efficiency
and reliability rather to construct rules from observation and clinically tested data set.

Thus in the second part of this thesis, we proposed a framework of rough-fuzzy


intelligent system, that is a hybridization of rough set [21] and fuzzy set,
incorporating the rough-fuzzy rules in a domain of diseases, say diabetes mellitus
[22].

The rule generation techniques have been widely developed and used for data mining
to developed intelligent system in many application areas [23], such as medical
diagnosis, decision-making, classification and prediction. In order to provide more
flexible and robust information processing system, using only one technique is not
enough. Each technique has its own advantage. Hybridizations of rough set and fuzzy
set for rule generation are introduced [24-30].

In our proposed rough-fuzzy framework [22] incorporated the advantages of fuzzy set
for linguistic representation of data and handling the uncertainty and vagueness in
data as well as advantage of rough set to handle impreciseness in data and dependency
rule generation from a table of data. Each rule associated with a certainty factor to
represent the degree of correctness. The certainty factor of each rule is constructed by
combining the both membership values of fuzzy set and rough set.

In order to provide more efficient, reliable, accurate, flexible and robust information
processing system, hybridizations of soft computing methodologies are introduced for
modeling intelligent system in different areas like rule generation[31, 32], prediction

4
Introduction

[33], classification [34], feature selection [35], knowledge encoding[36] pattern


recognition [37, 38], industrial wastewater treatment[39] and so on.

In the third part of this thesis, we proposed a rough-fuzzy-neural network intelligent


system, that is a hybridization of rough set and fuzzy set and neural network, in a
domain of diseases, say diabetes mellitus. The proposed framework amalgamated the
advantages of fuzzy set for linguistic representation of data and handling the
uncertainty and vagueness in data and advantage of rough set to handle impreciseness
in data and dependency rule generation from a table of data as well as connectionist
structure of neural network. Each rule in this framework as like as earlier proposed
framework, rough-fuzzy intelligent system, associated with a certainty factor to
represent the degree of correctness. The certainty factor of each rule is constructed by
combining the both membership values of fuzzy set and rough set. Finally all these
generated rules are mapped to a fuzzy-neural network for refinement.

This thesis concerns itself with tools for modeling intelligent system in medical
diagnosis. As such, the context in which to place this work is an intersection of
several scientific areas, the perhaps two most relevant being the field of data mining
and knowledge discovery, and the field of medical informatics. This work focuses on
the development of tools and techniques from a certain subfield of the former area,
and applying them in the latter.

1.2 Thesis Outline

The rest of this thesis is organized as follows:

 Chapter 2: Knowledge Based Intelligent System. A systematic overview of


current techniques for developing an intelligent system in different domains as
well as domain of medical diagnosis is given in this chapter. Literature and
architecture of knowledge-based system, expert system, decision support
system, knowledge-based intelligent system and fuzzy knowledge-based
intelligent system are also reviewed. A part of this chapter also discussed over
the publication appearing in [10]. Limitations of the present techniques are also
discussed as a part of this chapter.

5
Introduction

 Chapter 3: Interactive Intelligent System for Medical Diagnosis (IISMD). This


chapter describes all the modules of the proposed system IISMD. IISMD is
build employing fuzzy logic for knowledge representation and inferencing using
Generalized Modus Ponens (GMP) for diagnosis of diseases, in the specific
domain, which causes convulsion in infancy. It is an extended version of the
work appearing in [18].

 Chapter 4: Rough-Fuzzy Hybridization. This chapter presents the review of


most of the functionalities of rough set theory. Most of the functionalities in
rough set theory are described via examples for the best explanation of rough set
theory for readers. A theoretical investigation of fuzzy sets and rough-fuzzy sets
is also mentioned and focused on the combination of rough sets and fuzzy sets.
Many proposed framework on the combination of fuzzy and rough sets are
reviewed.

 Chapter 5: Rough-Fuzzy Intelligent System. This chapter describes about the


hybrid rough-fuzzy intelligent system. This proposed intelligent system is
developed using the rough-fuzzy rules generated from clinical dataset of the
disease of diabetic mellitus. These rough-fuzzy rules are associated with a
degree of certainty known as certainty factor. Certainty factors are generated
using both fuzzy and rough membership values. It is an extended version of the
work appearing in [22].

 Chapter 6: Rough-Fuzzy Neural Network Hybridization. A theoretical


investigation of neural network systems, neuro-fuzzy systems and hybrid fuzzy
neural network is mentioned and focused on the rough fuzzy neural network
hybridizations. Many proposed framework on the combination of fuzzy set,
rough set and neural network are reviewed.

 Chapter 7: Intelligent System Based on Rough-Fuzzy Neural Network. This


chapter describes all the modules of the proposed intelligent system based on
rough-fuzzy neural network. This proposed intelligent system is developed
using the rough-fuzzy rules, which are also refined by mapping to fuzzy-neural
network, generated from clinical dataset of the disease of diabetic mellitus. A
comparison result also presented with the rough-fuzzy intelligent system
described in chapter 5.

6
Introduction

 Chapter 8: Conclusion and Future work. The thesis is concluded in this


chapter, with a summary of the key findings from the research conducted here.
There is also a discussion of future work to be carried out in the area modeling
intelligent system, as well as hybridization of soft computing techniques in
particular.

Bibliography

[1] G. A. Gorry, 1973, Computer-assisted clinical decision making, Methods of


Information in Medicine, 12, pp. 45–51.

[2] Robert S. Ledley and Lee B. Lusted, 1959, Reasoning foundations of medical
diagnosis, Science, 130(3366), pp. 9–21.

[3] R. A. Nordyke, C. A. Kulikowski, and C.W. Kulikowski, 1971, A comparison


of methods for the automated diagnosis of thyroid dysfunction, Computers and
Biomedical Research, 4, pp. 374–389.

[4] W. B. Schwartz, 1970, Medicine and the computer: The promise and
problems of change, New England Journal of Medicine, 283, pp. 1257–1264.

[5] E. H. Shortliffe, R. Davis, S. G. Axline, B. G. Buchanan, C. C. Green, and S.


N. Cohen, 1975, Computer-based consultations in clinical therapeutics:
Explanation and rule acquisition capabilities of the MYCIN system,
Computers and Biomedical Research, 8(4), pp. 303–320.

[6] Enrico, W. Coiera, 1996, Artificial intelligence in medicine: The challenges


ahead, Journal of the American Medical Informatics Association, 3(6), pp.
363–366.

[7] Peter Szolovits, Ramesh S. Patil, andWilliam B. Schwartz, 1988, Artificial


intelligence in medical diagnosis, Annals of Internal Medicine, 108(1), pp.
80–87

[8] F Steimann, KP Adlassnig, 1998, Fuzzy medical diagnosis In: E Ruspini, P


Bonissone, W Pedrycz (Eds) Handbook of Fuzzy Computation, IOP
Publishing and Oxford University Press, 1998, G13.1.,

7
Introduction

[9] L. A. Zadeh, 1965, Fuzzy Sets, Information and Control, 8, pp. 338-353

[10] S. Mukhopadhyay, Jyotirmoy Ghosh, 2011, Studies on Fuzzy Logic and


Dispositions for Medical Diagnosis, International Journal of Computer
Technology and Applications, Vol 2 (5), pp. 1235-1240.

[11] I-Jen Chang, Ming-Jium Shieh, Jane Yung-Jen Hsu, Jau-Min Wong, 2005,
Building a medical decision support system for colon polyp screening by using
fuzzy classification trees, Applied Intelligence, Vol. 22, pp. 61-75.

[12] Laurent Jeanpierre, Francois Charpillet, 2004, Automated medical diagnosis


with fuzzy stochastic models: monitoring chronic diseases, Acta Biotheoretica,
Vol. 52, pp. 291-311.

[13] Musbah Jumah Aqel, 2002, A cooperative decision-making multi perspective


expert system for respiratory system diseases, Advances in Modeling(B),
AMSE Press, France, Vol. 45, No. 4, pp 21-28.

[14] A.K. Saha, P.Mondal and R. K. Samanta, 1992, A portotyped-object-oriented


expert system for child growth and development, Modellmg, Measurement and
Control (C ), AMSE Press, France, Vol. 31, No.2, pp 13-24.

[15] P. Mondal, A. K. Saha and R. K. Samanta, 1993, On an expert system with


child growth and development, Advances in Modelling & Analysis (B),
AMSE Press, France, Vol. 26 No.1, pp 13-28.

[16] Teach, R.L. and Shortliffe, E.H. 1981, An analysis of physician attitudes
regarding computer-based clinical consultation systems, Comput. Blomed.
Res. 14, pp. 542-558.

[17] Wallis, J.W. and Shortliffe, E.H., 1982, Explanatory power for medical expert
systems: studies in the representation or causal relationships for clinical
consultations, Meth. Info. Med. 21, pp.127-136.

[18] S. Mukhopadhyay, J. Ghosh, D. Ghosh Dastidar, 2007, IISMD: An Interactive


Intelligent System for Medical Diagnosis, Modelling, Measurement and
Control (C), AMSE Press, France, Vol. 68 No.3, pp 1-12.

8
Introduction

[19] D. M. Eddy, 1990, The challenge, Journal of American Medical Association,


263, pp. 287–290.

[20] M. Berg, 1997, Rationalizing Medical Work, Decision-support Techniques


and Medical Practices. MIT press.

[21] Z. Pawlak, 1982, Rough sets, International Journal of Computer and


Information Sciences 11, pp. 341–356.

[22] J. Ghosh, S. Mukhopadhyay, 2011, Role of Certainty Factor in Rough-Fuzzy


Rule Generation, International Journal of Computer Science, Engineering and
Applications (IJCSEA) Vol.1, No.6, pp. 49-61.

[23] S. Mitra, Sankar K. Pal, and P. Mitra, 2002, Data Mining in Soft Computing
Framework: a Survey, IEEE Trans.on neural networks, vol. 13, no.1, pp.3–14.

[24] Shusaku Tsumoto, 2004, Mining diagnostic rules from clinical databases
using rough sets and medical diagnostic model, Information Sciences 162,
pp. 65–80

[25] S. Tsumoto, H. Tanaka, 1995, PRIMEROSE: Probabilistic rule induction


method based on rough sets and resampling methods, Computational
Intelligence 11, pp. 389–405.

[26] P. J. Lingras, 2002, Rough Set Clustering for Web Mining, Proceedings of
2002 IEEE International Conference on Fuzzy Systems, Hawaii, pp. 12-17.

[27] Wlodzislaw Duch, Rafal Adamczak and Krzysztof Grabczewski, 2000, A new
methodology of extraction, optimization and application of crisp and fuzzy
logical rules, IEEE Transactions on Neural Networks, vol. 11, no. 2, pp. 1-31.

[28] Sankar K. Pal, Pabitra Mitra, 2001, Case Generation: A Rough-fuzzy


Approach, In: Proc. Intl. Conf. Case Based Reasoning (ICCBR2001),
Vancouver, Canada.

[29] Gang Xie , Fang Wang , Keming Xie, 2004, RST-Based System Design of
Hybrid Intelligent Control, IEEE International Conference on Systems, Man
and Cybernetics.

9
Introduction

[30] S. K. Pal and A. Skowron, Eds., 1999, Rough Fuzzy Hybridization: A New
Trend in Decision Making, Singapore: Springer-Verlag.

[31] Shi-tong Wanga, Dong-jun Yub and Jing-yu Yangb, 2003, Integrating rough
set theory and fuzzy neural network to discover fuzzy rules, Intelligent Data
Analysis 7, pp, 59–73.

[32] S. K. Pal, S. Mitra, P Mitra, 2003, Rough-Fuzzy MLP: Modular


Evolution,Rule Generation, and Evaluation, IEEE Transactions on Knowledge
and Data Engineering, Vol. 15, No. 1. Pp.14-25.

[33] Jing Hong, 2011, An Improved Prediction Model based on Fuzzy-rough Set
Neural Network, International Journal of Computer Theory and Engineering,
Vol.3, No.1, pp.1793-8201.

[34] M. Sarkar, B. Yegnanarayana, 1998, Fuzzy-rough neural networks for vowel


classification, IEEE International Conference on Systems, Man, and
Cybernetics.

[35] Junyang Zhao and Zhili Zhang, 2011, Fuzzy Rough Neural Network and Its
Application to Feature Selection, International Journal of Fuzzy Systems, Vol.
13, No. 4.

[36] M. Banerjee, S. Mitra, and S. K. Pal, 1998, Rough fuzzy MLP: Knowledge
encoding and classification, IEEE Transactions on Neural Networks, vol. 9,
no. 6, pp. 1203-1216.

[37] D. Zhang and Y. Wang, 2006, Fuzzy-rough Neural Network and Its
Application to Vowel Recognition, Control and Decision, vol. 21, no.2, pp.
221-224.

[38] D. Zhang, Y. Wang, H. Huang, 2007, Fuzzy-rough membership function


neural network and its application to pattern recognition, Proc. SPIE 6788,
MIPPR 2007: Pattern Recognition and Computer Vision, 67882N.

[39] W.C. Chena, N-B. Changb, J-C. Chenc, 2003, Rough set-based hybrid fuzzy-
neural controller design for industrial wastewater treatment, Water Research
37, pp. 95–107.

10
Knowledge‐Based Intelligent System 
 

Chapter 2

Knowledge-Based Intelligent System


A Knowledge-Based Intelligent System (KBIS) is an Expert System (ES) that can be
defined as “…an intelligent computer program that uses knowledge and inference
procedures to solve problems that are difficult enough to require significant human
expertise for their solution” (Giarratano and Riley, 2005) [1]. Generally, a KBIS
provides the expert advice to the users with response to their queries in a certain
knowledge domain. The knowledge base and inference engine are two main
components of a KBIS [1, 2]. The knowledge base is a repository of knowledge,
storage of knowledge, expertise, experience or heuristics derived from human experts.
The inference engine comprises different types of reasoning processes that
intelligently and automatically draw conclusions from facts or other form of
information delivered by users. Various Soft Computing techniques are usually
applied to the inference engine to produce a reasoning process suitable for different
reasoning features in knowledge-intensive problems. In the last few decades, many
knowledge-based intelligent systems were developed in several fields.

2.1 Introduction

The KBIS and expert system must ensure that, the system will provide its users
accurate advice or correct solutions to their problems. In KBIS the verification
process about the efficiency, accuracy and reliability may be performed by
Investigating the knowledge base contains all necessary information, to verify
interpretation and application over the stored information is performed correctly or
not. Knowledge-based debugging is the process of checking that a knowledge base is
correct and complete, is one component of the larger problem of knowledge
acquisition. This process involves testing and refinement of the knowledge base to
find out the different error due to transferring expert knowledge from a human expert
to a KBIS, and correction of those errors.

The development of KBIS and ES was started in 1970s. INTERNIST-I was an expert
system designed in the early 1970's to diagnose multiple diseases in internal medicine

11 
 
Knowledge‐Based Intelligent System 
 
by modelling the behaviour of clinicians. In 1982, the INTERNIST-I project
represented fifteen years of work, and by some reports covered 70-80% of all the
possible diagnoses in internal medicine.

A couple of years later, MYCIN were developed by Shortliffe in 1976 [3]. The
knowledge base of MYCIN was constructed by a set of rules in the area of bacterial
infection. Initially it was 500 rules and incorporated the backward chaining inference
procedure. In the early 1980s, MYCIN was generalized to EMYCIN [4] as a shell to
use for building other expert systems and so PUFF was presented in 1982 [5] as a
medical expert system. PUFF interprets measurements from respiratory tests to
determine the presence and severity of lung disease in the patient.

Shortliffe in 1986 [6], explains the history of expert systems and medical expert
systems. HERMES is another medical expert system in the domain of hepatic diseases
designed by Bonfa et al., in 1993[7]. It was further developed in 1989 as an integrated
expert system that incorporates the RDBMS. Hudson and Cohan in 1994 developed
medical expert systems that incorporate the fuzzy logic to handle the uncertainty lies
behind imprecise or inaccurate information [8]. Schmidt et al., in 2001 explain about
Case-based Reasoning (CBR) and its possible application as a reasoning technique in
medical ESs [9]. ESTDD, is an Expert System for Thyroid Diseases Diagnosis
developed by Keleş and Keleş in 2006 [10]. This explained the importance of early
diagnosis of the disease ignoring the symptoms that can lead to death, and early
discovery that can help to control the disease. Akbarzadeh-T et al., in 2007 have
proposed a general method for the classification and diagnosis in medical system
which was a diagnostic expert system to aphasia diagnosis [11]. Moreover, Kumar et
al., in 2009 represented a hybrid approach using CBR and Rule-based Reasoning for
domain independent clinical decision support in Intensive Care Units (ICU) [12].

Many other KBIS and ES was developed incorporating the different soft computing
techniques for more efficiency, accuracy and reliability will discuss later in this thesis.

The organization of this chapter is as follows: section 2.1 is devoted to an overview of


knowledge-based system. Section 2.2 deals with components of knowledge-based
system. Section 2.3 describes about the history of development fuzzy of knowledge-
based intelligent system and related literature

12 
 
Knowledge‐Based Intelligent System 
 
2.2 Knowledge-based System: An overview

The community of information processing accepted that “Knowledge-based systems


(KBS) are tools for building applications that draw logical inferences from their
stored knowledge of the problem domain” [13]. In the literature of this thesis, the
terms of KBS and expert system are used synonymously. In the research and
development of KBS in the last few decades, several applications in several fields of
KBS has been already developed, such as Database, Artificial Intelligence (AI),
Expert System (ES), Intelligent System (IS), Decision Support System (DSS), KBS,
and so on. All of them have the more or less same the framework but also have
different emphasis among the components. As for consideration, the expert system
and KBS are decent at inference mechanism while the database has the advantage of
storing huge volume of data. But Database System is less used to develop a KBS [14].
AI is the study of computing that would require intelligence like human beings. DSS
is generally used in decision making and problem solving, and is comprehensively
used in business and industry [15]. ES and IS are the best industrial application of AI,
these are a developed upon the knowledge of human experts captured in a knowledge-
base to solve problems that normally require human expertise [15]. The following
figure shows the general classifications of KBS.

Databases are now very popularly used in most industrial areas. A database is a
shared collection of structured interrelated data, designed to meet the needs of an
organization’s information system [16]. Normally according to the complex
information structure of an organization, a suitable data model such as E-R diagram is
used in design to represent the information in databases. A data model is a complete
architecture for describing data, relationships between objects and constraints on the
objects. It usually comprises three essential components [16]:

1. Structural part: It includes a set of rules which are used to build the database.
2. Manipulative part: The operations that are endorsed on the data.
3. Set of integrity rules: It ensures the accuracy of the data.

ES is another technology that is used to represent the real world aspects. It is an


application of the development of AI. The ES is a set of computer programs that
developed upon knowledge derived from human experts for a particular domain to

13 
 
Knowledge‐Based Intelligent System 
 
solve problems that require extensive human expertise in that domain; it must be
clever to solve problems directly [17]. Like a database system, an expert system also
contains a knowledge base, but knowledge representation is different, such as
production rules, which is a major component of an expert system. Production rules
can be used to infer new instances of the objects or new instances of a relationship
among objects from existing objects.

Database and ES were developed to represent different aspects of the real world
problems. Database systems has the ability to store huge amounts of structured
interrelated data with sophisticated data management facilities on the other hand ES
directed to store production rules and has the strong reasoning ability. Thus the
combination of these two technologies would benefit both the systems. So a
hybridization system of ES and database system must have the strong reasoning
ability and the ability to access huge volume of facts and rules [14].

Knowledge Based System

Decision
Support
System
Artificial Intelligence
 
 
  Intelligent
&
Database Expert
System System

Other System

Figure 2.1 Classifications of KBS

14 
 
Knowledge‐Based Intelligent System 
 
2.3 Components of Knowledge-Based System

KBSs are tools for developing applications that drives logical inferences from their
stored knowledge of a specific problem domain [13]. Both database system and ES
are particular kinds of KBSs and have the more or less same general architecture. The
components of the basic structure of a KBS comprise Knowledge base and inference
engine. The knowledge base contains facts, rules, heuristics, and other related
information that are used by the inference mechanism to produce expert opinion and
other valuable resources for the users.

Knowledge representation in the KB is an important step of knowledge organization.


Knowledge representation is like a data models that is similar to data representation in
databases system. But still there exists difference between those that knowledge is
more than data. Data and information together creates the knowledge that is necessary
to make inferences and to reach conclusions [15], including facts, theories, heuristics,
relationships, attributes, observations, definitions and so on. Today’s database
management systems, like Oracle, SQL server have the capability to represent
knowledge and use programming languages for inference process. All those database
management systems use a traditional data representation model, namely relational
data model.

2.3.1 Knowledge Representation

Knowledge representation in a KB requires a knowledge representation models.


Researchers used different data model for the different structure of knowledge
domain. At present there are some standard data models for data (knowledge)
representation. Al of them has few advantages and disadvantages. Data models for
certain knowledge representation are evaluated depending on two basic skills, natural
knowledge representation skill and easily knowledge search and modify skill. All the
knowledge representation model are categorized in the following two ways

 Traditional data model and


 Object-based data model.

15 
 
Knowledge‐Based Intelligent System 
 
2.3.1.1 Traditional Data Models

Traditional data models include: hierarchical data model [16], network data model
[16] and the relational data model [16]. They are all record-based logical data models.

Network model: In the network model, data is represented as collections of records


and relationships are represented by sets [16].

Some data were more naturally modeled with more than one parent per child. So, the
network model permitted the modeling of many-to-many relationships in data. In
1971, the Conference on Data Systems Languages (CODASYL) formally defined the
network model. The basic data modeling construct in the network model is the set
construct. A set consists of an owner record type, a set name, and a member record
type. A member record type can have that role in more than one set; hence the
multiple parent concepts are supported. An owner record type can also be a member
or owner in another set. The data model is a simple network, and link and intersection
record types may exist, as well as sets between them. Thus, the complete network of
relationships is represented by several pairwise sets; in each set some (one) record
type is owner (at the tail of the network arrow) and one or more record types are
members (at the head of the relationship arrow). Usually, a set defines a 1: N
relationship, although 1:1 is permitted. The CODASYL network model is based on
mathematical set theory.

Hierarchical model: The hierarchical data model is a restricted network model, in


which each node is allowed to have only one parent [16].

The hierarchical data model organizes data in a tree structure. There is a hierarchy of
parent and child data segments. This structure implies that a record can have repeating
information, generally in the child data segments. Data in a series of records, which
has a set of field values attached to it. It collects all the instances of a specific record
together as a record type. These record types are the equivalent of tables in the
relational model, and with the individual records being the equivalent of rows. To
create links between these record types, the hierarchical model uses Parent Child
Relationships. These are a 1: N mapping between record types. This is done by using
trees.

16 
 
Knowledge‐Based Intelligent System 
 
Relational model: The relational model is based on mathematical relations, in which
data and relations are both represented as tables [16].

The relational data model is based on the mathematical theory of sets and relations. In
this model, data relationships are represented by tables, each horizontal row describes
a record (tuple) and each column describes one of the attributes (data fields) of the
record. Now the majority of commercial database systems are based on the relational
data model, which provides data independence.

2.3.1.2 Object-Based Data Models

The main drawback of the record-based data models is that they have no ability to
specify the constraints on the data. To represent the real world data easily object-
based data models are used in the design, such as entity relationship data model and
some semantically data models like functional data model, object-oriented data model
etc.

Entity-relationship model: The entity-relationship model is a high-level conceptual


data model that commonly used as a foundation of the logical data model design for
the database system [16]. It includes the entity, the relationship types and the
attributes.

Functional model: The functional data model is based on entities and functions, and
also provides natural languages to give a more natural representation [18]. database
systems have data definition and manipulation language that provides a database
system interface which allows the users to model directly how they think about the
solution of a specific problem [18]. Thus, the functional data model has been act as a
suitable formal and practical basis for object-oriented databases. There does not need
to create extra tables like the relational model. Multi-valued functions allow many-to-
many, or one-to-many relation [18].

Object-oriented model: The application of object-oriented data model is rapidly


increasing after the development of the object-oriented programming [16]. It is a
logical data model that captures the semantics of objects supported by object-oriented
programming. It modeled more closely the real world problems by supporting the

17 
 
Knowledge‐Based Intelligent System 
 
building of complex objects that are more natural and realistic representation of real-
world objects. There are several advantages of object-oriented databases [16]:

 Enhanced modeling capabilities: The object is encapsulated both the state and
behavior to naturally represent the real world objects. It provides higher
performance management of objects, and enables better management of the
complex interrelationships between objects.
 Removal of impedance mismatch: It eliminates many problems that occur in
mapping a declarative language with an imperative language.
 Support for long duration transactions.

The main drawback of the object-orientation database is that there is no universally


agreed data model for an object-oriented database management system, and most
models lack a theoretical foundation.

Production rules: Mostly three data representation model used in conventional ES


and KBS to represent domain knowledge [19]: Production rules, structured objects
and predicate logic. Production rules are certainly the most predominant of the three.
When KBS or KBIS is designed using the production rule, formally it is said to be a
production system.

Production rules, also known as “condition-action” rules or “if-then” rules, comprise


of condition and action pairs of the form “IF condition THEN action". The condition
and action part may exist more than one condition or action combined by logical
“and” operator.

The production rules define a set of permissible transformations that move a problem
from its initial statement to its solution.

The inference engine in a KBS or KBIS match the facts stored against the condition
or IF part of the rules in the knowledge-base, and then forward the data stored in the
action or THEN part of the rule to reach the goal.

2.3.2 Inference Mechanisms

The inference mechanism of a KBS controls the reasoning process of the system. It
compiles the stored knowledge in the knowledge base of a KBS to infer decisions for

18 
 
Knowledge‐Based Intelligent System 
 
the solution of a specific problem. All the decisions inferred by the inference
mechanism together with the input for a specific problem form another part of a KBS
usually known as context also known as the working memory.

The two main inference mechanisms of a KBS are forward chaining or data driven
mechanism and backward chaining or goal driven mechanism. Forward chaining
assumes an initial state of known facts, and progresses though the selected problem,
utilizing the stored knowledge in KB to reach goal. Backward chaining assumes a
goal state or hypothesis and reasons back utilizing the facts to support or drop the
assumed hypothesis.

A hybrid technique also found that is the mixture of both forward and backward
chaining techniques. This indicates that the inference engine will perform forward-
chaining first, and then back ward-chaining second. This is used to confirm a
diagnosis or a hypothesis which was reached through forward chaining.

Another benefit of hybrid techniques in medical diagnostic system is that backward-


chaining benefits from the rule ordering provided by forward-chaining and starts to
pursue the symptoms to isolate and confirm the disease.

2.3.2.1 Backward-Chaining Inference Mechanism

PROSPECTOR is one of the first KBS that used backward-chaining inference


mechanism. Its knowledge domain is evaluating geological sites and determines the
possibility of finding rare minerals [20]. PROSPECTOR gained reputation after it has
suggested a search location in a geological area and discovered a molybdenum
deposit.

In the area of medicine, a medical KBS for neonates was presented [21]. The system
uses both frame representation and backward chaining inference mechanism to
produce quick calculated responses for nutrient for neonates. The VIE-PNN is Vienna
Expert System for Parental Nutrition of Neonates which has the advantage of
reducing calculation errors since the physician has input some dosages manually in
extreme conditions. However consider a situation that the physician may want to stop
the calculations and want to check the patient, but it is not possible with this system
since it is not an interactive program, so it can be a big disadvantage of that system.

19 
 
Knowledge‐Based Intelligent System 
 
A Web-based Weather Expert System (WES) [22] was proposed that supports
important shuttle decisions of “go/no go” in the NASA intelligent launch and range
operations program. WES considers many factors that affect the launch decision.
WES accepts the goal and works through backward chaining inferencing to fire the
necessary rules until the goal is reached. Here backward chaining is firing only the
necessary rules needed to prove the goal however they concluded that it is better to
combine both forward and backward chaining techniques for more accurate and
efficient performance. In another paper [23] also was presented an approach for
describing the complexity of hypotheses proving using backward chaining inference
mechanism. He also suggested that the efficiency would increase of the inference
mechanism if instead of using unification of premises and conclusions, matching
techniques is use. He also suggested that the efficiency would increase even more if
he provided the ability for the system to access external data during the inference as
well as accept user input.

2.3.2.2 Forward-Chaining Inference Mechanism

A research on distributed rule-base Applicative State Transition (AST-systems) [24]


was presented which focus on parallel programming. They integrate forward chaining
through the use of Official Production System (OPS) considered the most successful
inference mechanism for expert systems. According to the author it is advantageous in
terms of firing of rules in certain order as well as it provides flexibility.

Another study on a rule-based fuzzy controller for a nuclear power plant [25] proves
the importance of forward chaining inference mechanism because when the controller
came to critical areas such as nuclear power plants speed where fast responses are
important.

Another paper [26] presents a Fuzzy expert system which is built using FuzzyCLIPS
and certainty factors. FuzzyCLIPS is fuzzy extension to the famous CLIPS. The
system proposed in the research has the ability to identify causes of typical vibration
problems in rotating machinery. FuzzyCLIPS incorporates forward chaining due to
facts are available initially.

Another research in a paper [27] proposes a model using forward chaining


mechanism. They represent the advantages of forward chaining by keeping a

20 
 
Knowledge‐Based Intelligent System 
 
description of the current state of the system and the power of ordered search rules.
Also focus on a single weakness that it does not have a clear vision of the goal.
Finally they present two algorithms to give more information and guidance towards
the goal and also remove irrelevant information.

Another paper focusing on monitoring traffic also needs quick responses and also
supports the use of forward chaining due to its reliability [28].

ABIS, a language for intelligent systems [29] was developed considering the
relational data models and forward chaining. ABIS has passed all the stages of testing
and gives the right direction for future development.

2.3.2.3 Hybrid Inference Mechanism

An intelligent sensor to monitor power systems performance, stability, and failure


diagnosis was presented [30]. They successful incorporate both forward and backward
chaining where forward chaining used for monitoring the condition of the power
system and backward chaining to isolate the failure.

DESPLATE is an expert system designed for abnormal shape diagnosis in the plate
mill [31]. The expert system also successful incorporate both forward and backward
chaining where forward chaining is used to narrow the search for faults and backward
chaining used to prove a certain fault also in some cases where backward chaining is
not applicable, forward chaining is continued. This shows the dependency of
DESPLATE expert system not on a single inference mechanism.

A framework to develop prototypes of a Data Flow Diagrams was proposed where


conventional backward chaining was adopted to represent and execute a prototype of
a Data Flow Diagrams [32]. They faced few problems and solved by changing their
approach to combined forward and backward chaining mechanism.

An expert system was proposed to analyze the data gathered from modern digital
protective relays and report the data which relate to fault disturbances and the
expected protection operation [33]. The inference mechanism uses forward chaining
first for predicting the expected protection operation and backward chaining second to
validate and diagnose the actual protection operation. The authors focus on the
strength provided by combining both techniques.

21 
 
Knowledge‐Based Intelligent System 
 
2.3.3 Uncertainty in KBS and Reasoning Process

KBS are required to deal with uncertainty in knowledge base and inference
mechanism. Reasoning mechanism is a part of KBS which deals with the reasoning
process and uncertainty. Reasoning give the answer of question of why a particular
action was selected or completed based on the available facts provided by the user,
i.e. how the system reached such decision. Uncertainty is the weight or accuracy of
the deducted result, or action.

In KBS, like in medical diagnostic systems that deal with probabilities and always not
completely certain of each diagnosis similar to the real life, diagnosis is the most
possible expected and accurate deduction to take the decision.

Uncertainty in itself is a very uncertain area and prone to the subjectivity of the
individual or group of individuals whom are assigning that uncertainty [34]. To assign
a specific certainty to an event or set of facts, any personnel must be in full confident
about all the facts that can affect that event. Also there must not be bias to the
characteristics of the problem that itself presents a very important property to
measure. Knowledge acquisition from experts or reference material is comparatively
simple task than assigning a certainty value for a particular piece of knowledge.

KBS are classified according to how they deal with uncertainty. Various methods
have been discussed [35] that have been used to deal with uncertain or incomplete
information in the knowledge base. To manipulate the uncertain and imprecise
knowledge requires appropriate models of inference [36, 37]. The following sections
describe the most popular and widely used reasoning techniques.

2.3.3.1 Certainty Factor

A certainty factor (CF) is just a numerical measure between -1 and +1, defined in
terms of measure of belief and disbelief [38, 39]. A negative CF indicates that it is not
confirmed by evidence and positive CF indicates that it is confirmed by evidence. The
CF value zero indicates that the evidence does not influence the belief. The higher the
value of CF means the higher the certainty of the rule.

Nowadays some researchers like to focus on other newly developed methods to deal
with this issue rather such method. Some researchers on the other hand still use that

22 
 
Knowledge‐Based Intelligent System 
 
CF classifying as a simple and early method. In his paper [38], Lucas presents
certainty-factor interpretation of Bayesian Belief Network. He concludes that the
purpose of the paper is not to renew research in such an old way of resolving
uncertainty, but to learn from early models.

The CF model was first introduced in the rule-based expert system (MYCIN) by
Shortliffe and Buchanan [3, 40] for representation and manipulation of uncertain
knowledge. It was found that probabilities will be limited if used in a growing
knowledge of medical systems [41]. In medical diagnosis the line and order of
reasoning as well as explanation is essential, thus the probabilities will fail to perform
as uncertainty management techniques for MYCIN.

But in 1976, Adams argues that the assumptions made with Certainty Factors may not
be always valid, and therefore the hypotheses [42]. In 1986 Heckerman, who worked
to reform the CFformula, supported that [43]. In 1992, in a research paper [44] it was
claimed that both Adams and Heckerman were incorrect about their assumptions
because they result from incorrect interpretation of the CF and it had nothing to do
with elicitation and the use of the numerical factor. Finally it also concluded that CF
should not be interpreted as a measure of belief-updating.

CF is still a useful method in different domain like it was used to identify database
distortions [45], as well as in fault diagnosis in the Zinc Leaching process [46].

Finally, an important advantage of CF model is that it propagates and combines the


effects of multiple evidence sources in terms of joint beliefs or disbelief in each
hypothesis.

2.3.3.2 Fuzzy Logic

Classical set theory is governed by a two-valued logic, while a fuzzy logic can be
represented to a many-valued logic. In 1965 Lotfi Zadeh first introduces the fuzzy
logic [47]. Two valued logic like, propositional logic, predicate logic, cannot deal
with the exact meaning of human belief, uncertain knowledge. In that situation fuzzy
logic works well. Sometimes knowledge is expressed with suppressed its exact
meaning, disposition [48, 50]. Using fuzzy quantifier it is possible to represent the
exact meaning of that knowledge, disposition restoration [48, 50], in fuzzy logic.

23 
 
Knowledge‐Based Intelligent System 
 
The word “fuzzy” is used when dealing with ambiguous terms such as near and far, or
high and low.

Important terms we must comprehend first before we explain fuzzy logic are
“linguistic variables”, “terms”, and “universe of discourse”. For example a variable is
“Blood Cholesterol”. Its terms are “high, low, very high, very low”, and each term
will have a numerical value and called universe of discourse. Terms are associated
with fuzzy membership function that assigns a value between 0 and 1 to the variable.

Fuzzy Logic gives the freedom to think outside the true or false. Elastic Fuzzy Logic
is also introduced [51] that is based on certain antecedents. It also handle missing
antecedent based on multiple passing of rules.

In medical diagnosis, doctor dealing with words input by the user such as somewhat,
little, very, normal, mild, high low and so on, this considered as fuzzy.

Fuzzy logic is a very useful to handle the uncertainty [49] and has been used in many
popular ES and studies. Sphinx [52, 53] is a medical fuzzy expert system in the field
of internal medicine which proved very efficient.

A fair comparison has been done between fuzzy logic model and crisp model in a
medical domain for an ECG fetal heart rate and ECG analysis [54]. The initial results
of fuzzy model proved that the logic approach gives an enhanced performance

H2 leak detection system as part of the DIGEST (Diagnostic Expert System for
Turbomachinery) [55] developed as an analytics and diagnostics system for steam
turbine generators in power plants. The system was also tested on a crisp model, and
the crisp model failed to provide the speed and accuracy like fuzzy model.

A fuzzy expert system for iron ore processing monitor and control for a mill and
spiral circuits at Wabush Mines has been developed [56]. The developed system
increased the productivity and high return investments of the mill line.

Zadeh in 1994, states that fuzzy logic has been succeeded where traditional
approaches have failed [57]. The main cause is that fuzzy logic is the closest to
modeling the real world problem and considers the human mind as its role model.

24 
 
Knowledge‐Based Intelligent System 
 
2.3.3.3 Belief Networks

Belief Networks also known as Bayesian Networks. The belief network is a causal
reasoning tool that has been used in a wide variety of applications, and is now the
support of the uncertainty reasoning [58, 59, 60]. Belief networks are based on the
laws of probability, and in particular conditional and Bayesian probability theory.

A belief network is a directed acyclic graph which encodes the causal relationships
variables, represented in the graph as nodes. Nodes are connected by causal links,
represented in the graph as arrows, which point from parent nodes (causes) to child
nodes (effects). Each node is assigned a probability value. Nodes with one or more
parents require their conditional probability distribution, which is provided by
a conditional probability table (CPT). CPT is dependent on connected nodes;
therefore, each probability is calculated based on the probability of those connected
nodes.

Moreover, Cooman and Zaffalon in 2004 address the issue of beliefs with incomplete
observations [61]. They introduce an algorithm that focuses on a more logical
approach of updating than unpretentiously doing so. They derived a more coherent
rule for updating realistic expert systems. Their algorithm can be implemented easily
without changing in the knowledge base of the expert system.

Belief networks have typically been applied to problems when there is uncertainty in
the data or in the knowledge about the domain, and when reasoning with uncertainty
is important. Belief networks have been applied particularly to problems which
require diagnosis of problems from a variety of input data.

Applying such a network in a medical field where elements are not dependent will
only strain the system, and would not much help with the uncertainty. Due to this fact,
most of the researchers were not used of this technique to develop expert system.
Only few are found as part of uncomplicated problems. Because the larger that
knowledge base, the harder it is to implement [62].

2.3.3.4 Case-Based Reasoning

Case-Based reasoning (CBR) is defined as using old experiences to understand and


solve new problems. There have been many models of CBR that attempt to provide

25 
 
Knowledge‐Based Intelligent System 
 
better understanding of CBR [63, 64, 65, 66]. Researchers are proposed manly to type
of models namely R4 and R5 [67].

R4 consists of four important steps:

1. Retrieve the most similar cases


2. Reuse the cases to attempt to solve the problem
3. Revise the proposed solution
4. Retain the new solution as a part of a new case

Whereas R5 consists of one more steps, that deals with the knowledge acquisition.

1. Repartition the case to acquire knowledge


2. Retrieve the most similar cases
3. Reuse the cases to attempt to solve the problem
4. Revise the proposed solution
5. Retain the new solution as a part of a new case

CBR as a reasoning technique used in ChartD2, a Heart Disease Diagnostician system


[68]. It tried to solve problems using CBR like case matching, indexing, and learning.
It distinguishes between the knowledge represented as previous cases and the
knowledge represented as ‘Category Descriptors’. ‘Category Descriptors’ are
indexing of cases categorized in a certain way.

The main problem of CBR is to handle the increasing case library. Some opinion
about the solution of the problem is to store only the most important cases. But in
some domain like medical domain, it is very hard to decide the important cases.

Uncertainty in Case-Based Reasoning (CBR) can occur due to three main reasons.
First is due to missing information. Second one occurs for different problems and
third is occurring science perfect prediction is impossible.

The uncertainty in CBR is calculated based on the occurrence of cases and the
numbers are updated information retrieval, and correct diagnosis.

A framework proposed for case features that are considered as the bottle-neck for
CBR [69], with the hope that it will increase the usability, flexibility, and
expandability of CBR.

26 
 
Knowledge‐Based Intelligent System 
 
CBR is used in KBS in very few cases as comparison with other techniques.
However, it seems that researches are slowly moving in the direction of considering it
as an uncertainty technique.

2.3.3.5 Dempster-Shafer Theory

The Dempster-Shafer theory (DST), also known as the theory of belief functions, is a
generalization of the Bayesian theory of subjective probability. It was first introduced
by A. P. Dempster in 1968 [70], and extended by Glenn Shafer in 1976 [71]. This
theory was motivated by two difficulties faced in Bayesian probability theory, the
representation of ignorance and the idea that the subjective beliefs assign to an event
and its negation must sum to one.

Dempster's rule begins with the assumption that the questions with probabilities are
independent with respect to subjective probability judgments, but this independence is
apriori; it disappears when conflict is detected between the different items of evidence
[72]. DST is known for its computational complexity of combining beliefs.

The DST provides a unifying framework for representing uncertainty because it


includes in the same formulation the case of risk and ignorance. DST was used in
many different applications. It was used to handle uncertain decision making [73]. It
was also study to find its implication for rule of combination [74].DST was also
applies for system safety and reliability modeling [75].

DST also applied to develop an expert system with uncertain rules based on spatial
remote sensing data [76]. DST also used in a framework of valuation-based systems
that serve as a framework for managing uncertainty in expert systems [77].

2.4 Fuzzy Knowledge-Based Intelligent Systems and


related literature

Information in the world is surrounded by uncertainty and imprecision. The human


reasoning process can handle those inexact, uncertain, and vague concepts in proper
manner. Usually it is not possible to express precisely the human thinking, reasoning,
and perception process. The statistical measured or probability theory also can able to
express or measure this type of experience rarely. Fuzzy logic provides a framework

27 
 
Knowledge‐Based Intelligent System 
 
to model uncertainty, the way of human thinking, reasoning, and the perception
process. Fuzzy systems were first introduced by Zadeh in 1965[47].

Nowadays, one of the most important areas of application of fuzzy logic is Fuzzy
Knowledge-Based Intelligent Systems (FKBIS). These kinds of systems extend the
rule-based systems. Usually these kinds of systems deal with "IF-THEN" rules whose
antecedents and consequents are composed of fuzzy logic statements.

In general, an FKBIS is a rule-based intelligent system where fuzzy logic (FL) is used
as a tool for representing different forms of knowledge about a specific problem. FL
is also used in FKBIS for modelling the interactions and relationships that exist
between its variables. For this property, they have been successfully applied to a wide
range of problem domains for which uncertainty and vagueness emerge in different
ways [78].

Additionally, FKBIS were most commonly applied for fuzzy modelling [79], fuzzy
control [80] and fuzzy classification [81].

FL may be viewed as an extension to classical logic systems, providing an effective


conceptual framework for dealing with the problem of knowledge representation in an
environment of uncertainty and imprecision. In FL, underlying modes of reasoning
are approximate rather than exact. FL is important because most modes of human
reasoning and particularly commonsense reasoning are approximate in nature. FL is
most concerned with imprecision and approximate reasoning [82]. The area of fuzzy
set theory originated from the pioneer contributions of Zadeh [47, 82] who established
the foundations of FL systems.

2.4.1 Architecture of FKBIS

The application of FL mostly to a rule-based systems leads to the development of


FKBIS or fuzzy expert system. These systems used those "IF-THEN" rules whose
antecedents and consequents are composed of fuzzy statements define as fuzzy rules,
possessing the following advantages over classical rule-based systems as [78]

 The key features of fuzzy system knowledge acquisitions involve handling of


uncertainty, and

28 
 
Knowledge‐Based Intelligent System 
 
 Inference methods are more robust and flexible with the help of FL
approximate reasoning methods.

The general architecture of a FKBIS given bellow, show the flow of data through the
system. Generally FKBIS has two main parts, one is knowledge base and other is
Inference Engine. The inference engine of a FKBIS is involving the following three
components:

 Fuzzification module: the fuzzification module transforms the crisp input data
into fuzzy values that play as the input of the fuzzy reasoning process.

 Inference system: the inference system infers from the fuzzy input to several
resulting output fuzzy values depending on the information stored in the KB.

 Defuzzification module: the defuzzyfication module converts the fuzzy values


obtained from the inference process into a crisp values that constitutes the
global output of the FKBIS.

Inference
System
Inputs Outputs
Fuzzyfication Defuzzyfication

Knowledge
Base

Figure 2.2 Architecture of a fuzzy knowledge-based intelligent system

2.4.2 Knowledge Base

The KB establishment is the fundamental part of the FKBIS. It serves as the


repository of the domain knowledge about a specific problem. It models the
relationship between input and output of the underlying system by which the
reasoning process of inference converted an observed input to an associated output.

29 
 
Knowledge‐Based Intelligent System 
 
Generally the KB contains the linguistic rules representing the expert knowledge
about the problem domain. But sometimes the KB contains two different information
levels, namely the fuzzy rule semantics and the linguistic rules representing the expert
knowledge. This conceptual distinction is reflected to constitute the KB by two
separate entity.

 Database (DB): Database contains the linguistic term sets considered in the
linguistic rules and the membership function defines the semantics of the
linguistic labels. Each linguistic variable in the problem related a fuzzy
partition of its domain signifying the fuzzy set related to each of its linguistic
terms.

The DB also includes the scaling factors or scaling functions that are used for
transformation between the universe of discourse in which the fuzzy sets are
defined and the domain of the system input and output variables.

 Rulebase (RB): Rulebase is encompassed by a collection of linguistic “if-


then” rules. Multiple rules can fire simultaneously for the same input.

It is important that the RB can be present with several structures. The typical
one is the list of rules, although a decision table (sometimes called rule matrix)
becomes an equivalent and more compact representation for the same set of
linguistic rule.

2.4.3 Fuzzification

Input crisp data are fuzzyfied using Fuzzy set theory [47, 83]. Fuzzification is the
process of formulating the mapping from a given input to an output using fuzzy set
theory. This mapping provides a basis from which patterns discerned. Generally, the
fuzzification process involves three basic concepts: Association of linguistic
variables, fuzzy set operations and membership functions.

2.4.3.1 Linguistic Variables

Whereas an algebraic variable takes numbers as values, a linguistic variable takes


words or sentences as values [84]. The name of such a linguistic variable is its label.
The set of values that it can take is called its term set. Each value in the term set is a
30 
 
Knowledge‐Based Intelligent System 
 
linguistic value or term defined over the universe. In summary: A linguistic variable
takes a linguistic value, which is a fuzzy set defined on the universe. As for example

Let x be a linguistic variable labeled ’Age’. Its term set T could be defined as

T (age) = {young, very young, not very young, old, more or less old}

Each term is defined on the universe, for example the integers from 0 to 100 years.

Assume a discrete universe U = (0, 20, 40, 60, 80) of ages. We can assign u = [0 20
40 60 80] and

‘young’ = [1 .6 .1 0 0]

The discrete membership function for the set ‘very young’ is (young)2,

‘very young’ = [1 0.36 0.01 0 0]

The set ’very very young’ is, by repeated application, (young)4,

very very young = [1 0.13 0 0 0]

The derived sets inherit the universe of the primary set.

2.4.3.2 Fuzzy set operations

The standard fuzzy set operations are: union, intersection and additive complement.
Utilizing the standard max-min system proposed by Zadeh [47], the fuzzy-set
complement, intersection, and union are defined by

∩ , (2.1)

∪ , (2.3)

where

F, G: two fuzzy sets defined in a universe U

: a membership function of the complement of a fuzzy set F

∩ , ∪ : membership functions of sets of the intersection, union of


F and G, respectively

31 
 
Knowledge‐Based Intelligent System 
 
2.4.3.3 Membership functions

Each linguistic pattern has membership degree in the range [0, 1]. The type of the
membership function is used depending on the base set patterns. If the base set
contains many values, or if this set is continuous, then a parametric representation,
which can be adapted by changing the parameters, is appropriate. Mostly this type of
membership functions are triangular or trapezoidal functions that are defined by three
and four parameters respectively.

 The triangular function formally describe as

0 ,

, , , 2.2

 The trapezoidal function formally described as

0 ,

, , , , 2.3
1

For some applications continuously differentiable curves requires for modeling and
therefore smooth transitions, which is not possible using triangular or trapezoidal
function. Here three of these functions are mentioned

 The normalized Gaussian function [85]

,

, , , , 1, 2.4
,

where and , and are the mathematic expectation and the standard deviation
of the first and the second Gaussian distributions

32 
 
Knowledge‐Based Intelligent System 
 
 The difference of two sigmoidal functions [85]

, , , , 1 1 (2.5)

 The generalized bell function [85]

, , , 1 2.6

 In some application π function is also used [85]

2 1
, , 2.7
1 2 0
0 otherwise

where j is the jth linguistic-π set.

2.4.4 Inference system

The inference system infers from the fuzzy input to several resulting output fuzzy
values depending on the information stored in the knowledge base. There are two
most popular inference method, Mamdani inference method [86] and Takagi-Sugeno-
Kang method [87, 88]. The main difference in these two method is that the result of
Mamdani inference is one or more fuzzy sets which must (almost always) then be
defuzzified into one or more real numbers, whereas the result of TSK inference is one
or more real functions which may be evaluated directly.

2.4.4.1 Mamdani inference method

In Mamdani inference [86], rules are of the following form:

∶ …
1, 2, … ,

33 
 
Knowledge‐Based Intelligent System 
 
where m is the number of rules, (j = 1, 2, . . . , n) are the n input variables, y is the
output variable, and and are the linguistic terms corresponding to the fuzzy sets
that are characterized by membership functions and respectively. The
most important thing here is that the consequence part of each rule is characterized by
a fuzzy set ci. Thus the final output of a Mamdani inference method is an arbitrarily
complex fuzzy set which is usually needed to be defuzzified.

2.4.4.2 Takagi-Sugeno-Kang Inference method

In Takagi-Sugeno-Kang (TSK) inference [87, 88], rules are of the following form:

∶ … ⋯
1, 2, … ,
where m is the number of rules, (j = 1, 2, . . . , n) are the n input variables, is the
output function of input variables and and (j = 1, 2, . . . , n) are real-valued
linear parameters. The most important thing here is that the consequence part of each
rule is now described by a linear function of the original input variables. The final
output the inference process is calculated by:


2.8

where ,…, is the matching degree between the antecedent


part of the ith rule and the current inputs to the system xo = (x1, … ,xn). T stands for a
conjunctive operator modeled by a t-norm.

Therefore, to design the inference system of TSK FKBIS, the designer only have to
selects this conjunctive operator T, with the most common choices being the
minimum and the algebraic product.

2.4.5 Defuzzification

In Mamdani inference method the defuzzification module aggregate the information


provided by the m output fuzzy sets into an overall fuzzy set A and find out the crisp
output value from them.

34 
 
Knowledge‐Based Intelligent System 
 
Given a distribution of truth μA(y) for output variable y, the role of defuzzification is
to select one crisp value, ŷ, for it. There are several possibilities, the most common of
which being [85]

 Center of Gravity (CoG)

ŷ 2.9

 Middle of Maxima (MoM)

min z μ z max μ y max z μ z max μ y


ŷ 2.10
2 2

 Smallest of Maxima (SoM):

ŷ min z μ z max μ y 2.11

 Largest of Maxima (LoM)

ŷ max z μ z max μ y 2.12

 Bisection or Con of Area (CoA) denoted by ý

ý
ŷ ý ∶ μ y dy μ y dy 2.13
ý

2.4.6 Fuzzy Rule Determination

The domain knowledge of specific problem acting an important role to derive the
fuzzy rule set. Careful concern about the rule set generation may significantly
improve the performance of the FKBIS. In a usual application of FKBIS like
modelling, control and classification, two types of data, numerical and linguistic, are
available to the FKBIS designer. The first one is generally obtained from observing
the system, while the latter one is obtain from a domain expert. Thus there are two
main ways to derive fuzzy rule set for a FKBIS [89].

35 
 
Knowledge‐Based Intelligent System 
 
Derivation rule from expert: In this method, the KB of a FKBIS is made by means of
the domain knowledge of a human expert. The human expert specifies the linguistic
terms associated to each linguistic variable and the structure of the rules in the KB
and also the meaning of each terms. This is the simplest method when the human
expert is able to express his knowledge in the form of linguistic rules.

Derivation rule using automated learning methods: It is very difficult to derive rule
set from human experts due to different causes. For this reason researchers developed
numerous automated learning methods over the last few years for the different types
of FKBIS. Some of those are

 Neural networks [90, 91, 92, 93]


 Clustering techniques [94, 95]
 Evolutionary algorithms [96, 97]
 Neuro-Fuzzy approaches [98, 99]
 Rough-Fuzzy approach [100, 101, 102 ,103, 104]

2.5 Conclusion

Development of KBIS and ES in medical domain increasingly recognized the


importance of explanation capabilities of the system. One survey within potential
users of medical diagnostic systems suggested that explanation facility is the most
important capability to accept clinical decision tool [105]. Explanation capabilities
provide the following four functionalities in a diagnostic KBIS [106]:

1. At time of building the system, they provide an investigation facility of the


reasoning process to handle the errors.
2. They increase the acceptance of KBIS by convincing the users about the
logical reasoning process.
3. They may influence users that unexpected advice is appropriate.
4. They may enrich the users about some area of domain knowledge.

These functionalities enforce several requirements upon the system. For example, the
explanations must adequately represent the reasoning processes and they should allow
the user to check the steps of reasoning process or underlying knowledge at different
levels or in details. In addition, the approach to a problem of KBIS need not be

36 
 
Knowledge‐Based Intelligent System 
 
identical to the approach of any human expert, but the overall procedures and
reasoning steps must be understandable and logical. This concludes that the system
must have the capability to represents its explanations about the different
requirements and characteristics of users.

Although many diagnostic type KBIS systems incorporated explanation capabilities in


their system, but still need huge amount of refinement and accuracy. To develop an
efficient and accurate KBIS, knowledge representation, causal reasoning, user
modeling, deep structures, and the human-computer interface as well as good
explanation capabilities are needed.

Still the interest to design an intelligent system is the notion to design a more
efficient, accurate and reliable intelligent systems that can play the role of domain
expert when human expert are unavailable like many domains in real life, especially
the medical domain in the rural area of our country.

Bibliography

[1] Giarratano, J. C. and G. D. Riley, 2005, Expert Systems Principles and


Programming Fourth Edition. Canada, Thomson Course Technology.

[2] Awad, E. M. 1996, Building Expert Systems: Principles, Procedures and


Applications. St. Paul, West Publishing Company.

[3] E. H. Shortliffe, 1976, Computer-based Medical Consultations: MYCIN,


Elsevier Science Ltd , New York

[4] Melle W V, Shortliffe, E H., Buchanan B G., 1984, EMYCIN: A Knowledge


Engineer’s Tool for Constructing Rule-Based Expert Systems, In: Buchanan B
G., Shortliffe, E H.,(ED) Rule-Based Expert Systems: The MYCIN
Experiments of the Stanford Heuristic Programming Project, Addison Wesley,
Reading, MA

[5] Aikins J S, Kunz J C, Shortliffe E H, Fallat R J., 1982, PUFF: an expert


system for interpretation of pulmonary function data, Comput Biomed Res.
1983 Jun;16(3):199-208.

[6] Shortliffe, E H, 1986, Medical Expert Systems—Knowledge Tools for


Physicians, The Western Journal of Medicine, vol. 145, no. 6, pp. 830-839

37 
 
Knowledge‐Based Intelligent System 
 
[7] Bonfa, I, Maioli, C, Sarti, F, Milandri, G L & Monte, D P R, 1993, HERMES:
an Expert System for 131 Prognosis of Hepatic Disease, First New Zealand
International Two-Stream Conference on Artificial Neural Networks and
Expert Systems, Dunedin, New Zealand, pp. 240-246.

[8] Hudson, D L & Cohen, M E 1994, Fuzzy Logic in Medical Expert Systems,
IEEE Engineering in Medicine and Biology, vol. 13, no. 5, pp. 693-698.

[9] Schmidt. R, Montani. S, bellazi. R, Portinale. L & Gierl. L, 2001, Casebased


Reasoning for Medical Knowledge-based Systems, International Journal of
Medical Informatics, vol. 64, no. 2-3, pp. 335-367.

[10] Keleş, A & Keleş, A 2006, ESTDD: Expert System for Thyroid Disease
Diagnosis, Expert Systems with Applications, vol. 34, no. 1, pp. 242-246.

[11] Akbarzadeh-T, M R & Moshtag-Khorasani, M, 2007, A Hierarchical Fuzzy


RulebasedApproach to Aphasia Diagnosis, Journal of Biomedical Informatics,
vol. 40, no. 5, pp.465-475.

[12] Kumar, K A, Singh, Y & Sanyal S, 2009, Hybrid Approach Using case-based
Reasoning and Rule-based Reasoning for Independent Clinical Decision
Support in ICU, Expert Systems with Applications, vol. 36(1), pp. 65-71,

[13] D.M. Hembry 1990, Knowledge-based systems in the AD/cycle environment,


IBM Systems Journel, 29(2), 274-286

[14] Paul Beynon-Davies 1991, Expert Database Systems: A Gentle Introduction,


McGRAW-HILL Book Company (UK) Limited, ISBN 0-07-707240-5

[15] D. Partridge and K.M.Hussain 1995, Knowledge Based Information Systems,


McGRAW-HILL Book Company Europe, ISBN 0-07-707624-9

[16] Thomas Connolly and Carolyn Begg 1998, Database Systems: A Practical
Approach to Design, Implementation, and Management, Addison Wesley,
ISBN 0- 201-34287-1

[17] J L Alty and M J Coombs 1984, Expert systems: Concepts and examples,
NCC publications, ISBN 0-85012-399-2

[18] Shipman 1981, The Functional Data Model and the Data language
DAPLEX,Computer Corporation of America, ACM Transactions on Database
Systems, 6(1981) 140-173

[19] Peter Jackson 1999, Introduction to expert systems, Addison Wesley Longman
Limited, ISBN 0-201-87686-8

38 
 
Knowledge‐Based Intelligent System 
 
[20] A. S. Belward, and C. R. Valenzuela, 1991, Remote Sensing and Geographical
Information Systems for Resource Management in Developing Countries, 1st
edition, Kluwer Academic Publishers, Netherlands

[21] S. Miksch, M. Dobner, C. Popow, C and W. Horn, 1993, VIE-PNN – An


Expert System for Parenteral Nutrition of Neonates, Proceedings of the Ninth
Conference on Artificial Intelligence for Applications, Florida, USA, pp. 285-
291

[22] T. Rajkumar, and Jorge E. Bardina, 2003, Web-based Weather Expert System
(WES) for Space Shuttle Launch, IEEE International Conference on Systems,
Man and Cybernetics, vol. 5, no. 5, pp. 5040-5045.

[23] A. N. Karkishchenko, 2008, A Cost Function for Backward Chaining


Inference, Fuzzy Information Processing Society in North America, 2008,
New York, New York, pp. 1-5

[24] A. B. Cremers, B. Igel, and G. Reichwein, 1988, Distributed Rule-based


ASTsystems, Proceedings of Workshop on the Future Trends of Distributed
Computing Systems in the 1990s, Hong Kong, pp. 311-317

[25] H.L. Akin, V. Altin 1991, Rule-based Fuzzy Logic Controller for a PWRtype
Nuclear Powerplant, IEEE Transactions on Nuclear Science, vol. 38, no. 2, pp.
883-890.

[26] Chun Siu, Qiang Shen, R. Milne, 1997, A Fuzzy Expert System for Vibration
Cause Identification in Rotating Machines, Proceedings of the Sixth IEEE
International Conference on Fuzzy Systems, Barcelona, Spain, vol. 1, no. 1,
pp. 555-560

[27] F. Bacchus, and Y. Whye Teh, 1998, Making Forward Chaining Relevant,
AAAI Conference paper.

[28] R. Cucchiara, M. Piccardi & P. Mello, 2000, Image Analysis and Rule-based
Reasoning for a Traffic Monitoring System, IEEE Transactions on Intelligent
Transportation Systems, vol. 1, no. 2, pp. 119-130.

[29] A.Poletykin, M.Byvaykov and A.Baybulatov 2008, ABIS: A Language of


Intelligent Systems, Conference on Cybernetics and Intelligent Systems, pp.
226-229

[30] R. Doraiswami, J. Jiang 1989, Performance monitoring in expert control


systems, Automatica-Journal of IFAC, Vol. 25, no. 6, pp. 799-811

[31] T S Tung, L D Cung and J F Chicharo, 1990, DESPLATE: An Expert


System for Abnormal Shape Diagnosis in the Plate Mill, Transactions on
Industry Applications, Vol. 26, no. 6, pp. 1057-1062

39 
 
Knowledge‐Based Intelligent System 
 
[32] Erik T. H. Fung, 2000, Abductive Approach to Prototyping Data Flow
Diagrams, Asia-Pacific Conference on Quality Software-APAQS: pp. 306-314

[33] X. Luo, M. Kezunovic, 2005, An expert system for diagnosis of digital relay
operation, Proceedings of the 13th International Conference on Intelligent
Systems Application to Power Systems, pp. 175 – 180

[34] J. Miles and C. Moore, 1994, Practical knowledge based systems for
conceptual design, pp. 35-53, Springer-Verlag, Berlin, Germany.

[35] Hojjat Adeli (ed.) 1988, Expert Systems in Construction and Structural
Engineering, Chapman and Hall, London, England.

[36] P. W. Mullarkey, 1987, Languages and Tools for Building Expert Systems, in
Expert Systems for Civil Engineers, (ed. Maher M.L.), American Society of
Civil Engineers, New York, U.S.A., pp 15-34.

[37] G. Benchimol, P. Levine and J. C. Pomerol, 1987, Developing Expert Systems


for Business, North Oxford Academic Publishers Ltd, Oxford, England.

[38] Lucas, P J F 2001, Certainty-factor-like structures in Bayesian belief


networks, Knowledge-Based systems, Vol. 14, pp. 327-335.

[39] Chen, I & Lin, M T 1988, Resolution of incomplete conditions in rule- based
expert systems, System Theory, Proceedings of the Twentieth Southeastern
Symposium, March 1988, pp. 514-518.

[40] B. G. Buchanan and E. H. Shortliffe, 1984, Rule-Based Expert Systems: The


MYCIN Experiments of the Stanford Heuristic Programming Project, Addison
Wesley, Reading, MA

[41] Shortliffe, E H, Buchanan, B G & Feigenbaum, E A 1979, Knowledge


engineering for medical decision making: A review of computer-based clinical
decision aids, Proceedings of the IEEE, Vol. 67, No. 6, pp. 1207-1224.

[42] Adams, J B 1976, A probability model of medical reasoning and the MYCIN
model, Mathematical Bioscience, Vol.32, no. pp.177-186

[43] Heckerman, D 1986, An axiomatic framework for belief updates, 11-22

[44] Dan, Q & Dudeck, J 1992, Some Problems Related with Probabilistic
Interpretations for Certainty Factors, Proceedings of the Fifth Annual IEEE
Symposium on Computer-Based Medical Systems, pp. 538- 545

[45] Borcherts, R, Collier, C, Koch, E & Bennett, R 1991, Database accuracy


effects on Vehicle positioning as measured by the certainty factor, Vehicle
Navigation and Information Systems Conference, Vol. 2, pp. 291- 294

40 
 
Knowledge‐Based Intelligent System 
 
[46] Wu M, She J H, Nakano M & Gui W 2000, Expert control and fault diagnosis
of the leaching process in a zinc hydrometallurgy plant, Control Engineering
Practice, Vol. 10,no. 4, pp. 433-442

[47] L. A. Zadeh, 1965, Fuzzy Sets, Information and Control, 8, pp. 338-353

[48] L. A. Zadeh, 1984, A computational theory of dispositions, Proceedings of the


22nd annual meeting on Association for Computational Linguistics, pp. 312-
318.

[49] L.A. Zadeh, 1983, The Role of Fuzzy logic in the Management of Uncertainty
in Expert Systems, Fuzzy Sets and Systems, 11,3

[50] L. A. Zadeh, 1983, Linguistic variables, approximate reasoning and


dispositions, Madical Informatics 8 (1983b), 173-186.

[51] Tseng, H C & Teo, D W 1994, Medical Expert System with Elastic Fuzzy
Logic, paper presented in IEEE World Congress on Computational
Intelligence, Florida, USA, 26-29

[52] Fieschi M, Joubert M, Fieschi D, Soula G, Roux M, 1982, Sphinx: an


interactive system for medical diagnosis aids, In: M M Gupta and E. Sanchez
(Eds), Approximate Reasoning in Decision Analysis North-Holland

[53] Soula G, Sanchez E, 1982, Soft deduction rules in medical diagnosis


processes, In: M M Gupta and E Sanchez, (Eds) Approximate Reasoning in
Decision Analysis North-Holland

[54] Ifeachor E C, Curnow, J S K, Outram N J & Skinner J F 2001, Models for


handling uncertainty in fetal heart rate and ECG analysis, Proceedings of the
23rd Annual

[55] Muller, H, Rehbold, R and Emshoff, H 1993, A fuzzy logic expert systemfor
detecting generator H2leaks, Third International Conference on Industrial
Fuzzy Control andIntelligent Systems, pp. 185 – 190

[56] Hall, M B & Harris, C A 1993, Fuzzy logic expert system for iron ore
processing, Conference Record of the 1993 IEEE Industry Applications
Society Annual Meeting, Vol. 3, pp. 2190 – 2199

[57] L. A. Zadeh, ICASSP-94, 1994, Fuzzy Logic: Issues, Contentions and


Perspectives, IEEE International Conference on Acoustics, Speech, and Signal
Processing, Vol. 6, pp. VI/183, New York , USA

[58] Finn V, Jensen. An introduction to Bayesian Networks. UCL Press, 1996.

[59] Stuart Russell and Peter Norvig, 1995, Artificial Intelligence a Modern
Approach Prentice Hall, second edition,

41 
 
Knowledge‐Based Intelligent System 
 
[60] Judea Pearl, 1988, Probabilistic Reasoning in Intelligent Systems, Morgan
Kaufmann, San Mateo, Ca.

[61] Cooman, G, & Zaffalon, M 2004, Updating beliefs with incomplete


observations, vol. 159, no. 1-2, pp. 75-125.

[62] Ng, K C and Abramson, B 1990, Uncertainty Management in Expert Systems,


IEEE Expert: Intelligent Systems and Their Applications, Vol. 5, no. 2, pp. 29-
48

[63] A. Aamodt, E. Plaza, 1994, Case-based reasoning: foundational issues,


methodological variations, and system approaches, Artificial Intelligence
Communications 7 pp.39–59.

[64] B.P. Allen, 1994, Case-based reasoning: business applications,


Communications of the ACM, Vol. 37, No. 3, pp. 40–42.

[65] J. Hunt, 1995, Evolutionary case based design, in: I.D. Waston (Ed.), Progress
in Case-based Reasoning, LNAI 1020, Springer, Berlin, pp. 17–31.

[66] D.B. Leake, 1996, Case-Based Reasoning: Experiences, Lessons and Future
Direction, AAAI Press/MIT Press, Menlo Park, CA.

[67] G. Finnie, Z. Sun, 2003, R5 model for case-based reasoning, Knowledge-


Based Systems, Vol- 16, pp. 59–65

[68] Reategui, E B, Campbell, J A & Leao, B F 1997, A Case-Based Model that


Integrates Specific and General Knowledge in Reasoning, Applied
Intelligence, Vol. 7, No. 1, pp.79-90.

[69] Chen, W C, Tseng, S S, Chen, J H and Jiang, M F 2000, A Framework of


Features Selection for the Case-based Reasoning, IEEE International
Conference on Systems, Man, and Cybernetics, Vol. 1, pp. 1-5

[70] Dempster, A.P., 1968. A generalization of Bayesian inference. Journal of the


Royal Statistical Society,Series B 30 205-247.

[71] Shafer, Glenn, 1976. A Mathematical Theory of Evidence. Princeton


University Press.

[72] Petrik, M 2004, Knowledge Representation for Expert Systems, Presented at


International Conference for Undergraduate and Graduate Students of Applied
Mathematics

[73] J.M. Merigó, M. Casanovas, 2007. Using fuzzy OWA operators in decision
making with Dempster-Shafer belief structure, In Proceedings of the 16th
AEDEM International Conference, Krakow, Poland, pp. 475-486.

42 
 
Knowledge‐Based Intelligent System 
 
[74] L. A. Zadeh, 1986, A Simple View of the Dempster-Shafer Theory of
Evidence and its Implication for the Rule of Combination, The AI Magazine,
pp. 85-90

[75] U. Rakowsky, 2007, Fundamentals of the Dempster-Shafer theory and its


applications to system safety and reliability modelling - RTA # 3-4, December
- Special Issue

[76] Ahmadzadeh, M.R., 2001, Petrou, M., International Geoscience and Remote
Sensing Symposium, IGARSS, IEEE, vol.2 , pp. 861 - 863

[77] Prakash P. Shenoy, 1994, Using Dempster-Shafer’s Belief-function Theory in


Expert System, In: Yager, R. R., M. Federizzi, and J. Kacprzyk, (Eds).,
Advances in the Dempster-Shafer Theory of Evidence, 1994, pp. 395–414,
John Wiley & Sons, New York, NY.

[78] O. Cordon, F. Herrera, F. Hoffmann, L. Magdalena, Genetic Fuzzy Systems:


Evolutionary Tuning and Learning of Fuzzy Knowledge Base, Advances in
Fuzzy Systems - Applications and Theory Vol. 19, World Scientific,
Singapore, ISBN 981-02-4017-1

[79] W. Pedrycz, (Ed.), 1996, Fuzzy Modelling: Paradigms and Practic, Kluwer
Academic Press.

[80] D. Driankov, H. Hellendoorn, and M. Reinfrank, 1993, An Introduction to


Fuzzy Control. Springer-Verlag.

[81] Z. Chi, H. Yan, and T. Pham, 1996, Fuzzy Algorithms: With Applications to
Image Processing and Pattern Recognition, World Scientific.

[82] L. A. Zadeh, 1973, Outline of a new approach to the analysis of complex


systems and decision processes, IEEE Transactions on Systems, Man and
Cybernetics, Vol. SMC-3, No. 1, pp. 28-44

[83] L.A. Zadeh, 1983, A computational approach to fuzzy quantifiers in natural


languages, Computers and Mathematics with Applications, 9:149–184.

[84] H. J. Zimmermann, 1993, Fuzzy set theory - and its applications, Kluwer,
Boston, second edition.

[85] S.T. Wang, Fuzzy system and Fuzzy Neural Networks, Shanghai Science and
Technology Press, 1998, Edition 1.

[86] E.H. Mamdani and S. Assilian, 1975, An experiment in linguistic synthesis


with a fuzzy logic controller. International Journal of Man-Machine Studies,
7:1–13.

43 
 
Knowledge‐Based Intelligent System 
 
[87] M. Sugeno and G.T. Kang, 1988, Structure identification of fuzzy model.
Fuzzy Sets and Systems, 28:15–33.

[88] T. Takagi and M. Sugeno, 1985, Fuzzy identification of systems and its
applications to modeling and control. IEEE Transactions on Systems, Man,
and Cybernetics, 15(1):116–132.

[89] L. X. Wang, 1994, Adaptive Fuzzy Systems and Control: Design and
Analysis. Prentice-Hall.

[90] Nauck, D. and R. Kruse, 1997, A neuro-fuzzy method to learn fuzzy


classification rules from data. Fuzzy Sets and Systems 89, 377-388.

[91] Shann, J. J. and H. C. Fu, 1995, A fuzzy neural network for rule acquiring on
fuzzy control systems. Fuzzy Sets and Systems 71, 345-357.

[92] Takagi, H. and I. Hayashi, 1991, NN-driven fuzzy reasoning. International


Journal of Approximate Reasoning 5(3), 191-212.

[93] Takagi, H., N. Suzuki, T. Koda, and Y. Kojima, 1992, Neural networks
designed on approximate reasoning architecture and their applications. IEEE
Transactions on Neural Networks 5(5), 752-760.

[94] Delgado, M., A. F. Gomez-Skarmeta, and F. Martin, 1997, A fuzzy clustering


based rapid-prototyping for fuzzy rule-based modeling. IEEE Transactions on
Fuzzy Systems 5(2), 223-233.

[95] Yoshinari, Y., W. Pedrycz, and K. Hirota, 1993, Construction of fuzzy models
through clustering techniques, Fuzzy Sets and Systems 54, 157-165.

[96] Cordon, O. and F. Herrera, 1995, A general study on genetic fuzzy systems.
In: J. Periaux, G. Winter, M. Galan, and P. Cuesta (Eds.), Genetic Algorithms
in Engineering and Computer Science, pp. 33-57. John Wiley and Sons.

[97] Cordon O., F. Herrera and M. Lozano, 1997, A classified review on the
combination fuzzy logic-genetic algorithms bibliography: 1989-1995. In: E.
Sanchez, T. Shibata, and L. Zadeh (Eds.), Genetic Algorithms and Fuzzy
Logic Systems. Soft Computing Perspectives, pp. 209-241. World Scientific.

[98] J. S. R. Jang, 1993, ANFIS: Adaptive-network-based fuzzy inference system.


IEEE Transactions on Systems, Man, and Cybernetics, 23(3):665–685.

[99] L.-X.Wang and J.M. Mendel, 1992, Generating fuzzy rules by learning from
examples, IEEE Transactions on Systems, Man, and Cybernetics, 22(6):1414–
1427.

44 
 
Knowledge‐Based Intelligent System 
 
[100] S. Mitra and Y. Hayashi, 2000, Neuro-fuzzy Rule Generation: Survey in Soft
Computing Framework, IEEE Trans. On Neural Network, vol. 11, no. 3, pp.
748–768

[101] Sankar K. Pal, Pabitra Mitra, 2001, Case Generation: A Rough-fuzzy


Approach, In: Proc. Intl. Conf. Case Based Reasoning (ICCBR2001),
Vancouver, Canada

[102] S. K. Pal and A. Skowron, (Eds), 1999, Rough Fuzzy Hybridization: A New
Trend in Decision Making” Singapore: Springer-Verlag.

[103] S. K. Pal, S. Mitra, and P. Mitra, 2003, Rough-Fuzzy MLP: Modular


Evolution, Rule Generation, and Evaluation, IEEE Trans. On Knowledge and
data engineering, vol. 15, no. 1, pp. 14-25.

[104] Gang Xie , Fang Wang , Keming Xie, 2004, RST-Based System Design of
Hybrid Intelligent Control, IEEE International Conference on Systems, Man
and Cybernetics.

[105] Teach, R.L. and Shortliffe, E.H. 1981, An analysis of physician attitudes
regarding computer-based clinical consultation systems. Comput. Blomed.
Res. 14, pp. 542-558.
[106] Wallis, J.W. and Shortliffe, 1982, E.H. Explanatory power for medical expert
systems: studies in the representation or causal relationships for clinical
consultations. Meth. Info. Med. 21, pp.127-136.

45 
 
Interactive Intelligent System for Medical Diagnosis

Chapter 3
Interactive Intelligent System for Medical Diagnosis
Development of intelligent system is a major area of research nowadays. Design of an
Interactive Intelligent System for Medical Diagnosis (IISMD) employing fuzzy logic
for knowledge representation and inferencing using Generalized Modus Ponens
(GMP) is the purpose of this chapter. The system developed here may be used as an
assistant to the medical practitioner for diagnosis of diseases, in a particular domain,
say convulsion in infancy.1

3.1 Introduction

An intelligent system is an information system that provides answer to queries relating


to the information stored in the Knowledge Base (KB), which is a repository of human
knowledge. Development of intelligent system is an important area of research
activities and a number of papers have been published in the recent past. A fuzzy
classifier based medical decision support system has been developed by I-Jen Chiang
et al. to deal with highly uncertain noise data [5]. An automated model for medical
diagnosis with fuzzy stochastic techniques for monitoring chronic diseases for aged
people have been developed by L. Jeanpierre and F. Charpillet [11]. An expert system
for respiratory diseases can be seen in the work of Musbah Jumah Aqel [12].
Successful development of expert system in the domain of child growth and
development can be seen in the work of Saha, Mondal and Samanta [1, 14].

IISMD helps the doctors for the diagnosis of diseases. It is an interactive intelligent
system where human reasoning process has been simulated into a program. The
elements of reasoning are KB and Inference engine.

The KB for a medical diagnosis system may contain facts and rules, which are
sometimes uncertain, incomplete or fuzzy due to knowledge, gained from experience,
acquired from human expert [9, 18]. More over the user, a medical practitioner, of the

1
This Chapter is based on the published paper, S. Mukhopadhyay, J. Ghosh, D. Ghosh Dastidar,
IISMD: An Interactive Intelligent System for Medical Diagnosis, AMSE Journal of Modeling,
Measurement and Control, France, Series C, Vol. 68, No.3, Page 1-12, 2007

46
Interactive Intelligent System for Medical Diagnosis

system may not always be completely certain that the value, acquired from patient, he
provides for a variable is hundred percent correct. Naturally the conclusion based on
KB of such a system must have some uncertainty. We have stored facts and rules in
the KB of IISMD as production rule as seen in many rule based expert systems [6, 15,
16].

Appropriate inference strategies [9, 15, 16] have been used for the diagnosis process.
We have used GMP [10] for inferencing. GMP is an extension of standard modus
ponens where statements, proposition and predicates, are characterized by fuzzy sets i.
e., premises need not be fully matched.

The organization of the chapter is as follows: In section 3.2 we describe the knowledge
base, i.e., domain of intelligent system causing convulsion (Fit) in infancy and the
appropriate inference strategy for interpretation of diseases. Section 3.3 deals with the
observation of a particular consultation session and Section 3.4 deals with the
conclusion describing the important features for the intelligent system.

3.2 Knowledge Base and Inference Procedure

The domain of IISMD is a set of diseases causing convulsion in infancy.

3.2.1 Knowledge Base of IISMD

KB of the IISMD considered here is a set of diseases [13] whose primary symptom is
convulsion or fit. Fit is defined as attacks of involuntary tonic or clonic movements of
the limbs, trunk and face with or without loss of consciousness. The set of diseases
covered in the system are

1) Symptomatic convulsion

(i) Febrile convulsion

(ii) Neurological causes

(a) Meningitis

(b) Encephalitis

47
Interactive Intelligent System for Medical Diagnosis

(c) Brain abscess

(d) Hypertensive encephalopathy

(iii) Miscellaneous: Breath holding spasm

2) Idiopathic convulsion

(i) Epilepsy

(ii) Infantile spasm

3.2.2 Parameters Describing the Domain

The parameters describing the domain of the intelligent system are divided into two
categories, such as user obtained parameters and rule deduced parameters. User
obtained parameters are those, which are supplied by the user during consultation
session. The user may be a doctor or any person having some knowledge about the
domain of IISMD. Examples of user obtained parameters are current-complaint, past-
illness, family-history etc. Rule deduced parameters are those, which are supplied by
the system itself during the execution time. Examples of user obtained parameters are
look-for, eeg-for, invest etc.

The parameters are their legal values used in the domain of IISMD are stored in the
variable LEGALVALS in the file RULEBASE1.LIB (see appendix). The CAAR of
LEGALVALS returns the parameters and the CDAR returns the legal value(s) of the
parameter. The user is free to give any other value of the parameter but in that case he
has to type the value instead of selection of options (see sample session). The
parameters are pat-name (patient name), pat-age (patient age), current-complaint, past-
illness, family-history. Values of these parameters are supplied in the first stage of
consultation. Some clinical examinations may be required for the diagnosis and they
are described by the parameters skull-exam, skin-exam, limb-exam, neurological-
findings, convulsion-nature, post-convulsive-disorder, blood-pressure etc. Their legal
values are displayed on the console. The user has to select options to supply the value
of the parameters.

48
Interactive Intelligent System for Medical Diagnosis

The diagnosis process can be described as the establishment of values of the various
parameters, which describe the result of consultation. The values of these parameters
are established in different stages of consultation. The investigation of convulsion
should include a careful case history, attention being paid to each system of the body in
turn, followed by a detailed physical examination. Special investigations include
lumber puncture (csf test), X-ray examination of the skull and that of limbs. Electro
encephalography (eeg) is used to search for focal disturbances around organic lesions.
If hypertension is present renal function must be investigated to determine whether the
convulsions are due to blood pressure itself (hypertensive-encephalopathy). The
parameters blood-pressure and limb-exam are used for diagnosis of hypertensive-
encephalopathy in acute-nephritis as the cause of convulsion.

Lumber-puncture (csf test) is required for the diagnosis of meningitis. Parameters


which characterized the csf test for the meningitis are csf-pressure, csf-appear (csf
appearance), pred-csf-cell-type (predominating csf cell type) csf-protein-mgpercent
(csf protein in mg percent), csf-sug- mgpercent (csf sugar in mg percent), csf-cell-
count, pyogenic-organism-csf-smear-on-gramstain.

In order to detect idiopathic (epileptic type) convulsion [13] current-complaint may not
provide any illuminating evidence but the following clinical parameters are very
important; convulsion-nature, conv-durn-secs (convulsion duration in seconds) post-
conv-disorder (post convulsion disorder). The values of these parameters are used to
conclude that the investigation eeg is required along with the probable type of
epilepsy. This is implemented by inferring the type of epilepsy as the value of the
parameter eeg-for from which the value of the parameters clinically-asc-disease and
invest (investigation required) are established. Once the value of invest is established,
the value of invest-asc-disease may obtained.

3.2.3 Knowledge Representation

Knowledge is represented in the form of Fuzzy production rules. The concept of fuzzy
production rule is derived from the production rules popularly used in rule base
systems. The format of the rule is

If X is A then Y is B with CF = 

49
Interactive Intelligent System for Medical Diagnosis

Where the premise 'X is A' and the conclusion 'Y is B' are fuzzy propositions and  is
a numerical value known as confidence factor, CF. The CF values vary from 0 to 100
(for convenience to deal with LISP). Zadeh took this value from 0 to 1 [10].

The premise portion of a rule in a rule base may contain more than one clauses
connected by operator AND.

[Example 3.1]: If X and Y then Z with CF = 

Consider a rule as follows:

If ‘eeg-for’ is ‘grandmal-epilepsy’ and ‘past-illness’ is ‘grandmal-epilepsy’

Then ‘clinical-asc-disease’ is ‘grandmal-epilepsy’ with CF 100.

We write this rule in rule no.45 of the rule base as follows:

(RULE-045 (conclusion (clinical-asc-disease grandmal-epilepsy 100))

(premise (( SAME eeg-for grandmal-epilepsy)

(SAME past-illness grandmal-epilepsy)) ) )

It is to be noted that operator OR connective has not been provided for implementing
a rule structure since the same can be done effecttively by including another rule with
the said preconditions.

The parameters and their legal values are stored in the variable LEGALVALS. The
structure of LEGALVALS in LISP is shown partially as follows:

(SETQQ LEGALVALS ((consciousness-state (drowsy irritable normal))

(past-illness (febrile-convulsion petitmal-epilepsy grandmal-epilepsy otitis


pneumonitis scabies mumps measles chicken-pox mastoiditis endocarditis
congenital-cyanotic-heart-disease craniocerebral-wound paranasal-sinus-
infection) )

(current-complaint ( pyrexia vomiting nausea persistent-headache feeding-refusal


oliguria visual-disturbance ) )

50
Interactive Intelligent System for Medical Diagnosis

………………………………………………
( pred-csf-celltype ( lymphocyte monocyte) )
……………………………………………...)
The meaning of the parameters are shown in MEANING-PARA-VAL .The structure
of MEANING-PARA-VAL in LISP is shown partially as follows:

(SETQQ MEANING-PARA- VAL ((pyrexia (rising fever))

(oedema ( information of limb) )

(conv-durn-sec ( convulsion of the limb) )

(grandmal-epilepsy (convulsion nature is rapid) )


…………………………………..
(eeg ( electro encephalography) )
………………………………...)

Values of some of the variables, fuzzy subjective characterization of the domain


expert, are represented using 10-point possibility distribution [8] as shown in Figure-
3.1.

In LISP these distributions have been implemented as the example follows:

[Example 3.2]: (conv-durn-secs((high ( 110 130 150 170 190 210 230 250 270 290)

(50 90 100 100 100 100 100 100 100 100))

where the members of the first list in the fuzzy term correspond to high of the variable
conv-durn-secs meaning convulsion duration in seconds, and the members in the
second list correspond to the respective degrees of membership of being high.

Similarly (conv-durn-secs((med ( 40 50 60 70 80 90 100 110 120 130)

(20 50 90 100 100 100 100 100 90 30))

and (conv-durn-secs((low ( 20 30 40 50 60 70 80 90 100 110)

(100 100 100 90 80 70 60 50 40 30))

The following is a sample rule in the KB employing the above idea:

51
Interactive Intelligent System for Medical Diagnosis

(RULE-058 (conclusion (eeg-for grandmal-epilepsy 60))

(premise ( ( NOTSAME signs-of mening-enceph)

(SAME conv-durn-secs high))))

where high is represented as fuzzy sets.

Two rules in the rule base are not of the form IF-THEN, they are initial data and goal.
They are used for initialization of data and to set the parameters relating to goal. In our
IISMD system our goal is the diagnosis of diseases, which are either clinically
ascertained disease or investigatingly ascertained disease. The two rules relating to
initialization of data and goal are as follows:

(RULE-000 (initialization (pat-name, age-in-months, current-complain,

past-illness, family-history)))

(RULE-026 (goal (invest clinically-asc-disease invest-asc-disease)))

conv-durn-secs---->high conv-durn-secs---->med

conv-durn-secs---->low

Figure 3.1: Graphical representation of 10 point possibility distribution of fuzzy


variables high, med and low

52
Interactive Intelligent System for Medical Diagnosis

3.2.4 Inferencing

The Inference Engine of IISMD uses both forward chaining and backward chaining
inference strategies. Up to certain extent it works in forward mode then it works using
backward chaining.

A temporary storage called CACHE is used to store the result of consultation. The
establishment of various values of the parameters are done here using the inference
process of a production (rule based) system, match-select-execute as shown in Figure-
3.2.

KB CACHE

1
Match

Conflict Set

2
Select

3
Execute

Figure 3.2: The production system inference cycle

For inferencing MYCIN like inference procedure [2, 9, 16] is adopted. The expert
applies a confidence factor (CF) to the parameters in each of THEN statement only by
his experience. During consultation session, the user may select the CF or he may input
the CF from console to the parameter values. For an IF statement to be true, it's CF
must exceed certain limit (20 in our case). Premises may be combined by means of
AND function and the CF of IF statement is the minimum of the CF of the premises (as
per fuzzy logic). When the condition of IF statement are found to be met the action in
the statement is taken into consideration and a new CF for the parameter in the

THEN statement is calculated as follows:

53
Interactive Intelligent System for Medical Diagnosis

(CF of the parameter IF statement * CF of the parameter THEN statement)/100

Conceptually, if we are 50% certain of evidence and 50% certain than a rule applies,
we are 25 % certain of the inference.

If the same conclusion can be derived from different sources then the CF of THEN
statement is computed by using the formula

CFl + ((100- CF1) * CF2)/100

3.3 Observation

The KB of IISMD consists of 90 rules but we have shown the following rules, in the
form of English like language, those have been used in the consultation process of this
particular session as shown in Appendix-I.

RULE-030: If current complaint is visual-disturbance and clinical-findings-of is


raised-intracranial-tension

THEN look-for is space-occ-brain-lesion with CF l00

RULE-042: IF convulsion nature is tonic-clonic-movt-of-musc and signs-of is not


mening-enceph

THEN eeg-for is grandmal-epilepsy with CF 100

RULE-043: IF post-convulsive-disorder is bladder-bowl-incontinence and signs-of is


not mening-enceph

THEN eeg-for is grandmal-epilepsy with CF 100

RULE-044: IF post-convulsive-disorder is sleep and signs-of is not mening-enceph

THEN eeg-for is grandmal-epilepsy with CF 100

RULE-045: IF eeg-for is grandmal-epilepsy and pastillness is grandmal-epilepsy

THEN clinical-asc-disease is grandmal-epilepsy with CF 100

RULE-046: IF eeg-for is grandmal-epilepsy and family history is grandmal-epilepsy

54
Interactive Intelligent System for Medical Diagnosis

THEN clinical-asc-disease is grandmal-epilepsy with CF 100

RULE-047 : IF eeg-for is X THEN invest egg with CF 100

RULE-056 : IF eeg-for is X THEN clinical-asc-disease is X with CF 60

RULE-058 : IF conv-durn-secs is high and signs-of is not mening-enceph

THEN eeg-for is grandmal-epilepsy with CF 60

Suppose the user gives the following inputs as described in the sample consultation
session i. e., the value of parameters current-complaint is visual-disturbance with CF
60, past-illness is grandmal-epilepsy with CF 70, family-history grandmal-epilepsy
with CF 70, convulsion-nature is tonic-clonic-movt-of-musc with CF 90, post-
convulsive-disorder is sleep with CF 80 and conv-durn-secs is high with CF 90.

By Rule-042, Rule-043, Rule-044, Rule-058 the value of the parameter eeg-for is


grandmal-epilepsy with CF 100. By Rule-047 the value of invest is eeg with CF 99
which can be interpreted as investigation required if eeg with confidence factor 99.

By Rule-045, Rule-046 and Rule-056 the value of the parameter clinically-asc-disease


is grandmal-epilepsy with CF 96.

Finally, the following conclusion is drawn on the basis of facts supplied by the user:

(invest eeg 99) ( clinically-asc-disease grandmal-epilepsy 96)

i.e. investigation required is Electro encephalography with 99% certainty and


clinically ascertained disease is grandmal epilepsy with 96% certainty.

3.4 Conclusion

IISMD has been developed in a medical domain for diagnosis of diseases causing
convulsion in infancy, but it may be applied to any diagnostic domain having similar
rule based structure. The system is working satisfactorily. Our ultimate aim is to
extend the system based on neural network classification so that parallel computations
are possible.

Special features of IISMD are

55
Interactive Intelligent System for Medical Diagnosis

 User can input by selecting options and making the system interactive.

 Like essential features of expert system, user can obtain explanation of the
reasoning process by selecting WH (why) option.

 User can obtain meaning of any parameter by selecting EP (explanation)


option and thus allowing the user to be not necessarily an expert.

 User can abort the consultation session using AB (abort) option and continue
with the same afterwards, if required, at the point where it was aborted.

Bibliography

1. A. K. Saha, P.Mondal and R. K. Samanta, A portotyped-object-oriented expert


system for child growth and development, Modellmg, Measurement and Control
(C ), AMSE Press, France, Vol. 31, No.2, pp 13-24, (1992).

2. B. G. Buchanan, E. H. Shortliffe, Rule-Based Expert Systems: The MYCIN


Experiments of the Stanford Heuristic Programming Project, (1984).

3. Dwijis K. Dutta Majumder, Artificial Intelligence and expert systems framework,


Proceedings of Eastern Regional Convention of Computer Society of India, pp 6-
12, (1995).

4. Dan W. Patterson, Introduction to Artificial Intelligence and Expert System,


Prentice-Hall of India Private Limited, New Delhi, (2002)

5. I-Jen Chang, Ming-Jium Shieh, Jane Yung-Jen Hsu, Jau-Min Wong, Building a
medical decision support system for colon polyp screening by using fuzzy
classification trees, Applied Intelligence, Vol. 22, pp. 61-75, (2005)

6. J. J. Buckley, W. Siler and Douglas Tucker, A fuzzy expert system, Fuzzy sets
and systems, Vol. 20, pp 1-16, (1986).

7. Krzysztof J. Clous, Inho Shin, Lusy S. Goodenday, Using fuzzy sets to diagnosis
coronary stenosis, IEEE Computer, Volume 24 , No. 3, pp. 57 – 63, (1991)

56
Interactive Intelligent System for Medical Diagnosis

8. K. S. Leung, M. H. Wong and W. Lam, A fuzzy expert data base system, Data and
Knowledge Engineering, Vol. 4, pp 287-304, (1989).

9. Keung-chi Ng and Bruce Abramson, Uncertainty management in expert systems,


IEEE Expert: Intelligent Systems and Their Applications, vol. 05, no. 2, pp. 29-
48, (1990)

10. L. A. Zadeh, A computational theory of dispositions, International Journal of


Intelligent Systems, Vol. 11, pp. 39-63, (1987).

11. Laurent Jeanpierre, Francois Charpillet, Automated medical diagnosis with fuzzy
stochastic models: monitoring chronic diseases, Acta Biotheoretica, Vol. 52, pp.
291-311, (2004)

12. Musbah Jumah Aqel, A cooperative decision-making multi perspective expert


system for respiratory system diseases, Advances in Modeling(B), AMSE Press,
France, Vol. 45, No. 4, pp 21-28, (2002).

13. Nicholas P. Mann and Angus Nicoll, handbook of Paediatric, Blackwell Scientific
Publishers, (1989).

14. P. Mondal, A. K. Saha and R. K. Samanta; On an expert system with child growth
and development, Advances in Modelling & Analysis (B), AMSE Press, France,
Vol. 26 No.1, pp 13-28, (1993).

15. R. Morpurgo and S. Mussi, I-DSS : An intelligent diagnostic support system,


Expert system, Vol. 18, No.1, pp. 43-58, (2001).

16. S. Mukhopadhyay, A decision support system for medical diagnosis, Journal of


Discrete Mathematical Sciences & Cryptography, Vol. 3, No. 1-3, pp 179-192,
(2000).

17. V. K. Khanna, Artificial Intelligence, Knowledge Based Systems and Parallel


Computing, Asian Books Private Ltd, New Delhi, (2004).

18. Willam G. W. Magil and A. Stewart, Uncertainty techniques in expert system


software, Decision Support Systems, Vol. 7, pp. 55-65, (1991).

57
Rough‐Fuzzy Hybridization

Chapter 4

Rough-Fuzzy Hybridization
Rough set and its integration with fuzzy set has been an efficient soft computing
strategy of machine learning. Rough-Fuzzy hybridization provides a flexible way of
information processing for handling different real-life and ambiguous decision-
making problems.

Rough Set theory was introduced by Zdzislaw Pawlak [1, 2] for classificatory analysis
of data tables. Rough Set theory provides a systematic framework for studying
imprecise and insufficient knowledge. The main goal of rough set theoretic analysis is
to synthesize approximation (upper and lower) of concepts from the acquired data.
While fuzzy set theory [3] assigns to each object a grade of membership
(belongingness) to represent an imprecise set, the focus of rough set theory is on the
ambiguity caused by limited discernibility of objects in the domain of discourse.

4.1 Introduction

Fuzzy set and rough set theories are extensions of classical set theory in mathematics
to describe uncertainty, imprecision and vagueness of data. Characteristic function of
a fuzzy set uses a degree of membership in [0, 1], on the other hand characteristic
function of a rough set employs membership functions that is its lower and upper
approximations in an approximation space. There have been extensive theoretical
works on the relationships between rough sets and fuzzy sets [4, 5, 6], and many
approaches have been proposed on the combination of rough and fuzzy sets: rough-
fuzzy sets, fuzzy-rough sets [7, 8, 9], and rough fuzzy hybridization [10-14]. By
definition, analysis, and operation of a set with fuzzy concepts, it is simpler to utilize
a set-method for use of a fuzzy set. One example of using a set-method on
combination of rough and fuzzy sets is a more general framework suggested by Klir
and Yuan [15].

On the other hand rough-fuzzy hybridization is defined as a method of hybrid


intelligent system or soft computing, where fuzzy set theory is used for linguistic

58
Rough‐Fuzzy Hybridization

representation of patterns and rough set theory is used to obtain dependency rules to
construct the knowledge-base of the intelligent system.

The main objective of this chapter is to review the theoretical and implementation
approaches on combination of rough and fuzzy sets.

The organization of this chapter is as follows: section 4.2 represents an overview of


different fuzzy operator in fuzzy set theory. Section 4.3 devoted for the theoretical
discussion of rough set theory. Section 4.4 describes the different combination of
rough set and fuzzy set. Section 4.5 deals with the conclusion describing the interest
and scope of rough-fuzzy hybridization.

4.2 Fuzzy Set Theory

Fuzzy sets are a generalization of sets in which their membership functions are
defined in [0, 1] of real number domain.

A fuzzy set F is defined as

, | ∈ ,0 1 (4.1)

U: a universe

x: an element in U

F: a fuzzy set on U

: a membership function

Utilizing the standard max-min system proposed by Zadeh [3], the fuzzy-set
complement, intersection, and union are defined by

1 (4.2)

∩ ,

∪ ,

where

F, G: two fuzzy sets defined in a universe U

: a membership function of the complement of a fuzzy set F

59
Rough‐Fuzzy Hybridization

∩ , ∪ : membership functions of sets of the intersection, union of


F and G, respectively

An important feature of fuzzy set operations is that they are truth-functional [16]. The
membership functions of the complement, intersection, and union of fuzzy sets can be
obtained, which is based only on the membership functions of the fuzzy sets involved.

4.3 Rough Set Theory

Knowledge discovery is an important process in data analysis, data mining and


machine learning to extract knowledge, or information, from enormous data.
Knowledge Discovery in Databases (KDD) is the nontrivial extraction of implicit,
previously unknown and potentially useful information from data [17]. Data
preprocessing is a step of the KDD, which reduces the complexity of the data and
provide better environments to consequent investigation as well as make sense of data
in more readable and applicable form. Rough sets theory is a tool for developing
methods in knowledge discovery process. Knowledge discovery is a process that is
used to support human decision- making processes or to explain observed phenomena.

This kind of methods used to solve different type of problems like sampling, feature
selection, clustering or classification, transformation or projection, dimensionality
reduction, rule extraction, and different physical models by developing useful
algorithms. These algorithms generate assumptions about extracted data or
information. These assumptions are considered as new extracted knowledge. The
methods of KDD or extracting knowledge from acquired data sets about a problem
domain are basic steps of information processing. In real world problem,
consideration of implicit, imprecise, and insufficient knowledge in databases or
constructed data sets is a major important area in developing intelligent systems.
Typically knowledge is presented in the form of rules. However, knowledge
discovery systems often generate a huge amount of rules. Another fundamental issue
is how to automatically discover interesting and meaningful knowledge from such
discovered rules. It is infeasible for human beings to select important and interesting
rules manually.

60
Rough‐Fuzzy Hybridization

4.3.1 Introduction

The rough sets theory has been developed for KDD and extracts knowledge from
experimental data sets [18]. This theory provides a powerful foundation to describe
the behavior of concepts, properties, data, in general objects that may present some
intrinsic, vague, ambiguous, unsharp features.

Rough Set theory was introduced by Zdzislaw Pawlak [1]. The rough set approach
provides efficient algorithms for finding hidden patterns in data, minimal sets of data
(data reduction), evaluating significance of data, and generating sets of decision rules
from data also reduces the computational complexity of learning processes. This
approach is easy to understand, offers straightforward interpretation of obtained
results, most of its algorithms are particularly suited for parallel processing

Rough set theory has been applied in different domain like medical databases analysis
[19-21], medical image analysis [22-24], decision support systems [25, 26], pattern
recognition [27-29], and machine learning [30, 31] and so on. Rough sets has been
used very effectively for find out the relationships within imprecise data,
dependencies among objects and attributes also effectively used to evaluate the
classificatory importance of attributes, remove data redundancies and generate
dependency rules.

Some situation occurred when objects cannot be classified or categorized in terms of


the available attributes. They can only be roughly or approximately defined. The
equivalence relations are play the key role in rough set theory, which divides a data
set into equivalence classes which are approximates by a pair of sets, defined as lower
and upper approximations. The lower approximation of a set of object known as
concept contains all objects that are classified as certainly belonging to the concept
based on the knowledge of attributes set. Similarly the upper approximation of a
concept contains all objects that cannot be classified categorically as not belonging to
the concept based on the same knowledge of attributes set. A rough set is defined as a
pair of upper and lower approximations of the set.

The rough sets theory also deals with information presented in table. This table
consists of objects and their attributes. The entries in the table may be the categorical
values of the attributes or features and possibly also associated categories or classes.

61
Rough‐Fuzzy Hybridization

Normally data processing problems easily converted into a data table representation
and analysis. Information processing using the rough sets also classifies the objects.

4.3.2 Information System

An information system is a convenient tool for the representation of data gathered


from measurements of some physical domain, for example, medical patient data,
sequences of medical images, industrial process, and so on. An information system is
in this context a single flat table, either physically or logically in form of a view
across several underlying tables. A set-based information system denoted by S be a
quadruple as defined as [32, 33]

S =<U, A, V, f > (4.3)

Where

U: is a nonempty set of objects, sometime define as universe

A: is a finite and nonempty set of attributes

∶ a nonempty finite set of values for each attribute ∈


f: U×A → V : a total function such that f(u, a) ∈ Va , ∀ (u, a)∈ U A, also


called information or knowledge function.

In an information system S, any pair (a, v) for a ∈ A, v ∈ Va is called the descriptor.

An information system is also called knowledge representation system or attribute-


valued system or intuitively called as information table. The information system that
is represented as a finite data table, in which the rows are defined by represents a
case, an event, a patient, or simply an object, the column by attributes. Elements in
table are the values f(x, a) where x denote the row and a denote column. Each row
describes the information about some object in S. Any nonempty set of objects X is
called a concept in S. A concept might have a certain meaning. For example, in a
medical data set with tests and diagnoses, a concept is defines as a set of objects
representing patients. Consider an example of diabetes mellitus is given bellow

62
Rough‐Fuzzy Hybridization

[Example 4.1]

Suppose a simple example of a data set called DIABET representing medical


information of patients suffering from diabetes mellitus. For simplicity of example
few attributes are chooses.

Object Attributes

U INS(c1) PPB(c2) ALB(c3) LJM(c4)


p1 0 N 2 0
p2 0 N 2 0
p3 1 N 0 1
p4 1 N 0 0
p5 0 V 0 0
p6 0 V 0 0
p7 1 V 2 1
p8 1 V 2 1
p9 0 H 1 1
p10 0 H 1 0

Table 4.1 A medical data set DIABET

Where INS stands for use of insulin (1= yes, 0 = no), PPB stands for post prandial
blood glucose, ALB stands for albuminuria (0= no albuminuria, 1 =
microalbuminuria, 2 = proteinuria) and LJM is the disease limited joint mobility (1=
yes, 0 = no).

In the information system S describing in Table 4.1, the universe U consists of ten
objects, U = {p1, p2… p10} each representing one patient. Each patient is described by
the set of four attributes A = {a1, a2, a3, a4} = {c1, c2, c3, c4}, with discrete values
(numerical and symbolic), representing results of the medical observation and
diagnoses. The set of all discrete numerical values of the attribute c1 is = {0, 1},
second attribute c2 takes two discrete non-numerical values = {N, H, V}. The third
attribute c3 is with three discrete numerical values = {0, 1, 2}. The attribute c4,
with two binary values = {0, 1}. Values of information function f(x, a) are
included in Table 4.1. For example, for the patient (object) p1 and the attribute c1, the
information function values, f(p1, c1) = 0. A set of objects {p2, p3, p5, p6} can be
defined as an example of a concept in the considered information system.

63
Rough‐Fuzzy Hybridization

4.3.3 Decision System

In supervised learning process there is an outcome of classification that is known.


This a posteriori knowledge is expressed by one distinguished attribute called
decision attribute. Information systems of this kind are known as decision systems. A
decision system can be derived from any information system by dividing the attribute
set two disjoint sets which are condition attribute set C and decision attribute set D,
i.e., A = C ∪ D. For example, for a data set gathered for a classification task, a set C
of condition attributes may represent elements of a pattern x describing an object and
a set D may represent a classification decision, for instance, a categorical class
assigned to an object. The decision attribute may take several values though binary
outcomes are rather frequent.

In a given information system S, a decision system DS may be defined as

DS = <U, C ∪ D, V, f > where C∩ D = φ 4.4

Where

U: is a nonempty set of objects, sometime define as universe

C: a nonempty finite set of condition attributes (features of input pattern)

D: a nonempty finite set of decision attributes (target classes)

∶ a nonempty set of values for each attribute ∈ ∪


∈ ∪

f: a total decision function in DS

A decision system intuitively expressed in terms of decision table. A decision table is


called deterministic (consistent) if each object’s decision attributes values are
uniquely specified by a particular object’s condition attributes. If a number of
decision attribute values may be taken for a given condition attribute, it is called
nondeterministic (inconsistent). Some of nondeterministic decision tables may be
decomposed into two sub tables; deterministic and totally nondeterministic. A totally
nondeterministic decision table does not contain a deterministic sub table.

64
Rough‐Fuzzy Hybridization

[Example 4.2]

The DIABET data set from Example 4.1 can be interpreted as a decision system as
shown in Table 4.2.

Object Attributes
C D
(medical diagnoses) (disease class)
U c1 c2 c3 d
p1 0 N 2 0
p2 0 N 2 0
p3 1 N 0 1
p4 1 N 0 0
p5 0 V 0 0
p6 0 V 0 0
p7 1 V 2 1
p8 1 V 2 1
p9 0 H 1 1
p10 0 H 1 0

Table 4.2: A decision system DIABET

In Table 4.2 the attribute c4 of Table 4.1 define as d that represents an expert’s
(doctor’s) decision, taken about the disease based on observation and test results. The
decision attribute d = 0 denotes the diagnosis that a patient does not have a disease,
and d =1 that patient suffers from disease limited joint mobility.

4.3.4 Indiscernibility Relation

An information table (information system) expresses all the knowledge about a


specific problem. This table may be unnecessarily large in part because it is redundant
in at least two ways. The same or indiscernible objects may be represented several
times, or some of the attributes may be unessential [32, 33].

Let S = <U, A, V, f > be an information system and let B ⊆ A be a subset of attribute


set A. Two objects x, y ∈ U are said to be indiscernible by the set of attributes B in S,
also called B-indiscernible, (denoted by INDS(B) or simply IND(B)) iff f(x, a) =
65
Rough‐Fuzzy Hybridization

f(y, a), ∀ a ∈ B. For any given subset of attributes B⊆ A, the IND(B), is an


equivalence relation on universe U and is called an indiscernibility relation. The
indiscernibility relation, IND(B) is defined as

IND( B) = {( x, y) ∈ U× U : ∀a∈ B, f ( x, a) =f ( y, a)} (4.5)

If the pair of objects (x, y) belongs to the relation IND( B) i.e., ( x, y) ∈ IND( B) then
the objects x and y are called indiscernible with respect to set B in S or B-
indiscernible. In other words, anyone cannot distinguish object x from y in terms of
attributes from set B only.

The indiscernibility relation IND(B) as a binary equivalence relation, splits the given
universe U into a family of equivalence classes {E1, E2,…, Er}. The family of all
equivalence classes {E1, E2, …, Er}, defined by the relation IND (B) on U, generates a
partition of U and it is denoted by U / IND (B) or simple U/B. An equivalence class
denoted by [x]B, including an object x ⊆ U is defined by

∈ ∶ (4.6)

Objects belonging to the same equivalence class Ei are indiscernible; otherwise


objects are discernible with respect to the attribute subset B. The equivalence classes
Ei, i=1,2,..,r of the relation IND (B) are called B-elementary sets in an information
system S. DesB(X) denotes the description of B-elementary set X ∈ (an
equivalence class) and it is defined in the following equation

DesB (X ) ={(a = b) : f (x, a) = b, ∀x ∈ X, a∈ B } (4.7)

4.3.5 Approximation Space

Let S be an information system and B ⊆ A is a subset of attribute set, generates an


indiscernibility relation IND (B) (an equivalence relation). An ordered pair AS = (U,
IND (B)) is called an approximation space [1, 32].

Any finite union of elementary sets in AS is called a definable set of a composed set in
AS.

Let x∈ U, the equivalence class of U containing x with respect to B is denoted by [x]B.


The family of definable sets, i.e., finite union of arbitrary equivalence classes in

66
Rough‐Fuzzy Hybridization

partition U / IND(B) in AS, denoted by DEF(AS) is Boolean algebra [1]. The given
arbitrary set X ⊆ U, X may not be presented as union of some equivalence classes in
U. In other word, a subset X cannot be described precisely in AS. Thus, a subset X
may be characterized by a pair of its approximations, called lower and upper
approximations. It is here that the notion of rough set emerges.

[Example 4.3]

Consider the decision system, DIABET from Table 4.2 assuming only conditional
attributes i.e., observation and results of tests are considered, representing by the
attributes set B = {c1, c2, c3} and contained in the reduced Table 4.3.

Object Attributes
U c1 c2 c3
p1 0 N 2
p2 0 N 2
p3 1 N 0
p4 1 N 0
p5 0 V 0
p6 0 V 0
p7 1 V 2
p8 1 V 2
p9 0 H 1
p10 0 H 1

Table 4.3: The DIABET data set with the reduced attribute set B = {c1, c2, c3}

Equivalence classes:

E1 = [p1]B = [p2]B = {p1, p2}

E2 = [p3]B = [p4]B = {p3, p4}

E3 = [p5]B = [p6]B = {p5, p6 }

E4 = [p7]B = [p8]B = {p7, p8}

E5 = [p9]B = [p10]B = {p9, p10}

67
Rough‐Fuzzy Hybridization

Thus from Table 4.2 it is found that objects are divided into five disjoint groups
according to equal values of attributes c1, c2 and c3 from the subset B defined above
by the equivalence classes E1, E2, E3, E4, E5. Objects in the same group have the same
values for all attributes. For example, in the first group it has two objects p1 and p2
since no other objects have values c1 = 0, c2 = N and c3 = 2 for attributes from B. The
object p1 belongs to the equivalence class E1 = [p1]B= [p2]B = {p1, p2}. Objects p3 and
p4 with equal values for all attributes c1 = 1, c2 = N, c3=0 form the second group. It
can be observed that objects in this group cannot be distinguished based on attributes
c1, c2 and c3 from the set B only. They belong to the equivalence class E2 = [p3]B =
[p4]B = {p3, p4} . Similarly, it can be found that other equivalence classes in S for set
B; E3 = [p5]B = [p6]B = { p5, p6 }, E4 = [p7]B = [p8]B = {p7, p8}, E5 = [p9]B = [p10]B =
{p9, p10}.

As discussed, a subset of attributes B ⊆ A enforces an indiscernibility relation IND (B)


on the whole set of objects from the universe U. So the a relation IND(B) can be
implied that; All pairs of objects (pi, pj) in S for which values of all attributes from B
are equal.

4.3.6 Approximation of Sets

Some subsets of objects in an information system cannot be distinguished in terms of


the available attributes. They can only be approximately defined. The idea of rough
sets consists of an approximation of a set by a pair of sets, called a lower and an upper
approximation of this set.

Let S be an information system, and let B ⊆ A be a subset of attribute set, determines


the approximations space AS = (U, IND (B)) in S. Let X ⊆ U, X can be approximate in
AS by constructing the B-lower and B-upper approximations of X, denoted by and
respectively and defined as follows

∈ ∶ ⊆ ⋃ ∈ ∶ ⊆ (4.8)

∈ ∶ ∩ ⋃ ∈ ∶ ∩ (4.9)

Where [x]B: an equivalence class which contains x on an equivalence relation IND(B).

68
Rough‐Fuzzy Hybridization

A lower approximation of a set X with respect to B is union of all equivalence


classes that are subsets of X. Thus for any x ∈ , it is certain that x belongs to X
with respect to B. In other words, a lower approximation of a set X contains all
objects that, basis on the knowledge in B, can be classified as certainly belongingness
to the concept X.

An upper approximation of a set X with respect to B is a union of all equivalence


classes that have nonempty intersections with X. Thus for any x ∈ , it can be said
that x can possibly belongs to X with respect to B. In other words, an upper
approximation of a set X contains all objects, basis on the knowledge in B, that
can be classified as not belongingness to the concept X.

A B-boundary region of a set X in BS, is a doubtful region of IND(B) denoted by


BNB(X) and defined as follows in (4.10). Thus for any x ∈ U belonging to BNB(X), it is
impossible to determine that x belongs to X based on the description of elementary
sets of IND(B).

(4.10)

A B-lower approximation of a set X is a possibly (the greatest) definable set in B of a


set X and a B-upper approximation of a set X is a certainly (the smallest) definable set
in B of a set X. A B-boundary is a doubtful region in B of a set X.

Universe U

Arbitrary set X

Lower Approximation

Upper Approximation

Negative region

Figure 4.1 A set approximation of an arbitrary set X in U

69
Rough‐Fuzzy Hybridization

Given an approximation space AS for B ⊆ A and a concept X ⊆ U, the universe can be


partitioned into the following three regions as follows:

1. An B-positive region POSB(X) of X in S: (4.11)

2. An B-boundary region BNB(X) of X in S: (4.12)

3. An B-negative region NEGB(X) of X in S: U - (4.13)

If = then it can be said that the concept X ⊆ U is B-exactly approximated in

AS. In this case the B-boundary region BNB(X) = 0. On the other hand If ≠

then the concept X ⊆ U is B-roughly approximated in AS and the B-boundary region


BNB(X) ≠ 0. The B-boundary of B-exact set is an empty set.

Let φ be the empty set, X, Y ⊆ U, X is the complement of X in U. The properties


about the lower and the upper approximation are as follows []:

1. ⊆ ⊆
2. φ
3. U
4. ∪ ⊇ ∪
5. ∪ ∪
6. ∩ ∩

7. ∩ ⊆ ∩
8. X
9. X
10.
11.

[Example 4.4]

Let us consider a concept X1 of objects from U i.e., X1⊆ U in an information system S


from Table DIABET, representing patients with LJM (d =1 that patient suffers from
disease LJM). X1 = {p3, p7, p8, p9}

70
Rough‐Fuzzy Hybridization

According to the definition of the lower and the upper approximation of a set X1,
based on a subset of attributes B ⊆ A, B = {c1, c2, c3}, the lower approximation is the
largest composed set of B-elementary sets in S that is contained in the subset X1.

The lower approximation contains all B-elementary sets such that every element of
the elementary set is also an element of X1. A lower approximation consists of
patients that surely have a disease.

The upper approximation of set X1 is the smallest composed set of B-elementary sets
in S that contain a subset X1. An upper approximation consists of patients that
possibly have a disease.

, ∪ , ∪ , , , , , ,

The B-boundary region (B-doubtful region of IND (B)) of the set X1 in S based on B,
is

, , ,
This boundary region consists of the composed set of B-elementary sets from S whose
elements, based on the subset of attributes B, cannot be classified as belonging to X1
or not.

In rough sets theory, a set X is either definable or nondefinable. A set X ⊆ U is


definable in B, denoted by B-definable, iff = , otherwise X is not definable,
denoted by B-nondefinable. In other words, a set X is definable if every object x ∈ U
can be determined with certainty whether x ∈ X or not. Then the lower approximation
of X will be equal to the upper approximation of X, and the boundary of X will be
equal to the empty set.

 A set X is roughly B-definable iff and .


The lower and upper approximation of a set X can be defined. Thus it is possible
to determine for some objects of U whether they belong to X or X.

 A set X is externally B-nondefinable in S iff and .

71
Rough‐Fuzzy Hybridization

It cannot be said that any x ∈ U is not an object of X. Thus it can be determined


for some objects of U that they belong to X, but it cannot be said that for any
object of U, they belong to X.

 A set X is internally B-nondefinable in S iff and .


It cannot be said that any x ∈ U is an object of X. Thus it can be determined that
for some objects of U they belong to X, but it cannot be said that for any
object of U they belong to X.

 A set X is totally B-nondefinable in S iff and .


The approximations cannot be defined at all. For any element x ∈ U, it cannot be
determined that it belong to X or X.

[Example 4.5]

From example 4.3, the equivalence classes of DIABET data set with B⊆A and B =
{c1, c2, c3} are as follows.

E1 = [p1]B = [p2]B = {p1, p2}

E2 = [p3]B = [p4]B = {p3, p4}

E3 = [p5]B = [p6]B = {p5, p6 }

E4 = [p7]B = [p8]B = {p7, p8}

E5 = [p9]B = [p10]B = {p9, p10}

Consider a set (concept) X1⊆ U and X1 = {p3, p4, p5, p6}. The set X1 is an example of
B-definable set with respect to B. where

∪ , , ,

Consider another set (concept) X2⊆ U and X2 = {p2, p4, p5, p6}. The set X2 is an
example of roughly B-definable set with respect to B. where

, φ
∪ ∪ , , , , ,

72
Rough‐Fuzzy Hybridization

Consider another set (concept) X3⊆ U and X3 = {p2, p4, p5, p6, p7, p9}. The set X3 is an
example of externally B-nondefinable set with respect to B. where

, φ
∪ ∪ ∪ ∪ , , , , , , , , ,

Consider another set (concept) X4⊆ U and X4 = {p2, p4, p5}. The set X4 is an example
of internally B-nondefinable set with respect to B. where

∪ ∪ , , , , ,

Consider another set (concept) X5⊆ U and X5 = {p2, p4, p5, p7, p9}. The set X5 is an
example of totally B-nondefinable set with respect to B. where

φ
∪ ∪ ∪ ∪ , , , , , , , , ,

4.3.6.1 Accuracy of Approximation

Rough sets provide quantitative, numerical evaluation of the quality of approximation


(accuracy measure) of a set X ⊆ U with respect to subset of attributes B⊆ A, in the
approximation space AS = (U, IND(B)), using all equivalence classes of IND(B). Let S
= <U, A, V, f > be an information system and let B ⊆ A and X ⊆U determining the
approximation space AS = (U, IND(B)). The accuracy of an approximation of a set X
by the set of attributes B (shortly accuracy of X) is defined by [1, 32]

(4.14)

It can be easily seen that if a set X is B-exactly approximated in the approximation


space AS with respect to B, then = 1. If a set X is B-roughly approximated in
AS, then the range of is 0 < < 1. The alternative accuracy of an
approximation is defined by

1 (4.15)

73
Rough‐Fuzzy Hybridization

This is called roughness (B-roughness) of a set X with respect to B. Roughness, as


opposed to accuracy, represents a degree of inexact approximation of a set X ⊆ U in
the approximation space AS =(U, IND(B)) defined by B A .

The accuracy of approximation has the following properties:

1. For any B ⊆ A and X ⊆ U, 0 ≤ ≤ 1.

2. B-boundary region of X, BNB (X) = φ, ( and the set X is B-definable)


iff = 1.

3. B-boundary region of X, BNB (X) φ (the set X is B-nondefinable) iff


< 1.

A vague concept description may contain boundary-line objects from the universe U,
which cannot be absolutely certain classification to satisfy the description of a
concept. Uncertainty is related to the idea of membership of an element to a set. From
rough set perspectives a set membership function can be defined, which is related to
the rough sets concept. This may be considered as another numerical measure of
imprecision (uncertainty). The rough (B-rough) membership function of an object x to
a set X⊆ U with respect to the subset of attributes B⊆ A is defined by

Where

0≤ ≤1 (4.16)

To measure the membership value of degree of uncertainty of an object x in universe


to the set X ⊆ U with respect to the possessed knowledge (in an information system)
is defined

(4.17)

In rough set, it is possible to find out a firm connection between vagueness and
uncertainty. Vagueness is related to sets of objects (concept), whereas uncertainty is
related to elements of sets. Rough set show that vagueness is defined in terms of
uncertainty.

74
Rough‐Fuzzy Hybridization

The lower and upper approximation of a set and also the boundary region may be
defined by using the rough set membership function as follows:
∈ ∶ 1 (4.18)

∈ ∶ 0

∈ ∶ 0 1

[Example 4.6]

From example 4.4, for the subset of attributes B ⊆ A, and B = {c1, c2, c3} and set of
objects X1 ⊆ U, X1= {p3, p7, p8, p9} the accuracy of an approximation of a set X1 with
respect to B is:

, 2
0.333
, , , , , 6

4.3.6.2 Approximation and Accuracy of Classification

The concept of set approximations can be extended to approximations of


classification related to the family Γ of subsets {X1, X2, …, Xn} from U.

Let S = < U, A, V, f > be an information system, and let B ⊆ A and Γ = {X1, X2, …,
Xn} for every subset Xi ⊆ U (1 i n) be a classification (or a partition; a family of
subsets) of U. The family of sets Γ = {X1, X2, …, Xn} is a classification in U of te

information system S, if Xi ∩ Xj = for every i, j n, i ≠ j and ⋃ . Sets Xis

are called classes of Γ.

For B ⊆ A, the B-lower and B-upper approximation of a classification of Γ on S,


denoted by and respectively, are defined as follows

, ,…, (4.19)

, ,…, (4.20)

The set is called B-positive region of a classification Γ and BNB (Γ) = - is


called B-boundary region of a classification Γ. The B-positive region of a
classification Γ with respect to B is defined by

75
Rough‐Fuzzy Hybridization

⋃ ∈ (4.21)

A union of boundary regions of a classification Γ with respect to B⊆ A is called B-


doubtful region of a classification Γ in S as defined by

⋃ ∈ (4.22)

There is no B-negative region of a classification Γ in S, because ⋃ ∈ .

A classification Γ is called B-definable iff every class ∈ is B-definable;


otherwise the classification is called B-nondefinable. And the classification is called
roughly B-definable iff ∃ ∈ , .

The accuracy of approximation of a classification by the set of attributes B⊆A, or


accuracy of a classification, is defined by



(4.23)

The quality of approximation of a classification with respect to B ⊆ A, or quality of a


classification, is defined by


Γ (4.24)

This represents a ratio of all B-correctly classified objects and all objects in the
system S.

The idea of accuracy of a classification allows us to define closeness that one can
approximate a partition (classification U/IND(B)) generated by a set of attributes B ⊆
A by another partition U/IND(C) generated by a set of attributes C ⊆ A. The accuracy
of approximation of classification U/IND(C) by U/IND(B) may be defined as follows:

The inequality 0 / 1 holds for every B, C ⊆ A.

/
/ (4.25)

Where

/ ⋃ ∈ / : a classification Γ with respect to B. (4.26)

76
Rough‐Fuzzy Hybridization

[Example 4.7]

From the information system DIABET defined in example 4.2, consider that there is a
classification Γ ={X1, X2, X3, X4} where X1 = {p2, p4, p5}, X2 = {p1, p7, p8}, X3 = {p5,
p7, p9}, X4= {p8, p10,}.

The accuracy and the quality of a classification Γ is

, , , , 7, 8 , ,

, , ,
, , , , , , , , , , , , , , , , , , ,

0 2 0 0 2
Γ 0.1
6 4 6 4 20
2
0.18
11

Therefore, it can be said that the accuracy of this classification is very poor and the
classification process has to be improved towards higher accuracy.

4.3.6.3 Classification and Reduction of Attributes

In many applications, classification of objects is one of the most frequently performed


tasks. Classification may be considered as a process of determination of a unique class
for a specified object. A specified set of objects, described by the set of condition and
decision attributes, may be classified into a disjoint family of classes with respect to
the values of decision attributes. Each class may be determined in terms of attribute
values of corresponding condition attributes belonging to a class. If a given set of
objects with a given set of attributes is classifiable, then they may be possibly
classified by some subsets of attributes. Normally only a few important attributes are
sufficient to classify objects of a decision system. This is equivalent with human
perception and classification ability based on intelligent attention, and selection of
most important features of objects.

Few attributes in an information system may be redundant and can be eliminated


without losing the essential classificatory information. The process of finding a
smaller set of attributes than the original one with the same or the closest

77
Rough‐Fuzzy Hybridization

classificatory power as the original set is called attribute reduction. In other words, a
reduct is a minimal set of attributes from A that preserves the partitioning of the
universe and hence the ability to perform classifications as the whole attribute set A
does. As a result the original larger information system may be reduced to a smaller
system containing fewer attributes.

Rough sets allows to determine the most important attributes from a classificatory
point of view for a given information system. A reduct is the essential part of an
information system related to a subset of attributes that can discern all objects
discernible by the original information system. A core is a common part of all reducts.
Core and reduct are fundamental rough sets concepts that may be used to knowledge
reduction. Some attributes may depend on each other. A change of a given attribute
may cause changes of other attributes in some non-linear ways. Rough sets determine
a degree of attributes’ dependency and their significance. In an indiscernibility
relation, a dependency of attributes is one of the important features of information
systems.

Consider an information system S = < U, A, V, f >, with condition and decision


attributes A = C ∪ D, for a given set of condition attributes B ⊆ C, the B-positive
region POSB(D) in the relation IND(D) can be defined by

⋃ | ∈ (4.27)

The positive region POSB(D) contains all objects in U which can be classified
perfectly without error into distinct classes defined by IND(D), based only on
information in relation IND(B).

The definition of the positive region can be formed for any two subsets of attributes,
B1, B2 ⊆ A in the information system S. It is known that the subset of attributes B2 ⊆ A
defines the indiscernibility relation IND (B2) and thus the classification U / IND (B2)
with respect to the subset. The B1-positive region of B2 is defined bellow. The B1-
positive region of B2 contains all objects that, by using attributes B1, can be certainly
classified to one of distinct classes of the classification U/IND(B2).

⋃ ∈ (4.28)

78
Rough‐Fuzzy Hybridization

The rough sets define a degree of dependency for sets of attributes. The cardinality of
the B1-positive region of B2 is used to define a measure called a degree of
dependency of the set of attributes B2 on B1 in (4.32). It can be said that the set of
attributes B2 depends on the set of attributes B1 to the degree .

(4.29)

Suppose an information system S =< U, A, V, f > and two sets of attributes B1, B2 ⊆ A.

A set of attributes B2 depends on a set B1 in S, denoted by B1→B2, iff an equivalence

relation satisfies IND (B1) ⊆ IND (B2). The sets B1 and B2 are independent in S iff

neither B1→B2 nor B2→B1 holds. A set B2 is dependent to a degree k on the set B1 in

S, as denoted

→ ,0 1, (4.30)

Where

: Degree of dependency of a set of is attributes on .

If k = 1 a set B2 is totally dependent on B1 (or B2 → B1), if k = 0 a set B2 is totally


independent on B1 and otherwise a set B2 is roughly dependent on B2.

A level of significance of attributes from a set A with respect to the classification U /


IND(B2)) generated by IND(B2) may be different. The measure of significance
(coefficient of significance) of the attribute a ∈ B1 from the set B1 with respect to the
classification U / IND (B2) generated by a set B2 is defined by

, (4.31)

The significance of the attribute a in the set B1∈ A computed with respect to the
original classification U/IND(A) generated by the entire set of attributes A from the
information system S is denoted as , .

These are the properties of an attribute set B1 in an information system S as follows.

79
Rough‐Fuzzy Hybridization

1. A set B1⊂ A is dependent in S iff ∃ B2 ⊂ B1 such that IND (B2) = IND (B1).

2. A set B1⊂ A is independent in S iff ∀ B2 ⊂ B1 such that IND (B2) ⊃ IND (B1).

3. A set B1⊂ A is superfluous in A iff IND (A- B1) = IND (B2).

4. A set B1⊂ A is a reduct of A in S iff A-B1 is superfluous in A and B1 is


dependent in S.

A given information system may have many different reducts. If for given
information system S =< U, Q, V, f >, a subset B ⊂ A is a reduct, then the
corresponding information system S1 =<U, A, V, f 1> with the attribute set equal to a
reduct B, is called a reduced system (where f 1 is the restriction of a function f to a set
U×B). In other words, a reduced system S1 is constructed from the original system S
by removing columns related to attributes not included in a reduct B.

[Example 4.8]

Consider the information system DIABET define in Table 4.2, suppose there are two
subsets of attributes, B1 = {c1, c2, c3}, B2 = {c3}.

The partition U / IND(B1) related to the equivalence relation IND (B1), is

{X1, X2, X3, X4, X5} = {{p1, p2}, {p3, p4}, {p5, p6}, {p7, p8}, {p9, p10}}.

Consider another partition U / IND(B2) corresponding to the equivalence relation


IND(B2), is

{Y1, Y2, Y3} = {{p1, p2, p7, p8}, {p3, p4, p5, p6}, {p9, p10}}.

The B1-positive region of B2 is

∈ /

1, 2, 7, 8 , 3, 4, 5, 6 , 9, 10

80
Rough‐Fuzzy Hybridization

A degree of dependency of the set attributes B2 on B1 is

10
: 1.0
10

Therefore it can be said that the set of attributes B is totally dependent on a set A

For a information system some attributes may be redundant with respect to a specific
classification U/IND(B) generated by a set of attributes B ⊆ A . That implies an
information system may be overburdened by this redundant information. The
classifiers defined for overburdened information systems may exhibit a poor
generalization for new unseen objects. By virtue of the dependency properties of
attributes, one can find a reduced set of the attributes, by removing superfluous
attributes, without a loss of classification power in the reduced information system.
Thus it can lead to the substantial reduction of an information system to find the
optimal set of attributes that sufficient for a robust classification with a higher degree
of generalization.

For an information system S and a subset of attributes B ⊂ A , an attribute a ∈ B is


called dispensable in a set B if IND (B) = IND (B-{a}), which means that
indiscernibility relations generated by sets B and B-{a} are identical. Otherwise a
parameter a is indispensable in B. It may conclude that the dispensable attribute does
not improve the classification of the information system S. Thus it is cleared that the
absence of the dispensable attribute does not affect the classificatory power of an
information system and does not change the dependency relationship of the original
system. On the other side, the indispensable attributes carry the essential information
about the objects of an information system, and cannot be removed without changing
the classificatory power of the original system.

The set of all indispensable attributes in a set B ⊂ A is called a core of a set B in the
information system S and it is denoted by CORE(B). The core contains all attributes
that cannot be removed from the set B without losing the original classification.

Let B1, B2 ⊂ A two subsets of attributes in the information system S. An attribute a∈


B1 is called B2-dispensable, if . Otherwise the attribute

81
Rough‐Fuzzy Hybridization

a is B2-indispensable. If every attribute of B1 is B2-indispensable, then B1 is


indispensable with respect to B2. The set of all B2-indispensable attributes from the set
B1 is called B2-relative core (or B2-core) of B1 and denoted by as
defined

∈ : (4.32)

A set B ⊂ A is called orthogonal if all its attributes are indispensable. A proper subset
E ⊂ B is defined as a reduct set of B in S if E is orthogonal and preserves the
classification generated by B. Thus a reduct set of B, denoted by RED(B), is defined
by

E = RED(B) (E ⊂ B, IND(E) = IND(B), E - orthogonal) (4.33)

All reducts, or a family of reducts, of a set B are denoted by REDF(B). The


intersection of all reducts of a set B is a core of B as defined

⋂ (4.34)

[Example 4.9]

Consider the decision system DIABET, suppose there are two reducts B1 and B2 of the
set of condition attributes C = {c1, c2, c3} with respect to the decision attribute D =
{d} as follows: B1 = {c1, c2}, B2 = {c2, c3}.

Then the core of the set of attributes B1 and B2 is obtained as follows.

Thus the attribute c2 is the most significant attribute and B1 and B2 are the two subsets
of attributes B that discriminate the decision attributes.

By choosing a reduct B1, for example, the reduced decision table can be obtained by
simply removing the superfluous attribute c3 as shown in the following table. The
reduced decision table has the same information as the original with respect to the
classificatory power.

82
Rough‐Fuzzy Hybridization

Object Attributes

C D
(medical diagnoses) (disease class)
U c1 c2 d
p1 0 N 0
p2 0 N 0
p3 1 N 1
p4 1 N 0
p5 0 V 0
p6 0 V 0
p7 1 V 1
p8 1 V 1
p9 0 H 1
p10 0 H 0

Table 4.5 A reduced MEDICAL decision table

4.3.7 Discernibility Matrix

Frequently discernibility of objects is more interesting than specific values of


attributes. In these situations an information system may be represented as a
discernibility matrix. Discernibility matrix and the discernibility function have been
introduced by Skowron and Rauszer [34]. These are help to develop efficient
algorithms for generating minimal subsets of attributes, which is sufficient to describe
all concepts in a given information system. According to that discernibility matrix is
defined as a matrix which stores the differences between the attributes of each pair of
objects. The discernibility matrix contains fewer data than those of an information
system but holds all essential information used to check whether a set of attributes is a
minimal one that describes concepts.

Consider S = < U, A, V, f > be an information system and assume that there are n
objects, and thus U defined as U = {x1, x2, …, xn }

A discernibility matrix defined by M(A) for an information system S with the set of
attributes A is a n × n dimensional square matrix, with rows and columns labeled by
objects xi (i=1,2,…n). Each entry mij of a discernibility matrix (for a given row i and a
column j representing two objects xi and xj from U) is a subset of attributes which
discerns these objects. Therefore, a discernibility matrix can be defined by

83
Rough‐Fuzzy Hybridization

0 , ∈
∈ : , , , ∈

Where xi, xj ∈ U (4.35)

The entry mij contains all these attributes whose values are not identical for both xi
and xj, which means that xi, xj belong to different classes of partition generated by
IND (A). The discernibility matrix M(A) is symmetric and mii = 0, thus it is sufficient
to compute only entries in the lower triangle of M (A), i.e., the mij with 0 ≤ j<i ≤ n-1.

A discernibility function fS for an information system S is a Boolean function of n


Boolean variables , ,…, corresponding to the attributes a1 , a2 , … , an
respectively. It is defined by the following equation

, , … , ⋀⋁ | 1 , (4.36)
where

⋁ : A disjunction of all variables such that ∈ .

[Example 4.10]

Consider the information system DIABET as shown in Table 4.2, the discernibility
matrix M(B) can be obtained as (mii = 0, mij = mji for i, j=1,…,10)

p1 p2 p3 p4 p5 p6 p7 p8 p9 p10
p1 φ
p2 φ φ
p3 c1c3 c1c3 φ
p4 c1c3 c1c3 φ φ
p5 c2c3 c2c3 c1c2 c1c2 φ
p6 c2c3 c2c3 c1c2 c1c2 φ φ
p7 c1c2 c1c2 c2c3 c2c3 c1c3 c1c3 φ
p8 c1c2 c1c2 c2c3 c2c3 c1c3 c1c3 φ φ
p9 c2c3 c2c3 c1c2c3 c1c2c3 c2c3 c2c3 c1c2c3 c1c2c3 φ
p10 c2c3 c2c3 c1c2c3 c1c2c3 c2c3 c2c3 c1c2c3 c1c2c3 φ φ

Table 4.6 Discernibility Matrix M(B)

84
Rough‐Fuzzy Hybridization

The discernibility function is as follows.

fS (c1 , c2 , c3 ) = (c1 ∨ c2) ∧ (c2 ∨ c3) ∧ (c1 ∨ c3) ∧ (c1 ∨ c2 ∨ c3)

4.3.8 Decision Rules

One of the important applications of rough sets is a generation of decision rules for a
given information system for a classification of known objects, or a prediction of
classes for new objects unseen during design. Using an original or a reduced decision
table, one can find rules classifying objects through determining the decision attribute
value based on values of condition attributes.

Let DS =<U,C ∪ D,V, f > be a decision table (decision system) with C as a set of
condition attributes and D as a set of decision attributes. A decision table DS can be
classified as follows:

 DS is deterministic iff D depends on C, C → D; / 1.


 DS is roughly deterministic iff D depends on C, 0 < / < 1.
 DS is totally nondeterministic iff D does not dependent on C, / = 0.

If DS is deterministic, a set of condition attributes C discriminates a set of decision


attributes D. If DS is roughly deterministic, D depends on C, but C cannot
discriminate D. If DS is totally non-deterministic, C is not related to D.

For a deterministic decision table, unique decisions can be determined when some
conditions are satisfied (attributes taking certain values). Conversely, for a roughly
deterministic decision table, decisions are not uniquely determined by the conditions.

For a non-deterministic decision table, a subset of decisions is defined, which can be


taken for specific conditions. This kind of situation is interpreted as inconsistency or
uncertainty in the decision table, and thus decisions determined by the decision table
are not well-defined. The properties characterizing dependency of attributes can be
applied to test whether a given decision table is deterministic or non-deterministic.
The notion of a reduct can be used to reduce the original decision table while
preserving its classificatory power. This may lead to a design of robust classifiers with
better generalization ability.

85
Rough‐Fuzzy Hybridization

Decision rules can be derived from a decision table DS. Let , ,…,
and , ,…, be a C-definable and a D-definable classification of U.
A class Yi from a classification U/IND(D) can be identified with the decision i
(i=1,2,…,l), denoted also by rij. A set of decision rules rij for all D-definable sets Yj is
defined bellow:

→ : ∩ , ∈ , ∈ (4.37)

Where

, : Unique description of classes Xi, Yj respectively

The decision rules rij are logically described as follows: IF (a set of conditions)
THEN (a set of decisions).

A rule rij is said to be deterministic iff Xi ⊆ Yj (Xi ∩ Yj = Xi, i = 1, 2,..., r) in a decision

table DS, which means C → D, otherwise a rule is non-deterministic. In other words,

if DesC (Xi) uniquely implies DesD (Yj), then the rule rij is deterministic; otherwise rij
is non-deterministic. The set of decision rules for all classes Yj generated by a set of
decision attributes D (D-definable classes in S) is called a decision algorithm resulting
from the information system S.

[Example 4.11]

Consider the decision table in Table 4.2 with the decision attribute D = {d}, Vd = {0,
1}. The resulting partition U/IND(D) = {Y1, Y2} = {{p3, p7, p8, p9} }, { p1, p2, p4, p5,
p6, p10}} for DesD (Y1)=(d=1) and DesD (Y2)=(d=0). If a reduct B = {c1, c2} of the
condition attribute is considered, a partition of the universe U corresponding to the
equivalence relation IND(B) can be determined as below.

U/IND(B) = {X1, X2, X3, X4, X5} = {{p1, p2}, {p3, p4}, {p5, p6}, {p7, p8}, {p9, p10}}.

The unique descriptions of the classes Xis on the set B are:

DesB(X1) = (c1 = 0, c2 = N)

DesB(X2) = (c1 = 1, c2 = N)

86
Rough‐Fuzzy Hybridization

DesB(X3) = (c1 = 0, c2 = V)

DesB(X4) = (c1 = 1, c2 = V)

DesB(X5) = (c1 = 0, c2 = H).

Firstly, decision rules for the class Y1 (d=1) can be designed as follows.

Calculation Decision Rule Rule in language form


X1 ∩ Y1=
X2 ∩ Y1={p3} r21→ DesD(Y1) IF INC is 1 and PPB is N THEN LJM=1
X3 ∩ Y1=
X4 ∩ Y1={p7, p8} r41→ DesD(Y1) IF INC is 1 and PPB is V THEN LJM=1
X5 ∩ Y1={p9 } r51→ DesD(Y1) IF INC is 0 and PPB is H THEN LJM=1

Next, the decision rules for the class Y2 (no disease; d=0) can be obtained as below

Calculation Decision Rule Rule in language form


X1 ∩ Y2={p1, p2} r12→ DesD(Y2) IF INC is 0 and PPB is N THEN LJM=0
X2 ∩ Y2={p4} r22→ DesD(Y2) IF INC is 1 and PPB is N THEN LJM=0
X3 ∩ Y2= {p5, p6} r32→ DesD(Y2) IF INC is 0 and PP is V THEN LJM=0
X4 ∩ Y2=
X5 ∩ Y2={p10 } r52→ DesD(Y1) IF INC is 0 and PPB is H THEN LJM=0

4.4 Combination of Rough and Fuzzy Sets

Mainly the combinations of rough and fuzzy set are found in the three categories:
fuzzy-rough set, rough-fuzzy set and rough-fuzzy hybridization. A rough-fuzzy set is
defined as an approximation of a fuzzy set in a crisp approximation space, while a
fuzzy-rough set is defined as an approximation of a crisp set in a fuzzy approximation
space. In generalization, the category of an approximation can be interpreted in these
three different areas; a family of rough sets, a family of rough-fuzzy sets, and a family
of fuzzy-rough sets. The approximation of a fuzzy set in a fuzzy approximation space
leads to a more general framework. Rough-fuzzy hybridization is an approach to
combine the rough and fuzzy set excluding the concept of fuzzy-rough set and rough-

87
Rough‐Fuzzy Hybridization

fuzzy set. Rough-fuzzy hybridization are used in different area like feature selection,
dependency rule generation etc.

4.4.1 Fuzzy Equivalence Classes

Fuzzy equivalence relations are the generalization of the crisp equivalence relations in
the fuzzy framework. Fuzzy equivalence relations have been widely studied to
measure the degree of indistinguishability or similarity between the objects of a given
universe of discourse, and have been used in different contexts such as fuzzy control,
approximate reasoning, cluster analysis, etc [35]. Different researchers define other
names also to use in different context such as similarity relations [36-38] or
indistinguishability operators [39-41].

Fuzzy equivalence classes are central to the fuzzy-rough set approach [9,42,43] as like
as crisp equivalence classes are central to rough sets. An introduction to fuzzy set
theory can be found in appendix A. For typical Rough Set Attribute Reduction
applications, this means that the decision values and the conditional values may all be
fuzzy. The concept of crisp equivalence classes can be extended by the inclusion of a
fuzzy similarity relation R on the universe U, which determines the degree of which
two elements are similar in R. For example, if μ , 0.9, then objects x and y are
considered to be quite similar. The other properties of like reflexivity μ , 1,
symmetry μ , μ , and transitivity μ , μ , ∧ μ , hold.

Using the fuzzy similarity relation, the fuzzy equivalence class [x]R for objects close
to x can be defined:

μ μ , (4.38)

Fuzzy equivalence class F should hold the following axioms [44, 45]:

1. ∃ , 1 , is normalised
2. μ ∧ μ , μ
3. μ ∧ μ μ ,

The first axiom represents that an equivalence class is nonempty. The second axiom
says that objects which are in neighborhood of y are belonging in the equivalence
class of y. The third axiom states that any two elements in F are related through R.

88
Rough‐Fuzzy Hybridization

Thus this definition degenerates to the normal definition of equivalence classes when
R is not fuzzy.

Fuzzy partitioning of the universe of discourse formed a family of normal fuzzy sets
that can play the role of fuzzy equivalence classes [9]. Consider the crisp partitioning
of a universe of discourse U, by the attributes in A: U/A = {{1, 4, 6}, {2, 3, 5}}. This
contains two equivalence classes ({1, 4, 6} and {2, 3, 5}) that can be thought of as
degenerated fuzzy sets, with those elements belonging to the class with a membership
of one or zero. For the first class, for instance, the objects 2, 3 and 5 have a
membership of zero. Extending this concept to fuzzy equivalence classes which
results that objects can be allowed to assume membership values for any given class,
in the interval [0, 1]. U/A is not restricted to crisp partitions only; fuzzy partitions are
equally acceptable [46, 47].

4.4.2 Fuzzy-Rough Sets

Dubois and Prade introduced the Fuzzy-Rough Sets [9] are originated from Waillaeys
and Malvache [47] for defining a fuzzy set with respect to a family of fuzzy sets. It
deals with the approximation of fuzzy sets in a fuzzy approximation space defined by
a fuzzy similarity relation or defined by a fuzzy partition. The results for fuzzy-rough
sets reviewed here are based on a fuzzy similarity relation. Fuzzy similarity relation S
is a fuzzy subset of U × U. The pair (U, S) is called a fuzzy approximation space. A
fuzzy similarity relation may be used to define a fuzzy partition of the universe and
defined by U/S with respect to S.

For a fuzzy set F, its approximation in the fuzzy approximation space (U, S) i.e., the
fuzzy lower and upper approximation are defined as [48]

,1 ∈ (4.39)

, ∈

Since the universe of discourse is finite, so it is used the sup and inf. Crisp upper and
lower approximations deviate a little from these definitions due to the memberships of
individual objects of approximations are not explicitly available.

Thus the pair can be extended to a pair of fuzzy sets on the universe U as defined by

89
Rough‐Fuzzy Hybridization

,1 , | ∈ (4.40)
, , | ∈

The tuple , is called a fuzzy-rough set, which is a pair of fuzzy sets on U/S.

Many different definitions are proposed by different researcher to use fuzzy set with
different measure. Fuzzy-rough set are defined by using a family of equivalence
relations induced by different level sets of a fuzzy similarity relation [49]. Another
definition of fuzzy-rough set proposed based on a fuzzification of the lower and upper
bounds of Iwinski rough sets [50, 51]. Similar definition was also used to model the
approximation of a fuzzy set based on a weak fuzzy partition using the measures of
fuzzy set inclusion [52, 53].

The review shows that the same notions of rough fuzzy set and fuzzy rough sets are
used with different meanings by different authors. The functional approaches clearly
defined various notions mathematically. However, the physical meanings of these
notions are not clearly interpreted. In the rest of this chapter, this issue will be
addressed.

Let F1, F2 ∈ U, the properties of fuzzy-rough sets based on the properties of rough
sets are given bellow:

i. (4.41)

ii.

iii. ∩ ∩ ∪ ∪

∪ ⊇ ∩ ⊆ ∩

iv. ⊆ ⊇

Fuzzy-rough sets are monotonic with respect to set inclusion given bellow:

⊆ ⇒ ⊆ ⊆ (4.42)

Fuzzy-rough sets are monotonic with respect to the refinement of fuzzy similarity
relations. A fuzzy similarity relation S1 is a refinement of another fuzzy similarity
relation S2 if S1 belongs or equal to S2. That is a generalization of the refinement of

90
Rough‐Fuzzy Hybridization

crisp relations. The monotonicity of fuzzy-rough sets with respect to the refinement of
the fuzzy similarity relation given bellow

⊆ ⇒ ⊆ ⊇ (4.43)

4.4.3 Rough-Fuzzy Sets

Rough-Fuzzy Sets defined by Dubois and Prade deal with the approximation of fuzzy
sets in an approximation space [54]. Let S = <U, A, V, f> be an information system
and a subset of attribute B ⊆ A determines the approximations space (U, IND (B)) in
S. A fuzzy set F can be approximate in (U, IND (B)) by constructing the B-lower and
B-upper approximations of F, denoted by and respectively and defined as
follows:

μ | ∈ (4.44)

μ | ∈

Where

[x]B: an equivalence class which contains x on an equivalence relation IND(B)

Since the universe of discourse is finite, so it is used the sup and inf. Using the
extension principle, the pair can be extended to a pair of rough sets on the universe U
as defined bellow

μ | ∈ (4.45)

μ | ∈

This pair can be represented in another way by expressing rough sets using the
characteristic functions of lower and upper approximation as defined

μ max , , ∈ (4.46)

μ min ,1 , ∈

The pair ( , ) is called a rough-fuzzy set on U with reference fuzzy set F.

Let F and G are two fuzzy sets, the properties of rough-fuzzy sets are given bellow:

91
Rough‐Fuzzy Hybridization

i. (4.47)
ii.
iii. ∩ ∩ ∪ ∪
∪ ⊇ ∩ ⊆ ∩
iv. ⊆ ⊇

v. ⊆ ⊇

vi. ⊆ ⊇

Rough-fuzzy sets are monotonic with respect to fuzzy set inclusion as given bellow:

⊆ ⇒ ⊆ ⊆ (4.48)

Rough-fuzzy sets are also monotonic with respect to refinement of equivalence


relations. For two equivalence relations B, C and a fuzzy set F, it is given bellow:

⊆ ⇒ ⊆ ⊇ (4.49)

By comparing Equations (4.40) and (4.46), it can be concluded that rough-fuzzy sets
are special cases of fuzzy-rough sets as defined by Dubois and Prade [9]. The
approximation of a fuzzy set in a crisp approximation space is called a rough-fuzzy
set, to be consistent with the naming of rough set as the approximation of a crisp set in
a crisp approximation space. The approximation of a crisp set in a fuzzy
approximation space is called a fuzzy-rough set. Such a naming scheme has been used
by Klir and Yuan [55], and Yao [56]. Under this scheme, these two models are
complementary to each other, in a similar way that rough sets and fuzzy sets
complementary to each other.

4.4.4 Fuzzy-Rough Hybrids

Except fuzzy-rough set and rough-fuzzy set, other interpretations are possible [57].
An alternative definition of fuzzy-rough set was given by utilizing the rough
membership function [58, 59]. One attempt at rough-fuzzy hybridization was
proposed [60], where rough sets are expressed by a fuzzy membership function to
represent the negative, boundary and positive regions. The objects belongs to the
positive region have a membership value one and those belonging to the boundary

92
Rough‐Fuzzy Hybridization

region have a membership of 0.5. Objects have zero membership value, belongs to the
negative region i.e., they do not belong to the rough set. Thus modifying the rough set
operator like union, intersection; rough set may be expressed as a fuzzy set.

The interest to introduce fuzziness into rough set is to handle the levels of roughness
in the boundary region by using fuzzy membership values. Then the membership
value of the objects belong to the boundary region must be in the range of 0 to 1 in
spite of a crisp value 0.5. Therefore, for a set X in rough-fuzzy hybridization and for a
rough set R and a crisp equivalence relation E the membership function may be define
as follows:

0 1

Another approach is presented where objects belongs in the rough set lower
approximation with certainty, however the boundary region is fuzzified and
membership values of objects are expressed in terms of a fuzzy membership function
[61].

Another approach to address the problem where the fuzzy set representation of a
rough set may be too precise, such that a concept is described exactly once its
membership function has been defined [62]. The solution proposed to employ an
approximation of a family of fuzzy sets, termed as shadowed set, that use basic truth
values and a zone of uncertainty instead of exact membership values.

For a fuzzy set, a shadowed set may be influenced by increasing the membership
values around 1 and decreasing membership values around 0 until a certain threshold
value is attained. Any objects that do not belong to the set with a membership of 1 or
0 are assigned a unit interval, [0,1], considered to be a nonnumeric model of
membership grade. In fuzzy set theory, vagueness is distributed across the entire
universe of discourse, but in shadowed sets this vagueness is localized in the shadow
regions. As with fuzzy sets, the basic set operations (union, intersection and
complement) can be defined for shadowed sets, as well as shadowed relations.

93
Rough‐Fuzzy Hybridization

4.4.5 Fuzzy-Rough Reduction Process

Fuzzy-rough hybridization builds on the notion of fuzzy lower approximation to


enable reduction of datasets containing real valued features. As will be shown, the
process becomes identical to the crisp approach when dealing with nominal well-
defined features.

The crisp positive region in traditional rough set theory is defined as the union of the
lower approximations. By the extension principle [63], the membership of an object x
∈ U, belonging to the fuzzy positive region can be defined by

sup ∈ / (4.50)

Object x will not belong to the positive region only if the equivalence class it belongs
to is not a constituent of the positive region. This is equivalent to the crisp version
where objects belong to the positive region only if their underlying equivalence class
does so. Similarly, the negative and boundary regions can be defined. Using the
definition of the fuzzy positive region, the new dependency function can be defined as
follows:

∑ ∈
| | | |
(4.51)

As with crisp rough sets, the dependency of Q on R is the proportion of objects that
are discernible out of the entire dataset. In the present approach, this corresponds to
determining the fuzzy cardinality of divided by the total number of
objects in the universe.

If the fuzzy-rough reduction process is to be useful, it must be able to deal with


multiple features, finding the dependency between various subsets of the original
feature set. For example, it may be necessary to be able to determine the degree of
dependency of the decision feature(s) with respect to R = {a, b}. In the crisp case,
U/R contains sets of objects grouped together that are indiscernible according to both
features a and b. In the fuzzy case, objects may belong to many equivalence classes,
so the Cartesian product of U/IND({a}) and U/IND({b}) must be considered in
determining U/R. In general,

94
Rough‐Fuzzy Hybridization

U/R = ⨁{a ∈ R: U/IND({a}) } (4.52)


where
A⨁B X ∩ Y : ∀X ∈ A, ∀Y ∈ B, X ∩ Y
Each set in U/R denotes an equivalence class. For example,

if
R = {a, b}, U/IND({a}) = {Na, Za} and U/IND({b}) = {Nb, Zb},
then
U/R ={Na ∩ Nb, Na ∩ Zb, Za ∩ Nb, Za ∩ Zb} (4.53)
The extent to which an object belongs to such an equivalence class is therefore
calculated by using the conjunction of constituent fuzzy equivalence classes, say Fi,
i = 1, 2, …, n

∩ ∩…∩ , ,…, (4.54)

4.5 Conclusion

The interest in the hybridization of fuzzy sets with rough sets is endured by the
number of publication in this area. The hybridization of these approaches has resulted
in methods which take advantage of the ability of rough sets to model vagueness and
that of fuzzy sets to model uncertainty. In this sense both approaches are
complimentary, furthermore when hybridized as described in this section no tunable
parameters are required and only the data is used.

In this chapter a number of theoretical and real world application areas of rough set
theory, rough set extensions, and combination of fuzzy and rough set theory are
examined. Note that these examples are for representative purposes and do not serve
to demonstrate the whole spectrum of possible applicable areas. The sheer number of
applications and amount of work has been published in the area.

There is much scope for further research in relation to the development of fuzzy-
rough sets. In particular there is much interest in the area of type-2 fuzzy sets [63] at
the present moment. However, hybridization with rough sets has not been proposed as
yet. Additionally, there are a number of aspects in respect of fuzzy measures with
application to fuzzy-rough sets which remain unexplored, and these may offer some
new and interesting future research areas.

95
Rough‐Fuzzy Hybridization

Bibliography

[1] Z. Pawlak, Rough sets, International Journal of Computer and Information


Sciences 11, pp. 341–356, 1982

[2] Z. Pawlak, Rough Sets, Theoretical Aspects of Reasoning About Data.


Dordrecht, Kluwer, The Netherlands.1991.

[3] L. A. Zadeh, Fuzzy Sets, Information and Control, pp. 338-353, 1965.

[4] S. Chanas, D. Kuchta, Further remarks on the relation between rough and
fuzzy sets, Fuzzy sets & Systems, vol. 47, pp.391-394, 1992.

[5] Z. Pawlak, Rough sets and fuzzy sets, Fuzzy Sets & Systems, vol. 17, pp. 99-
102, 1985.

[6] M. Wygralak, Rough sets and fuzzy sets – some remarks on interrelations,”
Fuzzy Sets & Systems, vol. 29, pp. 241-243, 1989.

[7] R. Biswas, On rough sets and fuzzy rough sets, Bulletin of the Polish
Academy of Sciences, Mathematics, vol. 42, pp. 345-349, 1994.

[8] D. Dubois and H. Prade, Rough fuzzy sets and fuzzy rough sets, International
Journal of General systems, vol. 17, pp.191-209, 1990.

[9] D. Dubois and H. Prade, Putting rough sets and fuzzy sets together, In:
Intelligent decision Support: Handbook of Applications and Advances of the
rough Sets Theory, R. Slowinski, Kluwer Academic Publishers, Boston, pp.
203- 222, 1992.

[10] S. K. Pal, A. Skowron(Ed), Rough-Fuzzy Hybridization: A New Trend in


Decision Making, Springer-Verlag New York, Inc. Secaucus, NJ, USA, ISBN:
9814021008, 1999.

[11] R Jensen, C Cornelis, Q Shen, Hybrid Fuzzy-Rough Rule Induction and


Feature Selection, FUZZ-IEEE, Korea, pp. 1151-1156, 2009.

96
Rough‐Fuzzy Hybridization

[12] Nan-Chen Hsieh, Rule Extraction with Rough-Fuzzy Hybridization Method,


Advances in Knowledge Discovery and Data Mining, LNCS, Vol. 5012, pp
890-895, 2008.

[13] P. Lingras, R Jensen, Survey of Rough and Fuzzy Hybridization, Fuzzy


Systems Conference, FUZZ-IEEE 2007, IEEE International, pp. 1-6, 2007.

[14] Sankar K. Pal, Pabitra Mitra, Case Generation: A Rough-fuzzy Approach, In:
Proc. Intl. Conf. Case Based Reasoning (ICCBR2001), Vancouver, Canada,
2001,

[15] G. J. Klir and B. Yuan, Fuzzy Sets and Fuzzy Logic, Theory and Applications,
Prentice Hall, 1995.

[16] Cited In H.-J. Zimmermann, Fuzzy Set Theory - And its Applications, 3rd Ed,
Kluwer Academic Publishers, 1997.

[17] K. Cios, W. Pedrycz, R. Swiniarski, DATA MINING - Methods for


KNOWLEDGE DISCOVERY, Kluwer Academic Publishers, 1998.

[18] T. Y. Lin and N. Cercone, Rough Sets and Data Mining – Analysis of
Imprecise Data, Kluwer Academic Publishers, 1997.

[19] S. Tsumoto, H. Tanaka, Induction of medical expert system rules based on


rough sets re-sampling methods, Proceedings of the 18th Annual Symposium
on Computer Applications in Medical Care, Journal of the AMIA 1
(supplement), pp. 1066-1070, 1994.

[20] S. Tsumoto, H. Tanaka, PRIMEROSE, Probabilistic rule induction method


based on rough set re-sampling methods, In: Computational Intelligence: An
International Journal 11/2, pp. 389-405, 1995.

[21] S. Tsumoto, W. Ziarko, N. Shan, H. Tanaka, Knowledge discovery in clinical


databases based on variable precision rough sets model, In: Proceedings of the
19th Annual Symposium on Computer Applications in Medical Care, New
Orleans, Journal of American Medical Informatics Association Supplement,
pp. 270-274, 1995.

97
Rough‐Fuzzy Hybridization

[22] Z. Pawlak, K. S. lowinski, R. S lowinski, Rough classification of patients after


highly selected vagotomy for duodenal ulcer, Journal of Man-Machine Studies
24, pp. 413-433, 1986.

[23] J. Fibak, Z. Pawlak, K. S lownski, R. S lownski, Rough sets based decision


algorithm for treatment of duodenal ulcer by HSV, Bulletin of Polish
Academic Science - Biological Science 34/10-12, pp. 227-246, 1986.

[24] J. Fibak, K. S lownski, R. S lowinski, The application of rough sets theory to


the verification of treatment of duodenal ulcer by HSV, In: Proceedings of the
Sixth International Workshop on Expert Systems their Applications, Agence
de l'Informatique, Paris, pp. 587-599, 1986.

[25] J.W. Grzyma - Busse, Applications of the rule induction system LERS, In:
Polkowski and Skowron [43], pp. 366-375, 1998.

[26] L. Polkowski, A. Skowron (Eds.), Rough Sets in Knowledge Discovery 1:


Methodology and Applications, Physica-Verlag, Heidelberg, 1998.

[27] A. Czy_zewski, Speaker-independent recognition of digits - Experiments with


neural networks, fuzzy logic rough sets, Journal of the Intelligent Automation
and Soft Computing 2/2, pp. 133-146, 1996.

[28] A. Czajewski, Rough sets in optical character recognition, In: Polkowski and
Skowron, pp. 601-604, In: L. Polkowski, A. Skowron (Eds.), Rough Sets in
Knowledge Discovery 1: Methodology and Applications, Physica-Verlag,
Heidelberg, 1998.

[29] L. Polkowski, A. Skowron (Eds.), The 1st International Conference on Rough


Sets and Soft Computing (RSCTC'98), Warszawa, Poland, June 22-27,
Springer- Verlag, LNAI 1424, 1998.

[30] A. An, N. Shan, C. Chan, N. Cercone, W. Ziarko, Discovering rules from data
for water demand prediction, Proceedings of the Workshop on Machine
Learning in Engineering (IJCAI'95), Montreal, pp. 187-202, 1995; see also,
Journal of Intelligent Real-Time Automation, Engineering Applications of
Artificial Intelligence 9/6, pp. 645-654, 1995.

98
Rough‐Fuzzy Hybridization

[31] J. Catlett, On changing continuous attributes into ordered discrete attributes,


In: Y. Kodrato, (Ed.), Machine Learning-EWSL-91, In: Proc. of the European
Working Session on Learning, Porto, Portugal, March 1991, LNAI, pp. 164-
178, 1991.

[32] J. Komorowski, L. Polkowski, and A. Skowron, 1998, Rough sets: A Tutorial,


In S.K. Pal and A. Skowron, (EDs), Rough-Fuzzy Hybridization: A New
Method for Decision Making, Springer-Verlag, Singapore, 1998.

[33] Chang Su Lee, A Framework of Adaptive T-S typeRough-Fuzzy Inference


Systems (ARFIS), PhD Thesis, The University of Western Australia, July
2009.

[34] Skowron A. and Rauszer C. 1992, The discernibility matrixes and functions in
information systems, In: R. Slowinski, (Ed) Intelligent Decision Support
Boston: Kluwer, pp.331-362.

[35] Miroslav Ćirić, Jelena Ignjatović, Stojan Bogdanović, Fuzzy equivalence


relations and their equivalence classes. Fuzzy Sets and Systems 158:12, 1295-
1313, 2007.

[36] L. A. Zadeh, Similarity relations and fuzzy orderings, Information Sciences 3,


177–200, 1971

[37] S. Ovchinnikov, Similarity relations, fuzzy partitions, and fuzzy orderings,


Fuzzy Sets and Systems 40, 107–126, 1991.

[38] J. Jacas, Similarity Relations. The calculation of minimal generating families,


Fuzzy Sets and Systems 35, 151–162, 1990.

[39] L. Valverde, On the structure of F-indistinguishability operators, Fuzzy Sets


and Systems 17, 313–328, 1985.

[40] J. Jacas and J. Recasens, Maps and isometries between indistinguishability


operators, Soft Computing 6, 14–20, 2002

[41] D. Boixader, J. Jacas and J. Recasens, Transitive closure and betweenness


relations, Fuzzy Sets and Systems 120, 415–422, 2001.

99
Rough‐Fuzzy Hybridization

[42] H. Thiele. Fuzzy rough sets versus rough fuzzy sets - an interpretation and a
comparative study using concepts of modal logics. Technical report no. CI-
30/98, University of Dortmund. 1998.

[43] Y.Y. Yao. A Comparative Study of Fuzzy Sets and Rough Sets. Information
Sciences, Vol. 109, No. 1-4, pp. 21–47. 1998.

[44] U. Höhle. Quotients with respect to similarity relations. Fuzzy Sets and
Systems, Vol. 27, No. 1, pp. 31–44. 1988.

[45] Jensen, R., Shen, Q., Aiding Fuzzy Rule Induction with Fuzzy Rough
Attribute Reduction,In Proceedings of the UK Workshop on Computational
Intelligence (UKCI-02) Birmingham, UK, 81-88, September 2002.

[46] J. G. Marin-Blázquez and Q. Shen. From approximative to descriptive fuzzy


classifiers. IEEE Transactions on Fuzzy Systems, Vol. 10, No. 4, pp. 484–497,
2002

[47] D. Waillaeys and N. Malvache, The use of fuzzy sets for the treatment of
fuzzy information by computer, Fuzzy Sets & Systems, vol. 5, pp.323-328,
1981.

[48] Y.Y. Yao, Combination of rough and fuzzy sets based on α-level sets, in:
Rough Sets and Data Mining: Analysis for Imprecise Data, Lin, T.Y. and
Cercone, N. (Eds.), Kluwer Academic Publishers, Boston, pp. 301-321, 1997.

[49] A. Nakamura, Fuzzy rough sets, Notes on Multiple-valued Logic in Japan,


vol. 9, pp.1-8, 1988.

[50] S. Nanda and S. Maumdar, Fuzzy rough sets, Fuzzy Sets & Systems, vol. 45,
pp. 157-160, 1992.

[51] T. B. Iwinski, Algebraic approach to rough sets, Bulletin of the Polish


Academy of Sciences, Mathematics, vol. 35, pp. 673-683, 1987.

[52] R. Biswas, On rough sets and fuzzy rough sets, Bulletin of the Polish
Academy of Sciences, Mathematics, vol. 42, pp. 345-349, 1994.

100
Rough‐Fuzzy Hybridization

[53] L. I. Kuncheva, Fuzzy rough sets: application to feature selection, Fuzzy Sets
and Systems, vol. 51, pp.147-153, 1992.

[54] D. Dubois and H. Prade, Rough fuzzy sets and fuzzy rough sets, International
Journal of General systems, vol. 17, pp.191-209, 1990.

[55] G. J. Klir and B. Yuan, Fuzzy Sets and Fuzzy Logic, Theory and Applications,
Prentice Hall, 1995.

[56] Y. Y. Yao, On combining rough and fuzzy sets, Proceedings of the CSC’95
Workshop on Rough Sets and Database Mining, T. Y. Lin (Ed.), San Jose
State University, vol. 9, 9 pages, 1995.

[57] W. Pedrycz. Shadowed sets: bridging fuzzy and rough sets. In [123], pp. 179–
199. 1999.

[58] T. Beaubouef, F.E. Petry and G. Arora. Information Measures for Rough and
Fuzzy Sets and Application to Uncertainty in Relational Databases. IN: S.K.
Pal and A. Skowron (Eds.). Rough-Fuzzy Hybridization: A New Trend in
Decision Making. Springer Verlag, Singapore. 1999.

[59] Z. Pawlak. Rough Sets: Theoretical Aspects of Reasoning About Data. Kluwer
Academic Publishing, Dordrecht. 1991.

[60] M. Wygralak, Rough sets and fuzzy sets - some remarks on interrelations,
Fuzzy Sets and Systems, vol. 29, no. 2, pp. 241–243, 1989.

[61] W Chimphlee, A. H. Abdullah, M. N. M. Sap, S. Srinoy, S. Chimphlee.


Anomaly-Based Intrusion Detection using Fuzzy Rough Clustering,
International Conference on Hybrid Information Technology (ICHITʼ06), vol.

1, pp.329–334, 2006

[62] W. Pedrycz. Shadowed sets: bridging fuzzy and rough sets. In S.K. Pal and A.
Skowron (eds.) Rough-Fuzzy Hyridisation, Springer Verlag, Sngapore, pp.
179–199, 1999

[63] L.A. Zadeh. The Concept of a Linguistic Variable and Its Application to
Approximate Reasoning-1, Information Sciences vol.8 pp. 199–249, 1975.

101
Rough‐Fuzzy Intelligent System

Chapter 5

Rough-Fuzzy Intelligent System


Rough-Fuzzy intelligent system is a hybrid system in the area of soft computing
combining fuzzy set and rough set to construct the intelligent system. Fuzzy set is
used for representation of linguistic patterns, and Rough set theory is used to obtain
dependency rules. 1

5.1 Introduction

In developing intelligent systems, one major problem is knowledge acquisition from


human expert [1]. Extraction of knowledge from data stored in the form of table in
database may automate this problem. Knowledge extraction from data and combining
it with available symbolic knowledge and refine it as rule-base for a rule-based
intelligent systems is a great challenging work in computational intelligence.
Recommendations given by black box systems [2] is good but reasoning with logical
rules is more understandable and acceptable to users, because such reasoning is
understandable, provides explanations and may validate the system by inspecting the
increasing confidence, shown the important relationships and features may be hidden
in the data. Many methods have been developed to find logical depiction of the data in
the past using statistical, pattern recognition [3] and machine learning [4] approaches.
Rule-based systems should be preferred over other methods only in cases when the set
of logical rules is not too complex and their prognostic accuracy is adequately high.

The rule generation techniques have been widely developed and used for data mining
to developed intelligent system in many application areas [5], such as medical
diagnosis, decision-making, classification and prediction. Many inductive learning
methods, such as generation of decision trees [6], rule generation methods [7], soft
computing tools in rule generation are: neural network [8], fuzzy systems [9], rough
set theory [13], genetic algorithm etc, are introduced and applied to extract knowledge
from databases. Every approach has some advantages and disadvantages. In order to
1
This chapter is based on published paper, J. Ghosh, S. Mukhopadhyay, 2011, Role of Certainty Factor
in Rough-Fuzzy Rule Generation, International Journal of Computer Science, Engineering and
Applications (IJCSEA) Vol.1, No.6, pp. 49-61.

102
Rough‐Fuzzy Intelligent System

provide more flexible and robust information processing system, using only one
approach is not enough. Hybridizations of soft computing methodologies for rule
generation are introduced [22-34].

In this chapter the next sections are arranged as, section 5.2 discuss about the Rough-
Fuzzy rule generation. Section 5.3 presents the procedure of fuzzyfication of data i. e.,
the method of linguistic representation of pattern in fuzzy set theory. Section 5.4
describes the procedure of dependency rule generation using rough set theory. Section
5.5 presents the steps involve in the modified framework to compute dependency rule
with CF using rough-fuzzy hybridization. Section 5.6 gives a description about the
medical domain say diabetes mellitus. Section 5.7 shows some results and comparison
between exiting algorithms and the new modified rough-fuzzy framework and section
5.7 concludes the chapter.

5.2 Rough-Fuzzy Rule Generation

Typically in Rough-Fuzzy hybridization to generate rule utilized both the advantages


of fuzzy set theory to deal with uncertainty and vagueness in data and power of rough
set theory to generate dependency rule. Rough-Fuzzy hybridization is a method of
hybrid intelligent system, where fuzzy set theory [9] is used for linguistic
representation of patterns known as fuzzyfication of data and rough set theory [12] is
used to obtain dependency rule generation. Data in data set may be continuous or
discrete. Fuzzy set represent the continuous data in a linguistic [11] form with some
membership value (MV). Knowledge may be incomplete or uncertain and fuzzy sets
provide a natural framework for the process in dealing with uncertainty [10]. In this
process a huge amount of data set is converted to a small one, which is helpful to
generate the dependency rule using rough set theory. Knowledge may also be
consistent or inconsistent, two records in a data set have same condition attribute
values but have different decision attribute value, rough set theory can handle
inconsistent knowledge also to generate dependency rule [14].

Rule generation using rough set theory may be done by different approach. One
approach is first compute reducts [13], minimal set of attributes that preserve the
indiscernibility relation and, consequently, set approximation of information system.
An information system is a table, where each row represents an event and each

103
Rough‐Fuzzy Intelligent System

column represents an attribute that can be measured for each event and acquired from
the domain expert. In an information system, indiscernibility relation for a subset of
attribute set presents the same attribute value for all attribute of that subset for all
events. Then generate rule by calculating descernibility matrices [13]. If A is the
attribute set of an information system with n events then the descernibility matrix of
that information system is a n×n symmetric matrix with each entries is a set of
attribute defined as

cij={a ∈A| a(xi) ≠a(xj)} ∀ i,j=1,2,…,n . Here x’s are events (5.1)

Other way is directly generate decision rule using decision matrix, which is a
generalized from of rough set theory [14]. The decision matrix may also use to
compute reducts of the information system.

But in all previously worked over rough-fuzzy hybridization to generate rule


[14,15,34] the MV of the introduced linguistic terms at time of fuzzyfication of data
were ignored at time of rule generation using rough set theory. In those work all
fuzzyfied data are taken as certain. Thus finally when rules were generated all rules
were defined as certain rule with respect to the fuzzyfied data and very few rules were
generated with some possibility values, due to some inconsistency of data, defined as
possible rules in rough set theory. When this rule set will be used as a knowledge base
of a rule-based intelligent system then in time of inferencing, it will be difficult to
select a rule among set of matching rules for execution. Only one way is there to
consider the first matching rule for execution. This will not always found optimum
result, because result will depend on physical arrangement of rule set.

In this work we present a modified framework by introducing certainty factor (CF) of


each rule by considering both the MV of each linguistic term introduced at time of
fuzzyfication of data as well as possibility values, due to inconsistent data, generated
by rough set theory at time of rule generation. In this modified framework of rough-
fuzzy rule generation, CF will solve the problem of rule selection from a set of
matching rule. One may select the rule with highest CF value or consider the voting
method among rules of top 3 and top five CF. In this work we precede through some
steps. In step one data are read from database and perform the fuzzyfication of data by
introducing some linguistic term with some MV in between 0 and 1. We also assign
MV for all other data as 1or certain membership. In steps two modified the data set by

104
RoughFuzzyIntelligentSystem

deleting the equal records and store these all MV for future use to calculate CF. In
steps three we use rough set theory to generate dependency rule. Finally in step four
we calculate CF to each rule by considering the stored MV for certain rule, according
to rough set theory, and both stored MV and generated possibility value for possible
rule, according to rough set theory, and present CF as percentage form. This rule set
may be used as a knowledge base of a rule-based intelligent system as describe in
chapter 3. In time of knowledge inferencing CF of each rule will play an important
role to find out the appropriate rule to be fire. To generate rules and testing the system
we use the diabetes patient’s data.

5.3 Fuzzyfication of Data using Fuzzy Set Theory

Table of data are fuzzyfied using Fuzzy set theory [9, 11]. Fuzzy set theory has been
introduced the concepts of degree of membership of elements to set. Previously
elements could belong fully (membership 1) or not at all (membership 0) to set. The
degree of membership allows an element to lie in a set with membership values
anywhere in the range [0, 1]. A fuzzy set can be defined as a set of ordered pairs à =
{(x, Ã(x)) / x‫}ॼא‬. The function Ã(x) is called is called the membership function for
Ã, mapping each element of the base set ॼ (universe) to a membership degree in the
range [0.1]. The base set may be discrete or continuous.

In Rough-Fuzzy Hybridization fuzzyfication of data is performed to represent the


linguistic patterns of the continuous data. Each linguistic pattern has membership
degree in the range [0.1]. The type of the membership function is used depending on
the base set patterns. If the base set contains many values, or if this set is continuous,
then a parametric representation, which can be adapted by changing the parameters, is
appropriate. Mostly this type of membership functions are triangular or trapezoidal
functions that are defined by three and four parameters respectively.

For some applications continuously differentiable curves requires for modeling and
therefore smooth transitions, which is not possible using triangular or trapezoidal
function. In those cases normalized Gaussian function, difference of two sigmoidal
functions, generalized bell function, etc, and in some application  functions are used
[8].

105
Rough‐Fuzzy Intelligent System

5.4 Rule Generation using Rough Set Theory

Rough set theory [12, 13] is used to generate dependency rule form table of data. Let
us discuss some basic concepts of rough set which are used in this paper.

An Information system S is defined as S = <U, A>, where U denote the domain of


discourse formally Universe and A is the non-empty and finite set of attributes. Let
A=C ∪ D, where C is the non-empty and finite set of condition attributes and D is
defined as the non-empty and finite set of decision attributes. An attribute a ∈ A, can
be regarded as a function a: U→Va, where Va is a value set.

An information system may be viewed as an attribute value-table known as decision


table, where each row is labeled by object ∈ U and each column by attribute ∈ A.

For all B ⊆ C, equivalent relation IB on U is defined as

IB = {(x, y) ∈ U : ∀ a ∈ B, a(x) = a(y) } (5.2)

[x]B is denoted as equivalence class of object x ∈ U relative to IB and defined as

[x]B = {y | y ∈ U, yBx } (5.3)

and are denoted as B-lower and B-upper approximations of X ⊆ U in S,


where B⊆ C, and are defined as

= {x ∈ U : [x]B ⊆ X } (5.4)

= {x ∈ U : [x]B ∩ X ≠ φ} (5.5)

X ∈ U will be B exact if = and will be B rough if ≠ .

Rule generation using rough set can be done by the following two methods:

5.4.1 Method 1

The main task in this method, used in [34], of rule generation is to find out the reducts
relative to information system S. Let there are k number of decision attributes in D
i.e., D = {d1, d2 , … ,dk}. Then divide the decision table into k tables Si = <Ui, Ai> , ∀
i = 1, 2, …, k, where U = ⋃ U and Ai = C ∪ { di} .

106
Rough‐Fuzzy Intelligent System

Let us assume that there are n objects, i.e., U = {x1, x2, … ,xn }, and m condition
attributes, i.e., C = {a1, a2 , … ,am } in the information system S. Also assume that Ui
= {xi1, xi2, … ,xip} that occur in Si, i=1, 2, … k. Now construct the discernibility
matrix Mdi(B) for each di-reduct B = { b1, b2 , … ,bl }(say) from the di- discernibility
matrix as defined follows

cij ={ a ∈ B : a(x) ≠ a(y) } , ∀ i, j = 1, 2, … , n. (5.6)

For each object xj ∈ {xi1, xi2, … ,xip} the discernibility function denoted by fdi(xj) is
defined as

fdi(xj) = ⋀{⋁( cij) : 1≤ i, j ≤ n, j < i, cij ≠ φ} (5.7)

where ⋀ and ⋁ are conjunction and disjunction operations respectively.

Now calculate Ri defined as

Ri = ⋁ ( fdi(xj)) ,∀ j = i1, i2, … ,ip (5.8)

Thus the rule ri: Ri → di is obtained and the dependency factor dfi is calculated as

df (5.9)

where card(.) define the cardinality and POSi(di) is defined as

POSi(di) = ∪x∈Idi ki(X) where ki(X) is lower approximation of X with respect to Ii

5.4.2 Method 2

This method use decision matrix [14] for rule generation. Decision matrix is a
generalization of rough set theory from where reduct and decision rule can be
calculated. In this method first it is check that the information system is consistent or
not. Information system is said to consistent if there is no two objects whose condition
attributes are same but decision attributes are different. Similarly an information
system will be inconsistent if there exists any two objects whose condition attributes
are same and decision attributes are different. That means for any two object i and j

If ai(C) = aj(C) and ai(D) ≠ aj(D) where i≠j (5.10)

then the information system is inconsistent.

107
Rough‐Fuzzy Intelligent System

Let the domain of discourse U of the information system S is divided into k classes
(c1, c2… ck) depending on equivalence relation defined on D. For any class cp ∈ (c1,
c2… ck) , the objects ∈ U are belong in cp are numbered by subscripts i (i = 1, 2, …,
m) and those do not belong in cp are subscripts j (j = 1, 2, ,n). The decision matrix
M of the information system S for the class cp is defined as m×n matrix with elements
as a set {attribute-name, attribute-value}.

Mijp= {(a, a(i)): a(i)≠a(j)} ∀ i= 1,2, … ,m and j = 1,2, … , n (5.11)

Where a is attribute name and a(i) is attribute value.

5.4.2.1 For consistent information system:

The minimum-length decision rule for any object i (i = 1, 2, …, m) belong in class cp


∈ (c1, c2… ck) can be obtained as

|B | ⋀ ⋁ M (5.12)

Where ⋀ and ⋁ are conjunction and disjunction operations respectively.

The decision rule for the class cp∈ (c1, c2… ck) is calculated as

Rp = ⋁ |Bip | ∀ i = 1, 2, …, m (5.13)

5.4.2.2 For inconsistent information system:

First find out the B-lower and B-upper approximation for each class cp ∈ (c1, c2… ck)
. Rule generated from the B-lower approximation are certain and rule generated from
B-upper approximation is possible rule.

Thus the certain decision rule for any object i (i = 1, 2, …, m) belong in class cp ∈
(c1, c2… ck) can be obtained as

|B | ⋀ ⋁ M (5.14)

where ⋀ and ⋁ are conjunction and disjunction operations respectively.

The certain decision rule for the class cp∈ (c1, c2… ck) is calculated as

(Rp)certain = ⋁ |Bip|certain ∀ i = 1, 2, …, m (5.15)

108
Rough‐Fuzzy Intelligent System

And the possible decision rule for any object i (i = 1, 2, …, m) belong in class cp ∈
(c1, c2… ck) can be obtained as

|B | ⋀ ⋁ M (5.16)

where ⋀ and ⋁ are conjunction and disjunction operations respectively.

The possible decision rule for the class cp∈ (c1, c2… ck) is calculated as

(Rp) possible = ⋁ |Bip| possible ∀ i = 1, 2, … , m (5.17)

For possible rule the belief function can be defined as follows

1 (5.18)

Where c ∈ (c1, c2… ck) and card(.) define the cardinality of the set.

5.5 Modified Framework to Generate Rough-Fuzzy


Rule with CF

In this proposed modified framework we associate a CF in each rough-fuzzy rule,


which is more logical than considering each rough-fuzzy rule fully certain. The
following steps describe the details procedure as well as modification performed in
rough-fuzzy rule generation framework.

Step-1: read the data set from database. Find out the attributes which have continuous
values. Then perform the fuzzyfication operation over continuous attributes by
introducing some linguistic variables like low, high, medium etc., and calculate the
MV of each linguistic variable. The MV is calculated by using triangular membership
as described in section 5.3. According to the definition of MV describe in section 5.3,
the MV must be in [0, 1]. We also assign MV 1 or certain membership to the other
attribute values. We discard those parameters which have MV less than 0.25. Here we
use seven parameter triangular function instead of regular three parameter triangular
function defined as follows:

109
Rough‐Fuzzy Intelligent System

0 1, 7
1 2
2 3

, 1, 2, 3, 4, 5, 6, 7 3 4 (5.19)
4 5
5 6
6 7

Step-2: Find out those records which have same attribute value but may have
different MV of the attribute value in the data set. Then calculate the total
membership value of those records by summing up the MV of each attribute value in
each record. Only keep the record with maximum total membership value and delete
other records. In this way we find out a modified data set with some linguistic
variable and each attribute value has some membership degree. In this modified data
set all attribute have discrete values. Now in this point we store all MV corresponding
to the attribute values in the data set.

Step-3: Now we use rough set theory as described in section 5.4 with some
modification over the modified data set constructed in step-2. Here we have used
Method-2 as described in section 5.4.2. We take the fuzzyfied data set as information
set S. Next we check that the information system S is consistent or not using the
equation (5.10) described in Method-2 in section 5.4.2. Construct the decision matrix
as described in equation (5.11) of Method-2 in section 5.5.2 with a modification. We
modified the decision matrix by adding MV of attributes as follows:

Mijp= {(a, a(i), µ(i)): a(i)≠a(j)} ∀ i= 1,2, … ,m and j = 1,2, … , n (5.20)

Where a is attribute name and a(i) is attribute value and µ(i) is the MV of attribute
value.

For rule generation Method-2 describe in section 5.4.2 generate one rule for one class.
The length of the rule is very large and the form of the rule is

If (X is a1 and Y is b1 and …) or (X is b1 and Y is b2 and …) or …. Then Z is c

110
Rough‐Fuzzy Intelligent System

These generated rules may be used as a rule-base of a rule-based Intelligent system


and so for easier comparison we consider here the simple rule with the following
form:

If X is a1 and Y is b1 and …. Then Z is c with CF m.

where m is the CF of the rule.

5.5.1 For consistent information system

To construct the minimum-length decision rule as described in Method-2 of section


5.4.2.1 for any object i (i = 1, 2, …, m) belong in class cp ∈ (c1, c2… ck) can be
obtained as

|B | ⋀ ⋁ M (5.21)

Where ⋀ and ⋁ are conjunction and disjunction operations respectively.

Next calculate the CF of each rule by


5.22

Where k is the number of attributes present in the ith rule and µij is the MV.

The decision rule is calculated as

Rj = |Bip | ∀ i = 1, 2, …, m; cp ∈ (c1, c2… ck); j=number of distinct rule (5.23)

If for any i, |Bip | already exists in the rule set R , then compare the CF value of
existing rule and new rule and store the rule with maximum CF value in rule set R.

5.5.2 For inconsistent information system

First find out the B-lower and B-upper approximation for each class cp ∈ (c1, c2… ck).
Rule generated from the B-lower approximation are certain and rule generated from
B-upper approximation is possible rule.

To construct certain minimum-length decision rule for any object i (i = 1, 2, …, m)


belong in class cp ∈ (c1, c2… ck) can be obtained by performing the operation
described in Method-2 of section 5.4.2.2 as

111
Rough‐Fuzzy Intelligent System

|B | ⋀ ⋁ M (5.24)

where ⋀ and ⋁ are conjunction and disjunction operations respectively.

Next calculate the CF of each rule as


5.25

Where k is the number of attributes present in the ith rule and µij is the MV.

The decision rule is calculated as

Rj = |Bip | certain ∀ i = 1, 2, …, m; cp ∈ (c1, c2… ck); j=number of distinct rule (5.26)

If for any i, |Bip | certain already exists in the rule set R then compare the CF value of
existing rule and new rule and store the rule with maximum CF value in rule set R.

Then construct the possible minimum-length decision rule for any object i (i = 1, 2,
…, m) belong in class cp ∈ (c1, c2… ck) by performing the operation described in
Method-2 of section 3.2.2 as

|B | ⋀ ⋁ M (5.27)

where ⋀ and ⋁ are conjunction and disjunction operations respectively.

For possible rule the belief function of ith rule can be defined as follows

card Bc B
1 5.28

Where cp ∈ (c1, c2… ck) and card(.) define the cardinality of the set. and then


∗ 5.29

Where k is the number of attributes present in the ith rule and µij is the MV.

The decision rule is calculated as

Rj= |Bip| possible ∀ i = 1, 2, …, m; cp ∈ (c1, c2… ck); j=number of distinct rule (5.30)

112
Rough‐Fuzzy Intelligent System

If for any i, |Bip | possible already exists in the rule set R then compare the CF value of
existing rule and new rule and store the rule with maximum CF value in rule set R.

5.6 Description of Medical Domain

Medical data-sets of diabetes patients for rheumatological manifestations of Diabetes


Mellitus [35] is considered for the application of this framework.

Diabetes mellitus is a multimetabolic disorder characterized by hyperglycaemia, due


to relative or absolute deficiency of insulin affecting carbohydrate protein, fat,
vitamin, minerals, water, electrolyte metabolism leading to micro and macro vascular
complications.

Musculoskeletal disorders are common in type 1 and 2 diabetic subjects, and


examination of periarticular regions of the hands, the joints, shoulders and feet, as
well as the skeleton, should be included in the evaluation of patients with Diabetes
Mellitus. As modern therapeutics have helped decrease the mortality and morbidity of
diabetes mellitus, increased musculoskeletal symptoms may be discovered as these
patients lead longer and more active lives. The pathophysiology of these disorders in
diabetic patients is not obvious. It could be associated with connective tissue
disorders, such as the formation of abnormally glycosylated end products or the
impaired degradation of byproducts, it could be indirectly related to the vasculopathy
and neuropathy commonly complicating the primary disease, or finally, it could be
attributed to a combination of factors. A wide range of musculoskeletal syndromes
have been described in association with diabetes. In general, these are some
syndromes commonly associated with diabetes mellitus. Diabetic
cheiroarthropathy, Adhesive capsulitis of shoulder, Carpal tunnel syndrome,
Dupuytren's contracture, Hyperostosis, Osteoarthritis, Gout and hyperuricemia,
Calcium pyrophosphate deposition arthropathy, Osteopenia, Osteolysis of forefoot are
the foremost among them.

Hands are a target for several diabetes related complications. Lundbaek K first
described hand stiffness in young diabetics in 1957. and this was subsequently
termed “ diabetic hand syndrome”, limited joint mobility”(LJM) or
“cheiroarthropathy ( after the Greek word “cheiros” for hand ) . LJM was initially
described in type I patients with long standing diabetes, but it is found in both type I

113
Rough‐Fuzzy Intelligent System

and type II diabetes mellitus and can be detected very early the evolution of the
disease. The prevalence of LJM in type 1 diabetes quoted in several studies of
pediatric and adult diabetes clinics ranges from 9 to 58%.

Adhesive Capsulitis of Shoulder (ADH) or frozen shoulder is the most disabling of


the musculoskeletal problems in diabetes mellitus, which is also known as, shoulder
periarthritis or obliterative bursitis. It is characterized by progressive, painful
restriction of shoulder movement, in at least three planes, especially external rotation
and abduction. It is associated with thickened joint capsule, the joint capsule is closely
applied and adherent to the humeral head, resulting in considerable reduction in the
volume of the glenohumeral joint. The exact origins of adhesive capsulitis are not
known, although it has been associated with several other conditions, including
shoulder trauma, cerebral conditions, cardiac and respiratory conditions. Adhesive
capsulitis appears at a younger age in patients with diabetes and is usually less
painful, although it responds less well to treatment and lasts longer. The estimated
prevalence is 11–30% in diabetic patients and 2–10% in nondiabetics.

Carpal Tunnel Syndrome (CTS) is the most common entrapment neuropathy


encountered in diabetic and normal population and occurs as a result of median nerve
compression under the transverse carpal ligament . The increased prevalence in
diabetes may be related to repeated undetected trauma, metabolic changes,
accumulation of fluid or edema within the confined space of the carpal tunnel, and
diabetic cheiroarthropathy, rheumatoid arthritis, and hypothyroidism. It is more
common in females and in obese individuals (BMI 30 kg/m2) and affects the
dominant hand. It has been reported to occur in 2% of the general population, 14% of
diabetic subjects without diabetic polyneuropathy, and 30% of diabetic subjects with
diabetic polyneuropathy.

Dupuytren’s contracture causes a focal flexion contracture with a thickened band of


palmar fascia causing flexion deformities along with tethering of the skin. Knuckle
pads, as well as heel-pad nodules, may be noted. Pain and loss of motion result.
Dupuytren’s contracture, which is thought to be a low-grade inflammatory reaction,
often is associated with diabetes. Contractures may involve the third, fourth, or fifth
flexor tendons. Diabetic patients with dupuytren’s contracture are usually elderly and
present with a long duration of diabetes. It tends to be milder in diabetics compared to

114
Rough‐Fuzzy Intelligent System

non-diabetic subjects. It has been suggested that 25% of patients with Dupuytren’s
contractures have diabetes.

Flexor tenosynovitis (FTS) (trigger finger or stenosing tenovaginitis) is caused by


fibrous tissue proliferation in the tendon sheath leading to limitation of the normal
movement of the tendon. Clinically this can present as triggering or snapping of the
nodule as it passes through the tight constricting tendon sheath .First annular pulley is
the most affected lesion. Trigger finger has a reported incidence ranging from 1.7 to
2.6% in the general population. However, the incidence of trigger finger in diabetes is
reported to be between 10 and 20%.

Patients with type 2 diabetes seem to make more bone or to develop hyperostosis. The
most common form of new bone formation in diabetes is diffuse interstitial skeletal
hyperostosis (DISH) or ankylosing hyperostosis of the spine. DISH is more frequent
in males and the prevalence increases with age, affecting mainly subjects over the age
of 40 years.

Gout is a heterogeneous disorder characterized by hyperuricemia and arthritis induced


by the accumulation of urate crystals. Gout is a recognized but relatively uncommon
complication of diabetic ketoacidosis .There is also a possible correlation between
chronically stable diabetes and onset of gout symptoms.

Osteoarthritis is the most common rheumatic disease in the general population. It may
be asymptomatic, but severe involvement leads to pain, stiffness, and limitation of
motion in the affected joints, most commonly involving the knees, hips, and spine.

5.7 Application on Medical Data Set

We have applied this framework over the above said medical data-sets of diabetes
patients for rheumatological manifestations of Diabetes Mellitus like

- Diabetic cheiroarthropathy or Limited joint mobility (LJM)

- Adhesive capsulitis of shoulder(ADH)

- Clinical Carpal tunnel syndrome(CTS CL)

- NCV finding of Carpal tunnel syndrome (CNCV)

- Dupuytren’s contracture(DUPY)
115
Rough‐Fuzzy Intelligent System

- Flexor tenosynovitis (FTS)

- Diffuse interstitial skeletal hyperostosis (DISH )

- Gout and Hyperuricaemia(GOUT)

- Hand Osteoarthritis (OAH)

- Knee Osteoarthritis (OAK)

The datasets are contained nine attributes with values as follows

- Age(Integer Numbers)

- Sex( 0= MALE, 1 = FEMALE)

- Type of diabetes( 1= TYPE1 , 2 = TYPE2)

- Duration of Diabetes(Integer Number in year)

- Use of Insulin( 1= YES , 0 = NO)

- Fasting Blood Sugar(Integer Numbers)

- Post Prandial Blood Sugar(Integer Numbers)

- Albuminuria (1 = MICROALBUMINURIA , 2= PROTEINURIA )

- Uric acid(Floating Point Number)

We find out the result considering the best rule i.e. maximum CF value as well as
consider the maximum occurrence (vote) of a class for rules of top three CF value and
top five CF value. We also compare the result with the previously established rough-
fuzzy framework shown in Table-5.1 which is self-explaining. The datasets contains
100 instances. The datasets are presents in form of two files .nam and .dat file. In
.data file present the data and .nam file represent the data structure about data. Here
we used 5-fold data. Only 20% data are used to generate rule (c.f. Appendix B) and
other 80 % data is used for testing.

116
Rough‐Fuzzy Intelligent System

Data Existing Modified Algorithm


Set Algorithm Best Rule Vote of 3 Rule Vote of 5 Rule
LJM 72.50% 75.00% 77.5% 73.75%
ADH 78.75% 83.75% 83.75% 81.25%
CTS CL 92.50% 93.75% 93.75% 93.75%
CNCV 95.00% 96.25% 96.25% 95.00%
DUPY 96.25% 98.75% 98.75% 97.50%
FTS 85.00% 93.75% 91.25% 90.00%
DISH 95.00% 98.75% 96.25% 98.75%
GOUT 95.00% 96.25% 96.25% 96.25%
OAH 82.50% 85.00% 87.50% 83.75%
OAK 72.50% 73.75% 73.75% 72.50%

Table 5.1: Comparison result of existing method and modified method

5.8 Conclusion

We have presented a rough-fuzzy intelligent system that comprises rule generation


methodology based on rough-fuzzy hybridization. Fuzzy set is used to represent a
pattern in terms of its membership to linguistic variables. Rough sets are used to
generate diagnostic rules. CF of each rule is calculated by combining the fuzzy MV
and rough possibility values. We obtained good results but several issues still remain
unexplored, like aggregation of large number of input features, construction of
hierarchical systems when a large number of features contain missing data,
automatization of the whole process of logical data description and creation of
complete intelligent systems [36]. Hybridization with neural networks may increase
performance of the system considerable.

Bibliography

[1] B.G. Buchnan, E.H. Shortliffe, “Rule-Based Expert Systems”, Addison-


Wesley, New York, 1984.

117
Rough‐Fuzzy Intelligent System

[2] R. Andrews, J. Diederich, A.B. Tickle, “A Survey and Critique of Techniques


for Extracting Rules from Trained Artificial Neural Networks,” Knowledge-
Based Systems vol. 8, pp. 373–389, 1995.

[3] R. Schalkoff, “Pattern Recognition. Statistical, Structural and Neural


Approaches”. Wiley 1992.

[4] T. Mitchell, “Machine learning”. McGraw Hill 1997.

[5] S. Mitra, Sankar K. Pal, and P. Mitra, “Data Mining in Soft Computing
Framework: a Survey”, IEEE Trans.on neural networks, vol. 13, no. 1, pp. 3–
14, 2002.

[6] J.R. Quinlan, “C4.5––Programs for Machine Learning”, Morgan Kaufmann,


Palo Alto, 1993.

[7] P.S. Michalski, I. Mozetic, J. Hong, N. Lavrac, “The multi-purpose


incremental learning system AQ15 and its testing application to three medical
domains”, in: Proceedings of the Fifth National Conference on Artificial
Intelligence, AAAI Press, Menlo Park, 1986, pp. 1041–1045.

[8] S.T. Wang, “Fuzzy system and Fuzzy Neural Networks”, Shanghai Science
and Technology Press, 1998, Edition 1.

[9] L.A. Zadeh,” Fuzzy sets”, Inform. Contr. 8(1965), 338-353

[10] L. A. Zadeh, “Fuzzy logic as a basis for the management of uncertainty in


expert systems, Fuzzy Set and Systems 11 (1983c), 199-227.

[11] L.A. Zadeh. “A computational approach to fuzzy quantifiers in natural


languages”, Computers and Mathematics with Applications, 9:149–184, 1983.

[12] Z. Pawlak, “Rough sets”, International Journal of Computer and Information


Sciences 11 (1982), 341–356.

[13] Z. Pawlak,”Rough Sets: Theoretical Aspects of Reasoning about Data, System


Theory”, Knowledge Engineering and Problem Solving vol. 9, Kluwer
Academic Publishers, Dordrecht, The Netherlands (1991).

118
Rough‐Fuzzy Intelligent System

[14] Shi-tong Wang, Dong-jun Yu and Jing-yu Yang, “Integrating rough set theory
and fuzzy neural network to discover fuzzy rules”, Intelligent Data Analysis 7
(2003) 59–73

[15] Sankar K. Pal, Pabitra Mitra, “Case Generation: A Rough-fuzzy Approach”, ”,


In: Proc. Intl. Conf. Case Based Reasoning (ICCBR2001), Vancouver,
Canada, 2001

[16] Gang Xie , Fang Wang , Keming Xie, “RST-Based System Design of Hybrid
Intelligent Control”, IEEE International Conference on Systems, Man and
Cybernetics, 2004

[17] S. K. Pal and A. Skowron, Eds., “Rough Fuzzy Hybridization: A New Trend
in Decision Making” Singapore: Springer-Verlag, 1999

[18] Shusaku Tsumoto, “Mining diagnostic rules from clinical databases using
rough sets and medical diagnostic model”, Information Sciences 162 (2004)
65–80

[19] Brenda Mak, Toshinori Munakata, “Rule extraction from heuristic: A


comparative study of rough sets with neural network and ID3” Europian
Journal of Operational Research 136(2002) 212-229

[20] S. Tsumoto, H. Tanaka, “PRIMEROSE: Probabilistic rule induction method


based on rough sets and resampling methods”, Computational Intelligence 11
(1995) 389–405.

[21] V. Dhar, A.Tuzhilin, “Abstract-driven pattern discovery in databases”, IEEE


Transactions on Knowlwdge and Data Engineering 5(6)(1993) 926-937

[22] S. Mitra and Y. Hayashi, “Neuro-fuzzy Rule Generation: Survey in Soft


Computing Framework,” IEEE Trans. On Neural Network, vol. 11, no. 3, pp.
748–768, 2000.

[23] L. X. Wang and J. M. Mendel, “Generating Fuzzy Rules by Learning from


Examples”, IEEE Trans. Systems,Man, and Cybernetics, vol. 22, pp. 1414–
1427, 1992

119
Rough‐Fuzzy Intelligent System

[24] R. Rovatti and R. Guerrieri, “Fuzzy Sets of Rules for System Identification,”
IEEE Trans. Fuzzy Syst., vol. 4, pp. 89–102, 1996.

[25] H. Ishibuchi, K. Nozaki, N. Yamamoto, and H. Tanaka, “Selecting Fuzzy If–


Then Rules for Classification Problems Using Genetic Algorithms,” IEEE
Trans. Fuzzy Syst., vol. 3, pp. 260–270, 1995.

[26] L. Wang and J. Yen, “Extracting Fuzzy Rules for System Modeling Using a
Hybrid of Genetic Algorithms and Kalman Filter,” Fuzzy Sets Syst., vol. 101,
pp. 353–362, 1999.

[27] P. Mitra and S. K. Pal, “Staging of Cervical Cancer with Soft Computing”,
IEEE Trans. On Biomedical Engineering, vol. 47, no. 7, pp. 934–940, 2000.

[28] Y. Jung-heurn, Y. Seung-moo, and J. Hong-Tae, “Structure Optimization of


Fuzzy-Neural Network Using Rough Set Theory”, the Proceedings of 1999
IEEE International Conference on Fuzzy Systems, Aug 1999, pp. 1666-1670.

[29] B. S. Ahn, S. S. Cho, and C. Y. Kim, “The Integrated Methodology of Rough


Set Theory and Artificial Neural Network for Business Failure Prediction,”
Expert Systems with Applications, vol. 18, pp. 65–74, 2000.

[30] R. Yasdi, “Combining Rough Sets Learning and Neural Learning Method to
Deal with Uncertain and Imprecise Information”, Neurocomputing, vol. 7, pp.
61–84, 1995.

[31] R. W. Swiniarski and L. Hargis, “Rough Sets as a Front End of Neural


Networks Texture Classifiers”, Neurocomputing, vol. 36, pp. 85–102, 2001.

[32] R. R. Hashemi et. al., “A Hybrid Intelligent System for Predicting Bank
Holding Structures”, European Journal of Operational Research, vol. 109, pp.
390–402, 1998.

[33] P. J. Lingras, “Rough Set Clustering for Web Mining,” Proceedings of 2002
IEEE International Conference on Fuzzy Systems, Hawaii, May 2002, pp. 12-
17.

120
Rough‐Fuzzy Intelligent System

[34] S. K. Pal, S. Mitra, and P. Mitra, “Rough-Fuzzy MLP: Modular Evolution,


Rule Generation, and Evaluation”, IEEE Trans. On Knowledge and data
engineering, vol. 15, no. 1, pp. 14-25, Jan/Feb 2003.

[35] S. Ray, Study of Common Rheumatological Manifestrations of Diabetes,


Thesis of Doctor of Medicine, WBUHS, Kolkata, 2010.

[36] Wlodzislaw Duch, Rafal Adamczak and Krzysztof Grabczewski, “A new


methodology of extraction, optimization and application of crisp and fuzzy
logical rules”, IEEE Transactions on Neural Networks, vol. 11, no. 2, pp. 1-
31, march 2000.

121
Rough‐Fuzzy‐Neural Network Hybridization

Chapter 6

Rough-Fuzzy-Neural Network Hybridization


Along with the development of science and technology, the complexity of the
problems of real life is increasing. Data are also increasing by exponential nature.
Thus to process those huge amount of data needs an intelligent system according to
the experts thinking. Traditional systems normally use neural network or expert
system to solve problems, but have several shortcomings that is hard to get training
set of neural network, to understand knowledge rules, and to learn new knowledge
through the fuzzy expert system. A system based on the rough-fuzzy neural network
can reduce such problems.

6.1 Introduction

Rough-fuzzy-neural network hybridization is a powerful tool in the soft computing


area that utilizes the advantages of all the individual methods as well as pairwise
hybrid methods. Fuzzy set [1] is used to handle uncertainty and vagueness in the data
as well as give the human-like reasoning process; Rough set [2] has the power to
handle imprecise data and also used to attribute reduction and dependency rule
generation. Neural network [3] presents a connectionist structure. Therefore this type
of hybridization can handle the uncertain, imprecise and vague data and may produce
the fuzzy if-then rules, refined by neural network, and also allow the human-like
reasoning process. This hybridization can be efficiently used for representing
knowledge base of an intelligent system, knowledge discovery from database or
experimental data set, generating dependency rules that represents the knowledgebase
of a rule-based intelligent system. This hybridization has been employed in past in
many different combinations to solve the different problems, like Rule generation [4,
5], prediction [6], classification [7], feature selection [8], knowledge encoding [9]
pattern recognition [10, 11], industrial wastewater treatment [12] and so on.

The organization of this chapter is as follows: section 6.2 devoted for the theoretical
discussion of different type of neural network systems. Section 6.3 represents
different type of neuro-fuzzy systems. Section 6.4 describes the Neuro-Fuzzy Models

122
Rough‐Fuzzy‐Neural Network Hybridization

focused on the process modules of fuzzy multi-layer perceptron. Section 6.5 presents
the different proposed models that combine the Rough, Fuzzy and Neural Network.
Section 6.6 deals with the conclusion describing the key findings and scope of rough-
fuzzy neural network hybridization.

6.2 Neural Network Systems

A neural network (NN), in the domain of artificial intelligence called artificial neural
network (ANN), is an interconnected collection of artificial neurons that uses
a mathematical or computational model for information processing based on
a connectionistic approach to computation. NN was first proposed by Warren
McCulloch and Walter Pitts in 1943 [3] based upon knowledge about the human brain
architecture. Warren McCulloch and Walter Pitts are quite generally regarded as the
inventors of the NN model. This NN model, named after the inventors, included a
nonlinear activation function in the neuron and a threshold value, so that the neuron
only fires if the input value is larger than the threshold value. The next NN model was
Hebb network, Introduced by Donald Hebb in 1949[13].

The years in between 1950 and 1960 are defined as the golden ages of NN. In this
time with the growing interest in NN development a large number of models was
introduced, such as the Perceptron [14] and Adaline architectures [15]. The most of
the NN models those are recently used were actually invented in that time. Mainly the
models were developed in the domain of psychology. The development of NN was
not able to create any massive impact in understanding the human brain or real
artificial intelligence. The main reason was that in absence of optimization algorithms
and the lack of real applications that generated a disbelief in the capabilities of NN.

Widrow and Hoff [15] introduced the Adaline NN model in 1960 in the domain of
electrical engineering. A more mathematical approach to NN started in the 1970s, but
only gained success in the early 1980s. The early work came from Kohonen,
Anderson [17], Carpenter and Grossberg who wrote many mathematical and
biological papers on the subject [18-25]. After the 1980 most work was done on
learning algorithms [26], new network topologies [27] and the universal
approximation properties of neural networks [28, 29, 30, 31]. The Hopfield networks
[32] and the Boltzmann machine were invented. A lot of work was done by

123
Rough‐Fuzzy‐Neural Network Hybridization

combining neural networks with radial basis functions. The numbers of applications
are growing, combination of fuzzy logic with NN as well as rough set with NN also
hybridization with other soft computing techniques.

6.2.1 McCulloch-Pitts Neuron

The McCulloch-Pitts neuron model is first NN model described [3]. It based on a


binary nonlinear relationship and weights to describe the importance of an
interconnection between the neurons and a threshold value, depends on that neuron
fries. An input Ii, j ∈ {0, 1} of the i-th neuron , is either excitatory or
inhibitory. The excitatory inputs are defined as

, , , ,…, ,

For the number of inputs mi, the inhibitory inputs defined as

, , , ,…, ,

The inhibitory inputs prevent the neuron from firing, irrespective of the other inputs.
The transfer function of a single neuron , , with Wi the weight of
the neuron, can then be written as the logical relationship

1; , , 0
6.1

0;

with Ki the threshold value for the neuron and the output vector Y = [Y1, Y2, … Yn].

Initially all inputs have the same weight, so that the first term of the logical
relationship can be replaced by ∑ , 1 with / , thus all neurons use
the same nonlinear transfer function. Any inhibitive input can be given a large
negative weight -N with max 1 , such that the transfer function for a
neuron becomes

1; 1 1 1
6.2
0;

124
Rough‐Fuzzy‐Neural Network Hybridization

The output Yi ∈ {0, 1} may be used as an input to other neurons to construct a


connected network without any strict layered structure. The typical structure of a
McCullogh-Pitts neuron is given bellow

γi
1
∑ Yi
-N 1

Figure 6.1 McCulloch-Pitts neuron

6.2.2 Hebb Nets or Rosenblatt’s Perceptron

The Hebb NN or Rosenblatt’s Perceptron [13, 33, 34] model is a single layer of
neurons. The output , ,…, is modelled using R neurons. Every neuron
Yi = fH(I) has m+1 inputs, m inputs Ij plus a bias term Bi and a single output. The bias
term is considered as an extra input that always equals one. Inputs have the different
weights Wi,j. The input-output relationship defined as

1; ,
6.3
0;

The output is a binary value i.e., ∈ 0, 1 . All the inputs may not be used. In the
equation given bellow, this can be enforced by setting some of the weights to zero.
Also in that situation, elimination of the threshold Ki is possible. The parameter vector

to update Θ , , , ,…, , , is then formed by the resulting weights Yi = Wi, j


/ Ki and biases βi = Bi / Ki and the transfer function defines as

1 ; , , … , ,1 Θ
6.4
0 ;

Transfer function can also be simplified further by putting the threshold value in the
bias vector and that results the inequality with respect to one converts to an inequality
with respect to zero. The parameter vector is normally optimized by the Hebbian
optimization function. Therefore, the parameters are updated with the parameter
update vector

125
Rough‐Fuzzy‐Neural Network Hybridization

ΔΘ , ,…, ,1 (6.5)

γi, 1
γi, 2
1

…...
Yi


1
γi, m

Figure 6.2 Hebb neuron – Rosenblatt’s perceptron

Hebb nets are basically used to identification of digital value, like pattern recognition.
To reduce few drawbacks like poor generalization properties in the Hebb net, the
output is often allowed to become negative, that results the nonlinear relationship is
reformed to a sign(*) function instead of the larger-than relationship.

6.2.3 Adaline and Madaline models

ADALINE (Adaptive Linear Neuron or Adaptive Linear Element) has only one layer
of neurons. The name Adaline was introduced by Professor Bernard Widrow and his
graduate student Ted Hoff at Stanford University in 1960 [15]. This neuron is Hebb
NN model with a transfer function defined as

, 6.6

or, with an alternate bias,

, 6.7

The main difference between the Hebb net and Adaline models is the transfer function
of the nonlinear part. These are treated differently due to the way of how the
parameter vector is updated. Hebb nets are normally optimized by the Hebbian
optimization function, but the parameters of the Adaline model are optimized by
gradient methods with a LS cost function [35].

126
Rough‐Fuzzy‐Neural Network Hybridization

In Adaline model the first derivative with respect to the number of iterations is chosen
as a stopping criterion. If the cost drops below a user defined value, the optimization
is stopped and assumed that the cost function is to be in the local minimum.

Wi, 1

Wi, 2
1

…...
Yi


1
Wi, m

Figure 6.3 Adaline neuron

Madaline models is an extension of Adaline model. When Adaline neurons are placed
in a multilayer architecture, the NN model is called a Madaline network.

6.2.4 Multilayer Perceptron model

The Perceptron NN model [14] is the most used and popular NN architecture
nowadays. Originally the Perceptron model employed three layers of neurons, namely
the sensory units, the associator units and a response unit. But later, Perceptron
models with only two layers were found the capabilities to universal approximation.
Using only two layers has the advantage of rather simpler mathematical calculation as
well as easier optimization method.

The nonlinear transfer function may be binary like the McCulloch-Pitts and the Hebb
networks. In gradient optimization methods by choice a transfer function with finite
higher order derivatives makes the optimization of the parameters easier. A
Perceptron employs a bias term and weights and its transfer function that define as

, , 6.8

With , ,…, the output vector, Wi,j the weights and Wi, m+1 the bias term.
The parameter vector involves all weights and biases. Many nonlinear activation
function is there, the sigmoid function defined bellow

127
Rough‐Fuzzy‐Neural Network Hybridization

1
6.9
1

Or the hyperbolic tangent function

tanh 6.10

for the output when allowed to be negative. Other transfer functions that are also used
regularly are the hard-limit functions σ(x) = sign(x) and σ(x) = (x 0) , and RBF
functions. When Perceptron is use in a multilayer configuration, the outputs of one
layer use as the inputs for the next layer. A three layer Perceptron is defined as

with

, ,…, , , , ,…, , , , ,…, , (6.11)

the Wi are weight matrices, n1 and n2 are the number of neurons in the first and
second layer respectively and , ,…, is the input vector. The bias vectors
are

, ,…, , , , ,…, , , , ,…, , (6.12)

The parameter vector that must be updated is define as

Θ : ; : ; : ; : ; : ; : (6.13)

Wi, 1

Wi, 2
1

…...

Yi

1
Wi, m

Bi

Figure 6.4 Perceptron neuron

128
Rough‐Fuzzy‐Neural Network Hybridization

The simple one-layer Perceptron is also used for digital valued identification e.g. for
pattern recognition. Multilayer Perceptron (MLP) models have been used to model a
large number of nonlinear domains, typically with a soft nonlinear relationship
between inputs and outputs. Many researchers have been used the MLP for its strong
modeling capabilities.

6.2.5 Radial Basis Function Neural Network

The Radial Basis Function (RBF) neural network is similar with the two-layer MLP.
RBF networks, however, are currently widely used and studied by many researchers.

The RBF network is usually configured in a two-layer structure. Every node in the
hidden layer has a center ci that is compared with the input I by a norm ||I – ci||, and a
width ρi. The output layer is the linear combination of all hidden neurons. The transfer
function , ‖ ‖, of each neuron in the hidden layer may employ different
forms. Most generally it is similar to the Gaussian function

‖ ‖
, ‖ ‖, 6.14
2

where equals . Some other choices of , are the thin-plate-spline function

, ‖ ‖, ‖ ‖ log‖ ‖ 6.15

With 1, the multiquadratic function

, ‖ ‖, ‖ ‖ 6.16

Or the inverse multiquadratic function

‖ ‖, ‖ ‖ /
, 6.17

The combination of the neurons is shown in FIGURE 6.5. RBF input-output relation
is as follows

(6.18)

with (x) = x and (x) = fRBF(x ) , has the same form as the MLP model (6.11). The
approximation of RBF networks are similar to those of more general MLP networks,

129
Rough‐Fuzzy‐Neural Network Hybridization

irrespective of the choice of the transfer function fRBF. This suggests that the selection
of the nonlinearity function cri is not critical for the performance. It has a severe
effect, however, on the number of local minima when the cost function is optimized

W1

W2
∑ Y


WN

Figure 6.5 Radial Basis Function network

6.2.6 Other Neural Network Architectures

All of the above NN models may be considered as feed forward networks, since the
neurons are joined in a layered or grouped manner. In the network topology, does not
exist any cyclic paths. But in the Recurrent NN model each neuron is coupled to some
or all other neurons. The result is a recursive structure that may represent a form of
storage. The main drawback of the NN is the large interconnection weights.

One of the simplest recurrent networks is recognized as the linear autoassociator.


Every neurons within the NN are connected to each other. No exogenous inputs are
applied, and one or more of the neurons are observed as an output.

The Kohonen network is an example of a topology-preserving map. This means that


the behavior of a neuron is related to the arrangement of the neurons in the network.
This network is based on the organization of the human brain, where neurons lie close
to each other and have interconnection matrices. The neighborhood of a neuron is
chosen by a radius r ∈ {0, l , 2, ... } where the neurons are organized in a two
dimensional matrix structure.

A less used NN architecture is the Bidirectional Associative Memory [BAM], invented


by Bart Kosko in 1988 [36]. Two layers of neurons are used and each layer has its
own exogenous inputs. All neurons of one layer are connected to all neurons of the
other layer, but no interconnections exist within layer. Many variants on this two-

130
Rough‐Fuzzy‐Neural Network Hybridization

layered structure NN topology exist. Different nonlinear transfer functions may be


used for each neuron. For this feature BAM is discrete or continuous BAM.

The Maxnet is a fully connected network with each node connecting to every other
node, including itself. The basic idea is that the nodes compete against each other by
sending out inhibiting signals to each other. The Mexican Hat variant of Maxnet is
only allowing interaction with other neurons lie in a neighborhood. Moreover,
neurons that lays symmetric region of a particular neuron get the same weight with
different sign. Hamming nets incorporates the both techniques, such as neurons also
base their output on the neurons that lie close and with largest output. The
optimization of the net is done by LS cost function.

Counterpropagation networks are multilayer networks used for function


approximation. They have lookup tables that are optimized in two phases. First the
input layer is clustered and then the other layers are optimized.

ART maps look like Kohonen networks, but here user can control the degree of
similarity of inputs that are correlated with the same cluster of neurons. ART1 was
introduced for digital data; on the other hand ART2 has the ability to accept
continuous data. ART maps were developed using their proper optimization methods
depends on a designated cost function.

Many similar NN topologies were proposed based on both continuous or binary


transfer functions and different optimization methods. As for example, Boltzmann
machines have both single layer and multilayer architecture with binary outputs and
use the Simulated Annealing optimization scheme to optize the model parameters.
Activation function of Gaussian machines is sigmoid function and Gaussian
distributed noise when optimize using simulated annealing. Cauchy machines are
built on Cauchy noise and also allow other noise used in the optimization scheme.

The neocognitron net was proposed for handwritten recognition. It has nine layers
made with a large number of neuron arrays. The interconnections between the arrays
are sparse; every array in one layer is connected to a specified number of arrays of the
previous layer only.

Many more NN model were introduced in past. Mainly they differ in few
characteristics define bellow

131
Rough‐Fuzzy‐Neural Network Hybridization

 Topology: describe how the neurons are interconnected.


 Activation function: different nonlinear activation function.
 Cost function: choice a fixed cost function.
 Optimization method: choice of a fixed optimization method.

Some NN models are distinguished by only optimization method was used (e.g.
Backprop NN, Boltzmann and Cauchy machines), or because specific cost functions
were used (e.g. Hamming nets).

6.3 Neuro-Fuzzy Systems

Neuro-Fuzzy systems are soft computing methods for design of Intelligent Systems
that combine in various ways neural networks [37, 38] and fuzzy logic [1, 39, 40]
concepts. The combination of these two techniques into an integrated system that
leads toward the development of intelligent system, capable of acquiring quality to
characterizing the human brain. However, fuzzy logic and neural networks usually
approach the design of intelligent systems from different angles.

Every intelligent technique has particular computational properties like ability to


learn, explanation of decisions that make them suited for specific problems. Neural
networks are essentially low-level, computational algorithms that offer a good
performance in pattern recognition and control tasks and they are not good at
explaining how they reach their decisions, while Fuzzy logic systems, which can
reason with uncertain and imprecise information, are good at explaining their
decisions but they cannot automatically acquire the rules they use to make those
decisions. Therefore, Neuro-fuzzy hybridization results in a hybrid intelligent
system that employed these two techniques by combining the human-like reasoning
style and ease of incorporating expert knowledge of fuzzy systems with the learning
abilities, optimization abilities and connectionist structure of neural networks to
represents itself a powerful tool to design intelligent system.

The Neuro-fuzzy systems makes possible to bring the low-level learning and
computational power of neural networks into fuzzy systems and also high-level
humanlike IF-THEN thinking and reasoning of fuzzy systems into neural networks.
Therefore, the neural network gained more and more transparency to pursued and
obtained either by pre-structuring a neural network to improve its performance or by

132
Rough‐Fuzzy‐Neural Network Hybridization

possible interpretation of the weight matrix in the learning stage. On the other hand,
the development of fuzzy systems allows automatic tuning of the parameters that
characterize the fuzzy system can largely draw motivation from similar methods used
in the connectionist community. Finally, neural networks may improve their
transparency to come closer to fuzzy systems, while fuzzy systems can self-adapt, that
make them closer to neural networks [41].

Researchers attracted with their growing interest with neural fuzzy systems [41, 42] in
various scientific and engineering areas. Specifically increasing interest found in the
area of pattern recognition, hybrid Neuro-fuzzy systems [43, 44, 45].

Several ways to combine neural networks and fuzzy logic has been proposed. Efforts
or combining these two techniques may be categorized by considering three main
categories: neural fuzzy systems, fuzzy neural networks and fuzzy-neural hybrid
systems.

6.3.1 Neural Fuzzy Systems

Neural fuzzy systems are mentioned by the use of neural networks in the fuzzy
systems by providing an automatic tuning method, without disturbing their
functionality. One example of this type of system would be, where neural networks
used for the membership function elicitation and mapping between fuzzy sets that are
utilized as fuzzy rules. This kind of hybridization is mostly found in control
applications. This approach used in [46-52].

Consider an example, where the neural network simulates the processing of a fuzzy
system where the neurons of the first layer are responsible for the fuzzification
process. The neurons of the second layer represent the linguistic term that are used in
the fuzzy rules in third layer. Finally, the neurons of the third layer are an example of
the mapping of a neural network to a fuzzy logic system, is responsible for the
defuzzification process. At time of training the neural network adjusts its weights in
order to minimize the mean square error between the output of the network and the
desired output. In this example, the weights of the neural network represent the
parameters of the fuzzification function, linguistic term with membership function,
fuzzy rule certainty factor and defuzzification function respectively. Therefore, the

133
Rough‐Fuzzy‐Neural Network Hybridization

training of the neural network results in automatically adjusting the parameters of a


fuzzy system to find their optimal values.

6.3.2 Fuzzy Neural Networks

In this approach fuzzy logic fuzzify some of the elements of neural networks. In this
way a crisp neuron may become to a fuzzy neuron. Since the fuzzy neural networks
are basically neural networks, they are mostly used in Pattern Recognition
Applications. This system is used in [44, 53-59]. In [41] a neural network composed
of fuzzy neurons is presented.

In these fuzzy neurons, the input values are crisp, but the weighting operations are
replaced by membership functions that results of each weighting operation is the
membership value which map to the inputs in the fuzzy set. The aggregation
operations used many operations such as min and max and any other t-norms and t-
conorms [41]

6.3.3 Fuzzy-Neural Hybrid Systems

In this approach, both fuzzy and neural networks techniques are used independently,
to construct a hybrid system. Each one does its own task in serving different functions
in the system, incorporating and complementing each other in order to achieve a
common goal. This kind of merging is application-oriented and suitable for both
control and pattern recognition applications [41].

6.4 Neuro-Fuzzy Models

Several Neural Network models have been discussed in this thesis in section 6.2.
Among them the multi-layer perceptron (MLP) [37, 38] using the backpropagation
learning mechanism [26], is the most prevalent neural network discussed in the
literature [26]. Thus the study in this section focused on the fuzzy-MLP.

MLP has been used in a wide range of applications such as pattern recognition [60,
61, 62], Medicine [63, 64], forecasting [65] and so on. Basically the MLP is a
feedforward multi-layer network that uses a supervised learning mechanism for error
corrections. That means, it employs a mechanism to modify the weights of the

134
Rough‐Fuzzy‐Neural Network Hybridization

network in order to minimize the mean squared error between the actual and desired
outputs of the neural network [66].

The fuzzy multi-layer perceptron is an application of fuzzy set in a MLP network [54,
67]. In other words, fuzzy set is used directly to perform the fuzzification operation
either at a network-level, learning-level or a network learning- level of the MLP.
Many research works have been done on combining fuzzy set and the MLP such as in
[48, 49, 50, 54, 67, 68, 69].

A network-learning-level fuzzification of the MLP is proposed [56] in which the


following mechanisms have been introduced to the MLP operability.

 Fuzzy desired output: Fuzzy concepts are used in calculating the desired output
of input patterns that are existing to the MLP neural model during the learning
process and recalling phase;
 Degree of ambiguity: Parameter that involves handling the degree of ambiguity
of the input pattern has been added to the weight update equation.

This fuzzy MLP model [56] is a modified model of the model in [54].

6.4.1 Multi-Layer Perceptron

The perceptron and other single layer networks have been limited in their capabilities
[70]. Feedforward multi-layer networks, such as the MLP with nonlinear activation
functions, can overcome these limitations and employed to a wide range of
applications [60, 61, 63, 64, 65]. The learning mechanism usually used in this network
is called backpropagation learning [26]. However, it was not much used by the
researchers until the 1980's, when it was independently rediscovered by Parker [71],
LeCun [72] and in its most popular version by Rumelhart, Hinton and Williams [66].

The backpropagation learning algorithm assumes feedforward neural network


architecture such as a MLP. In this network architecture, nodes are partitioned into
layers, numbered 0 to L, in which the initial layer is the input layer with number 0, the
last layer (number L) is the output layer and the other layers are called the hidden
layers. Figure 6.6 shows a typical example of a MLP architecture which is consists of
an input layer, two hidden layers and an output layer.

135
Rough‐Fuzzy‐Neural Network Hybridization

The backpropagation algorithm is a generalization of the least mean squared


algorithm that modifies the weights of a neural network in order to minimize the mean
squared error between the desired and actual outputs of the network by a gradient
descent method. In a supervised learning algorithm, inputs as well as desired outputs
are known.

Hidden Layer I
Input Layer Hidden Layer II

Output Layer


Figure 6.6: The general structure of multi-layer perceptron

Fundamentally, in the backpropagation learning process consist two steps,


feedforward and a backward step, define bellow:

 Feedforward step: In this phase, the input pattern is feed to the input layer
neurons that pass them on to the first hidden layer neurons. The hidden layer
neurons compute a weighted sum of their inputs, pass the sum through their
activation function and send the result to the next hidden layer neurons. This
process is repeated until the result is send to the output layer neurons of the
network where the actual output of the neural network is generated.

 Backward step: Once the output of the network is generated, the mean squared
error between the desired and actual outputs of the network is calculated.
Subsequently, this error is back-propagated and is used as basis in order to
modify the weights of the network [37, 66].

136
Rough‐Fuzzy‐Neural Network Hybridization

6.4.2 Fuzzy Multi-Layer Perceptron

Fundamentally, the combination of Fuzzy set and MLP can be achieved in three
different ways, which are network-level fuzzification, learning-level fuzzification and
network-learning-level fuzzification, as follows:

 Network-level fuzzification: This process of fuzzification occurs in the structure


of the MLP model, such as in neurons (and weights) and/or in the whole
architecture as well as in the desired output.

o Neurons and whole architecture: Simulation of the behavior of a fuzzy


system is performed in most cases, where neurons represent fuzzy functions
with parameter values or fuzzy rules. When the layers of neurons represent
the set of fuzzy rules the conventional gradient descent backpropagation
algorithm is utilized to update and refine the parameters of the fuzzy system.

In this approach the fuzzy systems are optimized. Usually fuzzy rules are
acquired from human experts or operators in that domain according to their
domain knowledge or experiences. To modeling a fuzzy system, knowledge
acquisition a very difficult task to performed. Due to the ambiguity,
uncertainty or complexity of the identifying system, it is often too difficult
and sometimes impossible for human expert to produce the desired fuzzy
rules or membership values. Therefor necessity occurs to generate fuzzy
rules by some learning technique. Fuzzy rules can be generated by building
an optimal fuzzy system model by the backpropagation algorithm. This
system approach has been widely used and found in [46-52, 73].

o Desired output: In this approach, the desired output vector is fuzzified in


order to represent the degree of membership of the input pattern to each
output class [54, 56].

 Learning-level fuzzification: The process of fuzzification occurs in the learning


algorithm, preserving the multi-layer perceptron architecture. The learning-level
fuzzification try to reduce weak points of the backpropagation method by
achieving increasing convergence rate [68], improving accuracy [74],
decreasing influence of ambiguous training patterns [67, 56].

137
Rough‐Fuzzy‐Neural Network Hybridization

 Network-learning-level fuzzification: The process of fuzzification occurs in the


structure of the multi-layer perceptron model as well as in the backpropagation
learning method. One example of this approach is considered the structure is
adjusted as a fuzzy control system and its learning procedure is fuzzified in
order to better refinement of fuzzy rules and parameters. This approach is found
in in [50 ].

In the next subsections a network-learning-level fuzzification is considere for study


where fuzzy concepts are used to define the desired output vector (network-level
fuzzification) and a fuzzy parameter is added to the updating equation during the
learning process (learning-level fuzzification) [56].

6.4.2.1 Desired Output Vector

Usually, in the conventional MLP, the number of neuron in the output layer
corresponds to the number of pattern classes occurring in the experimental data set. In
this type of neural network, the winner-takes-all method is used during the learning
and recalling phases of the neural network. In winner-takes-all method, 1 is assign to
the winning element and 0 to the other elements. In the recalling phase, winner-takes-
all method is used to define the winning neuron, the neuron that represents the
network's prediction about the class where the input pattern belongs, by assigning 1 to
this neuron and 0 to the other neurons.

The winner-takes-all method is applied to the learning process to define the desired
output vector. In the desired output vector, the class assigned the value 1, where input
pattern belongs and the other classes are assigned the value 0. This is called a crisp
desired output. Then the mean square error is calculated by comparing actual and
desire output, which is then back-propagated to modify the weights of the neural
network.

In real world problems the data are usually ill-defined, with overlapping or fuzzy class
boundaries. Some patterns may have nonzero membership value in two or more
pattern classes. In the Pattern Recognition field, it is very common that an input
pattern has a degree of similarity to more than one class.

138
Rough‐Fuzzy‐Neural Network Hybridization

In the conventional MLP, this multiple similarity is not considered since crisp desired
output is assigned and used during training process and recalling phase. Therefore, to
consider the membership values of every class, it would be promising to include fuzzy
concepts in the calculation of the desired output [54, 56]. Unlike the conventional
MLP, the fuzzy MLP can employ desired membership values to the output neurons
during the training process instead of choosing crisp values as in a winner-takes-all
method [54]. Subsequently, the errors are back-propagated with respect to the exact
similarity which is found in the fuzzy desired output.

Thus fuzzy MLP model with fuzzy desired output can able to more efficiently classify
fuzzy data with overlapping or fuzzy class boundaries. As well as, the fuzzy MLP can
be applied in any problem where conventional MLP is used.

6.4.2.2 Process of Updating the Weights

The process of learning knowledge in the conventional MLP network is done through
the minimization of least mean square error in output vectors and it is known as a
gradient descent method. The minimization of least mean square error is achieved
through the update of the weights of the neural network. To calculate the new weights
of the network, a momentum updating equation is used, defined as follows [37]:

1 1 Δ (6.19)

Where

1 : The new weight of the connection from the i-th neuron to the j-th neuron.

: The old weight of the connection from the i-th neuron to the j-th neuron.

: The learning rate.

: The parameter that is calculated according to the LMS error and the layer in which
the neuron is found.

: is the output of the neuron i.

: Momentum parameter.

Δ : The change in the weight during the last iteration.

139
Rough‐Fuzzy‐Neural Network Hybridization

The momentum updating equation updates the weights of a neural network


considering their previous weights according to a certain learning rate together with
the update value from the last iteration according to a momentum parameter.

The weight update process defined in equation 6.19, where every training pattern has
the same importance in adjusting the weights of the neural network. However, the
patterns lie in overlapping areas is mainly responsible for misclassification. In the
momentum equation, this factor is not care and ambiguous and unambiguous training
patterns have the same influence in the updating the weights.

One approach to improve the performance of the algorithm was proposed in [56] by
considering the amount of correction in the weight vector produced by the input
pattern. The amount of correction was defined by the degree of ambiguity of a pattern
where more ambiguity an input pattern means less correction in the weight. By this
approach the ambiguous patterns have less influence in the vector updating process
rather the unambiguous patterns.

The degree of ambiguity (A) can be defined as follows.

(6.20)

Where

: The membership value of the top class i.

: The membership value of class j, the second highest one.

m : an enhancement/reduction fuzzy parameter.

The degree of ambiguity is calculated by taking the difference between the


membership value of the top class and the second highest class membership values.
The enhancement/reduction fuzzy parameter m either < 1 for enhancement, =1 for
maintains or > 1 for reduction.

To update the weights of the neural network, the degree of ambiguity employed as a
parameter in the weight updating equation along with the learning rate ( ) and the
output of the neuron defined bellow

140
Rough‐Fuzzy‐Neural Network Hybridization

1 1 1 Δ (6.21)

Thus by this approach [56] the problem of ambiguous patterns with much influence in
the weight updating process was avoided. The more ambiguous a training pattern, the
less its influence on the weight updating equation.

6.4.2.3 Learning Strategy

In the perceptron-based models, in any iteration of the training process contains


presentation of the whole training set in a predefined order. Several learning strategies
were applied to a perceptron neural network, the most used are learning by pattern,
learning by block and learning by epoch [75].

 Learning by pattern: This process of updating weights of the neural network


occurs after each training pattern has been presented. In other words, after every
training pattern has been presented, the weights of the neural networks updated
to adjust the weights of the network for better representation of the training
pattern most recently presented.

 Learning by block: This process of updating weights of the neural network


occurs after a subset of the training pattern has been presented. In other words,
after some training patterns have been presented, the weights of a neural
network the updated.

 Learning by epoch: This process of updating weights of the neural network


occurs after all training patterns have been presented.

6.5 Combination of Rough, Fuzzy and Neural Network

In real world, to solve a problem, need to process data. Almost in every solution,
processing of data faced a situation where data involve the vagueness, uncertainty,
imprecision. Many researchers have developed different kinds of approaches, such as
fuzzy systems [1, 76-79], neural network [80, 81], rough set theory [2, 82-84] etc.
Researchers also indicated that each approach has its own advantages and
disadvantages. It can be conclude that to provide more flexible and robust information
processing system using only one approach is not enough.

141
Rough‐Fuzzy‐Neural Network Hybridization

Many researchers are already tried to integrate different computing paradigms such as
neural network, fuzzy systems, rough set theory and so on to generate more efficient
hybrid systems such as fuzzy-rough or rough–fuzzy [85, 86] systems, neural-fuzzy or
fuzzy neural systems [81-89], rough-neural network [4, 90-94] etc.

Fuzzy-rough or rough–fuzzy hybridization embodies both advantages of fuzzy set


theory to deal with uncertainty and vagueness in data and power of rough set theory to
generate dependency rule for design the knowledge-base of an intelligent system.
Also these types of hybridization are used to feature selection and classification type
of problems.

Typically, fuzzy neural network (FNN) embodies both advantages of neural networks
and fuzzy systems. Fuzzy neural networks are designed to utilize a synthesis of the
computational power of the neural networks along with the uncertainty handling
capabilities of fuzzy logic. In other words, FNN can be used to construct knowledge-
based neural network i.e. knowledge of human expert can be incorporated into neural
networks, so FNN can be more suitable for the real life problem to be solved. But
there still exist questions. For example, in some situation appropriate rules cannot be
derived for a particular system. Of course, every input dimension divides into several
fuzzy subsets and then all the subsets for all input dimension are combined to
construct the complete rule set. However, such kind of FNN contains no expert
knowledge, thus this type of FNN may not fit to construct a particular system at the
very beginning.

The rough neural network (RNN) is an intelligent system [95]. RNNs are the neural
networks based on rough set which combine the advantage of rough set to process
uncertainly question: attributes reduce by none information losing then extract rule,
and the neural networks have the strongly fault tolerance, self-organization, massively
parallel processing and self-adapted. So that RNNs can process the massively and
uncertainly information, which is widespread applied in real life.

But still all this hybridization is not able to give satisfactory result in many situations
and thus to provide more flexible and robust information processing system
hybridization of fuzzy set, rough set and neural network are considered.

142
Rough‐Fuzzy‐Neural Network Hybridization

M. Banerjee et al [9] presented a scheme of knowledge encoding by using a


hybridization of rough, fuzzy and multilayer perceptron (MLP). They used
knowledge encoding in a fuzzy multilayer perceptron (MLP) using rough set-theoretic
concepts. This techniques first extracted domain knowledge from the data set in the
form of rules. Then syntax of these rules automatically determines the appropriate
number of hidden nodes and the dependency factors of the rules are used in the initial
weight encoding. After that the network refined during training. This proposed
knowledge encoding scheme was tested classification over classification of speech
and synthetic data. Result showed that the superiority of the system over the fuzzy
and conventional versions of the MLP.

Wang et al, [4] proposed a new generalized incremental rule extraction algorithm
(GIREA). GIREA is based on rough set theory that presented to extract rough
domain knowledge in terms of certain and possible rules. Then, FNN is used to refine
the obtained certain and possible rules to produce the fuzzy rule set. Authors claimed
that their approach and experimental results demonstrate the superiority in both rule’s
length and the number of fuzzy rules.

Junyang Zhao and Zhili Zhang [8] worked on the hybridization of fuzzy rough neural
network for feature selection. They introduced a feature selection algorithm based on
fuzzy-rough neural network (FRNN). They constructed four-layer feedforward FRNN
based on neural network implementation of the fuzzy-rough membership function of
fuzzy rough set. In this paper neural network adapted to deal with noise data for its
merits of strong approaching ability and good fault-tolerance performance and fuzzy-
rough membership function used to deal with real world data uncertainty.

Sarkar and Yegnanarayana [7] proposed a hybrid method of fuzzy-rough neural


networks for classification of data. They told that fuzzy clusters generated by the real
world input features may have rough uncertainty. They proposed this fuzzy-rough set
based network which exploits fuzzy-rough membership functions to reduce this
problem. The similar type of research also performed by D. Zhang et al [10, 11].
Their proposed four layer Fuzzy-Rough Membership Function Neural Network was
tested over infrared band combination image of Canada Norman Wells area and five
vowel characters. Their test results of classification indicated that it has better

143
Rough‐Fuzzy‐Neural Network Hybridization

classification precision than Radial Basis Function Neural Network and it has also the
same merit of quick learning as Radial Basis Function Neural Network.

Jing Hong [6] proposed an improved prediction model based on Fuzzy-rough Set
Neural Network to predict the gas emission of a coal mine. In this work back-
propagation neural network and fuzzy-rough set are used to develop the model.

To provide more flexible and robust information processing system, researchers are
also tried to hybridize of more soft computing tools, statistical models and
mathematical methods.

P Mitra et al, [96] proposed a method that describes a way of designing a hybrid
system for classification and rule generation, in soft computing paradigm, integrating
rough set theory with a fuzzy MLP using an evolutionary algorithm.

W.C. Chena et al, [12] also presents an innovative hybrid control algorithm leading to
integrate the distinct aspects of indiscernibility capability of rough set theory and
search capability of genetic algorithms with conventional neural-fuzzy controller for
industrial wastewater treatment.

S. K. Pal et al, [5] proposed a methodology for evolving a rough-fuzzy multilayer


perceptron with modular concept using a genetic algorithm to obtain a structured
network suitable for both classification and rule extraction. The modular concept,
based on “divide and conquer” strategy, provides accelerated training and a compact
network suitable for generating a minimum number of rules with high certainty
values.

A Ganivada et al, [97, 98] introduce a novel fuzzy rough granular neural network
(NFRGNN) and fuzzy rough granular neural network (FRGNN) model both based
on the multilayer perceptron using a back-propagation algorithm for the fuzzy
classification of patterns.

6.6 Conclusion

In this chapter we studded different combination of hybridization of rough set, fuzzy


set and neural network to solve different types of problems. Comparison between
proposed hybridization and the existing methods are also enlisted, and found better

144
Rough‐Fuzzy‐Neural Network Hybridization

performance of the new or modified methods. The performance of this type


hybridization is highly dependent on the knowledge domain. The result is also
depends on the pre-assigned values of parameters used in this type of hybridization.
Modelling an intelligent system, rule generation from the observed data set is the most
important task. In this studies fuzzy set are used for discretization of continuous type
data values and rough set are used to dependency rule generation and finally those
rules are mapped to a fuzzy neural network for refinement. But the fuzzy membership
values generated at time of discretization are ignored in these proposed rough-fuzzy-
neural network hybridization.

To provide more flexible and robust information processing system in a specific


domain, few modification or addition of genetic algorithm or other mathematical tool
may produce better result.

Bibliography

[1] L.A. Zadeh, 1965, Fuzzy sets, Information Control 8, 338–353.

[2] Z. Pawlak, 1982, Rough sets, International Journal of Computer and Information
Sciences 11, pp. 341–356.

[3] McCulloch, Warren; Pitts, Walter, 1943, A Logical Calculus of Ideas Immanent
in Nervous Activity, Bulletin of Mathematical Biophysics 5, pp.115-133.

[4] Shi-tong Wanga, Dong-jun Yub and Jing-yu Yangb, 2003, Integrating rough set
theory and fuzzy neural network to discover fuzzy rules, Intelligent Data
Analysis 7 pp, 59–73.

[5] S. K. Pal, S. Mitra, P Mitra, 2003, Rough-Fuzzy MLP: Modular Evolution,Rule


Generation, and Evaluation, Ieee Transactions on Knowledge and Data
Engineering, Vol. 15, No. 1. Pp.14-25.

[6] Jing Hong, 2011, An Improved Prediction Model based on Fuzzy-rough Set
Neural Network, International Journal of Computer Theory and Engineering,
Vol.3, No.1, pp.1793-8201.

[7] M. Sarkar, B. Yegnanarayana, 1998, Fuzzy-rough neural networks for vowel


classification, IEEE International Conference on Systems, Man, and
Cybernetics.

145
Rough‐Fuzzy‐Neural Network Hybridization

[8] Junyang Zhao and Zhili Zhang, 2011, Fuzzy Rough Neural Network and Its
Application to Feature Selection, International Journal of Fuzzy Systems, Vol.
13, No. 4.

[9] M. Banerjee, S. Mitra, and S. K. Pal, 1998, Rough fuzzy MLP: Knowledge
encoding and classification, IEEE Transactions on Neural Networks, vol. 9, no.
6, pp. 1203-1216.

[10] D. Zhang and Y. Wang, 2006, Fuzzy-rough Neural Network and Its Application
to Vowel Recognition, Control and Decision, vol. 21, no.2, pp. 221-224.

[11] D. Zhang, Y. Wang, H. Huang, 2007, Fuzzy-rough membership function neural


network and its application to pattern recognition, Proc. SPIE 6788, MIPPR
2007: Pattern Recognition and Computer Vision, 67882N.

[12] W.C. Chena, N-B. Changb, J-C. Chenc, 2003, Rough set-based hybrid fuzzy-
neural controller design for industrial wastewater treatment, Water Research 37,
pp. 95–107.

[13] Hebb, D.O., 1949, The organization of behavior, New York: Wiley & Sons.

[14] Rosenblatt, Frank. 1957, The Perceptron--a perceiving and recognizing


automaton. Report 85-460-1, Cornell Aeronautical Laboratory.

[15] J. A. Anderson, E. Rosenfeld (EDs), 2000, Talking Nets: An Oral History of


Neural Networks, MIT press.

[16] Kohonen, Teuvo, 1982, Self-Organized Formation of Topologically Correct


Feature Maps. Biological Cybernetics 43 (1): 59–69

[17] Anderson, J.A. 1970, Two Models for Memory Organization using Interacting
Traces. Mathematical Biosciences 8: 137–160.

[18] Carpenter, G.A. & Grossberg, S. 2003, Adaptive Resonance Theory, In Michael
A. Arbib (Ed.), The Handbook of Brain Theory and Neural Networks, Second
Edition (pp. 87-90). Cambridge, MA: MIT Press

[19] Grossberg, S. 1987, Competitive learning: From interactive activation to


adaptive resonance, Cognitive Science (Publication), 11, 23-63

146
Rough‐Fuzzy‐Neural Network Hybridization

[20] Carpenter, G.A. & Grossberg, S. 1987, ART 2: Self-organization of stable


category recognition codes for analog input patterns, Applied Optics, 26(23),
4919-4930

[21] Carpenter, G.A., Grossberg, S., & Rosen, D.B. 1991, ART 2-A: An adaptive
resonance algorithm for rapid category learning and recognition, Neural
Networks (Publication), 4, 493-504

[22] Carpenter, G.A. & Grossberg, S. 1990, ART 3: Hierarchical search using
chemical transmitters in self-organizing pattern recognition architectures, Neural
Networks (Publication), 3, 129-152

[23] Carpenter, G.A., Grossberg, S., & Rosen, D.B. 1991, Fuzzy ART: Fast stable
learning and categorization of analog patterns by an adaptive resonance system,
Neural Networks (Publication), 4, 759-771

[24] Carpenter, G.A., Grossberg, S., & Reynolds, J.H. 1991, ARTMAP: Supervised
real-time learning and classification of nonstationary data by a self-organizing
neural network, Neural Networks (Publication), 4, 565-588

[25] Carpenter, G.A., Grossberg, S., Markuzon, N., Reynolds, J.H., & Rosen, D.B.
1992, Fuzzy ARTMAP: A neural network architecture for incremental
supervised learning of analog multidimensional maps, IEEE Transactions on
Neural Networks, 3, 698-713

[26] Werbos, P.J. 1975, Beyond Regression: New Tools for Prediction and Analysis
in the Behavioral Science.

[27] Chen S., Billings S. A., 1992Neural Networks for Nonlinear Dynamic System
Modeling and Identification, International Journal of Control, Vol. 56, No.2, pp.
319-346.

[28] Funahashi K.-I, 1989, On the Approximate Realization of Continuous Mapping:


by Neural Networks, Neural Networks, Vol. 2, pp. 183-192.

[29] Ge, S. S., 1996, Robust Adaptive Control of Robots Based on Static Neural
Networks, 13th Triennial World Congress of the IFAC, San Francisco, USA, pp.
139- 144

[30] Hornik K., 1991, Approximation Capabilities of Multilayer Feedforward


Networks', Neural Networks, Vol. 4, pp. 251-257.

147
Rough‐Fuzzy‐Neural Network Hybridization

[31] Williamwn R. C., Helmke U., 1995, Existence and Uniqueness Results for
Neural Network Approximations, IEEE Trdnsadions on Neural Networks, Vol.
6, No. l, pp. 2-13.

[32] J. J. Hopfield, 1982, Neural networks and physical systems with emergent
collective computational abilities. Proc. NatL Acad. Sci. USA Vol. 79, pp.
2554-2558, Biophysics.

[33] Fausett L., 1994, Fundamentals of Neural Networks, Architectures, Algorithms


and Applications, Prentice Hal.

[34] Haykin S, 1999, Neural Networks, a Comprehensive Foundation, Prentice Hall,


1999.

[35] Baldi P.F., Hornik K., 1995, Learning in Linear Neural Networks: a Survey”,
IEEE Transactions on Neural Networks, Vol. 6, No. 4, pp. 837-858.

[36] Kosko, B. 1988, Bidirectional Associative Memories, IEEE Transactions on


System, Man, and Cybernetics, Vol. 18, No.1 , pp. 49-60.

[37] Haykin, S. 1998, Neuml Networks, A Comprehensive Foundation. Prentice


Hall, Second Edition.

[38] Mehrotra, K., Mohan, C. K., and Ranka, S. 1997, Elements of Artificial Neural
Networks. The MIT Press.

[39] Ruspini, E., Bonissone, P., and Pedrycz, W. 1998, Handbook of Fuzzy
Computation. Ed. Iop Pub/Inst of Physics.

[40] Cox, E. 1994, The Fuzzy Systems Handbook. AP Professional - New York.

[41] Lin, C.-T. and Lee, G. ,1996, Neural Fuzzy Systems: A Neuro-Fuzzy
Synergism to Intelligent Systems. Ed. Prentice Hall.

[42] Jang, J.-S. R., Sun, C.-T., and Mizutani 1997, Neuro-fuzzy and Soft Computing:
A Computational Approach to Learning and Machine Intelligence. Prentice
Hall.

[43] Alimi, I. 1997, A neuro-fuzzy approach to recognize arabic handwritten


characters. International Conference on Neuml Networks, pp.1397-1400.

[44] Baraldi, A. and Blonda, P. 1998, Fuzzy neural networks for pattern recognition,
Tech Report, IMGA-CNR, Italy.

148
Rough‐Fuzzy‐Neural Network Hybridization

[45] Meneganti, M., Saviello, F., and Tagliaferri, R. 1998, Fuzzy neural networks for
classification and detection of anomalies, IEEE Transactions on Neural
Networks, 9(2), pp. 848--861.

[46] Wang, L., Mendel, J. 1992, Back-propagation fuzzy system as nonlinear


dynamic system identifiers, Proceedings of IEEE International Conference on
Fuzzy Systems, pages 1409-1416.

[47] Nomura, H., Hayashi, I., and Wakami, N. 1992, A self-tuning method of fuzzy
control by descent method, Proceedings of IEEE International Conference on
Fuzzy Systems, pages 203-210.

[48] Nauck, D. 1994, A fuzzy perceptron as a generci model for neuro-fuzzy


approaches, Proceedings of Fuzzy-Systeme'94, 2nd GI-Workshop.

[49] Shi, Y., Mizumoto, M. 2000, A new approach of neuro- fuzzy learning
algorithm for tuning fuzzy rules, Fuzzy sets and systems, 112(1), pp.99-116.

[50] Shi, Y., Mizumoto, M. 2000, Some considerations on conventional neuro-fuzzy


learning algorithms by gradient descent method. Fuzzy sets and systems, 112(1),
pp.51-63.

[51] Yager, R., Filev, D. 1994, Generation of fuzzy rules by mountain clustering,
Journal of Intelligent Fuzzy Systems, 2(3), pp.209-219.

[52] Ichihashi, H., Tiiksen, I. 1993, A neuro-fuzzy approach to data analysis of


pairwise comparisons. Int. Journal of Approximate Reasoning, Vol. 9(3), pp.
227-248.

[53] Dagher, I., Georgiopoulos, M., Heileman, G., and Bebis, G. 1998, Fuzzy artvar:
An improved fuzzy artmap algorithm. International Joint Conference on Neural
Networks (IJCNN-98), 3, pp.1688-1693.

[54] S. K. Pal and S. Mitra, 1992, Multi-layer perceptron, fuzzy sets and
classification, IEEE Transactions on Neural Networks, vol. 3, pp. 683-697.

[55] Carpenter, G., Markuzon, N. 1998, Artmap-IC and medical diagnosis: instance
counting and inconsistent cases. Neural Networks, 11, pp. 323-336.

149
Rough‐Fuzzy‐Neural Network Hybridization

[56] Canuto, A., Howells, G., Fairhurst, M. 1999, Fuzzy multilayer perceptron for
binary pattern recognition, Seventh International Conference on Image
Processing and Its Application, 1, pp. 260--264.

[57] Canuto, A., Howells, G., Fairhurst, M. 1999, Repart: A modified fuzzy artmap
for pattern recognition, 6th Fuzzy Days, pp. 159-168.

[58] Carpenter, G., Grossberg, S., Markunzo, M., Reynolds, J. H., Rosen, D. B.
199l, Fuzzy art: Fats stable learning and categorization of analog patterns by an
adaptive ressonance system, Neural Networks, 4, pp.759-771.

[59] Carpenter, G., Grossberg, S., Markunzo, M., Reynolds, J. H., Rosen, D. B. 1992,
Fuzzy artmap: A neural network architecture for incremental supervised
learning of analog multidimensional maps, IEEE Transactions on Neural
Networks, 3, pp.698-713.

[60] DimlaSr., D. and Lister, P. 2000, On-line metal cutting tool condition
monitoring. ii: tool-state classification using multi-layer perceptron neural
networks, International Journal of Machine Tools and Manufacture, 40(5),
pp.769-781.

[61] Jeong, J.-H., Kim, H., Kim, D.-S., Lee, S.-Y. 2000, Speaker adaptation based on
judge neural networks for real world implementations of voicecommand
systems, Information Sciences, 123(1-2), pp. 13-24.

[62] Zhang, Z., Lyons, M., Schuster, M., Akrunatsu, S. 1998, Comparison between
geometry-based and gabor-wavelets-based facial expression recognition using
multi-layer-perceptron, Proceedings of the 3rd IEEE International Conference
on Automatic Face and Gesture Recognition, Japan, pp. 454-459.

[63] Guier, E., Sankur, B., Kabya, Y., Raudys, S. 1998, Visual classification of
medical data using mlp mapping, Computers in Biology and Medicine, 28(3),
pp. 275-287.

[64] Sheppard, D., McPhee, D., Darke, C., Shrethra, B., Moore, R., Jurewitz, A.,
Gray, A. 1999, Predicting cytomegalovirus disease after renal transplantation:
an artificial neural network approach. International Journal of Medical
Informatics, 54(1):55-76.

150
Rough‐Fuzzy‐Neural Network Hybridization

[65] Indro, D., Jiang, C., Patuwo, B., Zhang, G. 1999, Predicting mutual fund
performance using artificial neural networks. Omega, 27(3), pp.373-380.

[66] Rumelhart, D., Hinton, G., Williams, R. 1986, Learning representations by


back-propagating errors, Nature, 323, pp.533-536.

[67] Keller, J. M., Hunt, D. J., 1985, Incorporating fuzzy membership functions into
perceptron algorithm, IEEE Transactions on Pattern Analysis and Machine
Intelligence, 7(6), pp.693-699.

[68] Stoeva, S., Nikov, A. 2000, A fuzzy backpropagation algorithm, Fuzzy Sets and
Systems, 112(1), pp.27-39.

[69] Sural, S., Das, P. 1999, An mlp using hough transform based fuzzy feature
extraction for bengali script recognition, Pattern Recognition Letters, 20(8),
pp.771-782.

[70] Minsky, M. L., Papert, S. A. 1969, Perceptrons, The MIT Press, Cambridge.

[71] Parker, D. B., 1985, Learning-logic: Casting the cortex of the humam brain in
silicon, Technical report, tr-47, MIT.

[72] Cun, Y. L. 1998, Learning Process in an Asymmetric Threshold Network, page


234. (Ed.) E Bienenstock, F Fogelman Soulie, G Weisbuch, NATO ASI Ser, F
20, Springer, Berlin, Heidelberg.

[73] Cho, K., Wang, B. 1996, Radial basis function based adaptive fuzzy systems
and their application to system identification and prediction, Fuzzy sets and
systems, 83(3), pp.325--339.

[74] Hwang, R. C., Huang, H.-C., Chen, Y.-J., Hsier, J.-G., Chao, H. 1997, Adaptive
power signal prediction by non-fixed neural network model with modified fuzzy
back-propagation learning algorithm. Trends in Information Systems,
Engineering and Wireless Multimedia Communications; Proceedings of the
International Conference on Information, Communications and Signal
Processing, 2: pp. 689-692.

[75] Torresen, J. 1997, The convergence of back-propagation trained neural networks


for values weight update frequencies, Int. Journal of Neural Systems, 8(3),
pp.263-277.

151
Rough‐Fuzzy‐Neural Network Hybridization

[76] L.A. Zadeh, 1983, A computational approach to fuzzy quantifiers in natural


languages, Computers and Mathematics with Applications, 9:149–184.

[77] L. X. Wang and J. M. Mendel, 1992, Generating Fuzzy Rules by Learning from
Examples, IEEE Trans. Systems,Man, and Cybernetics, vol. 22, pp. 1414–1427.

[78] R. Rovatti and R. Guerrieri, 1996, Fuzzy Sets of Rules for System Identification,
IEEE Trans. Fuzzy Syst., vol. 4, pp. 89–102.

[79] Wlodzislaw Duch, Rafal Adamczak and Krzysztof Grabczewski, 2000, A new
methodology of extraction, optimization and application of crisp and fuzzy
logical rules”, IEEE Transactions on Neural Networks, vol. 11, no. 2, pp. 1-31.

[80] B. M. Happel and J. J. Murre, 1994, Design and Evolution of Modular Neural
Network Architec- tures, Neural Networks, vol. 7, pp. 985-1004.

[81] S.T. Wang, 1998, Fuzzy system and Fuzzy Neural Networks, Shanghai Science
and Technology Press, Edition 1.

[82] Z. Pawlak, Rough Sets, 1991, Theoretical Aspects of Reasoning About Data.
Dordrecht, Kluwer, The Netherlands.

[83] S. Tsumoto, H. Tanaka, 1995, PRIMEROSE: Probabilistic rule induction


method based on rough sets and resampling methods”, Computational
Intelligence 11, pp. 389–405.

[84] Shusaku Tsumoto, 2004, Mining diagnostic rules from clinical databases using
rough sets and medical diagnostic model”, Information Sciences 162, pp. 65–80.

[85] S. K. Pal and A. Skowron, Eds., 1999, Rough Fuzzy Hybridization: A New
Trend in Decision Making, Singapore: Springer-Verlag.

[86] Sankar K. Pal, Pabitra Mitra, 2001, Case Generation: A Rough-fuzzy Approach,
In: Proc. Intl. Conf. Case Based Reasoning (ICCBR2001), Vancouver, Canada.

[87] Y. Hayashi, 1992, Neural expert system using fuzzy teaching input and its
application to medical diagnosis, in: Proceedings of 2nd International
Conference on Fuzzy Logic and Neural Networks (Iizuka), pp. 989-993.

[88] S. Mitra, 1994, Fuzzy MLP based expert system for medical diagnosis, Fuzzy
Sets and Systems 65, pp.285-296

152
Rough‐Fuzzy‐Neural Network Hybridization

[89] S. Mitra and Y. Hayashi, 2000, Neuro-fuzzy Rule Generation: Survey in Soft
Computing Framework, IEEE Trans. On Neural Network, vol. 11, no. 3, pp.
748–768.

[90] R. Yasdi, 1995, Combining Rough Sets Learning and Neural Learning Method
to Deal with Uncertain and Imprecise Information”, Neurocomputing, vol. 7, pp.
61–84.

[91] M S Szczuka. 1998, Rough sets and artificial neural networks. In:Rough Sets in
Knowledge Discovery (2):Applications,Case Studies and Software Systems, pp.
449-470.

[92] B. S. Ahn, S. S. Cho, and C. Y. Kim, 2000, The Integrated Methodology of


Rough Set Theory and Artificial Neural Network for Business Failure
Prediction,” Expert Systems with Applications, vol. 18, pp. 65–74.

[93] R. W. Swiniarski and L. Hargis, 2001, Rough Sets as a Front End of Neural
Networks Texture Classifiers, Neurocomputing, vol. 36, pp. 85–102.

[94] Dongbo Zhang, 2007, Integrated methods of rough sets and neural network and
their applications in pattern recognition[D]. Hunan university.

[95] S. DING, J. CHEN, X XU, J. LI, 2011, Rough Neural Networks: A Review,
Journal of Computational Information Systems 7: 7, pp. 2338-2346.

[96] P Mitra, S Mitra, S. K. Pal, 1999, Modular Rough Fuzzy MLP: Evolutionary
Design, New Directions in Rough Sets, Data Mining, and Granular-Soft
Computing, Lecture Notes in Computer Science Volume 1711, pp 128-136.

[97] A. Ganivada, S. K. Pal, 2011, A Novel Fuzzy Rough Granular Neural Network
for Classification, International Journal of Computational Intelligence Systems,
Vol. 4, No. 5, pp. 1042-105.

[98] A. Ganivada, S. Dutta, S. K. Pal, 2011, Fuzzy rough granular neural networks,
fuzzy granules, and classification, Theoretical Computer Science, 412, pp.5834-
5853

153
Intelligent System Based on Rough‐Fuzzy‐Neural Network

Chapter 7
Intelligent System Based on Rough-
Fuzzy-Neural Network
An Intelligent system based on Rough-fuzzy-neural network is a hybridization
method of rough set, fuzzy set and neural network, the main components of soft
computing. Soft computing, may be considered as a science directed towards the
capturing the human ability to deal with uncertainty and imprecision in real time.
Enchanting from the conventional AI techniques, soft computing has been developed
not as one technique but as a synergistic collection of more than one technique;
Evolutionary, Neural, Fuzzy and Rough Computing being the prime methodologies.
Rough-fuzzy-neural network hybridization captures the power of handling
uncertainty and vagueness in data and human-like reasoning techniques through the
If-Then rule from fuzzy set, power of handling roughness in data and dependency rule
generation from rough set and automated learning and connectionist structure
representation power from Neural network.

7.1 Introduction
Rough-fuzzy-neural network hybridization has been employed in past few years with
many different combinations to solve the different problems, like Rule generation [1,
2], prediction [3], classification [4], feature selection [5], knowledge encoding [6]
pattern recognition [7, 8], industrial wastewater treatment [9] and so on. It is a tool in
the soft computing area that utilizes the advantages of all the individual methods as
well as pairwise hybrid methods. Combining the human-like reasoning process of
Fuzzy set [10], rule generation techniques from Rough set [11], with automated
learning concept from Neural network [12], established itself a powerful tool in soft
computing area. It can also able to handle uncertain, vague and imprecise data as well
as able to reduced unimportant attribute from the experimental data set and presents in
a connectionist structure. This hybridization can be efficiently used for representing
knowledge base of an intelligent system, knowledge discovery from database or

154
Intelligent System Based on Rough‐Fuzzy‐Neural Network

experimental data set, generating dependency rules that represents the knowledgebase
of a rule-based intelligent system.

A modified method of Rough-fuzzy-neural network hybridization to generate fuzzy


rule to use as a knowledge base of intelligent system is proposed in this chapter. This
hybridization is constructed in two consecutive steps. In the first step rough-fuzzy
hybridization [13] is performed to generate fuzzy rules, where fuzzy set used to
fuzzyfy the continuous data by introducing the linguistic pattern and membership
value and rough set is used for rule generation. Each rule associated with a certainty
factor calculated by considering the both fuzzy membership and rough membership
value. In step two this rules are mapped to a Fuzzy-neural network (FNN) [14, 15] to
refine the rules and further produce the fuzzy rule set [1, 2].

The organization of this chapter is as follows: section 7.2 presents the procedure of
Rugh-Fuzzy rule generation. Section 7.3 describes the process of mapping rules into a
fuzzy-neural network. Section 7.4 shows the application of the modified framework in
the medical dataset of diseases diabetes mellitus and section 7.5 concludes the key
finding of this chapter and possible area of future work.

7.2 Rough-Fuzzy Rule generation


Rough set theory [11, 16] is used to generate dependency rule form table of data. Let
us discuss some basic concepts of rough set which are used in this chapter.

7.2.1 Basic Concept of Rough Set


An Information system S is defined as S = <U, A>, where U denote the domain of
discourse formally Universe and A is the non-empty and finite set of attributes. Let
A=C ∪ D, where C is the non-empty and finite set of condition attributes and D is
defined as the non-empty and finite set of decision attributes. An attribute a ∈ A, can
be regarded as a function a: U→Va, where Va is a value set.

An information system may be viewed as an attribute value-table known as decision


table, where each row is labeled by object ∈ U and each column by attribute ∈ A.

For all B ⊆ C, equivalent relation IB on U is defined as

IB = {(x, y) ∈ U : ∀ a ∈ B, a(x) = a(y) } (7.1)

[x]B is denoted as equivalence class of object x ∈ U relative to IB and defined as

[x]B = {y | y ∈ U, yBx } (7.2)

155
Intelligent System Based on Rough‐Fuzzy‐Neural Network

and are denoted as B-lower and B-upper approximations of X ⊆ U in S,


where B⊆ C, and are defined as

= {x ∈ U : [x]B ⊆ X } (7.3)

= {x ∈ U : [x]B ∩ X ≠ φ} (7.4)

X ∈ U will be B exact if = and will be B rough if ≠ .

7.2.2 Fuzzyfication of Data using Fuzzy Set

Table of data are fuzzyfied using Fuzzy set theory [10, 17]. Fuzzy set theory has been
introduced the concepts of degree of membership of elements to set. In classical set,
elements could belong fully (membership 1) or not at all (membership 0) to set. The
degree of membership allows an element to lie in a set with membership values
anywhere in the range [0, 1]. A fuzzy set can be defined as a set of ordered pairs à =
{(x, μÃ(x)) / x∈ }. The function μÃ(x) is called is called the membership function for
Ã, mapping each element of the base set (universe) to a membership degree in the
range [0.1]. The base set may be discrete or continuous.

In Rough-Fuzzy Hybridization fuzzyfication of data is performed to represent the


linguistic patterns of the continuous data. Each linguistic pattern has membership
degree in the range [0.1]. The type of the membership function is used depending on
the base set patterns. If the base set contains many values, or if this set is continuous,
then a parametric representation, which can be adapted by changing the parameters, is
appropriate. Mostly this type of membership functions are triangular or trapezoidal
functions that are defined by three and four parameters respectively.

For some applications continuously differentiable curves requires for modeling and
therefore smooth transitions, which is not possible using triangular or trapezoidal
function. In those cases normalized Gaussian function, difference of two sigmoidal
functions, generalized bell function, etc, and in some application π functions are used
[18].

In this framework the fuzzyfication operation perform over continuous attributes by


introducing some linguistic variables like low, high, medium etc., and constructed the
MV of each linguistic variable. The membership value (MV) is calculated by using
triangular membership function. According to the definition of MV, the MV must be

156
Intelligent System Based on Rough‐Fuzzy‐Neural Network

in [0, 1]. Also assign MV 1 or certain membership to the other attribute values.
Finally discard those parameters which have MV less than 0.25.

7.2.3 Rule Generation using Rough Set Theory

To generate rule, decision matrix [1, 13] approach is considered in this modified
method. Decision matrix is a generalization of rough set theory from where reduct and
decision rule can be calculated. In this approach, first it is check that the information
system is consistent or not. Information system is said to consistent if there is no two
objects whose condition attributes are same but decision attributes are different.
Similarly an information system will be inconsistent if there exists any two objects
whose condition attributes are same and decision attributes are different. That means
for any two object i and j

if ai(C) = aj(C) and ai(D) ≠ aj(D) where i≠j (7.5)

then the information system is inconsistent.

Let the domain of discourse U of the information system S is divided into k classes
(c1, c2… ck) depending on equivalence relation defined on D. For any class cp ∈ (c1,
c2… ck), the objects ∈ U are belong in cp are numbered by subscripts i (i = 1, 2, …, m)
and those do not belong in cp are subscripts j (j = 1, 2, ,n). The decision matrix M of
the information system S for the class cp is defined as m×n matrix with elements as a
set {attribute-name, attribute-value, membership-value}.

Mijp= {(a, a(i), µ(i)): a(i)≠a(j)} ∀ i= 1,2, … ,m and j = 1,2, … , n (7.6)

Where a is attribute name and a(i) is attribute value and µ(i) is the MV of attribute
value.

7.2.3.1 For consistent information system

To construct the minimum-length decision rule for any object I (I = 1, 2, …, m)


belong in class cp ∈ (c1, c2… ck) can be obtained as

|B | ⋀ ⋁ M (7.7)

Where ⋀ and ⋁ are conjunction and disjunction operations respectively.

157
Intelligent System Based on Rough‐Fuzzy‐Neural Network

Next calculate the CF of each rule by


7.8

Where k is the number of attributes present in the ith rule and µij is the MV of attribute
value.

The decision rule is calculated as

Rj = |Bip | ∀ I = 1, 2, …, m; cp ∈ (c1, c2… ck); j=number of distinct rule (7.9)

7.2.3.2 For inconsistent information system

First find out the B-lower and B-upper approximation for each class cp ∈ (c1, c2… ck).
Rule generated from the B-lower approximation are certain and rule generated from
B-upper approximation is possible rule.

To construct certain minimum-length decision rule for any object i (i = 1, 2, …, m)


belong in class cp ∈ (c1, c2… ck) can be obtained as

|B | ⋀ ⋁ M (7.10)

where ⋀ and ⋁ are conjunction and disjunction operations respectively.

Next calculate the CF of each rule as


7.11

Where k is the number of attributes present in the ith rule and µij is the MV of attribute
value.

The decision rule is calculated as

Rj = |Bip | certain ∀ i = 1, 2, …, m; cp ∈ (c1, c2… ck); j=number of distinct rule (7.12)

Then construct the possible minimum-length decision rule for any object i (i = 1, 2,
…, m) belong in class cp ∈ (c1, c2… ck) can be obtained as

|B | ⋀ ⋁ M (7.13)

158
Intelligent System Based on Rough‐Fuzzy‐Neural Network

where ⋀ and ⋁ are conjunction and disjunction operations respectively.

For possible rule the belief function of ith rule can be defined as follows

card Bc B
1 7.14

Where cp ∈ (c1, c2… ck) and card(.) define the cardinality of the set. and then


∗ 7.15

Where k is the number of attributes present in the ith rule and µij is the MV of attribute
value.

The decision rule is calculated as

Rj= |Bip| possible ∀ i = 1, 2, …, m; cp ∈ (c1, c2… ck); j=number of distinct rule (7.16)

7.3 Mapping rules into the FNN

When the rules are extracted from the experimental data set, map them in to a fuzzy-
neural network[1, 2, 14].

Consider the N numbers of rule are generated in the following form:

: …

Where is the kth rule , 1 k N, { }i=1 …n are the input variables, y is the output
variable, are fuzzy sets defined on the input variables, and is the fuzzy
singletons defined on the output variables.

Fuzzy sets are defined by bell-shaped (Gaussian) membership functions [14,15]

/
(7.17)

where and are the center and the width of the Gaussian function, respectively.

By adopting singleton fuzzification, product rule inference and center average


defuzzification, the final output of this fuzzy system can be written as [14, 15]:

159
Intelligent System Based on Rough‐Fuzzy‐Neural Network

∑ ∏
7.18
∑ ∏

The FNN consists of four layers: input layer, fuzzification layer, rule inference layer
and Output layer as shown in Figure. 7.1.

Input Fuzzification Rule Inference Output


layer layer layer layer


x1 ~

Input


Output

xn ~

Figure 7.1: The FNN implementation of Rough-fuzzy rules

Input layer: Neurons in this layer receive the input values (x1, x2 … xn). This input
values are then transferred to the fuzzification layer to fuzzyfy.

Fuzzification layer: In this layer neurons (nodes) are arranged into N groups; each
group representing the IF-part of a rough-fuzzy rule. Each neuron ik receives the input
variable, i.e. consider the input variable xi and calculates the membership value
that identifies the degree of membership of the input value xi belongs to the

fuzzy set . Thus, the output of neuron ik is in the range [0, 1] and is calculated by
the following functions.

/
(7.19)

Rule inference layer: In this layer number of neurons (nodes) is considered as equal
number of fuzzy rules. Thus neuron in this layer represents a fuzzy rule. Each neuron
is connected with n fixed links from the input term neurons representing the IF part of

160
Intelligent System Based on Rough‐Fuzzy‐Neural Network

the fuzzy rule. The kth neuron performs the ∧ operation for matching of the kth rule
by Larsen product operator; the output of this node is:

7.20

Output layer: In this layer neurons (nodes) represent the output variables of the
system. Each node j acts as a defuzzifier and computes the output values for an input
vector ,… according to equation (1):


7.21

This FNN refines a set of fuzzy rules in its topology, and processes information in a
way that matches the fuzzy reasoning scheme. The weights of the network correspond
to the Gaussian membership functions parameters { }, { } and to the consequent
singleton {Bk}. In other words, each neuron k is associated with two premise weight
vectors … , … and one consequent weight Bk.

The learning process here is a supervised learning process. A standard gradient


descent method is used to minimize the overall error function [14].

7.22

With

7.23

Where is the jth output of the FNN for the current input sample and is
the corresponding desired output.

The general update formula for the generic fitness α is where η is


the learning rate.

Thus the weights updated by the update quantities [14,15]:

161
Intelligent System Based on Rough‐Fuzzy‐Neural Network

Δ

2

2

Where

; ∑ ; 7.24

and j, k, ik are neuron of output, rule inference and fuzzification layer


respectively.

The final solution may be presented as a set of rules or a network on nodes (neurons)
performing the logical functions, with hidden neurons realizing the rules.

Let is the certainty factor of ith Rough-Fuzzy rule from equation (). The final
certainty factor of the ith rule is calculated as

∗ (7.25)

Where is the generic fitness of the ith rule .

7.4 Application on Medical Data Set


We have applied this framework over the medical data-sets, detailed description of
the dataset is defined chapter 5 section 5.6, of diabetes patients for rheumatological
manifestations of Diabetes Mellitus like

- Diabetic cheiroarthropathy or Limited joint mobility (LJM)

- Adhesive capsulitis of shoulder(ADH)

- Clinical Carpal tunnel syndrome(CTS CL)

- NCV finding of Carpal tunnel syndrome (CNCV)

- Dupuytren’s contracture(DUPY)

162
Intelligent System Based on Rough‐Fuzzy‐Neural Network

- Flexor tenosynovitis (FTS)

- Diffuse interstitial skeletal hyperostosis (DISH )

- Gout and Hyperuricaemia(GOUT)

- Hand Osteoarthritis (OAH)

- Knee Osteoarthritis (OAK)

The datasets are contained nine attributes with values as follows

- Age(Integer Numbers)

- Sex( 0= MALE, 1 = FEMALE)

- Type of diabetes( 1= TYPE1 , 2 = TYPE2)

- Duration of Diabetes(Integer Number in year)

- Use of Insulin( 1= YES , 0 = NO)

- Fasting Blood Sugar(Integer Numbers)

- Post Prandial Blood Sugar(Integer Numbers)

- Albuminuria (1 = MICROALBUMINURIA , 2= PROTEINURIA )

- Uric acid(Floating Point Number)

Result are compared with the previous framework, rough-fuzzy intelligent system
shown in table-7.1 which is self-explaining.The datasets contains 100 instances. The
datasets are presents in form of two files .nam and .dat file. In .data file present the
data and .nam file represent the data structure about data. Here we used 5-fold data.
Only 20% data are used to generate rule and other 80 % data is used for testing.

163
Intelligent System Based on Rough‐Fuzzy‐Neural Network

Rough-Fuzzy Hybridization Rough-Fuzzy-Neural


Data Set
Network Hybridization
Best Rule Vote of 3 Rule Vote of 5 Rule
LJM 75.00% 77.5% 73.75% 78.75%
ADH 83.75% 83.75% 81.25% 86.25%
CTS CL 93.75% 93.75% 93.75% 93.75%
CNCV 96.25% 96.25% 95.00% 96.25%
DUPY 98.75% 98.75% 97.50% 98.75%
FTS 93.75% 91.25% 90.00% 95.00%
DISH 98.75% 96.25% 98.75% 98.75%
GOUT 96.25% 96.25% 96.25% 98.75%
OAH 85.00% 87.50% 83.75% 90.00%
OAK 73.75% 73.75% 72.50% 75.00%

Table 7.1: Comparison Result with Rough-Fuzzy Hybridization

7.5 Conclusion
In this chapter a hybrid intelligent system is proposed that used to generate efficient,
reliable and more approximate fuzzy rules. The rules are also tested and result is
given. Result indicates that this hybridization is better. To construct more efficient
method further hybridization with combining evolution computing or other
mathematical or statistical tool may be considered as the future work. Agent based
computing is also another area for updating this model in future.

Bibliography

[1] Shi-tong Wanga, Dong-jun Yub and Jing-yu Yangb, 2003, Integrating rough
set theory and fuzzy neural network to discover fuzzy rules, Intelligent Data
Analysis 7 pp, 59–73.

[2] S. K. Pal, S. Mitra, P Mitra, 2003, Rough-Fuzzy MLP: Modular


Evolution,Rule Generation, and Evaluation, Ieee Transactions on Knowledge
and Data Engineering, Vol. 15, No. 1. Pp.14-25.

[3] Jing Hong, 2011, An Improved Prediction Model based on Fuzzy-rough Set
Neural Network, International Journal of Computer Theory and Engineering,
Vol.3, No.1, pp.1793-8201.

164
Intelligent System Based on Rough‐Fuzzy‐Neural Network

[4] M. Sarkar, B. Yegnanarayana, 1998, Fuzzy-rough neural networks for vowel


classification, IEEE International Conference on Systems, Man, and
Cybernetics.

[5] Junyang Zhao and Zhili Zhang, 2011, Fuzzy Rough Neural Network and Its
Application to Feature Selection, International Journal of Fuzzy Systems, Vol.
13, No. 4.

[6] M. Banerjee, S. Mitra, and S. K. Pal, 1998, Rough fuzzy MLP: Knowledge
encoding and classification, IEEE Transactions on Neural Networks, vol. 9,
no. 6, pp. 1203-1216.

[7] D. Zhang and Y. Wang, 2006, Fuzzy-rough Neural Network and Its
Application to Vowel Recognition, Control and Decision, vol. 21, no.2, pp.
221-224.

[8] D. Zhang, Y. Wang, H. Huang, 2007, Fuzzy-rough membership function


neural network and its application to pattern recognition, Proc. SPIE 6788,
MIPPR 2007: Pattern Recognition and Computer Vision, 67882N.

[9] W.C. Chena, N-B. Changb, J-C. Chenc, 2003, Rough set-based hybrid fuzzy-
neural controller design for industrial wastewater treatment, Water Research
37, pp. 95–107.

[10] L.A. Zadeh, 1965, Fuzzy sets, Information Control 8, 338–353.

[11] Z. Pawlak, 1982, Rough sets, International Journal of Computer and


Information Sciences 11, pp. 341–356.

[12] McCulloch, Warren; Pitts, Walter, 1943, A Logical Calculus of Ideas


Immanent in Nervous Activity, Bulletin of Mathematical Biophysics 5,
pp.115-133.

[13] J. Ghosh, S. Mukhopadhyay, 2011, Role of Certainty Factor in Rough-Fuzzy


Rule Generation, International Journal of Computer Science, Engineering and
Applications (IJCSEA) Vol.1, No.6, pp. 49-61.

[14] G. Castellano, A. M. Fanelli, 2000, Fuzzy inference and rule extraction using
a neural network NEURAL NETWORK WORLD JOURNAL 3, pp. 361-371,

165
Intelligent System Based on Rough‐Fuzzy‐Neural Network

[15] A. D. Kulkarni, C. D. Cavanaugh, 2000, Fuzzy Neural Network Models for


Classification, Applied Intelligence, 12, pp. 207-215.

[16] Z. Pawlak,”Rough Sets: Theoretical Aspects of Reasoning about Data, System


Theory”, Knowledge Engineering and Problem Solving vol. 9, Kluwer
Academic Publishers, Dordrecht, The Netherlands (1991).

[17] L.A. Zadeh. “A computational approach to fuzzy quantifiers in natural


languages”, Computers and Mathematics with Applications, 9:149–184, 1983.

[18] S.T. Wang, “Fuzzy system and Fuzzy Neural Networks”, Shanghai Science
and Technology Press, 1998, Edition 1.

166
Conclusion and Discussion

Chapter 8
Conclusion and Discussion
This chapter is devoted for conclusion of the thesis. A summary of the research
presented in this thesis is given, with a focus on the main contribution, modelling
intelligent system and its application in medical diagnosis. Three frameworks for
modelling intelligent system in medical diagnosis are proposed in this thesis.

Based on the survey of the existing literature consolidated in chapter 2, it has been
seen that many approaches have been proposed for modeling knowledge-based
intelligent systems in medical diagnosis, but most of them have no explanation
capabilities which is suggested as the most important capability to accept clinical
decision tool.

The first one framework is proposed and discussed in chapter 3, is an Interactive


Intelligent System for Medical Diagnosis (IISMD). The knowledge base of IISMD is
designed by using the fuzzy If-Then rules, which also associated with certainty factor
to describe the degree of belief (accuracy) of that rule, and for the reasoning process
GMP is employed. IISMD is also able to give explanation of “Why” and “How”
queries about the process of diagnosis. IISMD is tested over a domain of disease, say
convulsion in infancy. The test results in form a consultation session is given in
Appendix-A of this thesis.

Researchers face a major problem on modelling intelligent system for medical


diagnosis in connection with acquiring knowledge from medical practitioner in a
country like India, where physician-patient ratio is remarkably low. On the other
hand, knowledge acquisition in form of if-then rules from an individual or set of
medical practitioners is also very less efficient and may not be reliable. The questions
about the reliability and efficiency are raised not only about the nature of the domain
but also reliability and efficiency about the domain expert.

In this context the automated rule generation from the observation and clinically test
data set stand as an important research area. But also the observation and clinically
test data set is not noise free, means there exists uncertainty, vagueness as well as
impreciseness. This facts demand methods that not only extract rules from data but

167
Conclusion and Discussion

also have the ability to handling uncertainty, vagueness and impreciseness lie in data
itself.

Other additional work that has been performed includes a comprehensive study of
rough sets and some of their applications that is presented in chapter 4. This review
allows a details view of existing methodologies as well as the hybridizations of rough
sets with fuzzy sets. That suggests and identifies the areas for further exploration. The
application of fuzzy and rough techniques for rule generation is a most promising and
a powerful tool.

Next approach proposed and discussed in chapter 5, to extract rules from


experimental data set (observation and clinically test data set) using Rough-Fuzzy
hybridization that act as an intelligent system. This Rough-Fuzzy intelligent system
generates rules as well as able to handle uncertain, vague and imprecise data. Fuzzy
set performed the task of handling the uncertainty and vagueness of data by
introducing the linguistic pattern and their membership value and rough set generates
the rules handling the imprecise data. The generated rules are associated with a
certainty factor, constructed by combining the both rough and fuzzy membership
values. This framework is tested over the domain of disease, say diabetes mellitus.

Rough-fuzzy hybridization have the capability of human-like reasoning, gained from


fuzzy set and capability of rule generation, captured from rough set, but lacks of
automated learning capability which is another important area of modeling intelligent
system.

Another exhaustive study includes a review of neural network systems and neuro-
fuzzy hybridization and their application documented in chapter 6. This review
recommends that hybridization of rough set, fuzzy set and neural network can able to
incorporate the above said criteria, automated learning, also.

Third framework proposed and discussed in chapter 7, is an intelligent system based


on rough set, fuzzy set and neural network, in a domain of diseases, say diabetes
mellitus. The proposed framework amalgamated the advantages of fuzzy set for
linguistic representation of data and handling the uncertainty and vagueness in data
and advantage of rough set to handle impreciseness in data and dependency rule
generation from a table of data as well as connectionist structure of neural network.

168
Conclusion and Discussion

Each rule in this framework as like as earlier proposed framework, rough-fuzzy


intelligent system, associated with a certainty factor to represent the degree of
correctness. The certainty factor of each rule is constructed by combining the both
membership values of fuzzy set and rough set. Finally all these generated rules are
mapped to a fuzzy-neural network for refinement.

To adopt the power of evolutionary computing, Genetic algorithm may be consider as


another tool to hybridized with rough-fuzzy-neural network for modelling intelligent
system, may be consider as the future work.

Granular computing is another new area of research, where this hybridization may be
utilized, which must be the area of future work.

Multi-agent base design of the existing intelligent systems is also an area of future
work.

Web-based intelligent system, with all its necessary characteristics like reliability,
efficiency, accurate, flexibility, robustness and also user friendliness, design in the
field of medical science considering a real life large domain, which by nature very
complex, is a challenging work of future.

169
Publications Contributes Thesis

S. Mukhopadhyay, J. Ghosh, D. Ghosh Dastidar, 2007, IISMD: An Interactive Intelligent System


for Medical Diagnosis, Modelling, Measurement and Control (C), AMSE Press, France, Vol. 68
No.3, pp 1-12.

S. Mukhopadhyay, Jyotirmoy Ghosh, 2011, Studies on Fuzzy Logic and Dispositions for
Medical Diagnosis, International Journal of Computer Technology and Applications, Vol 2 (5),
pp. 1235-1240.

J. Ghosh, S. Mukhopadhyay, 2011, Role of Certainty Factor in Rough-Fuzzy Rule Generation,


International Journal of Computer Science, Engineering and Applications (IJCSEA) Vol.1, No.6,
pp. 49-61.

Communicated, Jyotirmoy Ghosh, S. Mukhopadhyay, Rough-Fuzzy-Neural Network: for


classification and rule generation, Applied Soft Computing, Elsevier

170
Appendix‐A

Appendix-A

(A Consultation Session)

$(IISMD)
(Enter source drive to read RULBASEl .LIB)
C:
****Do you want to begin a knowledge acquisition session?
< YES/ NO> NO *
Do you want to begin a fresh consultation ?
< YES/ NO> YES *
What is (are) pat-name ?

The options are


user dependent
Other options are <AB/WH/UN/EP> for <ABort/WHy/UNknown/ExPlain-meaning>

Select options followed by cf at end press * and Enter


PT-170 100 *
What is (are) age-in-months ?

The options are


user dependent
Other options are <AB/WH/UN/EP> for <ABort/WHy/UNknown/ExPlain-meaning>

Select options followed by cf at end press * and Enter


20 100 *
What is (are) current-complaint ?

The options are


0) pyrexia 1) vomiting 2) nausea 3) persistent-headache
4)feeding-refusal 5) oliguria 6) visual-disturbance
Other options are <AB/WH/UN/EP> for <ABort/WHy/UNknown/ExPlain-meaning>

Select options followed by cf at end press * and Enter


6 60 *
What is (are) past-illness ?

The options are


0) febrile-convulsion 1) petitmal-epilepsy 2) grandmal-epilepsy
3) otitis 4) pneumonitis 5) scabies
6) mumps 7) measles 8) chicken-pox
9) mastoiditis 10) endocarditis 11) congenital-cyanotic-heart-disease
12) craniocerebral-wound 13) paranasal-sinus-infection

171
Appendix‐A

Other options are <AB/WH/UN/EP> for <ABort/WHy/UNknown/ExPlain-mean

Select options followed by cf at end press * and Enter


2 70 *
What is (are) family-history?

The options are


0) febrile-convulsion 1) petitmal-epilepsy 2) grandmal-epilepsy

Other options are <AB/WH/UN/EP> for <ABort/WHy/UNknown/ExPlain-meaning>

Select options followed by cf at end press * and Enter


2 70 *
What is (are ) consciousness-state ?

The options are


0) drowsy 1) irritable 2) normal

Other options are <AB/WH/UN/EP> for <ABort/WHy/UNknown/ExPlain-meaning>

Select options followed by cf at end press * and Enter


UN *
What is (are) skull-exam ?

The options are


0) microencephaly 1) hydrocephalus 2) bcraniotobes
3) compound-fracture 4) crackedpot-sign 5) tense-fontanelle

Other options are <AB/WH/UN/EP> for <ABort/Why/UNknown/ExPlain-meaning>

Select options followed by cf at end press * and Enter


EP *
About which parameter value you want explanation?
Select an option /NONE> 1 *

The meaning is Water on the brain

What is (are) skull-exam ?

The options are


0) microencephaly 1) hydrocephalus 2) bcraniotobes
3) compound-fracture 4) crackedpot-sign 5) tense-fontanelle

Other options are <AB/WH/UN/EP> for <ABort/WHy/UNknown/ExPlairt-meaning>

Select options followed by cf at end press * and Enter


UN *
What is (are) eye-exam ?

172
Appendix‐A

The options are


0) papilloedema 1) optic-atrophy 2) retinal-hoemorrage
3) vacant-stares 4) blinking-eyelid
Other options are <AB/WH/UN/EP> for <ABort/WHy/UNknown/ExPlain-meaning>

Select options followed by cf at end press * and Enter


UN *
What is (are) neurological-finding ?

The options are


0) neck-rigidity 1) kemigs-sign 2) babinski-sign

Other options are <AB/WH/UN/EP> for <ABort/WHy/UNknown/ExPlain-meaning>

Select options followed by cf at end press * and Enter


UN *
What is (are) limb-exam ?

The options are


0) oedema

Other options are <AB/WH/UN/EP> for <ABort/WHy/UNknown/ExPlain-mean

Select options followed by cf at end press * and Enter


UN *
What is (are) convulsion-nature ?

The options are


0) generalized 1) focal. 2) total-Ioss-of-conc
3) brief-Iapse-of-conc 4) vacant-stares 5) blinking-eyelid
6) tonic-clonic-movt-of-musc 7) preconvulsive-cyanosis 8) assoc-with-crying

Other options are <AB/WH/UN/EP> for <ABort/WHy/UNknown/ExPlain-meaning>

Select options followed by cf at end press * and Enter


6 90 *
What is (are) post-convulsive-disorder ?

The options are


0) bladder-bowel-incontinence 1) sleep 2) blurred-vision
3) todds-palsy
Other options are <AB/WH/UN/EP> for <ABort/WHy/UNknown/ExPlain-meaning>

Select options followed by cf at end press * and Enter


WH *
post-convulsive-disorder is needed to fire RULE-043 *
Higher goals use rules in order ROLE-047 ROLE-026 *
Which rule do you want to see<type NONE/Rule no> ?
<NONE or RULE-nnn> NONE *

173
Appendix‐A

What is (are) post-convulsive-disorder ?

The options are


0) bladder-bowel-incontinence 1) sleep 2) blurred-vision
3) todds-palsy

Other options are <AB/WH/UN/EP> for <ABort/WHy/UNknown/ExPlain-meaning>

Select options followed by cf at end press * and Enter


1 80 *
What is (are) conv-durn-secs ?

The options are.


0) (low (20 30 40 50 60 70 80 90 l00 110)
(100 100 100 90 80 70 60 50 40 30))
1) (med (40 50 60 70 80 90 100 110 120 130)
(20 50 90 100 100 100 100 100 90))
2) (high (110 130 150 170 190 210 230 250 270 290)
(50 90 100 100 100 100 100 100 100 100))

Other options are <AB/WH/UN/EP> for <ABort/WHy/UNknown/ExPlain-meaning>

Select options followed by cf at end press * and Enter


2 90 *
(((invest eeg 99) (RULE-047)))
Do you want to continue this consultation session ?
<YES or NO> YES *
What is (are) general-findings-of?

The options are


0) progressive-retardation 1) congenital-cyanotic-heart-disease

Other options are <AB/WH/UN/EP>-for <ABort/WHy/UNknown/ExPlain-meaning>

Select options followed by cf at end press * and Enter


UN *
(((invest eeg 99) (RULE-047)) ((clinically-asc-disease grandmal-epilepsy 96)
RULE-045 RULE-046 RULE-056)))
Do you want to continue this consultation session ?
YES or NO> NO *
DO you want to know about management of disease? <YES/NO> NO *

(CONSULTATION-OVER)

$(SYSTEM)

174
Appendix‐B

Appendix-B

Following are the portion of the rule-base generated by modified method for the
diabetes mellitus. Here decision is 1 means that diabetic mellitus is present and 0
means it is absent.

LJM

RULE-24: If TYPE is 1 and PPB is Extrm High Then Decision is 1 With CF 100

RULE-25: If TYPE is 1 and ALB is 1 Then Decision is 1 With CF 100

RULE-26: If INS is 1 and PPB is Extrm High Then Decision is 1 With CF 100

RULE-27: If PPB is Extrm High and ALB is 1 Then Decision is 1 With CF 100

RULE-28: If AGE is Very High and SEX is 0 Then Decision is 0 With CF 98

RULE-29: If AGE is Very High and INS is 1 Then Decision is 0 With CF 98

RULE-30: If SEX is 0 and FB is Medium Then Decision is 0 With CF 97

RULE-31: If DUR is Extrm Low and FB is Medium Then Decision is 0 With CF 97

RULE-32: If FB is Medium and ALB is 0 Then Decision is 0 With CF 97

RULE-33: If AGE is Very High and DUR is Extrm High Then Decision is 0 With CF
96

RULE-34: If AGE is Very High and ALB is 1 Then Decision is 0 With CF 96

ADH

RULE-07: If PPB is Extrm High Then Decision is 0 With CF 100

RULE-08: If AGE is Extrm Low Then Decision is 0 With CF 100

RULE-09: If DUR is Extrm High Then Decision is 0 With CF 100

RULE-10: If SEX is 1 and PPB is Extrm Low and ALB is 1 Then Decision is 1 With
CF 100

RULE-11: If PPB is Low Then Decision is 0 With CF 98

RULE-12: If AGE is Very High and SEX is 1 and ALB is 1 Then Decision is 1 With
CF 95

175
Appendix‐B

RULE-13: If FB is Low Then Decision is 0 With CF 93

RULE-14: If FB is Extrm Low Then Decision is 0 With CF 90

RULE-15: If AGE is High Then Decision is 0 With CF 87

RULE-16: If PPB is Very Low Then Decision is 0 With CF 86

RULE-17: If FB is Medium Then Decision is 0 With CF 84

CTS CL

RULE-05: If FB is Extrm Low Then Decision is 0 With CF 100

RULE-06: If TYPE is 1 Then Decision is 0 With CF 100

RULE-07: If DUR is Extrm Low Then Decision is 0 With CF 100

RULE-08: If INS is 1 Then Decision is 0 With CF 100

RULE-09: If FB is Extrm High and ALB is 2 Then Decision is 1 With CF 100

RULE-10: If PPB is Very Low Then Decision is 0 With CF 98

RULE-11: If FB is Extrm High and ALB is 0 Then Decision is 0 With CF 96

RULE-12: If AGE is Very High and SEX is 1 Then Decision is 0 With CF 96

RULE-13: If AGE is Very High and ALB is 0 Then Decision is 0 With CF 96

RULE-14: If SEX is 1 and DUR is Medium and INS is 0 Then Decision is 1 With CF
94

RULE-15: If AGE is High and SEX is 1 and DUR is Medium Then Decision is 1
With CF 92

CNCV

RULE-11: If AGE is Extrm High Then Decision is 0 With CF 100

RULE-12: If AGE is Extrm Low Then Decision is 0 With CF 100

RULE-13: If TYPE is 1 Then Decision is 0 With CF 100

RULE-14: If INS is 0 and ALB is 2 Then Decision is 1 With CF 100

RULE-15: If PPB is High and ALB is 2 Then Decision is 1 With CF 100

RULE-16: If PPB is Very High Then Decision is 0 With CF 98

RULE-17: If FB is Very Low Then Decision is 0 With CF 94

176
Appendix‐B

RULE-18: If AGE is Very High Then Decision is 0 With CF 90

RULE-19: If PPB is Medium Then Decision is 0 With CF 87

RULE-20: If PPB is Low Then Decision is 0 With CF 85

RULE-21: If SEX is 0 and FB is High and PPB is High Then Decision is 1 With CF
84

DUPY

RULE-10: If SEX is 0 and INS is 1 Then Decision is 1 With CF 100

RULE-11: If INS is 1 and ALB is 0 Then Decision is 1 With CF 100

RULE-12: If FB is Extrm High Then Decision is 1 With CF 100

RULE-13: If PPB is Low Then Decision is 0 With CF 95

RULE-14: If FB is Low Then Decision is 0 With CF 95

RULE-15: If SEX is 0 and DUR is High Then Decision is 1 With CF 94

RULE-16: If DUR is High and ALB is 0 Then Decision is 1 With CF 94

RULE-17: If DUR is Very High Then Decision is 0 With CF 94

RULE-18: If FB is Medium Then Decision is 0 With CF 93

RULE-19: If AGE is Very Low Then Decision is 0 With CF 92

FTS

RULE-08: If AGE is Extrm High Then Decision is 0 With CF 100

RULE-09: If ALB is 2 Then Decision is 0 With CF 100

RULE-10: If DUR is Extrm High Then Decision is 0 With CF 100

RULE-11: If FB is Extrm High Then Decision is 0 With CF 100

RULE-12: If PPB is Extrm High Then Decision is 0 With CF 100

RULE-13: If SEX is 1 and PPB is Extrm Low Then Decision is 1 With CF 100

RULE-14: If PPB is Medium Then Decision is 0 With CF 95

RULE-15: If AGE is High and SEX is 1 Then Decision is 1 With CF 93

RULE-16: If PPB is High Then Decision is 0 With CF 92

RULE-17: If AGE is Very Low Then Decision is 0 With CF 91

177
Appendix‐B

DISH

RULE-10: If PPB is Extrm High Then Decision is 0 With CF 100

RULE-11: If ALB is 2 Then Decision is 0 With CF 100

RULE-12: If AGE is Extrm High Then Decision is 1 With CF 100

RULE-13: If SEX is 1 and PPB is High Then Decision is 0 With CF 98

RULE-14: If SEX is 0 and PPB is High and ALB is 1 Then Decision is 1 With CF 97

RULE-15: If DUR is Medium and PPB is High Then Decision is 1 With CF 96

RULE-16: If AGE is High and PPB is High and ALB is 1 Then Decision is 1 With CF
94

RULE-17: If SEX is 0 and PPB is Low Then Decision is 0 With CF 92

RULE-18: If DUR is Medium and PPB is Low Then Decision is 0 With CF 92

RULE-19: If PPB is Low and ALB is 1 Then Decision is 0 With CF 92

GOUT

RULE-06: If FB is Extrm High Then Decision is 0 With CF 100

RULE-07: If DUR is Extrm High Then Decision is 0 With CF 100

RULE-08: If ALB is 2 Then Decision is 0 With CF 100

RULE-09: If AGE is Extrm High and FB is Medium Then Decision is 1 With CF 96

RULE-10: If AGE is High Then Decision is 0 With CF 95

RULE-11: If AGE is Very Low Then Decision is 0 With CF 94

RULE-12: If PPB is Low Then Decision is 0 With CF 94

RULE-13: If INS is 0 and FB is Medium and PPB is Very Low Then Decision is 1
With CF 93

RULE-14: If FB is Medium and PPB is Very Low and ALB is 0 Then Decision is 1
With CF 93

RULE-15: If PPB is Medium Then Decision is 0 With CF 93

OAH

RULE-09: If DUR is Extrm High Then Decision is 1 With CF 100

RULE-10: If AGE is Extrm High Then Decision is 1 With CF 100

178
Appendix‐B

RULE-11: If SEX is 1 and TYPE is 2 and INS is 1 Then Decision is 1 With CF 100

RULE-12: If TYPE is 2 and INS is 1 and ALB is 1 Then Decision is 1 With CF 100

RULE-13: If AGE is High and INS is 0 and ALB is 1 Then Decision is 0 With CF 97

RULE-14: If AGE is High and SEX is 1 and INS is 0 Then Decision is 0 With CF 97

RULE-15: If AGE is High and SEX is 1 and INS is 1 Then Decision is 1 With CF 94

RULE-16: If AGE is High and INS is 1 and ALB is 1 Then Decision is 1 With CF 94

RULE-17: If SEX is 0 and DUR is High Then Decision is 0 With CF 94

RULE-18: If DUR is High and ALB is 0 Then Decision is 0 With CF 94

OAK

RULE-06: If INS is 1 Then Decision is 0 With CF 100

RULE-07: If AGE is Extrm High and SEX is 1 Then Decision is 1 With CF 100

RULE-08: If DUR is Extrm Low Then Decision is 1 With CF 100

RULE-09: If AGE is Extrm High and PPB is Very Low Then Decision is 1 With CF
99

RULE-10: If SEX is 0 and FB is High Then Decision is 0 With CF 98

RULE-11: If PPB is High Then Decision is 1 With CF 97

RULE-12: If SEX is 0 and DUR is Very High Then Decision is 0 With CF 96

RULE-13: If FB is Extrm High Then Decision is 0 With CF 95

RULE-14: If AGE is Low Then Decision is 0 With CF 95

RULE-15: If AGE is High and SEX is 1 Then Decision is 1 With CF 95

179

Вам также может понравиться