Академический Документы
Профессиональный Документы
Культура Документы
DISEASE PREDICTOR
A PROJECT REPORT
Submitted to
Department of Computer Science and Information Technology
DWIT College
Submitted by
Anil Parajuli
August, 2016
DWIT College
DEERWALK INSTITUTE OF TECHNOLOGY
Tribhuvan University
SUPERVISOR’S RECOMENDATION
…………………………………………
Ritu Raj Lamsal
Lecturer
Deerwalk Institute of Technology
DWIT College
DWIT College
DEERWALK INSTITUTE OF TECHNOLOGY
Tribhuvan University
LETTER OF APPROVAL
This is to certify that this project prepared by ANIL PARAJULI entitled “DISEASE
PREDICTOR” in partial fulfillment of the requirements for the degree of B.Sc. in
Computer Science and Information Technology has been well studied. In our opinion it is
satisfactory in the scope and quality as a project for the required degree.
…………………………………… …………………………………………
RiturajLamsal [Supervisor] Hitesh Karki
Lecturer Chief Academic Officer
DWIT College DWIT College
………………………………………….. …………………………………………..
Jagdish Bhatta [External Examiner] SarbinSayami [Internal Examiner]
IOST, Tribhuvan University Assistant Professor
IOST, Tribhuvan University
i
ACKNOWLEDGEMENT
First of all, I would like to thank DWIT College for providing me with the opportunity
and resources need for this project. Also, I am really thankful to my respected and
esteemed guide Mr. Ritu Raj Lamsal who helped me complete this project.
At the end, I would like to express my sincere thanks to all my friends and others who
helped me directly or indirectly during this project work.
Anil Parajuli
TU Roll No.: 1789/069
ii
Tribhuvan University
Institute of Science and Technology
STUDENT’S DECLARATION
I hereby declare that I am the only author of this work and that no sources other than the
listed here have been used in this work.
Anil Parajuli
iii
ABSTRACT
“Disease Prediction” system based on predictive modeling predicts the disease of the user
on the basis of the symptoms that user provides as an input to the system. The system
analyzes the symptoms provided by the user as input and gives the probability of the
disease as an output
Disease Prediction is done by implementing the Naïve Bayes Classifier. Naïve Bayes
Classifier calculates the probability of the disease. Therefore, average prediction accuracy
probability 60% is obtained.
iv
TABLE OF CONTENTS
ACKNOWLEDGEMENT ...................................................................................................ii
v
2.3 Feasibility Analysis .................................................................................................... 8
APPENDIX ........................................................................................................................ 22
REFERENCES .................................................................................................................. 23
vi
LIST OF FIGURES
vii
LIST OF TABLES
viii
LIST OF ABBREVIATIONS
ix
Disease Predictor
CHAPTER 1 INTRODUCTION
1.1 Introduction
At present, when one suffers from particular disease, then the person has to visit to doctor
which is time consuming and costly too. Also if the user is out of reach of doctor and
hospitals it may be difficult for the user as the disease can not be identified. So, if the
above process can be completed using a automated program which can save time as well
as money, it could be easier to the patient which can make the process easier. There are
other Heart related Disease Prediction System using data mining techniques that analyzes
the risk level of the patient.
Disease Predictor is a web based application that predicts the disease of the user with
respect to the symptoms given by the user. Disease Prediction system has data sets
collected from different health related sites. With the help of Disease Predictor the user
will be able to know the probability of the disease with the given symptoms.
As the use of internet is growing every day, people are always curious to know different
new things. People always try to refer to the internet if any problem arises. People have
access to internet than hospitals and doctors. People do not have immediate option when
they suffer with particular disease. So, this system can be helpful to the people as they
have access to internet 24 hours.
1
Disease Predictor
There are many tools related to disease prediction. But particularly heart related diseases
have been analyzed and risk level is generated. But generally there are no such tools that
are used for prediction of general diseases. So Disease Predictor helps for the prediction
of the general diseases.
1.3 Objective
-To implement Naïve Bayes Classifier that classifies the disease as per the input of the
user.
-To develop web interface platform for the prediction of the disease.
1.4.1 Scope
This project aims to provide a web platform to predict the occurrences of disease on the
basis of various symptoms. The user can select various symptoms and can find the
diseases with their probabilistic figures.
2
Disease Predictor
1.4.2 Limitations
3
Disease Predictor
4
Disease Predictor
K.M. Al-Aidaroos, A.A. Bakar and Z. Othman have conducted the research for the best
medical diagnosis mining technique. For this authors compared Naïve Baeyes with five
other classifiers i.e. Logistic Regression (LR), KStar (K*), Decision Tree (DT), Neural
Network (NN) and a simple rule-based algorithm (ZeroR). For this, 15 real-world medical
problems from the UCI machine learning repository (Asuncion and Newman, 2007) were
selected for evaluating the performance of all algorithms. In the experiment it was found
that NB outperforms the other algorithms in 8 out of 15 data sets so it was concluded that
the predictive accuracy results in Naïve Baeyes is better than other techniques.
5
Disease Predictor
6
Disease Predictor
JyotiSoni, Ujma Ansari, Dipesh Sharma and SunitaSoni have done this research research
paper into provide a survey of current techniques of knowledge discovery in databases
using data mining techniques that are in use in today’s medical research particularly in
Heart Disease Prediction. Number of experiment has been conducted to compare the
performance of predictive data mining technique on the same dataset and the outcome
reveals that Decision Tree outperforms and some time Bayesian classification is having
similar accuracy as of decision tree but other predictive methods like KNN, Neural
Networks, Classification based on clustering is not performing well.
(JyotiSoni, Ansari, Sharma, & Soni, 2011)
Shadab Adam Pattekari and AsmaParveen have conducted a research using Naïve Bayes
Algorithm to predict the heart diseases where user provides the data which is compared
with trained set of values. So from this research, patients were able to provide their basic
information which is compared with the data and the heart disease is predicted.
(Adam & Parveen, 2012)
M.A.NisharaBanu, B Gomathy used medical data mining techniques like association rule
mining, classification, clustering I to analyze the different kinds of heart based problems.
Decision tree is made to illustrate every possible outcome of a decision. Different rules
are made to get the best outcome. In this research age , sex, smoking, overweight, alcohol
intake, blood sugar, hear rate, blood pressure are the parameters used for making the
decisions. Risk level for different parameters are stored with their id’s ranging (1-8). ID
lesser than of 1 of weight contains the normal level of prediction and higher ID other
than 1 comprise the higher risk levels .K-means clustering technique is used to study the
pattern in the dataset. The algorithm clusters informations into k groups. Each point in
the dataset is assigned to the closed cluster. Each cluster center is recomputed as the
average of the points in that cluster.
(NisharBanu, MA; Gomathy, B;, 2013)
7
Disease Predictor
a. Display the list of symptoms where user can select the symptoms.
The project is technically feasible as it can be built using the existing available
technologies. It is a web based applications that uses Grails Framework. The technology
required by Disease Predictor is available and hence it is technically feasible.
The project is economically feasible as the cost of the project is involved only in the
hosting of the project. As the data samples increases, which consume more time and
processing power. In that case better processor might be needed.
8
Disease Predictor
The project is operationally feasible as the user having basic knowledge about computer
and Internet. Disease Predictor is based on client-server architecture where client is users
and server is the machine where datasets are stored.
9
Disease Predictor
3.1 Methodology
Disease Prediction has been already implemented using different techniques like Neural
Network, decision tree and Naïve Byes algorithm. Particularly heart related disease is
mostly analyzed. From the analysis it was found that Naïve Bayes is more accurate than
other techniques. So, Disease Predictor also uses Naïve Bayes for the prediction of
different diseases.
Data collection has been done from the internet to identify the disease here the real
symptoms of the disease are collected i.e. no dummy values are entered. The symptoms
of the disease are collected from different health related websites.
3.1.2Algorithm implemented
10
Disease Predictor
The value P(Symptomi |Disease) of can be calculated by using multinomial Naïve Bayes
which is given by:
𝑁𝑦𝑖 + 𝛼
P(𝑠𝑦𝑚𝑝𝑡𝑜𝑚𝑖 |𝐷𝑖𝑠𝑒𝑎𝑠𝑒) =
𝑁𝑦 + 𝛼𝑛
Where:
11
Disease Predictor
The value of P(Disease) can be calculated by using Laplace Law of Succession which is
given by:
𝑁(𝐷𝑖𝑠𝑒𝑎𝑠𝑒) + 1
P(Disease)=
𝑁+2
Where,
N (Disease) = Frequency of the same disease in the dataset
N= Total disease in the dataset
12
Disease Predictor
It explain the classes used in the Disease Predictor. There are three classes used in total,
Symptoms Reader: Reads the user input and creates the list of symptoms
Symptoms Analyzer: According to symptoms parameter displays the subjective result.
Calculate Values: Calculates the probabilistic model of the diseases.
13
Disease Predictor
14
Disease Predictor
It explains the sequence of the Disease Predictor. Initially system shows the symptoms to
be selected. The user selects the symptoms and submits to the system .The Disease
Predictor predicts and display the result
15
Disease Predictor
4.1 Implementation
Disease Predictor is the ability to predict the disease that has been provided to the system.
For disease prediction, we need to implement the naïve Byes Classifier.
Figure 4- Workflow
As shown in the figure the input data sets are classified using Naïve Bayes classifier. The
sample input data sets is shown below
16
Disease Predictor
Naïve Bayes classifier uses the following rule to classify the datasets:
n
̂Y= ARG MAX P(Disease) ∏ P(symptomi|Disease)
i=1
User gives input to the system. The input consists of symptoms. The user marks the
symptoms due to which the user is feeling unwell.
1. Fever
2. Cough
3. Vomiting
The “Disease Predictor” system predicts the disease according to the input data sets and
calculates the probability of the disease.
The sample output is given as:
17
Disease Predictor
4.1.2 Description
SymptomsReader
This class is the run first when the user wants for disease prediction
Input: User selects the symptoms from the list.
Output: The selected symptoms are put in the list
SymptomsAnalyzer
CalculateValues
18
Disease Predictor
4.3 Testing
19
Disease Predictor
In case of any bugs left in the system, the bugs and issues will be fixed for smooth
running of the application. The accuracy of the system can be further improved with other
algorithms if needed.
The features in the application can be added such as history of the disease can be kept in
the log. The available list of symptoms can also be added for covering more number of
diseases.
20
Disease Predictor
6.1 Conclusion
This project aims to predict the disease on the basis of the symptoms. The project is
designed in such a way that the system takes symptoms from the user as input and
produces output i.e. predict disease. Average prediction accuracy probability of 55% is
obtained. Disease Predictor was successfully implemented using grails framework.
6.2 Recommendations
This project has not implemented recommendation of medications to the user. So,
medication recommendation can be implemented in the project. History about the disease
for a user can be kept as a log and recommendation can be implemented for medications.
21
Disease Predictor
APPENDIX
22
Disease Predictor
REFERENCES
A.Davis, D., V.Chawla, N., Blumm, N., Christakis, N., & Barbasi, A. L. (2008).
Predicting Individual Disease Risk Based On Medical History.
Adam, S., & Parveen, A. (2012). Prediction System For Heart Disease Using Naive
Bayes.
Al-Aidaroos, K., Bakar, A., & Othman, Z. (2012). Medical Data Classification With
Naive Bayes Approach. Information Technology Journal .
Darcy A. Davis, N. V.-L. (2008). Predicting Individual Disease Risk Based On Medical
History.
JyotiSoni, Ansari, U., Sharma, D., & Soni, S. (2011). Predictive Data Mining for Medical
Diagnosis: An Overview Of Heart Disease Prediction.
K.M. Al-Aidaroos, A. B. (n.d.).
K.M. Al-Aidaroos, A. B. (n.d.). 2012. Medical Data Classification With Naive Bayes
Approach .
NisharBanu, MA; Gomathy, B;. (2013). Disease Predicting System Using Data Mining
Techniques.
23