Вы находитесь на странице: 1из 5

Proceedings of the International Conference on Communication and Electronics Systems (ICCES 2018)

IEEE Xplore Part Number:CFP18AWO-ART; ISBN:978-1-5386-4765-3

An Efficient Decision Support Model Based on


Ensemble Framework of Data Mining Features
Assortment & Classification Process
Priyanka Sharma Sonal Saxena Dr. Yatendra Mohan Sharma
M.Tech Scholar, Deptt. of CSE Asst. Prof. Deptt. of CSE Assoc. Prof., Dept. of CSE,
RCEW, Jaipir, Rajasthan, India RCEW, Jaipir, Rajasthan, India BBCET, Jaipur, Rajasthan, India

Abstract— Over past decades, to expand the fitness and


success of verdict a bunch of decision support models on the base
of data mining classification techniques has proposed by
numerous researchers. However, introduced practices have
benefited the users in different ways but due to inadequate build
procedure, use of solitary classification technique or incorporate
the functionality of arbitrarily picked methods into a single
practice each approach face dissimilar complexity and fail to
obtain utmost results with different state of affairs. The selection
of the appropriate classification algorithm for a given data-set is
an important and complex issue, full of research challenges. On
the other hand building different model for dissimilar data sets
increase cost and time with lacking of correctness. This dilemma
of accessible decision support systems has considered into this Fig. 1. Different Categories in Decision Support System
paper by proposing a new dynamic ensemble framework of data
mining classification method with condensed feature selection However, with advanced technologies, speedy and more
procedure. The experimental results depicts that proposed exact calculative outcomes the digital systems has builds the
approach has produce more precise outcomes in comparison of
faith level of its users but rapid growth in amount of data and
classical approaches.
origination of naïve practice on daily basis has increase the
Keywords—Data Mining, Classification Techniques, Feature trust fears on digital information. On the other hand most of
Reduction, Ensemble Framework. studied models are not suitable for examines the statistics with
unstable data sets, model face hitch and be unsuccessful to
provide an efficient outcome for support or build an admirable
I. INTRODUCTION decision [6]. Therefore accessible approaches lacks in term of
Typically, notion of decision support system (DSS) is very accuracy. Over past decades a good number of researchers
extensive, different investigators has discussed it in dissimilar have exposed that accuracy of analyzed data can be lift up
way. Over past decades, much of investigators have verified with the advanced methods of data mining. A good amount of
that decisions builds by human beings can be outlying from researchers have also demonstrate that instead of sole scheme
optimal judgment and it deteriorate with difficulty and tension. of data mining an ensemble algorithms are more capable to
On the other hand in much of crucial working fields and in create high precise and efficient outcome to support decision
situations excellence of verdict is essential, in medical area an making process with good accuracy. Therefore, the work that
incorrect or delayed decision may cause of hazard for sick incorporated into this paper gives a more concentration on to
person. To support human being judgment and augment the design and utilize an effective decision support practice using
quality of build decisions much of schemes have been hybrid scheme.
intended in programming form of computer. The implemented
To overcome the deficiency of obtainable decision support
schemes employ as sole or incorporated tool for support of
methods and picks up reliability ratio in term of recognition
tough decision building task in quick way. Such milieus are
accuracy the proposed approach has analyzed the performance
habitually known as decision support systems [1].
of Naïve Bayes, Decision Tree, KStar, IBK and Hoeffding
DSS is an adjustable and interactive tool which examines tree, most popular classification methods of data mining with
the statistics with a set of rule to facilitate decision makers in 5 different feature selection methods. Subsequent to analyzing
term to improve the effectiveness of build decisions [2]. A the act of each selected algorithm without an involvement of
decision support system can be intended for dissimilar process human action the proposed approach automatically opt the
such as simulation, analyzing, forecasting and optimization best two dissimilar forecast methods with reduces size of
task [3-5]. The figure 1.1 depicted the class in DSS. feature set and incorporates the working functionality of
selected algorithm into a layered approach.

978-1-5386-4765-3/18/$31.00 ©2018 IEEE 487


Proceedings of the International Conference on Communication and Electronics Systems (ICCES 2018)
IEEE Xplore Part Number:CFP18AWO-ART; ISBN:978-1-5386-4765-3

II. RELATED WORK testing. The experimental outcomes have demonstrated that in
Over past few decades a number of decision support model comparison of evaluated approaches the introduced model
have proposed to improve the accuracy and efficiency of build have produced best accuracy (90.43%). Yung-Fu Chen et al.
decision by integrating an exclusive method of data mining. [18] have proposed a clinical decision support system for
However, each approach associates their unique benefits and fracture prediction. For the training and testing of designed
limitation. model authors has take patients data from National
HealthInsurance Research Database (NHIRD), aged 20 years
Li et al. [7] have discussed a decision support model for and older visited clinic in between 2002 to 2010. The authors
credit. They employ chronological mixture of kernel has elect genetic algorithm (GA) and support vector machine
procedures for improve interpretability of credit classification (SVM) for building the new decision support systems for
models. Lalit Dole, Jayant Rajurkar [8] have present a clinical judgment. The experimental results have shown that
decision support model on the base of modified version of proposed approach has achieves sensitivity of 69.84–77.00%
Naïve Bayes (NB) classification algorithm of data mining to and an AUC of 0.7495–0.7590. With same procedure a group
predict an average grade point of graduating students. They of authors has builds a support model for patients of asthma
have gathered data through surveys. Veenita Kunwar et. al. [9] [19]. In [20-22] different group of authors have build
have employed Naïve bayes and Artificial Neural dissimilar decision support model for unrelated work. Each
Network(ANN) for building an efficient prediction model for one has depicted the advantages of designed model in
Chronic Kidney Disease(CKD). They have considered age, comparison of other accessible approaches. All the models
diabetes, blood pressure and RBC count as simulation builds with the classification methods of data mining which
parameters. The investigational outputs illustrated that from shows that data mining methods are more useful to design an
ANN method Naïve bayes approach has outperformed. efficient decision support model.
Meryem Ouahilal et al. [10] have introduced a decision
support model for stock market. They have shown a III. PROPOSED APPROACH
proportional investigation over Decision Tree, Multiple Linear
and Support Vector Regression with L’Oréal financial dataset. To fabricate competent and precise decisions support
Their analysis outcomes have demonstrates that SV regression model five popular data mining algorithms Naïve Bayes,
produced the best forecasts. Yicheng Jiang et al. [11] have Decision Tree, KStar, IBK and Hoeffding tree has considered
depicts a three-layer based decision support system for with different feature assortment actions and data sets.
judgment of patient’s disease. The introduced approach Typically, before analyzing the act of each algorithm over
exploits constructive information to crack inexactness of the selected datasets a feature assortment procedure has executed
knowledge by adding the “property” to the two-layer with five different procedures of data mining. Each algorithm
knowledge base. The designed approach iteratively computes evaluated with selected sets of features over data set and at last
disease probability with Naive Bayes classification algorithm the classification method has elected with feature selection
which significantly reduces the dependencies between procedure by which it produced higher accuracy results. On
attributes. Additionally, authors have stated that designed the base of performance evaluation of each approach two best
approach was employed by two experienced doctors for classification methods has opted by proposed mechanism
clinical decision building process whose has find that which incorporates selected methods into a layer form. The
proposed three-layer model is more effective for proposed mechanism automatically selects best two suitable
recommendations. classification mechanisms and feature assortment procedure
for supplied data set, no need of manually efforts. A 10 fold
Saeed Piri et al. [12] have proposed a clinical decision cross validation process has also integrated in proposed
support system (CDSS) for diabetic retinopathy. The authors approach to train build model automatically with a procedure
have analyzed data from >1.4 million diabetic patients to of testing. Following steps illustrate the working methodology
demonstrate that build ensemble technique produced more of proposed mechanism in easy way.
accurate outcomes in comparison of obtainable methods. To
improve quality of build approach the authors has given more Dataset, UCI Library Data Preprocessing
consideration over the data cleaning and preparation by
analyzing the data distribution and percentage of missing
value, select key values which have enough and significant Classification Techniques NB, Analysis an act of
predictive powers. Emrana Kabir Hashi et al. [13] have each Classification
DT, KStar, IBK, Hoeffding Tree
proposed a new expert clinical Decision Support System for Method with 5
judge the diseases of diabetes patient’s more accurately. They assorted Feature
have employ data mining classification technique Decision Feature Selection Selection
Tree and K-Nearest Neighbor (KNN) for the diseases Methods Procedures
diagnosis process. To trained the build model authors have
employed 70% data for training and remain 30% for testing.
Build an Ensemble Framework Opt Best Two
At the training time approach produced 100% accuracy while
at testing phase decision tree and KNN provide 90.43% and lively on base of integrating Methods with feature
76.96% accuracy respectively. The authors have evaluated functionality of opted methods
Assortment process
their approach with the previous proposed approaches [14-17] into layer form
which has employed same data set for model training and
Fig. 2. Building Procedure of Proposed Framework

978-1-5386-4765-3/18/$31.00 ©2018 IEEE 488


Proceedings of the International Conference on Communication and Electronics Systems (ICCES 2018)
IEEE Xplore Part Number:CFP18AWO-ART; ISBN:978-1-5386-4765-3

IV. EXPERIMENT & RESULT ANALYSIS Above figure evidently shown that proposed hybrid
To clarify an efficiency of proposed approach over framework has produced improved accuracy results in
classical and modern decision support mechanism different assessment of other classical methods. The proposed
experiments has performed with dissimilar data sets which framework lively ensemble the features of Naïve Bayes and
have collected from UCI library. Hoefdding tree in a layer form because in individual
assessment these two methods have outperformed in
First experiment has performed for assisting in area of comparison of decision tree, KStar and IBK classification
clinical decision, opted diabetes dataset to depict an act of technique.
proposed methodology in excess of classical and other
proposed approaches. This selected dataset enclose 768 To ensure an effectiveness of proposed mechanism the
instances, 268 positive and 500 negative tested instances. This proposed approach has again evaluated with a different type of
data set has evaluated to know that a patient is diabetic or not. data set. This time a labor dataset has opted which consists 57
Among of 9 original attributes of opted dataset different instances with 17 attributes. Same as previous experiment
feature selection method provide dissimilar condensed set of before starting the classification procedure a feature
feature as shown in figure 3. assortment procedure has performed. The recommended
feature set of each method are illustrated in figure 5.

Fig. 3. Assorted Features from Diabities Data set by Different Data Mining Fig. 5. Assorted Features from Labor Data set by Different Data Mining
Methods Methods

Subsequent to feature assortment process each opted


classification algorithm has evaluates 5 times, each time with
the different feature set as elected by dissimilar feature
assortment procedure. At the end of evaluation procedure the
best performance of executed algorithm has measured for
comparison work. On the base of best accuracy rate the
proposed model elect two classification approaches along with
feature assortment procedure and build an ensemble
framework for enhance the accurateness of evaluation
procedure which improve the performance of decision support
system. Over diabetes dataset the best outcomes of each
classification method along with proposed approach has
illustrated in following figure 4.

Fig. 6. Act of Classical Data Mining Methods & Anticipated Hybrid


Algorithm with Assorted Features for Labor Verdict

The comparative value depicts in above figure 6 shows the


muscularity of proposed approach. Both of above experiment
have performed with dissimilar dataset but improve results of
proposed mechanism has shown that intended approach is
performed better in comparison of classical methods of data
mining.
For testing an additional effectiveness of proposed
approach over modern decision support model dissimilar
experiment has performed with the Australian credit data set
as used by Yoichi Hayashi [23]. This dataset consist 690
Fig. 4. Act of Classical Data Mining Methods & Anticipated Hybrid
instances with 14 attributes. The instances are two types,
Algorithm with Assorted Features for Recognition of Diabetes 307positive and 383negative instances. The figure 7

978-1-5386-4765-3/18/$31.00 ©2018 IEEE 489


Proceedings of the International Conference on Communication and Electronics Systems (ICCES 2018)
IEEE Xplore Part Number:CFP18AWO-ART; ISBN:978-1-5386-4765-3

demonstrates the selected feature set by different feature different most popular classification algorithms over three
assortment method of data mining. different type of dataset, Diabetes, Labor and credit risk of
Australia dataset. Each experiment outcomes shows that
proposed approach has enhance of prediction accuracy.

REFERENCES
[1] Shakiba Khademolqorani, Ali Zeinal Hamadani “An Adjusted Decision
Support System through Data Mining and Multiple Criteria Decision
Making” 2nd International Conference on Integrated Information,
Elsevier, Procedia - Social and Behavioral Sciences 73 ( 2013 ), pp. 388
– 395.
[2] R.Rupnik, M. Kukar “Decision Support System To Support Decision
Processes With Data Mining” Journal of information and organizational
sciences, Volume 31, Number 1 (2007), pp. 217-231.
Fig. 7. Assorted Features from Australian Credit Data set [3] J. H. Heinrichs and J.S. Lim. Integrating Web-based Data Mining Tools
with Business Models for Knowledge Management. Decision Support
Same as above executed experiment all the opted Systems, Vol. 35, No. 1, 2003, pp. 103-112.
classification approaches has evaluated with each of assorted [4] C. C. Kuan. Decision Support System for Tourism Development:
System Dynamics Approach. Journal of Computer Information Systems,
feature set and collect the best performance of each algorithm Vol. 45, No. 1, 2004, pp. 104-112.
along with feature assorted method. At last proposed
[5] A. Patelis, K. Metaxiotis, K. Nikolopoulos and V. Assimakopoulos.
mechanism lively opted the best two method whose have ForTV: Decision Support System for Forecasting Television
outperformed in term of high accuracy and generate an Viewership. Journal of Computer Information Systems, Vol. 46, No. 1,
ensemble framework by combining the functionality of opted 2003, pp. 25-34.
methods in to a layer form. The result depicted in figure 8 [6] Emrana Kabir Hashi, Md. Shahid Uz Zaman and Md. Rokibul Hasan
evidently proof that proposed mechanism has produce more “An Expert Clinical Decision Support System to Predict Disease Using
accurate result in comparison or obtainable methods. Classification Techniques” IEEE, International Conference on
Electrical, Computer and Communication Engineering (ECCE),
February 16-18, 2017, pp. 396-400.
[7] Li, J., Wei, L., Li, G., Xu, W.,An evolution strategy-based multiple
kernels multi-criteria programming approach: The case of credit
decision making, Decision support system, 2011, pp. 292-298.
[8] Lalit Dole, Jayant Rajurkar “A Decision Support System for Predicting
Student Performance” International Journal of Innovative Research in
Computer and Communication Engineering, Vol. 2, Issue 12, December
2014, pp.- 7232-7237.
[9] Veenita Kunwar, Khushboo Chandel, A. Sai Sabitha, Abhay Bansal
“Chronic Kidney Disease Analysis Using Data Mining Classification
Techniques” IEEE 6th International Conference - Cloud System and Big
Data Engineering (Confluence), 2016, pp.-300-305.
[10] Meryem Ouahilal, Mohammed El Mohajir, Mohamed chahhou, Badr
Eddine El Mohajir “A Comparative Study of Predictive Algorithms for
Business Analytics and Decision Support systems: Finance as a Case
Study” IEEE, International Conference on Information Technology for
Organizations Development (IT4OD) 2016.
[11] Yicheng Jiang,1 Bensheng Qiu,1 Chunsheng Xu, and Chuanfu Li “The
Research of Clinical Decision Support System Based on Three-Layer
Knowledge Base Model” Journal of Healthcare Engineering Volume
Fig. 8. Act of Classical Data Mining Methods & Anticipated Hybrid 2017, Article ID 6535286, pp.- 1-8.
Algorithm with Assorted Features for Australian Credit Data set [12] Saeed Piri, Dursun Delen, Tieming Liu, Hamed M. Zolbanin, “A data
analytics approach to building a clinical decision support system for
The evaluation outcomes of proposed approach at credit diabetic retinopathy: Developing and deploying a model ensemble”
risk data set of Australian has shown that proposed approach Elsevier journal, Decision Support Systems, 2017.
is capable to produce more accurate statistics in comparison of [13] Emrana Kabir Hashi, Md. Shahid Uz Zaman and Md. Rokibul Hasan
other evaluated algorithms. The above results evidently proves “An Expert Clinical Decision Support System to Predict Disease Using
Classification Techniques” IEEE International Conference on Electrical,
that proposed approach is reliable and effective. Computer and Communication Engineering (ECCE), February 16-18,
2017. pp.- 396-400.
V. CONCLUSION [14] A. Iyer, S. Jeyalatha and R. Sumbaly, “Diagnosis of diabetes using
classification mining techniques,” Int. J. of Data M. & Know. Manag.
This paper intended an appropriate decision support model Process, IJDKP, United Arab Emirates, vol. 5, pp. 1-14, January 2015.
for multi region. Based on a number of researchers' [15] Y. Hayashi and S. Yukita, “Rule extraction using Recursive-Rule
experiences, the proposed model lively elect the most extraction algorithm with J48graft combined with sampling selection
appropriate group of classification and feature assortment techniques for the diagnosis of type 2 diabetes mellitus in the Pima
procedure for supplied dataset and pooled the opted method Indian dataset,” Informatics in Medicine Unlocked, ELSEVIER, Vol. 2,
pp. 92-104, 2016.
functionality in to a layer format to enhance the level of
accuracy. The act of proposed model has examined with five

978-1-5386-4765-3/18/$31.00 ©2018 IEEE 490


Proceedings of the International Conference on Communication and Electronics Systems (ICCES 2018)
IEEE Xplore Part Number:CFP18AWO-ART; ISBN:978-1-5386-4765-3

[16] S. Sa’di, A. Maleki, R. Hashemi, Z. Panbechi and K. Chalabi. [20] Stanislaw Drosio and Stanislaw Stanek “Building a safe society
“Comparison of data mining algorithms in the diagnosis of type II environment: a summary of hybrid approaches to crisis decision support
diabetes,” Int. J. on Comput. Sci. & App., Vol. 5, pp.1-12, October systems” Journal of Decision systems 2018, Vol. 2 7 , No. s1, pp.- 181–
2015. 190.
[17] G.Huang, K.Huang, T.Lee, J. Tzu-Ya and Weng, “An interpretable [21] Shakuntala Jatav1 and Vivek Sharma “An Algorithm For Predictive
rulebased diagnostic classification of diabetic nephropathy among type 2 Data Mining Approach In Medical Diagnosis” International Journal of
diabetes patients,” Huang et al. BMC Bioinformatics, Vol. 16, pp.55-65, Computer Science & Information Technology (IJCSIT) Vol 10, No 1,
2015. February 2018, pp.- 11-20.
[18] Yung-Fu Chen ,Chih-Sheng Lin,Kuo-An Wang, La Ode Abdul Rahman, [22] Mohamed Hamada and Mohammed Hassan “Artificial Neural Networks
Dah-Jye Lee, Wei-Sheng Chung and Hsuan-Hung Lin “Design of a and Particle Swarm Optimization Algorithms for Preference Prediction
Clinical Decision Support System for Fracture Prediction Using in Multi-Criteria Recommender Systems” MDPI Journal Informatics
Imbalanced Dataset” Hindawi Journal of Healthcare Engineering 2018, pp.-1-16.
Volume 2018, Article ID 9621640, 13 pages. [23] Yoichi Hayashi “Application of a rule extraction algorithm family based
[19] M. Monadi, Y. Javadian, M. Cheraghi, B. Heidari, and M. Amiri, on the Re-RX algorithm to financial credit risk assessment from a Pareto
“Impact of treatment with inhaled corticosteroids on bone mineral optimal perspective” Elsevier Operations Research Perspectives 3,
density of patients with asthma: related with age,” Osteoporosis 2016, pp.- 32–42.
International, vol. 26, no. 7. 2015, pp. 2013–2018.

978-1-5386-4765-3/18/$31.00 ©2018 IEEE 491

Вам также может понравиться