Improved Algorithm For Network Intrusion Detection System Based On K-Nearest Neighbor: Survey

IJIRST International Journal for Innovative Research in Science & Technology| Volume 3 | Issue 03 | August 2016
ISSN (online): 2349-6010
Improved Algorithm for Network Intrusion

Detection System based on K-Nearest Neighbour:
Survey
Jyotika Gupta
M. Tech. Student
Department of Computer Science & Engineering
JSS Academy of Technical Education, Noida, India
Krishna Nand Chaturvedi

Assistant Professor
Department of Computer Science & Engineering
JSS Academy of Technical Education, Noida, India
Abstract
From the onset of web arrangement, protection menaces normally recognized as intrusions has come to be extremely vital and
critical subject in web arrangements, data and data system. In order to vanquish these menaces every single period a detection
arrangement was demanded because of drastic development in networks. Because of the development of arrangement, attackers
came to be stronger and every single period compromises the protection of system. Hence a demand of Intrusion Detection
arrangement came to be extremely vital and vital instrument in web security. Detection and prevention of such aggressions
shouted intrusions generally depends on the skill and efficiency of Intrusion Detection Arrangement (IDS).Therefore countless
ensemble mechanism has been counseled by employing countless methodologies, these methodologies have their own benefits
and short comings. In this paper we will focus on different classification techniques.
Keywords: Intrusion Detection, Anomaly Detection, Misuse Detection, KDD Cup 99, Ensemble Approaches
_______________________________________________________________________________________________________
I.
INTRODUCTION
In the past two decades alongside the quick progress in the Internet established knowledge, new request spans for computer web
have emerged. At the alike period, expansive range progress in the LAN and WAN request spans in company, commercial,
industry, protection and healthcare sectors made us extra reliant on the computer networks. All of these request spans made the
web an appealing target for the mistreatment and a large vulnerability for the community. A fun to do job or a trial to accomplish
deed for a little people came to be a nightmare for the others. In countless cases malicious deeds made this nightmare to come to
be a reality.
In supplement to the hacking, new entities like worms, Trojans and viruses gave extra panic into the net- worked society. As
the present situation is a moderately new phenomenon, web armaments are weak. Though, due to the popularity of the computer
webs, their connectivity and our ever producing dependency on them, realization of the menace can have desecrating
consequences. Safeguarding such a vital groundwork has come to be the priority one scrutiny span for countless researchers.
Aim of this paper is to study the present trends in Intrusion Detection Arrangements (IDS) and to examine a little present
setback that continue in this scrutiny area. In analogy to a little mature and well stayed scrutiny spans, IDS is a youthful earth of
research. Though, due to its duty critical nature, it has enticed momentous attention towards itself. Density of scrutiny on this
subject is constantly rising and everyday extra researchers are involved in this earth of work. The menace of a new wave of cyber
or web aggressions is not just a probability that ought to be believed, but it is a consented fact that can transpire at each time. The
present trend for the IDS is distant from a reliable protective arrangement, but instead the main believed is to make it probable to
notice novel web attacks.
One of the main concerns is to make sure that in case of an intrusion endeavor, the arrangement is able to notice and to report
it. After the detection is reliable, subsequent pace should be to protect the web (response). In supplementary words, the IDS
arrangement will be upgraded to an Intrusion Detection and Reply Arrangement (IDRS). Though, no portion of the IDS is
presently at a fully reliable level. Even nevertheless researchers are concurrently involved in working on both detection and
answer factions of the system. A main setback in the IDS is the promise for the intrusion detection. This is the reason why in
countless cases IDSs are utilized jointly alongside a human expert. In this method, IDS is truly helping the web protection
captain and it is not reliable plenty to be trusted on its own. The reason is the in- skill of IDS arrangements to notice the new or
modified attack patterns. Even though the latest creation of the detection methods has considerably enhanced the detection rate,
yet there is a long method to go.
There are two main ways for noticing intrusions, signature-based and anomaly-based intrusion detection. In the early way,
attack outlines or the deeds of the intruder is modeled (attack signature is modeled). Here the arrangement will gesture the
intrusion after a match is detected. Though, in the subsequent way normal deeds of the web is modeled. In this way, the
arrangement will rise the alarm after the deeds of the web does not match alongside its normal behavior. There is one more
Intrusion Detection (ID) way that is shouted specification-based intrusion detection. In this way, the normal deeds (expected
All rights reserved by www.ijirst.org
81
Improved Algorithm for Network Intrusion Detection System based on K-Nearest Neighbour: Survey
(IJIRST/ Volume 3 / Issue 03/ 015)
behavior) of the host is enumerated and subsequently modeled. In this way, manage worth for the protection, freedom of
procedure for the host is limited. In this paper, these ways will be briefly debated and compared.
The believed of possessing an intruder accessing the arrangement lacking even being able to notice it is the worst nightmare
for each web protection officer. As the present ID knowledge is not precise plenty to furnish a reliable detection, heuristic
methodologies can be a method out. As for the last line of protection, and in order to cut the number of undetected intrusions,
heuristic methods such as Honey Jars (HP) can be deployed. HPs can be installed on each arrangement and deed as mislead or
decoy for a resource.
Another main setback in this scrutiny span is the speed of detection. Computer webs have a vibrant nature in a sense that data
and data inside them are unceasingly changing. Therefore, noticing an intrusion precisely and punctually, the arrangement has to
work in real time. Working in real period is not just to per- form the detection in real period, but is to change to the new
dynamics in the network. Real period working IDS is an alert scrutiny span pursued by countless researchers. Most of the
scrutiny works are aimed to familiarize the most period effectual methodologies. The aim is to make the requested methods
suitable for the real period implementation.
From a disparate outlook, two ways can be envisaged in requesting an IDS. In this association, IDS can be whichever host
established or web based. In the host established IDS, arrangement will merely protect its own innate machine (its host). On the
supplementary hand, in the web established IDS, the ID procedure is somehow distributed alongside the network. In this way
whereas the agent established knowledge is extensively requested, a distributed arrangement will protect the web as a whole. In
this design IDS could manipulation or monitor web firewalls, web routers or web switches as well as the client machines.
The main emphasis of this paper is on the detection portion of the intrusion detection and reply problem. Re- searchers have
pursued disparate ways or a combination of disparate ways to resolve this problem. Every single way has its own theory and
presumptions. This is so because there is no precise behavioral ideal for the legitimate user, the intruder or the web itself.
II. LITERATURE SURVEY
Alexander J. Gibberd The paper proposes an intrusion detection model using chi-square feature selection and multi class support
vector machine. A parameter tuning technique is adopted for optimization of RBF kernel parameter gamma and over fitting
constant C. The proposed model results in high detection rate and low false alarm rates in comparison to other traditional
approaches.
Antonis Papadogiannakis The paper proposed a method of intrusion detection using SVM which can reduce the time required
to build model for classification and increase the intrusion detection accuracy when Gaussian RBF kernel is used. When data sets
are properly processed and proper SVM kernel is selected i.e. Radial Basis Function (RBF), it can overcome the drawback of
SVM i.e. extensive time required to build model.
Asma Gul Combining more than one data mining algorithms may be used to remove disadvantages of one another. Thus a
combining approach has to be made while selecting a mode to implement intrusion detection system. Combining a number of
trained classifiers lead to a better performance than any single classifier.
Kalyanmoy Deb Amrit Pratap The paper proposed a new methodology based on GFS and pair wise learning for the
development of robust and interpretable IDS. The paper make use of a multi-objective fuzzy model (MOGFIDS), three different
GFS schemes developed by Abadeh et al., and a genetic approach for boosting fuzzy association rules. The application of this
divide-and-conquer strategy improves the individual accuracy for the different classes of the problem, which is reflected on
the high value for the average accuracy metric.
Nikos Tsikoudis The key idea of LEoNIDS is to process with higher priority the first few bytes of each flow, which have a
higher probability to carry an attack, to achieve lower latency for them and faster attack detection. The paper proposed two
alternative techniques: time sharing, which uses a typical priority queue scheduling, and space sharing, which uses dedicated
cores with low utilization to process high-priority packets.
Reema Patel The paper proposed a new algorithm (CSVAC) for generating classifiers with clustering, and applied it to the
intrusion detection problem. This approach combines two existing machine learning methods (SVM and CSOACN) to achieve
better performance in both detection accuracy rate and faster running time.
Robert Mitchell The paper present selective packet discarding, a best approach that gracefully reduces the amount of traffic
that reach the detection engine of the NIDS by selectively discarding packets that are less likely to affect its detection accuracy.
This is achieved by setting a cutoff limit to the number of packets to be inspected for each network flow.
Salma Elhag The paper presents a novel feature representation approach that combines cluster centers and nearest neighbors
for effective and efficient intrusion detection, namely CANN. The CANN approach first transforms the original feature
representation of a given dataset into a one-dimensional distance based feature. Then, this new dataset is used to train and test a
k-NN classifier for classification.
Sumaiya Thaseen The paper put forward a new algorithm called S-K, where the combination of SOM neural network and Kmeans algorithm is running to detect the abnormity of the nodes in the wireless sensor network, which will make the system
more flexible, precise and easier to implement.
Wang Huai-bin The paper presented a two-stage method for learning dynamic GGM alongside change points in the graphical
structure. This method, based on two-stage regularization has comparable computational efficiency with existing dynamic
82
programming based methods. From an application point of view they have demonstrated how relational structures can be
uncovered when modeling network traffic data which may have potential uses to build improved attack filters for IDS.
Wei-Chao Lin The paper proposed simple yet effective IDS, which can be easily implemented in the secondary users
cognitive radio software .The proposed IDS uses non-parametric cusum algorithm, which offers anomaly detection. By learning
the normal mode of operations and system parameters of a CRN, the proposed IDS is able to detect suspicious (i.e., anomalous
or abnormal) behavior arising from an attack. This paper also presented an example of a jamming attack against a CRN
secondary user, and demonstrated how proposed IDS able to detect the attack with low detection latency.
Wenying Fenga The paper classifies modern CPS Intrusion Detection System (IDS) techniques based on two design
dimensions: detection technique and audit material. This paper also summarizes advantages and drawbacks of each dimensions
options.
III. SUMMARY OF DIFFERENT CLASSIFIERS
Table 1
Summary of different Classifiers
Parameters
Advantages
The effectiveness of SVM lies in
the selection of kernel and soft
margin parameters. For kernels,
1. Highly Accurate 2. Able to
different pairs of (C, ) values
model complex nonlinear
are tried and the one with the
decision boundaries 3. Less
best cross-validation accuracy is
prone to over fitting than
picked. Trying exponentially
other methods
growing sequences of C is a
practical method to identify good
parameters.
Classifier
Method
Support
Vector
Machine
A support vector machine

constructs a hyper plane or
set of hyper planes in a high
or infinite dimensional
space, which can be used for
classification, regression or
other tasks.
K Nearest
Neighbor
An object is classified by a
majority vote of its
neighbors, with the object
being assigned to the class
most common amongst its k
nearest neighbors (k is a
positive integer). If k = 1,
then the object is simply
assigned to the class of its
nearest neighbor.
Two parameters are considered

to optimize the performance of
the kNN, the number k of nearest
neighbor and the feature space
transformation.
1. Analytically tractable. 2.
Simple in implementation 3.
Uses local information, which
can yield highly adaptive
behavior 4. Lends itself very
easily to parallel
implementations
1. Large storage
requirements. 2. Highly
susceptible to the curse of
dimensionality. 3. Slow in
classifying test tuples.
Artificial
Neural
Network
An ANN is an adaptive
system that changes its
structure based on external
or internal information that
flows through the network
during the learning phase.
ANN uses the cost function C is

an important concept in
learning, as it is a measure of
how far away a particular
solution is from an optimal
solution to the problem to be
solved.
1. Requires less formal

statistical training. 2. Able to
implicitly detect complex
nonlinear relationships
between dependent and
independent variables. 3.
High tolerance to noisy data.
4. Availability of multiple
training algorithms.
1. "Black box" nature. 2.

Greater computational
burden. 3. Proneness to
over fitting. 4. Requires
long training time.
In Bayes, all model parameters

(i.e., class priors and feature
probability distributions) can be
approximated with relative
frequencies from the training set.
1. Nave Bayesian classifier

simplifies the computations.
2. Exhibit high accuracy and
speed when applied to large
databases.
1 The assumptions made in

class conditional
independence. 2. Lack of
available probability data.
Decision Tree Induction uses

parameters like a set of
candidate attributes and an
attribute selection method.
1. Construction does not

require any domain
knowledge. 2. Can handle
high dimensional data. 3.
Representation is easy to
understand. 4. Able to
process both numerical and
categorical data.
1. Output attribute must be

categorical. 2. Limited to
one output attribute. 3.
Decision tree algorithms
are unstable. 4. Trees
created from numeric
datasets can be complex.
Bayesian
Method
Decision
Tree
Based on the rule, using the

joint probabilities of sample
observations and classes,
the algorithm attempts to
estimate the conditional
probabilities of classes given
an observation.
Decision tree builds a
binary classification tree.
Each node corresponds to a
binary predicate on one
attribute; one branch
corresponds to the positive
instances of the predicate
and the other to the negative
instances.
Disadvantages
1. High algorithmic
complexity and extensive
memory requirements of the
required quadratic
programming in large-scale
tasks. 2. The choice of the
kernel is difficult 3. The
speed both in training and
testing is slow.
83
IV. CONCLUSION
Due to our increased dependence on Internet and growing number of intrusion incidents, building effective intrusion detection
systems are essential for protecting Internet resources and yet it is a great challenge. In literature, many researchers utilized k-NN
in supervised learning based intrusion detection successfully. Here, k-NN maps the network traffic into predefined classes i.e.
normal or specific attack type based upon training from label dataset. However, for k-NN based IDS, detection rate (DR) and
false positive rate (FPR) are still needed to be improved. In this study, we propose an ensemble approach, called MANNE, for kNN-based IDS that evolve k-NN by Multi-Objective Genetic Algorithm to solve the problem. It helps IDS to achieve high DR,
less FPR, improve accuracy and in turn high intrusion detection capability.
REFERENCES
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
[17]
[18]
[19]
[20]
[21]
Amin Dastanpour Suhaimi Ibrahim, and Reza Mashinchi. "Using Genetic Algorithm to Supporting Artificial Neural Network for Intrusion Detection
System." In The International Conference on Computer Security and Digital Investigation (ComSec2014), pp. 1-13. The Society of Digital Information and
Wireless Communication, 2014.
Alexander J. Gibberd and James D. B. Nelson, High Dimensional change point detection with a dynamic graphical Lasso, (2014) IEEE
Alina, Oprea, Zhou Li, Ting-Fang Yen, Sang H. Chin, and Sumayah Alrwais. "Detection of early-stage enterprise infection by mining large-scale log data."
In Dependable Systems and Networks (DSN), 2015 45th Annual IEEE/IFIP International Conference on, pp. 45-56. IEEE, 2015
Antonis Papadogiannakis, Michalis Polychronakis, Evangelos P. Markatos, Improving the Accuracy of Network Intrusion Detection Systems Under Load
Using Selective Packet Discarding, (2010) ACM
Asma Gul, Aris Perperoglou, Zardad Khan, Osama Mahmoud, Miftahuddin, Werner Adler, Berthold, Ensemble of a subset of k-NN classifiers, (2015)
Springer
Feng Gu Julie Greensmith, and U. Aickelin. "The dendritic cell algorithm for intrusion detection." Bio-Inspired Communications and Networking, IGI
Global (2011): 84-102.
Kalyanmoy Deb Amrit Pratap, Sameer Agarwal, and T. Meyarivan, A Fast and Elitist Multi objective Genetic Algorithm: NSGA-II, (2002) IEEE
Liyuan Xiao Yetian Chen, and Carl K. Chang. "Bayesian Model Averaging of Bayesian Network Classifiers for Intrusion Detection." In Computer
Software and Applications Conference Workshops (COMPSACW), 2014 IEEE 38th International, pp. 128-133. IEEE, 2014.
M. Govindarajan "Hybrid Intrusion Detection Using Ensemble of Classification Methods." IJ Computer Network and Information Security 2 (2014): 45-53.
Mradul Dhakar and Akhilesh Tiwari. "A Novel Data Mining based Hybrid Intrusion Detection Framework." Journal of Information and Computing Science
9, no. 1 (2014): 037-048.
Nikos Tsikoudis, Antonis Papadogiannakis, Evangelos P.Markatos, LEoNIDS: a Low-latency and Energy-efficient Network-level Intrusion Detection
System, (2013) IEEE.
Reema Patel, Amit Thakkar, Amit Ganatra, A Survey and Comparative Analysis of Data Mining Techniques for Network Intrusion Detection Systems,
(March 2012) ISSN: 2231-2307, Volume-2, Issue-1
Robert Mitchell, Ing-ray Chen, A Survey of Intrusion Detection Techniques for Cyber-Physical Systems, (March 2014) ACM Comput. Surv. 46, 4, Article
55
Salma Elhag, Alberto Fernndez , Abdullah Bawakid, Saleh Alshomrani, Francisco Herrera, On the combination of genetic fuzzy systems and pairwise
learning for improving detection rates on Intrusion Detection Systems,(11 August 2014) ,science direct
Sumaiya Thaseen, Ch. Aswani Kumar, Intrusion detection model using fusion of chi-square 4 feature selection and multi class SVM, (4 October 2015);
accepted 3 December 2015
Wang Huai-bin, YANG Hong-liang, XU Zhi-jian, YUAN Zheng, A clustering algorithm use SOM and K-Means in Intrusion Detection, (2010) IEEE
Wei-Chao Lin, Shih-Wen Ke, Chih-Fong Tsai, CANN: An intrusion detection system based on combining cluster centers and nearest neighbors,(23
January 2015),Science direct
Weiming Hu Jun Gao, Yanguo Wang, Ou Wu, and Stephen Maybank. "Online adaboost-based parameterized methods for dynamic distributed network
intrusion detection." Cybernetics, IEEE Transactions on 44, no. 1 (2014): 66-82.
Wenying Fenga, Qinglei Zhang, Gongzhu Hu, Jimmy Xiangji Huang , Mining network data for intrusion detection through combining SVMs with ant
colony networks, (2014) Science direct
Yogita B. Bhavsar1, Kalyani C.Waghmare,Intrusion Detection System Using Data Mining Technique: Support Vector Machine, (March 2013) Volume 3,
Issue 3 International Journal of Emerging Technology and Advanced Engineering
Zubair Md. Fadlullah, Hiroki Nishiyama, and Nei Kato, Tohoku University, An Intrusion Detection System (IDS) for Combating Attacks Against
Cognitive Radio Networks , (2013) IEEE.
84

Improved Algorithm For Network Intrusion Detection System Based On K-Nearest Neighbor: Survey

Загружено:

Сведения о документе

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Improved Algorithm For Network Intrusion Detection System Based On K-Nearest Neighbor: Survey

Загружено:

Авторское право:

Доступные форматы

IJIRST International Journal for Innovative Research in Science & Technology| Volume 3 | Issue 03 | August 2016

ISSN (online): 2349-6010

Improved Algorithm for Network Intrusion

Krishna Nand Chaturvedi

All rights reserved by www.ijirst.org

All rights reserved by www.ijirst.org

A support vector machine

Two parameters are considered

ANN uses the cost function C is

1. Requires less formal

1. "Black box" nature. 2.

In Bayes, all model parameters

1. Nave Bayesian classifier

1 The assumptions made in

Decision Tree Induction uses

1. Construction does not

1. Output attribute must be

Based on the rule, using the

All rights reserved by www.ijirst.org

All rights reserved by www.ijirst.org

Вам также может понравиться