Вы находитесь на странице: 1из 4

International Journal of Computer Trends and Technology (IJCTT) volume 4 Issue 7July 2013

ISSN: 2231-2803 http://www.ijcttjournal.org Page 2031



Identifying Network Intrusions using One
Dimensional distance
Greeshma K
Department of Computer Science and Engineering
Calicut University,Kerala,India.

Abstract Firewalls and other simple boundary devices lack
some degree of intelligence when it comes to observing,
recognizing, and identifying attack signatures that may be
present in the traffic they monitor and the log files they collect.
Without sounding critical of such other systems capabilities, this
deficiency explains the need for an intrusion detection
systems,IDS helping to maintain proper network security. The
simplest way to define an IDS is to describe it as a specialized
tool that knows how to read and interpret the contents of log files
from routers, firewalls, servers, and other network devices.To
provide the effective result for detecting intrusions, this process
introduces a new approach by clustering and classification
technique. In this process, two distances are measured and
summed. The first one is depended on the distance between each
data sample and its cluster center, and the second distance is
between the data and its nearest neighbor in the same cluster.
Then, this new and one-dimensional distance based feature is
used to represent each data.
I. INTRODUCTION
An intrusion detection system (IDS) is a device or software
application that monitors network or system activities for
malicious activities or policy violations and produces reports
to a management station. An IDS inspects all inbound and
outbound network activity and identifies suspicious patterns
that may indicate a network or system attack from someone
attempting to break into or compromise a system. Intrusion
Detection Systems help information systems prepare for, and
deal with attacks.They accomplish this by collecting
information from a variety of systems and network sources,
and then analyzing the information for possible security
problems.Intrusion detection provides the monitoring and
analysis of user and system activity,auditing of system
configurations and vulnerabilities,assessing the integrity of
critical system and data files,statistical analysis of activity
patterns based on the matching to known attacks,abnormal
activity analysis and operating system audit.
IDPSes typically record information related to observed
events, notify security administrators of important observed
events and produce reports. Many IDPSes can also respond to
a detected threat by attempting to prevent it from succeeding.
They use several response techniques, which involve the IDPS
stopping the attack itself, changing the security environment
(e.g. reconfiguring a firewall) or changing the attack's content.
Intrusion detection (ID) is a type of security management
system for computers and networks. An ID system gathers and
analyzes information from various areas within a computer or
a network to identify possible security breaches, which
include both intrusions (attacks from outside the organization)
and misuse (attacks from within the organization). ID uses
vulnerability assessment (sometimes refered to as scanning),
which is a technology developed to assess the security of a
computer system or network.
The three main components to the Intrusion detection
system are,Network Intrusion Detection system (NIDS) which
performs an analysis for a passing traffic on the entire subnet
and works in a promiscuous mode, and matches the traffic that
is passed on the subnets to the library of knows attacks.
Second one,Network Node Intrusion detection system
(NNIDS) which performs the analysis of the traffic that is
passed from the network to a specific hos and in this the
traffic is monitored on the single host only and not for the
entire subnet. Third one is , Host Intrusion Detection System
(HIDS) which takes a snap shot of your existing system files
and matches it to the previous snap shot. If the critical system
files were modified or deleted, the alert is sent to the
administrator to investigate.
Therefore, in this paper, we propose a new approach for
effective and efficient intrusion detection. It is based on
combining cluster centers and nearest neighbours. Particularly,
given a dataset the k-means clustering algorithm is used to
extract cluster centers of each pre-defined category. Then, the
nearest neighbor of each data sample in the same cluster is
identified. Next, the sum of the distance between a specific
data and the cluster centers and the distance between this data
and its nearest neighbor is calculated. This results in a new
distance as the feature to represent the data in the given
dataset. Consequently, the new dataset containing only one
dimension (i.e. distance based feature representation) is used
for AODE classification, which allows for effective and
efficient intrusion detection.

II. LITERATURE SURVEY

Network security is a large and growing area of concern for
every network. Most of the network environments keep on
facing an ever increasing number of security threats in the
form of Trojan wormattacks and viruses that can damage the
computer system and communication channels. Firewalls are
used as a security check point in a network environment but
still different types of security issues keep on arising. Inorder
to further strengthen the network from illegal access the
concept of Intrusion Detection System (IDS)and Intrusion
International Journal of Computer Trends and Technology (IJCTT) volume 4 Issue 7July 2013

ISSN: 2231-2803 http://www.ijcttjournal.org Page 2032

Prevention System (IPS) is gaining popularity. IDS is a
process of monitoring the events occurring in a computer
system or network and analyzing them for sign of possible
incident which areviolations or imminent threats of violations
of computer security policies or standard security
policies.Intrusion Prevention System (IPS) is a process of
performing intrusion detection and attempting to stopdetected
possible incidents.
A. HIDE : Heirarchichal Network Intrusinn Detection
system usingStastical preprocessing and Neural network
classification.
HIDE is an anomaly network intrusion detection system,
with hierarchical architecture, that uses statistical models and
neural network classifiers to detect attacks. Here we report our
experimental results of the performance of five different types
of neural networks, as well as the results of traffic intensity
stress-testing on HIDE. The system is a distributed
hierarchical application, which consists of several tiers with
each tier containing several Intrusion Detection Agents
(IDAs). IDAs are IDS components that monitor the activities
of a host or a network. Different tiers correspond to different
network scopes that are protected by agents affiliated to them.
The intrusion detection system can be divided into 3 tiers. Tier
1 agents monitor system activities of the servers and bridges
within a department and periodically generate reports for Tier
2 agents.Tier 2 agents detect the network status of a
departmental LAN based on the network traffic that they
observe as well as the reports from the Tier 1 agents within
the LAN. Tier 3 agents collect data from the Tier 1 agents at
the firewall and the router as well as data of Tier 2 agents.
B. Intrusion detection in MANET using classification
algorithms
In this they employ statistical classification algorithms to
order to perform intrusion detection in MANETs. Such
algorithms have the advantages that they are largely
automated, that they can be quite accurate, and that they are
rooted in statistics.For that reason, they are prime candidates
for use in cost-sensitive classification problems. After
training, they can be used for detection with arbitrary cost
matrices.They have extended applications including intrusion
detection in wired networks , they have been extensively
studied, both theoretically and experimentally, and used in
many applications with a high degree of success.
C. One-Dependence Estimators for Accurate Detection of
Anomalous Network Traffic
In this paper prior to the application of any training
algorithm on agiven data set, it is essential to convert all
features (attributes) to a format that is intelligible by the
classification algorithm. As a result, the effect of potential
nullification of the impact of certain features on the outcome
of the classification,is alleviated. In the NSL-KDD data set, all
features of the data set take numeric values except three,
namely,protocol type, service, and flag. As part of the
preprocessing phase, these features are converted into nominal
values,so that the AODE training algorithm, which we use for
network traffic classification, can operate on this data set,
unaffected.The process of numeric to nominal conversion is
achieved through discretization of the numeric values,
usingtechniques such as equal frequency binning, wherein the
frequency of occurrence of a certain data value defines the bin
into which the data is placed, i.e. discretized

III. PROPOSED METHOD

For avoiding different problem in existing system, here we
introduce the new system for effective and efficient intrusion
detection. The proposed approach is based on two distances as
the new features between a specific data and its cluster center
and nearest neighbor respectively. This contains the processes
of extracting cluster centers and nearest neighbors and new
data formation. The purpose of this algorithm is to assign an
unlabeled data to the class of its k nearest neighbors. Thus
process provides the effective results.


A.Selection of features:
KDD cup 99 dataset has been used to examine this
technique. After loading the dataset, the dataset moves to the
feature classification. In feature classification step, the
features are classified by five classes. After completing the
feature classification process, the classified features are
separated into 41 feature set. Feature set are detect as normal
and anomaly. Feature selection algorithm is used to eliminate
the unimportant features.

B.Distance of cluster center:
To extract cluster centers, some clustering technique
can be applied in this stage. In this paper, the k-means
clustering algorithm is used. The chosen dataset consisting of
12 data samples (N1 to N12) is a five-class classification
problem. Then, the number of clusters is defined as five (i.e. k
=5) for the k-means clustering algorithm. As a result, there
are five clusters, in which each cluster contains a cluster
center (i.e. C1, C2, C3, C4, and C5).

C.Loading one dimensional dataset:
After the cluster center and nearest neighbor for
every data of the chosen dataset are extracted and identified,
two types of distances are calculated and then summed. For
the first distance type, they are based on each data point to the
cluster centers. That is, if there are three cluster centers, then
there are three distances between a data point to the three
cluster centers respectively. The second distance type is based
on each data point to its nearest neighbor. The distance
between two data points is based on the Euclidean distance.
Finally this distance provides the new dataset.




International Journal of Computer Trends and Technology (IJCTT) volume 4 Issue 7July 2013

ISSN: 2231-2803 http://www.ijcttjournal.org Page 2033

D.AODE Classifier:
The new dataset is divided into the training and
testing datasets to train and test a specific classifier
respectively. In this paper, we use Average One Dependence
Estimator for classification of network traffic.This classifier
intelligently classifies network traffic based on the attributes
(features) of the network traffic.AODE resolves this issue by
having a single feature identified as a super-parent, upon
whom all other features depend.As a results, a dependancy
graph is generated to establish inter-feature relationships. It
was observed that such an approach improves the accuracy of
the detection process significantly. An enhancement to
SPODE, namely, AODE resolves the attribute independence
assumption by ensuring that dependencies between various
attributes in a given data set are averaged for various models.
For the NSL-KDD dataset, our proposed intrusion detection
scheme outperforms all other classifiers in terms of accuracy
in attack detection, lowered false alarm rates, and improved
precision and recall values. In particular, the SPODE
classified detected 97.8% of all attack traffic, whereas the
AODE classifier successfully detected 99.3% of the attacks,
with a false positive rate of 0.1%, as compared to a detection
rate of 97.3% and a false positive rate of 1%, exhibited by the
other classifiers.

IV. CLASSIFICATION
AODE was found to outperform both ODE and SPODE in
terms of speed and accuracy,The training phase of the AODE
algorithm operates by iterating through a given data set with k
features, and generating a set of frequency vectors as follows:
1. cfreq[y] - number of data elements belonging to a
given class y
2. afreq[k] - number of times a given data element is
found to possess a value, iterated over all k features
3. vfreq[xi] - number of times the value xi is encountered
in the entire data set
4. freq[y; xi; xj ] - the frequency of simultaneous occurrence
of two attribute values xi and xj for a given class y

During the testing phase of the AODE algorithm, the data
elements or instances of the data set are introduced to the
algorithm by hiding the class to which they belong. The task
of the AODE classifier is to predict the probability of the data
element to belong to each of the given classes. The higher
probability is then used for deciding the class of the data
element. These values are computed based on the following
equations:

for all i 2 k ; p = P^(xi ^ y)
for all j 2 k; if xj is known; p =p P^(xj jxi ^ y)

where, k is the total number of attributes, xj is the value
of a feature, P^(xi ^ y) is the probability that the value xi
is observed given xi belongs to y, and P^(xj jxi ^ y) gives the
probability of observing the value xj , given that the value xi
belongs to y.

V. SIMULATION ANALYSIS
In this section, we analyze the results obtained from
simulations performed to test the effectiveness of our
proposed intrusion detection system. The results were
quantified based on the following metrics, commonly used
for evaluating intelligent classifiers:

1. Accuracy =TP+TN / TP+TN+FN+FP
2.Recall =TP / TP+FN
3.Precision =TP / TP+FP
4. Specificity =TN / TN+FP



Table
Head
Table Column Head

ACCURACY
DETECTION
RATE
FALSE
ALARM
SVM 99.01% 93.29% 0.0289%
k-NN 99.2% 96.921% 0.322%
AODE 99.3% 97.3% 0.310%



Figure 1 :Comparison of the Precision Recall
curve for different AODE models built using
different feature selection techniques
V1. CONCLUSION
A new approach is proposed in this paper for
effective and efficient intrusion detection. This approach first
transforms the original feature representation of a given
dataset into one dimensional distance based feature. Then, this
new dataset is used to train and test a AODE classifier for
classification. Our experimental results show that this
performs similar to the k-NN and SVM classifiers using the
International Journal of Computer Trends and Technology (IJCTT) volume 4 Issue 7July 2013

ISSN: 2231-2803 http://www.ijcttjournal.org Page 2034

original dataset in terms of accuracy, detection rates, and false
alarm rates. However, the important strength of this method is
that it needs less computational effort than the k-NN and SVM
classifiers trained and tested by the original datasets. That is,
although this requires additional computations for extracting
the proposed distance based feature, it largely reduce the
training and testing (i.e. detection) time since the new dataset
only contains one dimension.
REFERENCES
[1] Mukkamala, S. (2002). Intrusion detection using neural networks and
support vector machine. Proceedings of the 2002 IEEE International
Honolulu, HI.
[2] Sammany, M.; Sharawi, M.; El-Beltagy, M.; and Saroit, I. (2007).
Artificial neural networks architecture for intrusion detection systems
and classification of attacks. Accepted for publication in the 5th
international conference INFO2007, Cairo University.
[3] Morteza, A.; J alili, R.; and Hamid R.S. (2006). RT-UNNID: A
practical solution to real-time network-based intrusion detection using
unsupervised neural networks. Computers & Security, 25(6), 459 468.
[4] Tran. T.P.; Cao, L.; Tran, D.; Nguyen, C.D. (2009). Novel intrusion
detection using probabilistic neural network and adaptive boosting.
International J ournal of Computer Science and Information Security
(IJ CSIS), 6(1), 83-91.
[5] Chen, R.C.; Cheng, K.F.; and Hsieh, C.F. (2009). Using rough set and
support vector machine for network intrusion detection. International
J ournal of Network Security & Its Applications (IJ NSA), 1(1), 1-13.
[6] G. Vigna, R. A. Kemmerer, NetSTAT: a network-based Intrusion
Detection Approach, Proceedings of 14
th
Annual Computer Security
Applications Conference, 1998, pp. 25 34
[7] W. Lee, S. J . Stolfo, K. Mok, A Data MiningFramework for Building
Intrusion Detection Models, Proceedings of 1999 IEEE Symposium
of Security and Privacy, pp. 120- 132.
[8] A.K. Ghosh, J . Wanken, F. Charron, Detectin Anomalous and
Unknown Intrusions Against Programs,Proceedings of IEEE 14
th

Annual Computer Security Applications Conference, 1998,.
[9] Lorenzo-Fonseca, I.; Maci-Prez, F.; Mora-Gimeno, F.; Lau-
Fernndez1, R.; Gil-Martnez-Abarca, J .; and Marcos-J orquera, D.
(2009). Intrusion detection method using neural networks based on the
reduction of characteristics. LNCS, 5517, 12961303.
[10] S. Axelsson. \Research in intrusion detection systems: A survey".
Technical Report No. 98-17, Dept. of Computer Engineering,
Chalmers University of Technology, Gteborg, Sweden, 1999.