Intrusion Detection System Via A Machine Learning Based Anomaly Detection Technique

International Journal of Scientific Research Engineering & Technology (IJSRET), ISSN 2278 0882
Volume 3, Issue 8, November 2014
Intrusion Detection System via a Machine Learning Based Anomaly

Detection Technique
Adedoyin Adeyinka*, and Oloyede Muhtahir O.**
Department of Info. and Comm. Science, University of Ilorin, Ilorin, Nigeria
ABSTRACT
Intrusion Detection Systems are gaining more
territory in the field of secure networks bringing new
ideas and concepts as the intrusion detection process
keep surfacing. The aim of this research is to look
into packet header anomaly detection system (PHAD)
time-based model and introduce a new model into
PHAD time-based model such that when a novel value
r is observed, the probability that the novel value
will occur exactly once during the testing session will
be r/2. The 1999 DARPA intrusion detection system
evaluation data set was used to train and analyze the
performances of this model. On the 1999 Defense
Advanced Research Projects Agency (DARPA)
evaluation data sets, the introduced PHAD time-based
model detected 31 novel attacks at a threshold of 1000
false alarm rate after training the model for 300secs.
Keywords: Network Security, Intrusion detection
system, Anomaly detection Model.
INTRODUCTION
Intrusion Detections Systems (IDS) is a new path of

security systems that provides efficient approaches to
secure computer networks. Although some of these
approaches rely on Learned Algorithms to provide the
network with an efficient classifier to recognize and
detect intrusions actions [1]. Hence, with increasing
advancement and sophistication of these attack
techniques as technology changes, there is the need to
keep these algorithms accurate, and abreast of the latest
network attacks [2].
A CSI survey report [3] stated that about 97% and 94%
of most organizations use security tools such as antivirus and firewall respectively against their attackers,
but they have being found to be imperfect. This is
because majority of these tools relies on technologies
that attempt to identify known and broadly distributed
attacks that have recognizable patterns in them.
Overtime attackers /developers have been gradually
increasing the sophistication of their methods and are
arriving at points where it is possible to bypass these
tools more or less at will, within a limited time frame.
These makes the stringent rules set by security
personnel for firewall and antivirus systems
insufficient [4]. However due to increased

advancement and sophistication of these attack
techniques as technology changes, there is the need to
have an effective and efficient intrusion detection
system that will detect this inevitable attacks in real
time bases, so as to stop an attack progress [5].
1.1
Anomaly Detection System

Anomaly detection system detects a network
intrusion by creating a normal profile of the network or
host under observation and flags any deviation from
the normal profile as probable intrusions [6]. In
anomaly IDS, attacks are detected without prior
knowledge of exactly what the attack looks like
because they stand out sufficiently from normal
network traffic [7]. An anomaly detection system
operates by monitoring and registering the users
activities during the operation of the computer system.
These datas are being used as the normal profile of the
network or host under observation [8]. Anomaly
detection systems are known to have the ability to
detect previously unknown and insider attacks [9].
Figure 1 below is a block diagram showing the mode
of operation of a typical anomaly detection system
[10]:
Figure 1: A typical anomaly detection system [10]
RELATED WORKS
Network anomaly detection system like Nextgeneration Intrusion Detection Expert System (NIDES)
is a statistical model that learns normal network traffic
and flags any deviations from this model. NIDES use a
frequency-based model in which the probability of an
event is estimated by its average frequency during
training. The model is based on the distribution of
source and destination IP addresses and ports per
transaction. NIDES models ports and addresses, and
www.ijsret.org
1143
flags the differences between short and long term

behavior [5].
The Event Monitoring Enabling Responses to
Anomalous Live Disturbances (EMERALD) is a
system that contains a statistical component called
eStat. This statistical component maintains short and
long-term distribution information for several types of
measures using a decay mechanism to age out less
recent events. It also has a component that combines
signature and anomaly-based approaches called
Ebayes. Ebayes uses a belief network to determine
from a number of features whether the values of those
features fit with some normal behavior like http, ftp
e.t.c, some predefined bad behavior (mailbomb,
ipsweep, e.t.c ), or neither of these (other) [11].
Packet Header Anomaly Detector (PHAD) [12],
Learning Rules for Anomaly Detector (LERAD) [13],
Application Layer Anomaly Detector (ALAD) [14] and
Detecting Network Intrusions via a Statistical Analysis
of Network Packet Features [15] use a time based
model in which the probability of an event depends on
the time since it last occurred. PHAD, ALAD and
LERAD differ in the attributes that they monitor.
PHAD monitors 33 attributes from the Ethernet, IP and
transport-layer packet headers. While ALAD and
LERAD models incoming server TCP requests; source
and destination IP addresses and ports opening and
closing TCP flags, and the list of commands in the
application payload. Depending on their attribute, they
build separate models for each target host, port number
or host/port combination.
2.2
PHAD
PHAD is an anomaly detection system that learns
the normal ranges of values for each packet header
field at the data link (Ethernet), network layer (Internet
Protocol(IP)),
and
transport/control
layers
(Transmission Control Protocol (TCP), User Datagram
Protocol (UDP), Internet Control Message Control
(ICMP)) [12]. PHAD as two distinct features from
other conventional network-based anomaly detection
systems; firstly it models protocols rather than the user
behaviors. This allows PHAD to detect two of the four
attack categories described by Kendall (1998) [16].
Secondly, it uses a time-based model, which assumes
that network statistics can change rapidly in a short
period of time. When a series of recurring anomaly is
detected, PHAD flags only the first anomaly it detects
as an alert. This feature helps in regulating the flood of
alarms that would otherwise be caused by spurt of
anomalous events [17]. PHAD uses only syntactic
knowledge to parse the header into fields, and then
figures out which fields are important ; it models
Ethernet, IP, TCP, UDP and ICMP packet header fields
without making distinction between the incoming and
outgoing traffic [18]. PHAD examines 33 packet
header fields, which correspond to packet header fields
with 1 to 4 bytes. Fields smaller than 8 bits (such as the
TCP flags) are grouped into a single byte field while,
fields that are larger than 4 bytes (such as the 6 byte
Ethernet addresses) are split. The attributes are as
follows: Ethernet header, IP header, TCP header, UDP
header and ICMP header [16].
2.1
Data Source
The experiments were performed using the 1999
DARPA Intrusion Detection Evaluation off-line data
sets at Massachusetts Institute of Technology, Lincoln
Lab (http://www.ll.mit.edu/IST/ideval/). This data were
used to configure the models and train free parameters.
The week three attack-free inside sniffer data, which
contains 7 days of traffic (consist of 2.5GB tcpdump
files), was downloaded to train the packet header
anomaly detection system. Furthermore to test the
anomaly detection system, week 4 and 5 inside sniffer
data sets were also downloaded which contains 201
attacks. Although, week 4 day 2 data was missing, thus
reducing the number of available attacks in the data
sets to 183. The inside sniffer traffic was chosen to be
used for these experiment because the inside data
contains evidence of attacks from both inside and
outside the network [12].
PHAD TIME BASED MODEL
Packet header anomaly detection [12] uses the rate

of anomalies during training period to estimate the
probability of an anomaly while in detection mode. It is
based on the concept that if a packet field is observed n
times with r distinct values with the assumption that
there must have been r anomalies that occurred during
the training period. If this rate continues, the
probability that the next observation will be anomalous
is given by:
P = r/n (1)
Where n is the number of packet and r is the
number of anomalous values seen during training.
To study the dynamic behavior of real-time traffic,
PHAD uses a non-stationary model while in its
detection mode. This model is based on the fact that if
an anomaly is detected in a network traffic t seconds
ago, the likelihood that it will occur in the next one
second is approximately given by the probability of an
www.ijsret.org
1144
anomalous is inversely proportional to the time since it

last occurred i.e
p = 1/t (2)
To apply time-based modeling to packet header
anomaly detection based on assumption; PHAD
assigns an anomaly score of 1/p for each packet field
containing an anomalous value i.e
Field Score = 1/p (3)
Substituting equation (1) and (2) into (3)
Packet Score= 1/ (1/t) x (r/n) = tn/r (4)
During the training period, the anomaly scores of an
instance more than one anomalous attributes is given
by
Packet Score = tini/r i .. (5)
Where the summation in equation (5) is over the
anomalous attributes
Note: there is no theoretical justification for summing
up of the inverse probabilities during the training
period because the attributes are neither independent
nor fully dependent and it was found experimentally
that a summation works better in practice [12].
3.1
The Introduced PHAD Time based Model
Packet header anomaly detection [12] uses the rate
of anomalies during training period to estimate the
probability of an anomaly while in detection mode. It is
based on the concept that if a packet field is observed n
times with r distinct values with the assumption that
there must have been r anomalies that occurred during
the training period. If this rate continues, the
probability that the next observation will be anomalous
is approximated by r/n. This model is equally
consistent with the Posterior Predictive Model
Checking (PPMC), one of the models used in
predicting novel values in data compression algorithm
and it is assumed to be an overestimate. PPMC doesnt
require an event to be independent like the PHAD [16,
19].
M.V Mahoney and P.K Chan [20] compared
inbound client traffic from IDEVAL inside sniffer
traffic from weeks 1 and 3 with the real traffic and
found that many attributes have a higher value of r in
real traffic due to greater variation in the protocols.
This implies that the attribute would generate more
false alarms in real traffic. However in order to
improve the performance of this model, so as to reduce
the number of generated false alarm the Good -Turing

probability for novel event by Gale and Sampson
(1995) [21] will be introduced. They suggested that if
n is the number of observations of a random variable
and r novel values is observed, and then the expected
number of novel value occurring exactly once will be
r/2. Thus the probability that the next value of any
discrete random variable will be novel will be:
From equation (1)
P=r/n
But r1 =r/2 from Good-Turing
Therefore p= r/2/n r/2n.. (6)
Therefore an anomaly score of tn/2r from equation (4)
will be assigned to the modified model. This implies
that the probability of detecting a novel event more
than once will be half the number observed novel event
[19]. This is assumed to reduce the number of false
alarm generated by PHAD, which will be verified in
the experiment.
3.2
Clustering and Tuning of PHAD
Clustering is the grouping of similar objects from a
given set of inputs [22]. In order to store potentially
large packet field values such as the source or
destination address during the training period, PHAD
treats different field attributes as being continuous and
clusters them into a maximum of K ranges i.e up to
some limit of K. Whenever the number of clusters
exceeds K, PHAD sorts out the adjacent ranges and
merge the two closest ranges of clusters together. The
distance between clusters is the smallest difference
between two cluster elements. For instance, if we have
the following sets of field ranges {32-64, 98-365,7401500, 2500-5500, 32768-65535} and if K=4, the
smallest gap is between 32-64 and 98-356. These
would then be merged to form a new cluster {32-365}.
[12, 16]
In order to improve the performance of PHAD on
the DARPA evaluation data set, the clustering value
K=32 and K=1000 was used in tuning the experiment
during the training and testing period. This implies that
PHAD-K32 and K1000 stores the observed values for
each field in a list of 32 and 1000 ranges respectively
[9]. According to M.V Mahoney and P.K Chan [9] in a
survey report they made a conclusion that whether a
cluster value of K=32 or K=1000 was used in PHAD; it
detects the same number of attacks. However for this
project in order to improve the performance of PHAD
on the DARPA evaluation data set, clustering value
with the range 32, 64 ,128. 1000 will be used in
tuning the algorithms to verify M.V Mahoney and P.K
Chan statement stated above.
www.ijsret.org
1145
EVALUATION PROGRAM
By DARPA criteria, to evaluate any kind of

intrusion detection system on the 1999 DARPA
Lincoln Laboratory data sets a software program
known as EVAL program is recommended [23], which
will be used in evaluating this research work.
The EVAL program [23] interprets a sim.file (e.g
file name.sim); the output file of an intrusion detection
system generated during the training/testing of the IDS
on the 1999 DARPA data sets. It reports the number of
attacks detected in the .sim file at the lowest threshold
and considers an attack has being detected when the
flagged alarm identifies the address of a targeted host
(i.e IP address) within the period of 60 seconds. The
target addresses as specified in the 1999 DARPA truth
table is any address on the networks 172.16.x.x or
192.168.x.x. In EVAL program any flagged alarm with
no identified attack is always regarded as a false
alarm. EVAL program uses two options in printing
out the results of the evaluated model; a reporing level
(0-4) and the threshold in number of false alarms (by
default is 100) otherwise a different number can be
specified. EVAL reporing level (0-4) is as described
below:
i. Level 0 prints out warning list about flagged
alarms containing errors which are ignored.
ii. Level 1 prints a table of detected attacks at a
specified false alarm rate (by default 100),
listing and categorizing them into the different
categories of attacked described by Kendal
(1998).
iii. Level 3 also prints the list of each detected
attack in descending order of the highest
scoring alarm that detected it.
iv. Level 4 prints the list of all alarms above the
threshold of the specified false alarm limit
[23].
EXPERIMENTAL SET-UP
Simulations were conducted to compare and

evaluate the performance of the modified PHAD timebased model with the PHAD time-based model. The
modified PHAD time-based model as well as the
PHAD time-based model were implemented and
simulated in offline manner using the downloaded
PHAD source code as illustrated in Figure 2. The
source codes were implemented on Intel centrino Duo
processor laptop with 4GB of RAM, 120GB of hard
disk space and running the Linux operating system.
The algorithms were evaluated using data from the
DARPA evaluation data set.
PHAD
Time model
Training of the algorithm

Output
DARPA Week 3
Attack free inside
sniffer network
traffic
Testing of the algorithm
.Sim file
DARPA Week 4
& 5 inside sniffer
network traffic
Evaluation of the Output
Eval Program
Figure 2: Format of the Experimental Set-up

5.1
Implementation of PHAD time based Model

PHAD [13] time based model was tested on the
1999 DARPA/Lincoln Labs IDS data set. PHAD
algorithm was trained using the week 3 inside sniffer
traffic, which contains 7 days attack free traffic data
2.5 GB in size, all in tcpdump format. The datas were
inputted in chronological order i.e the order in which
they were captured and then outputted in a .sim file (i.e
PHAD file extension). The model was trained for
300secs. After the training, it was then tested using the
week 4 and 5 data, which contains 183 attacks. During
the training and testing period, in order to improve the
performance of PHAD, the clustering value K was
tuned between the ranges of K=32, 64, 128, 356, 712
and 1000.
After the training and testing session, the model was
then evaluated using EVAL program as recommended
[23] to identify and list out attack information. A
reporing level of 4 and a threshold in the number of
false alarms of 1000 was used for the EVAL program.
e.g. eval phad.sim 4 1000 (detections at 1000 false
alarms)
Where eval is the compiled eval source code,
phad.sim is the output extension file from the training
and testing session, 4 is the reporing level (i.e reporing
the results from level 0 to 4 as described in section 4)
and 1000 is the specified end point for EVAL program.
i.e once the number of evaluated alarm reaches 1001,
the evaluation process will be stopped in order to count
the number detections between 1000 and 1001. The
basis of using 1000 is because if a lower value such as
100 is used, only the alarms with the highest scoring
alarms will be evaluated and thus leaving large number
of other anomalous fields not evaluated [13, 23]. The
basic idea of these experiments was gotten from M.V
www.ijsret.org
1146
Mahoney and P.K Chan technical report on PHAD

[12].
5.2
Implementation of the Modified PHAD time

based model
The modified PHAD time based model was trained
using the 1999 DARPA/Lincoln Labs IDS data set. The
whole experimental process discussed in section 3.1
above was repeated using the modified PHAD time
based model. In order to improve its performance
during the training and testing session, the clustering
value K was also tuned from 32 to 1000. The output
sim file was then evaluated using EVAL program.
EXPERIMENTAL RESULT
During the evaluation phase the threshold in the

number of false alarm was set to 1000. The basis
behind these was that when a lower value was used,
some of the packet fields were left unevaluated. The
Eval program stops evaluation and discard any other
alarms after a false alarm limit 1000 is reached i.e its
stops at 1001. The DARPA Eval data set was used in
training the modified PHAD time-based model and at
k=32, 31 attacks out of the 183 attacks were detected.
Below is the table of detections at 1000 false alarms:
Detections/Total at 1000 false alarms (weeks 4-5 only
except row W2)
Table 1: Table of detections for the modified PHAD
time-based model at 1000 false alarm
Marx
0/2
Poor
1/7
W2
0/3
6/44
1/6
2/11
2/10
10/72 1/21
7/38
3/29
0/43 0/9
0/0
0/0
3/17
2/18
0/2
4/17
4/15
1/18
0/6
0/12
0/13
31 detections, 1506 alarms, 37 true, 1001 false, 468

not evaluated.
From table 1, each cell lists the number of detections
out of the total number of various combinations of the
201 attacks listed on the DARPA truth table e.g Out of
the 60 DoS attacks with evidence in the inside sniffer
traffic (IT), 16 were detected . Also there are 37
alarms detected alarms, but 6 of this detected attack
have already being detected by another alarm, thus
summing up the total detected attack to 31. An attack
may appear twice under two overlapping attacks if it
detects both. The results of the generated alarms were
outputted in Eastern day time; this was achieved by
changing the time zone of my computer to EDT (-5hrs
form of GMT). The notations before the list of various
categories of attacks are [23]:
W45 shows the number week 4 and 5 attacks types
found.
IT - represents the number of inside sniffer traffic
attacks.
OT- shows evidence of outside sniffer traffic attacks.
All
Probe
Data New Stealthy
------------ ------- ------W45 31/201 6/37
2/16 11/62
5/36
IT
27/177 4/34
0/7
7/52
3/30
OT
21/151 5/32
0/11 4/38
2/23
BSM 3/38
0/1
1/6
1/8
1/6
NT
6/33
1/3
1/4
4/26
0/0
FS
28/189 6/37
0/11 9/54
5/34
Pascal 7/55
2/8
1/6
1/11
1/9
Hume 8/48
2/7
1/5
4/31
0/2
Zeno 5/22
1/7
0/1
0/2
0/6
DOS
R2L
U2R
-------
-------
------
16/65
5/56
3/37
16/60
5/54
2/27
11/44
3/46
2/26
1/12
0/10
1/11
Poor shows evidence of attacks that were poorly

detected in the 1999 evaluation.
----
BSM shows evidence of attacks in Solaris BSM

system call traces.
NT shows evidence of attacks in NT audit logs (LOG
+ hume).
FT shows evidence in file system dumps.
2/7
1/10
2/12
DOS shows evidence of denial of service attacks
15/62
5/56
2/31
R2L shows evidence of remote to local attacks
3/20
0/12
1/11
U2R shows evidence of user to root attacks
3/15
1/12
2/13
Probe shows evidence of probe attacks
4/9
0/3
0/3
New shows evidence of new attacks
www.ijsret.org
1147
6.1 Attacks detected

Table 2, shows the packet fields that contributed to
the 31 detected attacks along with the names of the
attacks. The numbers in front of the attack names
shows the total number of each of the attacks that was
found in the 31 detections.
TCP
Urgent
pointer
UDP
Checksum
TCP
Option
TCP
Checksum
6.2
Categories of Attacks Detected
Table 4, list of the attacks being detected in the 1999
Lincoln Labs IDS Evaluation data in categories,
according to Kendal (1998) [13] taxonomy of these
attacks.
Table 4: Categories of attacks detected in 1999 Lincoln
Labs IDS Evaluation data
Table 2: Packet fields attributes and attacks that

contributed to the 31 detections
Packet
Field
Attributes
IP
Protocol
IP Time of
Service
IP
Fragment
pointer
TCP flag
1148
Probe
queso(1)
Attacks
Crashiis(1),snmpget(2),portsweep(1)
ntinfoscan(1),queso(1),geusstelnet.
casesen(1),ncftp(2),ntfsdos(1),
secret(1), warezclient(1)
teardrop(3),pod(3)
dosnuke(1),portsweep(1),ps(1)
smurf(1)
dosnuke(2),insidesniffer(1)
DoS
crashiis(1)
U2R
R2L
casesen guesstelnet(1)
(1)
insidersniffer(2) dosnuke(3)
ntfsdos(1)
ncftp(2)
ntinfoscan(1)
mailbomb(1)
ps(1)
snmpget(2)
portsweep(2)
processtable(1)
teardrop(3)
pod(3)
smurf(1)
udpstorm(2)
warezclient(1)
Table 5: Percentage of attack categories that
contributed to the detection
udpstorm(2)
Attack
Total no of
Categories Attacks
Probes
6
DOS
16
U2R
3
R2L
5
mailbomb(1),processtable(1)
insidesniffer(1)
Table 3: Percentage of Packet fields attribute that

contributed to the detections
Packet Field
Total no %
Attributes
of
Contribution
Attacks
IP Protocol
7
22.58
IP Fragment
6
19.35
pointer
IP Time of
6
19.35
Service
TCP flags
4
12.9
TCP Urgent
3
9.68
pointer
UDP Checksum
2
6.45
TCP Option
2
6.45
TCP Checksum
1
3.22
%
Contribution
20
53.33
10
16.67
Table 5, shows the percentage of each of the attack

categories that was found in the 31 detections at
threshold of 1000 false alarms. The modified PHAD
time-based model detects mostly DOS attacks that
exploits the protocols that it analysis, which accounts
for 53.33% of the detected attacks. It does poorly for
the probes and U2L attacks and misses most of the
U2R attacks which are difficult to detect in network
traffic [13].
6.3
Table 3 above, shows the percentage contribution of

each of the attributes to the 31 detected attacks in
descending order.
Results of Tuning the Clustering value

The clustering values (K) are used in storing the
observed values for each field in the list of K- ranges.
However with aim of improving the performance of
PHAD algorithm on the evaluation data sets the
clustering value K was tuned during the training
session and the results can be seen in Figure 3.
www.ijsret.org
training and testing sections using different clustering

values as shown in Table 6:
Table 6: the number of attacks detected by the two
models with different clustering values
Number of
Psc=tn/r Psc=tn/2r
Clusters
32
28
31
64
28
31
128
28
31
356
28
31
712
28
31
1000
28
31
Figure 3: Summary of numbers of detections using
different clustering values
A clear representation of this can be seen from Figure

5:
According to Mathew V.M (2003) [13] the

method of approximating large sets is not critical,
because it only affects attributes with large r and that
PHAD detects the same number of attacks whether it
uses a cluster value of K=32 or K=1000. These can be
verified from figure 3 that both models detected the
same number of attacks when the clustering value was
varied between the ranges of 32, 64, 128, 356, 712, and
1000. A clear illustration of this can be seen in Figure
4:
Figure 5: the number of attacks detected by the two

PHAD models with different clustering values.
From Figure 5, it can be deduced that the modified
PHAD model detected more attacks than the original
PHAD model; this implies that the modified PHAD
tends to be more sensitive in terms of its rate of
detection than the original PHAD model. Thus, this
makes it a better model in terms of rate of attack
detection.
6.5
Figure 4: Detection line graph for the two models with
different clustering values
6.4
The modified PHAD time-based model
The modified PHAD time-based model was
implemented on the DARPA evaluation data sets along
with the original PHAD time-based model and the
following sets of results were obtained after the
False positive alarm detected by the models

Looking at the result of the modified model with
the original model in terms of the number of false
positive alarms generated, it can be seen from Table 7,
that the modified model generated more false positive
alarms than the original PHAD model. Although tuning
of the clustering values did not have much effect on the
models; the same numbers of false positive alarms
www.ijsret.org
1149
were generated by both models as the clustering values

were tuned. This implies that the models are not that
sensitive to increase in the clustering value ranges.
Table 7: Summary of the number of false positive
alarms
Number
Clusters
32
64
128
356
712
1000
Table 8: Summary of the number of True positive

alarms
Number of
Psc=tn/r Psc=tn/2r
Clusters
32
35
37
64
35
37
128
35
37
356
35
37
712
35
37
1000
35
37
of Psc=tn/r Psc=tn/2r
994
995
995
995
995
995
it can be deduced that the modified PHAD model

generated more true positive alarms than the original
PHAD model. Table 8 below gives the summary of the
result.
1506
1503
1503
1503
1503
1503
A clear illustration of this can be seen in Figure 6.
A clear illustration of this is as demonstrated in Figure

7:
Figure 7: showing the true positive alarms
Figure 6: showing rate of false positive alarms

Analyzing the results in the Figure 6, it can be seen
that the margin between the number of alarms that was
flagged by the modified model and the original model
is extremely high. This means that the modified model
false fully flags a high percentage of anomalous events
that are not intrusive as being intrusive. Thus an
assumption can be made by this, that the modified
model seems to be highly sensitive to anomaly
behaviors in network traffic, but in an ideal case a high
false positive alarm rate is not practicable. However
this makes the original PHAD model a better model in
terms of the number of false positive alarms it
generates, but not in terms of its rate of detection.
6.6
True positive alarm detected
Table 8, shows the results of both models based on the
number of true positive alarms they generate. However
Comparing the result of the modified model with the

original PHAD model from Figure 7, it can be deduced
that the modified model recorded a better true positive
detection than the original PHAD model. Although
tuning the cluster values did not have any effect on
both models. However, a conclusion can be reached
that the modified model is more sensitive than the
original PHAD model in terms of its response when
analyzing events or activities that tend to or leads to an
attack.
6.7
Detection - False Alarm Tradeoff
In the previous sections the PHAD model was analyzed
using a threshold of 1000 false alarm and the reason
why that rate was used was also explained. Figure 8
demonstrates the effects of varying the threshold in the
www.ijsret.org
1150
rate false alarm being generated by the two models

against their detection rate [13].
Figure 8: Detection line graph for 200 to 1000 false

alarms
From Figure 8, it can be deduced that
i. As the threshold is adjusted, there is a tradeoff
between the threshold in the number of false
alarm and the missed attacks. As the threshold his
increased the number of detection of both models
increased between 200 and 400, then becomes
stable between 400 and 600. However after 600
the number of detections for both models starts
increasing.
ii. Between 200 and 800, the two models had the
same number of detections but after 800 the
number of detections of the modified model began
to out pass that of the original model.
CONCLUSION
Overall, the implementation of the modified PHAD

time-based model was successful in its efforts to detect
novel events in network traffic. Although the model
seems not to be sensitive to the tuning of the clustering
value ranges which was experimented in the bid to
improve its performance. The modified model
outperformed the original PHAD model with a
detection rate of 31 attacks to 28 attacks. However in
spite of its overall high detection rates of novel events,
the false alarm rates being generated seems not to be
practicable. From the results of the experiment, the
modified model recorded a high number of false
positive alarms when compared with the original
model. This means that the modified model false fully
flags a high percentage of anomalous events that are
not intrusive as being intrusive. Thus an assumption
can be made by this, that the modified model seems to
be highly sensitive to anomaly behaviors in network
traffic, but in an ideal case a high false positive alarm
rate is not practicable. However this makes the original

PHAD model a better model in terms of the number of
false positive alarms it generates, but not in terms of its
rate of detection. Also the modified model recorded a
better true positive detection than the original PHAD
model. Although tuning the cluster values did not have
any effect on the number of true positive alarms
generated by both models. However, a conclusion can
be reached that the modified model is more sensitive
than the original PHAD model in terms of its response
when analyzing events or activities that tend to or leads
to an attack.
REFERENCES
1. Lamees Alhazzaa; Intrusion detection Systems
using Genetic Algorithms; King Saud University
Computer Science Collage CSC590 [Online]
Available
at
:
http://docs.ksu.edu.sa/PDF/Articles17/Article17074
4.pdf
2. Lu Sheng, Gong Jian, RUI Suying (2003); A Load
Balancing Algorithm for High Speed Intrusion
Detection; Department of Computer Science and
Engineering, Southeast University, Nanjing,China
[Online]
Available
at
:
http://www.njcert6.edu.cn/papers/2003/shlu_2003_1
.pdf
3. Robert Richardson, CSI Computer Crime &
Security Survey (2008) [Online] Available from:
http://i.zdnet.com/blogs/csisurvey2008.pdf
4. Wun-Hwa Chen, Sheng-Hsun Hsu, Hwang-pin
Shen, Application of SVM and ANN for intrusion
detection ; Science Direct Computers & Operations
Research Vol.32 (2005) pages 2617-2634
5. D. Anderson, T.F Lunt, H. Javitz, A. Tamaru and A.
Valdes (1995); Detecting Unusual Program
Behavior Using the Statistical Component of the
Next-generation Intrusion Detection Expert System
(NIDES); SRI Computer Science Laboratory, SRICSL-95-06,
Available
at:
http://www.sdl.sri.com/papers/5sri/5sri.pdf
6. Animesh Patcha and Jung-Min Park, Network
anomaly detection with incomplete audit data,
Science Direct, Computer Networks Vol.51, Issue
12 (2007), Pages 3935-3955.
7. Azzedine Boukerche, Renato B. M, Kathia
R.L,Mirela S.M.A Notare; An agent based and
biological inspired real-time intrusion detection and
www.ijsret.org
1151
security model for computer network operation;

Science Direct; Computer Communications Vol.30
(2007) pages 2649-2660.
8. W. Haines, R.P. Lippmann, D.J. Friend, M.A.
Zissman; 1999 DARPA Intrusion Detection
Evaluation: Design and Procedures; Masters
thesis, Massachusetts Institute of Technology.
[Online] Available at:
http://www.ll.mit.edu/mission/communications/ist/fi
les/TR-1062.pdf
9.
Animesh Patcha and Jung-Min Park, An

overview of anomaly detection techniques:
Existing solutions and latest technological trends,
Science Direct Computer Networks Vol.51, Issue
12 (2007), Pages 3448-3470.
10. Aurobindo Sundaram; An Introduction to

Intrusion Detection; ACM, Special Issue on
computer security; Vol. 2 Issue 4 (1996),pages 3-7.
[Online]
Available
at:
http://www.acm.org/crossroads/xrds24/intrus.html?CFID=44804116&CFTOKEN=1517
2725
11. P.A Porras and P.G Neumann (2005);
EMERALD:
Event
Monitoring
Enabling
Responses to Anomalous Live Disturbances;
Computer Science Laboratory SRI International.
12. M.V Mahoney and P.K Chan; PHAD: Packet
Header Anomaly Detection for Identifying Hostile
Network Traffic ; Florida Institute of Technology
Technical Report CS-2001-04, Available at:
http://cs.fit.edu/~mmahoney/paper3.pdf
13. Matthew V. Mahoney, Philip K. Chan; Learning
Nonstationary Models of Normal Network Traffic
for Detecting Novel Attacks; SIGKDD 2002,
Available at: http://cs.fit.edu/~mmahoney/
14. Matthew V. Mahoney and Philip K. Chan;
Learning Models of Network Traffic for
Detecting Novel Attacks; Technical Report CS2002-08, Florida Institute of Technology.
Available
at:
http://cs.fit.edu/~mmahoney/paper5.pdf
16. Mathew V.M (2003); A Machine Learning

Approach to Detecting Attacks by Identifying
Anomalies in Network Traffic; Doctorate thesis
on Philosophy, Florida Institute of Technology;
Available
at:
http://cs.fit.edu/~mmahoney/dist/diss.pdf
17. M. Ali Aydin, A. Halim Zaim and K. Gokhan
Ceylan; A hybrid intrusion detection system
design for computer network security; Science
Direct, Computers & Electrical Engineering vol.35
(2009), pages 517-526.
18. Mahoney V. Mahoney, Philip K. Chan; Detecting
Novel Attacks by Identifying Anomalous Network
Packet Headers; Technical Report CS-2001-2
,Florida Institute of Technology.
19. T. Bell, Ian H. Witten and John G. Cleary;
Modeling for Text Compression; ACM
Computing Surveys (CSUR) vol. 21, Issue 4
(December 1989) pages 557-591.
20.
M. V Mahoney and P .k Chan; An Analysis
of the 1999 DARPA/Lincoln Laboratory Evaluation
Data for Network Anomaly Detection; Florida
Institute of Technology; TR CS-2003-02.
21.
W.A Gale and Geoffrey Sampson; GoodTuring Frequency Estimation Without Tears;
Journal of Quantitative Linguistics (1995), vol.2, No
3, pp. 217-237.
22.
S. Zanero and S.M Savaresi; Unsupervised
Learning Techniques for an intrusion detection
system, 2004 ACM Symposium on Applied
Computing; pp 412-419.
23.
M.V Mahoney and P.K Chan (2002-2003);
Network Anomaly Intrusion Detection Research at
Florida
Technology
Available
at:
http://cs.fit.edu/~mmahoney/dist/
15. Eric Chiejina(2008), Detecting Network Intrusions

via a Statistical Analysis of Network Packet
Features; University of Hertfordshire School of
Computer Science.
www.ijsret.org
1152

Intrusion Detection System Via A Machine Learning Based Anomaly Detection Technique

Загружено:

Сведения о документе

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Intrusion Detection System Via A Machine Learning Based Anomaly Detection Technique

Загружено:

Авторское право:

Доступные форматы

International Journal of Scientific Research Engineering & Technology (IJSRET), ISSN 2278 0882

Volume 3, Issue 8, November 2014

Intrusion Detection System via a Machine Learning Based Anomaly

Intrusion Detections Systems (IDS) is a new path of

insufficient [4]. However due to increased

Anomaly Detection System

Figure 1: A typical anomaly detection system [10]

flags the differences between short and long term

PHAD TIME BASED MODEL

Packet header anomaly detection [12] uses the rate

anomalous is inversely proportional to the time since it

the number of generated false alarm the Good -Turing

By DARPA criteria, to evaluate any kind of

Simulations were conducted to compare and

Training of the algorithm

Testing of the algorithm

Evaluation of the Output

Figure 2: Format of the Experimental Set-up

Implementation of PHAD time based Model

Mahoney and P.K Chan technical report on PHAD

Implementation of the Modified PHAD time

During the evaluation phase the threshold in the

31 detections, 1506 alarms, 37 true, 1001 false, 468

Poor shows evidence of attacks that were poorly

BSM shows evidence of attacks in Solaris BSM

DOS shows evidence of denial of service attacks

R2L shows evidence of remote to local attacks

U2R shows evidence of user to root attacks

Probe shows evidence of probe attacks

New shows evidence of new attacks

6.1 Attacks detected

Table 2: Packet fields attributes and attacks that

Table 3: Percentage of Packet fields attribute that

Table 5, shows the percentage of each of the attack

Table 3 above, shows the percentage contribution of

Results of Tuning the Clustering value

training and testing sections using different clustering

A clear representation of this can be seen from Figure

According to Mathew V.M (2003) [13] the

Figure 5: the number of attacks detected by the two

False positive alarm detected by the models

were generated by both models as the clustering values

Table 8: Summary of the number of True positive

it can be deduced that the modified PHAD model

A clear illustration of this can be seen in Figure 6.

A clear illustration of this is as demonstrated in Figure

Figure 7: showing the true positive alarms

Figure 6: showing rate of false positive alarms

Comparing the result of the modified model with the

rate false alarm being generated by the two models

Figure 8: Detection line graph for 200 to 1000 false

Overall, the implementation of the modified PHAD

rate is not practicable. However this makes the original

security model for computer network operation;

Animesh Patcha and Jung-Min Park, An

10. Aurobindo Sundaram; An Introduction to

16. Mathew V.M (2003); A Machine Learning

15. Eric Chiejina(2008), Detecting Network Intrusions

Вам также может понравиться