Вы находитесь на странице: 1из 60

CHAPTER ONE

1.0 INTRODUCTION

Security is an important issue for all the networks of companies and

institutions at the present time, because all the intruders are trying in many

ways so that they can have successful access to the data network of these

companies and Web services, despite the development of multiple ways to

ensure that the infiltration of intrusion to the infrastructure of the network

via the Internet is reduced through the use of firewalls, encryption, etc.

Attack on networks is still on the increase. NICE (Network Intrusion

Detection and Countermeasure Selection) is a relatively new technology for

intrusion detection methods that have emerged in recent time. An intrusion

is an attempt to gain unauthorized access to a system with the purpose of

either testing the security of the network, using the facility as a launching

pad for further attacks on other systems, modifying or stealing information,

etc. Intrusion Detection (ID) is the art of detecting intrusions and taking

appropriate action against them.Intrusion detection system’s main role in a

network is to help computer systems to prepare and deal with the network

attacks. In computer security, Network Intrusion Detection and

Countermeasure Selection in Virtual Network Systems (NICE) is an

intrusion detection system thatattempts to discover unauthorized access to a


1
computer network by analysing traffic on the network for signs ofmalicious

activity. The use of network has become extremely common amongst

technological users as well as corporate users. This extensive usage of the

network has led to an increasing concern in network security. Improper use

of computer network and network resources to access private data and also

to deploy attacks on systems is the top trending security issue. Hackers and

illegal traders employ vulnerable applications and the reservoirs in network

servers to create loopholes in the systems and to the data stored in the

network. This makes the detection of the hackers difficult in network and on

the other hand an easy passage for them to escape successfully and

unnoticed. Recent studies have shown that users migrating to the network

computing consider security as the most important factor, because intruders

can explore vulnerabilities of a network system to gain access in order to

deploy some virus or malware such as Denial of Service (DOS) or

Distributed Denial of Service (DDoS) attack. Again, the network users may

themselves install and use vulnerable applications in their data centre which

can lead to disreputable use of network.NICE (Network Intrusion detection

and Countermeasure selection in virtual network systems) has been proposed

to set up a protection inside and out interruption identification structure. For

better assault recognition, NICE consolidates obstructing of the specific

2
system address into the interruption identification forms and improve

network security. NICE significantly advances the current network IDS/IPS

solutions by using programmable virtual networking approach that allows

the system to construct a dynamic reconfigurable IDS system. NICE, a new

multi-phase distributed network intrusion detection and prevention

framework in a virtual networking environment that captures and inspects

suspicious network traffic without interrupting users’ applications and

network services. The purpose of IDS is to help computer systems on how to

deal with attacks, and that IDS is collectinginformation from several

different sources within the computer systems and networks and compares

thisinformation with pre-existing patterns of discrimination as to whether

there are attacks or weaknesses.

TYPES OF ATTACKS

1. Eavesdropping: Eavesdrop means to listen to private conversation

secretly as per meaning in Eavesdropping data exposed by the capturing data

packets which transforms between computers on network. It’s easy for

attackers to find the path of the network when users do not use the encrypted

connection. Attacker grab the information from the network, this kind of

practice is also called sniffing or snooping.

3
2. Denial of service (DoS) Attack: Denial of service attack is a network

attack with intent to prevent the use of the computer and the web services

from the authorized users. In this technique attacker sends inoperative data

on the network in order to generate traffic. Some DoS attacks exploit the

limitations of TCP/IP Protocol, like Ping of death and teardrop. In the ping

of death, the attacker sends IP Packets larger than 64 Kb in size, and this

makes it difficult for the Operating system to respond to instructions,

whereas inteardrop, attacker sends data in fragments that overlay one

another on moderator receipt, it causes system crashes because of bug in a

TCP/IP re-assembly code.

3. Man in Middle attack: It is an attack in eavesdropping form where

attackers steal the private information of users from computer network.

Decrypted transformation on server gets high chances of sniffing; user gets

any information on the web. Attacker uses the public key of the user and

steals private information like credit card number, email addresses etc.

4. IP Spoofing: Every computer system on the internet has IP address that

reveals its identification. IP Address spoofing also known as IP address

forgery, it is a hijacking technique that hijacks the host browser to gain

access to the network.

4
5. Sniffer Attack: sniffer is a device or function that capture data packets on

the network and attacker get the important information among them like

Password, credit card number, email address etc.

6. DNS (Domain Name System) Spoofing: DNS spoofing is a computer

hijacking technique that hijacks the domain name server and responds to the

user from a fake server. The user won’t understand that the information it is

getting is not from the real DNS server but from the False Server.

1.1 BACKGROUND OF THE STUDY

The purpose of intrusion detection is to supervise network assets to detect

anomalous behaviour and misuse innetwork. This concept has been around

for nearly twenty years but only recently has it seen a dramatic rise

inpopularity and incorporation into the overall information security

infrastructure. Beginning in 1980, withJames Anderson's paper, he was a

pioneer of Information Security and member of the Defense-Science Board

Task Force, they produced Computer Security Threat Monitoring and

Surveillance, the intrusion detection was born. Since then, several polar

events in IDS technology have advanced intrusion detection to its

currentstate. James Anderson's seminal paper, was written for a government

organization, and it introduced the notion thataudit trails contained vital

5
information that could be valuable in tracking misuse and understanding of

userbehaviour. With therelease of this paper, the concept of "detecting"

misuse and specific user events emerged.In 1983, SRI International, and Dr.

Dorothy Denning, began working on a government project thatlaunched a

new effort into intrusion detection system development. Their goal was to

analyze audit trailsfrom government mainframe computers and create

profiles of users based upon their activities. One yearlater, Dr. Denning

helped to develop the first model for intrusion detection, the Intrusion

Detection ExpertSystem (IDES), which provided the foundation for the IDS

technology development that was soon to follow.

Finally in 1989, the developers from the Haystackproject formed the

commercial company, Haystack Labs, and released the last generation of the

technology,Stalker. Crosby Marks says that "Stalker was a host-based,

pattern matching system that included robustsearch capabilities to manually

and automatically query the audit data." The Haystack advances,

coupledwith the work of SRI and Denning, greatly advanced the

development of network-based intrusion detectiontechnologies.

1.1.1 Reason for Network Intrusion Detection and Countermeasure Selection

The first step in delivering an efficient and secure network intrusion

protection strategy is accurately detecting all possible threats. To achieve


6
this goal, multiple detection methods need to be employed to ensure

comprehensive coverage, and also to prevent intruders from exploring

vulnerabilities of a network system in order to gain unauthorized access and

to deploy some virus or malware such as Denial of Service (DOS) attack

into the network. Because of these reasons the need for the development of

a multiphase distributed vulnerability detection, measurement, and

countermeasure selection mechanism called NICE becomes a necessity.

1.1.2 THE TYPES OF NETWORK THREATS

Every attack ona network can comfortably be placed into one of these

groupings.

a. Denial of Service (DOS): A DOS attack is a type of attack in which the

hacker makes acomputing or memory resources too busy or too full to

serve legitimate networkingrequests and hence denying users access to a

machine e.g. apache, smurf, neptune, pingof death, back, mail bomb,

UDP storm etc. are all DoS attacks.

7
b. Remote to User Attacks (R2L): A remote to user attack is an attack in

which a usersends packets to a machine over the internet, which s/he

does not have access to inorder to expose the machines vulnerabilities

and exploit privileges which a local userwould have on the computer

e.g. xlock, guest, xnsnoop, phf, send mail dictionary etc.

c. User to Root Attacks (U2R): These attacks are exploitations in which

the hacker startsoff on the system with a normal user account and

attempts to abuse vulnerabilities in thesystem in order to gain super

user privileges e.g. perl, xterm.

d. Probing: Probing is an attack in which the hacker scans a machine or a

networking device in order to determine weaknesses or vulnerabilities

that may later be exploited so as to compromise the system.

1.2 MOTIVATION

The realization with the increasing traffic and increasing complexity of

attacks, none of the present day stand-alone intrusion detection systems can

meet the high demands for a very high detection rate and an extremely low

false alarm rate. Also, most of the IDSs available in literature show distinct

preference for detecting a certain class of attack with improved accuracy

while performing moderately for the other classes of attacks.

8
1.3 CLASSIFICATION OF INTRUSION DETECTION

Intrusion prevention systems can be classified into four different types:

1. Network-based intrusion prevention system (NIPS): monitors the

entire network for suspicious traffic by analyzing protocol activity.

2. Wireless intrusion prevention system (WIPS): monitor a wireless

network for suspicious traffic by analyzing wireless networking

protocols.

3. Network Behaviour Analysis (NBA): examines network traffic to

identify threats that generate unusual traffic flows, such as distributed

denial of service (DDoS) attacks, certain forms of malware and policy

violations.

4. Host-based intrusion prevention system (HIPS): an installed software

package which monitors a single host for suspicious activity by analyzing

events occurring within that host.

9
1.3.1 DETECTION METHODS

There are various types of detection methods which includes;

1. Signature-based detection: Signature-based IDS monitors packets in the

Network and compares with pre-configured and pre-determined attack

patterns known as signatures.

2. Statistical anomaly-based detection: An IDS which is anomaly-based will

monitor network traffic and compare it against an established baseline. The

baseline will identify what is "normal" for that network – what sort of

bandwidth is generally used and what protocols are used. It may however,

raise a False Positive alarm for legitimate use of bandwidth if the baselines

are not intelligently configured.

3. Stateful protocol analysis detection: This method identifies deviations of

protocol states by comparing observed events with "pre-determined profiles

of generally accepted definitions of benign activity".

4. Host-based Intrusion detection system: It consist of an agent on a host

that identifies intrusions by analysing system calls, application logs, file

system modification (binaries, password files, capability databases, access

control list etc.) and other host activities and states. In HIDS sensors usually

consist of software agent.

10
5. Perimeter Intrusion detection system: PIDS detects and pin-points the

location of intrusion attempts on perimeter fences of critical infrastructures.

Using either electronic or more advanced fibre optic cable technology fitted

to perimeter fence, the PIDS detects disturbance on the fence, and if an

intrusion is detected and Deemed by the system as an intrusion attempt, an

alarm is triggered.

1.4 STATEMENT OF PROBLEM

The challenge is to establish an effective vulnerability/attack detection and

response system for accurately identifying attacks and minimizing the

impact of security breach to virtual network system users. In a virtual

network system where the infrastructure is shared by potentially millions of

users, abuse and nefarious use of the shared infrastructure benefits attackers

to exploit vulnerabilities of the virtual network system and use its resource

to deploy attacks in more efficient ways. Such attacks are more effective in

the virtual network system environment since virtual network system users

usually share computing resources.

1.5 AIM AND OBJECTIVES OF THE STUDY:

The aim of this project is to study NICE and its peculiarity.

The objectives of the project are:

11
1. To prevent the vulnerable virtual machines from being compromised in

the network server using multi-phase distributed vulnerability detection,

measurement and countermeasure selection mechanism called NICE.

2. To prevent intruders from gaining unauthorized access to the computer

resources and network servers.

1.6 SCOPE OF THE STUDY

The scope of this project is mainly to analysean IDS using windows

platform in a mult-displinarynetwork.

1.7 SIGNIFICANCE OF THE STUDY

The NICE is important for providing a better security to the polytechnic

institutions network to ensure that the system integrity is being protected.

The project will provide multiphase network intrusion detection and

prevention framework in virtual networking environment that will identify

attacks and gather information about the attacks, capture and inspects

suspicious network traffic without interrupting user’s applications and

network service provide.

1.8 LIMITATIONS

1. It is not uncommon for the number of real attacks to be far below the

number of false-alarms. Number of real attacks is often so far below the

number of false-alarms that the real attacks are often missed and ignored.
12
2. It cannot compensate for weak identification and authentication

mechanisms or for weaknesses in network protocols. When an attacker

gains access due to weak authentication mechanisms then IDS cannot

prevent the adversary from any malpractice.

3. Encrypted packets are not processed by most intrusion detection devices.

4. Due to the nature of NIDS systems, and the need for them to analyse

protocols as they are captured, NIDS systems can be susceptible to the

same protocol-based attacks to which network hosts may be vulnerable.

Invalid data and TCP/IP stack attacks may cause an NIDS to crash.

1.9 RESEARCH METHODOLOGY

In this project, we propose NICE (network intrusion detection and

countermeasure selection in virtual network systems) to establish a defense-

in-depth intrusion detection framework. For better attack detection, NICE

incorporates attack graph analytical procedures into the intrusion detection

processes. We must note that the design of NICE does not intend to improve

any of the existing intrusion detection algorithms; indeed, NICE employs a

reconfigurable virtual networking approach to detect and counter the

attempts to compromise VMs, thus preventing zombie VMs.

13
1.10 DEFINITION OF TERMS

1. Intrusion Detection System

It is a device or software application that monitors a network or system for

malicious activities or policy violations. Any malicious activity or

violation is typically reported either to an administrator or collected

centrally using Security Information and Event Managements (SIEM)

system.

2. Network intrusion detection systems

Network intrusion detection systems (NIDS) are placed at a strategic point

or points within the network to monitor traffic to and from all devices on

the network. It performs an analysis of passing traffic on the entire subnet,

and matches the traffic that is passed on the subnets to the library of

known attacks. Once an attack is identified, or abnormal behaviour is

sensed, the alert can be sent to the administrator.

3. Countermeasure Selection

Countermeasure is a process, action, system or device that can prevent or

reduce the effect of threats to a computer server or network.

Countermeasure are selected by attack analyzer and executed by network

controller.

14
4. Virtual Network Systems

Virtual networking is a technology that facilitates the control of one or

more remotely located computers or servers over the internet.

5. Attack Graph

An attack graph is a modelling tool used to illustrate all possible multi-

stage, multi host attack paths that are crucial to understand threats and

then to decide appropriate countermeasures. In an attack graph, each node

represents either precondition or consequence of an exploit. The actions

are not necessarily an active attack since normal protocol interactions can

also be used for attacks. Attack graph is helpful in identifying potential

threats, possible attacks and known vulnerabilities in a network system.

6. Internet Protocol (IP)

A set of rules governing the format of data sent over the Internet or other

network.

7. Zombie

It is a computer connected to internet that has been compromised by a

hacker and can be used to perform malicious activities from a remote

location.

15
CHAPTER TWO

2.0 GENERAL OVERVIEW

Network Intrusion Detection and Countermeasure Selection in Virtual

Network Systemsdetect attacks by capturingand analyzing network traffic. It

has dedicated software or hardwaresystems called the NIDS (Network

Intrusion Detection System) that “sit” on a network and analyze network

packets.NIDS often consist of a set of single-purpose sensors placed at

various points ina network. These sensors monitor network traffic,

performing local analysis ofthat traffic and reporting attacks to a centralized

console.

2.1 LITERATURE REVIEW

This chapter reviews related literature regarding Network Intrusion

Detection and Countermeasure Selection in Virtual Network Systems.

According to Tejashreeet al., (2015),Secure Network Communication and

Intrusion Detection in Virtual Machines in India reported. The aim of the

work is to identify possible attacks in virtual machines, collect information

about the attacks and as well prevent the reoccurrence of the attacks. The

objective of the work is to detect the weakness in security polices of an

organization, keep track record of existing attacks and their threats and also
16
to prevent an individual from violating security policies. The work was

motivated by the need to develop a distributed network intrusion and

prevention framework in a virtual networking environment that captures and

inspects suspicious cloud traffic. Various materials and methods were used

in order to accomplish the task, which includes the attack graph, virtual

networking and light weighted NICE-Agent. The detection of vulnerability

in virtual machines by use of multiphase distributed mechanism was

achieved in multiple server clusters. Challenges such as no accuracy in

attack detection from attackers and inability to filter malicious traffic

without impacting the services as a whole were encountered.

In Bhagyashriet al., (2015),Countermeasure Selection and Networking

Intrusion Detection using Virtual Network Systems reviewed. Explained the

essence of embarking on the work, which is to prevent vulnerable virtual

machines from being compromised in a virtual networking environment.

The need to have an appropriate security and privacy solutions designed for

cloud computing and data sharing is what led to the initiation of the idea. To

achieve the aim they proposed multi-phase distributed vulnerability

detection, measurement and countermeasure selection mechanism called

NICE, which was built on attack graph-based analytical models and

reconfigurable virtual network-basedcountermeasures.Scenario

17
Attackalgorithm (SA) was used to achieve both control of requests and

mugger identification. The use of similar setup for Virtual Machines in the

cloud, e.g., virtualization techniques, Virtual Machine OS, installed

vulnerable software, networking, etc. attracted attackers to compromise

multiple Virtual Machines.

In Vikramet al., (2016), Network Intrusion Detection and Countermeasure

in India reviewed. Explained the security and privacy of cloud computing

environment.NICE is proposed to detect and mitigate collaborative attacks

in the cloud virtual networking environment.The objective of the paper is to

prevent vulnerable virtual machines from being compromised, and to

prevent unauthorized access to cloud database. The need prevent vulnerable

virtual machines on the network from being compromised is reason for the

development of a multi-phase vulnerability detection and countermeasure

mechanism called NICE in the network environment, the need to improve

the detection accuracy and defeat victim exploitation phases of collaborative

attacks. It was built on attack graph based systematic models,

reconfigurablevirtual network-based countermeasures and deep packet

inspection is also applied. The system performance evaluation indicates the

feasibility of NICE and shows that the proposed solution significantly

reduced the risk of the cloud system from being exploited and misused by

18
internal and external attackers. Cloud users usually have the rights to control

software installed on their managed virtual machines because of their

Service Level Agreement (SLA), this it made difficult to fix loop holes

within the network and work efficiently.

In Radharaniet al., (2017), Countermeasure Selection and Intrusion

Detection using NICE in Virtual Network Systems documented. The aim of

the work is to avoid susceptible Virtual Machines frombeing compromised

in the cloud the objective is to find unlicensed access to a computer network

by analyzing traffic on the network for signs of intruder’s activity. The need

to develop an Intrusion Detection and Countermeasure in virtual Network

systems technique that will prevent susceptible virtual machines from being

compromise in the cloud prompted the development of the idea. Materials

and methods such as hacker graph base analytical model with reconfigurable

virtual network base countermeasures, Graph based attack model analysis

and Reconfiguration plan were used. NICE it detects and minimizes attacks

in cloud server and is significantly reduces the danger of cloud system

starting abused by inside and outside attackers. The planned solution

progresses to attain detection in accuracy. The challenge task is inability to

create an efficient hackers discovery and reaction system for correctly

recognizing attacks and decreasing the effect of security problems to cloud

19
users, total cloud traffic control is not possible, total interferences problems

is not possible to control.

In Ayesha, (2015), Network Intrusion Detection and Countermeasure

Selectionin India reviewed. The aim of the work is to initiate a system that

will counter the assault of Distributed Denial of Service (DDoS) attack on

the cloud users and to set up a protection inside and out interruption

identification structure. The objective of the work is to setup a technique that

will scan the network for zombies or unauthorized access, to prevent the

cloud server from being compromised and to patch any loopholes that may

be found in the network. The need to keep vulnerable virtual machines from

being traded off in the cloud, led to the projection of a multi-stage

disseminated vulnerability identification, estimation, and countermeasure

determination system called NICE. Which is based on assault diagram based

logical models and reconfigurable virtual system based countermeasures.

The framework and security assessments show the proficiency and adequacy

of the proposed arrangement and NICE enhances the usage on cloud servers

to minimize asset utilization. Difficulties aroused when it came to fixing

loopholes because sometimes cloud users install some illegal programs on

their machines.

20
In James et al., (2016), Network Intrusion Detection and Countermeasure

Selection in Virtual Networks in Ghana Document.The aim of work is to

build a frequency-based Intrusion Detection System (IDS) to detect

Distributed Denial of Service (DOS) attack. The objective of the work is to

build a range of security techniques to detect and report malicious system

and network activity and to record evidence of intrusion. The need to

maintain confidentiality, integrity, control, authenticity, availability, and

utility of the network resources, prompted the introduction of an effective

and efficient approach for network intrusion detection as an effective

countermeasure for various network attacks. Network Simulator2 (NS2)

software, Frequency-based Intrusion Detection System (IDS) to detect DOS

attack, and Discrete Fourier Transform (DFT) were considered. The results

indicate the traffic visualization and connection-based activities analysis of

various network nodes in the form of frequency patterns. The system lacks

the ability to detect recent emerging threats such as Reflexive Distributed

Denied of Service (RDDOS) attacks.

According to Naga et al., (2017),Network Intrusion Detection and

Countermeasure Selection in Virtual Network Systems reviewed. The aim of

the work was to build maintenance and control plane for distributed

programmable virtual switches, and to enhance the identification of attacks

21
and alleviate attack consequences. The objective of the framework is

toidentify and remove threats in a cloud virtual system environment. The

need to obtain an accurate attack detection and diffusion process prompted

the design of Intrusion Detection System (IDS) framework. A lightweight

mirroring-based network intrusion detection agent (NICE-A) on each cloud

server to capture and analyze cloud traffic and a novel attack graph approach

for attack detection and prevention by correlating attack behavior and

suggests effective countermeasures were deployed. The security

performance evaluation shows that the approach achieves the design security

goals: To prevent vulnerable VMs from being compromised and to do so in

less intrusive and cost effective manner. The challenge faced ranged from

ability to create an effective Attack/Vulnerability detection and response

system to accurately detect the attacks and lower the effect of security

breach in the cloud system.

According to Pinkiet al., (2016), Network Intrusion Detection and

Countermeasure Selection in Virtual Network Systems reviewed.The main

aim of this project is to prevent the vulnerable virtual machines from being

compromised in the cloud environment. The object of the work is to identify

possible attacks, collect information about them and then try to stop their

occurrence and at last reporting them to the system administrator. Need to

22
avert these virtual machines from being compromised, brought about a

multi-phase solution NICE. Alert Correlation Algorithm, Countermeasure

Selection Algorithm and reconfigurable virtual networking approach to

detect and counter the attempts to compromise virtual machines were

employed. The Intrusion Detection System (IDS) framework introduced

incorporates a software switching solution to quarantine and inspect

suspicious virtual machines for further investigation and protection. Through

programmable network approaches, it improved the attack detection

probability and improves the resiliency to VM exploitation attack without

interrupting existing normal cloud services. No detection and prevention

framework in a virtual networking environment, and no accuracy in the

attack detection from attackers.

In Vidyaet al., (2017), Network Intrusion Detection and Countermeasure

Selection in Wireless Sensor Network reported. The main aim of the work is

to improve the attack detection probability and to improve the resiliency to

virtual machine exploitation attack without interrupting existing normal

virtual network system services. The need to optimize the implementation of

and IDS on virtual network system servers to minimize resource

consumption prompted the initiation of the NICE. It employs a novel attack

graph approach for attack detection and prevention by correlating attack

23
behavior and also suggests effective countermeasures, and reconfigurable

virtual networking approach to detect and counter the attempts to

compromise virtual machines. The result of system implementation that it

consumes less computational overhead compared to proxy-based network

intrusion detection solutions.

According to Nirmala et al., (2015), Network intrusion detection system

with Deceptive Virtual Hosts for Industrial Control Networks reported. The

work seeks to develop a multiphase distributed network intrusion detection

that manages effectively the cloud traffic without interrupting users’

applications and cloud services. The need to build a system that will monitor

and prevent the vulnerable virtual machines from being compromised in the

cloud server effectively and accurately prompted the idea initiation. Various

tools and model such as honeypot, reconfigurable virtual network and

honeyd were used. The proposed system implementation results in

successful by achieving automatic virtual network creation by means of

virtual host and also detecting malicious behaviours in a network system

effectively. In ability to handle security is a major challenge faced.

24
CHAPTER THREE
RESEARCH METHODOLOGY
3.1 Research Methodology
This study deals with investigation of Network Intrusion Detection and

Countermeasure Selection, and carries out root cause analysis of

performance using data mining techniques and further provides a solution to

fill performance gaps by suggesting a capability work culture model for

improving the quality of network usage, and in turn will provide an accurate

countermeasure for any attack detected.

To investigate these factors data mining techniques were considered.

DATA MINING: Data mining is the process of discovering hidden

knowledge from data. It uses a combination of a knowledge base,

sophisticated analytical algorithms, and domain knowledge to reveal hidden

patterns.

DATA MINING TOOLS: These are tools that can be used for performing

data mining, these tools includes: MATLAB, Orange, Rapid Miner, WEKA

(Waikato Environment for Knowledge Analysis), KNIME, Sisense, SSDT

(SQL Server Data Tools), Apache Mahout among others. For the purpose of

this project work WEKA is used.

25
WEKA (Waikato Environment for Knowledge Analysis): WEKA tool is an

open source and easily available tool which has a wide range of available

algorithms for such purpose. Weka is a cluster of machine learning

algorithms for data mining tasks. The algorithms can be applied directly to a

dataset. Weka contains all features and methods that support data pre-

processing, visualization of data and applying classification, regression,

clustering, and association rules.

Intelligence and prior knowledge is required in every phase of software

development. Motivation for this study is to improve accuracy in terms of

attack detection rate and prevent virtual machines from being compromised

in the network.

Figure 3.1: Project Framework

26
The project framework consists of the following steps - Problem Definition,

data collection and preprocessing, applying classification technique and

deriving results. The project objective was analyzed and the research

problem was defined along with framing of hypothesis considering the

constraints. Based on the problem definition, data was collected. The data

was preprocessed accordingly for applying to various classifiers. The results

from various classifiers were analyzed on the basis of results generated and

also accuracy.

3.2 Constraints

This work thus enables one to apply data mining techniques to analyze large

volumes of data regarding network intrusion characteristics in order to

reveal those attributes which contribute to good performance. Some major

constraints during the study was that this study is limited to web based

applications which are more of service than product, it includes applications

of type, property portal, online shopping sites, search engines, email portals

etc. Rationale for this limitation is due to constraint of research time and

resource availability of empirical data from the network. However, with the

data set obtained, a study of network based investigations was carried out;

this project work progressed by formulating the hypothesis as below.

27
Hypothesis – Certain attributes pertaining to effect of

network intrusion on a network environment and under

such work conditions, impact of intrusion in network

performance and thereby influence the quality of

network usage.

Some of the attributes considered are programming skills, reasoning skills,

data mining skills and data communication and networking skills. Thus the

impact of such attributes on performance is estimated using data mining

techniques.

3.3 Data Collection

Dataset information was collected from those projects which are developed

in similar domain using similar technology and programming language.

After framing the objectives, the next stage was data collection. The data

that was collected came from various sources. Also the data was very large.

More than 1000 data packets were collected with different attribute.

However, at this point it was necessary to determine the sample for the

empirical study. The sample should be optimum as to represent the

population of the data. If too large, the results may not be consistent. If too

small, it may again not give proper and accurate results. For the purpose of

this project a 100 data set with 39 packets were used.


28
This study took attributes related to network transmission protocol such as

TCP/IP & HTTP, network security protocols such as HTTPS & SFTP and

network management protocols like SNMP & ICMP, as well as analysis and

performance prediction.

Dataset Attribute
spkts varchar(5) Yes NULL
dpkts varchar(5) Yes NULL
sbytes varchar(6) Yes NULL
dbytes varchar(6) Yes NULL
rate varchar(4) Yes NULL
sttl varchar(4) Yes NULL
dttl varchar(4) Yes NULL
sload varchar(5) Yes NULL
dload varchar(5) Yes NULL
sloss varchar(5) Yes NULL
dloss varchar(5) Yes NULL
sinpkt varchar(6) Yes NULL
dinpkt varchar(6) Yes NULL
sjit varchar(4) Yes NULL
djit varchar(4) Yes NULL
swin varchar(4) Yes NULL
stcpb varchar(5) Yes NULL
dtcpb varchar(5) Yes NULL
dwin varchar(4) Yes NULL
tcprtt varchar(6) Yes NULL
synack varchar(6) Yes NULL
ackdat varchar(6) Yes NULL
smean varchar(5) Yes NULL
dmean varchar(5) Yes NULL
trans_depth varchar(11) Yes NULL
response_body_len varchar(17) Yes NULL
ct_srv_src varchar(10) Yes NULL
ct_state_ttl varchar(12) Yes NULL
ct_dst_ltm varchar(10) Yes NULL
ct_src_dport_ltm varchar(16) Yes NULL
ct_dst_sport_ltm varchar(16) Yes NULL

29
ct_dst_src_ltm varchar(14) Yes NULL
is_ftp_login varchar(12) Yes NULL
ct_ftp_cmd varchar(10) Yes NULL
ct_flw_http_mthd varchar(16) Yes NULL
ct_src_ltm varchar(10) Yes NULL
ct_srv_dst varchar(10) Yes NULL
is_sm_ips_ports varchar(15) Yes NULL
g_output varchar(8) Yes NULL

Table 3.1: Initial List of attributes

The above mentioned data attributes were taken into consideration during

data collection. Few attributes were taken from network department.

Therefore, many were removed deliberately and few were removed for

unbiased conclusions.

3.4 Data Analysis and Pre-processing

The data has come from various sources as shown in data collection. After

the sample was selected, it was very necessary to integrate and normalize

the data as per the requirements of the classification techniques.

Data pre-processing is the most crucial aspect in data mining. More than 30

attributes were taken initially. Data pre-processing involves reduction of

unnecessary attributes and secondly redundant tuples. Data pre-processing

can be carried out by various methods such as Genetic Algorithm, Neural

Networks Rough Set Theory and Entropy Based Discretization. Basically,

all methods give a hierarchical order of attributes so that unimportant

30
attributes can be easily trimmed. This study has used Entropy Based

Discretization method since it is capable of extracting knowledge from large

and unclear dataset. Entropy Based Discretization reduces the attributes and

also helps in improving the accuracy. Sampling of data set should be such as

to represent the entire data properly and removing only the redundant tuples.

After removing the unimportant attributes, the redundant tuples were

removed by using ID3 (Iterative Dichotomiser 3).

The approach for data pre-processing has been shown in Figure 3.2. As

shown in figure 3.2 it has been done in two phases. Firstly, unimportant and

insignificant attributes have been removed. This step is also called column

reduction. The second phase is reducing unimportant and redundant tuples.

This phase is also called row reduction.

Identifying and eligibility attributes were not considered for the project

since the project work focused on finding those technical factors which

influenced the performance. Some attributes were integrated, aggregated

and normalized. Some were further removed since they were ranked very

low in hierarchy of importance by data pre-processing method i.e. when

column reduction was done. The final list of attributes that were taken for

data analysis is as shown in Table 3.2 above. The column reduction was

31
done through two methods i.e. Information Gain and Gain ratio in Weka

tool which is based on Entropy.

3.4.1 Attribute Selection Measures (Column Reduction)

Attribute selection measure is a heuristic approach for splitting the dataset.

Therefore, they are also called splitting rules in forming decision trees.

Attribute selection method also provides ranking of attributes in the dataset.

The most important attribute comes as root and subsequently other attributes

come in the tree depending on their importance. Attribute selection measures

can be used for data preprocessing too for column reduction i.e. reducing

unimportant attributes.

Two methods were taken for achieving the task. They were information gain

and gain ratio, both based on entropy. Entropy is considered to be a very

good discretization measure. It was introduced by Claude Shannon in his

premium work on information theory (Weiss et al., 1998). Information gain

used in ID3 (Iterative Dichotomiser 3) classification is also based on the

same concept. Also Gain Ratio which is used in CART (Classification and

Regression Tree) decision tree is based on Entropy. Entropy based

discretization reduces data size effectively. It can be used for row reduction

as well as column reduction. Column reduction methods are also called

attribute selection methods.

32
Let D be the dataset. Let there be m class labels or output classes denoted

by Ci (i = 1..m). Let D and Ci, D denote the number of tuples in D and

number of tuples in class Ci. Further is explained the concept of information

gain and gain ratio.

a) Information Gain

Information gain is used in ID3 for splitting criteria. Information gain is

based on entropy. The attribute with highest information gain is chosen as

the splitting attribute for node N. The attribute which minimizes the

information needed to split the tuples and reflects the least randomness is

chosen.

Information needed to classify a tuple D is given by

Info (D) = − ∑𝑚
𝑖=1 𝑃𝑖 𝐿𝑜𝑔2(𝑃𝑖)

Where pi is the probability that an arbitrary tuple belongs to class Ci and is

estimated by Ci, D/D. Info (D) is also called entropy of D.

Info (D) is given by equ (3.1)

Info (D) = − ∑𝑚
𝑖=1 𝑃𝑖 𝐿𝑜𝑔2(𝑃𝑖) ................ equ (3.1)

Where class labels have m distinct class values.Ci,Dis the set of tuples of

class Ci in D. let D and Ci, D denote the number of tuples in D and Ci, D. In

33
equation (3.1), pi is the probability that a tuple in D belongs to C i and is

estimated by Ci, D/D

The dataset consists of attribute A having V distinct values. When test is

carried out on attribute A, there will be v outcomes. Or attribute A can be

used to split D into v partitions (D1..Dv), where Dj consists of those tuples in

D whose outcome is aj. These partitions are also the branches from node N.

However, when splitting takes place it contains tuples from other classes

too. Therefore, the partitions are generally impure. At this point, there is the

need of information or also called entropy about the classification. Info A(D)

is the expected information required to classify the tuple from D based on

the partitioning by A.

𝐷𝑗
Info ∑𝑣𝑗=1 A (D) = (Dj) ---------- (equ 3.2)
𝐷
𝑚
Info − ∑𝑖=1 𝑃𝑖 𝐿𝑜𝑔2(𝑃𝑖) where Info (Dj) = -(equ3.3)
Where pi is the probability that a tuple in D belongs to C i having attribute

value j.

Information gain is given by

Gain (A) = Info (D) - InfoA(D).------------------------------(equ 3.4)

The attribute with highest gain is chosen as the splitting criteria at node N.

34
Information also gives the importance of attributes and therefore is a good

way for attribute selection and data reduction.

b) Gain Ratio

To overcome the shortcomings of information, Quilan used gain ratio in

C4.5 which is a successor of ID3. It uses the extension of information gain

by using split information. Split Information normalizes the information

gain and is denoted in equ (3.5).

Split Info A (D) = - x log2Info ----------------------- (equ 3.5)

It differs from information gain since it considers the number of tuples

having that particular outcome or class label with respect to total number of

tuples in D. The gain ratio is defined in equation 3.6.

𝐺𝑎𝑖𝑛 (𝐴)
Gain Ratio (A) = --------------- (equ 3.6)
𝑆𝑝𝑙𝑖𝑡𝐼𝑛𝑓𝑜(𝐴)

The attribute with maximum gain ratio is selected for splitting.

In applying attribute selection methods i.e. Information Gain and Gain

Ratio, unimportant attributes were removed. Other method for discretization

is Histogram Analysis; in this method values are partitioned such that each

partition contains same number of tuples. It is then recursively applied until

the pre specified number of levels has been reached.

35
Information gain in Weka tool gave an attribute ranking. There after the

attributes with low rankings or irrelevant attributes were eliminated.

Thereby column reduction was achieved. This trims the data considerably

for analysis.

3.5 Mining Models

Data mining has got a big collection of methods for mining knowledge. It

can be done by association, clustering, classification or regression methods.

Considering the data type for this study, it was most appropriate to apply

supervised learning method since the output class labels were available as

seen in the data. Therefore, in this study the main focus was on

classification. Various classifiers which were used are detailed below.

3.5.1 Classification

There are many classification techniques such as neural networks, K-nearest

methods, Bayesian, and support vector machines. In this study, however

Bayesian classification and decision trees are used since decision trees are

easy to interpret and understand and it also gives rules which acts as the

final information needed for management policies. Decision tree is a

flowchart consisting of internal nodes, leaf node or terminal node and

branch. Branch represents an outcome of a test, internal node represents a

36
test on an attribute and each leaf holds the class label. The most popular

decision trees are ID3 (Iterative Dichotomiser 3), CART (Classification and

Regression Tree), and Random Forest. (Witten I et al. 2005).

Classification is a major aspect in data mining. It predicts the Y given the

values of a vector of predictor variables X. In this project Y, a finite set of

unordered values i.e. Performance outputs are classified based on input

attributes X which are technical capabilities classification.

The following classification method is used in this project.

3.5.1.1 Bayesian Classification

Classification by bayesian method is the process by which a model is

created or chosen to try to best predict the probability of an outcome. In

many cases the model is chosen on the basis of detection theory to try to

guess the probability of an outcome given a set amount of input data.

Bayesian Classification is a predictive data mining technique which makes

prediction using historical data. Predictive models have the specific aim of

allowing one to predict the unknown value of a variable of interest given

known values of other variables. Classification maps data into predefined

groups or classes. It is often referred to as supervised learning because the

37
classes are determined by examining the data by expert or many experts of

that domain.

Bayes classification in pattern recognition and data mining methods are

developed based on Bayes rule of conditional probability. Bayes theorem

offers a way to unfold experimental distributions in order to get the best

estimates of the true ones. Bayes rule is a technique to estimate the

likelihood of a property given the set of data as input also called evidence.

The approach is called “naïve” because it assumes the independence

between the various attribute values. Naïve Bayes classification can be

viewed as both a descriptive and a predictive type of algorithm. The

probabilities are descriptive and are then used to predict the class

membership for a target tuple with certain values of the attributes. Therefore

it is predictive too. The naïve Bayes approach is simpler to use because it

requires a small training data set and over-simplified assumptions.

Naive bayes has been used in many real time experiments for prediction.

Bayesian Networks provides a probabilistic method of reasoning under

uncertainty. It reveals the patterns in the data which illustrate the high

probability factors or also called reasons. It has been used for predicting

software defects where the probability of defects assisted in removing the

bugs (Fenton et al., 2008). It has also been used for finding the location of
38
fault in an electric power delivery system based on the database provided.

The bayesian network could classify the fault and non-fault depending on

the probability associated.

The Bayesian decision making refers to choosing the most likely class,

given the value of the features or attributes. The probabilities of class

membership are calculated from the bayes' theorem. Bayes theorem is

explained below:

If the tuple X is denoted by vector (x1-----xd) and class of Ci ,given the

probability p(Ci) and P (X/Ci) which denotes the prior probability that the

random sample is a member of class Ci and P(X/Ci) is the conditional

probability of obtaining attribute values X given the sample is from Ci. Our

goal is to estimate the probability that a sample belongs to class Ci , given

that it has attribute values X which is denoted by P(Ci/X) which can be

calculated according to equ 3.1 above as stated by bayes theorem. The

Derivation of bayes' classification can be thus written as below:

D: Set of tuples
Each Tuple is an ‘d’ dimensional attribute vector
X: (x1, x2, x3,….xd)
Let there be ‘k’ Classes: C1,C2,C3…Ck

39
If there are d attributes or features and k classes, then probability of the

attribute vector is denoted by equation 3.7.

P(X1,......XD) = ...............( equ 3.7)

Which can be computed assuming that each attribute is independent within

each class by equation 3.8

P(x1,.......xd|Cj) = P(x1|Cj)*P(x2|Cj).....P(xd|Cj)......(equ 3.8)

Bayes theorem of conditional probability states that a tuple with attributes


values x1, x2....xd belonging to class Ci is denoted by equation 3.9.

𝑷(𝑪𝒊)𝑷(𝒙𝟏|𝑪𝒊)……..𝑷(𝑿𝒅|𝑪𝒊)
P (Ci|x1,....,xd) = ...... (equ 3.9)
∑𝒌
𝒋=𝟏 𝑷(𝑪𝒋)𝑷( 𝒙𝟏|𝑪𝒋)……..𝑷(𝑿𝒅|𝑪𝒋)

Naïve Bayes classifier predicts X belongs to Class Ci if

P (Ci/X) > P (Cj/X) for 1<= j <= k , j <>i

Maximum Posteriori Hypothesis is given by equation 3.10

P(Ci /X) = P(X/Ci)* P(Ci) / P(X) ..........(3.10)

Bayes classification aims to Maximize P(X/Ci)* P(Ci) as P(X) is constant.

With many attributes, it is computationally expensive to evaluate P(X/Ci).

Therefore Naïve Assumption of “class conditional independence”. The final

40
derived equation assuming class independence is given by equation 3.11

and 3.12

P(X/Ci)=∏𝑑𝑘=1 𝑃(𝑋𝑘/𝐶𝑖) ................ (equ 3.11)

P( X /Ci) = P(x1 /Ci) * P(x2 /Ci) *.. P(xd /Ci)........(equ 3.12)

Here, X is the vector related to the project personnel having d attributes with

values x1, x2,--xd. Also output classes are: C1- good, C2- average and C3-

poor for performance. This project aims at finding through Bayesian

classification, attribute values which gives high probability in the respective

classes as per equation 3.12.(Butler et al.,1992).

3.6 Implementation of classification methods

Firstly, the experiment was conducted using a small dataset and using Bayes

Classification. On getting a positive pattern and important revelation, Weka

tool kit was used to continue the experiments and validate the result of

bayes classification.

The algorithm used for classification in this project is ID3 and CART.

Under the "Test options", the 10-fold cross-validation is selected for the

evaluation approach. Since, there is no separate evaluation dataset, this

option was necessary to get a reasonable idea of accuracy of the generated

model. The model is generated in the form of decision tree as shown in the

41
following chapter. These predictive models provide analytical way for

performance analysis.

42
CHAPTER FOUR

DATASET EVALUATION AND RESULT

4.1 Introduction

Network intrusion detection and Countermeasure Selectionbrings a number

of new challenges to a packet filtering system, such as flexibility, scalability

and robustness. Both the existing packet-filtering facilities. In this chapter,

we present a novel approach in the design of a domain-specific language for

network intrusion detection.

4.1.1 Data Discretization

It is the process of putting values into buckets so that a limited number of

possible states. The buckets themselves are treated as ordered and discrete

values. You discretize both numeric and string values.There are various

techniques or method of data discretization, these includes;Histogram

analysis, Binning, Correlation analysis, Clustering analysis etc.

For our data discretization, clustering analysis technique was used. Below is

the result of the discretized dataset.

43
Dataset
tr g
res ct ct ct_ ct_ ct is ct ct ct is_
s D s d T s a a ct ct_ _
s d s d s d s d s d po _ _ src dst _d _f _f _ _ sm
b b r sd i i s ds dc y c ns _s flw o
p p l l l l t t mm nse sr d _d _s st tp tp sr sr _i
y y at t n n j j w wp n k _ ta _ht u
k k o o oo c c e e _b v st po po _s _l _ c v ps
t t t tt p p i i i i r a d d te tp_ t
t t a a s s p p a a od _ _l rt_ rt_ rc o c _l _ _p
e e el l k k ttn nt c a e _t mt p
s s d d s s b b n n y_l sr t lt lt _lt gi m t d ort
s s t t t k t pt tl hd u
en c m m m m n d m st s
h t

0 0 0 0 1 10 1 0 0 0 0 0 000 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 1 1 0 1

1 1 1 0 0 01 0 0 1 1 0 1 111 1 0 1 1 1 1 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1

0 0 0 0 1 10 1 0 0 0 0 0 000 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 1 0 0 1

0 0 0 0 1 10 1 0 0 0 0 0 000 0 0 0 0 0 0 0 0 0 1 0 1 0 0 1 1 0 0 0 1 0 1 1

0 0 0 0 1 10 1 0 0 0 0 0 000 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1

0 0 0 0 1 10 1 0 0 0 0 0 000 0 0 0 0 0 0 1 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 1

0 0 0 0 1 10 1 0 0 0 0 0 000 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 1 0 0 0 1 0 0 1

0 0 0 0 1 10 1 0 0 0 0 0 000 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1

0 0 0 0 1 10 1 0 0 0 0 0 000 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 1 0 0 0 1 0 0 0

0 0 0 0 0 11 0 0 0 0 0 1 101 1 1 1 1 1 1 1 0 1 0 1 1 0 0 0 0 0 0 0 0 0 1 0

0 0 0 0 0 11 0 0 0 0 0 1 101 1 1 1 1 1 1 0 0 1 0 0 1 0 0 0 0 0 0 1 0 0 0 1

0 0 0 0 1 10 1 0 0 0 0 0 000 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 1 0 0 1

0 0 0 0 1 10 1 0 0 0 0 0 000 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0

0 0 0 0 0 11 0 0 0 0 0 1 101 1 1 1 1 1 1 0 0 1 0 0 1 0 0 0 0 0 0 1 0 0 0 0

0 0 0 0 0 01 0 0 0 0 0 0 001 0 1 1 1 0 1 0 1 1 0 0 1 0 0 0 0 0 0 1 0 0 0 0

0 0 0 0 1 10 1 0 0 0 0 0 000 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0

0 0 0 0 1 10 1 0 0 0 0 0 000 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0

44
tr g
res ct ct ct_ ct_ ct is ct ct ct is_
s D s d T s a a ct ct_ _
s d s d s d s d s d po _ _ src dst _d _f _f _ _ sm
b b r sd i i s ds dc y c ns _s flw o
p p l l l l t t mm nse sr d _d _s st tp tp sr sr _i
y y at t n n j j w wp n k _ ta _ht u
k k o o oo c c e e _b v st po po _s _l _ c v ps
t t t tt p p i i i i r a d d te tp_ t
t t a a s s p p a a od _ _l rt_ rt_ rc o c _l _ _p
e e el l k k ttn nt c a e _t mt p
s s d d s s b b n n y_l sr t lt lt _lt gi m t d ort
s s t t t k t pt tl hd u
en c m m m m n d m st s
h t

0 0 0 0 0 11 0 0 0 0 0 0 001 1 1 1 1 1 1 0 0 1 0 0 1 0 0 0 0 0 0 1 0 0 0 0

0 0 0 0 0 11 0 0 0 0 0 0 001 0 1 1 1 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 11 0 0 0 0 0 1 101 1 1 1 1 1 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0

0 0 0 0 1 10 1 0 0 0 0 0 000 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 1 0 0 1

1 1 0 1 0 01 0 0 0 1 0 0 011 0 1 1 1 1 1 0 1 1 0 0 1 0 0 0 0 0 0 1 0 0 0 1

0 0 0 0 0 11 0 0 0 0 0 0 001 1 0 1 1 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1

1 1 1 0 0 11 0 0 1 0 0 0 001 0 1 1 1 1 1 1 0 1 0 0 1 0 0 0 0 0 0 1 0 0 0 1

0 0 0 0 1 10 1 0 0 0 0 0 000 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 1 0 0 1

Fig.1: Discretized Dataset

4.1.2 Information Gain

Information gain is the amount of information that is gained by knowing the

value of the attribute, which is the entropy of the distribution before the split

minus the entropy of the distribution after it. The largest information gain is

equivalent to the smallest entropy.Information gain theory measure

information in bits;

− ∑ 𝑃𝑖 𝐿𝑜𝑔2 𝑃𝑖
𝑖=1

45
entropy(p1,p2,…,pn)=−p1log(p1)−p2log(p2)−⋯−pnlog(pn)

entropy(p1,p2,…,pn)=−p1log(p1)−p2log(p2)−⋯−pnlog(pn)

Information gain = (Entropy of distribution before the split)–(entropy of

distribution after it)Information gain=(Entropy of distribution before the

split)–(entropy of distribution after it)

InformationGain Formula
Gain (SA) = H (S) =
|𝑆𝑉 |
∑ 𝐻(𝑆𝑣 )
|𝑆|
𝑣∈(𝐴)

Where v is the possible values of A


S is set of examples
Sv is subset where XA = V

Gain Ratio Interface

Fig.2: Gain ratio front face

46
In order to access the gain ratio information the process button is click and

the gain ratio information is displayed below.

Fig.3: Gain ratio interface

The above interface describes the three different attributes, which are; Info

(D), Split Info and Gain Ratio. These attributes evaluates different packets in

a network and shows the packet which is most vulnerable to attacks.

47
Fig. 4: Evaluation on training set on Weka

Time taken to test model on training data: 0.01 seconds

Summary

Correctly Classified Instances 82 82 %

Incorrectly Classified Instances 18 18 %

Kappa statistic 0.2991

Mean absolute error 0.2628

Root mean squared error 0.364

Relative absolute error 87.8326 %

48
Root relative squared error 94.7293 %

Total Number of Instances 100

Detailed Accuracy By Class

Confusion Matrix

TP Rate FP Rate Precision Recall F-Measure MCC ROC Area PRC Area Class

0.927 0.667 0.864 0.927 0.894 0.308 0.773 0.933 yes

0.333 0.073 0.500 0.333 0.400 0.308 0.773 0.447 No

Weighted Avg. 0.820 0.560 0.798 0.820 0.805 0.308 0.773 0.845

a b<-- classified as

76 6 | a = yes

12 6 | b = No

Naïve Bayes classification using WEKA

Fig. 5: Visualize Error

The above interface shows the visualize classification error, which is based

on the other of the instances in the evaluation process.

49
Fig.6:Margin Curve

The margin curve generates points illustrating the prediction margin. The

curve is defined as the difference between the probability predicted for the

actual class and the highest probability predicted for the other classes. The

margin curve increases the margin on the training data and gives a better

performance on a test data.

4.4 NIDPS’sPerformanceanalysis-mode(sniffermode)

Here,SnortNIDPShasbeenconfigured toanalysisorSniffermode.The

following metrics

wererecorded:thenumberofpacketsreceivedofthetotalpacketssent;thenumber

50
ofpackets

analyzedofthetotalpacketsreceived;thenumberofpacketsdroppedofthetotalpa

cketsreceived; thenumberofpacketsrejectedofthe totalpackets received;and

thenumberofpacketsoutstanding of thetotalpacketsreceived. Specific

resultsaregiven inthefollowingsections.

4.4.1. Data Attribute Graph

Attribute Graph
20

15

10

0
43

61
1
4
7
10
13
16
19
22
25
28
31
34
37
40

46
49
52
55
58

64
67
70
73
76
79
82
85
88
91
94
97
100
spkts dpkts sbytes dbytes rate
sttl dttl sload dload sloss
dloss sinpkt dinpkt sjit djit
swin stcpb dtcpb dwin tcprtt
synack ackdat smean dmean trans_depth
response_body_len ct_srv_src ct_state_ttl ct_dst_ltm ct_src_dport_ltm
ct_dst_sport_ltm ct_dst_src_ltm is_ftp_login ct_ftp_cmd ct_flw_http_mthd
ct_src_ltm ct_srv_dst is_sm_ips_ports

Figure 7: Data Attribute Graph


The data attribute graph explains the different packets collected from

different networks and analyzed by their semantic similarities.

51
Figure 8: Dataset Table Sample
The above figure shows the discretized dataset used as a sample in the

Weka environment.

4.4.2 Check Performance in the Confusion Matrix

Uses the confusion matrix plot to understand how the currently selected

classifier performed in each class. To view the confusion matrix after

training a model, on the Classification Learner tab, in the Plots section,

click Confusion Matrix. The confusion matrix helps you identify the areas

where the classifier has performed poorly.

When you open the plot, the rows show the true class, and the columns show

the predicted class. The diagonal cells show where the true class and

predicted class match. If these cells are green and display high percentages,

52
the classifier has performed well and classified observations of this true class

correctly.

The default view shows summaries per true class in the last two columns on

the right.

Using thedataset, the top row shows all true positive class with true class

prediction. The columns show the predicted classes. In the top row, 83% of

the cars from France are correctly classified, so 15% is the true positive rate

for correctly classified points in this class, shown in the green cell in

the True Positive Ratecolumn.

If you want to see numbers of observations instead of percentages,

under Plot, selectAbsolute observations.

If false positives are important in your classification problem, plot results

per predicted class (instead of true class) to investigate false discovery rates.

To see results per predicted class, under Plot, select the Positive Predictive

Values False Discovery Rates option. The confusion matrix now shows

summary rows underneath the table. Positive predictive values are shown in

green for the correctly predicted points in each class, and false discovery

rates are shown below in red for the incorrectly predicted points in each

class.

53
If you decide there are too many misclassified points in the classes of

interest, try changing classifier settings or feature selection to search for a

better model.

Figure 9: Confusion Matrix

54
CHAPTER FIVE
SUMMARY, CONCLUSION AND RECOMMENDATION

5.0 SUMMARY

Network computing has increased in many organizations. It provides many

benefits in terms of low cost and accessibility of data. Ensuring the security

of the network is a major factor in network environment, as users of the

network often store sensitive information with network storage providers but

these providers may be untrusted. In this project an Intrusion Detection and

Countermeasure Selection in Virtual Network Systems mechanism called

NICE is used to prevent vulnerable virtual machines from being

compromised in the network. NICE detects and mitigates collaborative

attacks in the virtual networking environment. The system performance

evaluation demonstrates the feasibility of NICE and shows that the proposed

solution can significantly reduce the risk of the network system from being

exploited and abused by internal and external attackers.

5.1 CONCLUSION

Network Intrusion detection and Countermeasure selection (NICE) and the

related security concepts, is used to detect and mitigate collaborative attacks

in the virtual networking environment. NICE uses the attack graph model to

55
conduct attack detection and prediction. It investigates how to use the

programmability of software switches based solutions to improve the

detection accuracy and defeat victim exploitation phases of collaborative

attacks. NICE utilizes the attack graph model to conduct attack detection and

prediction.

1. NICE will improve the attack detection probability and improve the

resiliency to Virtual Machines (VM) exploitation attack without interrupting

existing normal network services.

2. NICE employs a novel attack graph approach for attack detection and

prevention by correlating attack behaviour and also suggests effective

countermeasures.

5.2 RECOMMENDATION

A network-based intrusion detection system (IDS) plugs directly into your

network and monitors activity. Such a system places very little overhead on

the network because it only watches your network traffic and sends alerts if

it detects anything abnormal within the network and as well counter attack

any of those abnormality or attackers. These systems are primarily passive

devices that are virtually undetectable by hackers but they are not perfect.

56
 The NICE framework can be used in any organization, institutions,

and offices etc. that are involved in network computing.

 This can be used not only for monitoring but also for controlling any

number of PC’s connected in LAN.

 It is used to monitor network activities and send alert to the network

administrator.

57
REFERENCES
Ayesha, A. (2015). Network Intrusion Detection and Countermeasure Selection:

International Journal of Innovative Research in Computer and Communication

Engineering, Vol. 3, Issue 8.

Bhagyashri, S., and Sonali, P. (2015). Review on Countermeasure Selection and

Networking Intrusion Detection using Virtual Network Systems: International

Journal of Electronics, Communication & Soft Computing Science and

Engineering, ISSN: 2277-9477.

James, D., Maxwell, C., and Griffith S. (2016). Network Intrusion Detection and

Countermeasure Selection in Virtual Network (NIDCS): IJSPTM, Vol 5, No 1.

Naga, R., and Subhani, S. (2017). Network Intrusion Detection and

Countermeasure Selection in Virtual Network Systems: International Journal

of Computers, Electrical and Advanced Communication Engineering, Volume

1, Issue 11, PP: 89 – 100.

Nirmala, D., and Selvalakshmi, C. (2015). Network intrusion detection system

with Deceptive Virtual Hosts for Industrial Control Networks: International

Journal of Scientific Engineering and Applied Science (IJSEAS), Vol. 1,

Issue.3.

58
Pinki, K., and Avni, K. (2016). Network Intrusion Detection and Countermeasure

Selection in Virtual Network Systems: International Journal of Electrical and

Electronics Research, ISSN 2348-6988, Vol. 4, Issue 3, pp: (9-13).

Radharani, S., and Leela, P. (2017). Countermeasure Selection and Intrusion

Detection Using NICE in Virtual Network Systems: International Journal of

Pure and Applied Mathematics, Volume 117, No. 19, ISSN: 1314-3395.

Rakhi, R. (2016). Network Intrusion Detection and Countermeasure: International

Journal of Innovative and Emerging Research in Engineering Volume 2, Issue

3.

Tejashree, A., Raksha, S., Gayatri, D., and Monika, V. (2015). Secure Network

Communication and Intrusion Detection in Virtual Machines: IJCSMC, Vol. 4,

Issue. 4, pg.36 – 40.

Vidya, B., and Megha, F. (2016). Network Intrusion Detection and

Countermeasure Selection in Wireless Sensor Network: International Journal

of Engineering Science and Computing, vol. 1, ISSN 2321 3361.

Vikram, K., Anitha, B., Padmavathi, G., and Sravani, D. (2016). Network Intrusion

Detection and Countermeasure: International Journal of Computer Science

And Technology, Vol. 7, Issue 3.

59
Vipin, S., and Himanshu, A. (2015).Network Intrusion Detection using Feature

Selection and PROAFTN Classification: International Journal of Scientific &

Engineering Research, Volume 6, Issue 4, pp: (466-472).

60

Вам также может понравиться