You are on page 1of 19

Detection of Application

Layer DDOS Attack using

Hidden Semi Markov Model

I thank the Almighty, who has been with me through every walk of

my life, guarding me and showering his blessings throughout the

endeavor to put forth my dissertation.

It gives me a great pleasure to place on record my heartfelt

gratitude to, Firstsoft Technologies Pvt. Ltd., for his invaluable

guidance and constant encouragement throughout the course of the


I am thankful to all staff for extending their support during the course of the project. Last
but not the least, I wish to thank my parents and friends who provided extensive support
in making this work a success

Firstsoft Technologies private Limited is a Chennai based software

Development Company specialized in customized client/server software solutions,

Internet centric application development, Business process outsourcing and project

consulting. Firstsoft has successfully executed projects in varied business domains

based on client/server-based platforms. The company is committed to provide

software services and products of assured quality to ensure customer satisfaction.

The focus has been consistent in adopting a solutions oriented approach and state

of the art tool and techniques, to implement high quality, cost effective business


Our technical team comprises of a well-balanced and experienced software

professionals serving the IT industry. Most of our team members are with

abundant technical skills that help the team to handle the extremely varied

requirements of the clients, thereby providing total solutions under one banner.

They are experienced in developing creative techniques to make meaningful

implementation of the latest technologies, which enhances the value and quality of

the services. We meet client challenges by having the best people and processes in

place to consistently deliver services and solutions on time and on budget, across

technologies and industries. We provide the expertise and experience required to

deliver solutions to our clients that are data driven, business focused, and

measurable in terms of our client’s critical business requirements.

Detection of Application Layer DDOS Attack
using Hidden Semi Markov Model


The recent tide of Distributed Denial of Service attacks against high-profile

web sites, demonstrate how damaging the DDoS attacks are and how
defenseless the Internet is under such attacks. The services of these web sites
were unavailable for hours or even days as a result of the attacks. In this
attack the adversary simultaneously send a large volume of traffic to a
victim host or network. The victim is overwhelmed by so much traffic that it
can provide little or no service to its legitimate clients. The burst traffic and
high volume are the common characteristics of App-DDoS attacks and flash
crowds, it is not easy for current techniques to distinguish them merely by
statistical characteristics of traffic. Therefore, App-DDoS attacks may be
stealthier and more dangerous for the popular Websites than the general Net-
DDoS attacks when they mimic the normal flash crowd. This project
proposes a scheme to capture the spatial-temporal patterns of a normal flash
crowd event and to implement the App-DDoS attacks detection. Since the
traffic characteristics of low layers are not enough to distinguish the App-
DDoS attacks from the normal flash crowd event, the objective of this
project is to find an effective method to identify whether the surge in traffic
is caused by App-DDoS attackers or by normal Web surfers. This project
defines the Access Matrix (AM) to capture spatial-temporal patterns of
normal flash crowd and to monitor App-DDoS attacks during flash crowd
event. Hidden semi-Markov model is used to describe the dynamics of AM
and to achieve a numerical and automatic detection. Principal component
analysis and independent component analysis used to deal with the
multidimensional data for Hidden semi-Markov model and finally the
monitoring architecture validate the real flash crowd traffic.


The following are the software tools are required to implement the system
and tested using Unit testing applications.


Language : Java JDK 1.6

J2EE Technologies : Servlets, JSP

Application Server : Apache Tomcat 5.0

Operating System : Windows XP


Processor : Pentium IV 500MHz.

Monitor : SVGA
Secondary Storage : 40GB HDD
Floppy Drive : 1.44MB

Any attack on the Internet today can be highly devastating.

Distributed Denial of Service (DDoS) attacks are among the most malicious
Internet attacks, that overwhelm a victim system with data such that the
victim response time is slowed or totally stopped. There have been many
instances where DDoS attacks have caused damages worth billions of
dollars. Defending against DDoS attacks has hence become a major priority
in the Internet community The attacker’s objective is to interrupt or reduce
the quality of experienced by legitimate users. Many attacks have innocent
counterparts (e.g., someone sends me very large E-mail services as
attachment, and blocks my access to other messages)

Basic Concepts:
Flash crowd: It is a sudden, large surge in traffic to a particular Web site

Denial of Service (DoS): It is an explicit attempt to prevent legitimate users

of a service from using that service
Attack Types:

1) Bandwidth consumption
i) attackers have more bandwidth than victim, e.g. T3 (45Mpbs)
attacks T1 (1.544 Mbps).
ii) attackers amplify their bandwidth engaging other computers to
attack victim with higher bandwidth, e.g. 100 56Kbps attack a T1
2) Resource starvation: consumes system resources like CPU, memory,
disk space on the victim machine using flooding
Smurf, Fraggle, Syn flood: Attacker sends sustained packets to
broadcast address of the Simplifying network with source address is
forged to read the victim’s IP address. Since traffic was sent to
broadcast address all hosts in the amplifying LAN will answer to the
victim’s IP address If a few SYN packets are sent by the attacker
every 10 seconds, the victim will never clear the queue and stops to
Hidden semi Markov Model:
We apply the hidden semi-Markov model (HSMM) to
characterize legitimate request patterns to a Web server and to detect
DDoS (distributed denial of service) attacks on it. Measurements of
real workload often indicate that a significant amount of variability is
present in the traffic observed over a wide range of time scales,
exhibiting self similar or long range dependent characteristics Major
advantages of using an HSMM are its efficiency in estimating the
model parameters to account for an observed sequence, and the
estimated parameters can capture various statistical properties of the
workload, including self-similarity, long-range and short-range
dependence. Therefore, use of this HSMM is effective in better
understanding the nature of Web workload and in detecting the
anomalous behavior that a DDoS attack may present.

Existing System:

At present most of the systems are vulnerable to Dos attack. DoS attacks are
of particular interest and concern to the Internet community because they
seek to render target systems inoperable and/or target networks inaccessible.
"Traditional" DoS attacks, however, typically generate a large amount of
traffic from a given host or subnet and it is possible for a site to detect such
an attack in progress and defend themselves. Distributed DoS attacks are a
much more nefarious extension of DoS attacks because they are designed as
a coordinated attack from many sources simultaneously against one or more
targets. There are some attack detection mechanisms as follows

1) Signature detection :
Signature detection (also known as misuse detection),where we
look for patterns signaling well known attacks

2) Anomaly detection:
Identifying something out of ordinary is essentially anomaly
PHAD (packet header anomaly detector):
PHAD extends the four attributes normally used in network anomaly
detection systems (source and destination IP address, source and destination
port numbers). Transport headers (TCP, UDP) fields are tested as
appropriate for each protocol. In testing, we discovered that many attacks
could be detected because of unusual values in these fields. In addition to IP
address anomalies, we found that some attacks generate unusually small
packet sizes, unusual combinations of TCP flags (e.g. urgent data, missing
acknowledgements, reserved flags).
ALAD (application layer anomaly detector):
Instead of modeling single packets, as in PHAD, we model incoming
TCP connections to the well known server ports (0-1023).Although this
misses a few attacks that exploit IP, UDP or higher numbered ports (such as
X servers), it does (or should) catch most attacks against servers, which
usually use TCP. The attackers will keep trying to establishing connections
to servers by huge number of requests which will generate the flash crowd in
network and resource starvation.
Time-To-Live (TTL)
Here each router marks packets with dynamic probability.
Specifically, each router marks a packet with a probability proportional to
the distance it has to travel. As such, a packet that has to traverse long
distances is marked with higher probability, compared with a packet with
shorter distances to traverse. This modification ensures that a packet is
marked with much higher probability compared to existing mechanisms,
which greatly reduces effectiveness of spoofed marks. It can reduce the
number of false positives by 90%
1) All the legitimate packets would be marked at least once by an
intermediate router before it reaches the destination (victim).
2) There is an upper bound on the probability that a spoofed (illegitimate)
packet reaches the destination without being marked. This upper bound is a
function of the distance between the sender (attacker) and the destination
(victim). The attackers will set TTL to high, but the spoofs will be find and
reduce the TTL by routers based on distance to destination.

1. The Existing Attack detection mechanism uses only the concept of
request rate of the particular user and flash crowd event in network.
2.Other existing defense methods may be those based on schemes.
Those schemes are not effective for the DDoS attack
They may annoy users and introduce additional
service delays.
3 Though anomaly detection can detect novel attacks, it has the
disadvantage that it is not capable of discerning intent. It can only
signal that some event is unusual, but not necessarily hostile, thus
generating false alarms

Proposed System:

The goal of the proposed system is to add some new attack detection
with addition of existing system. We proposed a attack detection
mechanism, a scheme ,based on document popularity using
Access Matrix that will define the temporal patterns.
Pattern indicates the website links that have some sequence
of path. We used a sequence anomaly detector based on
hidden semi-Markov model to detect the App-DDOS attacks.
1. The basic idea behind the proposed system is to isolate and protect
legitimate traffic from huge volumes of DDoS traffic when an attack
2. Our first step is to distinguish packets that contain genuine source IP
addresses from those that contain spoofed addresses. This is done by
redirecting a client to a new IP address and port number (to receive
web service) through a standard HTTP redirect message.
3. The proposed system uses some advanced detection technique with
addition to existing technique to detect the App-DDOS attack.
4. The proposed system uses Access Matrix to maintain the access
sequence of every user.


The following are the modules obtained by the detailed design of the proposed
1) MAC Generator
2) MAC verifier
3) IP handler
4) Query Handler
5) Access Matrix
6) Hidden semi Markov Model

Module 1:

MAC Generator

This module is to distinguish packets that contain genuine source

IP addresses from those that contain spoofed address. Once the very first
TCP SYN packet of a client gets through, the proposed system immediately
redirects the client to a pseudo-IP address (still belonging to the website) and
port number pair, through a standard HTTP URL redirect message. Certain
bits from this IP address and the port number pair will serve as the Message
Authentication code (MAC) for the client’s IP address. MAC is a symmetric
authentication scheme that allows a party A, which shares a secret key k
with another party B, to authenticate a message M sent to B with a signature
MAC (M,k) has the property that, with overwhelming probability, no one
can forge it without knowing the secret key k.

Module 2

MAC Verifier

This module is to prevent attackers who are using genuine address or

spoofed address. Since a legitimate client uses its real IP address to
communicate with the server, it will receive the HTTP redirect message
(hence the MAC). So, all its future packets will have the correct MAC inside
their destination IP addresses and thus be protected. The DDos traffic with
spoofed IP addresses, on the other hand, will be filtered because the
attackers will not receive the MAC sent to them. So, this technique
effectively separates legitimate traffic from DDos traffic with spoofed IP
Module 3:

Attacker Prevention (IP Handler Mechanism)

If the server find that the request rate from a IP is a higher than the
limit, the IP will be moved to blocked state, and further the response will not
be provided. Each time if a new request arrives, the server will get its IP and
check whether this IP is in blocked state or Normal state.
If it is in blocked state the service will not be provided or else the request is
handled and immediate response is given for the normal users.

Module 4:

Query Handler:

The attackers will try to attack the popular websites by sending the
queries on the URL path. If the queries are executed then some unexpected
results will happen for websites. For example modify and delete queries will
leads to more problems for popular sites. This module will check the URL
path and redirect the request if it contains the unwanted queries.

Module 5:

Access Matrix:

Here in this Access Matrix module we will store the Online

Shopping’s list of sequence access path information in a separate table. Here
the necessary information like user’s id, IP address port number access time
and the recent sequence of access path information is stored in another
separate table for future reference.

Module 6:
Hidden semi-Markov model:

Here in this module we will check the client’s sequence access path
information with the access matrix table to identify the attacker. If the
sequence of access path differs, we will update and name that ip address in
separate table as attacker.
Block Diagram:

Query Handler
Client 1

MAC Generator
& Server
MAC verifier Hidden Semi
Client 2
Markov Model

Access Matrix
Client n IP Handler
Dataflow Diagram:

IP Handler Query Handler

Branch details

MAC Generator Admin Stock details

& verifier

Supplier details
Product Details

Hidden Semi Markov model

Selecting Product Sequence path checker

Check Product