Вы находитесь на странице: 1из 4

A Model for Automatic generation of behaviour-

based worm signatures


Sébastien Chainay
Karima Boudaoud
University of Nice–Sophia-Antipolis
I3S-CNRS Lab.
University of Nice–Sophia-Antipolis
I3S-CNRS Lab.
Sophia-Antipolis, France
Sophia-Antipolis, France
chainay@polytech.unice.fr
karima.boudaoud@unice.fr

I. INTRODUCTION The aim of these signatures is to characterize the


1 propagation of a worm in a network and its execution on an
Worms are probably the faster malware with a propagation operating system.
capacity growing with the speed of networks. The propagation
is possible because of the bad organisation of the virtual This paper is organized as follows. First, we give an
memory of processes in operating systems and bad overview on worm morphisms. Then, we define the notion of
management of the memory in some programming languages behaviour-based signatures. After that, we present our model.
(such as the C language). Usually, a worm works as follows: Finally, we conclude with some remarks and future works.
II. MORPHISM OF WORMS
1. First, it injects code (named shellcode) into a remote
vulnerable process and binds a specific port. Then, the A worm having one representation is a monomorphic
exploited process starts to listen on this port. worm. However, a worm may have several representations. It
can be oligomorph, polymorph or metamorph. Moreover, a
worm can be encrypted or not. Generally, an encrypted worm
2. After that, the worm attempts to connect on the port (push contains one decryption function at the beginning or at the end
propagation strategy) or the hijacked process tries to connect of the worm followed by the encrypted body. In oligomorphic
to the worm (pull propagation strategy). worms, the decryption function can be different for some worm
replications (i.e. copies). However, in polymorphic worms, the
3. If the connection is successful, the exploited process decryption function changes for each worm replication.
executes a command prompt. Concerning metamorphic worms, they are not encrypted. They
recompile themselves with a different coding at each
replication.
4. Finally, the worm can start to send commands to the shell.
Existing signature generator systems detect and generate
The spreading of worms, particularly in high-speeds signatures for monomorph, oligomorph and polymorph worms.
networks, requires systems able to generate, as soon as possible Currently, Autograph [3] and SweetBait [4] (which uses
and automatically, signatures characterizing new worms. In Honeycomb [5]) detect monomorph worms. Earlybird [6] and
this context, several systems have been designed Nemean [7] detect oligomorph worms. PADS [8], PAYL [9],
[3][4][5][6][7][8][9][10][11]. The signatures generated by Polygraph [10] and Hamsa [11] detect polymorph worms (see
these systems are content-based, i.e. focus on finding one or Tab.I). All these systems focus on the worm content, except
several sets of bytes repeated in the code. Usually, hackers use Earlybird, which takes into account the address dispersion [6]
mechanisms that can change the content of a worm to avoid its by counting the number of connexions on different hosts (i.e.
detection by systems using content-based signatures. However, different IP addresses).
when a hacker changes the code content of a worm, he doesn’t
TABLE I. A COMPARISON OF GENERATION SIGNATURE SYSTEMS
change its behaviour. Thus, what is needed is to define
signatures based on the behaviour of the worm rather than on
its content. Consequently, in this paper, we propose a model
which generates a new kind of signatures, that we call:
behaviour-based signatures.

1
A worm is a piece of software that uses computer networks and security
flaws to create copies of itself. A copy of the worm will scan the network for
any other machine that has a specific security flaw. It replicates itself to the
new machine using the security flaw, and then starts replicating. [1]
Metamorphic worms can’t be detected easily by these Thus, to represent the system behaviour-based signature,
systems because worm codes (i.e. content) may change we consider the following elements:
considerably (without changing the behaviour), contrarily to
monomorphic, oligomorphic and polymorphic worms where • Internal Ports sequence.
only the decryption function may change [2]. • System calls sequence.
Thus, to detect these kinds of worms, it will be more • Library links sequence.
judicious to look at the behaviour rather than at the content.
• Devices access sequence.
III. BEHAVIOUR-BASED SIGNATURES
• CPU profile.
In the context of this work, we define the notion of
behaviour-based signatures to represent the worm behaviour at • Memory consumption profile.
the network and system level. Thus, we decompose the
behaviour-based signature in two parts: the network-based • Length evolution of the worm (in the case of
signature and the system-based signature. zipped worms).

A. Network behaviour-based signature • Historic of the worm location in the file system.
The network behaviour-based signature defines the way a • Operating system of the source host.
worm propagates from a source to a destination by analyzing
In addition to these elements, we take into account the
the following network metrics:
address dispersion characteristic used by EarlyBird. However,
• IP address of the source. in our case, in addition to count the number of connexions (like
in Earlybird), we look at the IP generation strategy used by the
• Source port, destination port, protocol type (tcp, worm. All these elements can be measured by tools provided
udp…). by Solaris such as VMstat, MPstat, IOstat, Kstat [12].
• Number of packets. After having defined the behaviour-based signatures, we
• Length of the biggest packet. will now present our generator model of signatures.

• Propagation strategy (i.e. Push or Pull). IV. A MODEL TO GENERATE BEHAVIOUR-BASED


SIGNATURES
• Average inter-arrival time between packets.
Our signatures generator model is composed of (see Fig.2):
• Duration between the first and the last packet.
• A collector, which gets worms connecting on
To collect these metrics, several tools are necessary: vulnerable applications.
TCPdump, TCPstat, DNS reverse lookup, etc.
• An emulator, which extracts worms by using taint
B. System behaviour-based signature analysis on a virtual operating system.
The system behaviour-based signature defines the worm • An analyzer, which monitors system activities of
activities on a computer (see Fig.1). We have identified three worms executed on a virtual operating system.
kinds of activities:
• A generator, which generates the behaviour-based
• Execution activities, which concern system calls signatures.
and library links made by the worm.
All these entities run independently each other so that
• Communication activities, which concern informations on several worms are collected in parallel.
communications with other processes using a
communication port.
• Access activities, which concern accesses to
devices (mainly disk accesses, in the case of files).
All these activities use CPU and RAM.

Figure 2. A model to generate behaviour-based signatures


Figure 1. Activities of a process in a computer
To obtain the network metrics from the worms collected TABLE II. A FEW CHARACTERISTICS OF HIGH-SPEED WORMS [15][16]
with both the emulator and collector, we propose to use
Code- Theorical Slammer Theorical
TCPdump, TCPstat tools. To obtain the system metrics, we
Red II faster worm Slapper faster worm
launch the worms on a virtual machine that emulate the
operating system of the source host where it has been extracted. Spread
14h 3,3s 10 mn 1,2s
We determine this operating system by using passive a duration
fingerprinting tool like Disco, p0f or ettercap. Length 4 ko 0,5 ko 0,4 ko 0,4 ko
Protocol-based TCP (latency-limited) UDP (bandwidth-limited)

B. Multimode and dual mode worms


A. Collector
Classic multimode worms search for both security holes
In the context of this work we don’t define a new kind of
whereas dual mode worms start out searching for the first hole
collector but we use an existing one, named Nepenthes [13].
until it decides that it has been completely exploited then they
The aim of this tool is to open vulnerable ports and to analyze
switch to exploiting the second hole [17]. Such worms have
received shellcodes in order to extract the URLs where the
several system behaviour-based signatures because they exploit
worms can be downloaded.
several security holes. To identify them, we have to make
B. Emulator groups (or clusters) that gather worms having the same
As for the collector, in our model, we don’t design a new network behaviour-based signature (more discriminatory).
emulator but we use an existing one, named Argos [14], which Within a cluster, worms that have completely different system
is a Linux tool that extracts worms from the network traffic. behaviour-based signatures are either dual mode worms (if they
have two system behaviour-based signatures) or multimode
C. Analyzer worms (if they have more of two system behaviour-based
The aim of the analyzer is to extract the system metrics signatures). If some system behaviour similarities are found
characterizing the collected worms. To do that, the analyzer between worms inside a same group then they belong to a same
starts a virtual system according to the operating system family of worms because they exploit common vulnerabilities
required for the execution of a specific worm. Then, it executes (see Fig.3).
the worm and monitors its system activities.
D. Generator
The generator gathers datas from the analyzer, emulator
(i.e. Argos) and collector (i.e. Nepenthes) and generates a
behaviour-based signature according to the following format
composed of two parts:
Net(Source(IP, country), Ports({({T|U}Source;
{T|U}Destination)}i), Packets(Number, length of
biggest packet), Time(Total duration, IAT),
Space(propagation strategy, address dispersion))
Sys(Calls({system}i, {library}i), Comm({Internal
ports}i), Devices({location access}i), CPU({%around
ten use, duration in seconds}i), Mem({%around ten
use, duration in seconds}i), Length({length}i),
Location({absolute path location}), address Figure 3. The Slapper worm family [18]
dispersion, background OS name)
VI. CONCLUSION
with T=TCP, U=UDP, IAT=Inter-Arrival Time.
In this paper, we have proposed a model to generate
The generated signatures are then registered in a database behaviour-based signatures in order to detect worms that can
in order to be used by a misuse detection system. not be detected easily by content-based signatures. For future
V. DISCUSSIONS works, we plan to define classification clusters using generated
behaviour-based signatures to recognize new worms,
A. Time consideration multimode worms, dual mode worms and worms family.
By using high-speed networks, worms can be spread on all
vulnerable computers within a very short period of time (see
Tab.II). In parallel, servers have more and more resources. It REFERENCES
involves that a worm can do more actions in a same time. So
the entities that extract network and system metrics (emulator, [1] Wikipedia, http://en.wikipedia.org/wiki/Computer_virus
collector, virtual machine, generator) have to be efficient [2] Peter Szor. The Art of Computer Virus Research and Defense, 2005.
enough (particularly with good hardware means) to respect the [3] K.-A. Kim and B. Karp. Autograph: Toward Automated Distributed
time constraint of these very fast spreads. But as the duration of Worm Signature Detection. In Proc. of the USENIX Security
worms execution in a virtual machine is unknown, this model Symposium, 2004.
does not guarantee to give a behaviour-based worm signature [4] G. Portokalidis and H. Bos. SweetBait: Zero-HourWorm Detection and
before the end of its spread. Containment Using Honeypots, Elsevier Journal on Computer Networks,
Special Issue on Security through Self-Protecting and Self-Healing [11] Zhichun Li, Manan Sanghi, Yan Chen, Ming-Yang Kao and Brian
Systems, 2005. Chavez. Hamsa: Fast Signature Generation for Zero-day Polymorphic
[5] Kreibich, C., Crowcroft, J. Honeycomb - Creating Intrusion Detection Worms with Provable Attack Resilience. In IEEE Symposium on
Signatures Using Honeypots. ACM SIGCOMM Computer Security and Privacy, 2006.
Communication Review 34 (2004) 51-56. [12] Bob Netherton. DTrace. Solaris 10 Workshop, 2005.
[6] S.Singh, C.Estan, G.Varghese, and S.Savage. Automated worm [13] http://nepenthes.mwcollect.org
fingerprinting. In Proc. OSDI, 2004. [14] http://www.few.vu.nl/argos
[7] V.Yegneswaran, J.Giffin, P.Barford, and S.Jha. An architecture for [15] S.Staniford, V.Paxson and N.Weaver. How to Own the Internet in Your
generating semantic-aware signatures. In USENIX Security Symposium, Spare Time. In Proc. of the 11th USENIX Security Symposium, pp.149-
2005 167, USENIX Association, 2002.
[8] Y. Tang and S. Chen. Defending against internet worms: A signature- [16] S.Staniford, D.Moore, V.Paxson and N.Weaver. The Top Speed of Flash
based approach. In Proc. of Infocom, 2003. Worms. In Proc. of RAID, 2004.
[9] K. Wang, G. Cretu, and S. J. Stolfo. Anomalous Payload-based [17] N.Weaver. Potential Strategies for High Speed Active Worms : A Worst
WormDetection and Signature Generation. In Symposium on Recent case Analysis, 2002.
Advances in Intrusion Detection, 2005.
http://www.icsi.berkeley.edu/~nweaver/worms.pdf
[10] J. Newsome, B. Karp, and D. Song. Polygraph: Automatically
generating signatures for polymorphic worms. In IEEE Security and [18] J.Nazario. Defense and Detection Strategies against Internet Worms.
Privacy Symposium, 2005. 2005.

Вам также может понравиться