Вы находитесь на странице: 1из 5

2010 Seventh International Conference on Information Technology

A Knowledge-Based System Implementation of Intrusion Detection Rules


Eric Flior, Tychy Anaya, Cory Moody Mohsen Beheshti, Jianchao Han, Kazimierz Kowalski Computer Science Department California State University Dominguez Hills Carson, CA 90747, USA Email: eflior1@cp.csudh.edu, tanaya1@cp.csudh.edu, cmoody8@cp.csudh.edu Abstract
This research determines the feasibility of using an Exsys Corvid based expert system to detect and respond to network threats and appropriately administrate a Linux-based iptables firewall in real-time. In our implementation, we attempt to replace the human domain expert required for creating the expert system knowledge base with intrusion detection rules created by datamining on network traffic. Our expert system will be used in conjunction with intrusion detection classification rules provided by the See5 data-mining tool, which have, in turn, been created based on the data fusion of normal and malicious network traffic from multiple network sensors. traffic across a particular network, and respond appropriately. This research intends to determine whether or not it is possible to create such a system based on a number of existing commercial off-the-shelf software (COTS) technologies. In Section 2, we will discuss certain existing systems, Snort, See5, Corvid, and iptables, which we propose to use to create our system. Section 3 will deal with how a ruleset created in See5 can be applied in a Corvid expert system. Section 4 discusses how an expert system based in Java can access and manipulate a firewall. We will propose and discuss an initial proof of concept implementation of a knowledge-based system implementation of intrusion detection rules in Section 5. A summary and conclusion will be the subject of the final section.

Key Words: Corvid, Expert System, iptables, Linux,


See5.

2. COTS software systems 1. Introduction


Creating a traditional knowledge-based expert system for the purposes of intrusion detection and prevention requires a great deal of work with a human domain expert. In addition to the initial investment of time required to distill the expert's knowledge into a set of rules, there is also a great deal of overhead involved with keeping the rule base current as attacks and exploits evolve. Often, as the number of rules grows in such a system, the number of false positives generated by the system increases and ultimately results in the need for the rule base to be completely re-designed [1]. In response to these, and other drawbacks, we propose a knowledge-based expert system which does not rely on knowledge generated by a human domain expert or experts, but instead uses classification data-mining on a set of captured and analyzed network traffic to provide rules which will allow the expert system to make decisions, in real-time, about whether or not to allow Our research relies on a number of existing systems, which are used to create a set of production rules and associated logic. These are then incorporated into the knowledge-based expert system. Incoming network traffic across a honeypot sensor network is analyzed and stored by the Snort packet capturing and analysis system. Classification data-mining is performed on the stored packets using See5, which provides us with a ruleset which allows us to classify incoming and previously unseen traffic as either malicious or benign. Finally, these rules are incorporated into a knowledge-based expert system, Corvid, which is used to make decisions and act on the rules provided by See5 and then modify a Linuxbased iptables firewall accordingly.

2.1 Snort
Snort, developed by SourceFire, is an open-source software package designed for packet capturing and real

978-0-7695-3984-3/10 $26.00 2010 IEEE DOI 10.1109/ITNG.2010.251

738

time network traffic analysis [2]. In our research, Snort is being used to identify traffic across a honeypot sensor network as being either malicious or benign, and, if malicious, to provide a signature of the traffic. Snort bases its analysis of the network traffic on a series of rules designed by a community of users. The results of Snort's analysis and the packet information are then stored in a MySQL database for later analysis. The information stored in the Snort database includes, but is not limited to, the malicious identification signature provided by Snort, as well as the IP, and TCP, UDP, or ICMP header information of the sensor network traffic [3].

2.3.2 Command blocks. Corvid command blocks control the procedural flow of the expert system, including how the system chains, executes the Logic blocks, loops, and displays results [9]. 2.3.3 Collection variables. Collection variables are variables that have lists of strings as their values [10]. They are generated during the execution of the Logic blocks, and can be passed as parameters to custom made Java functions. 2.3.4 CUSTOM.java. Corvid natively supports a number of standard Java functions, but also recognizes a special function, called CUSTOM [11]. CUSTOM.java can be passed any number of parameters, and can be used to add any special functionality that is needed which can be programmed in Java.

2.2 See5
See5, developed by Ross Quinlan as the successor to the ID3 and C4.5 systems, is a data mining tool which provides data classification capabilities [4]. In our research, we are using See5 in order to discover patterns and regularities in the Snort traffic database, present them in an intelligible form, and use those patterns and regularities to create a ruleset from which we can make predictions about and classify new malicious traffic. See5 takes its input in the form of a .data flat file, which contains information about the stored packets from Snort, and can output its classification as either a decision tree or a ruleset. The ruleset output is a series of If-then rules, based on statistical significance [5].

2.4 Iptables
Iptables is the userspace command line program used to configure the Linux 2.4.x and 2.6.x Ipv4 packet filtering ruleset [13]. The firewall that it is linked to is commonly known as the gateway to keep unauthorized access out and authorized access in. NAT or Network Address Translation is responsible for forming data packet headers out of network address information. Together these are used in order to maintain a safe environment on a network for users to access. Information from the NAT can be collected and then an intelligent decision on modifying the iptables can be made in regards to the firewall.

2.3 Corvid
Corvid, developed by Exsys, is a knowledge automation expert system development tool [6]. It is based on Java and allows for the creation of custom Java classes, providing a great deal of flexibility in modifying the operation of the created expert system. Expert systems created in Corvid are deployed as a Java applet, which allows the expert system to run on any Java-enabled system. Due to its features and flexibility, Corvid is ideal for implementing the intrusion detection rules determined by See5. Some key features include Corvid's Logic blocks, Command blocks, Collection Variables, and the userdefined Custom.java class. 2.3.1 Logic blocks. In a Corvid expert system, knowledge is encoded as a series of Logic Blocks, which organize and structure decision-making information into logically related blocks [7]. Decision-making information in the logic block is defined as a series of If-Then style rules. They can be run through either forward or backwardchaining, and can be associated with spreadsheet files which will allow the logic in each block to be applied sequentially to each record in the spreadsheet [8].

3. Combining See5 and Corvid


Once given the classification rules from See5 the expert systems that will be working as an intrusion detection system will then be created. The structure in which the See5 classification rules were created based decision statements that formulated with the attributes of the packet data flowing through Snort; the composition of Exsys Corvid's logic blocks is also composed by decision statements. This close similarity between both avenues makes it a simplistic and convenient translation of the See5 classification rules into Corvid's Logic Block.

3.1 Packet attribute conversion


In order for the Logic Block to be created, first Corvid must be able to access the information concerning the packets that had passed through Snort then to See5. The information of the packets from the network traffic was
739

created into a .data file for storage purposes, but it is still not sufficient for Corvid to use. The .data file must be converted into a spreadsheet file then saved as a tab delimited text file, Microsoft Excel was sufficient for doing this. This process of converting the packet information was to import the file into Corvid from the Logic Block.

3.2 Corvid metablocks


In order to use this feature the Logic Block must converted into a MetaBlock. The MetaBlock is an option for the Logic Block that provides a method to building a generic logic that can be applied to the rows of a spreadsheet. The rules and expressions in the MetaBlock have all of the features of any other Logic Block, but can also include special MetaBlock values that come from a spread sheet. After each row is processed, certain data may be saved or cleared depending on the nature of the system [8].

Seeing how we have a custom.java file we can use it to take input of system commands; that is issue the app a command and have it relay it to the system in order to execute it. This command will be responsible for manipulating iptables; after the command will be a series of arguments specific to the command that are a necessity in order for it to run properly. The java file will carry out the command through a system of classes and methods. There are two main classes that the system will utilize in order to achieve such a goal; the first being the runtime class and the second being the process class. Apart from these main classes there will be a variety of sub methods responsible for monitoring the system as well as ensuring basic functionality. They are not a necessity for the process to run but highly recommended in order to maintain proper functionality of the applet as a whole.

4.1 Runtime class


Every Java application has a single instance of class Runtime that allows the application to interface with the environment in which the application is running [X]. Inside of this runtime class it is import to focus and call on the methods specifically called process exec. In order for this to become useful java needs to run the new applet in a new process and under a specific directory where some sub-files may be located. There are a series of methods that may be invoked depending on what the user may require. Of these Processes exec (String command) will be the most basic; this executes the specified string command in a separate process [Y]. This can suit most basic manipulation but might not be the best possible route for it does not guarantee perfect directory file access nor the specifications for the command. The next step up from this will be the method Process exec (String cmdarray, String[] envp). This executes the specified command and arguments in a separate process with the specified environment [Z]. Using this method will take away the ambiguity of the command issued while still allowing it to act as a separate process. Perhaps the best of these methods inside the hierarchy is Process exec (String cmdarray, String[] envp, File dir). This will execute the specified command and arguments in a separate process with the specified environment and working directory [X2]. Implementing this method will be the best choice because it covers all the bases. Not only will it execute the command as a separate process with the arguments attached; it will also put it in a specific environment desired as well as the directory specified.

3.3 Logic block decision statements


Once the spreadsheet of the packet information is imported, the decision statements of the Logic Block may now be created. The decision statements of the Logic Block may be created in the exact structure as the decision statements in the See5 classification rules. Logic Blocks decision statements will be composed of IF, AND, and THEN statements. The Then statements will distinguish whether the packet is either safe or malignant by representing it as either 0 ("good") or ("bad") and will be stored into a collection variable. The main components of the packet attributes that will determine its classification in the decision statements are: ip_flags, ip_len, ip_src, ip_ttl, tcp_csum, udp_csum, icmp_csum, tcp_win, udp_win, icmp_win, tcp_dport, udp_dport, icmp_dport, tcp_sport, udp_sport and icmp_sport.

3.4 Custom.java utilization


After completing the Logic Block, the Command Block must be created in order to parse the information inside collection variable then utilize the data in order for CUSTOM.java to do what is required. A method call which pertaining to a variable must be created in the Command Block. The variable will be collection variable, the expression will be "CUSTOM("")" and this will set the COMMAND to: "SET [<Collection Variable Name>] CUSTOM("")". This will allow Corvid to access the CUSTOM.java, to do whatever that needs to be done.

4. Corvid and Java


740

The advantages of a specific directory means that all sub files required to run an applet optimally will be available to do so.

4.2 Process class


The class Process provides methods for performing input from the process, performing output to the process, waiting for the process to complete, checking the exit status of the process, and destroying (killing) the process [Y2]. Apart from actually running the command it is just as vital to monitor the method in case of unexpected faults. This will prevent endless loops from going unnoticed and notifying the user of either a typo in the command or missing files. Aside from this it will also alert the user of a successful attempt of the command so they will know what has been issued correctly. This is done through a system of process class methods.

This method will get the error stream of the sub process. The stream obtains data piped from the error output stream of the process represented by this Process object [Z3]. Having this method is vital in case of entry fault. It will allow the user to see where they have gone wrong and correct the error. One can choose to opt out of the method but it is highly recommended to implement this function as to ensure a proper execution of a command.

5. Proof of Concept Implementation


A training set of data was supplied to See5 based on traffic captured by Snort on our honeypot sensor network over a number of months. This data included a random sample of both normal and malicious traffic. Based on this data, See5 generated a ruleset to classify other traffic. This ruleset was then transformed into a logic block for a Corvid expert system. A sample data set was then prepared in spreadsheet format and provided to our expert system which analyzed and proposed a response to the traffic.

4.3 Destroy process


Sometimes it may be necessary to destroy a sub process that may be running. This will be achieved by using the method public abstract void destroy ().The sub process represented by this Process object is forcibly terminated [Z2]. Invoking this will be responsible for killing an endless loop or a faulty process.

5.1 See5 ruleset


The ruleset generated by See5 was a series of If-then rules based on packets from the training set of data. The rules were based on packet header information such as: ip_flag numbers, ip_length, ip_ttl, tcp_csum, tcp_dport, tcp_win, tcp_sport, and others. One example of the rules generated is as follows: IF {ip_flags} <= 0, and {ip_len} <= 357, and {tcp_csum} <=0, and {ip_length} > 120, and {ip_src} <= 1.451703E9, and {tcp_dport} <= 82, and {tcp_win} <= 23, and {ip_len} <= 276, THEN classify as malicious. A number of these rules were generated, and were used as the knowledge-base for the expert system.

4.4 Get input stream/output stream


The input of the process class is represented by Public abstract Input Stream getInputStream () this will get the input stream of the sub process. The stream obtains data piped from the standard output stream of the process represented by this Process object [X3]. Apart from this the output method of the process class is represented by public abstract Output Stream getOutputStream(). This method will use a method in order to return the reciprocal of the input stream function. This will get the output stream of the sub process. Output to the stream is piped into the standard input stream of the process represented by this Process object [Y3].

5.2 Corvid implementation of See5 ruleset


Due to the fact that both the See5 ruleset and the Corvid expert system logic blocks are based on an Ifthen style set of rules, modifying the ruleset output from See5 into a form which is accepted by Corvid is trivial. The output from the logic block was an internal collection variable identifying a particular record in the spreadsheet as being part of normal or malicious traffic.

4.5 Get error stream


The final of the important process methods is the get error stream method. This specific method is represented by syntax public abstract InputStream getErrorStream().

6. Results
When we ran our expert system, using the sample data set spreadsheet as input, and using the ruleset generated by See5, we observed that the expert system did properly identify known malicious traffic as malicious, and known normal traffic as normal with complete success.
741

6. Summary and Conclusion


Because of our observation that our knowledge-based system implementation of intrusion detection rules was able to identify malicious network traffic, we believe that it is in fact feasible to create a large-scale system without the need for a human domain expert. The success of our proof-of-concept implementation, combined with Corvid's ability to run customized Java code, and Java's ability to issue system commands shows that it is feasible to create an intrusion detection/prevention expert system based on these COTS systems. While the systems were not expressly designed to work together, it should be possible to create software wrappers that will automate the processing of incoming network traffic to use as input for the expert system.

[7] Exsys Corvid Knowledge Automation Expert System Software Developer's Guide, Exsys, Inc., Albuquerque, NM, 2007, p. 13. [8] Exsys Corvid Knowledge Automation Expert System Software Developer's Guide, Exsys, Inc., Albuquerque, NM, 2007, p. 15. [9] Exsys Corvid Knowledge Automation Expert System Software Developer's Guide, Exsys, Inc., Albuquerque, NM, 2007, p. 18. [10] Exsys Corvid Knowledge Automation Expert System Software Developer's Guide, Exsys, Inc., Albuquerque, NM, 2007, p. 25. [11] Exsys Corvid Knowledge Automation Expert System Software Developer's Guide, Exsys, Inc., Albuquerque, NM, 2007, pp. 275-282. [13] The netfilter.org iptables project,

7. Acknowledgments
This paper is based on work supported by the National Science Foundation (NSF) through grant CNS-05040538 and NGA. Any opinions, findings, and conclusions or recommendations expressed in the paper are those of the authors and do not necessarily reflect the views of the NSF or NGA.

http://www.netfilter.org/projects/iptables/index. html, Last visited (October 22, 2009).

8. References
[1] Stephen F. Owens, Reuven R. Levary, An adaptive expert system approach for intrusion detection, Int. J. Security and Networks, Vol 1, Nos. 3 / 4, pp. 206 -217. [2] Rehman, R.U., Intrusion Detection with Snort: Advanced IDS Techniques using SNORT, Apache, MySQL, PHP, and ACID, Prentice Hall PTR, 2003. [3] B. Caswell, J. Beale, J. C. Foster, & . Faircloth, Snort 2.0 Intrusion Detection, Syngree, 2003. [4] Data Mining Tools See5 http://www.rulequest.com/see5-info.html October 22, 2009. and Last C5.0: visited

[5] See5: An informal tutorial, http://rulequest.com/see5win.html#RULES, last visited October 22, 2009. [6] Products: http://www.exsys.com/productmain.html (last visited October 21, 2009). Welcome to Exsys Software

742

Вам также может понравиться