Академический Документы
Профессиональный Документы
Культура Документы
CHAPTER 1
INTRODUCTION
With the rapid development of computer technology and network technology, network security become more important for the aim of protects network information from variety kind of attack. In order to enable the network from a variety of possible abuse, the use of only a single feather firewall cannot meet the requirements, but also needs real-time monitoring on Networks, as far as possible to attack the intrusion before the attack happens.
Intrusion Detection System is developed and grew up against this background. As a new active securitydefensive mechanism, Intrusion Detection System can provide the host and network dynamic protection, it can not only monitor the implementation of internal network attacks, external attacks and disoperation of the real-time protection, but also in combination with other network security products to protect the network in full range. The characteristics of real-time and initiative are important complement to the firewall. Today, in the overall network security solutions, intrusion detection has become an indispensable component. However, with the continuous expansion of network scale and the complexity of the means of attack, Distributed Intrusion Detection System
Computer systems have been made increasingly secure over the past decades. However, new attacks and the spread of harmful viruses have shown that better methods must be used. One approach gaining increasing popularity in the computer community is to use Intrusion Detection Systems (IDSs). Intrusion Detection Systems identify attacks against a system or users performing illegitimate actions. Using a common analogy, having an Intrusion Detection System is like having a burglar alarm in your house. The alarm will not prevent the burglar from breaking into your house, but it will detect and warn you of the problem. Following the publication of the first research in Intrusion Detection Systems, a large number of diverse applications have been developed. One method of accomplishing this type of detection is the use of file system integrity tools. When a system is compromised, an attacker will often alter certain key files to provide continued access and to prevent detection. The changes could target any portion of the system software, e.g. the kernel, libraries, log files, or other sensitive files.
DEPT. OF CSE / B.T.L.I.T 1
The first approach is to create a secure database, which is usually composed of hashes. The stored hash will be periodically checked against a newly computed hash. This method is used with tools such as Tripwire, Aide and others
The second, more recent approach is to create digital signatures of sensitive data, such as executable files using asymmetric cryptography, and use these signatures to check the integrity of the signed file.
Both approaches have advantages and drawbacks, but they share a common flaw: the auditing relies on the validity of the operating system. All the previous applications have made the assumption that the OS itself is not corrupted. Once the operating system is compromised the intruder can easily defeat integrity tools. As an example, in the Linux operating system, redirecting system calls using kernel modules can potentially compromise the system.
Also, since the binary of the Integrity Tool resides in the machine to be audited, the attacker may be able to corrupt the binary or the configuration files of the tool. This work develops a novel way to overcome the problems of traditional Integrity Tools. Our approach is to use a Distributed Intrusion Detection System Based on Protocol Analysis, to perform the integrity detection checks.
The area of distributed computing systems provides a promising domain for applications of machine learning methods. One of the most interesting aspects of such applications is that learning algorithms that are embedded in a distributed computing infrastructure are themselves part of that infrastructure and must respect its inherent local computing constraints (e.g., constraints on bandwidth, latency, reliability, etc.), while attempting to aggregate information across the infrastructure so as to improve system performance (or, availability) in a global sense.
Consider, for example, the problem detecting anomalies in a wide-area network. While it is straightforward to embed learning algorithms at local nodes to attempt to detect node-level anomalies, these anomalies may not be indicative of network-level problems. Indeed, in recent work, demonstrated a useful role for Principal Component Analysis (PCA) to detect network anomalies. They showed that the minor components of PCA (the subspace obtained after removing the components with largest eigen values) revealed anomalies that were not detectable in any single node-level trace. While their work did not face the distributed data analysis problem (it involved centralized, off-line analysis of blocks of data), it does provide clear motivation for attempting to design a distributed PCA-based system for analyzing network anomalies in real time. The development of such a design involves facing several challenging problems that have not been addressed in previous work. Naive solutions that continuously push all data to a central analysis site simply cannot scale to large networks or massive data streams. Instead, viable solutions need to process data .in-network. To intelligently control the frequency and size of data communications.
The key underlying problem is that of developing a mathematical understanding of how to trade off quantization arising from local bandwidth restrictions against delay of the data analysis. We also need to understand how this trade off impacts overall detection accuracy. Finally, the implementation needs to be simple if it is to have impact on developers.
CHAPTER 2
A Traditional intrusion detection system (TIDS) is a device or software application that monitors network and/or system activities for malicious activities or policy violations and produces reports to a Management Station. Intrusion prevention is the process of performing intrusion detection and attempting to stop detected possible incidents. Intrusion detection and prevention systems (IDPS) are primarily focused on identifying possible incidents, logging information about them, attempting to stop them, and reporting them to security administrators In addition, organizations use IDPSs for other purposes, such as identifying problems with security policies, documenting existing threats, and deterring individuals from violating security policies. IDPSs have become a necessary addition to the security infrastructure of nearly every organization. IDPSs typically record information related to observed events, notify security administrators of important observed events, and produce reports. Many IDPSs can also respond to a detected threat by attempting to prevent it from succeeding. They use several response techniques, which involve the IDPS stopping the attack itself, changing the security environment (e.g., reconfiguring a firewall), or changing the attacks content.
2.1 TERMINOLOGY
Alert/Alarm: A signal suggesting that a system has been or is being attacked. True Positive: A legitimate attack which triggers TIDS to produce an alarm. False Positive: An event signaling TIDS to produce an alarm when no attack has taken place. False Negative: A failure of TIDS to detect an actual attack. True Negative: When no attack has taken place and no alarm is raised. Noise: Data or interference that can trigger a false positive. Site policy: Guidelines within an organization that control the rules and configurations of TIDS.
Site policy awareness: The ability a TIDS has to dynamically change its rules and configurations in response to changing environmental activity.
Confidence value: A value an organization places on a TIDS based on past performance and analysis to help determine its ability to effectively identify an attack.
Alarm filtering: The process of categorizing attack alerts produced from a TIDS in order to distinguish false positives from actual attacks.
Attacker or Intruder: An entity who tries to find a way to gain unauthorized access to information, inflict harm or engage in other malicious activities.
Masquerader: A user who does not have the authority to a system, but tries to access the information as an authorized user. They are generally outside users.
Misfeasor: They are commonly internal users and can be of two types: 1. An authorized user with limited permissions. 2. A user with full permissions and who misuses their powers.
Clandestine user: A user who acts as a supervisor and tries to use his privileges so as to avoid being captured.
2.2 TYPES
For the purpose of dealing with IT, there are two main types of IDS: 2.2.1 Network intrusion detection system (NIDS)
It is an independent platform that identifies intrusions by examining network traffic and monitors multiple hosts. Network intrusion detection systems gain access to network traffic by connecting to a network hub, network switch configured for port mirroring, or network tap. In a NIDS, sensors are located at choke points in the network to be monitored, often in the demilitarized zone (DMZ) or at network borders. Sensors capture all network traffic and analyze the content of individual packets for malicious traffic. An example of a NIDS is Snort.
It consists of an agent on a host that identifies intrusions by analyzing system calls, application logs, filesystem modifications (binaries, password files, capability databases, Access control lists, etc.) and other host activities and state. In a HIDS, sensors usually consist of a software agent. Some application-based IDS are also part of this category. An example of a HIDS is OSSEC. Intrusion detection systems can also be systemspecific using custom tools and honey pots.
CHAPTER 3
Detection System (DS) The system that monitors the events occurring in protected hosts or networks and analyze them for signs of intrusions. The intrusion is a major aspect of every network and can be harmful to the entire system. Thus we need a detection system to detect the intrusion beyond their early stages of damage to the network. Intrusion Detection System can provide the host and network dynamic protection It can not only monitor the implementation of internal network attacks, external attacks and disoperation of the real-time protection, but also in combination with other network security products to protect the network in full range. The characteristics of real-time and initiative are important complement to the firewall. Today, in the overall network security solutions, intrusion detection has become an indispensable component. However, with the continuous expansion of network scale and the complexity of the means of attack, Distributed Intrusion Detection System
CHAPTER 4
Intrusion Detection System early detection technology are misuse detection technology and anomaly detection technology commonly used. Misuse detection technology is based on the known methods of intrusion attacks to match and identify attacks. This detection technique commonly used is a simple patternmatching technology. It is characterized by simple, good scalability, detection efficiency, and can be detected, but only applied to relatively simple attacks, and high false alarm rate. Although simple patternmatching on performance is a big problem, because system implementation, configuration, maintenance is very convenient, it is widely used. Anomaly detection system is user's normal pre-stored patterns of behavior, but those inconsistent with normal behavior patterns of users on the case be considered aggression.
Anomaly Detection Intrusion Detection System is the main research direction, which is characterized by abnormal behavior of the detection system and found that unknown attack patterns. The key question of the anomaly detection is the establishment of normal usage patterns and how to use the model to the current system /user behavior compared with the normal in order to judge the degree of deviation from the model. Using these two methods of IDS systems do not have the intelligence to determine the true intention of these models but finally the results of protocol analysis and the advantages are being here.
Protocol analysis is the main technology means of new generation of IDS systems to detect attacks, which use a high degree of regularity corresponding to the reported location of the first protocol to analyze information only useful for detection of the intrusion detection field. Protocol decoding not only decodes on the bottom protocol, but also on the application layer protocol decoding. Since protocol analysis technology guide the search packet clearly part of specific rather than the entire payload reducing the search space, they are able to improve the efficiency of intrusion detection.
10
11
Contextual Anomalies- An individual data instance is anomalous within a context.Requires a notion of context and also referred to as conditional anomalies.
12
Normal Anomaly
Collective Anomalies-A collection of related data instances is anomalous.Requires a relationship among data instances -Sequential Data -Spatial Data -Graph Data
Anomalous Subsequence
13
CHAPTER 5
Ethernet MAC frame format, there are two different standards, one is DIX Ethernet V2, and the other is the IEEE standard 802.3[4]. Ethernet V2 format is often used in current MAC frame, the upper protocol including IP, IPX, ARP,SNMP, NetBUI, its frame format as shown in fig. 5.1
14
5.1.1 IP datagram In the transmission protocol, TCP, UDP, ICMP, IGMP data are based on IP data transmission format, IP datagram is divided into IP header and IP data. IP header contains the version, header length, service type, TL, identifier, flag, fragment offset, TTL, type, header checksum, source IP address, destination IP address. Reference fig. 5.3
Protocol field accounted for 8 bit; field values indicate that the data of this protocol IP datagram carries is which kind use of protocol, such as protocol field value of 6, indicating that part of their data using a TCP protocol.
Transmission Control Protocol is a reliable connection oriented transmission service, which is transmitted by segments, and a conversation must be built when exchange data. It is using the communication of bit
DEPT. OF CSE / B.T.L.I.T 15
stream, that is, unstructured data as byte stream. Each TCP transmitted sequence number is specified, for reliability, TCP datagram is divided into TCP header and TCP data. The header contains the source port, destination port, serial number, and confirmation number and so on as shown in fig. 5.3
The 16-bit source port number, used by the receiver to reply. Destination Port
The sequence number of the first data byte in this segment. If the SYN control bit is set, the sequence number is the initial sequence number (n) and the first data byte is n+1. Acknowledgment Number
If the ACK control bit is set, this field contains the value of the next sequence number that the receiver is expecting to receive.
16
Data Offset
The number of 32-bit words in the TCP header. It indicates where the data begins. Reserved
Indicates that the urgent pointer field is significant in this segment. ACK
Used in ACK segments. It specifies the number of data bytes beginning with the one indicated in the acknowledgment number field which the receiver (= the sender of this segment) is willing to accept. Checksum
The 16-bit one's complement of the one's complement sum of all 16-bit words in a pseudo-header, the TCP header and the TCP data.
The pseudo-header is the same as that used by UDP for calculating the checksum. It is a pseudo-IPheader, only used for the checksum calculation, with the format shown in fig. 5.4
CHAPTER 6
Data capture module- The major role of data capture module is to capture data on the Internet, and then sent the data to the analysis part of the protocol, whose role is more simple and easy to achieve.
Protocol analysis module- Protocol analysis is the focus of this module, it will parse captured data its working principle is as follows: from the Ethernet frame, get the Ethernet header, Ethernet header length is l4 byte, each of which is the 6-byte destination Ethernet address, 6byte source Ethernet address and the 2-byte frame type components, the frame type gives data frame included in the protocol type, such as ARP, RARP, IP, IPX, etc. Their corresponding number of protocol: 0806, 8035, 0800, 8l37, one of ARP/RARP are data link protocol, and IP and IPX are network layer protocol, we have only the IP(0800) protocol for further analysis; Where there is no select items, IP header length is 20 bytes, the main contents include the following: source IP address, destination IP address, fragment flag and offset, and protocol type of IP load (length of one byte), the type of protocol within the IP packet indicate the protocol type of IP packet load, that is, TCP, UDP or ICMP, their corresponding number of protocol:6,17 and 1; In the transport layer, where there is no select items, TCP header length is 20 bytes, the main contents include source port, destination port, flag, serial number and ACK and so on, TCP header contains six flag: URG, SYN, ACK, FIN, RST, PSH, the six flags reflect the status of the TCP connection, such as TCP connection is always in communication through the exchange of SYN packets to the two sides to begin to create a new connection,
18
and through the adoption of FIN , RST to terminate a connection, the packet types can be got according to the source port and destination port of TCP packet, such as TELNET port 23, EMAIL port 25 and so on; In the application layer, contains a lot of the protocol, we only analyze some daily applications, such as FTP, E-MAIL, TELNET, WWW and so on. After doing this protocol analysis, protocol analysis module extracts data packets from the application of the protocol of the protocol keyword, such as FTP at the package; you can extract the RETR (GET operation), STOR (PUT operation) and other protocol keywords. Comparing detector modules at the top of these keywords, we will be able to determine whether there is network intrusion happened. 6.1.1 Ethernet Version 2 The original Ethernet Version 2 frame varies slightly from the 802.3 Ethernet frame format in that a Type field, also referred to as Ethernet type, is used in place of the Length field (also 2 bytes), as shown on the Ethernet V2 Frame Format Diagram.
6.1.2 Ethernet 802.3 The 802.3 format of an Ethernet frame is shown on the IEEE 802.3 Ethernet Frame Format Diagram. It has a Length field instead of a Type field, and an 802.2 LLC header (not shown) in the Information field. As mentioned in the previous lesson, the LLC header DSAP field indicates the protocol being carried and steers the frame to the appropriate process in the Network Layer. An 802.3 Length field will always have a value of less than 0x0600.
19
6.1.3. 802.1Q VLAN Frames With the establishment of the 802.1Q VLAN standard, it is now possible to mix vendor switch equipment and have the VLANs interoperate. That is, frames travelling from switch-to-switch between VLANs carry VLAN membership information that all equipment meeting the standard recognize. The 802.1Q tag follows the standard MAC header in Ethernet frames. If the frame is VLAN-tagged, the Type field contains a value of 0x8100. The VLAN-tag format uses the next 2 bytes after the 0x8100 Type field for the VLAN tag. These 16 bits contain the 3-bit frame priority, the canonical format indicator (CFI), and the 12-bit VLAN ID. Another way of looking at this is that Ethernet frames have either a Length or a Type field. When using LLC, the field is Length-encoded. If not using LLC, the field is Type-encoded. Following the VLAN tag would be the original 802.3 Length field or Version 2 Type value that the frame would have carried had it not been tagged. That is, if this is an 802.1Q-tagged Type-encoded frame carrying IP, the 2 bytes after the VLAN tag will be 0x0800. If the original frame was Length-encoded, the 2 bytes following the VLAN tag would be a Length field, followed by the LLC header as the first part of the Information field. If Length-encoded: 8100 0020 01A6--The 8100h and 0020h are the 4 additional VLAN bytes; the 01A6 is an example of a valid 802.3 length; the LLC header would follow in the Information field. This concept is illustrated on the 802.1Q Length-Encoded Frame Format Diagram.
If Type-encoded: 8100 0020 0800--The 0800h is in the Type field, indicating IP is being carried. This concept is illustrated on the 802.1Q Type-Encoded Frame Format Diagram.
20
In this model, Internet data on the Detect Module and the Process Module can arrives at a user computer after the detection. After the network intrusion detection Process Module set the signal of the intrusion to Response Module to alarm the user. Each detection module is a micro-data analysis system; they will get through the analysis of data reported through the High speed link to send to process module. In the Response Module to determine whether there is intrusion. Detect Module and Process Module make up a complete intrusion detection system.
Computer systems have been made increasingly secure over the past decades. However, new attacks and the spread of harmful viruses have shown that better methods must be used. One approach gaining increasing popularity in the computer community is to use Intrusion Detection Systems (IDSs). Intrusion Detection Systems identify attacks against a system or users performing illegitimate actions. Using a common analogy, having an Intrusion Detection System is like having a burglar alarm in your house. The alarm will not prevent the burglar from breaking into your house, but it will detect and warn you of the problem. Following the publication of the first research in Intrusion Detection Systems, a large number of diverse applications have been developed. One method of accomplishing this type of detection is the use of file system integrity tools. When a system is compromised, an attacker will often alter certain key files to provide continued access and to prevent detection. The changes could target any portion of the system software, e.g. the kernel, libraries, log files, or other sensitive files.
21
Process Module contains a rule base, there is the keyword set of current often intrusion mode, and with the emergence of a new intrusion technology and expanding the size of rule base, and the keywords can deleted .Detect Module will send the keywords and rule base for comparison, if we find the matching of string of arrived words with the rules of rule, then the intrusion has happened, Response Module responses the user the intrusion, as well as to advise users the attack means and the aim of being attacked, allowing users to take timely preventive measures to avoid losses.
22
CHAPTER 7
7.1 SCENARIOS
The detection of certain attacks against a networked system of computers requires information from multiple sources. A simple example of such an attack is the so-called doorknob attack. In a doorknob attack the intruders goal is to discover, and gain access to, insufficiently-protected hosts on a system. The intruder generally tries a few common account and password combinations on each of a number of computers. These simple attacks can be remarkably successful. As a case in point, UC Davis NSM recently observed an attacker of this type gaining super-user access to an external computer which did not require a password for the super-user account. In this case, the intruder used telnet to make the connection from a university computer system, and then repeatedly tried to gain access to several different computers at the external site. In cases like these, the intruder tries only a few logins on each machine (usually with different account names), which means that an IDS on each host may not flag the attack. Even if the behavior is recognized as
DEPT. OF CSE / B.T.L.I.T 23
an attack on the individual host, current IDSs are generally unable to correlate reports from multiple hosts; thus they cannot recognize the doorknob attack as such. Because DIDS aggregates and correlates data from multiple hosts and the network, it is in a position to recognize the doorknob attack by detecting the pattern of repeated failed logins even though there may be too few on a single host to alert that hosts monitor.
In another incident, our NSM recently observed an intruder gaining access to a computer using a guest account which did not require a password. Once the attacker had access to the system, he exhibited behavior which would have alerted most existing IDSs (e.g., changing passwords and failed events). In an incident such as this, DIDS would not only report the attack, but may also be able to identify the source of the attack. That is, while most IDSs would report the occurrence of an incident involving user "guest" on the target machine, DIDS would also report that user "guest" was really, for example, user "smith" on the source machine, assuming that the source machine was in the monitored domain. It may also be possible to go even further back and identify all of the different user accounts in the "chain" to find the initial launching point of the attack. Another possible scenario is what we call network browsing. This occurs when a (network) user is looking through a number of files on several different computers within a short period of time. The browsing activity level on any single host may not be sufficiently high enough to raise any alarm by itself. However, the network-wide, aggregated browsing activity level may be high enough to raise suspicion on this user. Network browsing can be detected as follows. Each host monitor will report that a particular user is browsing on that system, even if the corresponding degree of browsing is small. The expert system can then aggregate such information from multiple hosts to determine that all of the browsing activity corresponds to the same network user. This scenario presents a key challenge for DIDS: the tradeoffs between sending all audit records to the director versus missing attacks because thresholds on each host are not exceeded. In addition to the specific scenarios outlined above, there are a number of general ways that an intruder can use the connectivity of the network to hide his trail and to enhance his effectiveness. Some of the attack configurations which have been hypothesized include chain and parallel attacks. DIDS combats these inherent vulnerabilities of the network by using the very same connectivity to help track and detect the intruder. Note that DIDS should be at least as effective as host-based IDSs (if we implement all of their functionality in the DIDS host monitor), and at least as effective as the stand-alone NSM.
24
25
This problem is unique to the network environment and has not been dealt with before in this context. The solution to the multiple user identity problem is to create a network-user identification (NID) the first time a user enters the monitored environment, and then to apply that NID to any further instances of the user. All evidence about the behavior of any instance of the user is then accountable to the single NID. In particular, we must be able to determine that "smith@host1" is the same user as "jones@host2", if in fact they are. Since the network-user identification problem involves the collection and evaluation of data from both the host and LAN monitors, examining it is a useful method to understand the operation of DIDS. In the following subsections we examine each of the components of DIDS in the context of the creation and use of the NID.
The host monitor examines each audit record to determine if it should be forwarded to the expert system for further evaluation. Certain critical audit records are always passed directly to the expert system (i.e., notable events); others are processed locally by the host monitor (i.e., profiles and attack signatures, which are sequences of noteworthy events which indicate the symptoms of attacks) and only summary reports are sent to the expert system.
Thus, one of the design objectives is to push as much of the processing operations down to the low-level monitors as possible. In order to do this, the HEG creates a more abstract object called an event. The event includes any significant data provided by the original audit record plus two new fields: the action and the domain. The action and domain are abstractions which are used to minimize operating system dependencies at higher levels. Actions characterize the dynamic aspect of the audit records. Domains characterize the objects of the audit records. In most cases, the objects are files or devices and their domain is determined by the characteristics of the object or its location in the file system. Since processes can also be objects of an audit record, they are also assigned to domains, in this case by their function. The actions are: session start,
DEPT. OF CSE / B.T.L.I.T 26
session end, read (a file or device), write (a file or device), execute (a process), terminate (a process), create (a file or (virtual) device), delete (a file or (virtual) device), move (rename a file or device), change rights, and change_user_id. The domains are: tagged, authentication, audit, network, system, sys_info, user_info, utility, owned, and not_owned. The domains are prioritized so that an object is assigned to the first applicable domain. Tagged objects Are Ones which are thought a priori to be particularly interesting in terms of detecting intrusions. Any file, device, or process can be tagged (e.g., /etc/passwd). Authentication objects are the processes and files which are used to provide access control on the system (e.g., the password file). Similarly, audit objects relate to the accounting and security auditing processes and files. Network objects are the processes and files not covered in the previous domains which relate to the use of the network. System objects are primarily those which are concerned with the execution of the operating system itself, again exclusive of those objects already assigned to previously considered domains. Sys_info and user_info objects provide information about the system and about the users of the system, respectively. The utility objects are the bulk of the programs run by the users (e.g., compilers and editors). In general, the execution of an object in the utility domain is not interesting (except when the use is excessive), but the creation or modification of one is. Owned objects are relative to the user. Not_owned objects are, by exclusion, every object not assigned to a previous domain. They are also relative to a user; thus, files in the owned domain relative to "smith" are in the not_owned domain relative to "Jones".
All possible transactions fall into one of a finite number of events formed by the cross product of the actions and the domains, and each event may also succeed or fail. Note that no distinction is made between files, directories or devices, and that all of these are treated simply as objects. Not every action is applicable to every object; for example, the terminate action is applicable only to processes. The choice of these domains and actions is somewhat arbitrary in that one could easily suggest both finer and coarser grained partitions. However, they capture most of the interesting behavior for intrusion detection and correspond reasonably well with what other researchers in this field have found to be of interest. By mapping an infinite number of transactions to a finite number of events, we not only remove operating system dependencies, but also restrict the number of permutations that the expert system will have to deal with. The concept of the domain is one of the keys to detecting abuses. Using the domain allows us to make assertions about the nature of a users behavior in a straightforward and systematic way. Although we lose some details provided by the raw audit information, that is more than made up for by the increase in portability, speed, simplicity, and generality. An event reported by a host monitor is called a host audit record (har). The record syntax is:
27
har(Monitor-ID, Host-ID, Audit-UID, Real-UID, Effective-UID, Time, Domain, Action, Transaction, Object, Parent Process, PID, Return Value, Error Code).
Of all the possible events, only a subset are forwarded to the expert system. For the creation and application of the NID, it is the events which relate to the creation of user sessions or to a change in an account that are important. These include all the events with session_start actions, as well as ones with an execute action applied to the network domain. These latter events capture such transactions as executing the rlogin, telnet, rsh, and rexec UNIX programs. The HEG consults external tables, which are built by hand, to determine which events should be forwarded to the expert system. Because they relate to events rather than to the audit records themselves, the tables and the modules of the HEG which use them are portable across operating systems. The only portion of the HEG which is operating system dependent is the module which creates the events.
28
Fig 7.2 shows a generalized DIDS target environment. The DIDS architecture combines distributed monitoring and data reduction with centralized data analysis. DIDS architecture consists of DIDS director, a single host monitor per host and a single LAN monitor for each broadcast LAN segment in the network which is monitored. In DIDS, the host and LAN monitors report events, which possibly lead to intrusive activity, to a centrally-located DIDS director. The director employs an expert system to detect the possible intrusion attacks. This architecture provides accountability by trying the users with their actions. The host and LAN monitors are responsible for the collection of evidence of suspicious activity and DIDS director is responsible for its evaluation. Reports are sent independently and asynchronously from the host and LAN monitors to the DIDS director through a communications architecture shown in figure 7.2. For high level communication protocols between the components are based on Common Management Information Protocol (CMIP) recommendations. The architecture provides a bidirectional communication between the DIDS director and any monitor in the configuration.
Similar to the host monitor, the LAN monitor uses several simple analysis techniques to identify significant events. The events include the use of certain services (e.g., rlogin and telnet) as well as activity by certain classes of hosts (e.g., a PC without a host monitor). The LAN monitor also uses and maintains profiles of expected network behavior. The profiles consist of expected data paths (e.g., which systems are expected to establish communication paths to which other systems, and by which service) and service profiles (e.g., what a typical telnet, mail, or finger is expected to look like). The LAN monitor also uses heuristics in an attempt to identify the likelihood that a particular connection represents intrusive behavior. These heuristics consider the capabilities of each of the network services, the level of authentication required for each of the services, the security level for each machine on the network, and signatures of past attacks. The abnormality of a connection is based on the probability of that particular connection occurring and the behavior of the
DEPT. OF CSE / B.T.L.I.T 29
connection itself. Upon request, the LAN monitor is also able to provide a more detailed examination of any connection, including capturing every character crossing the network (i.e., a wire-tap). This capability can be used to support a directed investigation of a particular subject or object. Like the host monitor, the LAN monitor forwards relevant security information to the director through its LAN agent. An event reported by a LAN monitor is called a network audit record (nar). The record syntax is: nar(Monitor-ID, Source_Host, Dest_Host, Time, Service, Domain, Status). A large amount of low level filtering and some analysis is performed by the host monitor to minimize the use of network bandwidth in passing evidence to the director.
The LAN monitor has several responsibilities with respect to the creation and use of the NID. The LAN monitor is responsible for detecting any connections related to rlogin and telnet sessions. Once these connections are detected, the LAN monitor can be used to verify the owner of a connection. The LAN monitor can also be used to help track tagged objects moving across the network. The SSO can also ask for a wire-tap on a certain network connection to monitor a particular users behavior.
30
CHAPTER 8
Like the host monitor, the LAN monitor consists of a LAN event generator (LEG) and a LAN agent. The LEG is currently a subset of UC Davis NSM. Its main responsibility is to observe all of the traffic on its segment of the LAN to monitor host-to-host connections, services used, and volume of traffic. The LAN monitor reports on such network activity as rlogin and telnet connections, the use of security-related services, and changes in network traffic patterns.
The DIDS director consists of three major components that are all located on the same dedicated workstation. Because the components are logically independent processes, they could be distributed as well. The communications manager is responsible for the transfer of data between the director and each of the host and the LAN monitors. It accepts the notable event records from each of the host and LAN monitors and sends them to the expert system. On behalf of the expert system or user interface, it is also able to send the requests to the host and LAN monitors for more information regarding a particular subject.
DEPT. OF CSE / B.T.L.I.T 31
The expert system is responsible for evaluating and reporting on the security state of the monitored system. It receives the reports from the host and the LAN monitors, and, based on these reports; it makes inferences about the security of each individual host, as well as the system as a whole. The expert system is a rule-based system with simple learning capabilities. The directors user interface allows the System Security Officer (SSO) interactive access to the entire system. The SSO is able to watch activities on each host, watch network traffic (by setting "wire-taps"), and request more specific types of information from the monitors.
32
The architecture provides bidirectional communication between the DIDS director and any monitor in the configuration and the communication consists of notable events and anomaly reports. The director makes requests for more detailed information from the distributed monitors.
The host monitor consists of host event generator and host agent. The agent generator collects and analysis audit records from the host operating system, in which, the audit records are scanned for notable events. The notable events are sent to the director of the next analysis. The LAN monitor consists of a LAN event generator and a LAN agent. The LAN event generator is a subset of NSM and is responsible to observe all the traffic on its segment of the LAN, in order to monitor host-to-host connections, services used and volume of traffic.
The DIDS director consists of three major components namely a communication manager, an expert system and a user interface. The communication manager is used to transfer data between the director and it accepts the notable event records from each host and LAN monitors and sends them to the expert system. It also sends request to the host and LAN monitors for information regarding a particular user.
The expert system is responsible for evaluating and reporting the security state of the monitored system and it receives the reports from the hosts and the LAN monitors. This makes inferences about the security of each individual host and the expert system is having simple learning capabilities.
The packet is read, in real time, off the network through a sensor that president on a network segment located somewhere between the two communicating computers. The sensor is usually a stand-alone machine or network device.
33
The network packet is created when one computer communicates with another. A sensor-resident detection engine is used to identify predefined patterns of misuse. If a pattern is detected, an alert is generated. The security officer is notified about the misuse. This can be done through a variety of methods including audible, visual, pager, email, or through any other different method. A response to the misuse is generated. The response subsystem matches alerts to predefined responses or can take responses from the security officer. The alert is stored for correlation and review at a later time. Reports are generated that summarize the alert activity. Data forensics is used to detect long-term trends. Some systems allow archiving of the original traffic to replay sessions.
34
A few years ago all commercial network intrusion detection systems used promiscuous-mode sensors. However, these technologies were subject to packet loss on high speed networks. A new architecture for network intrusion detection was created that dealt with the performance problem on high speed networks by distributing sensors to every computer on the network. In network-node intrusion detection each sensor is concerned only with packets directed at the target in which the sensor resides. The sensors then communicate with each other and the main console to aggregate and correlate alarms. However, this network-node architecture has added to the confusion over the difference between network and host-based intrusion detection. A network sensor that is running on a host machine does not make it a host-based sensor. Network packets directed to a host and sniffed at a host are still considered network intrusion detection.
A network packet is created. The packet is read in real-time off the network through a sensor resident on the destination machine. A detection engine is used to identify pre-defined patterns of misuse. If a pattern is detected, an alert is generated and forwarded to a central console or to other sensors in the network. The security officer is notified. A response is generated. The alert is stored for later review and correlation. Reports are generated summarizing alert activity. Data forensics is then used to look for long-term trends.
35
However, the architectures require operational modes in order to operate. Operational modes describe the manner the intrusion detection system will operate and partially describe the end goals of monitoring. There are two primary operational modes to use network-based intrusion detection: tip-off and surveillance. The system is used to detect misuse as it is happening. This is the traditional context for intrusion detection systems. By observing patterns of behavior, suspicious behavior can be detected to tip off the officer that misuse may be occurring. The defining characteristic for tip-off is that the system is detecting patterns that have not been detected before. During surveillance, targets are observed more closely for patterns of misuse. Surveillance is characterized by an increased observance of the behavior of a small set of subjects. Unlike tip-off, surveillance takes place when misuse has already been suspected. Surveillance results from a tip-off from either the intrusion detection system or another indicator.
36
In order for there to be a tip-off a data source needs to be searched for suspicious behavior. Host-based intrusion detection systems analyze data that originates on computers, such as application and operating system event logs and file attributes. Host data sources are numerous and varied, including operating system event logs, such as kernel logs, and application logs such as syslog. These host event logs contain information about file accesses and program executions associated with inside users. If protected correctly, event logs may be entered into court to support the prosecution of computer criminals.
There are many attack scenarios that host-based intrusion detection guards against. One of these scenarios is the abuse of privilege attack scenario. That is when a user has root, administrative or some other privilege and uses it in an unauthorized manner. Another scenario involves contractors with elevated privileges. This usually happens when an administrator gives a contractor elevated privileges to install an application. Most security policies restrict nonemployees from having root or administrator privileges, however it might be easier to elevate the user and reduce privileges later. However, the administrator might forget to remove the privileges. A third attack scenario involves ex-employees utilizing their old accounts. Most organizations have policies in place to delete or disable accounts when individuals leave.
However, they take time to delete or disable, leaving a window for a user to log back in. Another scenario involves modifying web site data. There have been many cases, against government agencies in particular, that result in uncomplimentary remarks posted on web sites. While these attacks originate from outside the network, they are perpetrated on the machine itself through alteration of data. With a review of what attacks host-based intrusion detection systems prevent, its important to examine the architecture to see how it prevents those attacks. In the centralized architecture, data is forwarded to an analysis engine running independently from the target. Fig. 8.4 represents the typical life cycle of an event record running through this type of architecture. And Fig. 8.5 represents distributed real-time host based intrusion detection architecture. The difference between the two is that in Fig. 8.4 the raw data is forwarded to a central location before it is analyzed and, in Fig. 8.5, the raw data is analyzed in real time on the target first and then only alerts are sent to the command console. There are advantages and disadvantages to each method. However, the best systems offer both types of processing.
37
An even record is created. This occurs when an action happens; such as a file is opened or a program is executed like the text editor like Microsoft Word. The record is written into a file that is usually protected by the operating system trusted computing base.
The target agent transmits the file to the command console. This happens at predetermined time intervals over a secure connection. The detection engine, configured to match patterns of misuse, processes the file. A log is created that becomes the data archive for all the raw data that will be used in prosecution.
38
An alert is generated. When a predefined pattern is recognized, such as access to a mission critical file, an alert is forwarded to a number of various subsystems for notification, response, and storage.
The security officer is notified. A response is generated. The response subsystem matches alerts to predefined responses or can take response commands from the security officer. Responses include reconfiguring the system, shutting down a target, logging off a user, or disabling an account.
The alert is stored. The storage is usually in the form of a database. Some systems store statistical data as well as alerts. The raw data is transferred to a raw data archive. This archive is cleared periodically to reduce the amount of disk space used. Reports are generated. Reports can be a summary of the alert activity. Data forensics is used to locate long-term trends and behavior is analyzed using both the stored data in the database and the raw event log archive. The lifecycle of an event record through a distributed real-time architecture is similar, except that the record is discarded after the target detection engine analyzes it. The advantage to this approach is that everything happens in real-time. The disadvantage is that the end users suffer from system performance degradation.
Data forensics is used to search for long-term trends. However, because there is no raw data archive and no statistical data, this capacity is limited. Reports are generated.
39
An event record is created. The file is read in real-time and processed by a target resident detection engine. The security officer is notified. Some systems notify directly from the target, while others notify from a central console. A response is generated. The response may be generated from the target or console. An alert is generated then sent to a central console. The alert is stored. Statistical behavioral data outside alert data are not usually available in this architecture.
40
CHAPTER 9
The model is the basis of the rule base. It serves both as a description of the function of the rule base, and as a touchstone for the actual development of the rules. The IDM consists of 6 layers, each layer representing the result of a transformation performed on the data (see Table 9.1)
The objects at the first level of the model are the audit records provided by the host operating system, by the LAN monitor, or by a third party auditing package. The objects at this level are both syntactically and semantically dependent on the source. At this level, all of the activity on the host or LAN is represented.
At the second level, the event (which has already been discussed in the context of the host and LAN monitor) is both syntactically and semantically independent of the source standard format for events.
The third layer of the IDM creates a subject. This introduces a single identification for a user across many hosts on the network. It is the subject who is identified by the NID. Upper layers of the model treat the network-user as a single entity, essentially ignoring the local identification on each host. Similarly, above this level, the collection of hosts on the LAN is generally treated as a single distributed system with little attention being paid to the individual hosts.
41
The fourth layer of the model introduces the event in context. There are two kinds of context: temporal and spatial. As an example of temporal context, behavior which is unremarkable during standard working hours may be highly suspicious during off hours. The IDM, therefore, allows for the application of information about wall clock time to the events it is considering. Wall-clock time refers to information about the time of day, weekdays versus weekends and holidays, as well as periods when an increase in activity is expected. In addition to the consideration of external temporal context, the expert system uses time windows to correlate events occurring in temporal proximity. This notion of temporal proximity implements the heuristic that a call to the UNIX who command followed closely by a login or logout is more likely to be related to an intrusion than either of those events occurring alone. Spatial context implies the relative importance of the source of events. That is, events related to a particular user, or events from a particular host, may be more likely to represent an intrusion than similar events from a different source. For instance, a user moving from a low-security machine to a high-security machine may be of greater concern than a user moving in the opposite direction. The model also allows for the correlation of multiple events from the same user or source. In both of these cases, multiple events are more noteworthy when they have a common element than when they do not.
42
The fifth layer of the model considers the threats to the network and the hosts connected to it. Events in context are combined to create threats. The threats are partitioned by the nature of the abuse and the nature of the target. In other words, what is the intruder doing, and what is he doing it to? Abuses are divided into attacks, misuses, and suspicious acts. Attacks represent abuses in which the state of the machine is changed. That is, the file system or process state is different after the attack than it was prior to the attack. Misuses represent out-of-policy behavior in which the state of the machine is not affected. ` Suspicious acts are events which, while not a violation of policy, are of interest to IDS. For example, commands which provide information about the state of the system may be suspicious. The targets of abuse are characterized as being either system objects or user objects and as being either passive or active. User objects are owned by non-privileged users and/or reside within a non-privileged users directory hierarchy. System objects are the complement of user objects. Passive objects are files, including executable binaries, while active objects are essentially running processes.
At the highest level, the model produces a numeric value between one and 100 which represents the overall security state of the network. The higher the number the less secure the network. This value is a function of all the threats for all the subjects on the system. Here again we treat the collection of hosts as a single distributed system. Although representing the security level of the system as a single value seems to imply some loss of information, it provides a quick reference point for the SSO. In fact, in the current implementation, no information is lost since the expert system maintains all the evidence used in calculating the security state in its internal database, and the SSO has access to that database.
In the context of the network-user identification problem we are concerned primarily with the lowest three levels of the model: the audit data, the event, and the subject. The generation of the first two of these has already been discussed; thus, the creation of the subject is the focus of the following subsection.
The expert system is responsible for applying the rules to the evidence provided by the monitors. In general, the rules do not change during the execution of the expert system. What does change is a numerical value associated with each rule. This Rule Value (RV) represents our confidence that the rule is useful in detecting intrusions. These rule values are manipulated using a negative reinforcement training method which allows the expert system to continually lower the number of false attack reports. When a potential
43
attack is reported by the expert system, the SSO determines the validity of the report and gives feedback to the expert system. If the report was deemed faulty, then the expert system lowers the RVs associated with the rules that were used to draw that conclusion. In addition to this directed training, which may lower some rule values, the system also automatically increases the RVs of all the rules on a regular basis. This recovery algorithm allows the system to adapt to changes in the environment as well as recover from faulty training. Logically the rules have the form: Antecedent => consequence Where the antecedent is either a fact reported by one of the distributed monitors, or a consequence of some previously satisfied rule. The antecedent may also be a conjunction of these. The overall structure of the rule base is a tree rooted at the top. Thus, many facts at the bottom of the tree will lead to a few conclusions at the top of the tree.
The expert system shell consists of approximately a hundred lines of Prolog source code. The shell is responsible for reading new facts reported by the distributed monitors, attempting to apply the rules to the facts and hypotheses in the Prolog database, reporting suspected intrusions, and maintaining the various dynamic values associated with the rules and hypotheses. The syntax for rules is: rule(n,r,(single,[A]),(C))). where n is the rule number, r is the initial RV, A is the single antecedent, and C is the consequence. Conjunctive rules have the form: rule(n,r,(and,[A1,A2,A3]),(C))). where A1,A2,A3 are the antecedents and C is the consequence. Disjunctive rules are not allowed; that situation is dealt with by having multiple rules with the same consequence.
9.1 ADVANTAGES
The distributed Intrusion Detection Model based on Protocol analysis has the following advantages:
The system consists of three modules: Detect Module, Process Module and Response Module. This makes data transmission between the modules do not need too much middle layer, enhance the
DEPT. OF CSE / B.T.L.I.T 44
transfer rate between the modules. At this point the flow of large data networks, intrusion detection has great advantages. When there is more data traffic on the network, undetected rate of general intrusion detection systems will increase sharply, which give a hacker an opportunity, which can be taken in some way to send a large number of flooded packets littering the network, at this time if there is some delays of detected part and processed part or the matching time is too long between the rule base processed and the data sent, then there is a large part of the data will certainly not be detected, the hackers can mix intrusion data packets with litter data packages falling through the openings in the packet, so as to achieve their sinister purposes. This model uses high-speed link, which greatly improve the data transmission speed.
In the Detect Module part, extraction is only the important characteristics of packet into Process Module to process. Its length is often only a small percentage of the length of all the data packets, not only saves resources of detection part, but also in the unit time greatly improves the characteristics of the packet transmission rate when transmitted. Because the rule base of the central part is constituted by the characteristics of these intrusion data, but also saves resources of Process Module. And the strings of characteristics is short, so the matching speed can be greatly enhanced, even if there is a lot of data that need to be processed at the same time, the system can also achieve matching tasks, detect intrusion timely, enhance the detection rate.
45
Table 9.3 illustrates the advantages and disadvantages of a real-time distributed intrusion detection system. This table is a mirror image of Table 9.2 with a few minor additions.
Host-based and network-based systems are both required because they provide significantly different benefits. Detection, deterrence, response, damage assessment, attack anticipation and prosecution support are available at different degrees from the different technologies. Table 9.4 summarizes these differences.
46
CONCLUSION
Intrusion detection technology based on protocol analysis has become one of the technologies for the intrusion detection system of next generation. This paper presents the Distributed Intrusion Detection System based on protocol analysis which is simple in structure, fast in detection speed, efficient in detection, safe in resources, etc., and is an affordable intrusion detection system. However, the diversity of network intrusion make detection system impossible, especially because the rule base can only extract the invaded, so that there is failure to recognize the intrusion undetected, resulting some missed detection. However, the Distributed Intrusion Detection research study is at the initial stage, with the development of technology, the system must be able to change with the trend of network data to make adaptive changes, which make the system have the function of self-learning and adaptive. However, this paper presents the protocol analysis have a certain stimulating function to improve the existing distributed intrusion detection system performance, and must have some practical significance for the future of the Distributed Intrusion Detection System.
47
FUTURE WORK
The Distributed Intrusion Detection System (DIDS) is being developed to address the shortcomings of current single host IDSs by generalizing the target environment to multiple hosts connected via a network (LAN). Most current IDSs do not consider the impact of the LAN structure when attempting to monitor user behavior for attacks against the system. Intrusion detection systems designed for a network environment will become increasingly important as the number and size of LANs increase. The prototype has demonstrated the viability of our distributed architecture in solving the network-user identification problem.
The tested system on a sub-network of Sun SPARC stations and it has correctly identified network users in a variety of scenarios. Work continues on the design, development, and refinement of rules, particularly those which can take advantage of knowledge about particular kinds of attacks. The initial prototype expert system has been written in Prolog, but it is currently being ported to CLIPS due to the latters superior performance characteristics and easy integration with the C programming language. The designing of a signature analysis component for the host monitors to detect events and sequences of events that are known to be indicative of an attack, based on a specific context. In addition to the current host monitor, who is designed to detect attacks on general purpose multi-user computers, the intension to develop monitors for application specific hosts such as file servers and gateways. In support of the ongoing development of DIDS there is a plan to extend the model to a hierarchical Wide Area Network environment.
48