Вы находитесь на странице: 1из 69

.

CHAPTER 1

INTRODUCTION

1.1 Motivation:
Peer-to-peer (P2P) networks have become an important infrastructure during the last years, and P2P networks have evolved from simple systems like Napster and Gnutella to more sophisticated ones based on distributed hash tables, such as CAN and CHORD. Although, schemes, based on hash functions, provide good performance for point queries (where the search key is known exactly), they almost does and loose scalability and performance. Obviously, for such queries we have to build some different infrastructure, seemingly, based on semantic relations among peers and data, they contain. There are two main intuitions that come to mind: queries can be routed only to a semantically chosen subset of peers, able to answer queries. If a peer cannot answer a query fully enough, it forwards the query only to its neighbors, which can also have answers and so on. Finally, the amount of flooding messages is reduced. shared data in the P2P systems often has pronounced ontological structure, because of its origin and relations to real world concepts (music, scientific papers,and movies) and its possible to sort such data into parts, classify its content somehow and identify semantically similar groups. These guesses were realized in conception , and presented in this write-up with several extensions from other papers. dont work for approximate, range, or text queries. In this case we must flood messages, like Gnutella

1.2 Related Work :


We have done simulations of Peer to Peer networks using eclipsed simulator Omnet+ +4.2.1, Inet , Oversim and result analysis using Scave Tool. We have compared the efficiency and working of four prominent protocols viz . Kademlia, Koorde, Chord and Pastry that are currently used in a p2p network on different comaparing factors like fault tolerance , throughput, scalability, time complexity space complexity, number of hop counts.

CHAPTER 2 Chord 2.1 Introduction:


Chord is a peer-to-peer protocol which presents a new approach to the problem of efficient location. Chord uses routed queries to locate a key with a small number of hops, which stays small even if the system contains a large number of nodes. What distinguishes Chord from other applications is its simplicity, its provable performance and provable correctness. Basically, Chord supports just one operation: given a key, it maps the key onto a node. Data localization can be implemented by associating each key with a data item. 2.2 Properties of Chord 2.2.1 Decentralization In a peer-to-peer system using Chord, there exists no central server or superpeer. Each node is of the same importance as any other node. Therefore, the system is very robust, since it does not have a single point of failure. 2.2.2 Availability The protocol functions very well even if the system is in a continuous state of change: Despite major failures of the underlying network and despite the joining of large number of nodes, the node responsible for a key can always be found. 2.2.3 Scalability The cost of a Chord lookup grows only logarithmically in the number of nodes in the system, so Chord can be used for very large systems. 2.2.4 Load balance Chord uses a consistent hash function to assign keys to nodes. Therefore, the keys are spread evenly over the nodes. 2.2.5 Flexible naming Chord imposes no constraints on the key structure, so the user is granted a large amount of flexibility in the data can be named.

2.3 Chord Protocol Brief


The Chord protocol is one solution for connecting the peers of a P2P network. Chord consistently maps a key onto a node. Both keys and nodes are assigned an -bit identifier. For nodes, this identifier is a hash of the node's IP address. For keys, this identifier is a hash of a keyword, such as a file name. It is not uncommon to use the words "nodes" and "keys" to refer to these identifiers, rather than actual nodes or keys. There are many other algorithms in use by P2P, but this is a simple and common approach.[2] A logical ring with positions numbered to is formed among nodes.

Key k is assigned to node successor(k), which is the node whose identifier is equal to or follows the identifier of k. If there are N nodes and K keys, then each node is responsible for roughly responsibility for keys. When a new node joins or leaves the network, keys changes hands.

If each node knows only the location of its successor, a linear search over the network could locate a particular key. This is a naive method for searching the network, since any given message could potentially have to be relayed through most of the network. Chord implements a faster search method. Chord requires each node to keep a "finger table" containing up to The entry of node will contain the address of successor entries. .

fig1[2]
3

CHAPTER 3

Pastry: Scalable, decentralized object location and routing for large-scale peer-to-peer systems
3.1 INTRODUCTION
Peer-to-peer Internet applications have recently been popularized through file sharing applications like Napster, Gnutella and FreeNet. While much of the attention has been focused on the copyright issues raised by these particular applications, peer-to-peer systems have many interesting technical aspects like decentralized control, selforganization, adaptation and scalability. Peer-to-peer systems can be characterized as distributed systems in which all nodes have identical capabilities and responsibilities and all communication is symmetric. Pastry is intended as general substrate for the construction of a variety of peer-to-peer Internet applications like global file sharing, file storage, group communication and naming systems. Several application have been built on top of Pastry to date, including a global, persistent storage utility called PAST and a scalable publish/subscribe system called SCRIBE . Other applications are under development. Pastry provides the following capability. Each node in the Pastry network has a unique numeric identifier (nodeId). When presented with a message and a numeric key, a Pastry node efficiently routes the message to the node with a nodeId that is numeri-cally closest to the key, among all currently live Pastry nodes. The expected number of routing steps is O(log N), where N is the number of Pastry nodes in the network. At each Pastry node along the route that a message takes, the application is notified and may perform application-specific computations related to the message. Pastry takes into account network locality; it seeks to minimize the distance mes-sages travel, according to a scalar proximity metric like the number of IP routing hops. Each Pastry node keeps track of its immediate neighbors in the nodeId space, and no-tifies applications of new node arrivals, node failures and recoveries. Because nodeIds are randomly assigned, with high probability, the set of nodes with adjacent nodeId is diverse in geography, ownership, jurisdiction, etc.[4] Applications can leverage this, as Pastry can route to one of knodes that are numerically closest to the key. A heuristic ensures that among a set of nodes with the kclosest 4

nodeIds to the key, the message is likely to first reach a node near the node from which the message originates, in terms of the proximity metric. Applications use these capabilities in different ways. PAST, for instance, uses a fileId, computed as the hash of the files name and owner, as a Pastry key for a file. Replicas of the file are stored on thekPastry nodes with nodeIds numerically closest to the fileId. A file can be looked up by sending a message via Pastry, using the fileId as the key. By definition, the lookup is guaranteed to reach a node that stores the file as long as one of the knodes is live. Moreover, it follows that the message is likely to first reacha node near the client, among the knodes; that node delivers the file and consumes themessage. Pastrys notification mechanisms allow PAST to maintain replicas of a file on the knodes closest to the key, despite node failure and node arrivals, and using onlylocal coordination among nodes with adjacent nodeIds. Details on PASTs use of Pastry As another sample application, in the SCRIBE publish/subscribe System, a list of subscribers is stored on the node with nodeId numerically closest to the topicId of a topic, where the topicId is a hash of the topic name. That node forms a rendez-vous point for publishers and subscribers. Subscribers send a message via Pastry using the topicId as the key; the registration is recorded at each node along the path. A publisher sends data to the rendez-vous point via Pastry, again using the topicId as the key. The rendez-vous point forwards the data along the multicast tree formed by the reverse paths from the rendez-vous point to all subscribers.[4]

3.2 Design of Pastry:


A Pastry system is a self-organizing overlay network of nodes, where each node routes client requests and interacts with local instances of one or more applications. Any computer that is connected to the Internet and runs the Pastry node software can act as a Pastry node, subject only to application-specific security policies. Each node in the Pastry peer-to-peer overlay network is assigned a 128-bit node identifier (nodeId). The nodeId is used to indicate a nodes position in a circular nodeId space, which ranges from 0 to 2128-1. The nodeId is assigned randomly when a node joins the system. It is assumed that nodeIds are generated such that the resulting set of nodeIds is uniformly distributed in the 128-bit nodeId space. For instance, nodeIds could be generated by computing a cryptographic hash of the nodes public key or its IP 5

address. As a result of this random assignment of nodeIds, with high probability, nodes with adjacent nodeIds are diverse in geography, ownership, jurisdiction, network attachment, etc. For the purpose of routing, nodeIds and keys are thought of as a sequence of digits with base 2b. Pastry routes messages to the node whose nodeId is numerically closest to the given key. This is accomplished as follows. In each routing step, a node normally forwards the message to a node whose nodeId shares with the key a prefix that is at least one digit (or b bits) longer than the prefix that the key shares with the present nodes id. If no such node is known, the message is forwarded to a node whose nodeId shares a prefix with the key as long as the current node, but is numerically closer to the key than the present nodes id.

3.3 Pastry node state:


Each Pastry node maintains a routing table, a neighborhood set and a leaf set. We begin with a description of the routing table. A nodes routing table, R, is organized into ceiling function(log2b N) rows with 2b-1.[4]

The 2b-1 entries at row n of the routing table ,each refer to a node whose nodeId shares the present nodes nodeId in the first n digits,but whose n+1th digit has one of the 2 b-1 possible values other than the n+1th digit in the present nodes id. Each entry in the routing table contains the IP address of one of potentially manynodes whose nodeId have the appropriate prefix; in practice, a node is chosen that is close to the present node, according to the proximity metric.[4]

3.4 Pastry API:


Next, we briefly outline Pastrys application programming interface (API). The presented API is slightly simplified for clarity. Pastry exports the following operations: nodeId = pastryInit(Credentials, Application) causes the local node to join an ex-isting Pastry network (or start a new one), initialize all relevant state, and return the local nodes nodeId. The application-specific credentials contain information needed to authenticate the local node. The application argument is a handle to the application object that provides the Pastry node with the procedures to invoke when certain events happen, e.g., a message arrival. route(msg,key) causes Pastry to route the given message to the node with nodeId nu-merically closest to the key, among all live Pastry nodes. Applications layered on top of Pastry must export the following operations: deliver(msg,key)called by Pastry when a message is received and the local nodes nodeId is numerically closest to key, among all live nodes. forward(msg,key,nextId) called by Pastry just before a message is forwarded to the node with nodeId = nextId. The application may change the contents of the message or the value of nextId. Setting the nextId to NULL terminates the message at the local node. newLeafs(leafSet) called by Pastry whenever there is a change in the local nodes leaf set. This provides the application with an opportunity to adjust application-specific invariants based on the leaf set.[4]

3.5 Self Organization and Adaption:


3.5.1 Node arrival : When a new node arrives, it needs to initialize its state tables, and then in form other nodes of its presence. We assume the new node knows initially about a nearby Pastry node A, according to the proximity metric, that is 7

already part of the system. Such a node can be located automatically, for instance, using expanding ring IP multicast, or be obtained by the system administrator through outside channels. Pastry uses an optimistic approach to controlling concurrent node arrivals and departures. Since the arrival/departure of a node affects only a small number of exist-ing nodes in the system, contention is rare and an optimistic approach is appropriate. Briefly, whenever a node Aprovides state information to a node B, it attaches a timestamp to the message. Badjusts its own state based on this information and eventually sends an update message toA(e.g., notifying Aof its arrival). Battaches the original timestamp, which allows Ato check if its state has since changed. In the event that its state has changed, it responds with its updated state and Brestarts its operation. Node departure Nodes in the Pastry network may fail or depart without warning. In this section, we discuss how the Pastry network handles such node departures. A Pastry node is considered failed when its immediate neighbors in the nodeId space can no longer communicate with the node.

3.5.2 Route Locality:


The entries in the routing table of each Pastry node are chosen to beclose to the present node, according to the proximity metric, among all nodes with the desired nodeId prefix. As a result, in each routing step, a message is forwarded to a relatively close node with a nodeId that shares a longer common prefix or is numerically closer to the key than the local node. That is, each step moves the message closer to the destination in the nodeId space, while traveling the least possible distance in the proximity space. Since only local information is used, Pastry minimizes the distance of the next rout-ing step with no sense of global direction. This procedure clearly does not guarantee that the shortest path from source to destination is chosen; however, it does give rise to relatively good routes. Two facts are relevant to this statement. First, given a message was routed from node Ato node Bat distance dfrom A, the message cannot subsequently be routed to a node with a distance of less than dfrom A. This follows directly from the routing procedure, assuming accurate routing tables.

3.5.3 Locating the nearest among k nodes:


8

Some p eer-to-peer application we have built using Pastry replicate information on the kPastry nodes with the numerically closest nodeIds to a key in the Pastry nodeId space. PAST, for instance, replicates files in this way to ensure high availability despite node failures. Pastry naturally routes a message with the given key to the live node with the numerically closest nodeId, thus ensuring that the message reaches one of the knodes as long as at least one of them is live. Moreover, Pastrys locality properties make it likely that, along the route from a client to the numerically closest node, the message first reaches a node near the client, in terms of the proximity metric, among the knumerically closest nodes. This is useful in applications such as PAST, because retrieving a file from a nearby node minimizes client latency and network load. Moreover, observe that due to the random assignment of nodeIds, nodes with adjacent nodeIds are likely to be widely dispersed in the network. Thus, it is important to direct a lookup query towards a node that is located relatively near the client.[4]

Pastry uses a heuristic to overcome the prefix mismatch issue described above. The heuristic is based on estimating the density of nodeIds in the nodeId space using local information. Based on this estimation, the heuristic detects when a message approaches the set of k numerically closest nodes, and then switches to numerically nearest address based routing to locate the nearest replica. Pastry is able to locate the nearest node in over 75%, and one of the two nearest nodes in over 91% of all queries.

3.5.4 Arbitrary node failures and network partitions:


It is assumed that Pastry nodes fail silently. Pastry network could deal with arbitrary nodes failures, where a failed node continues to be responsive, but behavesincorrectly or even maliciously. The Pastry routing scheme is deterministic. Thus, it is vulnerable to malicious or failed nodes along the route that accept messages but do not correctly forward them. Repeated queries could thus fail each time, since they normally take the same route.them. In applications where arbitrary node failures must be tolerated, the routing can be randomized.In order to avoid routing loops, a message must always be forwarded to a node that shares a longer prefix with the destination, or shares the same prefix length as the current node but is numerically closer in the nodeId space than the current node. However, the choice among multiple nodes that satisfy this criterion can be made randomly. In practice, the probability distribution should be biased towards the best choice to ensure low average route delay. In the event of a malicious or failed node along the path, the query may have to be repeated several times by the client, until a route is chosen that avoids the bad node. Furthermore, the protocols for node join and node failure can be extended to tolerate misbehaving nodes. Another challenge are IP routing anomalies in the Internet that cause IP hosts to be unreachable from certain IP hosts but not others. The Pastry routing is tolerant of such anomalies; Pastry nodes are considered live and remain reachable in the overlay net-work as long as they are able to communication with their immediate neighbors in the nodeId space. However, Pastrys self-organization protocol may cause the creation of multiple, isolated Pastry overlay networks during periods of IP routing failures. Because Pastry relies almost exclusively on information exchange within the overlay network to self-organize, such isolated overlays may persist after full IP connectivity 10

resumes. One solution to this problem involves the use of IP multicast. Pastry nodes can pe-riodically perform an expanding ring multicast search for other Pastry nodes in their vicinity. If isolated Pastry overlays exists, they will be discovered eventually, and rein-tegrated. To minimize the cost, this procedure can be performed randomly and infre-quently by Pastry nodes, only within a limited range of IP routing hops from the node, and only if no search was performed by another nearby Pastry node recently. As an added benefit, the results of this search can also be used to improve the quality of the routing tables.

11

CHAPTER 4 Kademlia protocol: 4.1 Introduction:


Kademlia, a peer-to-peer key,value storage and lookup system. Kadem-lia has a number of desirable features not simulta-neously offered by any previous peer-to-peer sys-tem. It minimizes the number of configuration mes-sages nodes must send to learn about each other. Configuration information spreads automatically as a side-effect of key lookups. Nodes have enough knowledge and flexibility to route queries through lowlatency paths. Kademlia uses parallel, asyn-chronous queries to avoid timeout delays from failed nodes. The algorithm with which nodes record each other's existence resists certain basic denial of ser-vice attacks. Finally, several important properties of Kademlia can be formally proven using only weak assumptions on uptime distributions (assumptions we validate with measurements of existing peer-to-peer systems). Kademlia takes the basic approach of many peer-to-peer systems. Keys are opaque, 160-bit quantities (e.g., the SHA-1 hash of some larger data). Partici-pating computers each have a node ID in the 160-bit key space. key,value pairs are stored on nodes with IDs close to the key for some notion of closeness. Finally, a node-IDbased routing algorithm lets any-one locate servers near a destination key. Many of Kademlia's benefits result from its use of a novel XOR metric for distance between points in the key space. XOR is symmetric, allowing Kadem-lia participants to receive lookup queries from pre-cisely the same distribution of nodes contained in their routing tables. Without this property, systems such as Chord do not learn useful routing infor-mation from queries they receive. Worse yet, be-cause of the asymmetry of Chord's metric, Chord routing tables are rigid. Each entry in a Chord node's finger table must store the precise node proceeding an interval in the ID space; any node actually in the interval will be greater than some keys in the inter-val, and thus very far from the key. Kademlia, in contrast, can send a query to any node within an in-terval, allowing it to select routes based on latency or even send parallel asynchronous queries.[1]

12

4.2 System Description:


Each Kademlia node has a 160-bit node ID. Node IDs are constructed as in Chord, but to simplify this paper we assume machines just choose a random, 160-bit identifier when joining the system. Every message a node transmits includes its node ID, permitting the recipient to record the sender's existence if necessary. Keys, too, are 160-bit identifiers. To publish and find hkey,valuei pairs, Kademlia relies on a notion of distance between two identifiers. Given two 160-bit identifiers, x and y, Kademlia defines the distance between them as their bitwise exclusive or (XOR) interpreted as an integer, d(x; y) = x y. We first note that XOR is a valid, albeit non-Euclidean, metric. It is obvious that that d(x; x) = 0, d(x; y) > 0 if x =6 y, and 8x; y : d(x; y) = d(y; x). XOR also offers the triangle property: d(x; y) + d(y; z) d(x; z). The triangle property follows from the fact that d(x; z) = d(x; y) d(y; z) and 8a 0; b 0 : a+b a b . Like Chord's clockwise circle metric, XOR is uni-directional. For any given point x and distance > 0, there is exactly one point y such that d(x; y) = . Unidirectionality ensures that all lookups for the same key converge along the same path, regardless of the originating node. Thus, caching hkey,valuei pairs along the lookup path alleviates hot spots. Like Pastry and unlike Chord, the XOR topology is also symmetric ( d(x; y) = d(y; x) for all x and y).[1]

13

4.3 Node State:


Kademlia node store information about hIP address; UDP port; Node IDi triples for nodes of distance between 2i and 2i+1 from itself. We cal these lists k-buckets. Each kbucket is kept sorted by time last seenleast-recently seen node at the head, mostrecently seen at the tail. For small values of i, the k-buckets will generally be empty (as no appro-priate nodes will exist). For large values of i, the lists can grow up to size k, where k is a system-wide repli-cation parameter. k is chosen such that any given k nodes are very unlikely to fail within an hour of each other (for example k = 20). When a Kademlia node receives any message (re-quest or reply) from another node, it updates the appropriate k-bucket for the sender's node ID. If the sending node already exists in the recipient's k-bucket, the recipient moves it to the tail of the list. If the node is not already in the appropriate k-bucket and the bucket has fewer than k entries, then the re-cipient just inserts the new sender at the tail of the list. If the appropriate k-bucket is full, however, then the recipient pings the k-bucket's leastrecently seen node to decide what to do. If the least-recently seen node fails to respond, it is evicted from the k-bucket and the new sender inserted at the tail. Otherwise, if the least-recently seen node responds, it is moved to the tail of the list, and the new sender's contact is discarded.[1] k-buckets effectively implement a least-recently seen eviction policy, except 14

that live nodes are never removed from the list. This preference for old contacts is driven by our analysis of Gnutella trace data collected by Saroiu et. al. Figure 1 shows the percentage of Gnutella nodes that stay online another hour as a function of current uptime. The longer a node has been up, the more likely it is to remain up another hour. By keeping the oldest live contacts around, k-buckets maximize the probability that the nodes they contain will remain online. A second benefit of k-buckets is that they pro-vide resistance to certain DoS attacks. One cannot flush nodes' routing state by flooding the system with new nodes. Kademlia nodes will only insert the new nodes in the k-buckets when old nodes leave the sys-tem.

4.4 Kademlia Protocol Brief: The Kademlia protocol consists of four RPCs: PING, STORE, FIND NODE, and FIND VALUE. The PINGRPC probes a node to see if it is online. STORE in-structs a node to store a hkey; valuei pair for later retrieval. FIND NODE takes a 160-bit ID as an argu-ment. The recipient of a the RPC returns hIP address; UDP port; Node IDi triples for the k nodes it knows about closest to the target ID. These triples can come from a single k-bucket, or they may come from multiple k-buckets if the closest kbucket is not full. In any case, the RPC recipient must return k items (unless there are fewer than k nodes in all its k-buckets combined, in which case it returns every node it knows about). FIND VALUE behaves like FIND NODEreturning hIP address; UDP port; Node IDi tripleswith one exception. If the RPC recipient has received a STORE RPC for the key, it just returns the stored value. In all RPCs, the recipient must echo a 160-bit ran-dom RPC ID, which provides some resistance to ad-dress forgery. PINGs can also be piggy-backed on RPC replies for the RPC recipient to obtain addi-tional assurance of the sender's network address. The most important procedure a Kademlia partic-ipant must perform is to locate the k closest nodes to some given node ID. We call this procedure a node lookup. Kademlia employs a recursive algorithm for node lookups. The lookup initiator starts 15

by picking nodes from its closest non-empty k-bucket (or, if that bucket has fewer than entries, it just takes the closest nodes it knows of). The initiator then sends parallel, asynchronous FIND NODE RPCs to the concurrency parameter. In the recursive step, the initiator resends the FIND NODE to nodes it has learned about from pre-vious RPCs. (This recursion can begin before all of the previous RPCs have returned). Of the k nodes the initiator has heard of closest to the tar-get, it picks that it has not yet queried and re-sends the FIND NODE RPC to them. Nodes that fail to respond quickly are removed from consider-ation until and unless they do respond. If a round of FIND NODEs fails to return a node any closer than the closest already seen, the initiator resends the FIND NODE to all of the k closest nodes it has not already queried. The lookup terminates when the initiator has queried and gotten responses from the k closest nodes it has seen. When = 1 the lookup al-gorithm resembles Chord's in terms of message cost and the latency of detecting failed nodes. Most operations are implemented in terms of the above lookup procedure. To store a hkey,valuei pair, a participant locates the k closest nodes to the key and sends them STORE RPCs. Additionally, each node re-publishes the hkey,valuei pairs that it has ev-ery hour.2 This ensures persistence (as we show in our proof sketch) of the hkey,valuei pair with very high probability. Generally, we also require the orig-inal publishers of a hkey,valuei pair to republish it every 24 hours. Otherwise, all hkey,valuei pairs ex-pire 24 hours after the original publishing, in order to limit stale information in the system. Finally, in order to sustain consistency in the publishing-searching life-cycle of a hkey,valuei pair, we require that whenever a node w observes a new node u which is closer to some of w's hkey,valuei pairs, w replicates these pairs to u without removing them from its own database. To find a hkey,valuei pair, a node starts by per-forming a lookup to find the k nodes with IDs closest to the key. However, value lookups use FIND VALUE rather than FIND NODE RPCs. Moreover, the procedure halts immediately when any node returns the value. For caching purposes, once a lookup suc-ceeds, the requesting node stores the hkey,valuei pair at the closest node it observed to the key that did not return the value. Because of the unidirectionality of the topology, future searches for the same key are likely to hit cached entries before querying the closest node. Dur-ing times of high 16 nodes it has chosen. is a system-wide

popularity for a certain key, the system might end up caching it at many nodes. To avoid over-caching, we make the expiration time of a hkey,valuei pair in any node's database exponen-tially inversely proportional to the number of nodes between the current node and the node whose ID is closest to the key ID. While simple LRU eviction would result in a similar lifetime distribution, there is no natural way of choosing the cache size, since nodes have no a priori knowledge of how many val-ues the system will store. Buckets will generally be kept constantly fresh, due to the traffic of requests traveling through nodes. To avoid pathological cases when no traffic exists, each node refreshes a bucket in whose range it has not performed a node lookup within an hour. Re-freshing means picking a random ID in the bucket's range and performing a node search for that ID. To join the network, a node u must have a contact to an already participating node w. u inserts w into the appropriate k-bucket. u then performs a node lookup for its own node ID. Finally, u refreshes all k-buckets further away than its closest neighbor. During the refreshes, u both populates its own k-buckets and inserts itself into other nodes' k-buckets as nec-essary.

4.5 Sketch of proof:


To demonstrate proper function of our system, we need to prove that most operations take dlog ne + c time for some small constant c, and that a hkey,valuei lookup returns a key stored in the system with over-whelming probability. We start with some definitions. For a k-bucket covering the distance range 2i; 2i+1 , define the in-dex of the bucket to be i. Define the depth, h, of a node to be 160 i, where i is the smallest index of a non-empty bucket. Define node y's bucket height in node x to be the index of the bucket into which x would insert y minus the index of x's least signifi-cant empty bucket. Because node IDs are randomly chosen, it follows that highly non-uniform distribu-tions are unlikely. Thus with overwhelming proba-bility the height of a any given node will be within a constant of log n for a system with n nodes. More-over, the bucket height of the closest node to an ID in the kth-closest node will likely be within a constant of log k. Our next step will be to assume the invariant that every k-bucket of every node contains at least one contact if a node exists in the appropriate range. Given this 17

assumption, we show that the node lookup procedure is correct and takes logarithmic time. Sup-pose the closest node to the target ID has depth h. If none of this node's h most significant k-buckets is empty, the lookup procedure will find a node half as close (or rather whose distance is one bit shorter) in each step, and thus turn up the node in h log k steps. If one of the node's k-buckets is empty, it could be the case that the target node resides in the range of the empty bucket. In this case, the final steps will not decrease the distance by half. How-ever, the search will proceed exactly as though the bit in the key corresponding to the empty bucket had been flipped. Thus, the lookup algorithm will always return the closest node in h log k steps. More-over, once the closest node is found, the concurrency switches from to k. The number of steps to find the remaining k 1 closest nodes can be no more than the bucket height of the closest node in the kth-closest node, which is unlikely to be more than a constant plus log k. To prove the correctness of the invariant, first con-sider the effects of bucket refreshing if the invariant holds. After being refreshed, a bucket will either contain k valid nodes or else contain every node in its range if fewer than k exist. (This follows from the correctness of the node lookup procedure.) New nodes that join will also be inserted into any buckets that are not full. Thus, the only way to violate the in-variant is for there to exist k + 1 or more nodes in the range of a particular bucket, and for the k actually contained in the bucket all to fail with no intervening lookups or refreshes. However, k was precisely cho-sen for the probability of simultaneous failure within an hour (the maximum refresh time) to be small. In practice, the probability of failure is much smaller than the probability of k nodes leaving within an hour, as every incoming or outgoing re-quest updates nodes' buckets. This results from the symmetry of the XOR metric, because the IDs of the nodes with which a given node communicates dur-ing an incoming or outgoing request are distributed exactly compatibly with the node's bucket ranges. Moreover, even if the invariant does fail for a sin-gle bucket in a single node, this will only affect run-ning time (by adding a hop to some lookups), not correctness of node lookups. For a lookup to fail, k nodes on a lookup path must each lose k nodes in the same bucket with no intervening lookups or refreshes. If the different nodes' buckets have no overlap, this happens with probability 2 lower probability of failure. 18
k2

. Other-wise, nodes

appearing in multiple other nodes' buck-ets will likely have longer uptimes and thus

Now we look at a hkey,valuei pair's recovery. When a hkey,valuei pair is published, it is popu-lated at the k nodes, closest to the key. It is also re-published every hour. Since even new nodes (the least reliable) have probability 1=2 of lasting one hour, after one hour the hkey,valuei pair will still be present on one of the k nodes closest to the key with probability 1 2 k . This property is not violated by the insertion of new nodes that are close to the key, because as soon as such nodes are inserted, they contact their closest nodes in order to fill their buck-ets and thereby receive any nearby hkey,valuei pairs they should store. Of course, if the k closest nodes to a key fail and the hkey,valuei pair has not been cached will lose the pair.

4.6 Discussion
The XOR-topology-based routing that we use very much resembles the first step in the routing algo-rithms of Pastry , Tapestr, and Plaxton's dis-tributed search algorithm . All three of these, however, run into problems when they choose to ap-proach the target node b bits at a time (for acceler-ation purposes). Without the XOR topology, there is a need for an additional algorithmic structure for discovering the target within the nodes that share the same prefix but differ in the next b-bit digit. All three algorithms resolve this problem in different ways, each with its own drawbacks; they all require secondary routing tables of size O(2b) in addition to the main tables of size O(2b log2b n). This increases the cost of bootstrapping and maintenance, compli-cates the protocols, and for Pastry and Tapestry pre-vents a formal analysis of correctness and consis-tency. Plaxton has a proof, but the system is less geared for highly faulty environments like peer-to-peer networks. Kademlia, in contrast, can easily be optimized with a base other than 2. We configure our bucket table so as to approach the target b bits per hop. This requires having one bucket for each range of nodes at a distance [j2160 (i+1)b ; (j + 1)2160 (i+1)b ] from us, for each 0 < j < 2b and 0 i < 160=b, which amounts to expected no more than (2b 1) log2b n buckets with actual entries. The implementation cur-rently uses b = 5.

4.7 Summary
With its novel XOR-based metric topology, Kadem-lia is the first peer-to-peer system 19

to combine provable consistency and performance, latency-minimizing routing, and a symmetric, unidirectional topology. Kademlia furthermore introduces a con-currency parameter, , that lets people trade a con-stant factor in bandwidth for asynchronous lowest-latency hop selection and delay-free fault recovery. Finally, Kademlia is the first peer-to-peer system to exploit the fact that node failures are inversely related to uptime.

CHAPTER 5
Koorde Protocol

5.1 Introduction:
It is a new routing protocol. It shares almost all aspects with Chord But, meets (to within constant factor) all lower bounds just mentioned: It has degree 2 and O(log n) hop Or degree log n and O(log n / loglog n) hops and fault tolerant. Like Chord it also has O(log n) load balance or constant with O(log n) times degree. Its each node has 2 outgoing neighbors and also two incoming neighbors. It can show good routing load balance need b = log n bits for n distinct nodes So log n hops to route.[3]

5.2 DTH Routing:


Distributed hash tables Implement hash table interface Map any ID to the machine responsible for that ID (in a consistent fashion) Standard primitive for P2P

Machines not all aware of each other Each tracks small set of neighbors Route to responsible node via sequence of hops to neighbors 20

5.3 Performance Measures:


Degree How many neighbors nodes have

Hop count How long to reach any destination node

Fault tolerance How many nodes can fail

Maintenance overhead E.g., making sure neighbors are up

Load balance How evenly keys distribute among nodes

5.4 Tradeoffs:
With larger degree, hope to achieve Smaller hop count Better fault tolerance

But higher degree implies More routing table state per node Higher maintenance overhead to keep routing tables up to date

Load balance orthogonal issue

5.5 Current Systems:


Chord, Kademlia, Pastry, Tapestry O(log n) degree O(log n) hop count O(log n) ratio load balance Chord: O(1) load balance with O(log n) virtual nodes per real node Multiplies degree to O(log2 n)

21

5.6 Lower Bounds to Shoot For:


Theorem: if max degree is d, then hop count is at least logd n Proof: < dh nodes at distance h Allows degree O(1) and O(log n) hops Or deg. O(log n) and O(log n / loglog n) hops

Theorem: to tolerate half nodes failing, (e.g. net partition) need degree W(log n) Pf: if less, some node loses all neighbors Might as well take O(log n / loglog n) hops!

5.7 Koorde Idea:


Chord acts like a hypercube Fingers flip one bit Degree log n (log n different flips) Diameter log n

Koorde uses a deBruijn network Fingers shift in one bit Degree 2 (2 possible bits to shift in) Diameter log n

5.8 De Bruijn Graph:


Nodes are b-bit integers (b = log n) Node u has 2 neighbors (bit shifts): 2u mod 2b and 2u+1 mod 2b

22

5.9 De Bruijn Routing:


Shift in destination bits one by one b hops complete route Route from 000 to 110

5.10 Routing Code:


Procedure u.LOOKUP(k, toShift) /* u is machine, k is target key toShift is target bits not yet shifted in */ if k = u then Return u else 23 /* as owner for k */

/* do de Bruijn hop */ t = u topBit(toShift) Return t.lookup(k, toshift 1) Initially call self.LOOKUP(k,k)

CHAPTER 6
Simlation framework and tools used: 6.1 Omnet++4.2.1

6.1.1 Introduction OMNeT++ is a discrete event simulation environment. Its primary application area is the simulation of communication networks, but because of its generic and flexible architecture, is successfully used in other areas like the simulation of complex IT systems, queueing networks or hardware architectures as well. OMNeT++ provides a component architecture for models. Components (modules) are programmed in C++, then assembled into larger components and models using a high-level language (NED). Reusability of models comes for free. OMNeT++ has extensive GUI support, and due to its modular architecture, the simulation kernel (and models) can be embedded easily into your applications. Although OMNeT++ is not a network simulator itself, it is 24

currently gaining widespread popularity as a network simulation platform in the scientific community as well as in industrial settings, and building up a large user community.[11]

6.1.2 Components

simulation kernel library compiler for the NED topology description language OMNeT++ IDE based on the Eclipse platform GUI for simulation execution, links into simulation executable (Tkenv) command-line user interface for simulation execution (Cmdenv) utilities (makefile creation tool, etc.) documentation, sample simulations, etc.

6.1.3 Platforms
OMNeT++ runs on ubuntu 11.10 ,Linux, Mac OS X, other Unix-like systems and on Windows (XP, Win2K, Vista, 7). The OMNeT++ IDE requires Linux32/64, Mac OS X 10.5 or Windows XP.[7]

6.1.4 ScreenShots

25

26

27

6.2 Oversim/inet-Oversim
6.2.1 Introduction OverSim is an OMNeT++-based open-source simulation framework for overlay and peer-to-peer networks, developed at the Institute of Telematics, University of Karlsruhe (TH), Germany. The simulator contains several models for structured (e.g. Chord, Kademlia, Pastry) and unstructured (e.g. GIA) peer-to-peer protocols. An example implementation of the framework is an implementation of a peer-to-peer SIP communications network.[6]

6.2.2 Oversim Features:


Some of the main features of the OverSim simulation framework include: Flexibility: The simulator allows to simulate both structured and unstructured overlay networks (currently Chord, Pastry, Koorde, Broose, Kademlia, and GIA are implemented). The modular design and use of the Common API facilitate the extension with new features or protocols. Module behavior can easily be customized by specify parameters in a human readable configuration file. Interactive GUI: In order to validate and debug new or existing overlay protocols you can make use of the GUI of OMNeT++, which visualizes networks topologies, messages and node state variables like the routing table. Exchangeable Underlying Network Models: OverSim has a flexible underlying network scheme, which on the one hand provides a fully configurable network topology with realistic bandwidths, packet delays, and 28

packet losses (INETUnderlay), and on the other hand a fast and simple alternative model for high simulation performance (SimpleUnderlay).[5] Scalability: OverSim was designed with performance in mind. On a modern desktop PC a typical Chord network of 10,000 nodes can be simulated in real-time. The simulator was used to successfully simulate networks of up to 100,000 nodes. Base Overlay Class: The base overlay class facilitates the implementation of structured peer-to-peer protocols by providing a RPC interface, a generic lookup class and a common API keybased routing interface to the application.

Reuse of Simulation Code: The different implementations of overlay protocols are reusable for real network applications, so that researchers can validate the simulator framework results by comparing them to the results from real-world test networks like PlanetLab. Therefore, the simulation framework is able to handle and assemble real network packets and to communicate with other implementations of the same overlay protocol. Statistics: The simulator collects various statistical data such as sent, received, or forwarded network traffic per node, successful or unsuccessful packet delivery, and packet hop count. Inet: The INET framework is an open-source communication networks simulationpackage, written for the OMNEST/OMNeT++ simulation system. The INET framework contains models for several Internet protocols: beyond TCP and IP there is UDP, Ethernet, PPP and LS with LDP and RSVP-TE signalling.[5] 29

6.3 SCAVETOOL: Scave is the result analysis tool of OMNeT++ and its task is to help the user process and visualize simulation results saved into vector and scalar files. Scave is designed so that the user can work equally well on the output of a single simulation run (one or two files) and the result of simulation batches (which may be several hundred files, possibly in multiple directories)[12]. Ad-hoc browsing of the data is supported in addition to systematic and repeatable processing. With the latter, all processing and charts are stored as recipes. For example, if simulations need to be re-run due to a model bug or misconfiguration, existing charts do not need to be drawn all over again. Simply replacing the old result files with the new ones will result in the charts being automatically displayed with the new data. Scave is implemented as a multi-page editor. What the editor edits is the recipe, which includes what files to take as inputs, what data to select from them, what (optional) processing to apply, and what kind of charts to create from them. The pages (tabs) of the editor roughly correspond to these steps. You will see that Scave is much more than just a union of the OMNeT++ 3.x

30

Scalars and Plove tools.

The first page displays the result files that serve as input for the analysis. The upper half specifies what files to select, by explicit filenames or by wildcards. The lower half shows what files actually matched the input specification and what runs they contain. Note that OMNeT++ result files contain a unique run ID and several metadata annotations in addition to the actual recorded data. The third tree organizes simulation runs according to their experimentmeasurementreplication labels.[11] The underlying assumption is that users will organize their simulation-based research into various experiments. An experiment will consist of several measurements which are typically (but not necessarily) simulations done with the same model 31

but with different parameter settings; that is, the user will explore the parameter space with several simulation runs. To gain statistical confidence in the results, each measurement will be possibly repeated several times with different random number seeds. It is easy to set up such scenarios with the improved ini files of OMNeT++. 4.x. Then, the experiment-measurement-replication labels will be assigned more-or- less automatically please refer to the Inifile document (Configuring Simulations in OMNeT++ 4.x) for more discussion.

The second page displays results (vectors, scalars, and histograms) from all files in tables and lets the user browse them. Results can be sorted and filtered. Simple filtering is possible with combo boxes, or when that is not enough, the user can write arbitrarily complex filters using a generic pattern matching expression language. Selected or filtered data can be immediately plotted, or remembered in named datasets for further processing.[11]

32

It is possible to define reusable datasets that are basically recipes on how to select and process data received from the simulation. You can add selection and data processing nodes to a dataset. Chart drawing is possible at any point in the processing tree.

33

Line charts are typically drawn from time-series data stored in vector files. Preprocessing of the data is possible in the dataset. The line chart component can be configured freely to display the vector data according to your needs.[11]

34

Bar charts are created from scalar results and histograms. Relevant data can be grouped and displayed via the Bar chart component. Colors, chart type, and other display attributes can be set on the component.

35

The Output Vector View can be used to inspect the raw numerical data when required. It can show the original data read from the vector file, or the result of a computation. The user can select a point on the line chart or a vector in the Dataset View and its content will be displayed.

The Dataset View is used to show the result items contained in the dataset. The content of the view corresponds to the state of the dataset after the selected processing is done. 36

CHAPTER 7 CODING Kademlia: .H file


#ifndef __KADEMLIA_H_ #define __KADEMLIA_H_ #include <deque> #include <omnetpp.h> #include <CommonMessages_m.h> #include <BaseOverlay.h> #include <GlobalStatistics.h> #include <NeighborCache.h> #include "KademliaNodeHandle.h" #include "KademliaBucket.h" class Kademlia : public BaseOverlay, public ProxListener { protected://fields: kademlia parameters uint32_t k; /*< number of redundant graphs */ uint32_t b; /*< number of bits to consider */ uint32_t s; /*< number of siblings */ uint32_t maxStaleCount; /*< number of timouts until node is removed from routingtable */ bool exhaustiveRefresh; /*< if true, use exhaustive-iterative lookups to refresh buckets */ bool pingNewSiblings; bool secureMaintenance; /**< if true, ping not authenticated nodes before adding them to a bucket */ bool newMaintenance; bool enableReplacementCache; /*< enables the replacement cache to store nodes if a bucket is full */ bool replacementCachePing; /*< ping the least recently used node in a full bucket, when a node is added to the replacement cache */ uint replacementCandidates; /*< maximum number of candidates in the replacement cache for each bucket */ int siblingRefreshNodes; /*< number of redundant nodes for exhaustive sibling table refresh lookups (0 = numRedundantNodes) */ int bucketRefreshNodes; /*< number of redundant nodes for exhaustive bucket refresh lookups (0 = numRedundantNodes) */ // R/Kademlia bool activePing; bool proximityRouting; bool proximityNeighborSelection; bool altRecMode; simtime_t minSiblingTableRefreshInterval; simtime_t minBucketRefreshInterval; simtime_t siblingPingInterval; 37

cMessage* bucketRefreshTimer; cMessage* siblingPingTimer; public: Kademlia(); ~Kademlia(); void initializeOverlay(int stage); void finishOverlay(); void joinOverlay(); bool isSiblingFor(const NodeHandle& node,const OverlayKey& key, int numSiblings, bool* err ); int getMaxNumSiblings(); int getMaxNumRedundantNodes(); void handleTimerEvent(cMessage* msg); bool handleRpcCall(BaseCallMessage* msg); void handleUDPMessage(BaseOverlayMessage* msg); virtual void proxCallback(const TransportAddress& node, int rpcId, cPolymorphic *contextPointer, Prox prox); protected: NodeVector* findNode(const OverlayKey& key, int numRedundantNodes, int numSiblings, BaseOverlayMessage* msg); void handleRpcResponse(BaseResponseMessage* msg, cPolymorphic* context, int rpcId, simtime_t rtt); void handleRpcTimeout(BaseCallMessage* msg, const TransportAddress& dest, cPolymorphic* context, int rpcId, const OverlayKey& destKey); /** * handle a expired bucket refresh timer*/ void handleBucketRefreshTimerExpired(); OverlayKey distance(const OverlayKey& x, const OverlayKey& y, bool useAlternative = false) const; /** * updates information shown in GUI*/ void updateTooltip(); virtual void lookupFinished(bool isValid); virtual void handleNodeGracefulLeaveNotification(); friend class KademliaLookupListener; private: uint32_t bucketRefreshCount; /*< statistics: total number of bucket refreshes */ uint32_t siblingTableRefreshCount; /*< statistics: total number of sibling table refreshes */ uint32_t nodesReplaced; KeyDistanceComparator<KeyXorMetric>* comparator; KademliaBucket* siblingTable; std::vector<KademliaBucket*> routingTable; int numBuckets; void routingInit(); 38

void routingDeinit(); int routingBucketIndex(const OverlayKey& key, bool firstOnLayer = false); KademliaBucket* routingBucket(const OverlayKey& key, bool ensure); bool routingAdd(const NodeHandle& handle, bool isAlive, simtime_t rtt = MAXTIME, bool maintenanceLookup = false); bool routingTimeout(const OverlayKey& key, bool immediately = false); void refillSiblingTable(); void sendSiblingFindNodeCall(const TransportAddress& dest); void setBucketUsage(const OverlayKey& key); bool recursiveRoutingHook(const TransportAddress& dest, BaseRouteMessage* msg); bool handleFailedNode(const TransportAddress& failed); }; #endif

.NED file
module KademliaModules like IOverlay { gates: input udpIn; // gate from the UDP layer output udpOut; // gate to the UDP layer input tcpIn; // gate from the TCP layer output tcpOut; // gate to the TCP layer input appIn; // gate from the application output appOut; // gate to the application submodules: kademlia: Kademlia { parameters: @display("p=60,60;i=block/circle"); } connections allowunconnected: udpIn --> kademlia.udpIn; udpOut <-- kademlia.udpOut; appIn --> kademlia.appIn; appOut <-- kademlia.appOut; }

.CC file
#include "KademliaBucket.h" KademliaBucket::KademliaBucket(uint16_t maxSize, const Comparator<OverlayKey>* comparator) : BaseKeySortedVector< KademliaBucketEntry >(maxSize, comparator) { lastUsage = -1; } 39

KademliaBucket::~KademliaBucket() { } KOORDE: .CC file #include <IPAddressResolver.h> #include <IPvXAddress.h> #include <IInterfaceTable.h> #include <IPv4InterfaceData.h> #include <GlobalStatistics.h> #include "Koorde.h" using namespace std; namespace oversim { Define_Module(Koorde); void Koorde::initializeOverlay(int stage) { // because of IPAddressResolver, we need to wait until interfaces // are registered, address auto-assignment takes place etc. if (stage != MIN_STAGE_OVERLAY) return; // fetch some parameters deBruijnDelay = par("deBruijnDelay"); deBruijnListSize = par("deBruijnListSize"); shiftingBits = par("shiftingBits"); useOtherLookup = par("useOtherLookup"); useSucList = par("useSucList"); setupDeBruijnBeforeJoin = par("setupDeBruijnBeforeJoin"); setupDeBruijnAtJoin = par("setupDeBruijnAtJoin"); // init flags breakLookup = false; // some local variables deBruijnNumber = 0; deBruijnNodes = new NodeHandle[deBruijnListSize]; // statistics deBruijnCount = 0; deBruijnBytesSent = 0; // add some watches WATCH(deBruijnNumber); WATCH(deBruijnNode); // timer messages deBruijn_timer = new cMessage("deBruijn_timer"); Chord::initializeOverlay(stage); } Koorde::~Koorde() 40

{ cancelAndDelete(deBruijn_timer); } void Koorde::changeState(int toState) { Chord::changeState(toState); switch(state) { case INIT: // init de Bruijn nodes deBruijnNode = NodeHandle::UNSPECIFIED_NODE; for (int i=0; i < deBruijnListSize; i++) { deBruijnNodes[i] = NodeHandle::UNSPECIFIED_NODE; } updateTooltip(); break; case BOOTSTRAP: if (setupDeBruijnBeforeJoin) { // setup de bruijn node before joining the ring cancelEvent(join_timer); cancelEvent(deBruijn_timer); scheduleAt(simTime(), deBruijn_timer); } else if (setupDeBruijnAtJoin) { cancelEvent(deBruijn_timer); scheduleAt(simTime(), deBruijn_timer); } break; case READY: // init de Bruijn Protocol cancelEvent(deBruijn_timer); scheduleAt(simTime(), deBruijn_timer); // since we don't need the fixfingers protocol in Koorde cancel timer cancelEvent(fixfingers_timer); break; default: break; } } void Koorde::handleTimerEvent(cMessage* msg) { if (msg->isName("deBruijn_timer")) { handleDeBruijnTimerExpired(); } else if (msg->isName("fixfingers_timer")) { handleFixFingersTimerExpired(msg); } else { Chord::handleTimerEvent(msg); } 41

} bool Koorde::handleFailedNode(const TransportAddress& failed) { if (!deBruijnNode.isUnspecified()) { if (failed == deBruijnNode) { deBruijnNode = deBruijnNodes[0]; for (int i = 0; i < deBruijnNumber - 1; i++) { deBruijnNodes[i] = deBruijnNodes[i+1]; } if (deBruijnNumber > 0) { deBruijnNodes[deBruijnNumber - 1] = NodeHandle::UNSPECIFIED_NODE; --deBruijnNumber; } } else { bool removed = false; for (int i = 0; i < deBruijnNumber - 1; i++) { if ((!deBruijnNodes[i].isUnspecified()) && (failed == deBruijnNodes[i])) { removed = true; } if (removed || ((!deBruijnNodes[deBruijnNumber 1].isUnspecified()) && failed == deBruijnNodes[deBruijnNumber - 1])) { deBruijnNodes[deBruijnNumber - 1] = NodeHandle::UNSPECIFIED_NODE; --deBruijnNumber; } } } } return Chord::handleFailedNode(failed); } void Koorde::handleDeBruijnTimerExpired() { OverlayKey lookup = thisNode.getKey() << shiftingBits; if (state == READY) { if (successorList->getSize() > 0) { // look for some nodes before our actual debruijn key // to have redundancy if our de-bruijn node fails lookup -= (successorList>getSuccessor(successorList->getSize() / 2).getKey() - thisNode.getKey()); } 42

if (lookup.isBetweenR(thisNode.getKey(), successorList->getSuccessor().getKey()) || successorList->isEmpty()) { int sucNum = successorList->getSize(); if (sucNum > deBruijnListSize) sucNum = deBruijnListSize; deBruijnNode = thisNode; for (int i = 0; i < sucNum; i++) { deBruijnNodes[i] = successorList>getSuccessor(i); deBruijnNumber = i+1; } updateTooltip(); } else if (lookup.isBetweenR(predecessorNode.getKey(), thisNode.getKey())) { int sucNum = successorList->getSize(); if ((sucNum + 1) > deBruijnListSize) sucNum = deBruijnListSize - 1; deBruijnNode = predecessorNode; deBruijnNodes[0] = thisNode; for (int i = 0; i < sucNum; i++) { deBruijnNodes[i+1] = successorList>getSuccessor(i); deBruijnNumber = i+2; } updateTooltip(); } else { DeBruijnCall* call = new DeBruijnCall("DeBruijnCall"); call->setDestKey(lookup); call->setBitLength(DEBRUIJNCALL_L(call)); sendRouteRpcCall(OVERLAY_COMP, deBruijnNode, call->getDestKey(), call, NULL, DEFAULT_ROUTING); } cancelEvent(deBruijn_timer); scheduleAt(simTime() + deBruijnDelay, deBruijn_timer); } else { if (setupDeBruijnBeforeJoin || setupDeBruijnAtJoin) { DeBruijnCall* call = new DeBruijnCall("DeBruijnCall"); call->setDestKey(lookup); call->setBitLength(DEBRUIJNCALL_L(call)); sendRouteRpcCall(OVERLAY_COMP, bootstrapNode, call->getDestKey(), 43

call, NULL, DEFAULT_ROUTING); scheduleAt(simTime() + deBruijnDelay, deBruijn_timer); } } } #if 0 void Koorde::handleFixFingersTimerExpired(cMessage* msg) { // just in case not all timers from Chord code could be canceled } #endif void Koorde::handleUDPMessage(BaseOverlayMessage* msg) { Chord::handleUDPMessage(msg); } bool Koorde::handleRpcCall(BaseCallMessage* msg) { if (state == READY) { // delegate messages RPC_SWITCH_START( msg ) RPC_DELEGATE( DeBruijn, handleRpcDeBruijnRequest ); RPC_SWITCH_END( ) if (RPC_HANDLED) return true; } else { EV << "[Koorde::handleRpcCall() @ " << thisNode.getIp() << " (" << thisNode.getKey().toString(16) << ")]\n" << " Received RPC call and state != READY!" << endl; } return Chord::handleRpcCall(msg); } 1].getKey())) { return deBruijnNodes[i]; } } return deBruijnNodes[deBruijnNumber-1]; } const NodeHandle& Koorde::walkSuccessorList(const OverlayKey& key) { for (unsigned int i = 0; i < successorList->getSize()1; i++) { if (key.isBetweenR(successorList>getSuccessor(i).getKey(), successorList->getSuccessor(i+1).getKey())) { 44

return successorList->getSuccessor(i); } } return successorList->getSuccessor(successorList >getSize()-1); } void Koorde::updateTooltip() { // Updates the tooltip display strings. if (ev.isGUI()) { std::stringstream ttString; // show our predecessor, successor and de Bruijn node in tooltip ttString << "Pred "<< predecessorNode << endl << "This " << thisNode << endl << "Suc " << successorList>getSuccessor() << endl << "DeBr " << deBruijnNode << endl; ttString << "List "; for (unsigned int i = 0; i < successorList->getSize(); i+ +) { ttString << successorList->getSuccessor(i).getIp() << " "; } ttString << endl; ttString << "DList "; for (int i = 0; i < deBruijnNumber; i++) { ttString << deBruijnNodes[i].getIp() << " "; } ttString << endl; getParentModule()->getParentModule()-> getDisplayString().setTagArg("tt", 0, ttString.str().c_str()); getParentModule()>getDisplayString().setTagArg("tt", 0, ttString.str().c_str()); getDisplayString().setTagArg("tt", 0,ttString.str().c_str()); // draw an arrow to our current successor showOverlayNeighborArrow(successorList>getSuccessor(), true,"m=m,50,0,50,0;ls=red,1"); } } void Koorde::finishOverlay() { // statistics simtime_t time = globalStatistics>calcMeasuredLifetime(creationTime); if (time >= GlobalStatistics::MIN_MEASURED) { 45

globalStatistics->addStdDev("Koorde: Sent DEBRUIJN Messages/s",deBruijnCount / time); globalStatistics->addStdDev("Koorde: Sent DEBRUIJN Bytes/s", deBruijnBytesSent / time); } Chord::finishOverlay(); } void Koorde::recordOverlaySentStats(BaseOverlayMessage* msg) { Chord::recordOverlaySentStats(msg); BaseOverlayMessage* innerMsg = msg; while (innerMsg->getType() != APPDATA && innerMsg->getEncapsulatedPacket() != NULL) { innerMsg = static_cast<BaseOverlayMessage*>(innerMsg>getEncapsulatedPacket()); } switch (innerMsg->getType()) { case RPC: { if ((dynamic_cast<DeBruijnCall*>(innerMsg) != NULL) || (dynamic_cast<DeBruijnResponse*>(innerMsg) != NULL)) { RECORD_STATS(deBruijnCount++; deBruijnBytesSent +=msg>getByteLength()); } break; } } } OverlayKey Koorde::findStartKey(const OverlayKey& startKey,const OverlayKey& endKey,const OverlayKey& destKey,int& step) { OverlayKey diffKey, newStart, tmpDest, newKey, powKey; int nBits; if (startKey == endKey) return startKey; diffKey = endKey - startKey; nBits = diffKey.log_2(); if (nBits < 0) { nBits = 0; } while ((startKey.getLength() - nBits) % shiftingBits ! = 0) { nBits--; } 46

step = nBits + 1; #if 0 // TODO: work in progress to find better start key uint shared; for (shared = 0; shared < (startKey.getLength() nBits); shared += shiftingBits) { if (destKey.sharedPrefixLength(startKey << shared) >= (startKey.getLength() - nBits - shared)) { break; } } uint nBits2 = startKey.getLength() - shared; newStart = (startKey >> nBits2) << nBits2; tmpDest = destKey >> (destKey.getLength() - nBits2); newKey = tmpDest + newStart; std::cout << "startKey: " << startKey.toString(2) << endl << "endKey : " << endKey.toString(2) << endl << "diff : " << (endKeystartKey).toString(2) << endl << "newKey : " << newKey.toString(2) << endl << "destKey : " << destKey.toString(2) << endl << "nbits : " << nBits << endl << "nbits2 : " << nBits2 << endl; // is the new constructed route key bigger than our start key return it if (newKey.isBetweenR(startKey, endKey)) { std::cout << "HIT" << endl; return newKey; } else { nBits2 -= shiftingBits; newStart = (startKey >> nBits2) << nBits2; tmpDest = destKey >> (destKey.getLength() nBits2); newKey = tmpDest + newStart; if (newKey.isBetweenR(startKey, endKey)) { std::cout << "startKey: " << startKey.toString(2) << endl << "endKey : " << endKey.toString(2) << endl<< "diff : " << (endKey-startKey).toString(2) << endl<< "newKey : " << newKey.toString(2) << endl << "destKey : " << destKey.toString(2) << endl << "nbits : " << nBits << endl<< "nbits2 : " << nBits2 << endl; std::cout << "HIT2" << endl; return newKey; } } 47

std::cout << "MISS" << endl; #endif newStart = (startKey >> nBits) << nBits; tmpDest = destKey >> (destKey.getLength() - nBits); newKey = tmpDest + newStart; // is the new constructed route key bigger than our start key return it if (newKey.isBetweenR(startKey, endKey)) { return newKey; } // If the part of the destination key smaller than the one of // the original key add pow(nBits) (this is the first bit where // the start key and end key differ) to the new constructed key // and check if it's between start and end key. newKey += powKey.pow2(nBits); if (newKey.isBetweenR(startKey, endKey)) { return newKey; } else { // this part should not be called throw cRuntimeError("Koorde::findStartKey(): Invalid start key"); return OverlayKey::UNSPECIFIED_KEY; } } void Koorde::findFriendModules() { successorList = check_and_cast<ChordSuccessorList*> (getParentModule()>getSubmodule("successorList")); } void Koorde::initializeFriendModules() { // initialize successor list successorList>initializeList(par("successorListSize"), thisNode, this); } }; //namespace

.NED file
#ifndef __KOORDE_H_ #define __KOORDE_H_ #include <omnetpp.h> #include <IPvXAddress.h> #include <OverlayKey.h> #include <NodeHandle.h> #include <BaseOverlay.h> 48

#include "../chord/ChordSuccessorList.h" #include "../chord/Chord.h" namespace oversim { class Koorde : public Chord { public: virtual ~Koorde(); // see BaseOverlay.h virtual void initializeOverlay(int stage); // see BaseOverlay.h virtual void handleTimerEvent(cMessage* msg); // see BaseOverlay.h virtual void handleUDPMessage(BaseOverlayMessage* msg); // see BaseOverlay.h virtual void recordOverlaySentStats(BaseOverlayMessage* msg); // see BaseOverlay.h virtual void finishOverlay(); virtual void updateTooltip (); protected: //parameters int deBruijnDelay; /**< number of seconds between two de bruijn calls */ int deBruijnNumber; /**< number of current nodes in de bruijn list; depend on number of nodes in successor list*/ int deBruijnListSize; /**< maximal number of nodes in de bruijn list */ int shiftingBits; /**< number of bits concurrently shifted in one routing step */ bool useOtherLookup; /**< flag which is indicating that the optimization other lookup is enabled */ bool useSucList; /**< flag which is indicating that the optimization using the successorlist is enabled */ bool breakLookup; /**< flag is used during the recursive step when returning this node */ bool setupDeBruijnBeforeJoin; /**< if true, first setup the de bruijn node using the bootstrap node and than join the ring */ bool setupDeBruijnAtJoin; /**< if true, join the ring and setup the de bruijn node using the bootstrap node in parallel */ //statistics int deBruijnCount; /**< number of de bruijn calls */ int deBruijnBytesSent; /**< number of bytes sent during de bruijn calls*/ //Node handles NodeHandle* deBruijnNodes; /**< List of de Bruijn nodes */ 49

NodeHandle deBruijnNode; /**< Handle to our de Bruijn node */ //Timer Messages cMessage* deBruijn_timer; /**< timer for periodic de bruijn stabilization */ virtual void changeState(int state); virtual void handleDeBruijnTimerExpired(); //virtual void handleFixFingersTimerExpired(cMessage* msg); // see BaseOverlay.h virtual bool handleRpcCall(BaseCallMessage* msg); // see BaseOverlay.h virtual void handleRpcResponse(BaseResponseMessage* msg,cPolymorphic* context, int rpcId,simtime_t rtt ); // see BaseOverlay.h virtual void handleRpcTimeout(BaseCallMessage* msg, const TransportAddress& dest,cPolymorphic* context,int rpcId, const OverlayKey& destKey); virtual void handleRpcJoinResponse(JoinResponse* joinResponse); virtual void handleRpcDeBruijnRequest(DeBruijnCall* deBruinCall); virtual void handleRpcDeBruijnResponse(DeBruijnResponse* deBruijnResponse); virtual void handleDeBruijnTimeout(DeBruijnCall* deBruijnCall); virtual NodeHandle findDeBruijnHop(const OverlayKey& destKey, KoordeFindNodeExtMessage* findNodeExt); // see BaseOverlay.h NodeVector* findNode(const OverlayKey& key,int numRedundantNodes,int numSiblings,BaseOverlayMessage* msg); virtual OverlayKey findStartKey(const OverlayKey& startKey,const OverlayKey& endKey,const OverlayKey& destKey,int& step); virtual const NodeHandle& walkDeBruijnList(const OverlayKey& key); virtual const NodeHandle& walkSuccessorList(const OverlayKey& key); virtual bool handleFailedNode(const TransportAddress& failed); virtual void rpcJoin(JoinCall* call); virtual void findFriendModules(); virtual void initializeFriendModules(); }; }; //namespace #endif 50

.H File
#ifndef __KOORDE_H_ #define __KOORDE_H_ #include <omnetpp.h> #include <IPvXAddress.h> #include <OverlayKey.h> #include <NodeHandle.h> #include <BaseOverlay.h> #include "../chord/ChordSuccessorList.h" #include "../chord/Chord.h" namespace oversim { class Koorde : public Chord { public: virtual ~Koorde(); // see BaseOverlay.h virtual void initializeOverlay(int stage); // see BaseOverlay.h virtual void handleTimerEvent(cMessage* msg); // see BaseOverlay.h virtual void handleUDPMessage(BaseOverlayMessage* msg); // see BaseOverlay.h virtual void recordOverlaySentStats(BaseOverlayMessage* msg); // see BaseOverlay.h virtual void finishOverlay(); virtual void updateTooltip (); protected: //parameters int deBruijnDelay; /**< number of seconds between two de bruijn calls */ int deBruijnNumber; /**< number of current nodes in de bruijn list; depend on number of nodes in successor list */ int deBruijnListSize; /**< maximal number of nodes in de bruijn list */ int shiftingBits; /**< number of bits concurrently shifted in one routing step */ bool useOtherLookup; /**< flag which is indicating that the optimization other lookup is enabled */ bool useSucList; /**< flag which is indicating that the optimization using the successorlist is enabled */ bool breakLookup; /**< flag is used during the recursive step when returning this node */ bool setupDeBruijnBeforeJoin; /**< if true, first setup the de bruijn node using the bootstrap node and than join the ring */ bool setupDeBruijnAtJoin; /**< if true, join the ring and setup the de bruijn node using the bootstrap node in parallel */ //statistics int deBruijnCount; /**< number of de bruijn calls */ int deBruijnBytesSent; /**< number of bytes sent during de bruijn calls*/ //Node handles NodeHandle* deBruijnNodes; /**< List of de Bruijn nodes */

51

NodeHandle deBruijnNode; /**< Handle to our de Bruijn node */ //Timer Messages cMessage* deBruijn_timer; /**< timer for periodic de bruijn stabilization */ virtual void changeState(int state); virtual void handleDeBruijnTimerExpired(); //virtual void handleFixFingersTimerExpired(cMessage* msg); // see BaseOverlay.h virtual bool handleRpcCall(BaseCallMessage* msg); // see BaseOverlay.h virtual void handleRpcResponse(BaseResponseMessage* msg, cPolymorphic* context, int rpcId,simtime_t rtt ); // see BaseOverlay.h virtual void handleRpcTimeout(BaseCallMessage* msg, const TransportAddress& dest,cPolymorphic* context, int rpcId, const OverlayKey& destKey); virtual void handleRpcJoinResponse(JoinResponse* joinResponse); virtual void handleRpcDeBruijnRequest(DeBruijnCall* deBruinCall); virtual void handleRpcDeBruijnResponse(DeBruijnResponse* deBruijnResponse); virtual void handleDeBruijnTimeout(DeBruijnCall* deBruijnCall); virtual NodeHandle findDeBruijnHop(const OverlayKey& destKey, KoordeFindNodeExtMessage* findNodeExt); // see BaseOverlay.h NodeVector* findNode(const OverlayKey& key,int numRedundantNodes,int numSiblings,BaseOverlayMessage* msg); virtual OverlayKey findStartKey(const OverlayKey& startKey, const OverlayKey& endKey,const OverlayKey& destKey,int& step); virtual const NodeHandle& walkDeBruijnList(const OverlayKey& key); virtual const NodeHandle& walkSuccessorList(const OverlayKey& key); // see BaseOverlay.h virtual bool handleFailedNode(const TransportAddress& failed); virtual void rpcJoin(JoinCall* call); virtual void findFriendModules(); virtual void initializeFriendModules(); }; }; //namespace #endif

52

PASTRY: .NED file


package oversim.overlay.pastry; import oversim.common.BaseOverlay; import oversim.common.IOverlay; simple BasePastry extends BaseOverlay { parameters: bool enableNewLeafs; // enable Pastry API call newLeafs() bool optimizeLookup; // whether to search the closest node // in findCloserNode() calls int bitsPerDigit; // bits per Pastry digit int numberOfLeaves; // number of entries in leaf set int numberOfNeighbors; // number of entries in neighborhoot set double joinTimeout @unit(s); // seconds to wait for STATE message // from closest node double repairTimeout @unit(s);// how long to wait for repair messages bool useRegularNextHop; bool alwaysSendUpdate;// tables delayed (should be very small) double readyWait @unit(s); // seconds to wait for missing state // messages in JOIN phase bool proximityNeighborSelection; // enable PNS ? } simple Pastry extends BasePastry { parameters: @class(Pastry); bool partialJoinPath; // allow join even with missing state// message along the routing path double secondStageWait @unit(s); // how long to wait before starting // second stage of init phase bool pingBeforeSecondStage; // join at nearest node, otherwise use bootstrapnode bool useDiscovery; // use smaller join state msgs (as described in the 2nd Pastry paper) bool minimalJoinState; // use state messages for leafset repair, otherwise use leafset messages bool sendStateAtLeafsetRepair; // how long to wait for leafset pings in discovery stage double discoveryTimeoutAmount @unit(s); // interval for periodic routing table maintenance 53

double routingTableMaintenanceInterval @unit(s); // pastry configuration according to the original paper bool overrideOldPastry; bool overrideNewPastry; // optimized pastry configuration @display("i=block/circle"); } simple PastryRoutingTable { parameters: @display("i=block/table"); } simple PastryLeafSet { parameters: @display("i=block/table"); } simple PastryNeighborhoodSet { parameters: @display("i=block/table"); } module PastryModules like IOverlay { gates: input udpIn; // gate from the UDP layer output udpOut; // gate to the UDP layer input tcpIn; // gate from the TCP layer output tcpOut; // gate to the TCP layer input appIn; // gate from the application output appOut; // gate to the application submodules: pastry: Pastry { parameters: @display("p=60,52;i=block/circle"); } pastryRoutingTable: PastryRoutingTable { parameters: @display("p=140,68;i=block/table"); } pastryLeafSet: PastryLeafSet { parameters: @display("p=220,52;i=block/table"); } pastryNeighborhoodSet: PastryNeighborhoodSet { parameters: @display("p=300,68;i=block/table"); } connections allowunconnected: 54

udpIn --> pastry.udpIn; udpOut <-- pastry.udpOut; appIn --> pastry.appIn; appOut <-- pastry.appOut; }

.H file
#ifndef __PASTRY_H_ #define __PASTRY_H_ #include <vector> #include <map> #include <queue> #include <algorithm> #include <omnetpp.h> #include <IPvXAddress.h> #include <OverlayKey.h> #include <NodeHandle.h> #include <BaseOverlay.h> #include <BasePastry.h> #include "PastryTypes.h" #include "PastryMessage_m.h" #include "PastryRoutingTable.h" #include "PastryLeafSet.h" #include "PastryNeighborhoodSet.h" class Pastry : public BasePastry { public: virtual ~Pastry(); // see BaseOverlay.h virtual void initializeOverlay(int stage); // see BaseOverlay.h virtual void handleTimerEvent(cMessage* msg); // see BaseOverlay.h virtual void handleUDPMessage(BaseOverlayMessage* msg); void handleStateMessage(PastryStateMessage* msg); virtual void pingResponse(PingResponse* pingResponse, cPolymorphic* context, int rpcId, simtime_t rtt); protected: virtual void purgeVectors(void); virtual void changeState(int toState); virtual bool recursiveRoutingHook(const TransportAddress& dest, BaseRouteMessage* msg); void iterativeJoinHook(BaseOverlayMessage* msg, bool incrHopCount); std::vector<PastryStateMsgHandle> stReceived; std::vector<PastryStateMsgHandle>::iterator stReceivedPos; std::vector<TransportAddress> notifyList; private: void clearVectors(); 55

simtime_t secondStageInterval; simtime_t routingTableMaintenanceInterval; simtime_t discoveryTimeoutAmount; bool partialJoinPath; int depth; int updateCounter; bool minimalJoinState; bool useDiscovery; bool useSecondStage; bool sendStateAtLeafsetRepair; bool pingBeforeSecondStage; bool overrideOldPastry; bool overrideNewPastry; cMessage* secondStageWait; cMessage* ringCheck; cMessage* discoveryTimeout; cMessage* repairTaskTimeout; void doSecondStage(void); void doRoutingTableMaintenance(); bool handleFailedNode(const TransportAddress& failed); void checkProxCache(void); void processState(void); bool mergeState(void); void endProcessingState(void); void doJoinUpdate(void); // see BaseOverlay.h virtual void joinOverlay(); }; #endif

.CC file
#include "PastryNeighborhoodSet.h" #include "PastryTypes.h" Define_Module(PastryNeighborhoodSet); void PastryNeighborhoodSet::earlyInit(void) { WATCH_VECTOR(neighbors); } void PastryNeighborhoodSet::initializeSet(uint32_t numberOfNeighbors, uint32_t bitsPerDigit,const NodeHandle& owner) { this->owner = owner; this->numberOfNeighbors = numberOfNeighbors; this->bitsPerDigit = bitsPerDigit; if (!neighbors.empty()) neighbors.clear(); // fill Set with unspecified node handles for (uint32_t i = numberOfNeighbors; i>0; i--) neighbors.push_back(unspecNode()); } 56

void PastryNeighborhoodSet::dumpToStateMessage(PastryStateMessage* msg) const { uint32_t i = 0; uint32_t size = 0; std::vector<PastryExtendedNode>::const_iterator it; msg->setNeighborhoodSetArraySize(numberOfNeighbors); for (it = neighbors.begin(); it != neighbors.end(); it++) { if (!it->node.isUnspecified()) { ++size; msg->setNeighborhoodSet(i++, it->node); } } msg->setNeighborhoodSetArraySize(size); } const NodeHandle& PastryNeighborhoodSet::findCloserNode(const OverlayKey& destination, bool optimize) { std::vector<PastryExtendedNode>::const_iterator it; if (optimize) { // pointer to later return value, initialize to unspecified, so // the specialCloserCondition() check will be done against our own // node as long as no node closer to the destination than our own was // found. const NodeHandle* ret = &NodeHandle::UNSPECIFIED_NODE; for (it = neighbors.begin(); it != neighbors.end(); it++) { if (it->node.isUnspecified()) break; if (specialCloserCondition(it->node, destination, *ret)) ret = &(it->node); } return *ret; } else { for (it = neighbors.begin(); it != neighbors.end(); it++) { if (it->node.isUnspecified()) break; if (specialCloserCondition(it->node, destination)) return it->node; } return NodeHandle::UNSPECIFIED_NODE; } } void PastryNeighborhoodSet::findCloserNodes(const OverlayKey& destination, NodeVector* nodes) { std::vector<PastryExtendedNode>::const_iterator it; for (it = neighbors.begin(); it != neighbors.end(); it++) if (! it->node.isUnspecified()) nodes->add(it->node); } bool PastryNeighborhoodSet::mergeNode(const NodeHandle& node, simtime_t prox) { std::vector<PastryExtendedNode>::iterator it; 57

bool nodeAlreadyInVector = false; // was the node already in the list? bool nodeValueWasChanged = false; // true if the list was changed, false if the rtt was too big // look for node in the set, if it's there and the value was changed, erase it (since the position is no longer valid) for (it = neighbors.begin(); it != neighbors.end(); it++) { if (!it->node.isUnspecified() && it->node == node) { if (prox == SimTime::getMaxTime() || it->rtt == prox) return false; // nothing to do! neighbors.erase(it); nodeAlreadyInVector = true; break; } } // look for the correct position for the node for (it = neighbors.begin(); it != neighbors.end(); it++) { if (it->node.isUnspecified() || (it->rtt > prox)) { nodeValueWasChanged = true; break; } } neighbors.insert(it, PastryExtendedNode(node, prox)); // insert the entry there if (!nodeAlreadyInVector) neighbors.pop_back(); // if a new entry was inserted, erase the last entry return !nodeAlreadyInVector && nodeValueWasChanged; // return whether a new entry was added } void PastryNeighborhoodSet::dumpToVector(std::vector<TransportAddress>& affected) const { std::vector<PastryExtendedNode>::const_iterator it; for (it = neighbors.begin(); it != neighbors.end(); it++) if (! it->node.isUnspecified()) affected.push_back(it->node); } const TransportAddress& PastryNeighborhoodSet::failedNode(const TransportAddress& failed) { std::vector<PastryExtendedNode>::iterator it; for (it = neighbors.begin(); it != neighbors.end(); it++) { if (it->node.isUnspecified()) break; if (it->node.getIp() == failed.getIp()) { neighbors.erase(it); neighbors.push_back(unspecNode()); break; } } // never ask for repair return TransportAddress::UNSPECIFIED_NODE; 58

} std::ostream& operator<<(std::ostream& os, const PastryExtendedNode& n) { os << n.node << ";"; if (n.rtt != SimTime::getMaxTime()) os << " Ping: " << n.rtt; return os; }

Chord .CC file


#include <cfloat> #include "hashWatch.h" #include "Chord.h" #include "ChordSuccessorList.h" #include "ChordFingerTable.h" namespace oversim { Define_Module(ChordFingerTable); void ChordFingerTable::initialize(int stage) { // because of IPAddressResolver, we need to wait until interfaces // are registered, address auto-assignment takes place etc. if(stage != MIN_STAGE_OVERLAY) return; maxSize = 0; WATCH_DEQUE(fingerTable); } void ChordFingerTable::handleMessage(cMessage* msg) { error("this module doesn't handle messages, it runs only in initialize()"); }

.H file
#ifndef __CHORD_H_ #define __CHORD_H_ #include <BaseOverlay.h> #include <NeighborCache.h> #include "ChordMessage_m.h" namespace oversim { class ChordSuccessorList; class ChordFingerTable; 59

class Chord : public BaseOverlay, public ProxListener { public: Chord(); virtual ~Chord(); // see BaseOverlay.h virtual void initializeOverlay(int stage); // see BaseOverlay.h virtual void handleTimerEvent(cMessage* msg); // see BaseOverlay.h virtual void handleUDPMessage(BaseOverlayMessage* msg); // see BaseOverlay.h virtual void recordOverlaySentStats(BaseOverlayMessage* msg); // see BaseOverlay.h virtual void finishOverlay(); // see BaseOverlay.h OverlayKey distance(const OverlayKey& x, const OverlayKey& y, bool useAlternative = false) const; virtual void updateTooltip(); void proxCallback(const TransportAddress &node, int rpcId, cPolymorphic *contextPointer, Prox prox); protected: int joinRetry; /**< */ int stabilizeRetry; /**< // retries before neighbor considered failed */ double joinDelay; /**< */ double stabilizeDelay; /**< stabilize interval (secs) */ double fixfingersDelay; /**< */ double checkPredecessorDelay; int successorListSize; /**< */ bool aggressiveJoinMode; /**< use modified (faster) JOIN protocol */ bool extendedFingerTable; unsigned int numFingerCandidates; bool proximityRouting; bool memorizeFailedSuccessor; bool newChordFingerTable; bool mergeOptimizationL1; bool mergeOptimizationL2; bool mergeOptimizationL3; bool mergeOptimizationL4; // timer messages cMessage* join_timer; /**< */ cMessage* stabilize_timer; /**< */ cMessage* fixfingers_timer; /**< */ cMessage* checkPredecessor_timer; // statistics int joinCount; /**< */ int stabilizeCount; /**< */ int fixfingersCount; /**< */ 60

int notifyCount; /**< */ int newsuccessorhintCount; /**< */ int joinBytesSent; /**< */ int stabilizeBytesSent; /**< */ int notifyBytesSent; /**< */ int fixfingersBytesSent; /**< */ int newsuccessorhintBytesSent; /**< */ int keyLength; /**< length of an overlay key in bits */ int missingPredecessorStabRequests; /**< missing StabilizeCall msgs */ virtual void changeState(int toState); // node references NodeHandle predecessorNode; /**< predecessor of this node */ TransportAddress bootstrapNode; /**< node used to bootstrap */ // module references ChordFingerTable* fingerTable; /**< pointer to this node's finger table */ ChordSuccessorList* successorList; /**< pointer to this node's successor list */ // chord routines virtual void handleJoinTimerExpired(cMessage* msg); virtual void handleStabilizeTimerExpired(cMessage* msg); virtual void handleFixFingersTimerExpired(cMessage* msg); virtual void handleNewSuccessorHint(ChordMessage* chordMsg); virtual NodeVector* closestPreceedingNode(const OverlayKey& key); virtual void findFriendModules(); virtual void initializeFriendModules(); // see BaseOverlay.h virtual bool handleRpcCall(BaseCallMessage* msg); // see BaseOverlay.h NodeVector* findNode(const OverlayKey& key, int numRedundantNodes, int numSiblings, BaseOverlayMessage* msg); virtual void joinOverlay(); virtual void joinForeignPartition(const NodeHandle &node); virtual bool isSiblingFor(const NodeHandle& node, const OverlayKey& key, int numSiblings, bool* err); int getMaxNumSiblings(); int getMaxNumRedundantNodes(); void rpcFixfingers(FixfingersCall* call); virtual void rpcJoin(JoinCall* call); virtual void rpcNotify(NotifyCall* call); void rpcStabilize(StabilizeCall* call); virtual void handleRpcResponse(BaseResponseMessage* msg, cPolymorphic* context, int rpcId, simtime_t rtt); virtual void handleRpcTimeout(BaseCallMessage* msg, const TransportAddress& dest, cPolymorphic* context, int rpcId, const OverlayKey& destKey); virtual void pingResponse(PingResponse* pingResponse, cPolymorphic* context, int rpcId, simtime_t rtt); virtual void pingTimeout(PingCall* pingCall, const TransportAddress& dest, cPolymorphic* context, int rpcId); 61

virtual void handleRpcJoinResponse(JoinResponse* joinResponse); virtual void handleRpcNotifyResponse(NotifyResponse* notifyResponse); virtual void handleRpcStabilizeResponse(StabilizeResponse* stabilizeResponse); virtual void handleRpcFixfingersResponse(FixfingersResponse* fixfingersResponse, double rtt = -1); virtual bool handleFailedNode(const TransportAddress& failed); friend class ChordSuccessorList; friend class ChordFingerTable; private: TransportAddress failedSuccessor; }; }; //namespace #endif

.NED file
module ChordModules like IOverlay { parameters: @display("i=block/network2"); gates: input udpIn; // gate from the UDP layer output udpOut; // gate to the UDP layer input tcpIn; // gate from the TCP layer output tcpOut; // gate to the TCP layer input appIn; // gate from the application output appOut; // gate to the application submodules: chord: Chord { parameters: @display("p=60,60"); } fingerTable: ChordFingerTable { parameters: @display("p=150,60"); } successorList: ChordSuccessorList { parameters: @display("p=240,60"); } connections allowunconnected: udpIn --> chord.udpIn; udpOut <-- chord.udpOut; appIn --> chord.appIn; appOut <-- chord.appOut; }

62

CHAPTER 8 Result Analysis using Scave tool Screenshots:

63

64

65

66

CHAPTER 9
67

References
[1] Kademlia: A Peer-to-peer Information System Based onthe XOR Metric Petar Maymounkov and David Mazi`eres fpetar, dmg@cs.nyu.edu http://kademlia.scs.cs.nyu.edu [2] Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications. Ion Stoica, Robert Morris, DavidKarger, M. Frans Kaashoek, Hari Balakrishnan, MIT Laboratory for Computer Science. chord@lcs.mit.edu, http://pdos.lcs.mit.edu/chord/ [3] Koorde: http://www.pdos.lcs.mit.edu/chord/ [4] Pastry: 1.Antony Rowstron, Microsoft Research Ltd, St. George House, Guildhall Street, Cambridge, CB2 3NH, UK. antr@microsoft.com 2. Peter Drusche, Rice University MS-132, 6100 Main Street, Houston, TX 770051892, USA. druschel@cs.rice.edu [5] OverSim: Ingmar Baumgart, Bernhard Heep, Stephan Krause, Institute of Telematics Universitat Karlsruhe (TH), Zirkel 2, D76128 Karlsruhe, Germany, Email:{baumgart, heep, stkrause}@tm.uka.de [6] OveSim/inet-oversim: https://github.com/oversim/inet-oversim.git [7] OMNeT++ Network Simulation Framework. www.omnetpp.org/ [8] http://h33t.com/tor/456004/omnet-4-2-2-released [9] http://www.h33t.com:3310/announce [10] http://fr33dom.h33t.com:3310/announce [11] www.omnetpp.org/doc/omnetpp/IDE-Overview.pdf [12] http://www.omnetpp.org/doc/omnetpp/manual/usman.html#sec411

68