Академический Документы
Профессиональный Документы
Культура Документы
II
Contents
1 P2P systems 1
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Denition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.3 Time evolution of applications . . . . . . . . . . . . . . . . . 2
1.4 Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.4.1 General Issues . . . . . . . . . . . . . . . . . . . . . . 3
1.4.2 Issues for ISP . . . . . . . . . . . . . . . . . . . . . . . 3
1.4.3 Issues for Users . . . . . . . . . . . . . . . . . . . . . . 4
1.5 Overlay network . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.6 Family of systems . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.7 Napster . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.8 Gnutella . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.8.1 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.8.2 Messages . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.8.3 Characteristics . . . . . . . . . . . . . . . . . . . . . . 12
1.8.4 Performance evaluation . . . . . . . . . . . . . . . . . 13
1.9 Chord . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.9.1 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.9.2 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.9.3 Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
1.9.4 Load balance . . . . . . . . . . . . . . . . . . . . . . . 23
1.9.5 Comparison between Chord and Gnutella . . . . . . . 25
1.10 CAN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
1.10.1 Routing . . . . . . . . . . . . . . . . . . . . . . . . . . 27
1.10.2 Join . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
1.10.3 Performances . . . . . . . . . . . . . . . . . . . . . . . 28
1.10.4 Leaving of a node and failures . . . . . . . . . . . . . . 28
1.11 Tapestry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
1.12 BitTorrent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
1.12.1 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 30
1.12.2 Policies . . . . . . . . . . . . . . . . . . . . . . . . . . 32
1.12.3 Case study: Flash Crowd . . . . . . . . . . . . . . . . 34
1.13 Skype . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
III
IV CONTENTS
1.14 P2P Streaming systems . . . . . . . . . . . . . . . . . . . . . 38
1.14.1 Tree-based systems . . . . . . . . . . . . . . . . . . . . 40
1.14.2 Meshed-based systems . . . . . . . . . . . . . . . . . . 43
2 Random graphs 53
2.1 Introduction and denitions . . . . . . . . . . . . . . . . . . . 53
2.2 Erdos-Renyi Model . . . . . . . . . . . . . . . . . . . . . . . . 54
2.2.1 Average degree . . . . . . . . . . . . . . . . . . . . . . 55
2.2.2 Degree distribution . . . . . . . . . . . . . . . . . . . . 56
2.3 Bender-Caneld Model . . . . . . . . . . . . . . . . . . . . . . 56
2.3.1 Node reachability . . . . . . . . . . . . . . . . . . . . . 56
2.3.2 Small-world eect . . . . . . . . . . . . . . . . . . . . 61
2.3.3 Clustering . . . . . . . . . . . . . . . . . . . . . . . . . 63
2.4 Heavy-Tailed Distribution . . . . . . . . . . . . . . . . . . . . 65
2.5 Watts-Strogatz model . . . . . . . . . . . . . . . . . . . . . . 66
2.5.1 Clustering analysis . . . . . . . . . . . . . . . . . . . . 67
2.5.2 Small-world analysis . . . . . . . . . . . . . . . . . . . 68
2.6 Theory of evolving networks . . . . . . . . . . . . . . . . . . . 69
2.7 Resume scheme . . . . . . . . . . . . . . . . . . . . . . . . . . 72
Chapter 1
P2P systems
1.1 Introduction
For P2P analysis point of view, the Internet is a structure already dened
and perfectly working: only users are taken into account and they are called
hosts or peers. Hosts communicate thanks to the Internet, which can be seen
as the transport media that carries data, therefore the analysis focuses on
layers 4 and 7 of the OSI stack. Indeed it is necessary having a knowledge
of transport layer to understand and predict the behavior of the network,
but it is also necessary know what kind of features users may require from
the application layer, since they operate with applications.
Layer 7
Layer 4
1.2 Denition
P2P (peer-to-peer) systems are system in which users receive and provide
part of the service. This is a general denition, indeed the concept of ser-
vice has to be declared. The important thing is that hosts also contribute
to service provisioning: it means that the service is distributed and not cen-
tralized like a web browsing application. Depending on the type of service,
users provide dierent things using their resources.
1
2 CHAPTER 1. P2P systems
Sharable resources
In this section the attention will focus on kind of sharable resources.
A rst type are content resources: users share content that they have on
their machines. If there are no other users with that content, the quality of
service will be very bad while, if a lot of hosts share the same content, the
service will be excellent. An example of application is Napstar where the
content is music. Types of content indeed might be various; grouping them,
it is possible introduce the following classication:
. le sharing;
. directories.
File sharing groups a lot of possible contents: music, games, videos, lms,
ebooks. Directories are typically part of a distributed database that once
received it is redistributed and anyone can access to that part (Skype).
Another possible sharable resource is CPU: in this context the compu-
tational power is shared. For example, if an application requires a very huge
computational capacity not owned by a single machine, it can be distributed
among Internet hosts to use their computational power to process a single
part of the application (application to discover new form of life that require
sharing power to signal processing).
The last possible shareable resource is bandwidth: an example is the case
in which an host owns a very popular lm requested by a lot of other peers;
if it has to distribute to everyone, a very large bandwidth it is required at
the access link. Perhaps it is better if he distribute parts of that lm to
other users that in turn redistribute: in this way the bandwidth actually
used is greater. Examples of applications are Bit-torrent, P2P Tv, Gaming.
1.3 Time evolution of applications
At the begin, the Internet was in certain sense peer-to-peer: at topology,
distributed features and protocols. Growing up, it moves to the client-
server paradigm in which someone provide some service requested by other
user: the web browsing is a typical client-server application. ISP developed
applications in that sense and that choose implied having asymmetric access:
upload and download treated separately, typically assigning to download
much bandwidth (ASDL). Indeed, usually there is one server with several
clients.
With the development of peer-to-peer applications the situation changed
in a fair symmetric way and now there is no a strict division to download
and upload bandwidth because, if peers have to redistribute contents, they
need an application able to exploit in particular the upload bandwidth.
1.4. Issues 3
Moreover with the technological evolution of devices, a much more com-
putational power has made it possible push down some tasks from the core
network to edges.
1.4 Issues
1.4.1 General Issues
Peer-to-peer systems suers of critical issues. One is churning, the high
variability in time of the system. Indeed hosts can freely join or leave so
the quantity of content avaiable changes very frequently. For example, for
P2P Tv, resources have to be balanced on the quantity that a peer can
redistribute and the quantity that he needs.
Furthermore, a perfect knowledge of participants is required, such as
their Ip address that, due to churning, can change over time. This knowledge
is not strictly necessary in others applications.
If a peer is hidden behind a NAT or a rewall, further information is
required, in particular the public Ip address of NATs. The reason is that
NATs were developed for a client-server kind of application. Firewalls, in-
stead, can denied the access of a machine to the P2P application.
Every P2P system has to deal with join issue: when users want to join
the net, they require some information like the address of the rs neighbor.
If, in a certain moment, there are no peers in the network the service can
not be provided. In order to join is possible:
. access to a web page which contains a list of peers active or recently
active: the new peer contact them as soon as he nds one up;
. connect to some server always on.
These mechanisms are centralized techniques: an application that use them
is BitTorrent.
1.4.2 Issues for ISP
ISPs have to cope with following troubles:
. trac engineering: to improve the service, having in mind the goal
of satisfying users requirement, ISPs can balance trac (symmetric or
asymmetric access means dierent amount of trac in the network);
. capacity problems: many applications generate a lot of trac and ISPs,
when exchange trac to other ISPs, have to respect cost policies stip-
ulated; moreover, the quantity of the trac can be huge because ap-
plications does not care of the physical topology so, being neighbors
in the peer network does not implies belonging to the same ISP: the
consequence is that, in general ISPs are crossed many times;
4 CHAPTER 1. P2P systems
. competitive services: ISP can have their own telephony company which
gives a non free service; of course they also carry data trac and, if that
trac is Skype trac, which is free service VOIP, they may penalize
it since it is concurrent.
1.4.3 Issues for Users
Considering users, they have to deal with:
. legal issues: some services, for example le sharing, may incur in this
issue because contents are distribuited violing copyright;
. security and private issues: maybe some applications are malicious
and exchange trac potentially riskily (viruses, malaware, spyware).
1.5 Overlay network
The layer 7 network that connects peers is called overlay network. The
overlay network is completely independent from the physical network and
can be fully mesh connected or not (if peers does not know all other peers,
but they have a partially view of the topology). The picture below reports
an example.
Isp 1
Isp 2
Isp 3
Overlay Network
Links are logical of course, and two peers connected by a link of the
overlay network are neighbors and they may belong to dierent ISP: it means
that physically they can be located very far away. Links can be created
in dierent ways, with direct TCP connections for example, or with UDP
connections plus some further information.
1.6. Family of systems 5
The overlay network is used to implement functions, dierent from appli-
cation to application and it is possible have more than one overlay network
nested together. Some examples are:
Gnutella :
+
2
+
3
+. . . +
H
It is possible to rewrite the expression into:
c =
H
i=1
i
Example Taking values for H and it is possible to determine realistic
values for c:
= 4
H = 7
= c
= 22k
If the message was a ping, peers will answer with a pong, therefore for each
ping, in a scenario like the preceding one, there will exchange
= 44k.
Time need to contact peers
To compute it, rst an assumption has to be taken: at each level of the tree
the time to contact peer (from father node to sons) it xed and equal to T.
Implicitly it means that the time required to send sequentially messages is
considered negligible with respect to the time need to reach neighbors.
Under that assumption, considering independent each level of the tree,
parallels propagations occur and so:
Avgtime = H
In a time (H T),
H
nodes are reached.
Example Considering:
H = 7
T
= 200 ms
= Avgtime = 0.2 7 = 1.4 s
Therefore, it is possible say that, the response received by an huge number
of peers is quite quick.
First step.
Second step.
Third step.
H-th level of the tree.
Number of
hops.
Time to cross an hop.
1.8. Gnutella 15
Probability of not nding a content
This is an ineciency of the system perceived by users. In general, the
number of copies of a given content with popularity p is (N p). It means
that each peer has an independent probability of having that content.
Considering c the number of contacted peer, the probability of not nd-
ing the content is:
{ (not nd) = (1 p)
c
Choosing a target F under which { (not nd) must be assured:
{ (not nd) < F
(1 p)
c
< F
Taking the logarithm:
c log(1 p) < log F = c >
log(F)
log(1 p)
Example Considering = 4:
Value of H Value of c
1 4
2 20
3 84
4 340
5 1360
6 5460
7 21844
Maintaining = 4, considering F = 0.01:
p = 0.05 (5%)
p = 0.01 (1%)
=
c > 90 take H = 4
c > 458 take H = 5
16 CHAPTER 1. P2P systems
Performance
Performances principally means the average number of hops require to con-
tact before having the rst hit. For example:
{ (1) = { (nd the le at the rst hop) = 1 (1 p)
Prosecuting:
{ (2) = = (1 { (1)) [1 (1 p)
2
]
{ (3) = = (1 { (1)) (1 { (2)) [1 (1 p)
3
]
The average time to send a request is:
i=0
i { (i)
T
The average time to receive an answer is:
i=0
i { (i)
2T
1.9 Chord
Chord is a structured system (on the overlay) which implies that churning
is a big issue since the topology is xed. So the choice of the topology is
very relevant: it can not be a star because in a P2P system in general there
are no role distinctions like the one introduced by the star topology with
the central node. Moreover, also regular structured topologies are not so
good since they introduce the concept of priority based on the geographical
position. The topology actually used is a ring.
The attention must be focused on the P2P technology, so the application
layer and network layer are non considered; using a diagram, the stack
should be:
Application
P2P Technology
Layer 3/4
1.9. Chord 17
The P2P technology concerns features like overlay creation and maintenance,
join operation and management of messages.
Chord is similar to Gnutella since it is a protocol, but it distributes the
information about contents and not the request for a given le. For example,
it is possible that the peer that knows where is located a certain content is
not the holder: the two aspects are completely separated.
1.9.1 Analysis
A regular structure like the ring gives, implicitly, a knowledge about the
distance between nodes. This fact is very useful to help the join operation:
a new peer that wants to be connected has just to know in which position
he should be placed. The distance knowledge is not provided physically:
it is too complex to manage. Moreover it introduce some dierences from
a peer to another one: if the application that runs this protocol becomes
very popular in a given country, nodes belonging to that country will be
physically placed near with respect to a node belonging to another country.
The density would be dierent.
On the contrary, supposing to have a knowledge of distance at the over-
lay, allows to consider peers physically located very far away as neighbors.
The way in which nodes are placed on the ring is to apply a function T
to a list of information about the peer: the outcome is deterministic and
uniformly distributed into an interval. This outcome is a number mapped in
bits, so the ring is usually divided into m bits and, consequently the interval
is divided into 2
m1
parts.
Peer Info Node Id
T
The function T is realized thanks to cryptografy (SHA-I):
. because makes dicult from the Node Id, obtain the peer information
list;
. allow to map a lot of information into an uniformly distributed space
avoiding some proximity among peers;
. although the mapping is random into the interval [0, 2
m1
], the func-
tion is deterministic, so receiving two identical inputs, it will provide
the same output (possible collisions).
The Node Id represent the nal position of the peer on the ring; thanks
to that topology, each peer has just two neighbors called predecessor (i 1
in the following picture) and successor (i + 1 in the picture); therefore it is
18 CHAPTER 1. P2P systems
possible dene neighbors as the closest active peers of the considered node
(i).
0
2
m
1
i
i 1
i + 1
Join
Up to now, the join operation can occur with following steps:
. the new node applies the function T to his peer list information re-
ceiving as a result his own position (N7);
. he should know another peer and contact it (N24);
. this peer contact his successor and so on until the right position of the
new node is reached;
. when successor and predecessor of the new node are founded, the con-
nection is established and the node becomes a peer.
Graphically:
N7
N
2
4
N7
N
2
4
N7
1.9. Chord 19
How information is distributed
Unlike Gnutella, in Chord the information of where contents are located is
distributed among peers. Each peers knows that information thanks to keys
that are generated applying a function ( to metadata (data that describe
synthetically the content). Graphically:
Metadata Key
(
Keys are values generated with the same properties of Node Id, there-
fore they are uniformly distributed in the same interval [0, 2
m1
]. An im-
portant thing to remark is that T and (, starting from dierent inputs
(peer information list and metadata), are both able to map dierent kind of
outputs (Node Ids and keys) into the same interval.
To associate keys to Node Ids the rule used is to assign a key to the
nearest peer succeeding the key value.
Queries
When the node N wants to retrieve a content, runs the function ( over the
metadata obtaining the key. Since it knows only his neighbors, he forwards
to them the query that each time is redistributed. In this way sooner or
later the peer that has holds the key searched by N is founded.
If peer are n, globally, the expected time to found the one with the right
key is n/2. This assumption holds just because both keys and node id are
uniformly distributed. Therefore the order of complexity is quite high with
respect to Gnutella, but Chord guarantees that the content is surely found
(in Gnutella it depends).
Shortcuts The query process has been improved by using shortcuts: in
practise each node does not have just the knowledge about his neighbors,
but know the location of more peers. Those peers are not chosen randomly,
but with a specic rule: each time the space of a possible search of a le
must be divided in two parts. The graphical explanation is:
20 CHAPTER 1. P2P systems
The principal advantage of using shortcuts is that the search, instead
being linear (complexity n), becomes dicotomic and therefore, the complex-
ity is log n. The main drawback is that a sort of routing table is required:
in Chord is called nger table. For a given node N, it has m entries and it
is build as:
Index Value Successor
1 N+2
0
successor(N+1)
2 N+2
1
successor(N+2)
3 N+2
2
successor(N+4)
.
.
.
i N+2
i1
successor(N+2
i1
)
.
.
.
m N+2
m1
successor(N+2
m1
)
The value of m is critical: if it is large the probability of having conicts
(same output value applying the function on dierent inputs) is negligible;
on the other side, high values of m imply:
. large number of bits used;
. high length of the nger table.
1.9.2 Example
Given the following picture with m = 6 and the number of bits 2
6
= 64:
1.9. Chord 21
N56 K54
N4
N8
K10
N14 K10
N32 K24
K30
K38 N39
K38
N42
N48
N51
consider the case in which N8 is looking for K54. The nger table of N8 is:
Index Value Successor
1 8+1=9 N14
2 8+2=10 N14
3 8+4=12 N14
4 8+8=16 N21
5 8+16=24 N32
6 8+32=40 N42
In this case the query is forwarded to N42 which is the nearest peer; the
nger table of N42 is:
Index Value Successor
1 42+1=42 N48
2 42+2=44 N48
3 42+4=46 N48
4 42+8=50 N51
5 42+16=58 N4
6 42+32=74=10 N4
22 CHAPTER 1. P2P systems
At this moment, the nearest peer is N51; its nger table is:
Index Value Successor
1 51+1=52 N56
2 51+2=53 N56
3 51+4=55 N56
4 51+8=59 N4
5 51+16=67=3 N4
6 51+32=83=19 N21
Since the key is in between values 53 and 55, the peer selected is N56: in
three hops the key is founded.
Join procedure with shortcuts
If a new node wants to connect to the P2P application, runs the function
T to discover his Node Id: assume it is N26. In the example, it has to be
placed between N21 and N32. If, for example, he contact N4 to discover his
successor and predecessor, the way in which this search is made is thanks
to shortcuts, exactly like a query: rst the successor of N26 is found and
then contacting N32 is possible discover N21 which will be the predecessor
of N26, but at the moment is the predecessor of N32. After this preliminary
step, all nger tables have to be updated.
Procedure
1. ask to some nodes to retrieve the successor(n) and the predecessor(n);
2. create nger table of n and update nger tables of other nodes; the
update operation is very complex;
3. redistribution of keys.
1.9.3 Issues
A possible problem of consistency takes place when nger tables are up-
dated: for example, if a node is searching a key in a given node N, but if
nger tables that point to N are not updated the content will not be found.
Another issue is a failure of a peer. When it happens due to a simple
switch o of a peer, notications are sent to other nodes, but if a node fails
how notications are sent?
1.9. Chord 23
To avoid some of those issues, it is possible introduce some redundancy:
each node maintains a list of some successors and not only the knowledge of
one predecessor and successor. If, for some reason, the immediate successor
fails, the node considered contact some of other successors.
Stabilization procedure
It is run every some time: each peer n ask to his successor n + 1 to answer
who is its predecessor; if the answer is positive the peer n is actually the
predecessor of n + 1. Otherwise, if the answer is p, two possible anomalies
take place:
1. in the case p > n:
n
p
n + 1
in this case the information is wrong and the node n has to update his
nger table since his own successor is p and not n + 1;
2. in the case p < n:
p
n
n + 1
in this case the information is wrong and the node n+1 has to update
his nger table since his own predecessor is n and not p.
1.9.4 Load balance
The amount of work that each peer has to deal with depends how keys are
associated to nodes. Let x:
A
B
x
24 CHAPTER 1. P2P systems
x =
B A
2
m
This parameter x is simply the fraction of the ring that the peer B is in
charge of; larger is x, larger can be the number of key assigned to B, so
that node has to deal with a large amount of work. In other words, it is
also possible to say that x is the probability that B is storing a given key:
since they are uniformly distributed on the space (normalized values in the
picture below), the probability of having a key is proportional to the space
that a node is in charge of:
0 1
x
Assuming that there are keys in the system, the probability that A is
not in charge of having keys is:
{ (A has no keys) = (1 x)
x
i
(1 x)
i
The distribution of that probability is something like:
f
A
(n)
n
1 2
The region 1 represents nodes that hold few keys, while region 2 describe
peers with a huge amount of work to deal with; since the distribution is
symmetric with a low variance, the load is assigned quite fairly to nodes.
The mean number of keys stored in peer B is:
E[# keys] = x
and, if there are N active peers, due to their uniformly distribution into the
ring:
x =
1
N
1.9. Chord 25
Therefore:
E[# keys] =
1
N
=
N
The fair assignment of keys to nodes on average should not be good: if,
for example, the peer A has much more bandwidth with respect to peer B,
it would be better assign to A more keys in order to provide a better service
to all users.
1.9.5 Comparison between Chord and Gnutella
Chord Gnutella
scalability very good very good
robustness (to churning) poor very good
overlay maintenance complex/less costly simple/costly
performances (users) service guaranteed no service guaranteed
responsiveness O(log n) O(H)
performances (network) ecient (shortcuts) inecient (ooding)
O(log n) O(
H
)
node: complexity small very small
node: storage size order of m order of
node: load balanced depends on
node: contents no user dependency user dependency
Robustness in Chord is poor since the routing is deterministic (short-
cuts): if churning is high, updating nger table implies consistency prob-
lems. Indeed, structured systems, suer an intrinsic issue due to the fact
that peers have a quite large knowledge of the topology: this implies that
the state information is high therefore the accuracy have to be very precise
otherwise the system will be not reliable.
The responsive time is similar for both protocols, but actually they
are not comparable because one is a structured system and the other one
unstructured, Chord uses a deterministic routing to found contents while
Gnutella uses ooding.
26 CHAPTER 1. P2P systems
1.10 CAN
CAN (Content Addressable Network) uses the same basic approach of Chord:
peer, thanks an hash function, are mapped on a space like keys. Moreover
the space is the same for both keys and peers; the main dierence is that the
space is not mono-dimensional like in Chord, but it could have d-dimensions.
Peer Info Node Id
T
Contents Keys
(
For example, with d = 2, the space will have two dimensions identied by
two coordinates:
x
y
The way in which keys are assigned to peers is on the base of the distance:
the space is divided fairly to peers and each one controls his region. It implies
that, all keys placed in a given region, are assigned to the peer that is in
charge of that region. Graphically peers are marked in blue while keys in
orange:
1.10. CAN 27
1.10.1 Routing
When a peer is looking for a given key, he follows the shortest path to contact
the peer that is in charge of the region where the key is placed. Implicitly,
it means that peers has a detailed knowledge about their neighbors (with
a routing table): indeed, to select the shortest path, they have to choose
among them to contact the best one that guarantees the reachability of the
key.
1.10.2 Join
Once a new host has run the hash function he is able to know its nal
own position on the space. First he has to download, from a web page, for
example, a list of active peer. Then he contact one of them: this node, by
contacting his neighbors, determines the position of the new peer in the same
way in which queries are performed. When the right position is discovered,
the node that is in charge of that region has to partitioned it, assigning
to the new node a portion. Regions describe the load that each peer deal
with, therefore high width means high load. Graphically, pictures show the
scenario before and after the arrive of a new peer (marked in yellow):
A
A
B
At rst, peer A, was in charge of an huge area with 2 keys. After the arrive
of peer B, the area has been reduced and, nodes A and B, have to deal with
one key each one. In practise, the step 3 in Chord (redistributing keys in
page 22), is realized in an hidden way just dividing the area.
It could happen that the hash function returns values very similar for
two dierent peers: in this scenario is possible that, one of the two nodes is
in charge of a region, but it does not physically belong to that region. For
example:
28 CHAPTER 1. P2P systems
A
B
B is in charge of the yellow region although it does not belong to it. This
phenomenon is due to the fact that the algorithm tries to obtain a fair
distribution of the load and, therefore, to divide regularly areas.
1.10.3 Performances
The complexity of a query request or a join can be evaluated by means of
the average path length:
AVGpath lenght =
d
4
n
1/d
The formula says that, in order to have a complexity not too high, d must be
taken suciently large, but large values of d implies have many dimensions
and, therefore, many neighbors to contact each time a message is sent.
The parameter d is much more critical with respect to the parameter m
analyzed in Chord: indeed, the complexity in Chord grows by log n inde-
pendently by m while the complexity of CAN is directly given by the value
of d.
1.10.4 Leaving of a node and failures
When a node leaves, notications must be sent to his neighbors in order
to decide which of them have to take care of the leaving peers region.
Periodically, peers send messages containing information to their neighbors:
among of them there is also the width of the area. Indeed, the criterion that
peers uses to incorporate region is simply: the neighbor with the smallest
area will be the new owner. This is done to maintain some uniform into the
space.
When a message is sent and after sometime a timeout expires without
having received any notication, the peer realizes that some problems occur.
To recover, a timer is started and that peer waits for some other informa-
tion about his neighbor that seems failed. If nothing arrives the takeover
procedure take place. The timer is proportional to the area owned by the
neighbor of the node that seems failed, therefore being in charge of a small
area allows to enter quickly in the recover procedure. The takeover runs:
1.11. Tapestry 29
. sending pickover messages to all neighbors of the node that is assumed
to fail (it implies that each peer has also the knowledge about neighbors
of his neighbors);
. assigning to someone the area of the node failed.
All these managing mechanisms are asynchronous and only provided in
structured systems that are very complex to managed.
1.11 Tapestry
Tapestry adopts the same method of Chord and CAN: peers and keys are
mapped on the same space. The peculiarity is that the space is composed
of 160 bits organized into 40 hexadecimal digits.
To know distances among nodes, digits that represent a peer are com-
pared; for example, considering:
Node 4227:
. Node 4228 has distance 1 so it is a Layer 4 neighbor (1 digit dier-
ent);
. Node 42A2 has distance 2 so it is a Layer 3 neighbor (2 digits
dierent);
. Node 43C9 has distance 3 so it is a Layer 2 neighbor (3 digits
dierent);
. Node 6FA0 has distance 4 so it is a Layer 1 neighbor (4 digits
dierent).
Therefore:
. Layer 4: 422x;
. Layer 3: 42xx;
. Layer 2: 4xxx;
. Layer 1: xxxx.
where x [0 F].
If each digit is a peer the knowledge near the considered one is very
detailed while it is reduced going far away: this mechanism is called mesh
routing and allows to reduce complexity.
30 CHAPTER 1. P2P systems
Routing
It is very similar to the longest prex match: if the peer 5230 queries 42A1:
5230 400F
4277 42A2 42A1
L1
L2
L3 L4
The search is reduced more deeply goes into layers, but this advantage has
a cost: the maintenance of tables that potentially are large. If is the base
of digits, the complexity is O(log
(n)).
It could happen that the table is not completely full: it means that
some digits are not associated to some peer. This is very risky because the
algorithm was designed for a stable number of peers and this implies that
is not robust to churning.
1.12 BitTorrent
BitTorrent is a very popular system and it is a bit dierent with respect to
previous mentioned systems. The objective is distribute les with huge size
to a, potentially, high number of customers. The peculiar feature is that, the
content, is not stored by a given user, but it is distributed among peers that
share, among them, the bandwidth to download it. The overlay, therefore,
is designed for this purpose and not for make queries.
The content is divided into small pieces called chunks: to consume the
le they have to be all downloaded so, from a peer point of view, they have
the same importance. The usual dimension of chunks is around 64256 kbit:
they are quite small. The neighborhood (overlay) is established randomly, so
peers are forced to both download (new chunks) and upload (chunks held).
Transmission occur by means of TCP.
1.12.1 Analysis
The distributor that wants to share the le, has to create a .torrent le by
means of an hash function: indeed the .torrent is simply a le which index
all chunks including the hash keys that guaranteed the correctness of chunks
and, therefore, of the le. The .torrent contains also other information; some
1.12. BitTorrent 31
of them are: the le name, the le size, the number of chunks in which is
divided into a the address of the tracker.
After the creation of the .torrent the distributor has to upload it to a
website from which peers can download and start to receive the le. There
is a central authority that maintains the list of active peers that are sharing
the content: it is called tracker. The tracker is not connected to the overlay;
his purpose is just help peer to download the le and, for reliability is better
have more than one tracker managing the overlay for each le.
A
.torrent Website
Tracker
1. upload
2. request
3. download .torrent
4. contact
5. list of peers
The list downloaded by the tracker is, usually, composed by 40 peers: they
will become the neighborhood of peer A.
Denitions
. seeders: peers that hold the whole content; they are very important
for the well behaviour of the system because it is possible download
every chunk by a seeder;
. leechers: peers that hold just a part of the content;
. swarm: the totality of peers (seeders and leechers) that share the le;
. chocked peers: this nodes are not allowed to receive content from a
given peer;
. unchocked peers: this nodes are allowed to receive content from a
given peer.
Among the list of 40 peers downloaded by the tracker, the node select
just 4 peers: they are eectively those one that he is in contact with.
32 CHAPTER 1. P2P systems
1.12.2 Policies
In this section are describe policies in which a peer select the 4 nodes to
exchange trac and how select chunks to be downloaded.
Selection of chunks
Peers distribute a map that shows what chunks they hold; this map is sent
to peers neighbors, so they can decide which chunk should be downloaded.
The policy is simple: the rarest chunk is selected and this is done for two
reasons:
. avoid risks that a rare chunk disappears from the network;
. speed up the download.
Chunk are subdivided in sub-blocks which are composed by around 10
TCP packets ( 16 kbit). If some neighbors have the same chunk, it is
possible open more TCP connections to download in parallel (typically 5)
sub-blocks at a time. In this way an higher download bit rate is expected
because the bandwidth is enlarged: indeed, if the connection established for
downloading a sub-block is very very slow, the eect on the global rate is
mitigate from the other connections.
Selection of peers
Actually BitTorrent introduces two overlays:
. one for the list of 40 peers downloaded by the tracker (green peers);
. a second that contains the 4 peers (marked in orange) in which a given
peer is in contact with (the blue one).
The following picture shows this concept:
Overlay 2
Overlay 1
Physical network
1.12. BitTorrent 33
The selection is based on the technique tit-for-tat: it depends on how much
peers contributed in the past. The global advantage is that connections
with large bandwidth are favourite and the local advantage is that the sys-
tem forced each peer to share more because in this way it will receive a
better service (avoid free riders: peers that want just to download and not
contribute). In conclusion, tit-for-tat:
. improve cooperation among peers;
. provide fairness.
Due to tit-for-tat, there is the distinction of chocked and unchocked peers:
if a node in the past has contribute very little, probably it will be put in
the chocked list. Each peers has his own chocked list, computed every time
window (10 s for example), in which nodes are ordered by how much they
shared: in rst positions are put unchocked peers.
The main drawback is that, at the beginning, each node should receive a
very bad service since he is not able to contribute so much. This fact is avoid
thanks to optimistic unchocking: each time, one chocked peer is unchocked.
Indeed, when a peer receives request from others, the one that he will serves
are peers that have lots of chunks (they have lots of rare chunks and they
can contribute to share wery well). It means that the rarest approach for
beginning users can not be used: they have to choose randomly chunks to
download, then when their number will be suciently high, they can start
use the rarest approach since their contribution will be enough.
Tit-for-tat tries to improve fairness balancing how much a peer can con-
tribute with his desired service, but it is possible that, due to asymmetry
of network ow, it reduces the performances of the system. Imagine that
two peers are exchanging chunks belonging to the same content: if the com-
munication follows two dierent paths, it is possible that one of them is
bottlenecked. It implies that one of the two peers (A) has a very slow
upload ratio with respect to the other (B), therefore (B) can not exploit
completely his bandwidth because the mechanism tries to punish (A) that
has a low contribute.
To improve eciency and performances the end game mechanism has
been introduced: for each chunk, last sub-blocks are requested by the peer
in broadcast to his neighbors. Once the positive answer is received, the
request is aborted. This technique allows to avoid that, being unlucky, the
receiver waits too much time the download from a slower peer: indeed, since
just one chunk at a time is possible download, waiting for just the last
sub-blocks is waste of time that is possible to avoid. This implies that the
download is sped up.
34 CHAPTER 1. P2P systems
1.12.3 Case study: Flash Crowd
Supposing that a content is very popular and the purpose is to distribute it
to the largest number of customer possible. Assume:
. the number of peers interested in is n = 2
;
. two cases are avaiable:
1. a client/server scenario;
2. a scenario in which the content is redistributed by peers;
. the content distributed is an atomic entity;
. all peers have the same upload bandwidth b.
If the size of the content is s, the time needed to download/upload the
content is:
T =
s
b
Plotting on the x axis the number of peer contacted at each step and on the
y axis the time:
peers
time
T
2T
3T
T
2
4
8
2
Case 1
Considering the client/server scenario, the service capacity needed, is:
1.12. BitTorrent 35
t
C(t)
B
where B is the global capacity of the server, and B > b.
Case 2
In the other approach:
t
C(t)
b
It implies that this method is very eective: in a very short time, it reaches
the client/server approach.
Now consider the case of parallel download: each peer divides in two
his upload bandwidth in such way that two other peers can download the
content simultaneously. This time the time to complete a download is:
T
x
=
s
b/2
=
2s
b
= 2T
The graph will be:
36 CHAPTER 1. P2P systems
peers
time
T
2T
3T
4T
T
3
9
If the content is a chunk, comparing the two graphs, it is immediately
clear that is better not divide the bandwidth distributing it: this allows
to speed up the download because more peers are reached in less time.
Moreover, now becomes clear the fact that the size of chunks is reduced: if s
is small, also T is small and if the download time is small, the redistribution
takes place quickly improving performances.
The source (colored in blue in both graphs) is the peer that works for
the highest time, but the ( 1) step (that is the most eective because
allows to reach half peers interested in the content) works just for a while: it
implies that the potential bandwidth (2
:
{ (G) = {
n 1
(1 p)
(n1)
where:
. n 1 are the total number of possible experiments: all nodes minus
the one considered;
. is exactly the number of successful experiments.
If n <( z):
{
=
z
!
and it is a Poisson distribution with parameter z: it means that E[{
] = z.
This approximation is due to the fact that the binomial distribution tends
to a Poisson for large numbers of n and small numbers of .
2.3 Bender-Caneld Model
This model deals with random graphs that have a given non-Poisson degree
distribution. Graphs are built in two steps:
. assign edge-ends to nodes (for each value of the degree probability
density function, edge ends are assigned accordingly);
. randomly connect edge-ends.
This is a dierent way to build random graphs with respect to the Erdos-
Renyi model because positions are independent and no notions of locality is
present.
Following sections deal with properties derived by this model.
2.3.1 Node reachability
The node reachability property studies the possibility of having a giant com-
ponent: if nodes are easily reached, it means that the probability of having a
giant component increases, while, on the contrary, a bad reachability implies
low connectivity and therefore, the giant component will not be present.
Consider the following topology, in which, starting from a given node
(marked in orange) the reachability of 1-hop (in light-blue) and 2-hop (in
violet) neigbors is studied:
2.3. Bender-Caneld Model 57
. 1-hop neighbors: their number is the degree;
. 2-hop neighbors: to compute their number, the distribution degree
of 1-hop neighbors is required; in principal, each node has the same
probability {
.
To understand this concept, consider the star topology:
in which n nodes are composed in such a way:
. the center with degree n 1;
. n 1 nodes with degree 1.
From this is possible to derive:
n 1
n
1 n 1
1
n
58 CHAPTER 2. Random graphs
where the heigh is proportional to the degree and, to be a distribution,
is normalized. By starting from the center, the degree perceived is 1, but
starting from any other node the degree perceived is n1 because the center
is easy to reach. Therefore:
n 1
n
1 n 1
1
n
If each node counts proportionally to his degree, the center counts:
(n 1) {
Therefore:
q
= {
+1
( + 1)
To be a distribution:
q
=
{
+1
( + 1)
j
j {
j
The average is given by:
Avgq
=0
q
=0
{
+1
( + 1)
j
j {
j
By substituting i = + 1 = = 0 i = 1:
Avgq
i=1
{
i
i(i 1)
j
j {
j
=
i=1
{
i
(i
2
i)
j
j {
j
2.3. Bender-Caneld Model 59
By splitting the numerator into two sums:
Avgq
i=1
{
i
i
2
i=1
{
i
i
j
j {
j
Now:
.
i=1
{
i
i
2
is the second moment <
2
>;
.
i=1
{
i
i and
j
j {
j
are rst moment (average) < >.
Therefore:
Avgq
=
<
2
> < >
< >
Since this represents the average number of nodes discovered in two hops
it will be denoted with z
2
. Till now are considered just 2-hop neighbors of
one 1-hop neighbor of a given node; the following picture shows this fact by
highlighting the paths mentioned in red:
Of course, the initial node has more neighbors so, to compute exactly z
2
all
of them have to be considered: to do this, it is just needed to multiply z
2
by the number of nodes of the initial node and this number is the degree
< > (also possible to call z
1
to emphasize that counts 1-hop reachable
neighbors):
z
2
=
<
2
> < >
< >
z
1
=
<
2
> < >
< >
< >=<
2
> < >
The formula shows how the number of reachable nodes growths: the domi-
nant value is <
2
>.
60 CHAPTER 2. Random graphs
Example
If the distribution is Poisson (it is the case of the Erdos-Renyi model) the
variance is equal to the mean value and:
< >=<
2
> (< >)
2
= <
2
>= (< >)
2
+ < >
Therefore:
z
2
=<
2
> < >= (< >)
2
+ < > < >= (< >)
2
Starting from z
2
, by iteration, it is possible discover that:
z
m
=
<
2
> < >
< >
z
m1
Since:
. z
2
=<
2
> < >
. z
1
=< >
the result is:
z
m
=
z
2
z
1
z
m1
=
z
2
z
1
m1
z
1
By analysing the fraction z
2
/z
1
:
. if:
z
2
z
1
< 1
when m grows (the distance grows) it seems like a constant, so there
is bad connectivity: it implies that there is not a giant component;
. if:
z
2
z
1
> 1
on the contrary, all conditions lead to have a giant component;
. if:
z
2
z
1
= 1
there is the so called critical condition: it is dicult study the be-
haviour.
2.3. Bender-Caneld Model 61
Example
Focusing on the Erdos-Renyi model in critical conditions:
z
2
= (< >)
2
z
2
z
1
= 1 =
(< >)
2
< >
= 1
therefore:
< >= 1
Conditions that lead to a giant component is:
(< >) > 1
Since:
z
2
z
1
=< >
It is possible discover that:
z
m
= (< >)
m1
z
1
= z
m
= (< >)
m1
< > = z
m
= (< >)
m
it means that the discovering process of reachable nodes grows geometrically.
z
2
z
1
l1
z
1
= n
By taking the logarithm:
log
z
2
z
1
l1
= log
n
z
1
= l 1 =
log n/z
1
log z
2
/z
1
62 CHAPTER 2. Random graphs
In conclusion:
l =
log n/z
1
log z
2
/z
1
+ 1
where such l is the average distance inside the network: it is also called
diameter. The parameter l grows as the logarithm of n: if the number of
nodes is very large, l does not grow too much, therefore the small-world
eect is ensured. It also means that randomly built graphs have a shortest
distance.
Since in the Erdos-Renyi model z
1
=< >= z and z
2
= (< >)
2
= z
2
:
l =
log n/z
1
log z
+ 1
=
log n/z
1
log z
=
log n log z
log z
=
log n
log z
This behavior is also valid for trees topologies, while regular structures:
. the ring has an average distance that grows with n (because it is n/2);
. a grid topology in which there are n
2
nodes has an average distance
that grows with
n.
It means that regular structures have intrinsically worst performances be-
cause:
. have higher distances;
. are less robust to churning (maintenance is hard).
Example
Consider an average delay D = 0.2 s; to not exceed a maximum average
delay R = 1 s the distance l should be computed as:
l D
< R
By using:
l
=
log n
log z
D < R
It is possible obtain:
log z >
log n
R
D
Consider:
. n = 10
4
= log z > (4 0.2) = z > 6.3
. n = 10
6
= log z > (6 0.2) = z > 15.8
This term, l D, shows the average delay to reach the farest node.
2.3. Bender-Caneld Model 63
It means that the degree increases by a factor of 3 every time the number
of nodes increase by a factor of 100.
=0
( 2) {
= 0
By analysing this expression, it is clear that terms with = 0 , 1 , 2 have no
eect on the nal result (the occurrence of the giant component) because:
. terms with = 0 are isolated nodes;
. in terms of reachability, = 1 , 2 are the same:
=
2.3.3 Clustering
The following analysis are performed for any distribution that is not Poisson;
the clustering property shows the probability that two neighbors of a given
nodes are neighbors. To be veried, it is need that the orange link in the
following picture is established:
A
B
C
Therefore the clustering coecient describe how much locality is introduced
into the network. Considering that:
. node B has connectivity
i
;
64 CHAPTER 2. Random graphs
. node C has connectivity
j
,
the clustering coecient is given by:
c =
<
i
> <
j
>
n z
where:
. the numerator represents the all ways in which is possible connect the
two nodes;
. the denominator represents the average number of links in the network
because is given by the number of nodes n multiplied by the average
degree of each node z.
For 1-hop neighbors the distribution is q
2
=
1
n z
<
2
> < >
< >
2
By multiplying and dividing by z
2
:
c =
z
n
<
2
> < >
(< >)
2
2
=
Now, to the numerator is added and subtracted the quantity (< >)
2
:
c =
z
n
<
2
> (< >)
2
+ (< >)
2
< >
(< >)
2
2
In this way is possible recognize, within the numerator, the variance. Since
the coecient of variation is dened as:
c
v
=
var
avg
=
Var <
2
> (< >)
2
< >
within the clustering coecient it is possible recognize the square:
(c
v
)
2
=
<
2
> (< >)
2
(< >)
2
Therefore:
c =
z
n
(c
v
)
2
+
< > 1
< >
2
Since the clustering coecient depends on the square of coecient of vari-
ation, the dominant value is the variance. In conclusion the variance is
extremely important: it ensures high connectivity and introduces locality.
2.4. Heavy-Tailed Distribution 65
Variance
Giant component Clustering coecient
Example
Using this formulas for the Erdos-Renyi model:
(c
v
)
2
=
Var
(< >)
2
=
< >
(< >)
2
=
1
z
Therefore:
c =
z
n
1
z
+
z 1
z
2
=
z
n
1 =
n p
p
= p
Indeed, p is the probability that two nodes have a link that connect them,
so it is the also the clustering coecient.
2
{
d
This behavior is not really good, because both the small world and clustering
property depends largely on the variance. But the distribution comes from
measures and the tail is typically dicult to estimate precisely.
66 CHAPTER 2. Random graphs
Scale-free property
The scale-free property says that after this change:
the shape of the distribution does not change. But, the mean value, is
not too much representative of system described before: think at the time
connectivity. There are few users that have very long time connections while
the major part of users have short time connections.
2.5 Watts-Strogatz model
This model represent a family of random graphs that is obtained as an
intermediate solution between pure random-graphs and regular structures.
This interpolation allows to provide both peculiar properties of the two
families:
. regular structures (lattices): notion of locality (clustering);
. random graphs: small world eect.
By considering a regular structure (a ring, for example), a Watts-Strogatz
model is built introducing randomicity:
The connectivity, considering a given node (marked in blue in the picture),
is:
. m nodes in the clockwise order;
. m nodes in the counter clockwise order.
Therefore, each node has a degree equal to 2m. The average distance be-
tween nodes grows linearly with the number of nodes n in the network:
thanks to short cuts (as in Chord) it is possible reduce it. Indeed, the
process to obtain a Watts-Strogatz model is:
. for each node:
2.5. Watts-Strogatz model 67
. take each clockwise link;
. rewire it randomly with a probability p (or maintain it with a
probability 1 p).
The following picture shows this procedure:
=
Properties mentioned before (small-world eect and clustering) depends
on p:
. if it is large, the system tends to be a pure random-graph (for p 1
tends to be a Erdos-Renyi graph);
. if it is small, the system tends to be a regular structure with high
clustering (long xed routes to reach farthest nodes).
2.5.1 Clustering analysis
When p = 0, the clustering coecient is:
c =
3 (m1)
2 (2m1)
therefore depends basically on m, but it is very high (greater than 0, while
for Erdos-Renyi is something near 10
4
). It means that, the probability
for two nodes of being neighbors is high if they have a common neighbor.
Indeed, look at the following picture:
the green nodes are neighbors and have a common neighbor: the blue node.
This behavior has to taken into account not just considering the degree of
68 CHAPTER 2. Random graphs
a node, but considering the degree for all of them: the result is a very high
locality.
When p > 0:
c =
3 (m1)
2 (2m1)
(1 p)
3
it means that when p increases, the connectivity based on locality decreases.
2.5.2 Small-world analysis
The small-world property describe the distance between nodes. The average
distance depends on the number of nodes in regular structures: if it is a grid:
. with 2 dimensions, the complexity is O(
n);
. with 3 dimensions, the complexity is O(
3
n);
In general:
l O(n)
Look at the following graph:
In the region placed at the left top values of p leads to a regular structure,
while the bottom right region describe random graphs. In the center there
is a zone in which are satisfy both the small-world property and clustering.
Considering the ring, it is possible say that, by introducing few short
cuts (few with respect to the number of links) the small-world property
start to be ensured because those short cuts connect very far nodes. When
the number of short cuts inserted increases, their benet decreases: it is
better, indeed, introduce few of them an use just to reach farthest regions,
then use the locality connections to reach the destination.
With short cuts, the size of regions obtained by splitting is given by:
n
np
1
p
where:
s
(s) = m
Barabasi-Albert criterion
This approach says that scale-free networks are built with a preferential
attachment criterion. The algorithm is:
. start with an initial graph;
. at each step a node is attached (m links);
. links are preferentially attached to nodes based on their degree:
(
s
(t)) =
1
j
j
(t)
s
(t) (2.3)
The term:
j
(t)
is a normalization coecient that describe, statistically, the amount of all
possible degrees of links.
By substituting (2.3) in (2.2), it is possible obtain:
s
(t)
t
=
m
s
(t)
(2mt + 2m
0
<
s
>)
(2.4)
where:
2.6. Theory of evolving networks 71
. 2mt represents the links already introduced in the network;
. 2m
0
<
s
> is the initial distribution of the degree since m
0
is the
number of links at time s and <
s
> is the average degree at the
beginning.
The denominator is, globally, the coecient of normalization seen in (2.2).
The equation (2.4) shows that at each step t, 2m new links are introduced:
this is the contribution of the degree of two dierent nodes.
At the beginning:
s
(s) = m <
s
>= 2m
At the end:
s
(s)
= m
t
s
1/2
for t
This suggest that the degree increases as a square root function in time; the
denominator is s and represent the current node: the degree is high if the
node is older, therefore it depends on the age of nodes. Consider a node s
< s < t
The ratio:
(t)
s
(t)
s
s
1/2
Looking at large values of t:
{
= 2 m
2
3
therefore the probability that a node has degree is a heavy-tailed distribu-
tion: the scale-free property is ensured. For what concern the small-world
property and the clustering:
l
log n
log log n
c =
m
8n
(log n)
2
The small-world property is expected because there are few nodes very well
connected: they are the oldest nodes. The clustering property is similar to
the Erdos-Renyi model in which decreases with the number of nodes.
72 CHAPTER 2. Random graphs
2.7 Resume scheme
Model Small-world Clustering
ER l
=
log n
log z
c = p
1
n
RG with empirical distr. l =
log n/z
1
log z
2
/z
1
c =
z
n
(c
v
)
2
+
z 1
z
WS (p = 0) l n non ensured c =
3 (m1)
2 (2m1)
WS (p > 0) ensured high clustering
BA ensured low clustering
For random graphs with empirical distribution, both, small-world prop-
erty and clustering depends on the variance: with power-law the scale-free
property is ensured.
For Watts-Strogatz the value p should be taken:
1
n
<p <1