Вы находитесь на странице: 1из 42

Peer-Assisted Content

Distribution Networks:
Techniques and Challenges
Pei Cao
Stanford University

Traditional Intra-Provider
Content Distribution Networks
National Center
Regional Center

. . .

Branch
Users

...

...

...

...

...

...

Peer-to-Peer Content
Distribution
National Center
Regional Center

. . .

Branch
Users

...

...

...

...

...

...

P2P vs CDN
P2P:

No infrastructure cost
Supply grows linearly with demand
Simple distributed, randomized algorithms
No QoS

CDN:

Initial infrastructure cost


Centralized scheduling algorithms
Network efficiency
Capable of supporting QoS

Combine P2P with CDN?


Use P2P to complement CDN
P2P reduces load on the CDN, covers areas
where CDN is not installed
Must be able to control, or shape, P2P traffic

Use CDN to complement P2P


CDN steps in when peer-based distribution is
falling short, enabling QoS
Must be able to detect when peers wont meet
the delivery time guarantee

Outline
Review of BitTorrent
Traffic-shaping BitTorrent: biased
neighbor selection
QoS in BitTorrent: delivery time
prediction

BitTorrent File Sharing


Network
Goal: replicate K chunks of data among
N nodes

Form neighbor connection graph


Neighbors exchange data

BitTorrent: Neighbor
Selection
Tracker
file.torrent

Seed 1
Whole file

4
3

BitTorrent: Piece Replication


Seed 1
Whole file

Tracker
file.torrent

3
5

BitTorrent: Piece Replication


Algorithms
Tit-for-tat (choking/unchoking):
Each peer only uploads to 7 other peers at a time
6 of these are chosen based on amount of data
received from the neighbor in the last 20 seconds
The last one is chosen randomly, with a 75% bias
toward newcomers

(Local) Rarest-first replication:


When peer 3 unchokes peer A, A selects which
piece to download

Analysis of BitTorrent
Conclusion from modeling studies:
BitTorrent is nearly optimal in
idealized, homogeneous networks

Demonstrated by simulation studies


Confirmed by theoretical modeling studies
Intuition: in a random graph,
Prob(Peer As content is a subset of Peer Bs)
50%

Traffic-Shaping BitTorrent

Random Neighbor Graph


Existing studies all assume random
neighbor selection

BitTorrent no longer optimal if nodes in


the same ISP only connect to each other

Random neighbor selection high


cross-ISP traffic

Difficulty in Traffic-Shaping P2P


Applications
ISPs:
Different links have different monetary costs
Prefer clustering of traffic

P2P Applications:
No knowledge of underlying ISP topology
Use randomized algorithms that dont do
well under clustering

Current solution: throttling users


suffer

A Network-Friendly
BitTorrent?
ISPs inform BitTorrent of its link

preferences
Algorithm of BitTorrent is adjusted
such that both users and ISPs benefit

Example: Biased Neighbor Selection


Works when cost function is transitive

Biased Neighbor Selection


Idea: of N neighbors, choose N-k from
peers in the same ISP, and choose k
randomly from peers outside the ISP
ISP

Implementing Biased Neighbor


Selection
By Tracker
Need ISP affiliations of peers
Peer to AS maps
Public IP address ranges from ISPs
Special X- HTTP header

By traffic shaping devices

Intercept peer tracker messages


and manipulate responses
No need to change tracker or client

Evaluation Methodology
Event-driven simulator
Use actual client and tracker codes as much as
possible
Calculate bandwidth contention, assume perfect fairshare from TCP

Network settings
14 ISPs, each with 50 peers, 100Kb/s upload, 1Mb/s
download
Seed node, 400Kb/s upload
Optional university nodes (1Mb/s upload)
Optional ISP bottleneck to other ISPs

Limitation of Throttling

Throttling: Cross-ISP Traffic

Redundancy: Average # of times a data chunk enters the

Biased Neighbor Selection:


Download Times

Biased Neighbor Selection:


Cross-ISP Traffic

Importance of Rarest-First
Replication
Random piece replication performs
badly

Increases download time by 84% - 150%


Increase traffic redundancy from 3 to 14

Biased neighbors + Rarest-First


More uniform progress of peers

Presence of External HighBandwidth Peers


Biased neighbor selection alone:
Average download time same as regular
BitTorrent
Cross-ISP traffic increases as # of university
peers increase
Result of tit-for-tat

Biased neighbor selection + Throttling:

Download time only increases by 12%


Most neighbors do not cross the bottleneck
Traffic redundancy (i.e. cross-ISP traffic) same as
the scenario without university peers

Comparison with Simple


Clustering
Gateway peer: only one peer

connects to the peers outside the


ISP, all other peers only connect to
peers inside the ISP
Gateway peer must have high
bandwidth
It is the seed for this ISP
Ends up benefiting peers in other ISPs

Combining Biased Neighbor


Selection with Caches
Under random neighbor selection
bandwidth requirement of cache is high

Under biased neighbor selection


bandwidth needed from the cache is
reduced by an order of magnitude

Conclusions
By choosing neighbors well, BitTorrent
can achieve high peer performance
without increasing ISP cost

Biased neighbor selection: choose initial


set of neighbors well
Can be combined with throttling and
caching

BitTorrents algorithm can be shaped!

Delivery Time Prediction

Motivation
Provide delivery time guarantee under
P2P+CDN
What contributes to delivery time of a
download via BitTorrent?
From simulations: seed bandwidth and
even replication of blocks
Missing: node join/leave dynamics, TCP
effects, etc.

Side-by-Side Live
Experiments
Two clients, running on the same

machine, starting at the same time,


downloading the same
13 experiments from Apr-May 2006
File sizes: 700MB ~ 1.4GB
Network size: 1100 ~ 2100 peers
Duration: 10 hrs ~ 2 days

Results from Experiments


Effective download rate: 10 ~ 30KB/s
Speed difference between the two
peers: 3% ~ 82%

What made the slower peer slow?

Suspicion #1: Slower


Neighbors?
Calculate unweighted average of observed
throughput at application level

R1: average from all neighbors


R2: average from neighbors uploading >250KB of data
R3: average from neighbors uploading >2.5MB of data

Low correlation between download-time ratio and


neighbor-speed ratio

0.57 for R1, 0.43 for R2, 0.47 for R3


Faster neighbors corresponds to slower downloads in 3
experiments

Suspicion #2: Fewer Neighbors


Uploading to the Peer?
Slot analysis: calculate download concurrency
Maximum number of neighbors: 35
Neighbors come and go align neighbors into 35
slots
Calculate time-average of number of concurrent
slots with neighbors uploading

Upload concurrency varies from 7 to 11

Explains one of the download-time/neighbor-speed


reversal case
But doesnt explain the two others

Close Neighbors
90% of data downloaded from 1-4%

of neighbors
Let F(p) and G(p) be the number of
neighbors that provides p of data to
peers F and G, then
F(p) > G(p) peer F is slower than G
This holds for p = 90%, 75%, and 50%

What makes a neighbor


close?
Not related to speed, or order of
connection to peer, or order of
unchoking by peer

Cost of Departure of a Close


Neighbor
Departure cost: if one close neighbor
leaves, calculate the time until the
earliest next close neighbor
The average departure cost: 30 min
The convergence time of the tit-fortat algorithm is slow

Why Do Close Neighbors


Leave
Five possible reasons

A: Random disconnect
B: Finished downloading
C: Peer broke off the relationship
D: Neighbor broke off the relationship

Results: B is most common, followed


by C/D, then A

Conclusions
Content delivery time in BitTorrent is
determined by:

Neighbor upload speed


Stability of neighbor relationship
Disruption of the pairing leads to long

delivery time
Neighbors may leave due to random
disconnection, completion of download, or
finding faster neighbors

Using CDN to Complement


P2P
Use nodes CDN as high-speed

specially managed seeds


Seeds are called to help whenever a
node loses a close neighbor

Summary
A way to shape BitTorrent traffic
Predicting BitTorrent performance by
monitoring close peer relationship

Related Work
Many modeling studies of BitTorrent
Simulation studies
Measurements of real torrents

Ongoing Work
Live experiments with biased

neighbor selections
A k-regular graph algorithm with
faster convergence
Prototype implementation of
P2P+CDN

Вам также может понравиться