Вы находитесь на странице: 1из 40

Adapting Routing to the Traffic

Challenges
Reacting quickly to alleviate congestion
Avoiding over-reacting and causing oscillations
Limiting bandwidth & CPU overhead on routers

Load-sensitive routing
Routers adapt to link load in a distributed fashion
At the packet level, or on group of packets

Traffic engineering
Centralized computation of routing parameters
Network-wide measurements of offered traffic

Do IP Networks Manage Themselves?


TCP congestion control
Senders react to congestion
Decrease sending rate
But the TCP sessions
receive lower throughput

IP routing protocols
Routers react to failures
Compute new paths
But the new paths
may be congested

2
4
2

1
3

3
3

Do IP Networks Manage Themselves?


In some sense, yes:
TCP senders send less traffic during congestion
Routing protocols adapt to topology changes

But, does the network run efficiently?


Congested link when idle paths exist?
High-delay path when a low-delay path exists?
2

2
4
2

1
3

3
4

Adapting the Routing to the Traffic


Goal: modify the routes to steer traffic
through the network in most effective way
Approach #1: load-sensitive protocols
Distribute traffic & performance measurements
Routers compute paths based on load

Approach #2: adaptive management system


Collect measurements of traffic and topology
Management system optimizes the parameters

Debates still today about the right answer


5

Load-Sensitive Routing Protocols


Advantages
Efficient use of network resources
Satisfying the performance needs of end users
Self-managing network takes care of itself

Disadvantages
Higher overhead on the routers
Long alternate paths consume extra resources
Instability from out-of-date feedback information

Packet-Based Load-Sensitive Routing


Packet-based routing
Forward packets based on forwarding table

Load-sensitive
Compute table entries based on load or delay

Questions
What link metrics to use?
How frequently to update the metrics?
How to propagate the metrics?
How to compute the paths based on metrics?
7

Original ARPANET Algorithm (1969)


Routing algorithm
Shortest-path routing based on link metrics
Instantaneous queue length plus a constant
Distributed shortest-path algorithm (Bellman-Ford)
2
3
2

3
5

1
20

congested link
8

Performance of ARPANET Algorithm


Light load
Delay dominated by transmission & propagation
So, link metrics dont fluctuate much

Medium load
Queuing delay is no longer negligible
Moderate traffic shifts to avoid congestion

Heavy load
Very high metrics on congested links
Busy links look bad to all of the routers
All routers avoid the busy links
Routers may send packets on longer paths

Problem: Out-of-Date Information


Lincoln Tunnel
NJ

NYC
Holland Tunnel

Backup at Lincoln on radio triggers congestion at Holland

Routers make decisions based on old information


Propagation delay in flooding link metrics
Thresholds applied to limit number of updates

Old information leads to bad decisions


All routers avoid the congested links
leading to congestion on other links
and the whole things repeats

10

Problem: Frequent Updates


Update messages
Link keeps track of its metric (e.g., queuing delay)
Link transmits updates when the metric changes

Frequency of updates
Frequent changes to the metric lead to frequent updates
Significantly increases the overhead of the protocol

Oscillation makes the problem worse


Oscillation leads to wild swings in the link metrics
Forcing very frequent update messages
that add to the load on the links in the network
11

Second ARPANET Algorithm (1979)


Link-state protocol
Old: Distributed path computation leads to loops
New: Better to flood metrics and have each router
compute the shortest paths

Averaging of the link metric over time


Old: Instantaneous delay fluctuates a lot
New: Averaging reduces the fluctuations

Reduce frequency of updates


Old: Sending updates on each change is too much
New: Send updates if change passes a threshold

12

Problem of Long Alternate Paths


Picking alternate paths
Long path chosen by one router consumes
resource that other packets could have used
Leads other routers to pick other alternate paths

Solution: limit path length


Bound the value of the link metric
This link is busy enough to go two extra hops

Extreme case
Limit path selection to the shortest paths
Pick least-loaded shortest path in the network

13

Load-Sensitive Routing
Timescales
What timescale of routing decisions?
What timescale of feedback about link loads?

Load-sensitive routing at packet level


Routers receive feedback on load and delay
Routers re-compute their forwarding tables
Fundamental problems with oscillation

Load-sensitive routing for groups of packets


Routers receive feedback on load and delay
Router compute a path for the next flow or circuit
Less oscillation, as long as circuits last for a while
14

Reducing Effects of Out-of-Date Info


Send link metrics more often
But, leads to higher overhead
But, propagation delay is a fundamental limit

Make the traffic last longer


Route on groups of packets, rather than packets
Fewer routing decisions, and more accurate feedback

Groups of packets
Telephone network: phone call (3-minutes long)
Internet: TCP connection (10-packets long)
Internet: all traffic between a pair of hosts, or routers,
15

Traffic Engineering as a NetworkManagement Problem: Case Study

16

Using Traditional Routing Protocols


Routers flood information to learn topology
Determine next hop to reach other routers
Compute shortest paths based on link weights

Link weights configured by network operator


2
3
2

3
5

3
17

Approaches for Setting the Link Weights


Conventional static heuristics
Proportional to physical distance
Cross-country links have higher weights
Minimizes end-to-end propagation delay

Inversely proportional to link capacity


Smaller weights for higher-bandwidth links
Attracts more traffic to links with more capacity

Tune the weights based on the offered traffic


Network-wide optimization of the link weights
Directly minimize metrics like max link utilization
18

Example of Tuning the Link Weights


Problem: congestion along the pink path
Second or third link on the path is overloaded

Solution: move some traffic to the bottom path


E.g., by decreasing the weight of the second link
2
3
2

31

3
5

3
19

Measure, Model, and Control


Network-wide
what if model

Offered
Topology/
traffic
Configuration
measure

Changes to
the network
control

Operational network
20

Traffic Engineering Problem


Topology
Connectivity and capacity of routers and links

Traffic matrix
Offered load between points in the network

Link weights
Configurable parameters for routing protocol

Performance objective
Balanced load, low latency, service level
agreements

Question: Given the topology and traffic


matrix, which link weights should be used?

21

Key Ingredients of the Approach


Instrumentation
Topology: monitoring of the routing protocols
Traffic matrix: fine-grained traffic measurement

Network-wide models
Representations of topology and traffic
What-if models of shortest-path routing

Network optimization
Efficient algorithms to find good configurations
Operational experience to identify key
constraints

22

Formalizing the Optimization Problem


Input: graph G(R,L)
R is the set of routers
L is the set of unidirectional links
cl is the capacity of link l
i

Input: traffic matrix


Mi,j is load from router i to j

Output: setting of the link weights

wl is weight on unidirectional link l


Pi,j,l is fraction of traffic from i to j traversing link l
23

Multiple Shortest Paths: Even Splitting


0.25

0.25

0.5
1.0

0.25
0.5

1.0

0.25

0.5
0.5

Values of Pi,j,l

24

Defining the Objective Function


Computing the link utilization

Link load: ul = i,j Mi,j Pi,j,l


Utilization: ul/cl

Objective functions

min (maxl(ul/cl))
min(l f(ul/cl))

f(x)

25

Complexity of the Optimization Problem


Computationally intractable problem
No efficient algorithm to find the link weights
Even for simple objective functions

What are the implications?


Must resort to searching through weight settings

26

Optimization Based on Local Search


Start with an initial setting of the link weights
E.g., same integer weight on every link
E.g., weights inversely proportional to capacity
E.g., existing weights in the operational network

Compute the objective function


Compute the all-pairs shortest paths to get Pi,j,l
Apply the traffic matrix Mi,j to get link loads ul
Evaluate the objective function from the ul/cl

Generate a new setting of the link weights

27
repeat

Making the Search Efficient


Avoid repeating the same weight setting
Keep track of past values of the weight setting
or keep a small signature of past values
Do not evaluate setting if signatures match

Avoid computing shortest paths from scratch


Explore settings that changes just one weight
Apply fast incremental shortest-path algorithms

Limit number of unique link-weight values


Dont explore 216 possible values for each weight

Stop early, before exploring all settings


28

Incorporating Operational Realities


Minimize number of changes to the network
Changing just 1 or 2 link weights is often enough

Tolerate failure of network equipment


Weights usually remain good after failure
or can be fixed by changing 1-2 weights

Limit effects of measurement accuracy


Good weights remain good, despite noise

Limit frequency of changes to the weights


Joint optimization for day & night traffic matrices
29

Application to AT&Ts Backbone


Performance of the optimized weights
Search finds a good solution within a few minutes
Much better than link capacity or physical distance
Competitive with multi-commodity flow solution

How AT&T changes the link weights


Maintenance every night from midnight to 6am
Predict effects of removing link(s) from network
Reoptimize the link weights to avoid congestion
Configure new weights before disabling equipment

30

Example from AT&Ts Operations Center


Amtrak repairing/moving part of train track
Need to move some of the fiber optic cables
Or, heightened risk of the cables being cut
Amtrak notifies AT&T the timework will be done

AT&T engineers model the effects


Determine which IP links go over affected fiber
Pretend the network no longer has these links
Evaluate the new shortest paths and traffic flow
Identify whether link loads will be too high
31

Example Continued
If load will be too high
Reoptimize the weights on the remaining links
Schedule time for new weights to be configured
Roll back to old weights when Amtrak is done

Same process applied to other cases


Assessing the networks risk to possible failures
Planning for maintenance of existing equipment
Adapting link weights to installation of new links
Adapting link weights in response to traffic shifts
32

What About Interdomain Routing?


Border Gateway Protocol
Announcements carry very limited information
E.g., AS path, but nothing about delay, loss, etc.

Challenging to make load-sensitive protocol


Hard to agree upon a common metric
Hard to scale to such a large network
Hard to prevent ASes from gaming the system

Instead, individual ASes act alone


Change routing policies based on link load
E.g., moving some traffic to another provider

33

Interdomain Traffic Engineering


Predict effects of changes to import policies
Inputs: routing, traffic, and configuration data
Outputs: flow of traffic through the network

Topology
Externally
learned
routes

BGP policy
configuration

BGP routing
model

Offered
traffic

Flow of traffic through the network

34

Outbound Traffic: Pick a BGP Route


Easier to control than inbound traffic
IP routing is destination based
Sender determines where the packets go

Control only by selecting the next hop


Border router can pick the next-hop AS
Cannot control selection of the entire path

Provider 1

(1, 3, 4)

Provider 2

(2, 7, 8, 4)

35

Outbound Traffic: Shortest AS Path


No import policy on border router
Pick route with shortest AS path
Arbitrary tie break (e.g., smallest router-id)

Performance?
Shortest AS path is not necessarily best
Could have high delays or congestion

Load balancing?
Could lead to uneven split in traffic
E.g., one provider with shorter paths
E.g., too many ties with skewed tie-break
36

Outbound Traffic: Load Balancing


Selectively use each provider
Assign local-pref across destination prefixes
Change the local-pref assignments over time

Useful inputs to load balancing


End-to-end path performance data
E.g., active measurements along each path

Outbound traffic statistics per destination prefix


E.g., packet monitors or router-level support

Link capacity to each provider


Billing model of each provider
37

Balancing Load, Performance, and Cost


Balance traffic based on link capacity
Measure outbound traffic per prefix
Select provider per prefix for even load splitting
But, might lead to poor performance and high bill

Balance traffic based on performance


Select provider with best performance per prefix
But, might lead to congestion and a high bill

Balance traffic based on financial cost


Select provider per prefix over time to minimize the total
financial cost
But, might lead to bad performance
38

A Fundamental Problem
Everyone is acting alone
Internet is highly decentralized
Each AS is adapting its routes alone

Toward greater coordination


End hosts or edge routers pick the entire path?
Neighbor ASes cooperate to pick better paths?

A largely unsolved problem


The price of anarchy
Is there a better way?
39

Conclusions
Adapting routing to the traffic
To alleviate congestion
To minimize propagation delay
To be robust to future failures

Two main approaches


Load-sensitive routing protocol
Optimization of configurable parameters

40