Академический Документы
Профессиональный Документы
Культура Документы
Outline
Network Processors
Adding Extensions
Scheduling Cycles
Observations
Emerging commodity components
can be used to build IP routers
switching fabrics, network processors,
CS 461
Router Architecture
Control Plane
(BGP, RSVP)
Data Plane
(IP)
Spring 2002
CS 461
Software-Based Router
PC
Control Plane
(BGP, RSVP)
+ Cost
+ Programmability
Performance (~300
Kpps)
Data Plane
(IP)
Spring 2002
Robustness
CS 461
Hardware-Based Router
PC
Control Plane
(BGP, RSVP)
Cost
Programmability
+ Performance (25+
Mpps)
Data Plane
(IP)
ASIC
Spring 2002
+ Robustness
CS 461
PC
NP-Based Router
Architecture
Control Plane
(packet flows)
Data Plane
(packet flows)
IXP1200
Spring 2002
+ Cost ($1500)
+ Programmability
? Performance
? Robustness
CS 461
In General...
Pentium
IXP
...
Pentium
Pentium
IXP
Spring 2002
CS 461
Architectural Overview
Packet Flows
. . . Network Services . . .
Virtual
Router
Forwarding Paths
. . . Hardware Configurations . . .
Spring 2002
CS 461
Switching Paths
Virtual Router
Classifiers
Schedulers
Forwarders
Spring 2002
CS 461
Simple Example
IP
Proxy
Active Protocol
Spring 2002
CS 461
10
Intel IXP
MAC Ports
Scratch
6 MicroEngines
DRAM
FIFOs
IX Bus
SRAM
PCI Bus
StrongARM
Spring 2002
IXP1200 Chip
CS 461
11
Processor Hierarchy
Pentium
StrongArm
MicroEngines
Spring 2002
CS 461
12
Input
FIFO Slots
Output
FIFO Slots
SRAM
(queues,
state)
Input
Contexts
Output
Contexts
64B
Spring 2002
CS 461
13
Spring 2002
CS 461
14
Pipeline Evaluation
Forwarding Rate (Mpps)
input
output
9
8
7
6
5
4
3
2
1
0
Measured
independently
100Mbps Ether
0.142Mpps
0
16
24
MicroEngine Contexts
Spring 2002
CS 461
15
What We Measured
Static context assignment
16 input / 8 output
Spring 2002
CS 461
16
O
Output FIFO
Lock synchronization
Max 3.47 Mpps
Contention lower bound 1.67 Mpps
Spring 2002
CS 461
17
O
Output FIFO
CS 461
18
O
Output FIFO
CS 461
19
Spring 2002
CS 461
20
Cycles to Waste
INPUT context loop
wait_for_data
copy in_fiforegs
Basic_IP_processing
nop
nop
nop
copy regsDRAM
if (last_fragment)
enqueueSRAM
Spring 2002
CS 461
21
0/0
40/4
80/8
160/16
320/32
640/64
Spring 2002
CS 461
22
Basic IP
32
TCP Splicer
45
ACK Monitor
15
Port Filter
26
Wavelet Dropper
28
Spring 2002
CS 461
23
Shared State
Smart Dropper
(data plane)
Spring 2002
CS 461
24
What we do
control IXP1200 Pentium DMA
control MicroEngines
We recommend against
running Linux
Spring 2002
CS 461
25
Performance
310Kpps with
1510 cycles/packet
Pentium
StrongArm
MicroEngines
Spring 2002
CS 461
3.47Mpps w/ no VRP
or
1.13Mpps w/ VRP buget
26
Pentium
Runs protocols in the control plane
e.g., BGP, OSPF, RSVP
Implementation
runs Scout OS + Linux IXP driver
CPU scheduler is key
Spring 2002
CS 461
27
Processes
.
.
.
.
.
.
P
P
P
Input Port
.
.
.
Spring 2002
P
P
Output Port
Pentium
.
.
.
CS 461
28
Performance
Aggregate Forwarding Rate
(Kpps)
350
I nterrupt
300
Polling, 1 Process
250
Polling, 2 Process
200
150
Polling, 3 Process
100
Polling, 3 Process,
w/ o Batching
50
0
0
Spring 2002
CS 461
29
Performance (cont)
300
Kpps
250
200
150
450MHz P-I I
Cisco 7200
100
50
0
Spring 2002
CS 461
30
Scheduling Mechanism
Proportional share forms the base
each process reserves a cycle rate
provides isolation between processes
unused capacity fairly distributed
Eligibility
a process receives its share only when its source
queue is not empty and sink queue is not full
Batching
to minimize context switch overhead
Spring 2002
CS 461
31
Share Assignment
QoS Flows
assume link rate is given, derive cycle rate
conservative rate to input process
keep batching level low
Spring 2002
CS 461
32
Experiment
A (BE)
B (QoS)
A+C
C (QoS)
Spring 2002
CS 461
33
Aggreate
200
Flow A (BE)
150
Flow B
(QoS,
90Kpps)
100
50
Flow C
(QoS,
90Kpps)
0
0
20
40
60
80
Spring 2002
CS 461
34
CPU vs Link
Fix A at 50Kpps, increase its processing
cost
250
Aggreate
200
Flow A (BE)
150
Flow B
(QoS,
90Kpps)
100
50
Flow C
(QoS,
90Kpps)
0
0
10
20
30
40
50
60
Spring 2002
CS 461
35
Aggreate
200
Flow A (BE)
150
Flow B
(QoS,
90Kpps)
100
50
Flow C
(QoS,
90Kpps)
0
0
10
20
30
40
50
60
CS 461
36
Aggreate
200
Flow A (BE)
150
Flow B
(QoS,
90Kpps)
100
50
Flow C
(QoS,
90Kpps)
0
0
10
20
30
40
50
60
CS 461
37
Batching Throttle
Scheduler Granularity: G
flow processes as many packets as possible w/in G
Batch Threshold: Bi
dont consider Flow i active until it has
accumulated at least Bi packets, where Csw / (Bi x
Ci) < T
Delay Threshold: Di
consider a flow that has waited D i active
Spring 2002
CS 461
38
Dynamic Control
Flow specifies delay requirement D
Measure context switch overhead
offline
Record average flow runtime
Set E based on workload
Calculate batch-level B for flow
Spring 2002
CS 461
39
Packet Trace
5500
5000
WFQ
4500
4000
T=10%
D=1000
3500
T=25%
D=1000
3000
T=10%
D=500
2500
2000
T=10%
D=300
1500
1000
500
0
0
10
Spring 2002
15
20
25
30
35
40
CS 461
45
50
55
40