Вы находитесь на странице: 1из 53

Bridging in the Data Center With or Without Spanning Tree

BRKDCT-1927

Overview
Transparent bridging in the data center Spanning Tree Protocol
How it works, how it fails Stability features Application to data center design

Virtual Port Channel (vPC)


Overview, recommendations Data center design with vPC

Next Generation Bridging


layer 2 routing Intra-data center: L2MP/TRILL Inter-data center: OTV

BRKDCT-1044

2010 Cisco and/or its affiliates. All rights reserved.

Cisco Confidential

Transparent Bridging in the Data Center

BRKDCT-1044

2010 Cisco and/or its affiliates. All rights reserved.

Cisco Confidential

Why Bridging in the Data Center?


Some protocols require it IP uses it: subnet concept linked to Layer 2

172.28.192.1/24

172.28.192.2

.3

.4

.5

BRKDCT-1044

2010 Cisco and/or its affiliates. All rights reserved.

Cisco Confidential

Extend a Subnet across Devices


For port density (not enough ports on device) For redundancy (without routing protocol involvement) For provisioning flexibility (add devices without changing L3 network configuration) Virtual machine mobility 172.28.192.1/24

172.28.192.3
BRKDCT-1044

.4

.2

.5
Cisco Confidential

.6

.7

.8
5

2010 Cisco and/or its affiliates. All rights reserved.

Pure L3 vs. mixed L2/L3 solutions


Robust redundancy Multipathing Host connects to 2 subnets Addressing constraints

L3
stateless host Mobility/flexibility No multipathing
Nic teaming

Failure domain = bridging domain

L3
BRKDCT-1044

L2 (with STP)

2010 Cisco and/or its affiliates. All rights reserved.

Cisco Confidential

Bridging is complementary to routing


Bridging provides flexibility Bridging main weaknesses are:
Failure domain = bridging domain (not scalable) A tree is required in the data plane no multipathing

Those limitations are caused by historic constraints in the data plane (not by STP, the spanning tree protocol)

BRKDCT-1044

2010 Cisco and/or its affiliates. All rights reserved.

Cisco Confidential

Spanning Tree Protocol

BRKDCT-1044

2010 Cisco and/or its affiliates. All rights reserved.

Cisco Confidential

STP Goals
Enforce a tree (at all time) Spanning eventually In a plug and play fashion Notify learning function of topology changes

BRKDCT-1044

2010 Cisco and/or its affiliates. All rights reserved.

Cisco Confidential

STP Information
Bridges exchange information using Bridge Protocol Data Units (BPDUs) The content of BPDUs is equivalent to a long integer Two different BPDUs can always be compared: the lower value is better Root Bridge ID
Root Path Cost Sender Bridge ID Sender Port ID 1011 2021
P1

B1

P1

B2

The BPDU sent by B1 is better than the BPDU sent by B2


A

BRKDCT-1044

2010 Cisco and/or its affiliates. All rights reserved.

Cisco Confidential

10

STP Terminology
Root bridge: bridge sending the best information (unique in the network) Designated port: the port sending the best information on a LAN (unique on a LAN) Root port: the port receiving the best information (unique on a non-root bridge) A port that is not root or designated is discarding: alternate or backup Designated port: best information LAN A Root bridge: best information in the network
Root port Alternate port Designated port
BRKDCT-1044

FW BLK

Root port: rx best BPDU on B2 1121 LAN A


P1

1011
P1

B1

P2

1012

LAN B 1122

P2

B2

Designated port: best information LAN B


2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

Alternate port: Not root or designated port


11

How Can STP Open a Loop?


Fundamental difference bridging vs. routing:
Router: Bridge: no control message no control message no forwarding no blocking by default, a router drops traffic by default, a bridge floods traffic

A port that fails to receive BPDUs goes designated (forwarding) Most STP failures are related to BPDUs being lost or not acted upon

BRKDCT-1044

2010 Cisco and/or its affiliates. All rights reserved.

Cisco Confidential

12

Unidirectional Link Failure


BPDUs lost one way A link only transmit traffic in one direction BPDUs are dropped Unidirectional loop open (clockwise here)
LAN A

loop
P1

B1
Root port Alternate port Designated port

P1

P2

1012

LAN B 1122

P2

B2

P2 does not receive any BPDU: it thinks it is designated and open a loop

BRKDCT-1044

2010 Cisco and/or its affiliates. All rights reserved.

Cisco Confidential

13

Dispute Mechanism
Protects Against Unidirectional Link There can only be a one designated port on a LAN RSTP (Rapid Spanning Tree) and MST (Multiple Spanning Trees) advertise a role in their BPDUs A designated port with worse information is a problem LAN A
P1

Root port Designated port

No loop!

P1 P2

B1

B2 LAN B Designated 1122 Designated 1012


Worse designated BPDU
14

P2

P2 receives inconsistent BPDUs disputed (blocked)


BRKDCT-1044 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

Brain Dead Bridge


BPDUs Ignored B2 does not process BPDUs (CPU) B2 still forwards traffic (ASIC) Traffic loops in both directions
LAN A

BPDUs ignored and not relayed


P1

loop
P1

B1
Root port Alternate port Designated port

P2

LAN B

1012

P2

B2

?!

brain dead bridge

BRKDCT-1044

2010 Cisco and/or its affiliates. All rights reserved.

Cisco Confidential

15

Bridge Assurance
Identify and configure network ports vs. edge ports On p2p network ports:
Send periodic BPDUs, regardless of role Expect periodic BPDUs, regardless of role If no BPDU is received, the port goes inconsistent (blocking) Root network port sends periodic BPDUs
Designated 1011

Root port Alternate port Designated port Edge port

B1 p1

Root 1121

p1

B2

Edge port: does not expect BPDUs

Worse root BPDU: does not trigger dispute Designated network port expects BPDUs
Cisco Confidential

BRKDCT-1044

2010 Cisco and/or its affiliates. All rights reserved.

16

Bridge Assurance
The Ultimate Brain Dead Detection Mechanism Introduce a behavior closer to L3: A network port with no peer does not transmit traffic
LAN A

loop
P1 P1

B1
P2

LAN B

1012

P2

B2

?!

Root port Alternate port Designated port

Bridge Assurance Inconsistent ports (no BPDU received)

brain dead bridge

BRKDCT-1044

2010 Cisco and/or its affiliates. All rights reserved.

Cisco Confidential

17

Data Center Design with Redundancy Handled by STP


N E

Data Center Core

B R L

Network port Edge port Normal port type BPDUguard Rootguard Loopguard

HSRP

HSRP
STANDBY

Aggregation

ACTIVE

Layer 3 Layer 2 (STP + Bridge Assurance) Layer 2 (STP + BA + Rootguard)

Backup

Root
N N N R R R R

Root
N N N R R R R

Access
N N

N N N

N L L

E B

E B

E B

E B

E B

Layer 2 (STP + BPDUguard)

BRKDCT-1044

2010 Cisco and/or its affiliates. All rights reserved.

Cisco Confidential

18

Virtual Port Channel


A slight change in the data plane

BRKDCT-1044

2010 Cisco and/or its affiliates. All rights reserved.

Cisco Confidential

19

Virtual Port Channel (vPC)


Introduces some changes to the Data Plane Provides load balancing Does not rely on STP for redundancy Limited to pair of switches
VPC domain

Blocked port
Redundancy handled by STP
BRKDCT-1044 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

Redundancy handled by vPC


20

vPC from the Perspective of STP


STP is still run independently on the two peers Dual attached devices only see the primary peer
VPC domain VPC domain

A Primary Peer

B Secondary Peer

C Physical topology

C Logical STP view

BRKDCT-1044

2010 Cisco and/or its affiliates. All rights reserved.

Cisco Confidential

21

vPC from the Perspective of L3


HSRP is run independently on both peers, however Both HSRP active and standby are forwarding traffic!
VPC domain

HSRP active L3 infrastructure HSRP standby

Those L3 links must not be bundled into a channel

BRKDCT-1044

2010 Cisco and/or its affiliates. All rights reserved.

Cisco Confidential

22

vPC Operation
Port Channel
VPC domain

B A vPC Peer Link vPC Peers

Traffic can use either side (depending on channel hashing) No traffic on peer link (ideally, all devices are dual attached)
BRKDCT-1044 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

23

Link Failure or Single Attached Device


VPC domain

B A C

Link failure or single attached device Traffic going through peer link must not be flooded to dual attached devices
BRKDCT-1044 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

24

vPC Peer Failure


VPC domain

B A vPC Peer Link is down vPC Peer has failed C

Single attached devices might be isolated

BRKDCT-1044

2010 Cisco and/or its affiliates. All rights reserved.

Cisco Confidential

25

vPC Peer Link Failure


(broken) vPC domain

B A C vPC Peer Link is down

Possibility of a dual active scenario The vPC domain cannot operate as a single switch How do we differentiate this failure from the previous one?
BRKDCT-1044 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

26

vPC Peer Keepalive Link


(broken) vPC domain

Peer Keepalive Link B

Primary Peer

A vPC Peer Link is down Secondary Peer

vPC Peer Keepalive Link is a hello mechanism that tests the peer without using the Peer Link The secondary peer block its ports when Peer Link down
BRKDCT-1044 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

27

vPC Design Recommendations


Always dual attach devices The Peer Link should be
a channel with at least two 10gig/s interfaces in dedicated mode, spanning different line cards.

The Peer Keepalive Link must not use the peer link. Use instead a separate cable/mgmt interface/L3 infrastructure Use LACP to form channels to the vPC pair

BRKDCT-1044

2010 Cisco and/or its affiliates. All rights reserved.

Cisco Confidential

28

STP vs. vPC Solution


stateless host Mobility/flexibility No multipathing
Nic teaming

L3

L2 with STP

Failure domain = bridging domain stateless host Mobility/flexibility Multipathing Failure domain = bridging domain

L2 with vPC L3
BRKDCT-1044

Ether Channels

as STP sees it
2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

29

vPC Data Center Example


N E B

Data Center Core

R L X

Network port Edge port Normal port type BPDUguard Rootguard Loopguard Network or Normal port: safety/availability tradeoff

HSRP

Aggregation

ACTIVE

VPC domain
N N

HSRP
STANDBY Backup

Layer 3 Layer 2 (STP + Bridge Assurance) Layer 2 (STP + BA + Rootguard)

Root
X X X X R R R R

Root
X X X X R R R R

Access
X

X X

E B

E B

E B

E B

E B

Layer 2 (STP + BPDUguard)

BRKDCT-1044

2010 Cisco and/or its affiliates. All rights reserved.

Cisco Confidential

30

Fixing STP Problems


By fixing the data plane

BRKDCT-1044

2010 Cisco and/or its affiliates. All rights reserved.

Cisco Confidential

31

Extreme Hierarchical Network Example


4 Billion Hosts
It might be acceptable to have 4 billion routes here But not here
32 layers

Routers: 3 summary routes per devices Bridges: 4 billion host routes per devices
BRKDCT-1044 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

32

No Routing Table Consequences


Routing: Notion of location associated to addresses
Equal Cost Multipathing (ECMP), Reverse Path Forwarding Check (RPFC)

To A A

To B B

R1

R2

Bridging: flooding requires a tree A B1 B2 B

BRKDCT-1044

2010 Cisco and/or its affiliates. All rights reserved.

Cisco Confidential

33

Mac-in-Mac (802.1ah) Model


Introduced for Service Providers
Create more services Solve Mac Address Table scalability issues A X B Y X Y
X W Z Y

A X B Y

Backbone Edge Bridge A


User space

Backbone Bridge AB XY AB
Provider Bridge

AB
User space

Backbone space Backbone Edge Bridge

BRKDCT-1044

2010 Cisco and/or its affiliates. All rights reserved.

Cisco Confidential

34

Mac-in-Mac Scalability
Backbone Edge Bridges (BEB) are able to:
map mac addresses between user and backbone spaces encapsulate/decapsulate frames

BEBs only need to learn a subset of the mac addresses Backbone Bridges are regular bridges They only see backbone space addresses Now, lets assume that the backbone bridges are not bridges but new special devices

BRKDCT-1044

2010 Cisco and/or its affiliates. All rights reserved.

Cisco Confidential

35

Application: Routing Backbone Frames


Backbone addresses are limited in number
They can be propagated by a control protocol A routing table is possible in the backbone!

To X To Y A
User space X W Y

B
User space

Backbone space Next generation bridge

ECMP, RPFC etc now possible in the backbone


BRKDCT-1044 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential

36

Adding a TTL
Frames are encapsulated unchanged in a new frame format in the backbone
The encapsulation can carry a TTL A Link state protocol allows determining the exact hop count

To X To Y A
User space X W Y

A X, TTL 2

B
User space

Backbone space

AB

XY 1 AB XY 2 AB

AB

BRKDCT-1044

2010 Cisco and/or its affiliates. All rights reserved.

Cisco Confidential

37

Upcoming Technologies
By introducing a new data plane in the backbone, the advantages of Layer 3 can be added to Layer 2 The backbone addresses are not seen by L2 users, they represent a location, aggregating several devices
Global PC A address = X.A
Backbone Address (location) Mac Address (ID) PCA
User space X

Backbone space

The plug and play aspect of L2 can be maintained

BRKDCT-1044

2010 Cisco and/or its affiliates. All rights reserved.

Cisco Confidential

38

TRILL (Transparent Interconnection of Lots of Links) and Cisco L2MP (Layer 2 Multi Pathing)
Intra-Data Center Solutions

BRKDCT-1044

2010 Cisco and/or its affiliates. All rights reserved.

Cisco Confidential

39

Intra Data Center Solutions


IETF TRILL, Cisco L2MP TRILL: Transparent Interconnection of Lots of Links, Cisco Layer 2 Multi Pathing Goal: replace current transparent bridging model
Add multipathing Introduce L3-like stability for bridging Add minimal overhead (backbone bridges identified with a compact ID, not a full mac address)

BRKDCT-1044

2010 Cisco and/or its affiliates. All rights reserved.

Cisco Confidential

40

TRILL/Cisco L2MP
Common use in the Data Center backbone = DC L2 Network (typically between Access and Aggregation)
3 Aggregation switches: no design restriction ECMP+Channels for higher bandwidth
Core

Aggregation

Access

Access encapsulates

BRKDCT-1044

2010 Cisco and/or its affiliates. All rights reserved.

Cisco Confidential

41

TRILL
Specific details Can create adjacencies on shared links at the price of a larger encapsulation
Core

Regular (non-TRILL) bridge

Aggregation

Access

BRKDCT-1044

2010 Cisco and/or its affiliates. All rights reserved.

Cisco Confidential

42

Cisco L2MP
Specific details
Assumes p2p connectivity to neighbor supporting L2MP Compact header (for low latency) Emulated bridge
P2p links
Core

Aggregation

Access

vPC+ ( vPC)

BRKDCT-1044

2010 Cisco and/or its affiliates. All rights reserved.

Cisco Confidential

43

L2MP: Conversational Learning


B sends a frame to C, an unknown address

A X
X Step 1: C is unknown to Y, flood Y

B Y

CB B

A CB

flood,Y C B

Step 2: X receives the frame but does not know C: Y is not learnt.

BRKDCT-1044

2010 Cisco and/or its affiliates. All rights reserved.

Cisco Confidential

44

L2MP: Conversational Learning


B sends a frame to A, and A answers
A X B Y (2)
X Step 1: A is unknown to Y, flood Y

B Y A X (4)

AB B

A AB BA

flood,Y A B

Step 2: X receives and knows A as a local address. B is learnt. Step 3: X knows B, the frame is unicast to Y

Y ,X

BA

Step 4: Y knows B, A is learnt

BA
45

BRKDCT-1044

2010 Cisco and/or its affiliates. All rights reserved.

Cisco Confidential

vPC vs. L2MP Solution


stateless host Mobility/flexibility Multipathing Failure domain = bridging domain stateless host Mobility/flexibility Multipathing
EtherChannel Robust redundancy (virtual bridge)

L2 with vPC L3
as STP sees it

Ether Channels

L3
BRKDCT-1044

L2MP
2010 Cisco and/or its affiliates. All rights reserved.

Cisco Confidential

46

Overlay Transport Virtualization (OTV)


Inter-Data Center Solution

BRKDCT-1044

2010 Cisco and/or its affiliates. All rights reserved.

Cisco Confidential

47

Inter Data Center Solution: OTV


OTV: Overlay Transport Virtualization Backbone: IP network. Backbone @ = IP @
Overlay network

DC1

L2

Provider network (IP)

L2

DC2

Aggregation encapsulate L2 over IP


L2

Core

Aggregation Access

DC3
BRKDCT-1044 Cisco Confidential

2010 Cisco and/or its affiliates. All rights reserved.

48

OTV specific features (1)


Tailored for Data Center Interconnect STP kept local (distinct roots for different DCs) No unknown unicast flooding (data plane protection) mac address knowledge exchanged via ISIS DC1 (root1)
E1/1 To known macB IP X IP Y To unknown macZ

mac@ port macA E1/1 macB IP Y1


ISIS

mac@ port macA IP X macB E1/2


E1/2

DC2 (root2)

macB

BRKDCT-1044

2010 Cisco and/or its affiliates. All rights reserved.

Cisco Confidential

49

OTV specific features (2)


Tailored for Data Center Interconnect Proxy ARP HSRP localization

DC1
E1/1 Arp for MacB Reply

DG2, HSRPmac

DG2, HSRPmac

DC2
E1/2 macB

IP X

IP Y Traffic to HSRPmac goes to the local DG

BRKDCT-1044

2010 Cisco and/or its affiliates. All rights reserved.

Cisco Confidential

50

Conclusion
L2 desirable for its flexibility (as a complement to L3) Transparent bridging has some scalability issues Several stability features have been developed in the control plane they will never be enough to match L3 The final solution will be injecting L3 elements in the L2 data plane The Nexus family of switches provide the HW support for those technologies

BRKDCT-1044

2010 Cisco and/or its affiliates. All rights reserved.

Cisco Confidential

51

BRKDCT-1927

Recommended Reading

BRKDCT-1044

2010 Cisco and/or its affiliates. All rights reserved.

Cisco Confidential

52

53

Вам также может понравиться