Вы находитесь на странице: 1из 100

Advanced Enterprise Campus Design : Resilient Campus Networks

BRKCRS 3032

BRKCRS-3032

2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

Presenter
Rahul Kachalia CCIE #11732 (R&S and SP)
Technical Marketing Engineer
System Development Unit (SDU)

BRKCRS-3032

2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

Design Zone for Borderless Networks


www.cisco.com/go/designzone/borderless

Borderless Campus CVD


http://www.cisco.com/en/US/docs/solutions/Enterprise/Campus/Borderless_Campus_Network_1.0/Borderless_Campus_1.0_Design_Guide.pdf
BRKCRS-3032 2012 Cisco and/or its affiliates. All rights reserved. Cisco Public

What Are Your Uptime Requirements?


Global Enterprise Availability

Campus network design is evolving in response to multiple drivers


User Expectations: Always ON Access to communications Industry Requirements: Financial, Healthcare, 7x24x365 Global access Technology Requirements: Services, Applications, Communications i.e Unified Communications, Video
Collaboration and Real-Time Communication

Requires a Structured and Resilient Design


BRKCRS-3032 2012 Cisco and/or its affiliates. All rights reserved. Cisco Public

How Does Downtime Affect Voice?


Availability Requirements for UC are more than just five 9s Also need to consider the subjective impact to real time communications

50

Seconds of Data Loss

45 40 35 30 25

20
15 10 5 0

5-6 sec 200ms


No impact to Voice

1 sec
Minimal Impact to Voice User Hangs Up Phone Resets*

* The time for a phone to reset is variable and depends on the signaling protocol (SCCP or SIP) and the state of the call (active, ringing, )
BRKCRS-3032 2012 Cisco and/or its affiliates. All rights reserved. Cisco Public

How Does Downtime Affect Voice or Video?


Network SLAs varies for traditional video conferencing versus TelePresence Availability Requirements for high-definition TelePresence are more stringent then UC

TelePresence Metric
Latency
Target Threshold 1 (Warning) Threshold 2 (Call Drop)

Traditional Video Conferencing


400-450 ms

150 ms

200 ms

400 ms

Jitter

10 ms

20 ms

40 ms

30-50 ms

Loss

0.05%

0.10%

0.20%

1% 384 or 768 kbps + overhead

BW

2.5 - 12.6 Mbps + overhead

BRKCRS-3032

2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

Design Strategies For Network Survivability

Resiliency Goal

Non-Disruptive Network and Service Availability


Operational Level Resiliency

Resiliency Strategy

Network Level Resiliency

System Level Resiliency NSF/SSO Power Redundancy

Resiliency Technologies

ECMP EtherChannel UDLD

ISSU eFSU GOLD EEM

BRKCRS-3032

2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

High Availability Campus Design


Agenda Network Level Resiliency
High Availability Design Principles Simplified and Redundant Campus Design Campus Routing Best Practices

System Level Resiliency


Integrated Hardware and Software Resiliency Stateful and Non-Stop Forwarding Hitless Switching

Operational Level Resiliency


Single and Multi-Chassis ISSU Upgrade Hitless NX-OS Software Upgrade
BRKCRS-3032 2012 Cisco and/or its affiliates. All rights reserved. Cisco Public

Simple Demand, Complex Design?

Advantage Advantage
Highly Redundant Network Design Operational simplicity Single Control-Plane between layer Redundant Network System and network paths on mission-critical network points Redundant Paths Protects network availability during major network fault event Single chassis system redundancy
Si

Si
Si

Cost-effective solution for small size network design

Si

Si

Si

Disadvantage Disadvantage
Single point-of-failure Becomes complex asdesign it scales Any majorcontrol network fault can cause complete network outage Increase and management plane May not be very cost-effective design compare with dual systems Redundant control-plane with redundant topology information
Si Si

Si

BRKCRS-3032

2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

Structured and Modular Designs Works Best


Redundant Supervisor

Optimize the interaction of the physical redundancy with the network protocols
Si

Layer 2 or Layer 3
Si Si Si Si Si

Provide the necessary amount of redundancy Pick the right protocol for the requirement

Redundant Links Redundant Switches

Optimize the tuning of the protocol

The network looks like this so that we can map the protocols onto the physical topology We want to build networks that look like this
Si Si

Si

Si

Si

Si

WAN

Data Center

Internet

BRKCRS-3032

2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

High Availability Campus Design


Agenda Network Level Resiliency
High Availability Design Principles Simplified and Redundant Campus Design Campus Routing Best Practices

System Level Resiliency


Integrated Hardware and Software Resiliency Stateful Switchover and Non-Stop Forwarding Hitless Switching

Operational Level Resiliency


Single and Multi-Chassis ISSU Upgrade Hitless NX-OS Software Upgrade
BRKCRS-3032 2012 Cisco and/or its affiliates. All rights reserved. Cisco Public

Access Layer Redundancy with Dual-Sup

Non-stop business communication with redundant supervisor Distribute multiple uplinks from both supervisor for following benefits :
Improve network resource utilization Minimize control-plane disruption Improve network recovery to sub-second Maximize network level protection
Sup-2
Si Sup-1

Si

Si
Sup-1 Sup-2

Protects switching capacity, network topology and forwarding information during supervisor switchover

4500E

BRKCRS-3032

2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

Access Layer Redundancy with Single-Sup

Flexible edge network and bandwidth expansion Multiple built-in supervisor uplink ports for high-speed distributionaccess block.

Si

10G

Si

1G Uplink 4500E
Si

Plan inter-distribution link capacity to handle large data re-routing


Minimize network congestion with distributed high-speed uplink connections to aggregation system

Si Si

10G 10G

Si Si

10G Uplink 10G Uplink 4500E


Si Si

BRKCRS-3032

2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

Simplified, Scalable & Reliable Access Network with Cisco StackWise Plus

Physical Network

Control and Mgmt Plane

Network Design

Si

VSL

Si

Network expansion as it grown Several 1G link consolidation to

Single Management Centralized control-plane

Single point-to-point network Distributed forwarding

10G
High-speed stack-ring for intra-

architecture
NSF Capable

architecture
Reduces VLANs and subnets
Cisco Public

access traffic
BRKCRS-3032 2012 Cisco and/or its affiliates. All rights reserved.

Uplink Redundancy with Cisco StackWise Plus


Dual vs Quad Uplink Design Alternatives

Build uplinks with two stack-member switches. Protocol driven network recovery with dual uplinks

Quad distributed uplinks


Increase uplink capacity Hardware driven network recovery with traditional distribution design
Dist-1
Si Si

Dist-2

Prevents network topology change and improves network recovery


SW1 SW9

BRKCRS-3032

2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

We Will Be Talking About Solutions for Two Distribution Block Models


Si Si Si Si

Vlan 10

Vlan 20

Vlan 30

Vlan 10

Vlan 20

Vlan 30

Traditional Distribution Block Design Dual Standalone System Distributed Planes Protocol dependent fault detection and recovery
BRKCRS-3032

Evolution Network Design Single Virtual System Unified Control and Management plane. Distributed Forwarding plane. Deterministic Network Recovery.

2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

Traditional Distribution Design


Redundant design with sub-optimal topology and complex operation. Stabilize network topology with several L2 :
STP Primary and Backup Root Bridge
Rootguard Loopguard or Bridge Assurance STP Edge Protection STP Root HSRP Active Rootguard Loopguard or Bridge Assurance Bridge Assurance

Si

Si

Protocol restricted forwarding topology


STP FWD/ALT/BLK Port Single Active FHRP Gateway Asymmetric forwarding Unicast Flood

Protocol dependent driven network recovery


PVST/RPVST+ FHRP Tunings BPDU Guard or PortFast Port Security

BRKCRS-3032

2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

Even with Faster Convergence from RPVST+ We Still Have to Wait on FHRP Convergence
VRRP Config
interface Vlan4 ip address 10.120.4.1 255.255.255.0 ip helper-address 10.121.0.5 no ip redirects vrrp 1 description Master VRRP vrrp 1 ip 10.120.4.1 vrrp 1 timers advertise msec 250 vrrp 1 preempt delay minimum 180

FHRP Active

FHRP Standby

Si

Si

HSRP Config
interface Vlan4 ip address 10.120.4.2 255.255.255.0 standby 1 ip 10.120.4.1 standby 1 timers msec 250 msec 750 standby 1 priority 150 standby 1 preempt standby 1 preempt delay minimum 180

GLBP Config
interface Vlan4 ip address 10.120.4.2 255.255.255.0 glbp 1 ip 10.120.4.1 glbp 1 timers msec 250 msec 750 glbp 1 priority 150 glbp 1 preempt glbp 1 preempt delay minimum 180
BRKCRS-3032

GLBP offers load balancing within a VLAN For Voice, sub-second Hello timer enables < 1 Sec traffic recovery upstream Sub-Second protocol timers must be avoided on SSO capable network

2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

PIM Needs Timer Tuning Too


Multicast recovery depends on PIM DR failure detection in Layer 2 network PIM routers exchanges PIM expiration time in query message
Default Query-Interval 30 seconds
Expiration Query Interval x 3 DR Failure Detection ~90 seconds
Si Si

PIM DR

Tune PIM query interval to sub-sec as FHRP for faster multicast convergence Sub-second protocol timer must be avoided on SSO capable network

interface Vlan4 ip pim sparse-mode ip pim query-interval 250 msec

BRKCRS-3032

2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

Sub-second Protocol Timers and NSF/SSO


Neighbor Loss, Graceful Restart NSF is intended to provide availability through route convergence avoidance

Si

Si

Fast IGP timers are intended to provide availability through fast route convergence
In an NSF environment dead timer must be greater than:
SSO recovery + Routing Protocol restart + time to send first hello
NSF-Aware Hello
Si Si

Recommendation keep protocol timers to default


NSF Restart

RP Restart
OSPF First Hello

NSF Capable

BRKCRS-3032

2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

Simplify STP Network Topology with VSS

STP Root

Multiple parallel Layer 2 network path builds STP loop network


VSS with MEC builds single loop-free network to utilize all available links.
Rootguard

Distributed EtherChannel minimizes STP complexities compared to standalone distribution design


STP toolkit should be deployed to safe-guard multilayer network
BPDU Guard or PortFast Port Security
STP BLK Port Loop-free L2 EtherChannel

BRKCRS-3032

2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

Simplified, Scalable and Reliable L3 Gateway with VSS

Single logical Layer 3 gateway. Eliminates complete need of implementing FHRP protocols.
Removes FHRP dependencies and increases Layer 3 network scalability. Hardware based rapid fault-detection and network recovery with default protocol timers. Deterministic network sub-second network convergence in multiple fault conditions.
Single IP Gateway

R1

BRKCRS-3032

2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

EtherChannel Link Convergence


Optimal Fast Traffic Restoration
1 2 3 4
Link failure detection Removal of the Portchannel entry in the software Update of the hardware Portchannel indices Notify the spanning tree and/or routing protocol processes of path cost change
Si Si

Link Failure Detection

2
Catalyst Switch

4
PortChannel 1 G3/1, G3/2, G4/1, G4/2

Routing Protocol Process

3
Layer 2 Forwarding Table
VLAN 10 11 MAC AA BB Destination Index Portchannel 1 G5/1

Spanning Tree Process

Destination Port G3/1 G3/2 G4/1 G4/2

Load-Balancing Hash
BRKCRS-3032 2012 Cisco and/or its affiliates. All rights reserved. Cisco Public

Multi-Chassis EtherChannel Performs Better In Any Network Design


Network Recovery mechanic varies in different distribution design
Standalone Protocol and Timer dependent VSS Hardware dependent
Convergence (sec)

0.8

VSS logical distribution system


Single P2P STP Topology Single Layer 3 gateway Single PIM DR system

0.6

0.4

0.2

Distributed and synchronized forwarding table MAC address, ARP cache, IGMP All links are fully utilized based on Ether-channel load balancing

0
L2-FHRP
Upstream Downstream Multicast

L2-MEC

BRKCRS-3032

2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

The Best Deployment for Standalone Is Routed Access


Bridge Assurance OSPF SPF Tuning STP Root timers throttle spf 10 100 5000 timers throttle lsa all 10 Active 100 5000 HSRP timers lsa arrival 80 Rootguard Loopguard or Bridge Assurance BPDU Guard or PortFast Port Security EIGRP/OSPF

Layer 3
Si

Si

Layer 2

Simplified Operation with single control-plane Routing Protocols Improved Network Design No FHRP, STP, Trunk, VTP etc. Optimized Forwarding Topology Layer 3 ECMP Improved convergence with fewer protocols
BRKCRS-3032 2012 Cisco and/or its affiliates. All rights reserved. Cisco Public

VSS Simplifies Routed Access

Builds single point-to-point routing peer adjacency with MEC EtherChannel delivers deterministic network recovery Minimizes adjusting protocol timers and parameters
Single Adjacency EIGRP / OSPF

BRKCRS-3032

2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

Routed Access Optimized Multicast Operation


Layer 2 access has two multicast routers on the access subnet, causing one to have to discard frames Routed Access has a single multicast router which simplifies management of multicast topology

IGMP Querier (Low IP address)

Si

Si

Si

Si

Non-DR has to drop all non-RPF Traffic

Designated Router (High IP Address) Designated Router & IGMP Querier

BRKCRS-3032

2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

VSS Optimizes Multicast Performance with Routed Access


Single logical L3 path to RP from access to join multicast distribution tree Single OIL/IIL PIM interface in Multicast Routing Table Increases multicast bandwidth capacity with all MEC member-links programmed for switching Transparent to network faults and provides deterministic sub-second multicast data recovery
Single OIL Single PIM Join Message

OIL = Outgoing Interface List IIL = Incoming Interface List


BRKCRS-3032 2012 Cisco and/or its affiliates. All rights reserved. Cisco Public

Routed Access Provides Rapid Convergence with Optimized Traffic Flow and Ease of Mgmt
CEF and protocol based network recovery in Standalone Routed Access Design
EIGRP converges in <200 msec OSPF with sub-second tuning converges in <200 msec Multicast with sub-second tuning convergences in ~600 msec
Convergence (sec)

0.7

0.6

0.5

EtherChannel hash based network recovery in VSS Routed Access Design


Deterministic sub-second unicast & multicast network convergence

0.4

0.3

EtherChannel does not require any further protocol tunings

0.2

0.1

0
EIGRP-ECMP EIGRP-MEC OSPF-ECMP OSPF-MEC

Upstream

Downstream

Multicast

BRKCRS-3032

2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

Diversify Links For Module Redundancy


Distribute multiple connections to single or logical remote system between different linecard module when possible. Recovery mechanic same as link failure. Prevents topology changes or forwarding updates and provides intra-chassis sub-second recovery. Depending network load it minimize the network congestion

Inter-Chassis Recovery Intra-Chassis Recovery

Si Si

Si Si

BRKCRS-3032

2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

Best Practice for Module OIR


Module OIR is supported on all modular systems. Network recovery have higher impact with Module OIR due to
OIR detection
Hardware Synchronization
Convergence (sec) 2.5

Protocol Dependencies Forwarding Updates

1.5

Minimize network impact with following techniques :


Admin Power Down Admin Reset

0.5

OIR 6500 Standalone


6500E(config)#no power enable module <slot-id>

Power Down Downstream

Soft Reset Multicast

Upstream

6500 VSS
6500-VSS(config)no power enable switch <1|2> module <slot-id>
BRKCRS-3032 2012 Cisco and/or its affiliates. All rights reserved. Cisco Public

High Availability Campus Design


Agenda Network Level Resiliency
High Availability Design Principles Simplified and Redundant Campus Design Campus Routing Best Practices

System Level Resiliency


Integrated Hardware and Software Resiliency Stateful Switchover and Non-Stop Forwarding Hitless Switching

Operational Level Resiliency


Single and Multi-Chassis ISSU Upgrade Hitless NX-OS Software Upgrade
BRKCRS-3032 2012 Cisco and/or its affiliates. All rights reserved. Cisco Public

Core Layer Routing Design Strategy


Design Campus Core with Simplicity Optimize Routing Topologies:
Hide Topology EtherChannel Hide Reachability Route Summarization

Filter Stub, Distribute-list, Route-Maps

High-Performance, Reliable Network Design


Increase Application Performance

Deterministic Network Recovery

BRKCRS-3032

2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

VSS Enabled Campus Design


End-to-End VSS Design Option

Si

Si

Si

Si

Si

Si

Si

Si

Si

Si

Si

Si

WAN
BRKCRS-3032

Data Center

Internet
Cisco Public

2012 Cisco and/or its affiliates. All rights reserved.

Scalable and Hitless Core Design Alternative with Nexus 7000


Standalone Redundant Core System

High-scale, High-Performance system.


Hitless forwarding design with distributed forwarding

architecture by de-coupling centralized control and management.


Highly Available Hitless Forwarding, NSF/SSO, EC,

ISSU etc.

WAN
BRKCRS-3032 2012 Cisco and/or its affiliates. All rights reserved.

Data Center
Cisco Public

Internet

Deploy EtherChannel for Simplify, Optimize and Reliable Core


6500-VSS Nexus 7000

Single Unified Core System Single point-to-point network per neighbor. Simplified, Optimized and resilient Unicast and

Standalone Redundant Core System Single point-to-point network per neighbor. EtherChannel ECMP to simplify, optimize and build

Multicast Network Design


Highly Available VSS, Quad-Sup, NSF/SSO, MEC,

resilient Network Design


Highly Available Hitless Forwarding, NSF/SSO, EC,

eFSU etc.
BRKCRS-3032

ISSU etc.
2012 Cisco and/or its affiliates. All rights reserved. Cisco Public

EIGRP Is Unique with Multi-Level Summarization Capability


10.10.0.0/16
2001:DB8:10::/48

The greatest advantages of EIGRP are gained when the network has a structured addressing plan that allows for use of summarization and stub routers EIGRP provides the ability to implement multiple tiers of summarization and route filtering Able to maintain a deterministic convergence time in very large L3 topology

2001:DB8:10::/56

Si

Si

2001:DB8:10:128:/56

10.10.0.0/17

10.10.128.0/17

Si

Si

Si

Si

BRKCRS-3032

2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

EIGRP Convergence Is Improved with Summarization, Filtering and EtherChannel


EIGRP convergence is largely dependent on query paths and response times Implement EtherChannel to reduce query paths Minimize the number and time for query response to speed up convergence Summarize distribution block routes upstream to the core Configure all L3 access switches as EIGRP stub routers
Si Si Si Query

Response Si

interface TenGigabitEthernet 4/1 ip summary-address eigrp 100 10.120.0.0 255.255.0.0

router eigrp 100 network 10.0.0.0 eigrp stub connected !

BRKCRS-3032

2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

Avoid Default Route Black Hole


10.1.0.0/16 10.2.0.0/16 10.3.0.0/16

Know default route source in the network. EIGRP advertises default-route if exists in Routing Table. Maintain network availability in campus by advertising following routes to EIGRP Stub routers
Summarized Internal Route Default-Route to Stub routers

router eigrp 100 network 10.0.0.0 distribute-list EIGRP_STUB_Routes out <Port-Channel#> ! ip access-list standard EIGRP_STUB_Routes permit 10.0.0.0 permit 0.0.0.0 !

WAN
10.4.0.0/16

Data Center
10.5.0.0/16

Internet

BRKCRS-3032

2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

OSPF Area Boundaries Offer Summarization for Improved Scale


Area 100 Area 110 Area 120

Area boundaries provide buffers between fault domains Keep area 0 for core infrastructure

Si

Si

Si

Si

Si

Si

Do not extend area 0 to the access routers when using Routed Access
Si Si

Area 0
Si Si

Si

Si

WAN

Data Center

Internet

BRKCRS-3032

2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

OSPF Downstream Summarization Is Accomplished with Multiple Area Types


ABR for a regular area forwards
Summary LSAs (Type 3) ASBR summary (Type 4) Specific externals (Type 5)
Si Si

Stub area ABR forwards


Summary LSAs (Type 3) Summary default (0.0.0.0 - ::/0)
Si Si

A totally stubby area ABR forwards


Summary default (0.0.0.0 - ::/0)
router ospf 100 area 120 stub no-summary network 10.120.0.0 0.0.255.255 area 120 network 10.122.0.0 0.0.255.255 area 0

OSPF Area 120

BRKCRS-3032

2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

OSPF Upstream Summarization Helps Minimize LSA Churn in the Core


Summarize routes from the distribution block upstream into the core Minimize the number of LSAs and routes in the core Reduce the need for SPF calculations due to internal distribution block changes
Si Si

router ospf 100 area 120 stub no-summary area 120 range 10.120.0.0 255.255.0.0 cost 10 network 10.120.0.0 0.0.255.255 area 120 network 10.122.0.0 0.0.255.255 area 0

Si

Si

ABRs originate
Summary 10.120.0.0/16 & 2001:DB8:10:120::/48

interface Vlan120 ip address 10.120.0.1 255.255.255.192 !

BRKCRS-3032

2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

OSPF Cost Matters in EtherChannel Designs


Route metrics (bandwidth) automatically adjusted on EtherChannel interface

Si
Cost = 3 Cost = 3 5

Si
Cost Cost= =3 3

Maximum bandwidth or cost computation differs between OS


IOS 10G (default) * NX-OS 40G (default) **

Single core-layer member-link failure in OSPF EC/MEC design may


Under-utilize Network Resources Build Asymmetric Forwarding Topology Increases Network Convergence Time
Nexus 7000
N7K-Core(config-router)#auto-cost reference-bandwidth 10000

Auto-Cost = Auto-Cost = 40G 10G

Auto-Cost = Auto-Cost = 40G 10G

Cost Cost = =1 1

Cost Cost= =1 1

Summary Net 10.100.0.0/16

* Adjustable. Recommended to keep default ** Recommended to adjust OSPF auto-cost ref. bw to 10G on Nexus 7000
BRKCRS-3032 2012 Cisco and/or its affiliates. All rights reserved. Cisco Public

Optimize EtherChannel Load Balancing

Load share egress data traffic based on input hash Optimal load sharing results with :
Bucket-based load-sharing Bundle member-links in power-of-2 (2/4/8) Multiple variation of input for hash (L2 to L4)

Default : src-dst-ip vlan Default : src-dst-ip Default : src-dst-ip vlan Recommended : src-dst-mixed-ip-port vlan Recommended : src-dst ip-l4port-vlan

Recommended : src-dst-mixed-ip-port
Core

Recommended algorithm * :
Access Src/Dst IP Dist/Core Src/Dst IP + Src/Dst L4 Ports

Default : src-dst-ip vlan Recommended : src-dst-mixed-ip-port vlan

Dist

Default : src-mac Recommended : src-dst-ip


Si

Access

* May vary based on your network traffic pattern


BRKCRS-3032 2012 Cisco and/or its affiliates. All rights reserved. Cisco Public

Layer 3 Load Balancing Can Be Randomized with a Unique ID Associated with Switch
Universal ID concept (also called Unique ID) is used to prevent CEF polarization
Universal ID generated at bootup (32-bit pseudo-random value seeded by routers base IP address)

Si

Si

Universal ID used as input to ECMP hash, introduces variability of hash result at each network layer Universal ID supported on Catalyst 6500 Sup-32 and Sup-720
Si Si

Universal ID supported on Catalyst 4500 SupII+10GE, SupV-10GE and Sup6E


Si

Hash using Source IP (SIP), Destination IP (DIP) &Universal ID

Catalyst 4500 Load-Sharing Options


Original Universal* Include Port Src IP + Dst IP Src IP + Dst IP + Unique ID Src IP + Dst IP + (Src or Dst Port) + Unique ID

Catalyst 6500 PFC3** Load-Sharing Options


Default* Full Full Exclude Port Simple Src IP + Dst IP + Unique ID Src IP + Dst IP + Src Port + Dst Port Src IP + Dst IP + (Src or Dst Port) Src IP + Dst IP Src IP + Dst IP + Src Port + Dst Port
Cisco Public

* = default load-sharing mode


BRKCRS-3032

Full Simple

2012 Cisco and/or its affiliates. All rights reserved.

Simple Network Design Delivers Deterministic Network Recovery


Routing Protocol Independent network convergence ECMP Prefix-Independent Convergence (PIC) for with 6500 (VSS/Standalone) from 12.2(33)SXI2 Hardware-based fault detection and recovery in MEC/EC designs
Time for ECMP/MEC Unicast Recovery

3.5 3 2.5 2 1.5 1

Convergence (sec)

0.5
0 500 1000 5000
ECMP (W/o PIC)

10000
ECMP (With PIC)

15000
MEC

20000

25000

Number or Unicast Routes Core/Distribution Sup720-10GE


BRKCRS-3032 2012 Cisco and/or its affiliates. All rights reserved. Cisco Public

VSS Core Simplifies Multicast Operation, Improve Performance and Redundancy

AnyCast - MSDP

Standalone Core needs AnyCast MSDP peering for RP

Redundancy.
VSS based Core simplifies PIM RP Redundancy with

Single Logical PIM RP PIM RP


Si Si

Core Core PIM RP Multiple Multicast Forwarding Paths Single OIL

NSF/SSO/MMLS technologies.
ECMP builds single Multicast forwarding path. MEC increases multicast forwarding capacity by utilizing all

Single Logical PIM Interface PIM Router Single Logical PIM Router
Si

Single Logical OIL PIM Join PIM Join PIM Router


Si

Dist

Dist

member-links.

BRKCRS-3032

2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

Simplified Multicast Network Design Delivers Deterministic Network Recovery


ECMP multicast recovery is mroute scale dependent could range in seconds. MEC/EC multicast recovery is hardware-based and recovery is scale-independent in sub-seconds
Time for ECMP/MEC Multicast Recovery
6 Convergence (sec) 5 4 3 2 1 0 100 500 1000 5000

ECMP
MEC/EC

Number or Multicast Routes Core/Distribution Sup720-10GE


BRKCRS-3032 2012 Cisco and/or its affiliates. All rights reserved. Cisco Public

High Availability Campus Design


Agenda Network Level Resiliency
High Availability Design Principles Simplified and Redundant Campus Design Campus Routing Best Practices

System Level Resiliency


Integrated Hardware and Software Resiliency Stateful Switchover and Non-Stop Forwarding Hitless Switching

Operational Level Resiliency


Single and Multi-Chassis ISSU Upgrade Hitless NX-OS Software Upgrade
BRKCRS-3032 2012 Cisco and/or its affiliates. All rights reserved. Cisco Public

Does I still need Dual Supervisor?

Redundant physical paths


Protects Network Availability Converges in sub-second May not maintain capacity and performance. Increases outage probability during major node failure
Reduced Capacity

Redundant Supervisor Module


Protects Network and Services Availability Maintains capacity and performance System remains in-service during major supervisor failure Hitless to insignificant data loss during switchover

Self Recovery Fail

Single Point of Failure

Reduced Capacity

Si

BRKCRS-3032

2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

Supervisor Redundancy Provides Stateful Switch Over


1:1 Supervisor Redundancy Architecture
Stateful Synchronization
System Variables Configuration Running/Startup Layer 2/3 Protocol State and Topologies Policies ACLs, QoS etc. Linecards Status

Active Supervisor owns control-plane ownership. Develops central and distributed forwarding table Graceful system recovery by protecting hardware and software state-machines Architecture varies between modular systems

BRKCRS-3032

2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

NSF Works with SSO to Keep Neighbors Forwarding During a Supervisor Switchover
Non-Stop Forwarding provides graceful restart enhancements to EIGRP, OSPF, IS-IS, BGP and LDP
An NSF-capable router continuously forwards packets during an SSO processor recovery NSF-aware and NSF-capable routers provide for transparent routing protocol recovery
Graceful restart extensions enable neighbor recovery without resetting adjacencies Routing database re-synchronization occurs in the background
NSF-Aware, NSF-Capable NSF-Aware, NSF-Capable

Si

Si

NSF-Aware

BRKCRS-3032

2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

Cisco vs IETF OSPF NSF Capability


NSF Capable NSF Aware NSF Capable NSF Aware

Restart event

Restart event

Fast Hello

Announce Gracefulrestart

Fast Hello (2 sec interval RS bit set) Fast Hello (2 sec interval RS bit set)

Fast Hello (2 sec interval RS bit clear) Fast Hello (2 sec interval RS bit clear)

LS Update (Grace LSA)

LS ACK (Grace LSA) 225.0.0.5 Hello

OSPF Discovery

Hello 225.0.0.5

Out-of-Band Sync

Database Exchange

Database Description

Database Description

Database Description

Database Description

LSA Requests/U pdate

LSA Requests/U pdate

LSA Requests/U pdate

LSA Requests/U pdate

Hello (RS bit clear)

Hello Hello (RS bit clear) Hello

Recommendation When peering with IETF capable device, use IETF NSF Capability using nsf ietf command under routing process
BRKCRS-3032 2012 Cisco and/or its affiliates. All rights reserved. Cisco Public

High Availability Campus Design


Agenda Network Level Resiliency
High Availability Design Principles Simplified and Redundant Campus Design Campus Routing Best Practices

System Level Resiliency


Integrated Hardware and Software Resiliency Stateful and Non-Stop Forwarding Hitless Switching

Operational Level Resiliency


Single and Multi-Chassis ISSU Upgrade Hitless NX-OS Software Upgrade
BRKCRS-3032 2012 Cisco and/or its affiliates. All rights reserved. Cisco Public

StackWise+ Provides Stack-Ring Redundancy

Mix Processing Architecture


Centralized (Master) CDP, LACP, Layer 3 (ARP/Routing/Multicast) and Management Plane. Distributed (All stack-members) MAC Learning, STP, QoS, ACL etc.
Si
VSL

Distributed Forwarding Architecture


Single Forwarding Table Master synchronizes the RIB/FIB with all stack-member switches Local-switching Within port-asic and between port-asics thru local switch-fabric

Si

1:N Master Switch Redundancy in stack-ring. Dynamic reelection after failure


Protects distributed L2/L3 FIB. Gracefully restarts routing adjacencies

Distributed FIB

Master Master

BRKCRS-3032

2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

Designating Master/Slave Switch in Stack-Ring


VSL

Any stack-member can become Master. Recommended to increase switch priority for deterministic role.
Master switch failure-detection, propagation and re-election could range in 2-3 seconds.

Si

Si

Network recovery mechanic differs in different designs


Master Switch with Uplink Master Switch without Uplink (Recommended)

!Increase Master Switch Priority to 15(highest) switch 5 priority 15


!Increase Slave Switch Priority to 14(lower than Master) switch 6 priority 14

Master (Priority=15)

Slave (Priority=14)

BRKCRS-3032

2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

Stack Design 1

Multilayer StackWise Plus Master Switch Recovery Analysis Stack Design 3 (Recommended)
Master Switch with Uplink No Slave Switch (same priority) Master Switch w/o Uplink Slave Switch (w/o Uplink) set in stack-ring

Stack Design 2
Master Switch with Uplink Slave Switch (w/o Uplink) set in stack-ring
Catalyst 3750-X StackWise Plus Master Failure Analysis 2.5
VSL

2
Convergence (sec)

Si

Si

1.5

0.5

0 Design - 1
Upstream

Design - 2
Downstream

Design - 3

BRKCRS-3032

2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

Graceful Routing with StackWise Plus

Routing adjacencies and L3 FIB preserved during Master failure. Graceful routing capability supported for EIGRP and OSPF.
Si

VSL

Si

Network recovery mechanic differs in different designs


Master Switch with Uplink Master Switch without Uplink (Recommended)
EIGRP / OSPF

Distributed FIB NSF Recovery

router eigrp 100 nsf ! router ospf 100 nsf


BRKCRS-3032 2012 Cisco and/or its affiliates. All rights reserved. Cisco Public

Master Master

Routed Access StackWise Plus Master Switch Recovery Analysis


Stack Design 1
Convergence (sec)
1.4 1.2

Catalyst 3750-X StackWise Plus Master Failure Analysis EIGRP Routed Access

Master Switch with Uplink


No Slave Switch (same priority)

1 0.8 0.6 0.4 0.2 0

Stack Design 2
Master Switch with Uplink Slave Switch (w/o Uplink) set in stack-ring

Stack Design 3 (Recommended)


Master Switch w/o Uplink Slave Switch (w/o Uplink) set in stack-ring
VSL
Si Si
Convergence (sec)
1 0.8 0.6 0.4 0.2 0 1.4 1.2

Design - 1
Upstream

Design - 2
Downstream

Design - 3

Catalyst 3750-X StackWise Plus Master Failure Analysis OSPF Routed Access

EIGRP / OSPF

Design - 1
Upstream

Design - 2
Downstream

Design - 3

BRKCRS-3032

2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

4500E SSO Architecture Protects Network Availability and Capacity


SSO Redundancy

1+1 Supervisor Redundancy Architecture


Centralized Processing Architecture
Active Supervisor maintains all three-planes In real-time hardware and software state-machine synchronization from Active to Standby Supervisor

Active Sup

Standby Sup

Forwarding Engine FFE /VFE Shared Memory Fabric PPE / IPP

Centralized Forwarding Engine


Switch data-traffic between linecard modules Stub-Linecards No local-switching

Decouples Control and Forwarding Plane


Protects Network Capacity during Soft/Admin Forced Switchover
IOS Software Upgrade
Catalyst 4500E
Line Card Line Card Line Card Line Card Line Card

BRKCRS-3032

2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

MultiLayer 4500E Supervisor Switchover Analysis


Stateful Layer 2 Protocol Synchronization
STP, MAC Table, IGMP Snooping, PAgP etc.
VSL

Protects Network Capacity


Maintains all uplinks, including on failed Sup All linecard module remains operational
Si

Si

Deterministic <100msec Convergence


Forwarding-Engine decouples control and forwarding plane Sup Fabric Connectivity remains operational even after failure
0.12

Standby Active
0.1 Convergence (sec) 0.08 0.06 0.04 0.02 0 Upstream Downstream Multicast

Standby Active

4500E

BRKCRS-3032

2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

Routed Access 4500E Supervisor Switchover Analysis


Stateful Layer 3 Protocol Synchronization
EIGRP, OSPF, ARP etc.
VSL

PIM SSO capability not supported. Deterministic <100msec Unicast Convergence


Forwarding-Engine decouples control and forwarding plane Sup Fabric Connectivity remains operational even after failure

Si

Si

EIGRP / OSPF

2.5

2 Convergence (sec)

Standby Active

Standby Active

1.5

4500E
0.5 0 Upstream Downstream Multicast

BRKCRS-3032

2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

6500-E VSS Architecture


Catalyst 6500-E
Line Card
SF PFC RP Inter-Chassis SSO Redundancy

Catalyst 6500-E
Line Card
SF PFC RP

Active Sup
Intra-Chassis SSO Redundancy Internal EOBC SF PFC RP External EOBC (VSL)

Standby Sup
Internal EOBC

Standby Sup

Line Card

Line Card

Standalone VSS-SW1

VSS-SW2

Internal EOBC : Internal communication control channel between supervisor and linecards within single-chassis External EOBC : External communication control channel between supervisors between two-chassis
BRKCRS-3032

SF : Switch Fabric PFC : Policy Feature Card RP : Route Processor EOBC : Ethernet Out-of-Band Channel
Cisco Public

2012 Cisco and/or its affiliates. All rights reserved.

VSS Dual-Sup Inter-Chassis Redundancy

VSS Dual-Sup (single per virtual-switch) supports inter-chassis SSO redundancy. Single in-chassis supervisor - SSO Active or Standby role.

Stateful SSO synchronization and redundancy between virtualswitches


Single Sup System Design
Supervisor switchover requires chassis reset, including all linecard and service modules Network capacity reduced until system returns to operational state
Si
Active Standby

Reduced NSF Recovery Capacity Reduced Capacity

Standby Active

Reduced Reduced Capacity Capacity

BRKCRS-3032

2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

VSS Quad-Sup Extends HA Capability


Sup720-10GE Quad-Sup Redundancy
Starting 12.2(33)SXI4 Sup720-10GE VSS supports two sup redundancy modes :
Dual-Sup One Sup per virtual-switch Quad-Sup Two Sups per virtual-switch

Dual Sup offers single redundancy option


Inter-Chassis only. Resetting Active or Standby supervisor reboots all installed modules Sup hardware failure may increase MTTR, reduce network capacity, services availability and may build un-reliable network
Self Recovery Fail New Active Supervisor

NSF Recovery Reduced Capacity

Quad Sup offers dual redundancy options


Inter-Chassis Same design as dual-sup Intra-Chassis Allows virtual switch to return in-service, reduce MTTR and stabilize network from major fault
Si

Single Point of Failure

Reduced Capacity

BRKCRS-3032

2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

VSS Quad Sup Supports Dual HA Mode


Sup720-10GE Quad-Sup Redundancy
Inter-Chassis Sup Redundancy

Intra-Chassis Sup Redundancy ICA SSO Active ICS RPR-WARM

VSL

Intra-Chassis Sup Redundancy ICA SSO Standby ICS RPR-WARM

Si
SW1

Si
SW2

Dual in-chassis supervisors, each in different redundancy modes


In-chassis Active Supervisor (ICA) In SSO Active OR Standby Mode In-chassis Standby Supervisor (ICS) RPR-WARM Mode

Stateful SSO synchronization from SSO Active to Standby supervisor System configuration synchronization between ICA and ICS supervisors Chassis reset when ICA supervisor reset
BRKCRS-3032 2012 Cisco and/or its affiliates. All rights reserved. Cisco Public

VSS Quad Sup RPR-WARM Design


Sup720-10GE Quad-Sup Redundancy
Provides system redundancy during major ICA failure.
RPR-WARM Sup in hybrid operational mode :
ICS Supervisor RPR cold-state with extended capabilities DFC Linecard Distributed linecard with all available 1G/10G uplink ports for network connectivity.
ICA SSO Active ICS RPR-WARM ICA SSO Standby

VSL

ICS RPR-WARM

Si
SW1

Si
SW2

ICS synchronizes various configuration from ICA :


Startup-Configuration VLAN Database

6500#show switch virtual redundancy | inc Switch|Current Software My Switch Id = 1 Peer Switch Id = 2 Switch 1 Slot 5 Processor Information : Current Software state = ACTIVE Switch 1 Slot 6 Processor Information : Current Software state = RPR-Warm Switch 2 Slot 5 Processor Information : Current Software state = STANDBY HOT (switchover target) Switch 2 Slot 6 Processor Information : Current Software state = RPR-Warm

Boot Variable
VSS Virtual-Switch ID

BRKCRS-3032

2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

Graceful VSS Quad-Sup Deployment


Sup720-10GE Quad-Sup Redundancy

Software Upgrade
Upgrade VSS supervisor (Active/Standby) to 12.2(33)SXI4 or onwards. Maintain network availability during software upgrade with enhanced Fast Software Upgrade (eFSU)

Deploy ICS
Install redundant (ICS) supervisors on each virtualswitch chassis. Bootup ICS supervisor with common software version and license as ICA.

Redesign VSL
Build full-mesh VSL physical paths between quad supervisor module. Bundle new VSL connections in VSL EC.

Failure to follow recommended procedure may de-stabilize VSS system and network operation
BRKCRS-3032 2012 Cisco and/or its affiliates. All rights reserved. Cisco Public

Installing ICS Supervisor With Mismatch IOS Version


Sup720-10GE Quad-Sup Redundancy
Incompatible IOS software between ICA and ICS supervisor may force ICS to fallback in ROMMON mode ICS with Quad-Sup software capability may allow to boot up with mismatch IOS version to install common software version No effect of disabling IOS mismatch version if ICS boot up without Quad-Sup capability (pre12.2(33)SXI4)

ICA SSO Active ICS RPR-WARM - ROMMON 12.2(33)SXI4 12.2(33)SXI5 12.2(33)SXI3


Si Si

ICA SSO Standby ICS ROMMON RPR-WARM 12.2(33)SXI4 12.2(33)SXI5 12.2(33)SXI3

SW1

SW2

6500-VSS(config)#no switch virtual in-chassis standby bootup mismatch-check !


BRKCRS-3032 2012 Cisco and/or its affiliates. All rights reserved. Cisco Public

ICS Supervisor IOS Upgrade Process


Sup720-10GE Quad-Sup Redundancy
Step 1 Disable IOS software mismatch version check from global configuration mode: 6500-VSS (config)#no switch virtual in-chassis standby bootup mismatch-check

Step 2 Insert the ICS supervisor module in both chassis. Intra-chassis role negotiation will allow ICS to complete the bootup process in RPR-WARM mode
Step 3 Copy the ICA-compatible IOS software version on both ICS supervisor modules: 6500-VSS#copy <image_src_path> sw1-slot6-disk0:<image> 6500-VSS#copy <image_src_path> sw2-slot6-disk0:<image>

Step 4 Re-enable IOS software mismatch version check from global configuration mode. Keeping disable may cause chassis to go in RPR mode in next-switchover. 6500-VSS (config)#switch virtual in-chassis standby bootup mismatch-check Step 5 Force ICS supervisor module reset. In the next bootup process, the ICS module will now bootup with an ICA-compatible IOS software version: 6500-VSS#hw-module switch 1 ics reset 6500-VSS#hw-module switch 2 ics reset

BRKCRS-3032

2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

Dual and Quad Sup SSO Analysis


Sup720-10GE Quad-Sup Redundancy
6500-VSS Dual/Quad Sup NSF/SSO Analysis Unicast Application 0.3

Convergence (sec)

0.2

MEC based network recovery mechanic with VSS in dual or quad-sup design. Deterministic sub-second network convergence for unicast and multicast data traffic. Only SSO Active failure triggers graceful protocol recovery.

0.1

0
EIGRP - ECMP EIGRP - MEC Upstream OSPF- ECMP Downstream OSPF - MEC

6500-VSS Dual/Quad Sup NSF/SSO Analysis Multicast Application 140 120 Convergence (sec) 100 80 60 40 20 0 ECMP Active-IIL Standby-IIL MEC

BRKCRS-3032

2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

VSS Dual Sup VSL Design


Sup2T and Sup720-10GE Design
Two Cisco recommended designs
Profile 1 VSL on Supervisor (Sup2T/Sup720-10GE) Profile 2 Diversified VSL between Supervisor (Sup2T/Sup720-10GE) and VSL capable Linecard

Sup

Sup

Sup

Sup

VSL VSL

Cost-effective solution to leverage both uplinks. Continue to use non-VSL capable linecard for 10G core connection. Redundant fibers connects thru common fabric and ASICs, this could result vulnerability in system stability. Optimal and preset VSL parameters LoadBalancing, QoS, HA, Traffic-engg, Dual-Active etc. Restricted to bundle 2 x VSL ports or 20G switching capacity on per virtual-switch node basis.
BRKCRS-3032 2012 Cisco and/or its affiliates. All rights reserved.

Redundant and diversified fibers between supervisor and next-gen VSL capable linecards. Same design as Profile 1 but increases system reliability as each VSL port are diversified across different fabric/ASICs. Optimal and preset VSL parameters LoadBalancing, QoS, HA, Traffic-engg, Dual-Active etc. Flexible to scale up to 8 x VSL for high-dense system to aggregate uplink, service modules, single-home etc.
Cisco Public

VSS Quad Sup VSL Design


Sup720-10GE Quad-Sup VSL Redundancy
Recommended Full-Mesh VSL on Quad-Sup
Sup-1 Sup-2 Sup-4 Sup-4 Sup-1 Sup-2 Sup-4 Sup-4

Sup-3 Sup-3

VSL

Sup-3 Sup-3

VSL

Si
SW1

Si
SW2

Si
SW1

Si
SW2

Same Design Profile 1 Dual Sup Flexible to increase VSL Capacity Continue to leverage existing non-VSL 10G linecard for uplink connection Retains all original VSL benefits Vulnerable design during any supervisor selfrecovery fault incident

Highly Redundant and cost-effective VSL Design. Increases overall VSL Capacity Maintains 20G VSL Capacity during supervisor failure. Increases network reliability by minimizing the dual-active probability

BRKCRS-3032

2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

VSS Dual-Active Detection Redundancy


Dual-Sup or Quad-Sup VSL Redundancy
All VSL link failure forces both virtual-switch to transition in ACTIVE role known as Dual-Active
Si

Dual-Active condition confuses neighbor devices and de-stabilizes network. Two Detection and Recovery Mechanic : Direct = Dual-Active Fast Hello or BFD In-Direct = Enhanced PAgP (ePAgP)

ePAgP Layer 3 Port-Channel

Si

Si

Fast-Hello ePAgP Layer 2 Port-Channel


Si

Recommended to use ePAgP and Fast-Hello mechanic for redundancy


BFD detection mechanic deprecated starting 15.0(SY1)

Catalyst 2K/3K/4K

!Enable Enhanced PAgP on trusted L2/L3 Port-Channel interface 6500-VSS(config-vs-domain)#dual-active detection pagp trust channel-group 101 ! !Enable dual-active fast-hello on directly connected interface (copper/fiber) 6500-VSS(config#interface range Gi1/1/1 , Gi2/1/1 6500-VSS(config-if)#dual-active fast-hello
BRKCRS-3032 2012 Cisco and/or its affiliates. All rights reserved. Cisco Public

Dual-Active Recovery Analysis


Dual-Sup or Quad-Sup VSL Redundancy
6500E VSS Dual-Active Recovery Analysis ePAgP

Dual-Active Network Recovery depends on


Convergence (sec)

35 30 25 20 15 10 5 0 EIGRP - ECMP EIGRP - MEC Upstream OSPF - ECMP Downstream OSPF - MEC

Uplink Network Design ECMP vs MEC Routing Protocols EIGRP vs OSPF Detection Mechanic Fast-Hello vs ePAgP

OSPF ECMP faster in failure detection then ePAgP. Slow network convergence Starting 12.2(33)SXI3 Dual-Active Fast-Hello performs rapid failure detection and delivers deterministic recovery independent of network design and protocol

6500E VSS Dual-Active Recovery Analysis Fast-Hello 0.5

0.4 Convergence (sec)

0.3

0.2

0.1

0
EIGRP - ECMP EIGRP - MEC Upstream OSPF - ECMP Downstream OSPF - MEC

BRKCRS-3032

2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

High Availability Campus Design


Agenda Network Level Resiliency
High Availability Design Principles Simplified and Redundant Campus Design Campus Routing Best Practices

System Level Resiliency


Integrated Hardware and Software Resiliency Stateful Switchover and Non-Stop Forwarding Hitless Switching

Operational Level Resiliency


Single and Multi-Chassis ISSU Upgrade Hitless NX-OS Software Upgrade
BRKCRS-3032 2012 Cisco and/or its affiliates. All rights reserved. Cisco Public

Nexus 7000 Distributed Architecture


Fabric Modules 1 Crossbar
46Gbps/slot
Fabric ASICs

46Gbps/slot

Crossbar Fabric ASICs

Distributed IPFIB/MFIB

URIB

MRIB ACTIVE

FIB

46Gbps/slot

URIB STANDBY MRIB

FIB

Crossbar Fabric ASICs

46Gbps/slot

Crossbar Fabric ASICs

SSO Synchronization
URIB : Unicast Routing Info Base MRIB : Multicast Routing Info Base FIB : Forwarding Info Base MFIB : Multicast Forwarding Info Base Local Switching
BRKCRS-3032 2012 Cisco and/or its affiliates. All rights reserved.

46Gbps/slot

Crossbar Fabric ASICs

Nexus 7018
Cisco Public

Hitless Supervisor Redundancy with Nexus 7000


1+1 Supervisor Redundancy architecture Decouple centralized control-plane with distributed forwarding plane Redundant central arbiter Hitless Supervisor Switchover with
Distributed I/O Module
Crossbar Fabric Module
NSF Recovery CPU Standby CA CMP Active Active CPU Standby CA CMP

BRKCRS-3032

2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

Fabric Module Capacity and Redundancy


8x10GE I/O Module

Fabric Modules 1 Crossbar


46Gbps/slot Fabric
ASICs

2 x 23G channels per I/O module slot

Insufficient Capacity

1 x 23G channel per supervisor slot

Crossbar 46Gbps/slot Fabric ASICs

Required for 80G/slot

Crossbar 46Gbps/slot Fabric ASICs

3
N+1 Redundancy

230Gbps 46Gbps 184Gbps 138Gbps 92Gbps


per slot bandwidth
Crossbar 46Gbps/slot Fabric ASICs

4
N+1 Redundancy AND Future Proof

Crossbar 46Gbps/slot Fabric ASICs

Nexus 7018
BRKCRS-3032 2012 Cisco and/or its affiliates. All rights reserved. Cisco Public

Nexus 7000 Crossbar Failure May Cause Fabric Congestion


No Topology Change 46G/Slot 80G/Slot Symmetric Asymmetric Forwarding Capacity

80G/Slot

Crossbar Fabric module reduces internal switching capacity. And may cause congestion

Supervisor and I/O Module remains operational


No network topology change gets triggered.

80G/Slot 46G/Slot No Topology Change

Asymmetric Symmetric Forwarding Forwarding Capacity Capacity

80G/Slot

%XBAR-2-XBAR_INSUFFICIENT_XBAR_BANDWIDTH: Module in slot 1 has insufficient xbar-bandwidth.

BRKCRS-3032

2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

Hitless Fabric Switching with Nexus 7000


Hitless Fabric Switchover

1 Right and Left Ejector - Open 2 Signal Software to start graceful data re-routing 3 Hitless data re-routing 4 Fabric Interface Shutdown 5 Crossbar Fabric Module Power Down
3 2 3

4 5 4

4 4

Hitless Fabric Switchover


BRKCRS-3032 2012 Cisco and/or its affiliates. All rights reserved. Cisco Public

High Availability Campus Design


Agenda Network Level Resiliency
High Availability Design Principles Simplified and Redundant Campus Design Campus Routing Best Practices

System Level Resiliency


Integrated Hardware and Software Resiliency Stateful Switchover and Non-Stop Forwarding Hitless Switching

Operational Level Resiliency


Single and Multi-Chassis ISSU Upgrade Hitless NX-OS Software Upgrade
BRKCRS-3032 2012 Cisco and/or its affiliates. All rights reserved. Cisco Public

In Service Software Upgrade Allows Upgrade Without Taking Switch Down


In redundant topology standard maintenance practice is to shut down devices during upgrade and let the network converge
ISSU provides the ability to upgrade software in place without having to shut down Offers significant uptime improvements
Si Si Si Si Si Si

Si

Si

Si

Si

Scheduled Maintenance Half Capacity

ISSUAll Paths and Switches Active During Upgrade

BRKCRS-3032

2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

ISSU Graceful IOS Software Upgrade Cycle


ACTIVE

issu loadversion

OLD
STANDBY

Standby Sup reboots with new software version

OLD
ACTIVE

STANDBY

NEW
ACTIVE

OLD

issu abortversion

STANDBY

NEW

Return to original version

NEW

issu commitversion

issu runversion

Commit and reboot the STANDBY with new software


STANDBY

SSO switchover and new software becomes effective


STANDBY

OLD
ACTIVE

OLD
ACTIVE

NEW

NEW

issu acceptversion

Acknowledge successful new software activation (Optional)


BRKCRS-3032 2012 Cisco and/or its affiliates. All rights reserved. Cisco Public

ISSU Software Upgrade Prep List


Save system configuration and save in local and remote server (TFTP/FTP) Copy new software (same version/license) in local storage of Active and Standby Supervisor and change boot parameters with new software version NSF capability is enabled under routing process Prevent following major system changes until software upgrade process completes
Add or remove hardware modules Modifying software configuration Modifying Boot-registers

BRKCRS-3032

2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

Simplified Catalyst 4500E ISSU Upgrade Process


Manual Upgrade
LV

Automatic Upgrade
issu changeversion
ChV

loadversion

New SW

RV

runversion
New SW RV

runversion

commitversion

CV

AV

acceptversion

commitversion

CV

Supported on all Supervisor Modules Attentive four-step manual software upgrade process

Supported Supervisor Modules


Sup7E Starting 3.1.0SG Sup6E/Sup6L-E Starting 15.0.2SG

Opportunity to verify and upgrade new software

Single-CLI and automated software upgrade process Opportunity to schedule upgrade new software

Recommendation : Use both methods for safe and graceful software roll-out in large deployment
BRKCRS-3032 2012 Cisco and/or its affiliates. All rights reserved. Cisco Public

Catalyst 4500E Network Recovery with ISSU


4500E Network Recovery With ISSU Software Upgrade Multilayer Design
0.02 Convergence (sec)

0.01

Protects Network Capacity during entire software upgrade process


Real-time software upgrade with NSF/SSO capability Completes entire software upgrade process with <50msec loss in Multilayer design

issu loadversion
Upstream

issu runversion
Downstream Multicast

issu commitversion

4500E Network Recovery with ISSU Software Upgrade Routed Access Design
2.5

2 Convergence (sec)

1.5

0.5

issu loadversion
Upstream

issu runversion
Downstream Multicast

issu commitversion

BRKCRS-3032

2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

VSS Inter-Chassis Software Upgrade Process


Dual-Sup eFSU Upgrade Process
VSL

1 ISSU LoadVersion

Triggers Standby chassis to reset with new software version.


SW1

Si

Si
SW2

2 ISSU RunVersion

Forces SSO Switchover and makes new software version operational. New Active starts graceful protocol recovery. Active switch starts ISSU roll-back timer after Standby becomes operational
3 ISSU AcceptVersion Standby Active
2

VSL

Active Standby
Si
1 4

Stops Roll-back Timer


4 ISSU CommitVersion

Si
SW2

SW1

Triggers Standby chassis to reset with new software version.

Si

Starting 12.2(33)SXI 6500 VSS supports enhanced Fast Software Upgrade (eFSU)
BRKCRS-3032 2012 Cisco and/or its affiliates. All rights reserved. Cisco Public

VSS Quad Sup Software Upgrade Process


Sup720-10GE Quad-Sup eFSU Upgrade Process
VSL

1 ISSU LoadVersion

Triggers ICA and ICS Supervisor modules in Standby chassis to reset with new software version.
2 ISSU RunVersion

Si

Si

SW1

SW2

Forces SSO Switchover and makes new software version operational. New Active starts graceful protocol recovery. Active switch starts ISSU roll-back timer after Standby becomes operational
3 ISSU AcceptVersion

VSL

Standby Active
Si Si

Active Standby

Stops Roll-back Timer


4 ISSU CommitVersion

SW1

SW2

Triggers ICA and ICS Supervisor modules in Standby chassis to reset with new software version.
Si

BRKCRS-3032

2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

Catalyst 6500E VSS Network Recovery with eFSU

6500E VSS Dual/Quad Sup Network Recovery with eFSU Software Upgrade 0.25

Network availability is maintained with MEC

Convergence (sec)

Network capacity is reduced until Standby chassis becomes operational

0.2

0.15

MEC based recovery mechanic allows complete software upgrade process ~1-second traffic loss

0.1

0.05

0 issu loadversion Upstream issu runversion Downstream Multicast issu commitversion

BRKCRS-3032

2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

High Availability Campus Design


Agenda Network Level Resiliency
High Availability Design Principles Simplified and Redundant Campus Design Campus Routing Best Practices

System Level Resiliency


Integrated Hardware and Software Resiliency Stateful Switchover and Non-Stop Forwarding Hitless Switching

Operational Level Resiliency


Single and Multi-Chassis ISSU Upgrade Hitless NX-OS Software Upgrade
BRKCRS-3032 2012 Cisco and/or its affiliates. All rights reserved. Cisco Public

Nexus 7000 NX-OS ISSU Benefits


Hitless ISSU Upgrade

Simplified Single-CLI to upgrade (system/kickstart) several distributed hardware components Automated Fully automates the upgrade process in serial order. Reliable Runs new software compatibility test on current hardware inventory, generates impact report prior initializing upgrade. Hitless Graceful and non-disruptive procedure, leverages distributed forwarding architecture to upgrade entire system with zero packet loss.
System

I/O

BIOS

Kickstart

CMP

CMP-BIOS

System

Kickstart

CMP

CMP-BIOS

I/O

BIOS

Hitless ISSU Upgrade

BRKCRS-3032

2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

Nexus 7000 NX-OS ISSU Upgrade Prep List

Save system configuration and save in local and remote server (TFTP/FTP) Copy new software in local storage of Active and Standby Supervisor Run new software compatibility test and generate detail upgrade analysis report
show install all impact system bootflash:/<system-image-name> kickstart bootflash:/<kickstart-image-name>

Prevent following major system changes until software upgrade process completes
Add or remove hardware modules Modifying software configuration Modifying Boot-registers

BRKCRS-3032

2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

Nexus 7000 Hitless NX-OS Upgrade Process


1

install all Starts compatibility test and generates impact report. Upon user action proceed or terminate ISSU upgrade process
4
---------- ----------1 yes 2 yes 5 yes 6 yes Module Image ---------------------------to reboot with new NX-OS --------------software non-disruptive rolling non-disruptive non-disruptive non-disruptive rolling reset reset

Hitless ISSU Upgrade

I/O

BIOS

check is done: 2 Compatibility Updates resetsReason Standby supervisor Module bootableboot Impact variable andInstall-type

3 Active supervisor resets and performs hitless SSO


Running-Version(pri:alt) New-Version This switchover. Reboots with new NX-OS software. Upg-Required step---------makes new NX-OS in-effect ---------------------------------------------------------------------1 lc1n7k 5.0(5) yes v1.10.14(04/02/10):v1.10.14(04/02/10) 5.0(5) yes v1.10.14(04/02/10):v1.10.14(04/02/10) 5.0(5) yes 5.0(5) yes v3.22.0(02/20/10):v3.22.0(02/20/10) 5.0(2) 5.1(1) 02.01.05 02.01.05 5.0(5) 5.1(1a) 5.0(5) 5.1(1a) v3.22.0(02/20/10):v3.22.0(02/20/10) 5.0(2) 5.1(1) 02.01.05 02.01.05 5.1(1a) no

System

Kickstart

CMP

CMP-BIOS

Active Standby
System System Kickstart Kickstart CMP CMP CMP-BIOS CMP-BIOS

---

Active Standby

bios non-disruptive I/O Module upgrade v1.10.14(04/02/10) 4 1 Starts in serial 2 lc1n7k 5.1(1a) 2 5 5

order. in-effect. bios Roll-over CPU with new NX-OS software v1.10.14(04/02/10) no system 5.1(1a) Remains operational during upgrade
kickstart 5.1(1a)

I/O

BIOS

bios v3.22.0(02/20/10 5 5 Upgrades CMP Processor and BIOS on Active and 5 cmp yes 5 6 6 6 6 6 cmp-bios Standby Supervisor system kickstart bios cmp cmp-bios no yes yes v3.22.0(02/20/10 yes no

4
no

no

Hitless ISSU Upgrade 1


N7K#install all system bootflash:///<system-image-name> kickstart bootflash:///<kickstart-image-name> !
Cisco Public

Do you want to continue with the installation (y/n)? [n] Y

BRKCRS-3032

2012 Cisco and/or its affiliates. All rights reserved.

Summary
Simplify and Optimize your campus network design with system and network consolidation to maintain application performance even during common network faults Leverage hardware-based fault detection for scale-independent and deterministic network recovery Build non-stop communication network with system-level redundancy in all campus layer Access / Distribution / Core Design mission-critical campus backbone that offers scale flexibility, key foundational services and uncompromised high-availability. Reduce maintenance window and upgrade system while maintaining network availability

BRKCRS-3032

2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

Recommended Reading
Continue your Cisco Live learning experience with further reading from Cisco Press Check the Recommended Reading flyer for suggested books
End-to-End QoS Network Design: Quality of Service in LANs, WANs and VPNs ISBN: 1-58705-176-1 Building Resilient IP Networks ISBN: 1-58705-215-6 Top-Down Network Design, Second Ed. ISBN: 1-58705-152-4

Available Onsite at the Cisco Company Store


BRKCRS-3032 2012 Cisco and/or its affiliates. All rights reserved. Cisco Public

Complete Your Online Session Evaluation

Give us your feedback and you could win fabulous prizes. Winners announced daily. Receive 20 Passport points for each session evaluation you complete. Complete your session evaluation online now (open a browser through our wireless network to access our portal) or visit one of the Internet stations throughout the Convention Center.

Dont forget to activate your Cisco Live Virtual account for access to all session material, communities, and on-demand and live activities throughout the year. Activate your account at the Cisco booth in the World of Solutions or visit www.ciscolive.com.
Cisco Public 98

BRKCRS-3032

2012 Cisco and/or its affiliates. All rights reserved.

Final Thoughts
Learn more in the World of Solutions. Visit Booth #XXXX Visit www.ciscoLive365.com after the event for updated PDFs, ondemand session videos, networking, and more! Follow Cisco Live! using social media:
Facebook: https://www.facebook.com/ciscoliveus Twitter: https://twitter.com/#!/CiscoLive LinkedIn Group: http://linkd.in/CiscoLI

BRKCRS-3032

2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

99

Presentation_ID

2012 Cisco and/or its affiliates. All rights reserved.

Cisco Public

Вам также может понравиться