Вы находитесь на странице: 1из 147

#CLUS

ACI Multi-Site
Architecture and
Deployment
Max Ardica, Principal Engineer
@maxardica
BRKACI-2125

#CLUS
Cisco Webex Teams
Questions?
Use Cisco Webex Teams to chat
with the speaker after the session

How
1 Find this session in the Cisco Live Mobile App
2 Click “Join the Discussion”
3 Install Webex Teams or go directly to the team space
4 Enter messages/questions in the team space

Webex Teams will be moderated cs.co/ciscolivebot#BRKACI-2125


by the speaker until June 16, 2019.

#CLUS © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 3
Session Objectives

At the end of the session, the participants should be able to:


• Articulate the different deployment options to interconnect
Cisco ACI networks (Multi-Pod and Multi-Site) and when to
choose one vs. the other
• Understand the functionalities and specific design
considerations associated to the ACI Multi-Site architecture
Initial assumption:
• The audience already has a good knowledge of ACI main
concepts (Tenant, BD, EPG, L2Out, L3Out, etc.)

#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 4
• ACI Network and Policy Domain Evolution
Agenda
• ACI Multi-Pod Quick Review

• ACI Multi-Site Deep Dive


• Overview and Use Cases
• Introducing the ACI Multi-Site Orchestrator (MSO)
• Inter-Site Connectivity Deployment Considerations
• Control and Data Plane
• Unicast Communication
• Multicast Routing
• Connecting to the External Layer 3 Domain
• Network Services Integration
• Virtual Machine Manager (VMM) Integration
• Multi-Pod and Multi-Site Integration
• Migration Scenarios

• Conclusions and Q&A


#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 5
ACI Network and
Policy Domain
Evolution
Cisco ACI: Industry Leader

5,800+ 50+% 65+


ACI Customers ACI Attach Rate Ecosystem Partners

Ecosystem Partners

#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 7
Introducing Application Centric Infrastructure (ACI)
Web App DB
Outside QoS QoS QoS
(Tenant
Filter Service Filter
VRF)

APIC

Application Policy
ACI Fabric Infrastructure Controller
Integrated GBP VXLAN Overlay

#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 8
ACI Anywhere
Fabric and Policy Domain Evolution

ACI Single Pod Fabric ACI Multi-Site ACI Multi-Cloud


ISN
Fabric ‘A’ Fabric ‘n’

MP-BGP - EVPN

… ACI 3.1/4.0 - Remote


ACI 2.0 - Multiple
Networks (Pods) in a Leaf and vPod extends an
single Availability Zone Availability Zone (Fabric)
(Fabric) to remote locations

ACI 1.0 - ACI 3.0 – Multiple ACI 4.1 – ACI Extensions


ACI Multi-Pod Fabric Availability Zones (Fabrics)
ACI Remote Leaf
Leaf/Spine Single to Multi-Cloud
Pod Fabric in a Single Region ’and’
IPN
Pod ‘A’ Pod ‘n’ Multi-Region Policy
Management
MP-BGP - EVPN


APIC Cluster

Remote-Leaf session: BRKACI-2387


Virtual Pod (vPod session): BRKACI-2882
Integration with Public Cloud session: BRKACI-2690#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 9
Multi-Pod or Multi-Site?

That is the question…

#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 10
And the answer is…

BOTH!

#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 11
Regions and Availability Zones
OpenStack and AWS Definitions
OpenStack
• Regions - Each Region has its own full OpenStack
deployment, including its own API endpoints, networks
and compute resources
• Availability Zones - Inside a Region, compute nodes can
be logically grouped into Availability Zones, when launching
new VM instance, we can specify AZ or even a specific
node in a AZ to run the VM instance

• Regions – Separate large geographical areas, each Amazon Web Services


composed of multiple, isolated locations known as
Availability Zones
• Availability Zones - Distinct locations within a region
that are engineered to be isolated from failures in other
Availability Zones and provide inexpensive, low latency
network connectivity to other Availability Zones in the
same region
#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 12
Terminology

• Pod – A Leaf/Spine network sharing a common control plane (ISIS, BGP,


COOP, …)
 Pod == Network Fault Domain
• Fabric – Scope of an APIC Cluster, it can be one or more Pods
 Fabric == Availability Zone (AZ) or Tenant Change Domain
• Multi-Pod – Single APIC Cluster with multiple leaf spine networks
 Multi-Pod == Multiple Networks within a Single Availability Zone (Fabric)
• Multi-Fabric – Multiple APIC Clusters + associated Pods (you can have
Multi-Pod with Multi-Fabric)*
 Multi-Fabric == Multi-Site == a DC infrastructure region with multiple AZs

* Available from ACI release 3.2


#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 13
Typical Requirement
Creation of Two Independent Fabrics/AZs

Fabric ‘A’ (AZ 1)

Fabric ‘B’ (AZ 2)

Application
workloads deployed
across availability
zones
#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 14
Typical Requirement
Creation of Two Independent Fabrics/AZs

Multi-Pod Fabric ‘A’ (AZ 1)

‘Classic’ Active/Active

Pod ‘1.A’ Pod ‘2.A’

ACI Multi-Site

Multi-Pod Fabric ‘B’ (AZ 2)

‘Classic’ Active/Active

Pod ‘1.B’Application Pod ‘2.B’


workloads deployed
across availability
zones
#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 15
ACI Multi-Pod
Quick Review
ACI Multi-Pod For more Information on
ACI Multi-Pod:
Overview BRKACI-2003
VXLAN Data Plane
Inter-Pod
Pod ‘A’ Pod ‘n’
Network

MP-BGP - EVPN


Up to 50 msec RTT

APIC Cluster
IS-IS, COOP, MP-BGP IS-IS, COOP, MP-BGP

Availability Zone

• Multiple ACI Pods connected by an IP Inter-Pod L3 • Forwarding control plane (IS-IS, COOP) fault
network, each Pod consists of leaf and spine nodes isolation
• Up to 50 msec RTT supported between Pods • Data Plane VXLAN encapsulation between Pods
• Managed by a single APIC Cluster • End-to-end policy enforcement
• Single Management and Policy Domain
#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 17
Single AZ with Maintenance and Configuration Zones
Scoping ‘Network Device’ Changes

Maintenance Zones – Groups


of switches managed as an
“upgrade” group Inter-Pod
Network

ACI Multi-Pod
Fabric

APIC Cluster

Configuration Zone ‘A’ Configuration Zone ‘B’


• Configuration Zones can span any required set of switches, simplest approach may be to map a configuration
zone to an availability zone, applies to infrastructure configuration and policy only

#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 18
Single AZ with with Tenant Isolation
Isolation for ‘Virtual Network Zone and Application’ Changes

Inter-Pod
Network

ACI Multi-Pod
Fabric

APIC Cluster

Tenant ‘Prod’ Configuration/Change Domain Tenant ‘Dev’ Configuration/Change Domain

• The ACI ‘Tenant’ construct provide a domain of application and associated virtual network policy
change
• Domain of operational change for an application (e.g. production vs. test)

#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 19
ACI Multi-Pod
Most Common Use Cases

 Need to scale up a single ACI fabric above


Pod
200 leaf nodes supported in a single Pod Inter-Pod
 Handling 3-tiers physical cabling layout Leaf Nodes Network
(for example traditional N7K/N5K/N2K
deployments)
Spine Nodes

 True Active/Active DC deployments


Pod 1 Pod 2
Single VMM domain across DCs (stretched ESXi
Metro Cluster, vSphere HA/FT, DRS initiated
workload mobility,…)
Deployment of Active/Standby or Active/Active
clustered network services (FWs, SLBs) across DCs APIC Cluster
DB Web/App Web/App
Application clustering (L2 BUM extension across
Pods)

#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 20
ACI Multi-Pod
Supported Topologies
Intra-DC Site Two DC sites directly connected

1G/10G/40G/100G
10G/40G/100G 10G*/40G/100G
Pod 1 Pod n 10G*/40G/100G 10G*/40G/100G
Pod 1 Dark fiber/DWDM Pod 2
(up to 50** msec RTT)

APIC Cluster APIC Cluster

3 (or more) DC Sites directly connected Multiple DC sites interconnected


1G/10G/40G/100G by a generic L3 network
10G*/40G/100G
Pod 1 10G*/40G/100G Pod 2
Dark fiber/DWDM 10G*/40G/100G 10G*/40G/100G
(up to 50 msec RTT)
L3
10G*/40G/100G
10G*/40G/100G (up to 50msec RTT)
10G*/40G/100G

POD 3 #CLUS BRKACI-2125 ** 50


© 2019 msec
Cisco and/orsupport
its affiliates.added in SW release
All rights reserved. 2.3(1)21
Cisco Public
ACI Multi-Site
Deep Dive
Overview and Use Cases
ACI Multi-Site VXLAN Data Plane
Overview Inter-Site
Network

MP-BGP - EVPN
Multi-Site
Orchestrator

Site 1 Site 2
REST
GUI
API Availability Zone ‘B’
Availability Zone ‘A’
Region 1

• Separate ACI Fabrics with independent APIC clusters • MP-BGP EVPN control plane between sites
• No latency limitation between Fabrics • Data Plane VXLAN encapsulation across
• ACI Multi-Site Orchestrator pushes cross-fabric sites
configuration to multiple APIC clusters providing • End-to-end policy definition and
scoping of all configuration changes enforcement
#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 24
ACI Multi-Site
Most Common Use Cases

• Scale-up model to build a • Data Centre Interconnect (DCI) • ACI Multi-Cloud


very large intra-DC network Integration between on-prem and
Extend connectivity and policy
(above 400 leaf nodes) public clouds
between ‘loosely coupled’ DC sites
Disaster Recovery and IP mobility use
cases

#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 25
ACI Multi-Site
Software and Hardware Requirements

• ACI Multi-Site introduced from release 3.0(1)

• Support all ACI leaf switches (1st Generation, -EX and -FX) Can have only a subset
Inter-Site of spines connecting to
• Only –EX spine (or newer) to connect to the ISN Network (ISN) the IP network

• New 9364C/9332C non modular spine


1st Gen 1st Gen -EX -EX
(64/32 40G/100G ports) also supported
• 1st generation spines (including 9336PQ)
not supported
• Can still leverage those for intra-site leaf
to leaf communication

#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 26
ACI Multi-Site
Network and Identity Extended between Fabrics

Network information carried across Identity information carried across


Fabrics (Availability Zones) Fabrics (Availability Zones)

VTEP IP VNID Class-ID Tenant Packet


No Multicast Requirement
in Backbone, Head-End
Replication (HER) for any
Inter-Site Network Layer 2 BUM traffic)

MP-BGP - EVPN

Multi-Site
Orchestrator

#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 27
ACI Multi-Site
Namespace Normalisation VNID  16678781
Class-ID: 49153 Translation of Class-ID, VNID
Inter-Site (scoping of name spaces)
VNID  16678781 Network
Spine Translation Table
Class-ID: 49153
Rem. Site Local Site

VNID 16678781 16547722


MP-BGP - EVPN Class-ID 49153 32770

Multi-Site VNID  16547722 EP1


C
Orchestrator EP2 EPG

Class-ID: 32770
EPG

VNID  16678781
Class-ID: 49153 EP1 Site 2
Site 1 EPG C
EP2
EPG
Leaf to Leaf VTEP, Class-ID is local to the Fabric
Leaf to Leaf VTEP, Class-ID is local to the Fabric
VNID Class-ID Tenant Packet
VNID Class-ID Tenant Packet VNID Class-ID Tenant Packet

• Maintain separate name spaces with ID translation performed on the spine nodes
• Requires specific HW on the spine to support for this functionality
• Multi-Site Orchestrator instructs local APIC to program translation tables on spines
#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 28
ACI Multi-Site
Inter-Site Policies and Spines’ Translation Tables

IP
 Inter-Site policies defined on the ACI
Network
Multi-Site Orchestrator are pushed to
the respective APIC domains
• End-to-end policy consistency
• Creation of ‘Shadow’ EPGs to locally
represent the policies
 Inter-site communication requires the
installation of translation table entries on EP1 EP2
the spines (namespace normalization) Site 1 Site 2

 Up to ACI release 4.0(1) translation EP1


EPG
C EP2 EPG
EP1
EPG
C EP2 EPG

entries are populated only in two cases:


1. Stretched EPGs/BDs
‘Shadow’
2. Creation of a contract between not EPGs
stretched EPGs

#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 29
ACI 4.0(2)
ACI Multi-Site Release
Removing Policy Enforcement: Preferred Groups
Contract required to
Multi-Site Preferred Group communicate with EPG(s)
external to the Preferred Group

App DB
C1 Non-PG
EPG
C2
Free
Web
communication

 Multi-Site Preferred Group configuration from the Multi-Site Orchestrator is supported from ACI
4.0(2) release
• Creates ‘shadow’ EPGs and translation table entries ‘under the hood’ to allow ‘free’ inter-site communication
• 250 Preferred Groups supported as ACI release 4.1(1)
 Typically desired in legacy to ACI migration scenarios
 vzAny support from MSO scoped for a future ACI release (post 4.2(1) release)
 No current plans to support ”VRF unenforced” with Multi-Site
#CLUS © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
Allowing Any-to-Any Communication
Preferred Groups for E-W and N-S Flows

IP
Network  Adding internal EPGs and External EPGs
Site 1 Site 2
(associated to L3Outs) to the Preferred Group
allows to enable free east-west and north-
south connectivity
 When adding the Ext-EPG to the Preferred
L3Out L3Out Group:
Site 1 Site 2
Ext-EPG
EP1 EP2 Ext-EPG • Can’t use 0.0.0.0/0 for classification, needs more
specific prefixes
Multi-Site Preferred Group • As workaround it is possible to use 0.0.0.0/1 and
EPG1 EPG2
128.0.0.0/1 to achieve the same result
Ext-EPG

On MSO

#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 31
ACI 3.2(1)
ACI Multi-Site Release
Spines in Separate Sites Connected Back-to-Back

Inter-Site E-W (Direct Cable or Dark Fibre)

Multi-Site
Orchestrator

• Back-to-back connections only supported between 2 sites from ACI 3.2 release
• Support for full mesh and ‘square’ topologies
• Support for more than 2 sites scoped for a future ACI release
 Current restriction is that a site cannot be ‘transit’ for communication between other sites

#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 32
ACI Multi-Site
CloudSec Encryption for VXLAN Traffic
Encrypted Fabric to Fabric Traffic
[GCM-AES-256-XPN (64-bit PN)])
CloudSec = “TEP-to-TEP MACSec”

VTEP IP MACSec VXLAN Tenant Packet

VTEP Information
in Clear Text
Inter-Site Network

MP-BGP - EVPN

Multi-Site
Orchestrator

Supported from ACI 4.0(1) release for FX line cards and 9332C/9364C platforms

#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 33
ACI Multi-Site Networking Options
Per Bridge Domain Behavior
Layer 3 only across sites
1
ISN
Site Site
1 2

 Bridge Domains and subnets not


extended across Sites
 Layer 3 Intra-VRF or Inter-VRF
communication (shared services
across VRFs/Tenants)

MSO GUI
(BD)

#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 34
ACI Multi-Site Networking Options
L3 Only across Sites

Why not using just normal routing across independent fabrics?

Independent ACI Fabrics (no ACI Multi-Site)

L3Out L3Out

Need to apply a contract Need to apply a contract


WAN between internal EPG and
between internal EPG and
Ext-EPG associated to the Ext-EPG associated to the
L3Out in Fabric 1 L3Out in Fabric 2
Mandates the use of a multi-VRF
capable backbone network (VRF-Lite,
MPLS-VPN, etc.) to extend multiple
VRFs across fabrics
#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 35
ACI Multi-Site Networking Options Should be the behavior for the
Per Bridge Domain Behavior majority of BDs with Multi-Site

Layer 3 only across sites IP Mobility without BUM flooding Layer 2 adjacency across Sites
1 2 3
ISN ISN ISN
Site Site Site Site 2
Site Site Site
1 2 1 2 1 2

 Bridge Domains and subnets not  Same IP subnet defined in separate  Interconnecting separate sites for
extended across Sites Sites fault containment and scalability
reasons
 Layer 3 Intra-VRF or Inter-VRF  Support for IP Mobility (‘cold’ and
communication (shared services ‘live’* VM migration) and intra-  Layer 2 domains stretched across
across VRFs/Tenants) subnet communication across sites Sites, support for application
clustering
 No Layer 2 BUM flooding across
sites  Layer 2 BUM flooding across
sites

MSO GUI MSO GUI MSO GUI


(BD) (BD) (BD)

*’Live’ migration officially supported from ACI release 3.2


#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 36
ACI Multi-Site
Scalability Values Supported in 4.1(1) Release

Scale Parameter Stretched Objects


Sites 12

Leaf scale 1600 across all sites

Tenants 400
• Starting from ACI release 4.1(1) only the
scalability values for the ‘stretched objects’
VRFs 1000
are documented
IP Subnets 8000
• The total number of objects (stretched +
BD 4000
local site) cannot exceed the maximum
EPGs 4000
values published in the scalability guide
• For more information please refer to:
Endpoints 100000
https://www.cisco.com/c/en/us/td/docs/switches/datace
Contracts 4000 nter/aci/apic/sw/4-x/verified-scalability/Cisco-ACI-
Verified-Scalability-Guide-411.html
L3Outs External EPGs 500 (prefixes)

IGMP Snooping 8,000

#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 37
ACI Multi-Site
Continuous Scale Improvements
ACI Release 3.0 ACI Release 3.1 ACI Release 3.2 ACI Release 4.1

MSO 1.0 MSO 1.1 MSO 1.2 MSO 2.1


Number Of Sites 5 8 10 12

Max Leafs (across sites) 250 800 1200 1600

Tenants 100 200 300 400

VRF 400 400 800 1000



BD 800 2,000 3,000 4,000

EPGs 800 2,000 3,000 4,000

Contracts 1,000 2,000 3,000 4,000

L3Out External EPGs 500 500 500 500

Isolated EPGs N/A 400 400 400

#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 38
Introducing the ACI Multi-Site Orchestrator
ACI Multi-Site
Multi-Site Orchestrator (MSO)
• Three MSO nodes are clustered and run concurrently (active/active)
 Typical database redundancy considerations
(minority/majority rules)

REST  Up to 150 msec RTT latency supported between MSO nodes


GUI
API  vSphere VM only form factor initially, physical appliance
planned for a future ACI release
ACI Multi-Site Orchestrator • OOB Mgmt connectivity to the APIC clusters deployed in
150 msec RTT
(max)
separate sites
VM VM VM
 Up to 1 sec RTT latency between MSO and APIC nodes
Hypervisor • Main functions offered by MSO:
1 sec RTT  Monitoring the health-state of the different ACI Sites
(max)
 Provisioning of day-0 infrastructure configuration to
establish inter-site EVPN control plane and VXLAN data
….. plane
Site 1 Site 2 Site n
 Defining and provisioning tenant policies across sites
 Day-2 operation functionalities

#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 40
ACI Multi-Site
MSO Dashboard

• Health/Faults for all managed sites


• Easy way to identify stretched policies across sites
• Quickly search for any deployed inter-site policy
• Provide direct access to the APIC GUIs in different sites

#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 41
ACI Multi-Site
MP-BGP/EVPN Infra Configuration

• Configure Day-0 infra policies


• Select spines establishing MP-BGP EVPN peering with remote sites
• Site/Pod Overlay Unicast and Multicast TEPs (O-UTEP and O-MTEP)
• Spine MP-BGP EVPN Router-IDs (EVPN-RIDs)

#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 42
ACI Multi-Site For more Information on setting
up ACI Multi-Site via Ansible :
UCSD and Ansible Integration BRKACI-2291

ACI 4.1/MSO 2.1


UCSD 6.6 and Ansible Main Functions
UCSD 6.6 Site Management
Ansible
Orchestration Site Infra config and test connectivity
MSC site inventory
APIC site management (cross-launch)
User Management
Tenant Lifecycle and Site Association
Schema and Template lifecycle (AP, EPGs, Contracts, VRF, BD, etc … )
L3Out and External EPG
Deploy Tenants and Schemas to sites
Monitoring MSC and Management
….. Import brownfield tenant policies and deploy across sites
Site 1 Site 2 Site n
Troubleshooting

#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 43
APIC vs. Multi-Site Orchestrator Functions

• Complementary to APIC
• Central point of management and
configuration for the Fabric • Provisioning and managing of “Inter-Site
Tenant and Networking Policies”
• Responsible for all Fabric local functions
• Scope of changes
• Fabric discovery and bring up
• Fabric access policies • Granularly propagate policies to multiple APIC
• Domains creation (VMM, Physical, etc.) clusters
• … • Can import tenant configuration from APIC
• Maintains runtime data (VTEP address, VNID, cluster domains
Class_ID, GIPo, etc.) • End-to-end visibility and troubleshooting
• No participation in the fabric control and data • No run time data, configuration repository
planes
• No participation in the fabric control and data
planes

#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 44
Multi-Site
Orchestrator
Deployment Considerations
ACI Multi-Site
MSO Deployment Considerations
Intra-DC Deployment Interconnecting DCs over WAN

New York
Site3
IP Network

WAN

Milan Rome
Hypervisor Hypervisor Hypervisor Site1 Site2
VM VM VM

ACI Multi-Site Orchestrator


Hypervisor Hypervisor
ACI Multi-Site
VM VM Orchestrator VM

• Hypervisors can be connected directly to the DC OOB network • Up to 150 msec RTT latency supported between MSO nodes
• Each MSO node has a unique routable IP (can be part of separate IP • Higher latency (500 msec to 1 sec RTT) between MSO nodes and
subnets) managed APIC clusters
• Async calls from MSO to APIC • If possible deploy MSO nodes in separate sites for availability
purposes (network partition scenarios)
#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 46
ACI Multi-Site
MSO and APIC Release Dependency (Current Implementation)

• MSO and ACI releases must currently be


aligned
IP Network
 For example MSO 1.2(x) is used with all sites
running ACI release 3.2(x)
• Different ACI versions across sites are only
supported during an ACI SW upgrade
procedure
ACI 3.2(x) ACI 3.2(x) ACI 3.2(x) ACI 3.2(x)
• The supported order of upgrade is:
1. APIC firmware
For each Site
2. Switches firmware
3. MSO As last task
MSO 1.2(x)

#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 47
ACI Multi-Site
ACI Multi- Site MSO 2.2(1)
Release
Decoupling MSO
Decoupling MSO and
and APIC
APIC Releases
Releases

IP
IP Network
Network • MSO
MSO “inter-version” support is
“ inter- version” support is planned
planned for
for
MSO
MSO release
release 2.2(1)
2.2(1) (Q3CY19)
(Q3CY19)
• Different
Different ACI
ACI versions
versions across
across sites
sites can
can be
be
supported
supported at
at steady
steady state
state
• MSO
MSO will
will have
have visibility
visibility into
into what
what functionalities
functionalities
ACI 3.2(x) ACI 4.0(1) ACI 4.1(1) ACI 4.2(1)
are
are supported
supported inin each
each fabric
fabric (based
(based on
on the
the
specific
specific ACI
ACI releases)
releases)
• Preventing
Preventing the
the deployment
deployment of
of unsupported
unsupported
functionalities
functionalities

MSO
MSO 2.2(1)
2.2(1)

#CLUS
#CLUS BRKACI-2125
BRKACI- 2125 © 2019
2019 Cisco
Cisco and/or
and/or its
its affiliates.
affiliates. All
All rights
rights reserved.
reserved. Cisco
Cisco Public
Public 48
48
How to Define Schemas,
Templates and their Mappings to
ACI Sites?
ACI Multi-Site
MSO Schema and Templates
Schema
 Template = ACI policy definition
(ANP, EPGs, BDs, VRFs, etc.)
 Schema = container of Templates
sharing a common use-case
• As an example, a schema can be
dedicated to a Tenant
 The template is currently the atomic unit
of change for policies
• Such policies are concurrently pushed to
one or more sites
Site 1 Site 2
 Scope of change: policies in different
templates can be pushed to separate EFFECTIVE
POLICY
EFFECTIVE
POLICY
sites at different times

#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 50
ACI Multi-Site
Schema and Templates Definition for the DR Use Case
Future
Schema Schema Schema
Template 1 Template 1 Template 2 Template 1
EP1 EP2 EP1 EP2 EP1 EP2 EP1 EP2
C C C C
EPG EPG EPG EPG EPG EPG EPG EPG

t1 t1 t1 t2 t1 t2

Prod Site DR Site Prod Site DR Site Prod Site DR Site

 Single Template associated to Prod  Separate Template associated to Prod  Single Template associated to Prod
and DR Sites and DR Sites and DR Sites
 Any change applied to the template  Changes made to a template can be  Capability of independently apply
is pushed to both sites applied only to the mapped site changes to each site
simultaneously
 Requires sync between the two  Brings together the advantages of
 Easiest way to keep consistent templates (manual or performed by an the previous two options
policies deployed across sites higher level Orchestration
#CLUS
tool) BRKACI-2125
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 51
Schema Design
One Template per Site, plus a ‘Stretched’ Template

Schema
ANP1 Site 1
Template
Site 1
EPG1 EPG2 BD1 BD2

Site 2
ANP1
Template
Site 2
EPG3 EPG4 BD3 BD4

ANP1 Site 3 Site 3


Template
EPG5 EPG6 BD5 BD6

ANP1 VRF
BD7 C1 C2
EPG7
Contracts

Stretched Template (for stretched objects)


#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 52
Schema Design
Deployment Considerations

 All objects defined inside the schema are visible and can be referenced via the
drop-down list
• This is not the case for object referenced across schemas  for those it is required to digit at least 3
letters of their names to be displayed and then create references
 Current support limited to 5 templates per schema
• With four sites you could have a template per site and one stretched template (would not scale to
support other combinations)
 Be aware of the maximum object limit in the same schema (500 objects is the
current limit)
• Every object that can be defined in a template counts (EPGs, BDs, VRFs, Contracts, etc.)
• May make sense to locally define on APIC objects that are only used locally in a site
 Note: increasing both the number of templates and number of objects in a
schema is planned for a future ACI release (2HCY19)

#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 53
How to Define the Policies inside a
Template for a Given Tenant?
ACI Multi-Site Orchestrator
Defining Policies in a Template
Green Field Deployment Import Policies from an Existing Fabric

Site 1
Site 1 Site 1 Site 1

1a 1b 2b
2a
Site 2
Site 2 Site 2 Site 2

2 2 1

Site 1 Site 2 Site 1 Site 2


Green Field Green Field Existing Fabric Green Field
1a. Model new tenant and policies to a common template on MSO 1. Import existing tenant policies from site 1 to new common and
and associate the template to both sites (for stretched objects) site-specific templates on MSO
1b. Model new tenant and policies to site-specific templates and 2a. Associate the common template to both sites (for stretched objects)
associate them to each site 2b. Associate site-specific templates to each site
2. Push policies to the ACI sites 3. Push the policies back to the ACI sites
#CLUS BRKACI-2125 55
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public
ACI Multi-Site Orchestrator
Defining Policies in a Template (2)
Import Policies from Multiple Existing Fabrics

 In the current implementation, MSO does


Site 1
Site 1

2a 2b
Site 2
Site 2 not allow diff/merge operations on policies
from different APIC domains
1 1  It is still possible to import policies for the
3 same tenant from different APIC domains,
under the assumption those are no
conflicting
Site 1 Site 2 • Tenant defined with the same Name
Existing Fabric Existing Fabric • Name and policies for stretched objects are
1. Import existing tenant policies from site 1 and site 2 to new
also common
common and site-specific templates on ACI MSO
2a. Associate the common template to both sites (for stretched objects)
2b. Associate site-specific templates to each site
3. Push the policies back to the ACI sites #CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 56
Inter-Site Connectivity Deployment
Considerations
ACI Multi-Site
Inter-Site Network (ISN) Requirements

Inter-Site Network

MP-BGP - EVPN
Multi-Site
Orchestrator

• Not managed by APIC, must be independently configured (day-0 configuration)


• IP topology can be arbitrary, not mandatory to connect to all spine nodes
• Main requirements:
 OSPF on the first hop routers to peer with the spine nodes and exchange TEP address reachability
 Must use sub-interfaces (with VLAN tag 4) toward the spines
 Increased end-to-end MTU support (at least 1550 B)
#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 58
ACI Multi-Site and MTU
Different MTU Meanings

1. Data Plane MTU: MTU of the traffic


generate by endpoints (servers,
routers, service nodes, etc.) 2 ISN
connected to ACI leaf nodes MP-BGP EVPN

• Need to account for 50B of


overhead (VXLAN encapsulation) for
inter-site communication
2. Control Plane MTU: for CPU Multi-Site
Orchestrator

generated traffic like EVPN across 1 1

sites
• The default value is 9000B, can be
tuned to the maximum MTU value
supported in the ISN

#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 59
ACI Multi-Site and MTU
Tuning MTU for EVPN Traffic across Sites

Configurable MTU

ISN

MP-BGP - EVPN
Multi-Site
Orchestrator

 Control Plane MTU can be set leveraging


the “CP MTU Policy” on APIC Modify the default
9000B MTU value
 The required MTU in the ISN would then
depend on this setting and on the Data
Plane MTU configuration
Always need to consider the VXLAN encapsulation
overhead for data plane traffic (50/54 bytes)

#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 60
ACI Multi-Site and QoS
Intra-Site QoS Behavior

• ACI Fabric supports six classes of services


• Traffic is classified only in the ingress leaf
 The CoS value in the iVXLAN packet is set based
Class of Traffic Type Dot1p Marking in
on this table
Service/QoS-group VXLAN Header
• Three user configurable classes of services
0 Level3 user data 0
for user data traffic
1 Level2 user data 1
 Level3 is the default class (CoS value 0)
2 Level1 user data 2
• Three reserved classes of service for control
3 APIC controller traffic 3
traffic and SPAN
4 SPAN traffic 4  APIC controller traffic
5 Control Traffic 5  Control traffic for traffic destined to the supervisor
5 Traceroute 6  SPAN traffic
 Traceroute traffic
Note: 3 additional user classes have been added in ACI • Each class is configured at the fabric level and
4.0(1) release mapped to an hardware queue

#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 61
ACI Multi-Site and QoS
Inter-Site QoS Behavior
• Traffic across sites should be consistently prioritised (as it happens intra-site)
• To achieve this end-to-end consistent behavior it is required to perform DSCP-to-CoS
mapping on the spines (Spines-to-ISN and ISN-to-Spines)
• Important: must ensure that no traffic is received by the spines from the IPN with the DSCP
marking associated to Traceroute (spines do not forward this traffic toward the leaf nodes)
• The traffic can then be properly treated inside the ISN (classification/queuing)

Traffic classification
and queuing
Spines set the outer Spines set the iVXLAN
DSCP field based on the CoS field based on the
configured mapping configured mapping

ISN
Pod ‘A’ Pod ‘B’

MP-BGP - EVPN

CS5 CS5

Multi-Site
Orchestrator

#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 62
Control Plane Considerations
ACI Multi-Site
BGP Inter-Site Peers
• Spines connected to the Inter-Site Network perform
two main functions:
Inter-Site
Network
1. Establishment of MP-BGP EVPN peerings with spines in
remote sites
Anycast VTEP Addresses:
O-UTEP & O-MTEP  One dedicated Control Plane address (EVPN-RID) is
assigned to each spine running MP-BGP EVPN
2. Forwarding of inter-sites data-plane traffic
 Anycast Overlay Unicast TEP (O-UTEP): assigned to all the
EVPN-RID 4 spines connected to the ISN and used to source and
receive L2/L3 unicast traffic
EVPN-RID 1
EVPN-RID 2 EVPN-RID 3  Anycast Overlay Multicast TEP (O-MTEP): assigned to all
the spines connected to the ISN and used to receive L2
BUM traffic

• EVPN-RID, O-UTEP and O-MTEP addresses are


assigned from the Multi-Site Orchestrator and must
be routable across the ISN

#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 64
ACI Multi-Site
Exchanging TEP Information across Sites IP Network Routing Table

O-UTEP A, O-MTEP A
EVPN-RID S1-S4
O-UTEP B, O-MTEP B
Filter out the
EVPN-RID S5-S8
advertisement of internal
TEP pools into the ISN
• OSPF peering between spines and
Inter-Site network Inter-Site
OSPF Network OSPF
• Exchange of External Spine TEP
addresses (EVPN-RID, O-UTEP and S5 S6 S7 S8
S1 S2 S3 S4
O-MTEP) across sites IS-IS to OSPF
mutual redistribution
Internal TEP Pool information not needed TEP Pool 1 TEP Pool 2
to establish inter-site communication
(should be filtered out on the first-hop ISN
router) Multi-Site
Orchestrator
Use of overlapping internal TEP Pools Site 1 Site 2
across sites is fully supported
Leaf Routing Table Leaf Routing Table
IP Prefix Next-Hop IP Prefix Next-Hop
O-UTEP B Pod1-S1, Pod1-S2, O-UTEP A Pod2-S1, Pod2-S2,
Pod1-S3, Pod1-S4 Pod2-S3, Pod2-S4
#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 65
ACI Multi-Site
Inter-Site MP-BGP EVPN Control Plane

S3-S4 Table S5-S8 Table


EP1 Leaf 1 EP2 Leaf 4
• MP-BGP EVPN used to communicate MP-BGP EVPN
EP2 O-UTEP B
Endpoint (EP) information across Sites EP1 O-UTEP A

MP-iBGP or MP-EBGP peering options


supported Inter-Site
Remote host route entries (EVPN Type-2) Network
are associated to the remote site Anycast O-UTEP A O-UTEP B
O-UTEP address
S1 S2 S3 S4 S5 S6 S7 S8
• Automatic filtering of endpoint Multi-Site

information across Sites


Orchestrator
COOP COOP
Host routes are exchanged across sites
only if there is a cross-site contract
requiring communication between EP1
EP2
endpoints Site 1 Site 2

Define and push inter-site policy


EP1 EP2
EPG
C EPG
#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 66
66
Data Plane
ACI Multi-Site
Policy information (EP1’s Class-ID)
Inter-Sites Unicast Data Plane carried across Pods

VTEP IP VNID Class-ID Tenant Packet

S2 has remote info for EP2 S6 translates the VNID


EP1 Leaf 4 and encapsulates traffic to 3 Inter-Site and Class-ID to local
EP2 S2-L4-TEP
EP2 O-UTEP B EP1 O-UTEP A
remote O-UTEP B Address Network values and sends traffic to
the local leaf
Site 1 Site 2
O-UTEP A All VXLAN Inter-Site unicast traffic O-UTEP B
always sourced from O-UTEP A and
S1 S2Proxy A S3 S4 S5 S6Proxy BS7 S8
destined to O-UTEP B
Multi-Site EP2 e1/1
EP1 e1/3 2 Orchestrator
4 EP1 O-UTEP A

5 * Proxy B
Proxy A
*
EP1 sends Leaf learns remote Site
EP2 unknown, traffic is 1 traffic to EP2 location info for EP1
encapsulated to the local EP1 EP1
C EP2 EP2
10.10.10.10 EPG EPG 20.20.20.20
Proxy A Spine VTEP
(adding S_Class
6
information) 2 3 4 If policy allows it, EP2
receives the packet
Proxy-A O-UTEP B S2-L3-TEP
1 6
S1-L4-TEP O-UTEP A O-UTEP A

20.20.20.20 20.20.20.20 20.20.20.20 20.20.20.20 20.20.20.20 = VXLAN Encap/Decap


BRKACI-2125 10.10.10.10 10.10.10.10 10.10.10.10
#CLUS 10.10.10.10 10.10.10.10
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 68
ACI Multi-Site
Policy information (EP1’s Class-ID)
Inter-Sites Unicast Data Plane (2) carried across Pods

VTEP IP VNID Class-ID Tenant Packet

S3 translates the VNID


and S_Class to local EP1 S1-L4-TEP
O-UTEP A Inter-Site S6 rewrites the S-VTEP
values and sends traffic to EP2
Network to be O-UTEP B
the local leaf
10 9
Site 1 Site 2
O-UTEP A O-UTEP B

S1 S2 S3 S4 S5 S6 S7 S8
EP1 e1/3
EP2 O-UTEP B Multi-Site EP1 O-UTEP A
Orchestrator
** Proxy A
8 * Proxy B
11 Leaf applies the policy and, if
Leaf learns remote Site allowed, encapsulates traffic to
location info for EP2 EP1 EP1
C EP2
EP2 remote O-UTEP address
EPG EPG

12 7
EP1 receives the packet 10 9 8 EP2 sends traffic back to
remote EP1
S1-L4-TEP O-UTEP A O-UTEP A
12 7
O-UTEP B O-UTEP B S2-L4-TEP

10.10.10.10 10.10.10.10 10.10.10.10 10.10.10.10 10.10.10.10 = VXLAN Encap/Decap


BRKACI-2125 20.20.20.20 20.20.20.20 20.20.20.20
#CLUS 20.20.20.20 20.20.20.20
© 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 69
ACI Multi-Site
Inter-Sites Unicast Data Plane (3)

From this point EP1 to EP2 communication is encapsulated Leaf to Remote Spine O-UTEPs in both directions

Inter-Site
Network

Site 1 Site 2
O-UTEP A O-UTEP B

S1 S2 S3 S4 S5 S6 S7 S8
Multi-Site
Orchestrator
**

EP2 e2/5
EP1 e1/3 EP1 O-UTEP A
EP1 EP1
EPG C EP2
EPG
EP2
EP2 O-UTEP B * Proxy B

Proxy A

= VXLAN Encap/Decap
#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 70
ACI Multi-Site
Layer 2 BUM Traffic Data Plane

S3 is elected as Multi-Site forwarder for GIPo 1 S7 translates the VNID and the
BUM traffic  it creates an unicast VXLAN GIPo values to locally significant
packet with O-UTEP A as S_VTEP and Inter-Site ones and associates the frame to
Multicast O-MTEP B as D_VTEP Network an FTAG tree
3 4
O-UTEP A All VXLAN Inter-Site BUM traffic O-MTEP B
always sourced from O-UTEP A and
S1 S2 S3 S4 destined to O-MTEP B S5 S6 S7 S8
BUM frame is flooded along the
Multi-Site tree associated to GIPo. VTEP
2 Orchestrator 5 learns VM1 remote location
*
*
EP1 O-UTEP A
BUM frame is associated to
GIPo1 and flooded intra-site via Proxy B
*
the corresponding FTAG tree EP1 EP2
1 6
GIPo1 = Multicast Group EP1 generates a BUM EP2 receives the BUM
associated to EP1’s BD frame
frame

#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 71
Tenant Multicast Routing
Tenant Multicast Routing with Multi-Site
Deployment Considerations
• Supported from ACI 4.0 release only on 2nd Gen leaf switches (EX/FX/FX2)
• Support for PIM-ASM and PIM-SSM in the overlay
 For PIM-ASM external RP is required (RP in the fabric planned for 4.2(1) release)
• All sites must have reachability to the external RP via each sites local L3Outs
 Each BL node runs PIM in active mode and forms neighborship with other BLs in the same site
and with the external router(s)
 PIM protocol states (hellos/joins/prunes) are contained within a site
• Supports sources attached to the fabric (as an endpoint) and source outside of the fabric
(reachable via local L3Out)
 BDs with L3 Multicast sources or receivers may or may not be stretched across the sites
 External sources must be reachable independently from each site via local L3Outs

• Supports receivers attached to the fabric (as an endpoint) and receivers outside of the
fabric (reachable via L3Out)
 BDs with L3 Multicast receivers or receivers may or may not be stretched across the sites

#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 73
Multicast
Forwarding
Multicast Routing with Multi-Site
Forwarding Behavior

• Each defined VRF gets assigned a dedicated multicast group (VRF GIPo)
• L3 Multicast within a site is forwarded using the VRF GIPo tree and delivered to all
leaf nodes where the VRF is deployed (whether there is multicast interest or not)
 Multicast is dropped at the egress leaf in the case where there are no interested
receivers

• Inter-site L3 Multicast for a given VRF is forwarded using HER (Head End
Replication) tunnels whether there is multicast interest or not at the receiving site
 Multicast is dropped at the receiving spine if there are no interested receivers in that site

• No contracts are needed for forwarding L3 Multicast data plane traffic

#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 75
For more Information on
Multicast Routing with Multi-Site ACI Multicast Routing:
L2 Multicast over Multi-Site (Supported since ACI 3.0) BRKACI-2608
• Stretched BDs with BUM Traffic Enabled (no PIM configuration required)
• Within a site the L2 multicast is sent to the BD GIPo multicast address (unique per site)  reaches all the spines and
the leaf nodes where the BD is defined (configuration driven)
• Spine elected as Designated Forwarder (DF) replicate the stream to each remote sites where the BD is stretched
• At the receiving spine the multicast will be sent down the FTAG tree to the receiving site BD GIPo multicast address

Inter-Site
Network

HREP tunnel destination:


10.100.102.200

Site 1
BD1 VNID  16514962
BD1 GIPo  225.0.195.240

Site 2
BD1 VNID  16711545
BD1 GIPo  225.1.128.160
BD1 BD1 BD1
BD1 BD1
Site 1 Source Receiver Receiver Site 2
NOT Receiver

BRKACI-2125 #CLUS © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 76
Multicast Routing with Multi-Site
L3 Multicast over Multi-Site (Source Inside the Fabric)
• Built as Routing-First Approach (decrement TTL at source and destination ACI leaf node)
• L3 Multicast is always sent to the VRF GIPo within a site (existing behavior)
• Between sites it is sent over the HREP tunnel to the Multicast TEP of the remote sites
where the VRF is stretched (the VXLAN header will include the source site VRF VNID)
• L3 Multicast at the receiving site will be sent in the VRF GIPo of the receiving site
Inter-Site
Network

Site1 VRF VNID  2293762


HREP tunnel dest: 10.100.102.200

Site 2 Decrement TTL


Decrement TTL VRF VNID
VRF VRF GIPo  225.1.248.16
Site 1
L3Out VRF VNID 2293762 L3Out
VRF VRF GIPo  225.1.248.32 VRF

BD1 BD2Decrement TTL BD1 BD2 BD2


VRF VRF
Site 1 VRF VRF Site 2
Source R1 R3 R4
RP
R2 #CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 77
Multicast Routing with Multi-Site
L3 Multicast over Multi-Site (Source Outside the Fabric)

• Local L3Out must be used to receive traffic from an external source


• Multicast traffic from external sources dropped on the spines (to avoid traffic duplication)

Multicast traffic from external


Multicast traffic from external Inter-Site sources is dropped on spine.
sources is dropped on spine. Network Not sent over HREP tunnels
Not sent over HREP tunnels

O-MTEP O-MTEP

Site 2 Decrement TTL


VRF VNID
VRF VRF GIPo  225.1.248.16
Site 1
L3Out VRF VNID 2293762 L3Out
VRF VRF GIPo  225.1.248.32 VRF
Decrement TTL
BD2Decrement TTL BD1 BD2 BD2
Site 1 VRF VRF VRF Site 2
R1 R3 R4
RP
Source
#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 78
PIM Sparse Mode
Control and Data Planes
Multicast Routing with Multi-Site
External RP requirement with ACI 4.0

• RP must be external to the fabric


• All sites can point to the same RP
address Inter-Site
Network

RP: 1.1.1.1 L3Out-2 RP: 1.1.1.1


L3Out-1
Site 1 Site n

PIM Enabled Core

RP: 1.1.1.1
#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 80
Multicast Routing with Multi-Site
External RP requirement with ACI 4.0

• Sites can be connected to different


RP addresses with inter-domain
multicast Inter-Site
Network

RP: 1.1.1.1 L3Out-2 RP: 2.2.2.2


L3Out-1
Site 1 Site n

PIM Enabled Core

MSDP
RP: 1.1.1.1 RP: 2.2.2.2
#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 81
Source Inside, Receiver Inside
Control and Data Planes

#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 82
Multicast Routing with Multi-Site
Sources Inside, Receivers Inside (Control Plane)
 A receiver is connected to a leaf node in a
site and sends an IGMP Join for group G
 PIM Shared tree with (*,G) state is built from Inter-Site
the leaf node toward the external RP Network

COOP
(*,G) state (*,G) state
BL11 BL21
IGMP
Join for
L3Out-1 L3Out-2
G
Receiver
Site 1 PIM (*,G) Site 2
Join
(*,G) state
(*,G) state
RP PIM (*,G)
Join

#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 83
Multicast Routing with Multi-Site
Sources Inside, Receivers Inside (Control Plane cont.)
 Control plane activities when a source
connected to a leaf node starts streaming traffic
Inter-Site
Network

PIM
Register
* (S,G) state
BL11 BL21
MC
traffic to
L3Out-1 Advertise L3Out-2
G
Source S PIM PIM (S,G) source’s IP Receiver
Site 1 Register Join Subnet Site 2
Stop PIM (S,G)
Join (S,G) state

(S,G) state
RP

* PIM register packets are unicast packets (sent from first-hop router to the RP
#CLUS external to
BRKACI-2125 the©fabric), withand/or
2019 Cisco PIMitsprotocol number
affiliates. All (103) set
rights reserved. inPublic
Cisco the IP header
84
Multicast Routing with Multi-Site
Sources Inside, Receivers Inside (Data Plane)
• When RP receives register from source it will forward
multicast down the shared tree

• When BL (BL21) installs (S,G), sees that the source is


part of a pervasive BD and sends PIM prune towards Inter-Site
the RP Network

(S,G) state (S,G) state


BL11 BL21
MC
traffic to
L3Out-1 L3Out-2 Receiver
G
Source S
Site 1 Site 2
PIM Prune

Multicast data
RP PIM Prune

#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 85
Source Outside, Receiver Inside
Control and Data Planes

#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 86
Multicast Routing with Multi-Site
Sources Outside, Receivers Inside (Control Plane)
 A receiver is connected to a leaf node in a
site and sends an IGMP Join for group G
 PIM Shared tree with (*,G) state is built from Inter-Site
the leaf node toward the external RP Network

COOP COOP
(*,G) state (*,G) state
BL11 BL21
IGMP IGMP
Join for Join for
L3Out-1 L3Out-2
G G
Receiver Receiver
Site 1 PIM (*,G) Site 2
PIM (*,G) Join
Join
(*,G) state
(*,G) state
(*,G) state
RP PIM (*,G)
Source S Join

#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 87
Multicast Routing with Multi-Site
Sources Outside, Receivers Inside (Data Plane)
 Each site must receive multicast sent from external sources via a local L3Out
Multicast traffic from external Multicast traffic from external
sources is dropped on spine. sources is dropped on spine.
Inter-Site
Not sent over HREP tunnels Not sent over HREP tunnels
Network

(S,G) state (S,G) state


BL11 BL21

L3Out-1 L3Out-2

`
Receiver Receiver
Site 1 Site 2

Multicast data
RP
Source S
MC
traffic to
G #CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 88
Multicast Routing with Multi-Site
Sources Outside, Receivers Inside with Transit Case
 Transit multicast use case is supported. One site can be transit for an external source and that multicast
flow can arrive at another site via the local L3out. Multicast is not sent over the ISN in this case

Multicast traffic from external Inter-Site Multicast traffic from external


sources is dropped on spine. Network sources is dropped on spine.
Not sent over HREP tunnels Not sent over HREP tunnels

BL11 BL21

L3Out-1a L3Out-1b L3Out-2


Receiver Receiver
Site 1 Site 2

Multicast data
RP
Source S

#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 89
Source Inside, Receiver Outside
Control and Data Planes

#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 90
Multicast Routing with Multi-Site
Sources Inside, Receivers Outside (Control Plane)
 Control plane activities when a source
connected to a leaf node starts streaming traffic
Inter-Site
Network

PIM
Register
* (S,G) state
BL11 BL21
MC
traffic to
L3Out-1 Advertise L3Out-2
G
Source S PIM PIM (S,G) source’s IP
Site 1 Register Join Subnet Site 2
Stop PIM (S,G)
Join (S,G) state

(S,G) state
RP

Receiver
* PIM register packets are unicast packets (sent from first-hop router to the RP
#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 91
external to the fabric), with PIM protocol number (103) set in the IP header
Multicast Routing with Multi-Site
Sources Inside, Receivers Outside (Data Plane)
• When RP receives register from source it will forward
multicast down the shared tree

• When BL (BL21) installs (S,G), sees that the source is


part of a pervasive BD and sends PIM prune towards Inter-Site
the RP Network

(S,G) state (S,G) state


BL11 BL21
MC
traffic to
L3Out-1 L3Out-2
G
Source S
Site 1 Site 2
PIM Prune

PIM Prune Multicast data


RP

Receiver
#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 92
Multicast Routing with Multi-Site
Sources Inside on a Stretched BD, Receivers Outside (Control Plane)
• In scenarios where the source is part of a
stretched BD, the RP may sent PIM (S,G) Join
to either sites
Inter-Site
Network

PIM
Register
* (S,G) state
BL11 BL21
MC
traffic to
L3Out-1 L3Out-2
G Advertise
Source S PIM Advertise PIM (S,G)
source’s IP
Site 1 Register source’s IP
Subnet Join Site 2
Stop Subnet
PIM (S,G)
(S,G) state
Join

RP (S,G) state

Receiver
* PIM register packets are unicast packets (sent from first-hop router to the RP
#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 93
external to the fabric), with PIM protocol number (103) set in the IP header
Multicast Routing with Multi-Site
Sources Inside, Receivers Outside (Data Plane)
• In scenarios where the source is part of a • Multicast flows may be sent to the external
stretched BD, the RP may sent PIM (S,G) Join receivers through the L3Out of a remote site
to either sites
Inter-Site
Network

(S,G) state
BL11 BL21
MC
traffic to
L3Out-1 L3Out-2
G
Source S
Site 1 Site 2

(S,G) state

Multicast data
RP (S,G) state

Receiver
#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 94
PIM SSM
Control and Data Planes
Source Inside, Receiver Inside
Control and Data Planes

#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 96
PIM SSM Multicast Routing with Multi-Site
Sources Inside, Receivers Inside (Control Plane)
 A receiver is connected to a leaf node in a site
and sends an IGMPv3 Join for group (S,G)
 An (S,G) state is created in the leaf nodes where
Inter-Site
the endpoint is connected and to the BL node
Network

COOP
(S,G) state (S,G) state
BL11 BL21
IGMPv3 Join
for (S,G)
L3Out-1 L3Out-2
Source S Receiver
Site 1 Site 2

#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 97
PIM-SSM Multicast Routing with Multi-Site
Sources Inside, Receivers Inside (Control Plane)
 As soon the multicast source starts sending
traffic, it is encapsulated and sent to all the local
leaf nodes and all the remote sites where the VRF
is defined Inter-Site
 The traffic will hence be received by the remote Network
receiver

(S,G) state (S,G) state


BL11 BL21
MC
traffic to
L3Out-1 L3Out-2
G
Source S Receiver
Site 1 Site 2

#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 98
Source Outside, Receiver Inside
Control and Data Planes

#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 99
PIM-SSM Multicast Routing with Multi-Site
Sources Outside, Receivers Inside (Control Plane)
 A receiver is connected to a leaf node in a site
and sends an IGMPv3 Join for group (S,G)
 PIM (S,G) Joins are sent from the BL nodes
toward the external network to build (S,G) state Inter-Site
up to the last router where the source is Network
connected

COOP COOP
(S,G) state (S,G) state (S,G) state (S,G) state
BL11 BL21
IGMPv3 Join IGMPv3 Join
for (S,G) for (S,G)
L3Out-1 L3Out-2
Receiver Receiver
Site 1 PIM (S,G) Site 2
PIM (S,G) Join
Join PIM (S,G)
Join
(S,G) state
(S,G) state

Source S

#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 100
PIM-SSM Multicast Routing with Multi-Site
Sources Outside, Receivers Inside (Data Plane)
 Each site must receive multicast sent from external sources via a local L3Out
Multicast traffic from external Multicast traffic from external
sources is dropped on spine. sources is dropped on spine.
Inter-Site
Not sent over HREP tunnels Not sent over HREP tunnels
Network

BL11 BL21

L3Out-1 L3Out-2
Receiver Receiver
Site 1 Site 2

MC Multicast data
traffic to
Source S G

#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 101
Source Inside, Receiver Outside
Control and Data Planes

#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 102
PIM-SSM Multicast Routing with Multi-Site
Sources Inside, Receivers Outside (Control Plane)
 Control plane activities when a receiver
connected to an external network wants to join
a (S,G) stream Inter-Site
Network

(S,G) state
BL11 BL21

L3Out-1 Advertise L3Out-2


Source S PIM (S,G) source’s IP
Site 1 Join Subnet Site 2

(S,G) state

IGMPv3 Join
for (S,G)
Receiver
* PIM register packets are unicast packets (sent from first-hop router to the RP
#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 103
external to the fabric), with PIM protocol number (103) set in the IP header
Multicast Routing with Multi-Site
Sources Inside, Receivers Outside (Data Plane)
 As soon the multicast source starts sending traffic, it
is encapsulated and sent to all the local leaf nodes
and all the remote sites where the VRF is defined
 Traffic flows out of the local L3Out toward the Inter-Site
external receiver
Network

BL11 BL21
MC
traffic to
L3Out-1 L3Out-2
G
Source S
Site 1 Site 2

Multicast data

Receiver
#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 104
Multicast Routing with Multi-Site
Sources Inside on a Stretched BD, Receivers Outside (Control Plane)
• In scenarios where the source is part of a
stretched BD, the PIM (S,G) Joins could be
sent toward either sites
Inter-Site
Network

(S,G) state
BL11 BL21

L3Out-1 L3Out-2
Advertise
Source S Advertise PIM (S,G)
source’s IP
Site 1 source’s IP
Subnet Join Site 2
Subnet

(S,G) state

PIM (S,G)
IGMPv3 Join (S,G) state
Join
for (S,G)
Receiver
* PIM register packets are unicast packets (sent from first-hop router to the RP
BRKACI-2125 #CLUS © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 105
external to the fabric), with PIM protocol number (103) set in the IP header
PIM-SSM Multicast Routing with Multi-Site
Sources Inside, Receivers Outside (Data Plane)
• In scenarios where the source is part of a • Multicast flows may be sent to the external
stretched BD, the RP may sent PIM (S,G) Join receivers through the L3Out of a remote site
to either sites
Inter-Site
Network

(S,G) state
BL11 BL21
MC
traffic to
L3Out-1 L3Out-2
G
Source S
Site 1 Site 2

Multicast data
(S,G) state

Receiver
#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 106
Connecting to the External Layer 3 Domain
Connecting to the External Layer 3 Domain
‘Traditional’ L3Outs on the BL Nodes

Client

L3Out WAN

• Connecting to WAN Edge devices at


Border Leaf nodes
• VRF-Lite hand-off for extending L3 multi-
tenancy outside the ACI fabric
Border Leafs • Up to 400 L3Outs/VRFs currently supported on
the same BL nodes pair
• Support for host routes advertisement out
of the ACI Fabric from ACI release 4.0(1)

#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 108
Connecting to the External Layer 3 Domain
‘GOLF’ L3Outs

= VXLAN Encap/Decap
Different WAN
Hand-Off options:
VRF-Lite, MPLS-
VPN, LISP*
Client

WAN

OTV/VPLS
• Connecting to WAN Edge devices at Spine
GOLF Routers
(ASR 9000, ASR 1000,
nodes (directly or indirectly)
Nexus 7000)  VXLAN data plane with MP-BGP EVPN control
plane
• High scale tenant L3Out support
• Simplified and automated tenant L3Out
configuration with OpFlex
• Support for host routes advertisement out
of the ACI Fabric
#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 109
ACI Multi-Site and ‘GOLF’ L3Outs
Deployment Options
Distributed GOLF Routers Shared GOLF Routers (from ACI 3.1)

WAN WAN
GOLF Routers GOLF Routers GOLF Routers

MP-BGP MP-BGP
MP-BGP ISN MP-BGP EVPN ISN EVPN
EVPN EVPN

 Each ACI sites utilises a separate pair of GOLF routers for  Common pair of GOLF routers shared by all sites for
communication with the WAN communication with the WAN
 Local EVPN peering between spines and GOLF routers  GOLF routers can be connected to the ISN or directly
 GOLF routers connect to the ISN or directly to the spines to the spines

#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 110
ACI Multi-Site and ‘GOLF’ L3Outs
Must Use Host-Route Advertisement for Stretched BDs with GOLF L3Outs
10.1.0.10/32  G1, G2 Traffic destined


10.1.0.20/32  G3, G4 10.1.0.0/24  G1-G4


to 10.1.0.20

WAN WAN
G1 G2 G3 G4 G1 G2 G3 G4

ISN ISN

10.1.0.0/24 10.1.0.0/24
.10 .20 .10 .20

 Host-route advertisement into the WAN for stretched BDs  Without host-route advertisement traffic destined to a
stretched IP subnet may enter the ‘wrong site’
 Ensures that ingress traffic is always delivered to the ‘right
site’  A site can’t be used as ‘transit’ for traffic destined to an
endpoint part of a stretched BD and remotely located (traffic
is dropped on the receiving spines)
#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 111
Multi-Site and L3Outs
Endpoints always Use Local L3Outs for Outbound Traffic

Supported Design
✓ Not Supported Design

Inter-Site Network Inter-Site Network
X


L3Out L3Out L3Out L3Out
Site 1 Site 2 Site 1 Site 2

WAN WAN

Note: the same consideration applies to both Border Leaf L3Outs and GOLF L3Outs
#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 112
ACI 4.2(1)
ACI Multi-Site and L3Outs Release
Support of Inter-Site L3Out

• Starting with ACI Release 4.2(1) it will be


Inter-Site Network possible for endpoints in a site to send traffic to
resources (WAN, Mainframes, etc.) accessible
via L3Out of a remote site
• Traffic will be directly encapsulated to the TEP
of the remote BL nodes
• The BL nodes will get assigned an address part of

L3Out L3Out
Site 1 Site 2 an additional (configurable) prefix that must be
routable across the ISN
WAN
• Same solution will also support transit routing
across sites (L3Out to L3Out)

#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 113
Multi-Site and L3Out
Endpoints always Use Local L3Outs for Outbound Traffic

Inter-Site Network

Site 1 Site 2

Web-EPG C1 Ext-EPG

L3Out L3Out
Site 1 Site 2
10.10.10.10 IP Subnet 10.10.10.11
IP Subnet Active/Standby
10.10.10.0/24 Active/Standby
10.10.10.0/24

Traffic dropped
because of lack of
state in the FW

#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 114
Multi-Site and L3Out ACI 4.0(1)
Release
Use of Host-Routes Advertisement

Inter-Site Network

Site 1 Site 2

Web-EPG C1 Ext-EPG

L3Out L3Out
Site 1 Site 2
10.10.10.10 Host routes 10.10.10.11
10.10.10.10/32 Active/Standby Active/Standby Host routes
10.10.10.11/32
*Alternative could be
running an overlay solution
Host-routes
(LISP, GRE, etc.) injected into the
WAN* Enabled at
the BD level
• Ingress optimisation requires host-routes advertisement on the L3Out
 Native support on ACI Border Leaf nodes available from ACI release 4.0
 Supported also on GOLF L3Outs #CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 115
Multi-Site and L3Out ACI 4.2(1)
Release
Active/Standby FW Deployed across Sites

Inter-Site Network

Site 1 Site 2

Web-EPG C1 Ext-EPG

L3Out L3Out
Site 1 Site 2
10.10.10.10 10.10.10.11
Active Standby

• Inbound and outbound flows are forced through the site with the active perimeter FW node
• Mandates inter-site L3Out support (ACI release 4.2)
#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 116
Multi-Site and Network Services
Integration
Multi-Site and Network Services Deployment options fully
supported with ACI Multi-Pod
Integration Models
ISN

• Active and Standby pair deployed across Pods


• Currently supported only if the FW is in L2 mode or in L3
mode but acting as default gateway for the endpoints
• From ACI 4.2(1) will be also supported as perimeter FW
Active Standby

ISN
• Active/Active FW cluster nodes stretched across Sites
(single logical FW)
• Requires the ability of discovering the same MAC/IP info
in separate sites at the same time
Active/Active Cluster
• Not currently supported (scoped for a future ACI release)

ISN • Recommended deployment model for ACI Multi-Site


• Option 1: supported from 3.0 for N-S if the FW is
connected in L3 mode to the fabric  mandates the
deployment of traffic ingress optimization
• Option 2: supported from 3.2 release with the use of
Active/Standby Active/Standby #CLUS Service Graph with
© 2019 Policy
Cisco and/orBased Redirection
its affiliates. All (PBR)
rights reserved. Cisco Public 118
Independent Active/Standby FW Pairs across Sites ACI 3.2(1)
Release
Use of Service Graph and Policy Based Redirection

• SW and HW dependencies:
 Supported from ACI release 3.2(1)
 Mandates the use of EX/FX leaf nodes (both for compute and service leaf switches)

• The PBR policy applied on a leaf switch can only redirect traffic to a service
node deployed in the local site
 Requires the deployment of independent service node function in each site
 Various design options to increase resiliency for the service node function: per site
Active/Standby pair, per site Active/Active cluster, per site multiple independent Active
nodes
 Only a single service node function (FW) supported in the PBR policy with 3.2 release
 Two service node functions (FW + SLB) supported in the PBR policy with 4.0 release

#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 119
Use of Service Graph and Policy Based Redirection
Resilient Service Node Deployment in Each Site

Active/Standby Cluster Active/Active Cluster Independent Active Nodes


Site1 Site1 Site1

L3 Mode L3 Mode L3 Mode L3 Mode L3 Mode Active/Standby


Active/Standby Cluster Active/Active Cluster Active Node 1 Active Node 2 Node 3

• The Active/Standby pair represents a • The Active/Active cluster represents a • Each Active node represent a unique
single MAC/IP entry in the PBR policy single MAC/IP entry in the PBR policy MAC/IP entry in the PBR policy
• Spanned Ether-Channel Mode • Use of Symmetric PBR to ensure each
supported with Cisco ASA/FTD flow is handled by the same Active node
platforms in both directions

#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 120
Use of Service Graph and Policy Based Redirection
North-South and East-West Use Cases

North-South • Best practice recommendations for both North-South and East-


West use cases:
VRF1
 Service Node deployed in ‘one arm’ mode (‘two-arms’ mode also
L3 L3
supported but not preferred)
WAN  Service-BD must be stretched across sites (BUM flooding
L3Out
WAN Web-EPG
WAN can/should be disabled)
Edge

Ext-BD Web-BD
 Ext-EPG must also be a stretched object, mapped to the
Service-BD
individual L3Outs defined in each site
 Web-BD and App-BD can be stretched across sites or locally
defined in each site
East-West • North-South use case
VRF1  Intra-VRF only support current releases
L3 L3 L3
• East-West use case
Web-EPG App-EPG
 Supported intra-VRF or inter-VRFs/Tenants
 Requires to configure the IP range for the endpoints under the
Web-BD Service-BD App-BD Provider EPG

#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 121
Use of Service Graph and Policy Based Redirection
North-South Communication – Inbound Traffic

Inter Site
Network

Site1 Site2
Compute leaf
always applies
the PBR policy Compute leaf
always applies
EPG EPG the PBR policy
Ext C Web

Consumer Provider
(Provider) (Consumer)

L3Out-Site1 L3Out-Site2
10.10.10.10 10.10.10.11
L3 Mode L3 Mode
Active/Standby Active/Standby

• Inbound traffic can enter any site when destined to a stretched subnet (if ingress optimisation is not deployed or
possible)
• PBR policy must always be applied on the compute leaf node where the destination endpoint is connected
 Requires the VRF to have the default policies for enforcement preference and direction
 Supported only intra-VRF in ACI release 3.2
 Ext-EPG and Web EPG can indifferently be provider or consumer of the contract
#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 122
Use of Service Graph and Policy Based Redirection
North-South Communication – Outbound Traffic

Inter Site
Network

Site1 Site2
Compute leaf
always applies
the PBR policy Compute leaf
always applies
EPG EPG the PBR policy
Ext C Web

Consumer Provider
(Provider) (Consumer)

L3Out-Site1 L3Out-Site2
10.10.10.10 10.10.10.11
L3 Mode L3 Mode
Active/Standby Active/Standby

• PBR policy always applied on the same leaf where it was applied for inbound traffic
• Ensures the same service node is selected for both legs of the flow
• Different L3Outs can be used for inbound and outbound directions of the same flow

#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 123
Use of Service Graph and Policy Based Redirection
East-West Communication (1)

Inter Site
Network

Site1 Site2

Provider leaf
always applies
the PBR policy
EPG EPG
Web C App
Provider Consumer

L3 Mode L3 Mode EPG


EPG
Active/Standby Active/Standby
Web App

 EPGs can be locally defined or stretched across sites


 EPGs can be part of the same VRF or in different VRFs (and/or Tenants)
 PBR policy is always applied on the leaf switch where the Provider* endpoint is connected
• The Consumer leaf always redirects traffic to a local service node

*From ACI 4.0(1) release, it was on the Consumer side earlier #CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 124
Use of Service Graph and Policy Based Redirection
East-West Communication (2)

Inter Site
Network

Site1 Site2

Provider leaf
always applies Consumer leaf
the PBR policy does not apply
EPG EPG
Web C App the PBR policy

Provider Consumer

L3 Mode L3 Mode EPG


EPG
Active/Standby Active/Standby
Web App

 The Consumer leaf must not apply PBR policy to ensure proper traffic stitching to the FW
node that has built connection state
 Ensures both legs of the flow are handled by the same service node

#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 125
Multi-Site and Virtual Machine Manager
(VMM) Integration
ACI Multi-Site and VMM Integration
Option 1 – Separate VMM per Site
ISN

Site 1 VMM Domain VMM Domain Site 2


DC1 DC2

VMM 1 VMM 2

HV vSwitch1
HV HV Managed HV vSwitch2
HV HV
by VMM 1
HV Cluster 1 Managed HV Cluster 2
by VMM 2

• Typical deployment model for an ACI Multi-Site


• Creation of separate VMM domains in each site, which are then exposed to
the Multi-Site Orchestrator
#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 127
ACI Multi-Site and VMM Integration
Option 2 – Single VMM Managing Host Clusters in Separate Sites
ISN

Site 1 VMM Domain VMM Domain Site 2


DC1 DC2

VMM 1

HV vSwitch1
HV HV HV vSwitch2
HV HV
Managed
HV Cluster 1 by VMM 1 HV Cluster 2

• Even the deployment of a single VMM leads to the creation of separate


VMM domains across sites

#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 128
ACI Multi-Site and VMM Integration
Workload Migration across Sites
ISN

Site 1 VMM Domain VMM Domain Site 2


DC1 DC2

vCenter vCenter
Server 1 Server 2

SRM SRM
HV HVVDS1 HV
EPG1 HV HVVDS2 HV
EPG1

HV Cluster 1 HV Cluster 2

Live vMotion/Cold Migration


• Live virtual machines migration across sites is supported only with vCenter deployments
(both for single or multiple vCenter options)
 Requires vSphere 6.0 and newer, no support for DRS, vSphere HA/FT
• Use of Site Recovery Manager (SRM) or similar higher level orchestrator for workload
recovery across sites
#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 129
Multi-Pod and Multi-Site Integration
For More Information on how
ACI Multi-Pod and Multi-Site to setup ACI Multi-Pod +
Main Use Cases Multi-Site from scratch:
BRKACI-2291

 Adding a Multi-Pod Fabric as a ‘Site’ on the Multi-Site Orchestrator (MSO)

 Converting a single Pod Fabric (already added to MSO) to a Multi-Pod fabric

#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 131
ACI Multi-Pod and Multi-Site
Connectivity between Pods and Sites
Single external network used
for IPN and ISN

IPN/ISN

1st Gen 1st Gen

APIC Cluster
Pod ‘A’ Pod ‘B’

Site 1 Site 2

 Only 2nd generation spines must be connected to the external network


• Need to add 2nd gen spines in each Pod (at least two per Pod) and migrate connections to the IPN from 1 st gen
spines to 2nd gen spines
 Single ‘infra’ L3Out and set of uplinks to carry both Multi-Pod and Multi-Site East-West traffic

#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 132
ACI Multi-Pod and Multi-Site
Connectivity between Pods and Sites

IP WAN
Separate networks
used for IPN and ISN
IPN

Site 2
1st Gen 1st Gen

APIC Cluster
Pod ‘A’ Pod ‘B’

Site 1 Site 2

 Only 2nd generation spines must be connected to the external network


• Need to add 2nd gen spines in each Pod (at least two per Pod) and migrate connections to the IPN from 1 st gen
spines to 2nd gen spines
 Single ‘infra’ L3Out and set of uplinks to carry both Multi-Pod and Multi-Site East-West traffic

#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 133
Connectivity between Pods and Sites
Not Supported Topology

Separate uplinks
between spines IP WAN
and external
networks
IPN

Site 2
1st Gen 1st Gen

APIC Cluster
Pod ‘A’ Pod ‘B’

Site 1 Site 2

 Only 2nd generation spines must be connected to the external network


• Need to add 2nd gen spines in each Pod (at least two per Pod) and migrate connections to the IPN from 1 st gen
spines to 2nd gen spines
 Single ‘infra’ L3Out and set of uplinks to carry both Multi-Pod and Multi-Site East-West traffic

#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 134
ACI Multi-Pod and Multi-Site
BGP Spine Roles

 Spines in each Multi-Pod fabric can have one


of those two roles:
1. BGP Speakers: establish EVPN peerings with BGP
IPN/ISN speakers in remote sites and with BGP Forwarders
in the local site (intra- and inter- Pods)
BGP Forwarders
BGP Forwarders
BGP Forwarders
• Recommended to deploy two speakers per
BGP Speakers
BGP Speakers Multi-Pod fabric (in separate Pods)

1st Gen 1st Gen


• Explicitly configured on MSO
• Multi-Site Speaker must be a Multi-Pod spine
as well
2. BGP Forwarders: establish BGP EVPN peerings
with BGP speakers in the local site
APIC Cluster

Pod ‘B’
• All the spines that are not speakers implicitly
become forwarders
Site 1

#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 135
ACI Multi-Pod and Multi-Site
Inter-Site and Intra-Site EVPN Sessions
= Inter-Site MP-BGP EVPN Peering (Speaker-to-Speaker)
= Intra-Site MP-BGP EVPN Peering (Speaker-to-Forwarders)

Site 2
IP
BGP BGP
Speaker Speaker

IP WAN

IPN

BGP Forwarder Forwarder Forwarder Forwarder BGP


Speaker Speaker

Site 1
#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 136
ACI Multi-Pod and Multi-Site
Inter-Site L2/L3 Unicast Traffic

Site 2
IP
O-UTEP-S2

IP WAN
EP1
Site 2 Spine Table
EP1 Leaf 1

EP2 O-UTEP-S1P1 Site 1-Pod2 Spine Table


EP3 O-UEP-S1P3 EP4 Leaf 4
IPN EP1
EP4 O-UTEP-S1P2 O-UTEP-S2

EP2 O-UTEP-S1P1

EP3 O-UTEP-S1P3
Site 1-Pod3 Spine Table
Site 1-Pod1 Spine Table O-UTEP-S1P1 O-UTEP-S1P2 O-UTEP-S1P3
EP3 Leaf 4
EP2 Leaf 1
EP1 O-UTEP-S2
EP1 O-UTEP-S2
EP2 O-UTEP-S1P1
EP3 O-UTEP-S1P3
EP4 O-UTEP-S1P2
EP4 O-UTEP-S1P2
EP2 EP4 EP3

Site 1
#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 137
ACI Multi-Pod and Multi-Site
Inter-Site L2 BUM Traffic
= BUM sent via Ingress Replication
= BUM sent via PIM-Bidir

Site 2
IP
O-MTEP-S2

IP WAN
EP1

Use PIM-Bidir for


Replication

BUM originated in the


IPN
local Pod, so use
Ingress Replication Only
Only forwarding
forwarding
toward the Remote Site the BUM frame
the BUM frame
into the local
into the local Pod
Pod

DF DF DF

BUM
Frame
EP2 EP4 EP3

Site 1
#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 138
ACI Multi-Pod and Multi-Site
Inter-Site L2 BUM Traffic
= BUM sent via Ingress Replication
Ingress Replicated = BUM sent via PIM-Bidir
IP to O-MTEP address
Site 2 of Site 1
DF

BUM
Frame IP WAN
EP1

Use PIM-Bidir for


EP1 generates Use PIM-Bidir for Replication
a L2 BUM frame L2 BUM frame Replication
cannot be re-injected
into the IPN IPN

O-MTEP-S1

EP2 EP4 EP3

Site 1
#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 139
ACI Multi-Pod and Multi-Site
TEP Pools Deployment and Advertisement
Filter out the
advertisement of
TEP Pool advertisement
10.1.0.0/16
the TEP Pool into not needed for inter-Sites
Site 2
IP the backbone
communication

IP WAN
Filter out the
All Pods TEP Pools
TEP Pool Site 2 advertisement of
advertised into the
10.1.0.0/16 the TEP Pools into
IPN
the backbone

IPN
10.1.0.0/16
TEP Pool Pod1 10.2.0.0/16 10.3.0.0/16 TEP Pool Pod3
10.1.0.0/16 TEP Pool Pod2 10.3.0.0/16
10.2.0.0/16

Site 1
#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 140
Conclusions
Multi-Pod and Multi-Site
Complementary Architectures

Multi-Pod Fabric ‘A’ (AZ 1)

‘Classic’ Active/Active

Pod ‘1.A’ Pod ‘2.A’

ACI Multi-Site

Multi-Pod Fabric ‘B’ (AZ 2)

‘Classic’ Active/Active
Application
Pod ‘1.B’
workloads
Pod ‘2.B’
deployed across
availability zones #CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 142
Where to Go for More Information
 ACI Multi-Pod White Paper
http://www.cisco.com/c/en/us/solutions/collateral/data-center-virtualization/application-centric-
infrastructure/white-paper-c11-737855.html?cachemode=refresh

 ACI Multi-Pod Configuration Paper


https://www.cisco.com/c/en/us/solutions/collateral/data-center-virtualization/application-centric-
infrastructure/white-paper-c11-739714.html

 ACI Multi-Pod and Service Node Integration White Paper


https://www.cisco.com/c/en/us/solutions/collateral/data-center-virtualization/application-centric-
infrastructure/white-paper-c11-739571.html

 BRKACI-2003 @ Cisco Live Barcelona 2019


https://ciscolive.cisco.com/on-demand-library/?search=BRKACI-2003#/session/1532112828758001tmf6

 ACI Multi-Site White Paper


https://www.cisco.com/c/en/us/solutions/collateral/data-center-virtualization/application-centric-
infrastructure/white-paper-c11-739609.html

 Deploying ACI Multi-Site from Scratch


https://www.youtube.com/watch?v=HJJ8lznodN0

#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 143
Complete your
online session • Please complete your session survey
evaluation after each session. Your feedback
is very important.
• Complete a minimum of 4 session
surveys and the Overall Conference
survey (starting on Thursday) to
receive your Cisco Live water bottle.
• All surveys can be taken in the Cisco Live
Mobile App or by logging in to the Session
Catalog on ciscolive.cisco.com/us.
Cisco Live sessions will be available for viewing
on demand after the event at ciscolive.cisco.com.

#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 144
Continue your education

Demos in the
Walk-in labs
Cisco campus

Meet the engineer


Related sessions
1:1 meetings

#CLUS BRKACI-2125 © 2019 Cisco and/or its affiliates. All rights reserved. Cisco Public 145
Thank you

#CLUS
#CLUS

Вам также может понравиться