Вы находитесь на странице: 1из 336

IXP Training Workshops

Contact: training@apnic.net

WROU03_v1.0

Introduction to The
Internet
IXP Training Workshops

Introduction to the Internet


Topologies and Definitions
IP Addressing
Internet Hierarchy
Gluing it all together

Topologies and
Definitions
What does all the jargon mean?

Some Icons
Router
(layer 3, IP datagram forwarding)
Ethernet switch
(layer 2, packet forwarding)

Network Cloud

Routed Backbone

ISPs build networks covering


regions

Regions can cover a country,


sub-continent, or even global
Each region has points of
presence built by the ISP

Routers are the infrastructure


Physical circuits run between
routers
Easy routing configuration,
operation and
troubleshooting
The dominant topology used
in the Internet today
6

MPLS Backbones

Some ISPs & Telcos use


Multi Protocol Label
Switching (MPLS)
MPLS is built on top of
router infrastructure

Used replace old ATM


technology
Tunnelling technology

Main purpose is to provide


VPN services

Although these can be


done just as easily with
other tunnelling
technologies such as GRE
7

Points of Presence

PoP Point of Presence

vPoP virtual PoP

Physical location of ISPs equipment


Sometimes called a node
To the end user, it looks like an ISP location
In reality a back hauled access point
Used mainly for consumer access networks

Hub/SuperPoP large central PoP

Links to many PoPs


8

PoP Topologies

Core routers

Distribution routers

connections to other providers

Service routers

high port density, connecting the end users to the network

Border routers

higher port density, aggregating network edge to the


network core

Access routers

high speed trunk connections

hosting and servers

Some functions might be handled by a single


router
9

Typical PoP Design


Other ISPs

Other ISPs

Border

Backbone link
to another PoP

Backbone link
to another PoP
Network
Core

Service
Network
Operation
Centre

Access
Business
Customer
Aggregation

Service
ISP Services
(DNS, Mail, News,
FTP, WWW)

Access

Hosted Services

Consumer
Aggregation

10

More Definitions

Transit

Peering

Carrying traffic across a network


Usually for a fee
Exchanging routing information and traffic
Usually for no fee
Sometimes called settlement free peering

Default

Where to send traffic when there is no


explicit match in the routing table
11

Peering and Transit example


provider A

IXP-West

Backbone
Provider D

IXP-East

provider B
provider C

A and B peer for free, but need


transit arrangements with D to
get packets to/from C
12

Private Interconnect
Autonomous System 334

ISP B
border

border

ISP A

Autonomous System 99

13

Public Interconnect
A location or facility where several ISPs
are present and connect to each other
over a common shared media
Why?

To save money, reduce latency, improve


performance

IXP Internet eXchange Point


NAP Network Access Point

14

Public Interconnect
Centralised (in one facility)
Distributed (connected via WAN links)
Switched interconnect

Ethernet (Layer 2)
Technologies such as SRP, FDDI, ATM, Frame
Relay, SMDS and even routers have been used
in the past

Each provider establishes peering


relationship with other providers at IXP

ISP border router peers with all other provider


border routers
15

Public Interconnect
ISP 1

ISP 2

ISP 3

ISP 4

IXP

ISP 5

ISP 6

Each of these represents a border router in a different autonomous system


16

ISPs participating in
Internet

Bringing all pieces together, ISPs:

Build multiple PoPs in a distributed network


Build redundant backbones
Have redundant external connectivity
Obtain transit from upstream providers
Get free peering from local providers at IXPs

17

Example ISP Backbone


Design
ISP
Peer
ISP
Peer

IXP

ISP
Peer

ISP
Peer
Upstream1
Upstream 2

Upstream 2

PoP 2

Upstream1

PoP 1

Network
Core

Backbone
Links

PoP 3

PoP 4

18

IP Addressing
Where to get address space and
who from

19

IP Addressing
Internet uses classless routing
Concept of IPv4 class A, class B or class C
is no more

Engineers talk in terms of prefix length, for


example the class B 158.43 is now called
158.43/16.

All routers must be CIDR capable

Classless InterDomain Routing


RFC1812 Router Requirements
20

IP Addressing

Pre-CIDR (before 1994)

The CIDR IPv4 years (1994 to 2010)

Big networks got a class A


Medium networks got a class B
Small networks got a class C
Sizes of IPv4 allocations/assignments made according to
demonstrated need CLASSLESS

IPv6 adoption (from 2011)

The size of IPv4 address allocations and assignments


are now very limited as IANAs free pool has run out

21

IP Addressing

IP Address space is a resource shared amongst


all Internet users

Regional Internet Registries delegated allocation


responsibility by the IANA
AfriNIC, APNIC, ARIN, LACNIC & RIPE NCC are the five
RIRs
RIRs allocate address space to ISPs and Local Internet
Registries
ISPs/LIRs assign address space to end customers or
other ISPs

All usable IPv4 address space has been allocated


to the RIRs by the IANA (February 2011)

The time for IPv6 is now


22

Non-portable Address
Space

Provider Aggregatable or PA Space

Customer uses RIR members address space


while connected to Internet
Customer has to renumber to change ISP
Aids control of size of Internet routing table
Need to fragment provider block when
multihoming

PA space is allocated to the RIR member

All assignments made by the RIR member to


end sites are announced as an aggregate to
the rest of the Internet
23

Portable Address Space

Provider Independent or PI Space

Customer gets or has address space


independent of ISP
Customer keeps addresses when changing ISP
Is very bad for size of Internet routing table
Is very bad for scalability of the routing system
PI space is rarely distributed by the RIRs

24

Internet Hierarchy
The pecking order

25

High Level View of the


Global Internet
Global Providers

Regional
Provider 1

Regional
Provider 2
Content
Provider 1

Access
R4 1
Provider

Content
Provider 2

Internet Exchange Point

Access
Provider 2

Customer Networks
26

Detailed View of the Global


Internet

Global Transit Providers

Regional Transit Providers

Connect to each other


Provide connectivity to Regional Transit Providers
Connect to each other
Provide connectivity to Content Providers
Provide connectivity to Access Providers

Access Providers

Connect to each other across IXPs (free peering)


Provide access to the end user

27

Categorising ISPs
Tier 1 ISP
$
$
$
$
$
$
$
$
$
$
$
$
$
$
$

Tier 1 ISP

Tier 1 ISP

Tier 1 ISP

Tier 2 ISP

Tier 2 ISP

Tier 2 ISP

Tier 2 ISP
IXP

Tier 3 ISP

IXP
Tier 3 ISP

Tier 3 ISP

Tier 3 ISP

Tier 3 ISP
Tier 3 ISP

28

Inter-provider relationships

Peering between equivalent sizes of


service providers (e.g. Tier 2 to Tier 2)

Peering across exchange points

Shared cost private interconnection, equal


traffic flows
No cost peering
If convenient, of mutual benefit, technically
feasible

Fee based peering

Unequal traffic flows, market position


29

Default Free Zone


The default free zone is made
up of Internet routers which
have explicit routing
information about the rest of
the Internet, and therefore do
not need to use a default route
NB: is not related to where an
ISP is in the hierarchy

30

Gluing it together

31

Gluing it together

Who runs the Internet?

How does it keep working?

No one
(Definitely not ICANN, nor the RIRs, nor the US,)
Inter-provider business relationships and the need for
customer reachability ensures that the Internet by and
large functions for the common good

Any facilities to help keep it working?

Not really. But


Engineers keep working together!

32

Engineers keep talking to


each other...

North America

Latin America

NANOG (North American Network Operators Group)


NANOG meetings and mailing list
www.nanog.org
Foro de Redes
NAPLA
LACNOG supported by LACNIC

Middle East

MENOG (Middle East Network Operators Group)


www.menog.net
33

Engineers keep talking to


each other...

Asia & Pacific

APRICOT annual conference

www.apricot.net

APOPS & APNIC-TALK mailing lists


mailman.apnic.net/mailman/listinfo/apops
mailman.apnic.net/mailman/listinfo/apnic-talk

PacNOG (Pacific NOG)

mailman.apnic.net/mailman/listinfo/pacnog

SANOG (South Asia NOG)

E-mail to sanog-request@sanog.org
34

Engineers keep talking to


each other...

Europe

Africa

RIPE meetings, working groups and mailing lists


e.g. Routing WG:
www.ripe.net/mailman/listinfo/routing-wg
AfNOG meetings and mailing list

And many in-country ISP associations and NOGs


IETF meetings and mailing lists

www.ietf.org

35

Summary
Topologies and Definitions
IP Addressing

Internet Hierarchy

PA versus PI address space


Local, Regional, Global Transit Providers
IXPs

Gluing it all together

Engineers cooperate, common business


interests
36

Introduction to The
Internet
ISP Training Workshops

37

The Value of
Peering
ISP Training Workshops

38

The Internet

Internet is made up of ISPs of all shapes and


sizes

These ISPs interconnect their businesses

Some have local coverage (access providers)


Others can provide regional or per country coverage
And others are global in scale
They dont interconnect with every other ISP (over
41000 distinct autonomous networks) wont scale
They interconnect according to practical and business
needs

Some ISPs provide transit to others

They interconnect other ISP networks


39

Categorising ISPs
Global ISP
$
$
$
$
$
$
$
$ Regional ISP
$
$
$
$
Access ISP
$
$
$

Global ISP

Global ISP

Global ISP

Regional ISP

Regional ISP
Regional ISP

IXP

IXP
Access ISP

Access ISP

Access ISP

Access ISP
Access ISP

40

Peering and Transit

Transit

Carrying traffic across a network


Usually for a fee
Example: Access provider connects to a
regional provider

Peering

Exchanging routing information and traffic


Usually for no fee
Sometimes called settlement free peering
Example: Regional provider connects to
another regional provider
41

Private Interconnect

Two ISPs connect their networks over a


private link

Can be peering arrangement


No charge for traffic
Share cost of the link

Can be transit arrangement


One ISP charges the other for traffic
One ISP (the customer) pays for the link

ISP 1

ISP 2
42

Public Interconnect

Several ISPs meeting in a common neutral


location and interconnect their networks

Usually is a peering arrangement between their


networks
ISP 1

ISP 6

ISP 2

ISP 3

IXP

ISP 5

ISP 4
43

ISP Goals

Minimise the cost of operating the business


Transit

ISP has to pay for circuit (international or domestic)


ISP has to pay for data (usually per Mbps)
Repeat for each transit provider
Significant cost of being a service provider

Peering

ISP shares circuit cost with peer (private) or runs circuit


to public peering point (one off cost)
No need to pay for data
Reduces transit data volume, therefore reducing cost
44

Transit How it works

Small access provider provides Internet access


for a citys population

Mixture of dial up, wireless and fixed broadband


Possibly some business customers
Possibly also some Internet cafes

How do their customers get access to the rest of


the Internet?
ISP buys access from one, two or more larger
ISPs who already have visibility of the rest of the
Internet

This is transit they pay for the physical connection to


the upstream and for the traffic volume on the link
45

Peering How it works

If two ISPs are of equivalent sizes, they have:

Equivalent network infrastructure coverage


Equivalent customer size
Similar content volumes to be shared with the Internet
Potentially similar traffic flows to each others networks

This makes them good peering partners


If they dont peer

They both have to pay an upstream provider for access


to each others network/customers/content
Upstream benefits from this arrangement, the two ISPs
both have to fund the transit costs
46

The IXPs role

Private peering makes sense when there


are very few equivalent players

Connecting to one other ISP costs X


Connecting to two other ISPs costs 2 times X
Connecting to three other ISPs costs 3 times X
Etc (where X is half the circuit cost plus a port
cost)

The more private peers, the greater the


cost
IXP is a more scalable solution to this
problem

47

The IXPs role

Connecting to an IXP

Some IXPs charge annual maintenance fees

ISP costs: one router port, one circuit, and one router to
locate at the IXP
The maintenance fee has potential to significantly
influence the cost balance for an ISP

Generally connecting to an IXP and peering there


becomes cost effective when there are at least
three other peers

The real $ amount varies from region to region, IXP to


IXP
48

Who peers at an IXP?

Access Providers

Dont have to pay their regional provider transit fees for


local traffic
Keeps latency for local traffic low
Unlimited bandwidth through the IXP (compared with
costly and limited bandwidth through transit provider)

Regional Providers

Dont have to pay their global provider transit for local


and regional traffic
Keeps latency for local and regional traffic low
Unlimited bandwidth through the IXP (compared with
costly and limited bandwidth through global provider)
49

The IXPs role

Global Providers can be located close to IXPs

Attracted by the potential transit business available

Advantageous for access & regional providers

They can peer with other similar providers at the IXP


And in the same facility pay for transit to their regional
or global provider
(Not across the IXP fabric, but a separate connection)

IXP
Transit
Access
50

Connectivity Decisions

Transit

Almost every ISP needs transit to reach rest of Internet


One provider = no redundancy
Two providers: ideal for traffic engineering as well as
redundancy
Three providers = better redundancy, traffic engineering
gets harder
More then three = diminishing returns, rapidly
escalating costs and complexity

Peering

Means low (or zero) cost access to another network


Private or Public Peering (or both)
51

Transit Goals
1.

Minimise number of transit providers

2.

But maintain redundancy


2 is ideal, 4 or more is bad

Aggregate capacity to transit providers

More aggregated capacity means better value

Lower cost per Mbps

4x 45Mbps circuits to 4 different ISPs will


almost always cost more than 2x 155Mbps
circuits to 2 different ISPs

Yet bandwidth of latter (310Mbps) is greater than


that of former (180Mbps) and is much easier to
operate

52

Peering or Transit?
How to choose?
Or do both?
It comes down to cost of going to an IXP

Free peering
Paying for transit from an ISP co-located in
same facility, or perhaps close by

Or not going to an IXP and paying for the


cost of transit directly to an upstream
provider

There is no right or wrong answer, someone has


to do the arithmetic
53

Private or Public Peering

Private peering

Public peering

Scaling issue, with costs, number of providers, and


infrastructure provisioning
Makes sense the more potential peers there are (more is
usually greater than two)

Which public peering point?

Local Internet Exchange Point: great for local traffic and


local peers
Regional Internet Exchange Point: great for meeting
peers outside the locality, might be cheaper than paying
transit to reach the same consumer base
54

Local Internet Exchange


Point

Defined as a public peering point serving


the local Internet industry
Local means where it becomes cheaper to
interconnect with other ISPs at a common
location than it is to pay transit to another
ISP to reach the same consumer base

Local can mean different things in different


regions!

55

Regional Internet Exchange


Point

These are also local Internet Exchange Points


But also attract regional ISPs and ISPs from
outside the locality

Regional ISPs peer with each other


And show up at several of these Regional IXPs

Local ISPs peer with ISPs from outside the


locality

They dont compete in each others markets


Local ISPs dont have to pay transit costs
ISPs from outside the locality dont have to pay transit
costs
Quite often ISPs of disparate sizes and influences will
happily peer to defray transit costs
56

Which IXP?

How many routes are available?

What is the cost of co-lo space?

If prohibitive or space not available, pointless choosing


this IXP

What is the cost of running a circuit to the


location?

What is traffic to & from these destinations, and by how


much will it reduce cost of transit?

If prohibitive or competitive with transit costs, pointless


choosing this IXP

What is the cost of remote hands/assistance?

If no remote hands, doing maintenance is challenging


and potentially costly with a serious outage
57

Example: South Asian ISP


@ LINX
Date: October 2011
Facts:

Route Server plus bilateral peering offers 81k


prefixes
IXP traffic averages 55Mbps/15Mbps
Transit traffic averages 35Mbps/3Mbps

Analysis:

61% of inbound traffic comes from 81k


prefixes available by peering
39% of inbound traffic comes from remaining
287k prefixes from transit provider
58

Example: South Asian ISP


@ HKIX
Date: October 2011
Facts:

Route Server plus bilateral peering offers 34k


prefixes
IXP traffic is 130Mbps/30Mbps
Transit traffic is 125Mbps/40Mbps

Analysis:

51% of inbound traffic comes from 42k


prefixes available by peering
49% of inbound traffic comes from remaining
326k prefixes from transit provider
59

Example: South Asian ISP

Summary:

Traffic by Peering: 185Mbps/45Mbps


Traffic by Transit: 160Mbps/43Mbps
54% of incoming traffic is by peering
52% of outbound traffic is by peering

60

Example: South Asian ISP

Router at remote co-lo

Servers at remote co-lo

Benefits: can select peers, easy to swap transit


providers
Costs: co-lo space and remote hands
Benefits: mail filtering, content caching, etc
Costs: co-lo space and remote hands

Overall advantage:

Can control what goes on the expensive


connectivity back to home
61

Value propositions

Peering at a local IXP

Reduces latency & transit costs for local traffic


Improves Internet quality perception

Participating at a Regional IXP

A means of offsetting transit costs

Managing connection back to home


network
Improving Internet Quality perception for
customers

62

Summary

Benefits of peering

Private
Internet Exchange Points

Local versus Regional IXPs

Local services local traffic


Regional helps defray transit costs

63

Worked Example

Single International Transit


Versus
Local IXP + Regional IXP + Transit

64

Worked Example

ISP A is local access provider

Some business customers (around 200 fixed links)


Some co-located content provision (datacentre with 100
servers)
Some consumers on broadband (5000
DSL/Cable/Wireless)
Some consumers on dial (1000 on V.34 type speeds)

They have a single transit provider

Connect with a 16Mbps international leased link to their


transits PoP
Transit link is highly congested

65

Worked Example (2)

There are two other ISPs serving the same


locality

Course of action for our ISP:

There is no interconnection between any of the three


ISPs
Local traffic (between all 3 ISPs) is traversing
International connections
Work to establish local IXP
Establish presence at overseas co-location

First Step

Assess local versus international traffic ratio


Use NetFlow on border router connecting to transit
provider
66

Worked Example (3)

Local/Non-local traffic ratio

Example: balance is 30:70

Local = traffic going to other two ISPs


Non-local = traffic going elsewhere
Of 16Mbps, that means 5Mbps could stay in country and
not congest International circuit
16Mbps transit costs $50 per Mbps per month traffic
charges = $250 per month, or $3000 per year for local
traffic
Circuit costs $100k per year: $30k is spent on local
traffic

Total is $33k per year for local traffic


67

Worked Example (4)

IXP cost:

Simple 8 port 10/100 managed switch plus co-lo space


over 3 years could be around US$30k total; or $3k per
year per ISP
One router to handle 5Mbps (e.g. 2801) would be
around $3k (good for 3 years)
One local 10Mbps circuit from ISP location to IXP
location would be around $5k per year, no traffic
charges
Per ISP total: $9k
Somewhat cheaper than $33k
Business case for local peering is straightforward - $24k
saving per annum
68

Worked Example (5)

After IXP establishment

5Mbps removed from International link


Leaving 5Mbps for more International traffic and that
fills the link within weeks of the local traffic being
removed

Next step is to assess transit charges and


optimise costs

ISPs visits several major regional IXPs


Assess routes available
Compares routes available with traffic generated by
those routes from its Netflow data
Discovers that 30% of traffic would transfer to one IXP
via peering
69

Worked Example (6)

Costs:

Router for Regional IXP (e.g. 2801) at $3k over three


years
Co-lo space at Regional IXP venue at $3k per year
Best price for transit at the Regional IXP venue by
competitive tender is $30 per Mbps per month, plus $1k
port charge
30% of traffic offloads to IXP, leaving 70% of 16Mbps to
transit provider = $330 per month, or $5k per annum
Total with this model is $9k per year, plus the cost of the
circuit (still $100k)
Compare this with paying $50 per Mbps per month to
the transit provider = $10k per annum (plus cost of the
circuit)
70

Worked Example (7)

Result:

ISP co-locates at Regional IXP


Pays reduced transit charges to transit provider
(competitive tender)
Pays no charges for traffic across Regional IXP

Bonuses:

Rate limits on router at Regional IXP Co-lo

Can prioritise congestion dependent on customer demands

Install servers at Regional IXP co-lo facility

Filters e-mail (spam and viruses) relieves some capacity


on link
Caches content relieves a little more capacity on link
71

Conclusion

Within the original costs of having one


international transit provider:

ISP has turned up at the local IXP and offloaded local


traffic for free
ISP has turned up at a major regional IXP and offloaded
traffic, avoiding paying transit charges to transit provider
ISP has reduced remaining transit charges by
competitive tender at the regional IXP co-location facility

Caveat

These numbers are typical of the Internet today


As ever, your mileage may vary but do the financial
calculations first and in the context of potential technical
advantages too
72

The Value of
Peering
ISP Training Workshops

73

Introduction to
OSPF
ISP Training Workshops

74

OSPF

Open Shortest Path


First
Link state or SPF
technology
Developed by OSPF
working group of
IETF (RFC 1247)
OSPFv2 standard
described in RFC2328

Designed for:

TCP/IP environment
Fast convergence
Variable-length subnet
masks
Discontiguous subnets
Incremental updates
Route authentication

Runs on IP, Protocol


89

75

Link State
Zs Link State
Qs Link State
Z

X
Xs Link State

A
B
C

Q
Z
X

2
13
13

Topology Information is kept


in a Database separate from
the Routing Table
76

Link State Routing

Neighbour discovery

Constructing a Link State Packet (LSP)

Distribute the LSP

(Link State Announcement LSA)

Compute routes

On network failure

New LSPs flooded

All routers recompute routing table


77

Low Bandwidth Utilisation


LSA

R1
LSA

Only changes propagated


Uses multicast on multi-access broadcast
networks
78

Fast Convergence

Detection Plus LSA/SPF

Known as the Dijkstra Algorithm

Alternate Path

N1

R1

R2

R3

N2

Primary Path
79

Fast Convergence

Finding a new route

LSA flooded
throughout area
Acknowledgement
based
Topology database
synchronised
Each router derives
routing table to
destination network

LSA
N1
R1

80

OSPF Areas

Area is a group of
contiguous hosts
and networks

Per area topology


database

Reduces routing
traffic

R2

Area 2

Invisible outside the


area

Backbone area
MUST be contiguous

R1

All other areas must


be connected to the
backbone

Rc

Area 0
Backbone Area

Rd

Rb

Ra

R5
R8

Area 3

R4

R7

Area 4
R6

Area 1
R3
81

Virtual Links between OSPF


Areas

Virtual Link is used


when it is not possible
to physically connect
the area to the
backbone
ISPs avoid designs
which require
virtual links

Increases complexity
Decreases reliability and
scalability

Rc

Area 0
Backbone Area

Rd

Rb

Ra

Area 4
R5
R8

R4

R7

Area 1
R6

R3

82

Classification of Routers
IR

R1

IR

R2

Area 2

Area 3
Rc

Rb

ABR/BR
Area 0
Rd

Ra

ASBR
To other AS

IR/BR
R5

R4

Area 1

R3

Internal Router (IR)


Area Border Router (ABR)
Backbone Router (BR)
Autonomous System
Border Router (ASBR)
83

OSPF Route Types


IR

R1

IR

R2

Area 2

Area 3
Rc

Rb

ABR/BR
Area 0
Rd

Ra

ASBR
To other AS

R5

Intra-area Route

R4

Inter-area Route

Area 1
R3

all routes inside an area


routes advertised from
one area to another by
an Area Border Router

External Route

routes imported into


84
OSPF from other protocol
or static routes

External Routes

Prefixes which are redistributed into OSPF from


other protocols
Flooded unaltered throughout the AS

Recommendation: Avoid redistribution!!

OSPF supports two types of external metrics

Type 1 external metrics


Type 2 external metrics (Cisco IOS default)

OSPF

R2
Redistribute

RIP
EIGRP
BGP
Static
Connected
etc.

85

External Routes

Type 1 external metric: metrics are added


to the summarised internal link cost
Cost = 10

R2

to N1
External Cost = 1

R1

Cost = 8
Network
N1
N1

Type 1
11
10

Next Hop
R2
R3

R3

to N1
External Cost = 2

Selected Route

86

External Routes

Type 2 external metric: metrics are


compared without adding to the internal
link cost
Cost = 10

R2

to N1
External Cost = 1

R1

Cost = 8
Network
N1
N1

Type 1
1
2

Next Hop
R2
R3

R3

to N1
External Cost = 2

Selected Route
87

Topology/Link State
Database

A router has a separate LS database for each


area to which it belongs
All routers belonging to the same area have
identical database
SPF calculation is performed separately for each
area
LSA flooding is bounded by area
Recommendation:

Limit the number of areas a router participates in!!


1 to 3 is fine (typical ISP design)
>3 can overload the CPU depending on the area
topology complexity
88

The Hello Protocol

Responsible for
establishing and
maintaining neighbour
relationships
Elects designated
router on multi-access
networks

Hello

Hello

Hello

89

The Hello Packet

Contains:

Router priority
Hello interval
Router dead
interval
Network mask
List of neighbours
DR and BDR
Options: E-bit,
MC-bit, (see A.2
of RFC2328)

Hello

Hello

Hello

90

Designated Router

There is ONE designated router per multiaccess network

Generates network link advertisements


Assists in database synchronization
Designated
Router

Designated
Router

Backup
Designated
Router

Backup
Designated Router

91

Designated Router by
Priority

Configured priority (per interface)

ISPs configure high priority on the routers they want


as DR/BDR

Else determined by highest router ID

Router ID is 32 bit integer


Derived from the loopback interface address, if
configured, otherwise the highest IP address
131.108.3.2
R1

131.108.3.3
DR

R1 Router ID = 144.254.3.5
144.254.3.5

R2
R2 Router ID = 131.108.3.3
92

Neighbouring States

Full

Routers are fully adjacent


Databases synchronised
Relationship to DR and BDR

Full
DR

BDR
93

Neighbouring States

2-way

Router sees itself in other Hello packets


DR selected from neighbours in state 2-way or
greater
2-way

DR

BDR
94

When to Become Adjacent


Underlying network is point to point
Underlying network type is virtual link
The router itself is the designated router
or the backup designated router
The neighbouring router is the designated
router or the backup designated router

95

LSAs Propagate Along


Adjacencies

DR

BDR

LSAs acknowledged along adjacencies

96

Broadcast Networks

IP Multicast used for Sending and


Receiving Updates

All routers must accept packets sent to


AllSPFRouters (224.0.0.5)
All DR and BDR routers must accept packets
sent to AllDRouters (224.0.0.6)

Hello packets sent to AllSPFRouters


(Unicast on point-to-point and virtual
links)
97

Routing Protocol Packets

Share a common protocol header


Routing protocol packets are sent with type of
service (TOS) of 0
Five types of OSPF routing protocol packets

Hello packet type 1


Database description packet type 2
Link-state request packet type 3
Link-state update packet type 4
Link-state acknowledgement packet type 5

98

Different Types of LSAs

Six distinct type of LSAs

Type
Type
Type
Type
Type
Type

1:
2:
3 & 4:
5 & 7:
6:
9, 10 & 11:

Router LSA
Network LSA
Summary LSA
External LSA (Type 7 is for NSSA)
Group membership LSA
Opaque LSA (9: Link-Local, 10: Area)

99

Router LSA (Type 1)


Describes the state and cost of the
routers links to the area
All of the routers links in an area must be
described in a single LSA
Flooded throughout the particular area and
no more
Router indicates whether it is an ASBR,
ABR, or end point of virtual link

100

Network LSA (Type 2)


Generated for every transit broadcast and
NBMA network
Describes all the routers attached to the
network
Only the designated router originates this
LSA
Flooded throughout the area and no more

101

Summary LSA (Type 3 and


4)

Describes the destination outside the area


but still in the AS
Flooded throughout a single area
Originated by an ABR
Only inter-area routes are advertised into
the backbone
Type 4 is the information about the ASBR

102

External LSA (Type 5 and 7)


Defines routes to destination external to
the AS
Default route is also sent as external
Two types of external LSA:

E1: Consider the total cost up to the external


destination
E2: Considers only the cost of the outgoing
interface to the external destination

(Type 7 LSAs used to describe external


LSA for one specific OSPF area type)
103

Inter-Area Route
Summarisation
Prefix or all subnets
Prefix or all networks
Area range command

R2

With
Network
summarisation
1
Without
Network
summarisation
1.A
1.B
1.C

Next Hop
R1
Next Hop
R1
R1
R1

Backbone
Area 0

(ABR)
R1

1.A

1.B

Area 1
1.C

104

No Summarisation

Specific Link LSA advertised out of each area


Link state changes propagated out of each area
1.A
1.B
1.C
1.D

3.A
3.B
3.C
3.D

Area 0
2.A
2.B
2.C
2.D

1.A

1.C

1.B

1.D

3.A

2.A

2.C

2.B
3.C

2.D

3.B

3.D

105

With Summarisation

Only summary LSA advertised out of each area


Link state changes do not propagate out of the area

Area 0
2

1.A

1.C

1.B

1.D

3.A

2.A

2.C

2.B
3.C

2.D

3.B

3.D

106

No Summarisation

Specific Link LSA advertised in to each area


Link state changes propagated in to each area
2.A
2.C
3.A
3.C

2.B
2.D
3.B
3.D

Area 0
1.A
1.C
3.A
3.C

1.A

1.C

1.A
1.C
2.A
2.C

1.B
1.D
3.B
3.D

1.B

1.D

3.A

2.A

2.C

2.B
3.C

2.D

1.B
1.D
2.B
2.D

3.B

3.D

107

With Summarisation

Only summary link LSA advertised in to each area


Link state changes do not propagate in to each area

2
3

1
2

Area 0
1
3

1.A

1.C

1.B

1.D

3.A

2.A

2.C

2.B
3.C

2.D

3.B

3.D

108

Types of Areas

Regular
Stub
Totally Stubby
Not-So-Stubby
Only regular areas are useful for ISPs

Other area types handle redistribution of other routing


protocols into OSPF ISPs dont redistribute anything
into OSPF

The next slides describing the different area


types are provided for information only
109

Regular Area (Not a Stub)

From Area 1s point of view, summary networks from other


areas are injected, as are external networks such as X.1
ASBR
X.1
2
3

X.1 External
networks
1
2 X.1

Area 0

X.1
1
3

X.1

1.A

1.C

1.B

1.D

X.1
X.1

2.A

2.C

3.A

2.B
3.C

2.D

3.B

3.D

110

Normal Stub Area

Summary networks, default route injected


Command is area x stub
Default
2
3

ASBR

X.1 External
networks
1
2 Default

Area 0

Default
1
3

X.1

1.A

1.C

1.B

1.D

X.1
X.1

2.A

2.C

3.A

2.B
3.C

2.D

3.B

3.D

111

Totally Stubby Area

Only a default route injected

Default path to closest area border router

Command is area x stub no-summary

Totally
Stubby Area

X.1

Default

ASBR

X.1 External
networks
1
2 Default

Area 0

Default
1
3

1.A

1.C

1.B

1.D

X.1
X.1

2.A

2.C

3.A

2.B
3.C

2.D

3.B

3.D

112

Not-So-Stubby Area

Capable of importing routes in a limited fashion


Type-7 LSAs carry external information within an NSSA
NSSA Border routers translate selected type-7 LSAs into type-5 external
ASBR
network LSAs
X.1 External
networks

Not-SoStubby Area

X.1

Default

Area 0

Default
X.2 1
3

1.A

X.2
External
networks

1
2 Default
X.2

1.C

1.B

1.D

X.2

X.2
X.1
X.1

2.A

2.C

3.A

2.B
3.C

2.D

3.B

3.D

113

ISP Use of Areas

ISP networks use:

Backbone area

Backbone area
Regular area
No partitioning

Regular area

Summarisation of point to point link addresses used


within areas
Loopback addresses allowed out of regular areas without
summarisation (otherwise iBGP wont work)

114

Addressing for Areas


Area 0
network 192.168.1.0
range 255.255.255.192

Area 1
network 192.168.1.64
range 255.255.255.192

Area 2
network 192.168.1.128
range 255.255.255.192

Area 3
network 192.168.1.192
range 255.255.255.192

Assign contiguous ranges of subnets per area to


facilitate summarisation
115

Summary

Fundamentals of Scalable OSPF Network


Design

Area hierarchy
DR/BDR selection
Contiguous intra-area addressing
Route summarisation
Infrastructure prefixes only

116

Introduction to
OSPF
ISP Training Workshops

117

Deploying OSPF for


ISPs
ISP Training Workshops

118

Agenda
OSPF Design in SP Networks
Adding Networks in OSPF
OSPF in Ciscos IOS

119

OSPF Design
As applicable to Service
Provider Networks

120

Service Providers

SP networks are divided


into PoPs
PoPs are linked by the
backbone
Transit routing information
is carried via iBGP
IGP is only used to carry
the next hop for BGP
Optimal path to the next
hop is critical

121

SP Architecture

Major routing
information is ~430K
prefixes via BGP
Largest known IGP
routing table is ~910K
Total of 440K
10K/440K is 2% of
IGP routes in an ISP
network
A very small factor but
has a huge impact on
network convergence!

Area 6/L1
BGP 1

POP

POP

Area 1/L1
BGP 1

Area 2/L1
BGP 1

IP Backbone
Area0/L2
BGP 1

POP

Area 5/L1
BGP 1
POP

Area 3/L1
BGP 1
POP

Area 4/L1
BGP 1
POP

122

SP Architecture

Regional
Core

You can reduce the IGP


size from 10K to approx
the number of routers in
your network
This will bring really fast
convergence
Optimise where you must
and summarise where you
can
Stops unnecessary flapping

RR

IGP
Access

customer

customer

customer 123

OSPF Design: Addressing

OSPF Design and Addressing go together

Objective is to keep the Link State Database


lean
Create an address hierarchy to match the
topology
Use separate Address Blocks for loopbacks,
network infrastructure, customer interfaces &
customers

Customer Address Space PtP LinksInfrastructure Loopbacks


124

OSPF Design: Addressing

Minimising the number of prefixes in OSPF:

Number loopbacks out of a contiguous address


block

Use contiguous address blocks per area for


infrastructure point-to-point links

But do not summarise these across area boundaries: iBGP


peer addresses need to be in the IGP

Use area range command on ABR to summarise

With these guidelines:

Number of prefixes in area 0 will then be very close to


the number of routers in the network
It is critically important that the number of prefixes and
LSAs in area 0 is kept to the absolute minimum
125

OSPF Design: Areas

Examine physical topology

Use areas and summarisation

This reduces overhead and LSA counts


(but watch next-hop for iBGP when summarising)

Dont bother with the various stub areas

Is it meshed or hub-and-spoke?

No benefits for ISPs, causes problems for iBGP

Push the creation of a backbone

Reduces mesh and promotes hierarchy

126

OSPF Design: Areas

One SPF per area, flooding done per area

Avoid externals in OSPF

Watch out for overloading ABRs


DO NOT REDISTRIBUTE into OSPF
External LSAs flood through entire network

Different types of areas do different flooding

Normal areas
Stub areas
Totally stubby (stub no-summary)
Not so stubby areas (NSSA)

127

OSPF Design: Areas

Area 0 must be contiguous

Do NOT use virtual links to join two Area 0 islands

Traffic between two non-zero areas always goes


via Area 0

There is no benefit in joining two non-zero areas


together
Avoid designs which have two non-zero areas touching
each other
(Typical design is an area per PoP, with core routers
being ABR to the backbone area 0)

128

OSPF Design: Summary

Think Redundancy

Dual Links out of each area using metrics


(cost) for traffic engineering

Too much redundancy

Dual links to backbone in stub areas must be


the same cost other wise sub-optimal routing
will result
Too Much Redundancy in the backbone area
without good summarisation will effect
convergence in the Area 0
129

OSPF Areas: Migration

Where to place OSPF Areas?

Follow the physical topology!


Remember the earlier design advice

Configure area at a time!

Start at the outermost edge of the network


Log into routers at either end of a link and change the
link from Area 0 to the chosen Area
Wait for OSPF to re-establish adjacencies
And then move onto the next link, etc
Important to ensure that there is never an Area 0 island
anywhere in the migrating network
130

OSPF Areas: Migration


A

Area 0

C
D
Area 10

Migrate small parts of the network, one area at a


time

Remember to introduce summarisation where feasible

With careful planning, the migration can be done


with minimal network downtime

131

OSPF for Service


Providers
Configuring OSPF & Adding
Networks

132

OSPF: Configuration

Starting OSPF in Ciscos IOS

router ospf 100


Where 100 is the process ID

OSPF process ID is unique to the router

Gives possibility of running multiple instances of OSPF


on one router
Process ID is not passed between routers in an AS
Many ISPs configure the process ID to be the same as
their BGP Autonomous System Number

133

OSPF: Establishing
Adjacencies

Cisco IOS OSPFv2 automatically tries to establish


adjacencies on all defined interfaces (or subnets)
Best practice is to disable this

Potential security risk: sending OSPF Hellos outside of


the autonomous system, and risking forming
adjacencies with external networks
Example: Only POS4/0 interface will attempt to form an
OSPF adjacency
router ospf 100
passive-interface default
no passive-interface POS4/0
134

OSPF: Adding Networks


Option One

Redistribution:

Applies to all connected interfaces on the router but


sends networks as external type-2s which are not
summarised
router ospf 100
redistribute connected subnets

Do NOT do this! Because:

Type-2 LSAs flood through entire network


These LSAs are not all useful for determining paths
through backbone; they simply take up valuable space

135

OSPF: Adding Networks


Option Two

Per link configuration from IOS 12.4 onwards

OSPF is configured on each interface (same as ISIS)


Useful for multiple subnets per interface

interface POS 4/0


ip address 192.168.1.1 255.255.255.0
ip address 172.16.1.1 255.255.255.224 secondary
ip ospf 100 area 0
!
router ospf 100
passive-interface default
no passive-interface POS 4/0
136

OSPF: Adding Networks


Option Three

Specific network statements

Every active interface with a configured IP address


needs an OSPF network statement
Interfaces that will have no OSPF neighbours need
passive-interface to disable OSPF Hellos

That is: all interfaces connecting to devices outside the ISP


backbone (i.e. customers, peers, etc)

router ospf 100


network 192.168.1.0 0.0.0.3 area 51
network 192.168.1.4 0.0.0.3 area 51
passive-interface Serial 1/0
137

OSPF: Adding Networks


Option Four

Network statements wildcard mask

Every active interface with configured IP address covered


by wildcard mask used in OSPF network statement
Interfaces covered by wildcard mask but having no OSPF
neighbours need passive-interface (or use passiveinterface default and then activate the interfaces which
will have OSPF neighbours)
router ospf 100
network 192.168.1.0 0.0.0.255 area 51
passive-interface default
no passive interface POS 4/0
138

OSPF: Adding Networks


Recommendations

Dont ever use Option 1


Use Option 2 if supported; otherwise:
Option 3 is fine for core/infrastructure routers

Doesnt scale too well when router has a large number


of interfaces but only a few with OSPF neighbours
solution is to use Option 3 with no passive on
interfaces with OSPF neighbours

Option 4 is preferred for aggregation routers

Or use iBGP next-hop-self


Or even ip unnumbered on external point-to-point links

139

OSPF: Adding Networks


Example One (Cisco IOS
Aggregation router with large number of leased
12.4)
line customers and just two links to the core
network:

interface loopback 0
ip address 192.168.255.1 255.255.255.255
ip ospf 100 area 0
interface POS 0/0
ip address 192.168.10.1 255.255.255.252
ip ospf 100 area 0
interface POS 1/0
ip address 192.168.10.5 255.255.255.252
ip ospf 100 area 0
interface serial 2/0:0 ...
ip unnumbered loopback 0
! Customers connect here ^^^^^^^
router ospf 100
passive-interface default
no passive interface POS 0/0
no passive interface POS 1/0

140

OSPF: Adding Networks


Example One (Cisco IOS <
Aggregation router with large number of leased
12.4)
line customers and just two links to the core
network:

interface loopback 0
ip address 192.168.255.1 255.255.255.255
interface POS 0/0
ip address 192.168.10.1 255.255.255.252
interface POS 1/0
ip address 192.168.10.5 255.255.255.252
interface serial 2/0:0 ...
ip unnumbered loopback 0
! Customers connect here ^^^^^^^
router ospf 100
network 192.168.255.1 0.0.0.0 area 51
network 192.168.10.0 0.0.0.3 area 51
network 192.168.10.4 0.0.0.3 area 51
passive-interface default
no passive interface POS 0/0
no passive interface POS 1/0

141

OSPF: Adding Networks


Example Two (Cisco IOS
Core router with only links to other core
12.4)
routers:
interface loopback 0
ip address 192.168.255.1 255.255.255.255
ip ospf 100 area 0
interface POS 0/0
ip address 192.168.10.129 255.255.255.252
ip ospf 100 area 0
interface POS 1/0
ip address 192.168.10.133 255.255.255.252
ip ospf 100 area 0
interface POS 2/0
ip address 192.168.10.137 255.255.255.252
ip ospf 100 area 0
interface POS 2/1
ip address 192.168.10.141 255.255.255.252
ip ospf 100 area 0
router ospf 100
passive interface loopback 0

142

OSPF: Adding Networks


Example Two (Cisco IOS <
Core router with only links to other core
12.4)
routers:
interface loopback 0
ip address 192.168.255.1 255.255.255.255
interface POS 0/0
ip address 192.168.10.129 255.255.255.252
interface POS 1/0
ip address 192.168.10.133 255.255.255.252
interface POS 2/0
ip address 192.168.10.137 255.255.255.252
interface POS 2/1
ip address 192.168.10.141 255.255.255.252
router ospf 100
network 192.168.255.1 0.0.0.0 area 0
network 192.168.10.128 0.0.0.3 area 0
network 192.168.10.132 0.0.0.3 area 0
network 192.168.10.136 0.0.0.3 area 0
network 192.168.10.140 0.0.0.3 area 0
passive interface loopback 0

143

OSPF: Adding Networks


Summary

Key Theme when selecting a technique:


Keep the Link State Database Lean

Increases Stability
Reduces the amount of information in the Link
State Advertisements (LSAs)
Speeds Convergence Time

144

OSPF in Cisco IOS


Useful features for ISPs

145

Areas

An area is stored as
a 32-bit field:

Defined in IPv4
address format (i.e.
Area 0.0.0.0)
Can also be defined
using single decimal
value (i.e. Area 0)

0.0.0.0 reserved for


the backbone area

Area 3

Area 0
Area 2
Area 1

146

Logging Adjacency Changes


The router will generate a log message
whenever an OSPF neighbour changes state
Syntax:

[no] [ospf] log-adjacency-changes


(OSPF keyword is optional, depending on IOS
version)

Example of a typical log message:

%OSPF-5-ADJCHG: Process 1, Nbr


223.127.255.223 on Ethernet0 from LOADING
to FULL, Loading Done
147

Number of State Changes

The number of state transitions is


available via SNMP (ospfNbrEvents) and
the CLI:

show ip ospf neighbor [type number]


[neighbor-id] [detail]
Detail(Optional) Displays all neighbours
given in detail (list all neighbours). When
specified, neighbour state transition counters
are displayed per interface or neighbour ID

148

State Changes (Continued)

To reset OSPF-related statistics, use the


clear ip ospf counters command

This will reset neighbour state transition


counters per interface or neighbour id
clear ip ospf counters [neighbor [<type
number>] [neighbor-id]]

149

Router ID
If the loopback interface exists and has
an IP address, that is used as the router
ID in routing protocols stability!
If the loopback interface does not exist,
or has no IP address, the router ID is the
highest IP address configured danger!
OSPF sub command to manually set the
Router ID:

router-id <ip address>


150

Cost & Reference


Bandwidth

Bandwidth used in Metric calculation

Syntax:

Cost = 108/bandwidth
Not useful for interface bandwidths > 100 Mbps
ospf auto-cost reference-bandwidth <referencebw>

Default reference bandwidth still 100 Mbps for


backward compatibility
Most ISPs simply choose to develop their own
cost strategy and apply to each interface type
151

Cost: Example Strategy


100GE
40GE/OC768
10GE/OC192
OC48
GigEthernet
OC12
OC3
FastEthernet
Ethernet
E1

100Gbps
40Gbps
10Gbps
2.5Gbps
1Gbps
622Mbps
155Mbps
100Mbps
10Mbps
2Mbps

cost
cost
cost
cost
cost
cost
cost
cost
cost
cost

=
=
=
=
=
=
=
=
=
=

1
2
5
10
20
50
100
200
500
1000
152

Default routes

Originating a default route into OSPF

default-information originate metric <n>


Will originate a default route into OSPF if there is
a matching default route in the Routing Table
(RIB)
The optional always keyword will always
originate a default route, even if there is no
existing entry in the RIB

153

Clear/Restart

OSPF clear commands

clear ip ospf [pid] redistribution

This command clears redistribution based on OSPF routing


process ID

clear ip ospf [pid] counters

If no process ID is given, all OSPF processes on the router are


assumed

This command clears counters based on OSPF routing process


ID

clear ip ospf [pid] process

This command will restart the specified OSPF process. It


attempts to keep the old router-id, except in cases where a
new router-id was configured or an old user configured routerid was removed. Since this command can potentially cause a
network churn, a user confirmation is required before
performing any action
154

Use OSPF Authentication

Use authentication

Too many operators overlook this basic requirement

When using authentication, use the MD5 feature

Under the global OSPF configuration, specify:


area <area-id> authentication message-digest

Under the interface configuration, specify:


ip ospf message-digest-key 1 md5 <key>

Authentication can be selectively disabled per


interface with:
ip ospf authentication null

155

Point to Point Ethernet


Links

For any broadcast media (like Ethernet), OSPF will


attempt to elect a designated and backup designated
router when it forms an adjacency

If the interface is running as a point-to-point WAN link, with only


2 routers on the wire, configuring OSPF to operate in "point-topoint mode" scales the protocol by reducing the link failure
detection times
Point-to-point mode improves convergence times on Ethernet
networks because it:

Prevents the election of a DR/BDR on the link,


Simplifies the SPF computations and reduces the router's memory
footprint due to a smaller topology database.

interface fastethernet0/2
ip ospf network point-to-point
156

Tuning OSPF (1)

DR/BDR Selection

ip ospf priority 100 (default 1)


This feature should be in use in your OSPF
network
Forcibly set your DR and BDR per segment so
that they are known
Choose your most powerful, or most idle routers,
so that OSPF converges as fast as possible under
maximum network load conditions
Try to keep the DR/BDR limited to one segment
each
157

Tuning OSPF (2)

OSPF startup

max-metric router-lsa on-startup wait-for-bgp


Avoids blackholing traffic on router restart
Causes OSPF to announce its prefixes with highest
possible metric until iBGP is up and running
When iBGP is running, OSPF metrics return to normal,
make the path valid

ISIS equivalent:

set-overload-bit on-startup wait-for-bgp

158

Tuning OSPF (3)

Hello/Dead Timers

ip ospf hello-interval 3 (default 10)


ip ospf dead-interval 15 (default is 4x hello)
This allows for faster network awareness of a failure,
and can result in faster reconvergence, but requires
more router CPU and generates more overhead

LSA Pacing

timers lsa-group-pacing 300 (default 240)


Allows grouping and pacing of LSA updates at configured
interval
Reduces overall network and router impact

159

Tuning OSPF (4)

OSPF Internal Timers

timers spf 2 8 (default is 5 and 10)


Allows you to adjust SPF characteristics
The first number sets wait time from topology
change to SPF run
The second is hold-down between SPF runs
BE CAREFUL WITH THIS COMMAND; if youre
not sure when to use it, it means you dont
need it; default is sufficient 95% of the time

160

Tuning OSPF (5)

LSA filtering/interface blocking

Per interface:

Per neighbor:

ip ospf database-filter all out (no options)


neighbor 1.1.1.1 database-filter all out (no options)

OSPFs router will flood an LSA out all interfaces except the
receiving one; LSA filtering can be useful in cases where
such flooding unnecessary (i.e., NBMA networks), where the
DR/BDR can handle flooding chores
area <area-id> filter-list <acl>
Filters out specific Type 3 LSAs at ABRs

Improper use can result in routing loops and blackholes that can be very difficult to troubleshoot
161

Summary
OSPF has a bewildering number of
features and options
Observe ISP best practices
Keep design and configuration simple
Investigate tuning options and suitability
for your own network

Dont just turn them on!

162

Deploying OSPF for


ISPs
ISP Training Workshops

163

Introduction to BGP
ISP Training Workshops

164

Border Gateway Protocol

A Routing Protocol used to exchange routing


information between different networks

Described in RFC4271

Exterior gateway protocol


RFC4276 gives an implementation report on BGP
RFC4277 describes operational experiences using BGP

The Autonomous System is the cornerstone of


BGP

It is used to uniquely identify networks with a common


routing policy

165

BGP
Path Vector Protocol
Incremental Updates
Many options for policy enforcement
Classless Inter Domain Routing (CIDR)
Widely used for Internet backbone
Autonomous systems

166

Path Vector Protocol

BGP is classified as a path vector routing


protocol (see RFC 1322)

A path vector protocol defines a route as a


pairing between a destination and the
attributes of the path to that destination.

12.6.126.0/24
12.6.126.0/24 207.126.96.43
207.126.96.43 1021
1021 00 6461
6461 7018
7018 6337
6337 11268
11268 ii

AS Path
167

Path Vector Protocol


AS6337

AS11268

AS7018

AS500
AS6461
AS600
168

Definitions
Transit carrying traffic across a network,
usually for a fee
Peering exchanging routing information
and traffic
Default where to send traffic when there
is no explicit match in the routing table

169

Default Free Zone


The default free zone is made
up of Internet routers which
have explicit routing
information about the rest of
the Internet, and therefore do
not need to use a default route
NB: is not related to where an
ISP is in the hierarchy

170

Peering and Transit


example
provider A

IXP-West

Backbone
Provider D

IXP-East

provider B
provider C

A and B can peer, but need


transit arrangements with D to
get packets to/from C
171

Autonomous System (AS)


AS 100

Collection of networks with same routing policy


Single routing protocol
Usually under single ownership, trust and
administrative control
Identified by a unique 32-bit integer (ASN)

172

Autonomous System
Number (ASN)

Two ranges

(original 16-bit range)


(32-bit range RFC4893)

Usage:

0-65535
65536-4294967295
0 and 65535
1-64495
64496-64511
64512-65534
23456
65536-65551
65552-4294967295

(reserved)
(public Internet)
(documentation RFC5398)
(private use only)
(represent 32-bit range in 16-bit
world)
(documentation RFC5398)
(public Internet)

32-bit range representation specified in RFC5396

173
Defines asplain (traditional format) as standard notation

Autonomous System
Number (ASN)

ASNs are distributed by the Regional Internet


Registries

Current 16-bit ASN allocations up to 61439 have


been made to the RIRs

Around 42000 are visible on the Internet

Each RIR has also received a block of 32-bit ASNs

They are also available from upstream ISPs who are


members of one of the RIRs

Out of 3100 assignments, around 2800 are visible on


the Internet

See www.iana.org/assignments/as-numbers
174

Configuring BGP in Cisco


IOS

This command enables BGP in Cisco IOS:


router bgp 100

For ASNs > 65535, the AS number can be


entered in either plain or dot notation:
router bgp 131076
or

router bgp 2.4

IOS will display ASNs in plain notation by default


Dot notation is optional:
router bgp 2.4
bgp asnotation dot

175

BGP Basics
Peering
A

AS 100

AS 101
D

Runs over TCP port 179


Path vector protocol
Incremental updates
Internal & External BGP

AS 102
176

Demarcation Zone (DMZ)


A

AS 100

DMZ
Network

AS 101
D

AS 102

DMZ is the link or network shared between ASes

177

BGP General Operation


Learns multiple paths via internal and
external BGP speakers
Picks the best path and installs it in the
routing table (RIB)
Best path is sent to external BGP
neighbours
Policies are applied by influencing the best
path selection

178

Constructing the
Forwarding Table

BGP in process

BGP out process

receives path information from peers


results of BGP path selection placed in the BGP table
best path flagged
announces best path information to peers

Best path stored in Routing Table (RIB)


Best paths in the RIB are installed in forwarding
table (FIB) if:

prefix and prefix length are unique


lowest protocol distance
179

Constructing the
Forwarding Table
BGP in
process

in

discarded
accepted

everything

bgp

BGP
table

peer

routing
table

best paths

out

BGP out
process

forwarding
table
180

eBGP & iBGP


BGP used internally (iBGP) and externally
(eBGP)
iBGP used to carry

Some/all Internet prefixes across ISP backbone


ISPs customer prefixes

eBGP used to

Exchange prefixes with other ASes


Implement routing policy

181

BGP/IGP model used in ISP


networks

Model representation

eBGP

eBGP

eBGP

iBGP

iBGP

iBGP

iBGP

IGP

IGP

IGP

IGP

AS1

AS2

AS3

AS4
182

External BGP Peering


(eBGP)
A

AS 100

AS 101

Between BGP speakers in different AS


Should be directly connected
Never run an IGP between eBGP peers

183

Configuring External BGP


ip address on
ethernet interface

Router A in AS100

interface ethernet 5/0


ip address 102.102.10.2 255.255.255.240
!
Local ASN
router bgp 100
network 100.100.8.0 mask 255.255.252.0
Remote ASN
neighbor 102.102.10.1 remote-as 101
neighbor 102.102.10.1 prefix-list RouterC in
neighbor 102.102.10.1 prefix-list RouterC out
!

ip address of Router
C ethernet interface

Inbound and
184
outbound filters

Configuring External BGP


ip address on
ethernet interface

Router C in AS101

interface ethernet 1/0/0


ip address 102.102.10.1 255.255.255.240
!
Local ASN
router bgp 101
network 100.100.64.0 mask 255.255.248.0
Remote ASN
neighbor 102.102.10.2 remote-as 100
neighbor 102.102.10.2 prefix-list RouterA in
neighbor 102.102.10.2 prefix-list RouterA out
!

ip address of Router
A ethernet interface

Inbound and
185
outbound filters

Internal BGP (iBGP)


BGP peer within the same AS
Not required to be directly connected

IGP takes care of inter-BGP speaker


connectivity

iBGP speakers must be fully meshed:

They originate connected networks


They pass on prefixes learned from outside the
ASN
They do not pass on prefixes learned from
other iBGP speakers
186

Internal BGP Peering


(iBGP)
AS 100
A

Topology independent
Each iBGP speaker must peer with every other
iBGP speaker in the AS

187

Peering between Loopback


Interfaces
AS 100
C

Peer with loop-back interface

Loop-back interface does not go down ever!

Do not want iBGP session to depend on state of a


single interface or the physical topology

188

Configuring Internal BGP


ip address on
loopback interface

Router A in AS100

interface loopback 0
ip address 105.3.7.1 255.255.255.255
!
Local ASN
router bgp 100
network 100.100.1.0
Local ASN
neighbor 105.3.7.2 remote-as 100
neighbor 105.3.7.2 update-source loopback0
neighbor 105.3.7.3 remote-as 100
neighbor 105.3.7.3 update-source loopback0
!

ip address of Router
B loopback interface

189

Configuring Internal BGP


ip address on
loopback interface

Router B in AS100

interface loopback 0
ip address 105.3.7.2 255.255.255.255
!
Local ASN
router bgp 100
network 100.100.1.0
Local ASN
neighbor 105.3.7.1 remote-as 100
neighbor 105.3.7.1 update-source loopback0
neighbor 105.3.7.3 remote-as 100
neighbor 105.3.7.3 update-source loopback0
!

ip address of Router
A loopback interface

190

Inserting prefixes into BGP

Two ways to insert prefixes into BGP

redistribute static
network command

191

Inserting prefixes into BGP

Configuration Example:
redistribute
static
router bgp 100
redistribute static
ip route 102.10.32.0 255.255.254.0 serial0

Static route must exist before


redistribute command will work
Forces origin to be incomplete
Care required!

192

Inserting prefixes into BGP

Care required with redistribute!


redistribute
static
redistribute <routing-protocol> means

everything in the <routing-protocol> will be


transferred into the current routing protocol
Will not scale if uncontrolled
Best avoided if at all possible
redistribute normally used with routemaps and under tight administrative control

193

Inserting prefixes into BGP

Configuration Example
network
command
router bgp 100
network 102.10.32.0 mask 255.255.254.0
ip route 102.10.32.0 255.255.254.0 serial0

A matching route must exist in the routing


table before the network is announced
Forces origin to be IGP

194

Configuring Aggregation

Three ways to configure route aggregation

redistribute static
aggregate-address
network command

195

Configuring Aggregation

Configuration Example:
router bgp 100
redistribute static
ip route 102.10.0.0 255.255.0.0 null0 250

static route to null0 is called a pull up


route

packets only sent here if there is no more


specific match in the routing table
distance of 250 ensures this is last resort static
care required see previously!
196

Configuring Aggregation
Network Command

Configuration Example

router bgp 100


network 102.10.0.0 mask 255.255.0.0
ip route 102.10.0.0 255.255.0.0 null0 250

A matching route must exist in the routing


table before the network is announced
Easiest and best way of generating an
aggregate

197

Configuring Aggregation
aggregate-address
Configuration Example:
command
router bgp 100
network 102.10.32.0 mask 255.255.252.0
aggregate-address 102.10.0.0 255.255.0.0 [summary-only]

Requires more specific prefix in BGP table before


aggregate is announced
summary-only keyword

Optional keyword which ensures that only the summary is


announced if a more specific prefix exists in the routing
table

Summary
BGP neighbour status
Router6>shipbgpsum
BGProuteridentifier10.0.15.246,localASnumber10
BGPtableversionis16,mainroutingtableversion16
7networkentriesusing819bytesofmemory
14pathentriesusing728bytesofmemory
2/1BGPpath/bestpathattributeentriesusing248bytesofmemory
0BGProutemapcacheentriesusing0bytesofmemory
0BGPfilterlistcacheentriesusing0bytesofmemory
BGPusing1795totalbytesofmemory
BGPactivity7/0prefixes,14/0paths,scaninterval60secs
NeighborVASMsgRcvdMsgSentTblVerInQOutQUp/DownState/PfxRcd
10.0.15.24141098160000:04:472
10.0.15.24241065160000:01:432
10.0.15.24341098160000:04:492
...

BGP Version

Updates sent Updates waiting


and received

199

Summary
BGP Table
Router6>shipbgp
BGPtableversionis16,localrouterIDis10.0.15.246
Statuscodes:ssuppressed,ddamped,hhistory,*valid,>best,iinternal,
rRIBfailure,SStale,mmultipath,bbackuppath,fRTFilter,
xbestexternal,aadditionalpath,cRIBcompressed,
Origincodes:iIGP,eEGP,?incomplete
RPKIvalidationcodes:Vvalid,Iinvalid,NNotfound
NetworkNextHopMetricLocPrfWeightPath
*>i10.0.0.0/2610.0.15.24101000i
*>i10.0.0.64/2610.0.15.24201000i
*>i10.0.0.128/2610.0.15.24301000i
*>i10.0.0.192/2610.0.15.24401000i
*>i10.0.1.0/2610.0.15.24501000i
*>10.0.1.64/260.0.0.0032768i
*>i10.0.1.128/2610.0.15.24701000i
*>i10.0.1.192/2610.0.15.24801000i
*>i10.0.2.0/2610.0.15.24901000i
*>i10.0.2.64/2610.0.15.25001000i
...
200

Summary
BGP4 path vector protocol
iBGP versus eBGP
stable iBGP peer with loopbacks
announcing prefixes & aggregates

201

Introduction to BGP
ISP Training Workshops

202

BGP Policy Control


ISP Training Workshops

203

Applying Policy with BGP


Policy-based on AS path, community or
the prefix
Rejecting/accepting selected routes
Set attributes to influence path selection
Tools:

Prefix-list (filters prefixes)


Filter-list (filters ASes)
Route-maps and communities

204

Policy Control Prefix List

Per neighbour prefix filter

incremental configuration

Inbound or Outbound
Based upon network numbers (using
familiar IPv4 address/mask format)
Using access-lists in Cisco IOS for filtering
prefixes was deprecated long ago

Strongly discouraged!

205

Prefix-list Command Syntax

Syntax:
[no] ip prefix-list list-name [seq seq-value]
permit|deny network/len [ge ge-value] [le levalue]
network/len:
The prefix and its length
ge ge-value:
greater than or equal to
le le-value:
less than or equal to

Both ge and le are optional

Used to specify the range of the prefix length to be


matched for prefixes that are more specific than
network/len

Sequence number is also optional

no ip prefix-list sequence-number
display of sequence numbers

to disable
206

Prefix Lists Examples

Deny default route


ip prefix-list EG deny 0.0.0.0/0

Permit the prefix 35.0.0.0/8


ip prefix-list EG permit 35.0.0.0/8

Deny the prefix 172.16.0.0/12


ip prefix-list EG deny 172.16.0.0/12

In 192/8 allow up to /24


ip prefix-list EG permit 192.0.0.0/8 le 24
This allows all prefix sizes in the 192.0.0.0/8 address
block, apart from /25, /26, /27, /28, /29, /30, /31 and /
32.
207

Prefix Lists Examples

In 192/8 deny /25 and above


ip prefix-list EG deny 192.0.0.0/8 ge 25
This denies all prefix sizes /25, /26, /27, /28, /29, /30, /
31 and /32 in the address block 192.0.0.0/8.
It has the same effect as the previous example

In 193/8 permit prefixes between /12 and /20


ip prefix-list EG permit 193.0.0.0/8 ge 12 le 20
This denies all prefix sizes /8, /9, /10, /11, /21, /22,
and higher in the address block 193.0.0.0/8.

Permit all prefixes


ip prefix-list EG permit 0.0.0.0/0 le 32
0.0.0.0 matches all possible addresses, 0 le 32
matches all possible prefix lengths

208

Policy Control Prefix List

Example Configuration
router bgp 100
network 105.7.0.0 mask 255.255.0.0
neighbor 102.10.1.1 remote-as 110
neighbor 102.10.1.1 prefix-list AS110-IN in
neighbor 102.10.1.1 prefix-list AS110-OUT out
!
ip prefix-list AS110-IN deny 218.10.0.0/16
ip prefix-list AS110-IN permit 0.0.0.0/0 le 32
ip prefix-list AS110-OUT permit 105.7.0.0/16
ip prefix-list AS110-OUT deny 0.0.0.0/0 le 32
209

Policy Control Filter List

Filter routes based on AS path

Inbound or Outbound

Example Configuration:
router bgp 100
network 105.7.0.0 mask 255.255.0.0
neighbor 102.10.1.1 filter-list 5 out
neighbor 102.10.1.1 filter-list 6 in
!
ip as-path access-list 5 permit ^200$
ip as-path access-list 6 permit ^150$
210

Policy Control Regular


Expressions

Like Unix regular expressions


.
*
+
^
$
\
_
|
()
[]

Match one character


Match any number of preceding expression
Match at least one of preceding expression
Beginning of line
End of line
Escape a regular expression character
Beginning, end, white-space, brace
Or
brackets to contain expression
brackets to contain number ranges
211

Policy Control Regular


Expressions

Simple Examples
.*
.+
^$
_1800$
^1800_
_1800_
_790_1800_
_(1800_)+
_\(65530\)_

match anything
match at least one character
match routes local to this AS
originated by AS1800
received from AS1800
via AS1800
via AS1800 and AS790
multiple AS1800 in sequence
(used to match AS-PATH prepends)
via AS65530 (confederations)

212

Policy Control Regular


Expressions

Not so simple Examples


^[0-9]+$
^[0-9]+_[0-9]+$
^[0-9]*_[0-9]+$
^[0-9]*_[0-9]*$

Match AS_PATH length of one


Match AS_PATH length of two
Match AS_PATH length of one or two
Match AS_PATH length of one or two
(will also match zero)
^[0-9]+_[0-9]+_[0-9]+$ Match AS_PATH length of three
_(701|1800)_
Match anything which has gone
through AS701 or AS1800
_1849(_.+_)12163$
Match anything of origin AS12163
and passed through AS1849

213

Policy Control Route Maps

A route-map is like a programme for IOS


Has line numbers, like programmes
Each line is a separate condition/action
Concept is basically:
if match then do expression and exit
else
if match then do expression and exit
else etc

Route-map continue lets ISPs apply multiple


conditions and actions in one route-map
214

Route Maps Caveats

Lines can have multiple set statements


Lines can have multiple match statements
Line with only a match statement

Line with only a set statement

Only prefixes matching go through, the rest are dropped


All prefixes are matched and set
Any following lines are ignored

Line with a match/set statement and no following


lines

Only prefixes matching are set, the rest are dropped

215

Route Maps Caveats

Example

Omitting the third line below means that prefixes not


matching list-one or list-two are dropped

route-map sample permit 10


match ip address prefix-list list-one
set local-preference 120
!
route-map sample permit 20
match ip address prefix-list list-two
set local-preference 80
!
route-map sample permit 30 ! Dont forget this
216

Route Maps Matching


prefixes

Example Configuration

router bgp 100


neighbor 1.1.1.1 route-map infilter in
!
route-map infilter permit 10
match ip address prefix-list HIGH-PREF
set local-preference 120
!
route-map infilter permit 20
match ip address prefix-list LOW-PREF
set local-preference 80
!
ip prefix-list HIGH-PREF permit 10.0.0.0/8
ip prefix-list LOW-PREF permit 20.0.0.0/8
217

Route Maps AS-PATH


filtering

Example Configuration

router bgp 100


neighbor 102.10.1.2 remote-as 200
neighbor 102.10.1.2 route-map filter-on-as-path in
!
route-map filter-on-as-path permit 10
match as-path 1
set local-preference 80
!
route-map filter-on-as-path permit 20
match as-path 2
set local-preference 200
!
ip as-path access-list 1 permit _150$
ip as-path access-list 2 permit _210_
218

Route Maps AS-PATH


prepends

Example configuration of AS-PATH prepend


router bgp 300
network 105.7.0.0 mask 255.255.0.0
neighbor 2.2.2.2 remote-as 100
neighbor 2.2.2.2 route-map SETPATH out
!
route-map SETPATH permit 10
set as-path prepend 300 300

Use your own AS number when prepending

Otherwise BGP loop detection may cause disconnects


219

Route Maps Matching


Communities

Example Configuration

router bgp 100


neighbor 102.10.1.2 remote-as 200
neighbor 102.10.1.2 route-map filter-on-community in
!
route-map filter-on-community permit 10
match community 1
set local-preference 50
!
route-map filter-on-community permit 20
match community 2 exact-match
set local-preference 200
!
ip community-list 1 permit 150:3 200:5
ip community-list 2 permit 88:6
220

Community-List Processing

Note:

When multiple values are configured in the same


community list statement, a logical AND condition is
created. All community values must match to satisfy
an AND condition
ip community-list 1 permit 150:3 200:5

When multiple values are configured in separate


community list statements, a logical OR condition is
created. The first list that matches a condition is
processed
ip community-list 1 permit 150:3
ip community-list 1 permit 200:5
221

Route Maps Setting


Communities

Example Configuration
router bgp 100
network 105.7.0.0 mask 255.255.0.0
neighbor 102.10.1.1 remote-as 200
neighbor 102.10.1.1 send-community
neighbor 102.10.1.1 route-map set-community out
!
route-map set-community permit 10
match ip address prefix-list NO-ANNOUNCE
set community no-export
!
route-map set-community permit 20
match ip address prefix-list AGGREGATE
!
ip prefix-list NO-ANNOUNCE permit 105.7.0.0/16 ge 17
ip prefix-list AGGREGATE permit 105.7.0.0/16

222

Route Map Continue

Handling multiple conditions and actions in one routemap (for BGP neighbour relationships only)
route-map peer-filter permit 10
match ip address prefix-list group-one
continue 30
set metric 2000
!
route-map peer-filter permit 20
match ip address prefix-list group-two
set community no-export
!
route-map peer-filter permit 30
match ip address prefix-list group-three
set as-path prepend 100 100
!
223

Order of processing BGP


policy

For policies applied to a specific BGP


neighbour, the following sequence is
applied:

For inbound updates, the order is:


Route-map
Filter-list
Prefix-list

For outbound updates, the order is:


Prefix-list
Filter-list
Route-map

224

Managing Policy Changes

New policies only apply to the updates going


through the router AFTER the policy has been
introduced or changed
To facilitate policy changes on the entire BGP
table the router handles the BGP peerings need
to be refreshed
This is done by clearing the BGP session either in or out,
for example:
clear ip bgp <neighbour-addr> in|out

Do NOT forget in or out doing so results in a


hard reset of the BGP session
225

Managing Policy Changes

Ability to clear the BGP sessions of groups of


neighbours configured according to several
criteria
clear ip bgp <addr> [in|out]
<addr> may be any of the following
x.x.x.x
IP address of a peer
*
all peers
ASN
all peers in an AS
external
all external peers
peer-group <name>
all peers in a peer-group

226

BGP Policy Control


ISP Training Workshops

227

Internet Exchange
Point Design
ISP Training Workshops

228

IXP Design
Background
Why set up an IXP?
Layer 2 Exchange Point
Layer 3 Exchange Point
Design Considerations
Route Collectors & Servers
What can go wrong?

229

A bit of history
In a time long gone

230

A Bit of History
End of NSFnet one major backbone
move towards commercial Internet

Need for coordination of routing exchange


between providers

Private companies selling their bandwidth

Traffic from ISP A needs to get to ISP B

Routing Arbiter project created to facilitate


this

231

What is an Exchange Point

Network Access Points (NAPs) established


at end of NSFnet

The original exchange points

Major providers connect their networks


and exchange traffic
High-speed network or ethernet switch
Simple concept any place where
providers come together to exchange
traffic

232

Internet Exchange Points

Layer 2 exchange point

Ethernet (100Gbps/10Gbps/1Gbps/100Mbps)
Older technologies include ATM, Frame Relay,
SRP, FDDI and SMDS

Layer 3 exchange point

Router based
Has historical status now

233

Why an Internet
Exchange Point?
Saving money, improving QoS,
Generating a local Internet
economy

234

Internet Exchange Point


Why peer?

Consider a region with one ISP

Internet grows, another ISP sets up in


competition

They provide internet connectivity to their customers


They have one or two international connections

They provide internet connectivity to their customers


They have one or two international connections

How does traffic from customer of one ISP get to


customer of the other ISP?

Via the international connections

235

Internet Exchange Point


Why peer?

Yes, International Connections

If satellite, RTT is around 550ms per hop


So local traffic takes over 1s round trip

International bandwidth

Costs significantly more than domestic


bandwidth
Congested with local traffic
Wastes money, harms performance

236

Internet Exchange Point


Why peer?

Solution:

Two competing ISPs peer with each other

Result:

Both save money


Local traffic stays local
Better network performance, better QoS,
More international bandwidth for expensive
international traffic
Everyone is happy
237

Internet Exchange Point


Why peer?

A third ISP enters the equation

Becomes a significant player in the region


Local and international traffic goes over their
international connections

They agree to peer with the two other


ISPs

To save money
To keep local traffic local
To improve network performance, QoS,
238

Internet Exchange Point


Why peer?

Private peering means that the three ISPs


have to buy circuits between each other

Works for three ISPs, but adding a fourth or a


fifth means this does not scale

Solution:

Internet Exchange Point

239

Internet Exchange Point

Every participant has to buy just one


whole circuit

From their premises to the IXP

Rather than N-1 half circuits to connect to


the N-1 other ISPs

5 ISPs have to buy 4 half circuits = 2 whole


circuits already twice the cost of the IXP
connection

240

Internet Exchange Point

Solution

Every ISP participates in the IXP


Cost is minimal one local circuit covers all domestic
traffic
International circuits are used for just international
traffic and backing up domestic links in case the IXP
fails

Result:

Local traffic stays local


QoS considerations for local traffic is not an issue
RTTs are typically sub 10ms
Customers enjoy the Internet experience
Local Internet economy grows rapidly
241

Layer 2 Exchange
The traditional IXP

242

IXP Design

Very simple concept:

Ethernet switch is the interconnection media

IXP is one LAN

Each ISP brings a router, connects it to the


ethernet switch provided at the IXP
Each ISP peers with other participants at the
IXP using BGP

Scaling this simple concept is the


challenge for the larger IXPs
243

Layer 2 Exchange
ISP 6

ISP 5

ISP 4

IXP Services:
Root & TLD DNS,
Routing Registry

Ethernet Switch

Looking Glass, etc

ISP 1

ISP 2

IXP
Management
Network

ISP 3
244

Layer 2 Exchange
ISP 6

ISP 5

ISP 4

IXP Services:

IXP
Management
Network

Root & TLD DNS,


Routing Registry

Ethernet Switches

Looking Glass, etc

ISP 1

ISP 2

ISP 3
245

Layer 2 Exchange
Two switches for redundancy
ISPs use dual routers for redundancy or
loadsharing
Offer services for the common good

Internet portals and search engines


DNS Root & TLDs, NTP servers
Routing Registry and Looking Glass

246

Layer 2 Exchange

Requires neutral IXP management

Usually funded equally by IXP participants


24x7 cover, support, value add services

Secure and neutral location


Configuration

Private address space if non-transit and no


value add services
Otherwise public IPv4 (/24) and IPv6 (/64)
ISPs require AS, basic IXP does not
247

Layer 2 Exchange

Network Security Considerations

LAN switch needs to be securely configured


Management routers require TACACS+
authentication, vty security
IXP services must be behind router(s) with
strong filters

248

Layer 3 IXP
Layer 3 IXP is marketing concept used by
Transit ISPs
Real Internet Exchange Points are only
Layer 2

249

IXP Design
Considerations

250

Exchange Point Design

The IXP Core is an Ethernet switch

It must be a managed switch

Has superseded all other types of network


devices for an IXP

From the cheapest and smallest managed 12


or 24 port 10/100 switch
To the largest switches now handling high
densities of 10GE and 100GE interfaces

251

Exchange Point Design


Each ISP participating in the IXP brings a
router to the IXP location
Router needs:

One Ethernet port to connect to IXP switch


One WAN port to connect to the WAN media
leading back to the ISP backbone
To be able to run BGP

252

Exchange Point Design

IXP switch located in one equipment rack


dedicated to IXP

Also includes other IXP operational equipment

Routers from participant ISPs located in


neighbouring/adjacent rack(s)
Copper (UTP) connections made for
10Mbps, 100Mbps or 1Gbps connections
Fibre used for 1Gbps, 10Gbps, 40Gbps or
100Gbps connections

253

Peering

Each participant needs to run BGP

They need their own AS number


Public ASN, NOT private ASN

Each participant configures external BGP


directly with the other participants in the
IXP

Peering with all participants


or
Peering with a subset of participants
254

Peering (more)

Mandatory Multi-Lateral Peering (MMLP)

Multi-Lateral Peering (MLP)

Each participant is forced to peer with every other


participant as part of their IXP membership
Has no history of success the practice is strongly
discouraged
Each participant peers with every other participant
(usually via a Route Server)

Bi-Lateral Peering

Participants set up peering with each other according to


their own requirements and business relationships
This is the most common situation at IXPs today
255

Routing

ISP border routers at the IXP must NOT be


configured with a default route or carry the full
Internet routing table

Carrying default or full table means that this router and


the ISP network is open to abuse by non-peering IXP
members
Correct configuration is only to carry routes offered to
IXP peers on the IXP peering router

Note: Some ISPs offer transit across IX fabrics

They do so at their own risk see above

256

Routing (more)

ISP border routers at the IXP should not


be configured to carry the IXP LAN
network within the IGP or iBGP

Use next-hop-self BGP concept

Dont generate ISP prefix aggregates on


IXP peering router

If connection from backbone to IXP router goes


down, normal BGP failover will then be
successful

257

Address Space

Some IXPs use private addresses for the IX LAN

Public address space means IXP network could be


leaked to Internet which may be undesirable
Because most ISPs filter RFC1918 address space, this
avoids the problem

Some IXPs use public addresses for the IX LAN

Address space available from the RIRs


IXP terms of participation often forbid the IX LAN to be
carried in the ISP member backbone

258

Hardware

Try not to mix port speeds

Dont mix transports

if 10Mbps and 100Mbps connections available,


terminate on different switches (L2 IXP)
if terminating ATM PVCs and G/F/Ethernet,
terminate on different devices

Insist that IXP participants bring their own


router

moves buffering problem off the IXP


security is responsibility of the ISP, not the IXP
259

Charging

IXPs should be run at minimal cost to participants


Examples:

Datacentre hosts IX for free

IX operates cost recovery

Because ISP participants then use data centre for co-lo


services, and the datacentre benefits long term
Each member pays a flat fee towards the cost of the switch,
hosting, power & management

Different pricing for different ports

One slot may handle 24 10GE ports


Or one slot may handle 96 1GE ports
96 port 1GE card is tenth price of 24 port 10GE card
Relative port cost is passed on to participants
260

Services Offered

Services offered should not compete with


member ISPs (basic IXP)

e.g. web hosting at an IXP is a bad idea unless


all members agree to it

IXP operations should make performance


and throughput statistics available to
members

Use tools such as MRTG/Cacti to produce IX


throughput graphs for member (or public)
information
261

Services to Offer

ccTLD DNS

Root server

the country IXP could host the countrys top level DNS
e.g. SE. TLD is hosted at Netnod IXes in Sweden
Offer back up of other country ccTLD DNS
Anycast instances of I.root-servers.net, F.rootservers.net etc are present at many IXes

Usenet News

Usenet News is high volume


could save bandwidth to all IXP members

262

Services to Offer

Route Collector

Route collector shows the reachability


information available at the exchange
Technical detail covered later on

Looking Glass

One way of making the Route Collector routes


available for global view (e.g.
www.traceroute.org)
Public or members only access

263

Services to Offer

Content Redistribution/Caching

Network Time Protocol

For example, Akamised update distribution


service
Locate a stratum 1 time source (GPS receiver,
atomic clock, etc) at IXP

Routing Registry

Used to register the routing policy of the IXP


membership (more later)
264

Introduction to
Route Collectors
What routes are available at the
IXP?

265

What is a Route Collector?


Usually a router or Unix system running
BGP
Gathers routing information from service
provider routers at an IXP

Peers with each ISP using BGP

Does not forward packets


Does not announce any prefixes to ISPs

266

Purpose of a Route
Collector

To provide a public view of the Routing


Information available at the IXP

Useful for existing members to check


functionality of BGP filters
Useful for prospective members to check value
of joining the IXP
Useful for the Internet Operations community
for troubleshooting purposes

E.g. www.traceroute.org

267

Route Collector at an IXP

R3
R2

R1

R4

SWITCH

Route Collector

R5

268

Route Collector
Requirements

Router or Unix system running BGP

Peers eBGP with every IXP member

Minimal memory requirements only holds IXP routes


Minimal packet forwarding requirements doesnt
forward any packets
Accepts everything; Gives nothing
Uses a private ASN
Connects to IXP Transit LAN

Back end connection

Second Ethernet globally routed


Connection to IXP Website for public access
269

Route Collector
Implementation

Most IXPs now implement some form of


Route Collector
Benefits already mentioned
Great public relations tool
Unsophisticated requirements

Just runs BGP

270

Introduction to
Route Servers
How to scale very large IXPs

271

What is a Route Server?


Has all the features of a Route Collector
But also:

Announces routes to participating IXP


members according to their routing policy
definitions

Implemented using the same specification


as for a Route Collector

272

Features of a Route Server


Helps scale routing for large IXPs
Simplifies Routing Processes on ISP
Routers
Optional participation

Provided as service, is NOT mandatory

Does result in insertion of RS Autonomous


System Number in the Routing Path
Optionally uses Policy registered in IRR

273

Diagram of N-squared
Peering Mesh

For large IXPs (dozens for participants)


maintaining a larger peering mesh becomes
cumbersome and often too hard

274

Peering Mesh with Route


Servers

RS

RS

ISP routers peer with the Route Servers

Only need to have two eBGP sessions rather


than N
275

RS based Exchange Point


Routing Flow

RS

TRAFFIC FLOW
ROUTING INFORMATION FLOW
276

Advantages of Using a
Route Server

Advantageous for large IXPs

Helps scale eBGP mesh


Helps scale prefix distribution

Separation of Routing and Forwarding


Simplifies BGP Configuration Management
on ISP routers

277

Disadvantages of using a
Route Server

ISPs can lose direct policy control

Completely dependent on 3rd party

If RS is only peer, ISPs have no control over


who their prefixes are distributed to
Configuration, troubleshooting, etc

Insertion of RS ASN into routing path

(If using a router rather than a dedicated


route-server BGP implementation)
Traffic engineering/multihoming needs more
care
278

Typical usage of a Route


Server

Route Servers may be provided as an


OPTIONAL service

Most common at large IXPs (>50 participants)


Examples: LINX, TorIX, AMS-IX, etc

ISPs peer:

Directly with significant peers


With Route Server for the rest

279

Things to think about...

Would using a route server benefit you?

Helpful when BGP knowledge is limited (but is


NOT an excuse not to learn BGP)
Avoids having to maintain a large number of
eBGP peers
But can you afford to lose policy control? (An
ISP not in control of their routing policy is
what?)

280

What can go
wrong
The different ways IXP
operators harm their IXP

281

What can go wrong?


Concept

Some Service Providers attempt to cash in


on the reputation of IXPs
Market Internet transit services as
Internet Exchange Point

We are exchanging packets with other ISPs,


so we are an Internet Exchange Point!
So-called Layer-3 Exchanges really Internet
Transit Providers
Router used rather than a Switch
Most famous example: SingTelIX
282

What can go wrong?


Financial

Some IXPs price the IX out of the means of


most providers

IXP is intended to encourage local peering


Acceptable charging model is minimally costrecovery only

Some IXPs charge for port traffic

IXPs are not a transit service, charging for traffic


puts the IX in competition with members
(There is nothing wrong with charging different flat
fees for 100Mbps, 1Gbps, 10Gbps etc ports as they
all have different hardware costs on the switch.)
283

What can go wrong?


Competition

Too many exchange points in one locale

Competing exchanges defeats the purpose

Becomes expensive for ISPs to connect to


all of them

An IXP:

is NOT a competition
is NOT a profit making business

284

What can go wrong?


Rules and Restrictions

IXPs try to compete with their membership

IXPs run as a closed privileged club e.g.:

Offering services that ISPs would/do offer their


customers
Restrictive membership criteria

IXPs providing access to end users rather than


just Service Providers
IXPs interfering with ISP business decisions e.g.
Mandatory Multi-Lateral Peering

285

What can go wrong?


Technical Design Errors

Interconnected IXPs

IXP in one location believes it should connect


directly to the IXP in another location
Who pays for the interconnect?
How is traffic metered?
Competes with the ISPs who already provide
transit between the two locations (who then
refuse to join IX, harming the viability of the
IX)
Metro interconnections work ok (e.g. LINX,
AMS-IX, DE-CIX etc)
286

What can go wrong?


Technical Design Errors

ISPs bridge the IXP LAN back to their


offices

We are poor, we cant afford a router


Financial benefits of connecting to an IXP far
outweigh the cost of a router
In reality it allows the ISP to connect any
devices to the IXP LAN with disastrous
consequences for the security, integrity and
reliability of the IXP

287

What can go wrong?


Routing Design Errors

Route Server implemented from Day One

ISPs have no incentive to learn BGP


Therefore have no incentive to understand
peering relationships, peering policies, &c
Entirely dependent on operator of RS for
troubleshooting, configuration, reliability

RS cant be run by committee!

Route Server is to help scale peering at


LARGE IXPs
288

What can go wrong?


Routing Design Errors

iBGP Route Reflector used to distribute prefixes


between IXP participants
Claimed Advantage (1):

Participants dont need to know about or run BGP

Actually a Disadvantage

IXP Operator has to know BGP


ISP not knowing BGP is big commercial disadvantage
ISPs who would like to have a growing successful
business need to be able to multi-home, peer with other
ISPs, etc these activities require BGP

289

What can go wrong?


Routing Design Errors
Route Reflector Claimed Advantage (2):
(cont)
Allows an IXP to be started very quickly

Fact:

IXP is only an Ethernet switch setting up an


iBGP mesh with participants is no quicker than
setting up an eBGP mesh

290

What can go wrong?


Routing Design Errors
Route Reflector Claimed Advantage (3):
(cont)
IXP operator has full control over IXP activities

Actually a Disadvantage

ISP participants surrender control of:


Their border router; it is located in IXPs AS
Their routing and peering policy

IXP operator is single point of failure


If they arent available 24x7, then neither is the IXP
BGP configuration errors by IXP operator have real
impacts on ISP operations

291

What can go wrong?


Routing Design Errors
Route Reflector Disadvantage (4):
(cont)
Migration from Route Reflector to correct

routing configuration is highly non-trivial


ISP router is in IXPs ASN

Need to move ISP router from IXPs ASN to the ISPs


ASN
Need to reconfigure BGP on ISP router, add to ISPs
IGP and iBGP mesh, and set up eBGP with IXP
participants and/or the IXP Route Server

292

More Information

293

Exchange Point
Policies & Politics

AUPs

Fees?

Acceptable Use Policy


Minimal rules for connection
Some IXPs charge no fee
Other IXPs charge cost recovery
A few IXPs are commercial

Nobody is obliged to peer

Agreements left to ISPs, not mandated by IXP


294

Exchange Point etiquette


Dont point default route at another IXP
participant
Be aware of third-party next-hop
Only announce your aggregate routes

Read RIPE-399 first


www.ripe.net/docs/ripe-399.html

Filter! Filter! Filter!

295

Exchange Point Examples

LINX in London, UK
TorIX in Toronto, Canada
AMS-IX in Amsterdam, Netherlands
SIX in Seattle, Washington, US
PA-IX in Palo Alto, California, US
JPNAP in Tokyo, Japan
DE-CIX in Frankfurt, Germany
HK-IX in Hong Kong

All use Ethernet Switches


296

Features of IXPs (1)

Redundancy & Reliability

Support

Multiple switches, UPS


NOC to provide 24x7 support for problems at
the exchange

DNS, Route Collector, Content & NTP


servers

ccTLD & root servers


Content redistribution systems such as Akamai
Route Collector Routing Table view
297

Features of IXPs (2)

Location

Address space

neutral co-location facilities


Peering LAN

AS Number

If using Route Collector/Server

Route servers (optional, for larger IXPs)


Statistics

Traffic data for membership


298

More info about IXPs

http://www.pch.net/documents

Another excellent resource of IXP locations,


papers, IXP statistics, etc

http://www.telegeography.com/ee/ix/inde
x.php

A collection of IXPs and interconnect points for


ISPs

299

Summary

L2 IXP most commonly deployed

The core is an ethernet switch


ATM and other old technologies are obsolete

L3 IXP nowadays is a marketing concept


used by wholesale ISPs

Does not offer the same flexibility as L2


Not recommended unless there are overriding
regulatory or political reasons to do so
Avoid!
300

Internet Exchange
Point Design
ISP Training Workshops

301

BGP Configuration
for IXPs
ISP Training Workshops

302

Background

This presentation covers the BGP


configurations required for a participant at
an Internet Exchange Point

It does not cover the technical design of an IXP


Nor does it cover the financial and operational
benefits of participating in an IXP
See the IXP Design Presentation that is part of
this Workshop Material set for financial,
technical and operational details

303

Recap: Definitions

Transit carrying traffic across a network,


usually for a fee

Traffic and prefixes originating from one AS are


carried across an intermediate AS to reach
their destination AS

Peering private interconnect between


two ASNs, usually for no fee
Internet Exchange Point common
interconnect location where several ASNs
exchange routing information and traffic

304

IXP Peering Issues


Only announce your aggregates and your
customer aggregates at IXPs
Only accept the aggregates which your
peer is entitled to originate
Never carry a default route on an IXP (or
private) peering router

305

ISP Transit Issues


Many mistakes are made on
the Internet today due to
incomplete understanding of
how to configure BGP for
peering at Internet
Exchange Points
306

Simple BGP
Configuration
example
Exchange Point Configuration

307

Exchange Point Example

Exchange point with 6 ASes present

Layer 2 ethernet switch

Each ISP peers with the other

NO transit across the IXP is allowed

308

Exchange Point
AS150

AS100

AS110

AS120

AS140

AS130

Each of these represents a border router in a different


autonomous system

309

Router configuration

IXP router is usually located at the


Exchange Point premises

Create a peer-group for IXP peers

Configuration needs to be such that


disconnecting it from the backbone does not
cause routing loops or traffic blackholes
All outbound policy to each peer will be the
same

Ensure the router is not carrying the


default route

Or the full routing table (for that matter)


310

Creating a peer-group &


route-map
router bgp 100
neighbor ixp-peer peer-group
neighbor ixp-peer send-community
neighbor ixp-peer prefix-list my-prefixes out
neighbor ixp-peer route-map set-local-pref in
!
ip prefix-list my-prefixes permit 121.10.0.0/19
Only allow AS100 address
!
block to IXP peers
route-map set-local-pref permit 10
set local-preference 150
!
Prefixes heard from IXP peers
have highest preference

311

Interface and BGP


configuration (1)
interface fastethernet 0/0
description Exchange Point LAN
ip address 120.5.10.1 mask 255.255.255.224
no ip directed-broadcast
no ip proxy-arp
IXP LAN BCP configuration
no ip redirects
!
router bgp 100
neighbor 120.5.10.2 remote-as 110
neighbor 120.5.10.2 peer-group ixp-peer
neighbor 120.5.10.2 prefix-list peer110 in
neighbor 120.5.10.3 remote-as 120
neighbor 120.5.10.3 peer-group ixp-peers
neighbor 120.5.10.3 prefix-list peer120 in
312

Interface and BGP


Configuration (2)
neighbor
neighbor
neighbor
neighbor
neighbor
neighbor
neighbor
neighbor
neighbor
!
ip
!
ip
ip
ip
ip
ip

120.5.10.4
120.5.10.4
120.5.10.4
120.5.10.5
120.5.10.5
120.5.10.5
120.5.10.6
120.5.10.6
120.5.10.6

remote-as 130
peer-group ixp-peers
prefix-list peer130 in
remote-as 140
peer-group ixp-peers
prefix-list peer140 in
remote-as 150
peer-group ixp-peers
prefix-list peer150 in

Peer-group applied
to each peer

Each peer has own


inbound filter

route 121.10.0.0 255.255.224.0 null0


prefix-list
prefix-list
prefix-list
prefix-list
prefix-list

peer110
peer120
peer130
peer140
peer150

permit
permit
permit
permit
permit

122.0.0.0/19
122.30.0.0/19
122.12.0.0/19
122.18.128.0/19
122.1.32.0/19

313

Exchange Point
Configuration of the other routers in the
AS is similar in concept
Notice inbound and outbound prefix filters

outbound announces myprefixes only


inbound accepts peer prefixes only

Notice inbound route-map

Set local preference higher than default


ensures that if the same prefix is heard via
AS100 upstream, the best path for traffic is via
the IXP
314

Exchange Point

Ethernet port configuration

Be aware of LAN configuration best practices


Switch off proxy arp, redirects and broadcasts
(if not already default)

IXP border router must NOT carry prefixes


with origin outside local AS and IXP
participant ASes

Helps prevent stealing of bandwidth

315

Exchange Point

Issues:

AS100 needs to know all the prefixes its peers


are announcing
New prefixes requires the prefix-lists to be
updated

Alternative solutions

Use the Internet Routing Registry to build


prefix list
Use AS Path filters (could be risky)

316

More Complex BGP


example
Exchange Point Configuration

317

Exchange Point Example

Exchange point with 6 ASes present

Layer 2 ethernet switch

Each ISP peers with the other

NO transit across the IXP allowed


ISPs at exchange points provide transit to their
BGP customers

318

Exchange Point
AS200
AS201

AS110

AS120

AS150

AS100

AS140

AS130

Each of these represents a border router in a different


autonomous system

319

Exchange Point
Router A configuration
interface fastethernet 0/0
description Exchange Point LAN
ip address 120.5.10.2 mask 255.255.255.224
no ip directed-broadcast
no ip proxy-arp
no ip redirects
!
router bgp 100
Filter by ASN rather
than by prefix and
neighbor ixp-peers peer-group
block bogons too
neighbor ixp-peers send-community
neighbor ixp-peers prefix-list bogons out
neighbor ixp-peers filter-list 10 out
neighbor ixp-peers route-map set-local-pref in
...next slide
320

Exchange Point
neighbor
neighbor
neighbor
neighbor
neighbor
neighbor
neighbor
neighbor
neighbor
neighbor
neighbor
neighbor
neighbor
neighbor
neighbor

120.5.10.2
120.5.10.2
120.5.10.2
120.5.10.3
120.5.10.3
120.5.10.3
120.5.10.4
120.5.10.4
120.5.10.4
120.5.10.5
120.5.10.5
120.5.10.5
120.5.10.6
120.5.10.6
120.5.10.6

remote-as 110
peer-group ixp-peers
prefix-list peer110 in
remote-as 120
peer-group ixp-peers
prefix-list peer120 in
remote-as 130
peer-group ixp-peers
prefix-list peer130 in
remote-as 140
peer-group ixp-peers
prefix-list peer140 in
remote-as 150
peer-group ixp-peers
prefix-list peer150 in
321

Exchange Point
ip route 121.10.0.0 255.255.224.0 null0
!
ip as-path access-list 10 permit ^$
ip as-path access-list 10 permit ^200$
ip as-path access-list 10 permit ^201$
!
ip prefix-list peer110 permit 122.0.0.0/19
ip prefix-list peer120 permit 122.30.0.0/19
ip prefix-list peer130 permit 122.12.0.0/19
ip prefix-list peer140 permit 122.18.128.0/19
ip prefix-list peer150 permit 122.1.32.0/19
!
route-map set-local-pref permit 10
set local-preference 150
322

Exchange Point

Notice the change in router As configuration

Filter-list instead of prefix-list permits local and


customer ASes out to exchange
Prefix-list blocks Special Use Address prefixes rest get
out, could be risky

Other issues as previously


This configuration will not scale as more and
more BGP customers are added to AS100

As-path filter has to be updated each time


Solution: BGP communities

323

More scalable BGP


example
Exchange Point Configuration

324

Exchange Point Example


(Scalable)

Exchange point with 6 ASes present

Each ISP peers with the other

Layer 2 ethernet switch


NO transit across the IXP allowed
ISPs at exchange points provide transit to their
BGP customers

(Scalable solution is presented here)

325

Exchange Point
AS150

AS100

AS110

AS120

AS140

AS130

Each of these represents a border router in a different


autonomous system - each ASN has BGP customers of their own
326

Router configuration

Take AS100 as an example

Create a peer-group for IXP peers

All outbound policy to each peer will be the


same

Communities will be used

Has 15 BGP customers, in AS501 to AS515

AS-path filters will not scale well

Community Policy

AS100 aggregate put into 100:1000


All BGP customer aggregates go into 100:1100
327

Creating a peer-group &


route-map
router bgp 100
neighbor ixp-peer peer-group
neighbor ixp-peer send-community
neighbor ixp-peer route-map ixp-peers-out out
neighbor ixp-peer route-map set-local-pref in
!
ip community-list 10 permit 100:1000
AS100 aggregate
ip community-list 11 permit 100:1100
!
AS100 BGP customers
route-map ixp-peers-out permit 10
match community 10 11
!
route-map set-local-pref permit 10
set local-preference 150
!
Prefixes heard from IXP peers
have highest preference 328

BGP configuration for IXP


router
router bgp 100
neighbor 120.5.10.2
neighbor 120.5.10.2
neighbor 120.5.10.2
neighbor 120.5.10.3
neighbor 120.5.10.3
neighbor 120.5.10.3
...etc

remote-as 110
peer-group ixp-peer
prefix-list peer110 in
remote-as 120
peer-group ixp-peers
prefix-list peer120 in

Remaining configuration is the same as earlier


Note the reliance again on inbound prefix-lists for
peers

Peers need to update the ISP if filters need to be changed


And thats what the IRR is for (otherwise use email)
329

BGP configuration for


AS100s customer
router bgp 100
network 121.10.0.0 mask
255.255.192.0 route-map set-comm
aggregation
router
neighbor 121.10.4.2 remote-as 501
neighbor 121.10.4.2 prefix-list as501-in in
neighbor 121.10.4.2 prefix-list default out
neighbor 121.10.4.2 route-map set-cust-policy in
...etc
!
Set community on
route-map set-comm permit 10
AS100 aggregate
set community 100:1000
!
route-map set-cust-policy permit 10
Set community on
set community 100:1100
BGP customer routes
!
330

Scalable IXP policy

ISP Community policy is set on ingress


ISP now relies on communities to determine what
is announced at the IXP

If BGP customer announces more prefixes, only


the filters at the aggregation edge need to be
updated

No need to update any as-path filters, prefix-lists, &c

And those new prefixes will automatically be tagged with


the community to allow them through to AS100s IXP
peers

Consult the BGP community presentation for


more extensive examples
331

Route Servers

IXP operators quite often provide a Route Server


to assist with scaling the BGP mesh

All prefixes sent to a Route Server are usually


distributed to all ASNs that peer with the Route Server
(although some IXPs offer ISPs the facility to configure
specific policies on their Route Server)

BGP configuration to peer with a Route Server is


the same as for any other ordinary peer

But note that the route server will offer prefixes from
several ASNs (the IXP membership who choose to
participate)
Inbound filter should be constructed appropriately
332

Route Servers

Route Server software suppresses the ASN of the


RS so that it doesnt appear in the AS-path
IOS by default will not accept prefixes from a
neighbouring AS unless that AS is first in the ASpath
Needed so that IOS can
receive prefixes without
AS65534 being first in path

router bgp 100


no bgp enforce-first-as
neighbor x.x.x.a remote-as 65534
neighbor x.x.x.a route-map IXP-RS-in in
neighbor x.x.x.a route-map ixp-peers-out out

333

Summary
Exchange Point Configuration

334

Summary

Ensure that BGP is scalable on your IXP peering


router

Only carry local ASN prefixes and customer


routes on the IXP peering router

Manually updating filters every time a new customer


connects is tiresome and has potential to cause errors

Anything else (e.g. default or full BGP table) has the


potential to result in bandwidth theft

Filter IXP peer announcements

Inbound use the IRR if maintaining prefix-lists is


difficult
Outbound use communities for scalability
335

BGP Configuration
for IXPs
ISP Training Workshops

336

Вам также может понравиться