Академический Документы
Профессиональный Документы
Культура Документы
Whenever a frame with an unknown source address enters the switch, its source MAC
address, along with switch port number and VLAN are recorded in the CAM table.
Time stamps are also added (every time a frame from that device is sent out) so the CAM
table know the latest entry, if newer time stamp of this device is found in another port, and
the switch no longer hears from that device (at the old port), this entry is deleted and the
new entry is used instead (of normal 300 second age out).
By default, MAC address table age out an entry after 300 second of inactivity (5 minutes).
But you can manually adjust this by ‘(config)#mac address-table aging-time SEC’.
You can also configure a static entry of MAC address so it doesn't leave the CAM table
unless you specified so, by ‘(config)#mac address-table static MAC_ADD vlan V_ID interface
TYPE_NO’
To display the MAC address table, use ‘#show mac address-table dynamic [address
MAC_ADD | interface TYPE_NO | vlan VID]’. To delete an entry, add ‘clear’ in front of the
command. To see the CAM table size, use ‘show mac address-table count’ command.
L2 switch
When a frame arrives at a switch port, it’s placed into one of the ingress queues. Each
ingress queues have a different priority service level, so the higher priority queues gets
faster service, this prevents time (latency: the time takes from a frame enters the switch to
leaving it) sensitive frames from being last.
Not only does the switch has to figure out WHERE to send a frame, it also has to figure to
WHETHER it should and HOW. These are the factors affecting the process, each is operated
separately:
- L2 forwarding table, or CAM table.
- Security ACL placed in ternary CAM or TCAM table deciding whether a frame should leave
- QoS ACL determines which egress queue a frame will go, each queue has different priority
to prevent time-sensitive information from delay. Also stored in TCAM table.
When frames come in, all ACLs are run in parallel to match the frame.
When a packet comes in, these components will be involved in its forwarding decision:
- L2 forwarding, or CAM table.
- L3 forwarding, or FIB table
- Security ACLs compiled into TCAM table
- QoS ACLs compiled into TCAM table
Type of MLS
Topology-based or CEF switching refers to the switching method which RP build the RIB
and send a copy called FIB (contain all the IP prefix from routing table) to SE, which forward
packet based on next-hop entry of each packet.
When new entries take place, the CEF table is updated, packets are temporarily switched
slower in RP. Special frames such as Telnet are also switch slower as they also use demand-
based switching.
TCAM table
ACL are made up of one or more access control entities (ACE) or matching statements that
are implemented in hardware. In MLS, ACL is implemented in
1) Feature manager (FM) merges ACEs into TCAM table.
2) Switching Database Manager (SDM) helps to divide the TCAM if necessary.
TCAM entries are composed of Value, Mask and Result (VMR) combination that matches
from the frame or packet headers from the value and mask pair.
- Value are always 134-bit quantities, consisting of source and destination address and
other info, all of which needed to be matched.
Access List Value and Mask Components, 134 Bits Wide (Number of Bits)
Type
Ethernet Source MAC (48), destination MAC (48), Ethertype (16)
ICMP Source IP (32), destination IP (32), protocol (16), ICMP code (8), ICMP type (4), IP
type of service (ToS) (8)
Extended IP Source IP (32), destination IP (32), protocol (16), IP ToS (8), source port (16),
using TCP/UDP source operator (4), destination port (16), destination operator (4)
Other IP Source IP (32), destination IP (32), protocol (16), IP ToS (8)
IGMP Source IP (32), destination IP (32), protocol (16), IP ToS (8), IGMP message type (8)
IPX Source IPX network (32), destination IPX network (32), destination node (48), IPX
packet type (16)
- Mask is also a 134-bit quantity in the exact format, but it does a different job. It turns on
(1) the things needed to be matched and turn off bits not to be matched.
- Results are used to inform switches of the action to take after the lookup occurs.
TCAM table is organized by masks, each unique mask has 8 value patterns. However, some
keywords such as ‘gt’ ‘lt’ ‘neq’, or ‘range’, have more than one match, thus FM comes into
play and compile TCAM entry using logical operation unit (LOU) register pairs.
However, since there are only limited LOUs, if there are more comparison operator than
LOUs, FM must break down ACE for the statement to process.
If too many items are presented in the TCAM table, an overflow may occur due to low on
system resource. This will generate a syslog error.
More about MLS
MLS switching
MLS have logical interfaces called switched virtual interface (SVI) that can perform Layer 3
functions. This layer 3 address is the default gateway for VLANs. VLAN must be configured
before SVI can be enabled. VLAN and SVI are independent of each other even though they
interoperate.
You can verify with ‘show ip interface vlan VID’ command
Inter-VLAN routing on MLS: start the VLAN with ‘vlan X’, then assign it an IP address from
‘int vlan X’. These subnets appear as directly connected subnets.
You can configure a port to Layer 2 mode by ‘interface TYPE/NUM’ -> ‘switchport’.
To configure it to Layer 3 mode, use ‘no switchport’ instead.
Confirm with ‘show interface TYPE MOD/NUM switchport’ under ‘switchport:’ line. If
‘switchport:enabled’, this is a Layer 2 mode, otherwise, it’s in Layer 3 mode.
Note: EtherChannel port channel itself can also be in Layer 3 mode, you can assign a layer
3 address to that port channel only.
A Layer 2 port can be a routed port, a SVI interface, and an EtherChannel interface
These 2 planes are responsible for the building and actual forwarding of the routing table.
Control plane is responsible of gathering and organizing information. It runs routing
protocol, and other control information. It updates the routing table.
Data plane is where the actual forwarding occurs. Contain information from control plane.
Determine egress port for a packet.
CEF
CEF operates at the data plane and increase efficiency by using FIB and adjacency table.
The adjacency table is created to contain all connected next hops. As soon as a neighbor is
connected, it will use a MAC string/rewrite to reach that device, then an entry will be stored
in the adjacency table.
The host route, or 255.255.255.255 is also found in the FIB. If change take place in
routing table or ARP table, it will be reflected on the FIB. To display FIB, use ‘show ip cef
[TYPE MOD/NUM | vlan VID] [PREF_IP MASK] [longer-prefixes] [detail]’
Most of the time, Layer 3 forwarding engine check the packet and forward it using
hardware, but in the following instances, these packets will be marked ‘CEF punt’ and send
to Layer 3 engine for further processing:
- Entry can’t be found in FIB
- FIB is full
- TTL is expired
- MTU is exceeded, fragmentation required
- ICMP redirect is involved
- Encapsulation type is not supported
- Packets are tunneled, requiring compression or encryption
- Involves ACL with ‘log’ keyword
- NAT operation (exception: Catalyst 6500 Supervisor 720 can handle NAT in hardware)
- ARP requests and replies
- IP packets that require a response from the router (such as decrementing TTL, MTU
match, fragmentation, etc)
- IP broadcast that will be relayed as unicast (IP helper-address)
- Routing protocol updates
- CDP packets
- IPX routing protocol and service advertisement
- Non IP or IPX protocol
Advanced CEF
As performance demand increase, Layer 3 engine also increased its capability. In Catalyst
6500, CEF is being optimized with special forwarding hardware and accelerated CEF or
distributed CEF.
Accelerated CEF allows a portion of the FIB to be distributed to capable line card modules
in the Catalyst 6500 switch. This allows the forwarding decision to be made on the local line
card using the locally stored scaled-down CEF table. In the event that FIB entries are not
found in the cache, requests are sent to the Layer 3 engine for more FIB information.
Distributed CEF refers to the use of multiple CEF tables distributed across multiple line
cards installed in the chassis. When using dCEF, the Layer 3 engine (MSFC) maintains the
routing table and generates the FIB, which is then dynamically downloaded in full to each of
the line cards, allowing for multiple Layer 3 data plane operations to be performed
simultaneously.
To enable CEF, use ‘(config)#ip cef [distributed]’. To disable, use ‘(config)#no ip cef
[distributed]’ or ‘(config)#no ip route-cache cef’.
Verify with ‘show ip cef [TYPE MOD/NUM] [detail]’
Fallback bridging
Not all routed protocols are supported. Depending on the platform, unsupported routed
protocol must be routed using software while some can’t even be routed. The non-routable
protocols can be bridged between different VLANs and routed interfaces of the same bridge
group using fallback bridging, which allows the switch to forward this traffic.
Fallback bridging is enabled by assigning 2 or more switch interfaces to a bridge group.
Once the interfaces have been assigned to a bridge group, the interfaces are able to bridge
all non-routed traffic between them and other member interfaces. BPDUs are exchanged
between members of same bridge group, but not between groups. Note:
- Up to a maximum of thirty two (32) bridge groups can be configured on the switch
- An interface (an SVI or routed port) can be a member of only one bridge group
- Use a different bridge group for each separately bridged network connected to the switch
- Do not configure fallback bridging on a switch configured with private VLANs
- When enabled, all protocols are bridged, except for the following:
IP Version 4
IP Version 6
Address Resolution Protocol (ARP)
Reverse ARP (RARP)
Frame Relay ARP
Shared STP packets are fallback bridged
Adjacency table
Another table used for maintain a list of next-hop neighbor and directly connected hosts’
MAC address and its correspondent IP address; this table is called the adjacency table and
is built from the ARP table. You can display this table with ‘show adjacency [TYPE MOD/NUM
| vlan VID] [summary | detail].
You will see a line of hexadecimal values, the first 12 digits is the MAC address of the
attached host. Another 12 digits contain the MAC address of this Layer 3 engine’s interface,
with the last 4 digit denoting the EtherType.
If an ARP entry is missing, the corresponding FIB entry is marked “CEF glean”, meaning
the frame can’t be processed due to the missing of the layer 2 address. This causes the
layer 3 engine to generate an ARP request for that address. You can show a list of FIB
entries in ‘glean’ state with ‘show ip cef adjacency glean’ command.
When an entry is in glean state, packets for that destination is dropped to prevent sending
multiple ARP request, this is known as ARP throttling or throttling adjacency. If an ARP reply
is not received within the next 2 seconds, another ARP is released.
For Catalyst 6500, Supervisor 720 module is the where all the processing takes place. It
contains:
- MLS feature card 3 runs all software process and support both switch and route processor
(SP and RP). Build CEF FIB and download it to ASIC of PFC3 to make forwarding decision.
- Policy feature card 3 contain ASIC and perform routing and switching, implementing ACL,
QoS and multicast packet. Require RP to populate route cache or optimized route table
structure to perform L3 switching.
- Switch/switching fabric is the connection between multiple ports within a switch to
transport data.
3 refers to version 3, the current latest version.
To verify CEF is running well, use ‘show ip route’ and ‘show arp’ to verify correct
information. Display the content of FIB table with ‘show ip cef [distributed]’ command.
Detailed version of last command include ‘show ip cef [network [mask]] [longer-prefixes]
[checksum | detail | internal [checksum]]’ and ‘show ip cef [interface-type interface-number
[checksum | [detail | internal [checksum] | platform]]’
Adjacency table can be viewed with ‘show adjacency [ip-address] [interface-type interface-
number | null number | port-channel number | sysclock number | vlan number | ipv6-
address | fcpa number | serial number] [connectionid number] [link {ipv4 | ipv6 | mpls}]
[detail | encapsulation] and the show ip cef adjacency [interface-type] [interface-number]
[ip-prefix] [checksum | detail | epoch epoch-number | internal | platform | source]’
It’s possible for MSFC to process some packets instead. But to prevent oversubscription,
you should limit the rate at which PFC sends (or punts) frame to MSFC. This is controlled by
- CEF Receive: limit frames for switch’s own interfaces
- CEF Glean: when destination is not contained or can’t be relayed to the next hop, which
doesn’t exist and need an ARP request.
This feature is performed by ‘(config)#mls rate-limit unicast cef [glean | receive]’
Hardware
Switching fabric is a term used for communication channel used by the switch to transport
frames, carry forwarding decision information, and relay management information
throughout the switch. It’s responsible to relay frame from ingress port to an egress port.
There are 2 major types of switch fabric
- Shared bus: share the same method of switching. Use a central arbiter to determine how
and when to grant requests from each line card (port). Only one process occurring any
time.
When frame is received, it’s placed in a buffer (queue) and check for error. Line card local
arbiter request access to transit frame onto data bus. Defective frame are discarded.
Header is added to insist forwarding decision in data bus then transmit there.
Data bus send the frame to all ports (except where the frame come from), then the header
added will decide which port will forward. Certain ports transmit frame, while others discard
it.
- Crossbar solves the waiting problem in shared bus by using multiple line card that can
operate simultaneously. It’s available in SFM modules.
As oversubscription can occur any time, it’s a good idea to buffer excessive frames before
they are processed to prevent being dropped. 2 types of memory management is used to
act as switch frame buffer:
- Port buffer memory: high-speed memory specially designed store excessive frame. One/
port, drop frame if buffer full.
- Shared memory: old type of memory designed for port buffer. All ports share the same
buffer. Dynamically allocate memory for buffer. Vary by different platform, but usually
increment from 64 to 256 bytes.
Catalyst 5000/5500
Catalyst 5000/5500 (Project Synergy) contains the most fundamental parts of all Cisco
switches.
Catalyst 5000 switch introduced a 5-slot chassis with one slot for supervisor module, hot
swappable line modules, redundant power supply, redundant fans, and 1.2 Gbps backplane
bus.
Catalyst 5500 switches use aggregate switching bandwidth of 3.6 Gbps by using three 1.2
Gbps buses.
Most Catalyst switches are now modular and allow user replacement of components.
Chassis the place where all the components reside. It provides electrical connection
between Supervisor module and all other modules and line cards + system clock to the
power supply.
Model 5000 5002 5005 5009 5500
EARL v1 created CAM tables made up of MAC address, associated VID, and an index value.
The VID field takes 16 bits, but only 10 bits were used to identify a VLAN and the rest are:
- Aging bit: for aging addresses
- Trap bit: indicate an exception, such as filtering or blocking
- Static bit: indicate a MAC address is static
- Valid bit: indicate the entry has a running aging timer, meaning it’s still valid.
EARL v2 can rewrite header for each packet flow, or give out rewrite information to ASICs
capable of in-line rewrite.
Power supply uses external RPS 675, which can supply power to a maximum of 6 switches,
and provide immediate failover for internal power supply.
Stackwise switches are connected with 68-pin cables. When two or more switches are
connected via Stackwise cables, a switch fabric consisting of dual counter-rotating rings is
formed, with each ring providing 16 Gbps of bandwidth, resulting in 32 Gbps of total
bandwidth. Each ring carries data and is self healing via a loopback protection mechanism
that is enabled should a Stackwise cable or individual switch fail.
This provides high availability even when an ASIC fail.
3750 uses a shared token to determine the order in which port ASCIs may transmit data
onto the ring. The port ASIC can create a 24-byte header containing the necessary
information to make a forward decision.
When transferring a packet, the port ASIC will use the ring that the first token arrives on.
If token from both ring arrive at the same time, port ASIC chooses the least used ring.
The destination port ASIC copies the 24-byte header and the packet data from the ring
and forward it to the correct port.
Catalyst 4500
Switch performance is usually measured by bandwidth = data bus * clock speed and
packet per second it can handle.
Catalyst software, Cat OS
Cat OS is a part of Crescendo when it was purchased by Cisco. It uses ‘set’ and ‘clear’
commands instead of the IOS we currently use.
‘enable password’ = ‘set password’
‘hostname’ = ‘set system name’
‘show version’ = ‘show system’
‘show controller’ = ‘show environment’
‘show run’ + ‘show start’ = ‘show config’ <- no difference.
‘telnet XXX’ = ‘session XXX’
‘vlan VID’ = ‘set vlan VID’
Cables and connectors
FastEthernet 0/1/3, 0 = slot number, 1 = port adapter, 3 = port number
Fast Ethernet
100Base-FX uses MT-RJ or SC connector. MT-RJ connector has a tab on top for easy
removal, like RJ-45; it uses 2 fiber-optic strands. SC connector also use 2 strands but the
connector is square in shape.
Some people only install 2 pairs of wires in the UTP 4 cable to lessen their work, but this is
not good for future development or exchange of cables.
Autonegotiation is only allowed in UTP Fast Ethernet and Gigabit Ethernet links. When only
one side is set to auto-negotiate, duplex setting can’t be detected, so the port falls back to
half-duplex state.
Fast Ethernet defaults to use full duplex
Gigabit Ethernet
Gigabit Ethernet uses somewhat different physical layer standard, called ANSI X3T11
FibreChannel, which provide a base of high-speed ASICs, optical components, and
encoding/decoding and serialization mechanism.
802.3z Wiring Type Pairs Cable length
1000BASE-CX Shielded twisted pair (STP) 1 25 m
1000BASE-T EIA/TIA Category 5 UTP 4 100 m
1000BASE-SX Multimode fiber (MMF) with 62.5-micron core; 850-nm laser 1 275 m
MMF with 50-micron core; 850-nm laser 1 550 m
1000BASE-LX/LH MMF with 62.5-micron core; 1300-nm laser 1 550 m
MMF with 50-micron core; 1300-nm laser 1 550 m
SMF with 9-micron core; 1300-nm laser 1 10 km
1000BASE-ZX SMF with 9-micron core; 1550-nm laser 1 70 km
SMF with 8-micron core; 1550-nm laser 1 100 km
802.3ab standard provides Gigabit Ethernet over copper, which allow falling back of 10 and
100 Mbps, and allow autonegotiation as well, of which, 1000Base-T (full-duplex) has the
most priority, followed by 1000Base-T (half-duplex).
Gigabit EtherChannel (GEC) support 2 - 8 Gigabit Ethernet link to act as a single logical
link.
Connections are made using modules, usually GBIC or SFP. GBIC support SC and RJ-45,
while SFP support MJ-RT, LC, and RJ-45. These modules can use:
- 1000Base-SX - 1000Base-LX/LH - 1000Base-ZX
- Gigastack (a row of different connector in one module) - 1000Base-T
If UTP 5 cable is used 1, 2, 3, 4, 5, 6, 7, 8 on one end connects 3, 6, 1, 7, 8, 2, 4, 5 on the
other end.
Only 1000Base-T can use 1 Gbps, where other GBIC can operate at 10/100/1000 Mbps
10 Gigabit Ethernet
10 Gigabit Ethernet (802.3ae), or 10GbE, operate only at full duplex. The physical
connectors used are defined at physical media dependent (PMD), which can be:
- LAN PHY: interconnects switches in a campus network, predominantly in core layer
- WAN PHY: interfaces with SONET and SDH in MAN
PMD Type* Fiber Medium Max Length
10GBASE-SR/SW (850 nm serial) MMF: 50 micron 66 m
MMF: 50 micron (2GHz* km modal bandwidth) 300 m
MMF: 62.5 micron 33 m
10GBASE-LR/LW (1310 nm serial) SMF: 9 micron 10 km
10GBASE-ER/EW (1550 nm serial) SMF: 9 micron 40 km
10GBASE-LX4/LW4 (1310 nm WWDM) MMF: 50 micron 300 m
MMF: 62.5 micron 300 m
SMF: 9 micron 10 km
10GBASE-CX4 Copper: CX4 with Infiniband connectors 15 m
S = short L = long E = extra-long C = copper R = LAN PHY
W = WAN PHY X = (coding) WWDM = wide-wavelength division multiplexing
Cisco Catalyst switches support 10GbE PMD in XENPAK, X2, and SFP+ transceiver.
Generally, C2 is smaller than XENPAK, with SFP+ being the smallest; allowing more port
density.
10GbE can also be used in EtherChannel, called 10GEC. For EtherChannel to form, the
same protocol/standard must be used.
Autonegotiation
10Base-T didn’t have autonegotiation as part of its standard. However, it generates a pulse
called normal link pulse (NLP) every 16 millisecond on an idle link. This means that when
the link is not busy, 10Base-T sends the signal to keep the link from down. When NLP is not
received within a specified time slot, the link is considered down.
802.3u included autonegotiation and many different functions such as Remote Fault
Indication (detect L1 error) and Next Page Function (information about negotiation process).
802.3u is capable of communicating with dissimilar standards. For instance, it uses Parallel
Detection to make the link compatible with 10Base-T and generate NLP signals. The switch
also generate FLP signal every 2 ms to match with 802.3u devices.
Gigabit Ethernet require that all IEEE 802.3z devices have autonegotiation capability.
Software control of the device can override this function by ‘set port negotiation MOD/PORT
{enable | disable}’.
GE autonegotiation for 802.3z include:
- Duplex setting (full duplex only)
- Flow control (optional, ask sender to slow down transmission by sending a frame at
0180.c200.0001 to the transmitter. Sender then places the data in buffer. Note the latency
time. Use ‘set port flowcontrol MOD/PORT’)
- Remote fault information (detect L1 error)
Verify with ‘show port capabilities’ available on CatOS
Switch Port Gigabit Autonegotiation NIC Gigabit Switch Link/NIC Link
Setting Autonegotiation Setting
Enabled Enabled Up Up
Disabled Disabled Up Up
Enabled Disabled Down Up
Disabled Enabled Up Down
Summary
Only media or connector meeting the standard is not enough, to be able to use a standard,
you must verify the link from end-to-end.
Note that since extended VLAN are not stored in vlan.dat in the Flash, they are not
supported by VTP client or server. This means you must manually delete the VLANs and
reassign the ports when moving from transparent to client or server mode.
By default, all switch port are assigned to VLAN 1, VLAN type is Ethernet, and MTU = 1500
bytes. (VLAN 1 uses default value, and they can’t be changed unless the native VLAN is
change) VLAN 1 and 1002 - 1005 are for special purposes.
Extended VLAN can be used for WAN interfaces, L3 Ethernet ports, and sub-interfaces.
VLAN can be dynamically assigned with VMPS, cisco uses applications such as CiscoWorks.
When planning for VLAN, an important factor to consider is relationship between VLAN and
IP subnet; Cisco recommends one IP subnet/VLAN.
However, it’s possible to have more than 1 IP subnet/VLAN, example, VLAN 1.
A VLAN is active (or passing traffic) by default, however, you can force it into suspension
mode, which means the entire VTP domain won’t pass any traffic (as this mode
propagated). ‘(config-vlan)#state suspend’ only applies to standard range VLANs, you can’t
force an extended VLAN to suspend. ‘Status’ is ‘suspended’ in ‘show vlan [brief]’
On the other hand, if you ‘(config-vlan)#shutdown’ or ‘(config)#shutdown vlan’ a VLAN, it
will only be shut on that switch, rather than propagating throughout the domain. The
‘Status’ is shown as ‘act/lshut' in ‘show vlan [brief]’
Cisco Catalyst 6500 series switches support an additional feature called VLAN locking that
allows administrators to provide an extra level of verification when moving ports from one
VLAN to another. This feature, which is enabled via the vlan port provisioning global
configuration command, requires that the VLAN name, NOT number, be entered when a port
is moved from one VLAN to another via the switchport access vlan [VLAN NAME]
interface configuration command.
‘show interface TYPE/NO switchport’ show how a switch port is configured for trunking and
its status. ‘Operational Mode’ of static access means no trunk formed.
‘show interface TYPE/NO trunk’ display brief info about an interface’s trunk status.
To form a trunk between a DTP-capable device and a DTP-incapable device, you must force
both ports to be trunk by ‘switchport mode trunk’. Any other mode will not form a trunk.
This situation occur when switch forms a link with router, or switches between different VTP
domain want to form trunk. Unless you use ‘switchport nonegotiate’, DTP is always enabled
by default.
Access Dynamic Auto Dynamic Desirable Trunk
Access Access Access Access Access
Dynamic Auto Access Access Trunk Trunk
Dynamic Desirable Access Trunk Trunk Trunk
Trunk Access Trunk Trunk Trunk
For old switches, default mode is dynamic desirable. New switches defaults to use dynamic
auto. Confirm with ‘show dtp [interface TYPE MOD/NUM]’
DTP also chooses which trunking protocol a trunk link use. ISL is favored if both are
present. DTP sends message every second OR every 30 seconds after trunk formation.
VLAN 1
VLAN 1 contains control plane traffic and user traffic. Control plane traffic such as VTP, CDP
(advertisement every 60 second), and PAgP are tagged with VLAN 1 information and
forwarded across VLAN 1 regardless it’s been pruned.
sc0 is used for management of switch such as telnet, SNMP, and syslog. Redundant links in
management VLAN eliminate the need for STP as no loop would be created. If there are
redundant links, separate physical connections only supporting management VLAN are
needed. A good design shouldn’t place user traffic on the management VLAN.
Trunk design
If the links between 2 switches are not trunk, then the 2 switches need to have n links for
n VLANs it has. Trunk is supported in Fast Ethernet and Gigabit Ethernet, and corresponding
EtherChannel links.
Frames traveling trunk links are tagged, while frames traveling access links are not.
End-to-end VLANs, also called campus-wide VLANs, distribute VLAN throughout the
network. End-to-end VLAN are not recommended since broadcast traffic are carried from
one end to the other, creating possibility for broadcast storms.
All users in a VLAN follow the 80/20 rule. Although only 20 percent of the traffic in a VLAN
is expected to cross the network core, this design can allow 100% of traffic within a single
VLAN to cross the core.
Local VLAN is the opposite of end-to-end VLAN. 20% traffic is kept for local while 80% for
the outside.
Trunking protocols
Trunking protocol header is inserted at the egress switch trunk port. Tag is removed at
ingress port.
ISL is a Cisco-proprietary trunking protocol that can be used on Ethernet, Token Ring,
FDDI, and ATM frame by using a ‘Frame Type’ field. ISL is sometimes called double tagging
for the extra encapsulation. ISL frames can’t pass through non-ISL switches and require a
minimum of FastEthernet connection. Header include source MAC address of the device that
added the encapsulation.
Old ISL doesn't support untagged VLAN and extended range VLAN, new version does. Use
multicast address of 0100.0c00.0000 or 0300.0C00.0000
802.1Q, on the other hand, is referred to as single, or internal tagging; it can be used on
Ethernet or Token Ring, as indicated by CFI bit, which is also known as canonical format,
little-endian or big-endian format.
The 4-byte header is added after source address field. The first 2 bytes are used as Tag
Protocol Identifier (TPI) that is always equal to 0x8100 (indicating 802.1Q frames here).
The remaining 2 bytes are used for Tag Control Identifier (TCI). The TCI information
contains a three-bit Priority field, for CoS functions in 802.1Q/802.1p with one bit for CFI.
The last 12 bits are used as a VID to indicate the source VLAN for the frame. The VID can
have values from 0 to 4095, but VLANs 0, 1, and 4095 are reserved.
For a trunk to operate, these conditions must be the same: Trunking mode, Trunking
protocol, Native VLAN, if exist, and Allowed VLAN, if configured
InterVLAN routing
802.1Q-in-Q tunneling
Traditionally, VLANs can’t extend beyond WAN boundary and it isn’t really a good design to
do so. However, to connect to remote networks, 802.1Q-in-Q, Ethernet over MPLS
(EoMPLS), Metro Ethernet and VLAN MPLS (VMPLS) can be used to extend VLAN across
WAN links.
Port that provide the tunnel is called a tunnel port, while the customer end uses 802.1Q
trunk (and disable DTP, use ‘switchport nonegotiate dot1q’). The link is also called
asymmetric. Always configure one VLAN for each tunnel.
No Layer 3 routing or usage of IP address unless tunnel port is SVI. Can’t be configured
with PVLAN, VoIP, fallback bridging, IP ACL, ToS ACL, DTP, .
Traffic including CDP (automatically disabled), STP BPDU (automatically filtered on tunnel
port), VTP, PAgP, EtherChannel, loopback detection and UDLD can pass transparently over
Q-in-Q tunnel.
Frame traffic from these protocols changes its destination MAC address when entering
service provider switch to 0100.0CCD.CDD0, then back to the original MAC address when
the frame is about to enter the client network. If, on the tunnel port, a frame with that
multicast MAC address is received, it’s shut down to prevent loops.
This tunnel is known as Layer 2 protocol tunnel. By default, it’s not created to transport
CDP, STP and VTP and all CoS value = 5. There is no default shutdown threshold. This
tunnel can’t be created unless both customer and service provider are access ports.
The basic idea of a Q-in-Q or Layer 2 protocol tunnel is that by encapsulating the original
802.1Q frame with another 802.1Q tag (when the frame enters the switch), the switch
strips it off during processing and add it before the frame leaves the switch.
On the switch between service provider and the customer, the switch strips the header
upon incoming frame but don’t add it back when the frame leaves the switch. This way, the
outer tag is transparent to the customer network.
Frames from customer network are tagged regardless it’s already tagged or not.
When serving different customer occupying the same range of VLAN, they will not get
mixed up because the outer tag (customer’s native VLAN assigned by service provider) is
different.
Also note that the service provider don’t have to use the same trunking protocol as the
client, or not trunking at all.
When the native VLAN (from the tunnel port) is untagged, it may get mixed with other
untagged native VLAN from other customers. If the outer tag have the same VLAN ID as the
inner tag, the outer tag is not applied. Solution:
- Use ISL on service provider network
- Tag all native VLANs frames using ‘(config)#vlan dot1q tag native’
- Ensure that native VLAN ID on the edge switch trunk port isn’t within the customer VLAN
range.
You can increase the system MTU due to the extra tag using ‘(config)#system mtu'
When 802.1Q trunks are used in these core switches, the native VLANs of the 802.1Q
trunks must not match any native VLAN of the nontrunking (tunneling) port on the same
switch because traffic on the native VLAN would not be tagged on the 802.1Q transmitting
trunk port.
PPPoE
PPP is widely used for dial up connection for PtP links. It was designed to work with serial
connections, but it can be encapsulated to work over Ethernet (PPPoE) or ATM (PPPoA).
PPP uses LCP to determine if a link can be established, if so, create a session between a PC
and ISP. LCP packets include fields to make that decision.
PPPoE provide support for DSL, but not for Frame Relay, or other LAN interfaces. PPPoE
use the standard method for encryption, authentication and compression used by PPP.
PPPoE create a virtual point-to-point connection between 2 Ethernet ports using special
software. PPPoE discovery:
1. Initiation: client software send PPPoE active discovery initiation (PADI) to the server to
initiate a connection.
2. Offer: if server accept, it responds with PPPoE active discovery offer (PADO)
3. Request: client send PPPoE active discovery request (PADR) packet to the server
4. Confirmation: server send a PPPoE active discovery session (PADS) packet that includes a
unique ID for the session.
PPPoE works with DHCP to manage the address pool when session is generated/ended.
PPPoE sample configuration:
EdgeRouter(config)# interface fa0/1
EdgeRouter(config-if)# ip address 192.168.100.1 255.255.255.0
EdgeRouter(config-if)# ip nat inside
EdgeRouter(config-if)# pppoe-client dial-pool-number 1
EdgeRouter(config-if)# exit
EdgeRouter(config)# interface dialer1
EdgeRouter(config-if)# mtu 1492
EdgeRouter(config-if)# encapsulation ppp
EdgeRouter(config-if)# ip address negotiated
EdgeRouter(config-if)# ppp authentication chap
EdgeRouter(config-if)# ip nat outside
EdgeRouter(config-if)# dialer pool 1
EdgeRouter(config-if)# dialer-group 1
EdgeRouter(config-if)# exit
EdgeRouter(config)# dialer-list 1 protocol ip permit
EdgeRouter(config)# ip nat inside source list 1 interface dialier1 overload
EdgeRouter(config)# access-list 1 permit 192.168.100.0 0.0.0.255
EdgeRouter(config)# ip route 0.0.0.0 0.0.0.0 dialer1
VTP
VTP messages are send as multicast frames at 0100-0CCC-CCCC, SNAP = AA and type =
0x2003. To clear the configuration version number (to ensure no wipeouts), perform the
following for any new switches introduced regardless of VTP mode.
- Change VTP mode to transparent (transparent mode always have revision number of 0)
then to server
- Change the VTP domain to another name, then back to the current name
Synchronization problem can occur when: 1) link to network using trunk links, 2) new
switch has same VTP domain, 3) higher revision number, or 4) same password.
VTP has 3 versions: v1 (default) doesn’t send all necessary updates, v2 is used in switches
with using IOS operating system, and v3 is used in switches using CatOS. VTP v3 allow
extended VLAN to be used and advertised but the version is not available in all IOS Catalyst
switches.
VTP pruning allow efficient bandwidth usage by forward frame (for a VLAN) over a trunk
link only if the receiving switch has ports in that VLAN. VTP pruning uses join messages to
decide whether to forward. This feature is present in both V1 and V2, and is off by default.
‘vtp pruning’ has no effect on transparent switches because it can only prune VLANs from
2 to 1001 by default. Pruning in transparent mode must use ‘switchport trunk pruning vlan’
command. The command specify which VLANs can be pruned.
Verify with ‘show interface TYPE MOD/NUM switchport’
If a switch has no domain name, it will accept the domain that is being propagated to it
first. This means if there are 2 VTP domain, the first VTP update to reach that non-domain
switch will become the VTP domain that that switch. This is dynamic VTP.
VTP can only propagate information for up to 1024 VLANs (use 15 bits); this is why
extended VLANs are not advertised in the VTP domain.
VTP updates VLAN information using 3 kinds of updates, generated by VTP client/server:
Summary advertisement: when vlan.dat changes. Updates every 5 min by client and Ser.
Sequence number (like that in TCP) contain sequence of the packet in the stream of
packets that follow a summary advertisement; start with 1.
Lower VLANs occur before higher ones. Here is the VLAN information field, with 802.10
SAID indicate some security mechanism for Layer 2.
VTP version 3
VTP version 3 is the third version of the VLAN trunk protocol. This version of VTP enhances
its initial functions well beyond the handling of VLANs. VTP version 3 adds a number of
enhancements to VTP version 1 and VTP version 2, which include the following:
■ Support for a structured and secure VLAN environment (Private VLAN, or PVLAN)
■ Support for up to 4000 VLANs
■ Feature enhancement beyond support for a single database or VTP instance
■ Protection from unintended database overrides during insertion of new switches
■ Option of clear text or hidden password protection
■ Configuration option on a per-port basis instead of only a global scheme
■ Optimized resource handling and more efficient transfer of information
VTP version 3 differs from VTP versions 1 and 2 in that it distributes a list of opaque
databases over an administrative domain in situations where VTP version 1 and VTP version
2 interacted with the VLAN process directly. By offering a reliable and efficient transport
mechanism for a database, usability can be expanded from just serving the VLAN
environment.
VTP version 3 uses the same concept of domains as those used in VTP versions 1 and 2,
where only devices belonging to the same VTP domain are able to exchange and process
VTP information. However, unlike versions 1 and 2, which allow a new switch with the
default domain name to configure itself with the domain name in the first received VTP
message, VTP version 3 requires that the domain name be explicitly configured on each
switch. This means that the VTP domain name must be configured before VTP version 3 can
be enabled.
In addition to the traditional VTP roles of sever, client, and transparent, VTP version 3
supports an additional switch role called ‘off.' This mode is similar to transparent mode;
however, unlike a transparent mode switch that relays any received VTP messages, a switch
in off mode simply terminates the received messages and does not relay or forward them.
With VTP version 3, off mode can be configured globally or on a per-port basis. Turning VTP
to off allows a VTP domain to connect to devices in a different administrative domain.
Miscellaneous
• A VTP domain should have at least 1 VTP server. Cisco recommends 2 for redundancy.
• Errors can be found in ‘show vtp counters’ command along with statistic information.
• ‘show vtp status’ -> ‘VTP version: 2’ means this switch is version 2-capable.
• ‘vtp filename’ allow you to specify where to store VLAN information. By default, this is in
vlan.dat in Flash.
• ‘vtp interface TYPE MOD/NUM [only]’ command is used to specify the name of the
interface providing VTP ID. ‘only’ allow a singles IP address to be the VTP ID. Confirm at
‘show vtp status’
• You can debug with ‘debug sw-vlan vtp {events | packets | pruning | xmit}’
• CatOS support ‘off’ mode, which means a switch doesn’t participate in VTP.
• When using VTP with CatOS, avoid using VLAN 1006 to 1024
• VTP information is stored in vlan.dat in Flash for client and server. Transparent mode
store normal-range VLAN in vlan.dat and running-config (NVRAM), while extended range
VLAN are stored in NVRAM only.
EtherChannel
Port channel is name of the bundled logical interface while EtherChannel is the name of the
technology. EtherChannel prevent bridging loops and increase bandwidth by making itself a
single logical link. Note that traffic from a particular source MAC address (or whatever the
load-balance criteria is) always go through the same physical link, whether or not this is a
unicast, multicast, or broadcast frame.
These links can be used on any type of device as long as number of ports are sufficient.
All bundled ports must be in the same configuration such as VLAN, speed, duplex setting,
trunk status, and VTP setting. EtherChannel was primarily designed to operate on ISL.
EtherChannel can have different STP path cost.
Interface configuration applies to the interface on which it’s configured while port channel
configuration applies to the entire EtherChannel. Because, after the formation of
EtherChannel, a change to individual interface will be affected on every other interface, if
the port channel configuration and interface configuration mismatch, the latter configured
command is used.
PAgP
Neighbors are identified by neighbor ID and port group capability are learned to form
EtherChannel; different device, different device ID.
If any of the port setting is changed on one port, to keep the link active, all ports will
change to that setting to keep EtherChannel operating.
Default mode is ‘Auto’. Auto mode will negotiate with another PAgP port only if the port
receives a PAgP packet. This port will not actively send a PAgP packet.
Management traffic are distributed over all physical interfaces, while PAgP PDU will only
send/receive on interfaces that are up and have PAgP in auto/desirable mode. If the port
channel is trunked, PAgP PDUs are transferred on VLAN with lowest VID or from port that
first come up.
You can verify this with ‘show pagp [CHANNEL_NO] neighbor’ or ‘show etherchannel
summary’. Port forwarding PDU traffic is marked as ‘Pd’
More to know: Cisco Virtual Switching System (VSS) is comprised of two physical Catalyst
6500 series switches acting as a single logical switch. In the VSS, one switch is selected as
the active switch while the other is selected as the standby switch. The two switches are
connected together via an EtherChannel, which allows for the sending and receiving of
control packets between them.
Access switches are connected to the VSS using Multichassis EtherChannel (MEC). An MEC is
simply an EtherChannel that spans the two physical Catalyst 6500 switches but terminates
to the single logical VSS. Enhanced PAgP (PAgP+) can be used to allow the Catalyst 6500
switches to communicate via the MEC in the event that the EtherChannel between them
fails, which would result in both switches assuming the active role (dual active), effectively
affecting forwarding of traffic within the switched network.
LACP, 802.3ad
One of the difference is that LACP can automatically create port channels by exchanging
LACP packets between ports. It gathers data about link capability and inform other ports.
Once done, it can match the links to become port channel.
LACP require all links to be full-duplex, half-duplex links are suspended (PAgP allow this).
LACP is a Layer 2, MAC-sublayer protocol.
A port using SPAN will be removed from EtherChannel group.
Links with different STP cost can be used to form EtherChannel.
LACP is composed of collector, distributor, LACP agent, and marker and marker response.
The collector assembles frames from physical link, they can parse marker to it then pass it
to LACP agent, which can also parse a marker.
Distributor transmits outgoing frames. It’s responsible fore distribution algorithm. Higher-
layer agents, such as LACP agent, instruct the redistribution in the marker packet. The
recipient agent will reply (instruct the distributor to create the marker response) with
marker response packet after successful transmission.
Passive mode will negotiate with another LACP port only if the port receives a LACP packet.
This port will not actively send a LACP packet. The port channel group attaches interface to
EtherChannel bundle. Default mode is passive
Note that if you convert a PAgP EtherChannel to LACP will cause all existing EtherChannel
to reset to default channel mode for new protocol.
After LACP PDUs are exchanged, the switches come to an agreement about each other’s
settings and decide whether the links can become an aggregation by:
- LACP System Priority: default to 32768. Used along with device MAC address to form
system ID. Configure with ‘(config)#lacp system-priority [1-65535]’ and ‘show lacp sys-id’.
Device with lower priority gets to decide which link are active and which are standby.
- LACP Port Priority: Decide whether link is active or standby. The lower, the better. If tied,
lower MAC address win. Port priority + port number = port identifier. By default, maximum
allowed is 16, 8 active and 8 standby. Use ‘(config-if)#lacp port-priority [1-65535]’ and
‘show lacp NO internal’
- LACP Administrative Key: automatically. Same administrative key, same port channel
group.
LACP allow redundancy by:
- HRSP: by default, 8 active links are allowed and minimum of 2 is needed. To change, use
‘(config-if)#lacp max-bundle NO’ and ‘(config-if)#port-channel min-links NO’ command.
- LACP 1:1 redundancy: this feature allow an active link to load its traffic to a standby link
when it fails, and regain the load after it comes back.
Configuration
All Cisco IOS model currently support both PAgP and LACP. You can tell the router to use
this protocol only with this/these interfaces by (config-if)#channel-protocol [pagp | lacp].
Assign the interface with ‘(config-if)#channel-group NUM mode {active | on | {auto [non-
silent]} | {desirable [non-silent]} | passive}. ‘on’, ‘auto [non-silent]’, and ‘desirable [non-
silent]’ are use for PAgP, while, ‘on’, ‘passive’, and ‘active’ is used for LACP. The NUM must
be between 1 and 64.
By default, PAgP is in silent submode with desirable and auto modes, this is intended to
form EtherChannel with devices that are not EtherChannel-capable, such a file-server or
packet analyzer. No PAgP frames are required to form the connection.
If the other end of the connection is PAgP or LACP capable, then you can add the ‘non-
silent’ keyword, telling the switch that you expect such a frame for the connection of the
link. If no such frames is heard on the active port, the port remains in the up state, but STP
will shut the port down.
Note that if both devices are using auto silent mode, it would take about 15 seconds for
the connection to establish, and 45 to 50 seconds if both using auto non-silent (30 seconds
come from STP).
If you would like to assign an IP address, you MUST assign it to ‘port-channel’ interface.
Due to the different load-balancing algorithm, the switch may provide different load to
different links, thus, each link may not have the same work load. When links fail or restore,
balance can be re-achieved very fast and transparent to the user.
Load-balancing in EtherChannel is performed by a hashing algorithm that chooses the
source/destination IP/MAC address, and/or UDP/TCP port information to calculate route to
take.
Depending on the criteria the algorithm chooses to use, for instance, source and
destination IP address, the last 1 bit is XORed if you have a 2-link EtherChannel, the last 2
bits are XORed to calculate a 4-link EtherChannel, and the last 3 bits are XORed for load
balance over 8-link EtherChannel. If a single criteria is used, such as the source MAC
address, only the last 1, 2, or 3 bits will be used.
By default, Catalyst 2970 and 3560 uses source MAC address as the criteria, but if Layer 3
switching is performed, source and destination IP address will be used by default. Here are
the list of options following ‘(config)#port-channel load-balance’ command:
Port-channel load-balance Hash Input Hash Operation Switch Model
src-ip Source IP address bits All models
dst-ip Destination IP address bits All models
src-dst-ip Source and destination IP address XOR All models
src-mac Source MAC address bits All models
dst-mac Destination MAC address bits All models
src-dst-mac Source and destination MAC XOR All models
src-port Source port number bits 6500, 4500
dst-port Destination port number bits 6500, 4500
src-dst-port Source and destination port XOR 6500, 4500
To show the statistics of the load balance, use ‘show etherchannel port-channel’ or ‘show
etherchannel load-balance’ command.
802.1d also describes transparent bridging, which is the segmentation of networking into 2
or more collision domain, thus causing less collisions. The process of transparent bridging
contains 5 steps:
- Learning
- Flooding
- Filtering occurs when devices on the same collision of the bridge try to communicate with
each other
- Forwarding
- Aging ensure system only track active devices as a timer keeps note of which device is
active
Type of BDPU
BPDU has source address of the propagating switch and destination address of multicast
01-80-C2-00-00-00. There are 2 types of BPDU,
- Configuration BPDU, are exchanged to elect the root bridge. The election is ongoing and
depend on the BID, which is composed of Bridge priority and MAC address, lower one
wins. STP recalculation only occur when Root Bridge changes
Field Description Protoco Versio BPDU type = Flags Root Bridge Cost to Root
l ID = 0 n = 0 0x0 BID Bridge (root port)
No. Of Bytes 2 1 1 1 8 4
Field Description Sender Port ID Message Age Maximum Age Hello Time (in Forward Delay (in
BID (in 256ths of (in 256ths of a 256ths of a 256ths of a
a second) second) second) second)
No. Of Bytes 8 2 2 2 2 2
By default, no non-root bridge can distribute BPDU. However, a special case can happen in
UplinkFast.
The flags include TC (bit 0, LSB) and TCA (bit 7, MSB) bits, if TC is set, this is a
configuration BPDU in response to a topology change BPDU. If TCA bit is set, this is a
configuration BPDU to acknowledge the receipt of topology change BPDU. TCA BPDU is send
for every TC BPDU.
TC configuration BPDU is re-generated for every switch it passes, and TCA configuration
BPDU is replied by every other switch.
Note: BPDU type = 0 for configuration BPDU, BPDU type = 80 for TCN BPDU
- Topology Change Notification (TCN) BPDU, is for announcing changes in topology, when 1)
port move into forward state (has one designated port), or 2) Forward/learning ->
blocking state.
If PortFast is enabled at a port, STP is disabled.
TCN BPDU doesn’t carry data about the change but informs recipients that a change has
occurred. This BPDU is sent from the switch (Root Port) where change take place. Then, the
Root Bridge sends TCN BPDU for a period of (Forward Delay + Max Age =) 35 seconds,
when switches receive this BPDU, it will flush CAM entries inactive for 15 seconds instead of
300 seconds (default). This message include:
Field Description Protocol ID = 0 Version = 0 BPDU type = 0x80
No. Of bytes 2 1 1
All switches have a locally configured Hello Time to time TCN BPDU when they are
retransmitted. To change the timer, apply the change on the root bridge; not recommended.
It’s better to adjust the diameter of the network, which is, by default, 7 (including root
bridge) switches from root bridge outward.
Port costs
Path cost refers to the cumulative cost to a switch. Root path cost refers to the cumulative
cost needed to reach the root bridge. The cost is incremented at the ingress port. There are
2 sets of standards for 802.1D.
Link Bandwidth 4 Mbps 10 Mbps 16 Mbps 45 Mbps 100 Mbps 155 Mbps 622 Mbps 1 Gbps 10 Gbps
Old STP cost 250 100 63 22 10 6 2 1 0
New STP cost 250 100 62 39 19 14 6 4 2
Every switch always has a copy of the best BPDU. If a better cost comes up, the switch will
choose that as the new path cost.
The above standard uses 16-bit port cost value that can be manually assigned, only used
for ports that haven't been specifically configured for port cost.
802.1t standard uses 32-bit port cost = 200,000,000 / bandwidth (in Mbps). You can
configure to use this with ‘(config)#spanning-tree pathcost method {long | short}’.
Remember that lower cost are preferred and by default, 802.1d port cost is used.
A switch has only 1 designated port/segment and 1 root port/non-root bridge. 2 links to
the same segment will cause one to become blocked (unless EtherChannel is implemented),
the blocked port is elected by:
‣ Highest root bridge BID (priority + MAC address)
‣ Highest root path cost (to root bridge)
‣ Highest sender BID (priority + MAC address)
‣ Highest sender port ID (port priority + port number)
Note: EtherChannel ports has, by default, a high port ID, therefore, likely to be elected
blocked.
Port states
When a port first initializes, it’s in blocking state. Use ‘show spanning-tree interface [TYPE
MOD/NUM] [detail]’, and ‘debug spanning-tree state’ commands to verify the state.
STP State The Port Can... The Port Cannot... Duration
Disabled N/A Send or receive data N/A
Blocking Receive BPDUs Send or receive data or Indefinite if loop has been
learn MAC addresses detected
Listening Send and receive BPDUs Send or receive data or Forward Delay timer (15
learn MAC addresses seconds)
Learning Send and receive BPDUs and learn Send or receive data Forward Delay timer (15
MAC addresses seconds)
Forwarding Send and receive BPDUs, learn MAC Indefinite as long as port is
addresses, send and receive data up and loop is not detected
Note: only ports connect to other switches or bridges are considered STP ports
A port moves from blocking to listening if the port thinks it can become forwarding.
Types of STP
STP configuration
General
You can enable or disable an instance of spanning-tree by ‘(config)#spanning-tree vlan
VID’ for VLAN VID. To ensure a good STP design, the root bridge should be set in a pre-
determined fashion. A secondary switch should also be set up in case of failure.
The root bridge should be placed in the center of the network
Non-root bridge is known as Designated switch, which contain the Designated port of a
particular LAN segment.
To manually set a switch as the root bridge, either change the priority with
‘(config)#spanning-tree vlan VID priority PRI’ or use ‘(config)#spanning-tree vlan VID [root
{primary | secondary}] [diameter VAL]. Diameter range from 1 to 7.
The macro is a series of commands that make the switch favorable. It can’t work when the
current root bridge has priority that isn’t a multiple of 4096, but you can manually set the
priority to 0. This command only work once, because it won’t guard the configuration.
Secondary root has priority 20480 + root priority.
A switch usually has 1024 addresses to allocate for STP, one for each VLAN. After 802.1t
(extended system ID and default port cost) is introduced, only one MAC address is needed
for STP. When using 802.1t, BID = priority (in multiple of 4096) + VID + MAC address.
802.1t is enabled by default on most switches and is used for both standard and extended
VLAN. It can be enabled by ‘(config)#spanning-tree system-id’.
You can also manually change the cost of a path reaching the root bridge (for all VLANs or
for one VLAN) by ‘(config)#spanning-tree [vlan VID] cost VAL’ command. Display the cost of
a specific interface by ‘show spanning-tree interface TYPE/NO cost’. Only recommended to
change on the Root Bridge to prevent suboptimal switching.
Timers
To modify the timers, use ‘(config)#spanning-tree [vlan VID] [hello-time | forward-time |
max-age] SEC’ command. Hello time range from 1 to 10, forward delay range form 4 to 30,
and max-age range from 6 to 40 seconds.
If the timers of a non-root-bridge switch differ from the root bridge, it will change its timer
to match that of the root bridge.
Max Age timer must be the same for all BPDUs in the domain.
Message age timer display the age of the root bridge BPDU; increment by 1 for each
switch it passed through. BPDU from root bridge have message age = 0. Message age timer
can be used to determine:
- How far away is the Root Bridge
- The time before received BPDU is aged out.
- Aging time = Max Age - Message Age
Port ID
A port ID consist of port priority (8 bits) and port number (8 bits). The priority range from
0 to 240 (in increment of 16) and default to 128, and port number can be found as the last
number of the slash, but it’s not always this way.
Port ID for Port Channel is always higher than un-bundled ports; Po port are shared port.
All physical links will still participate in STP.
You can find the port ID at ‘show span int TYPE/NO [detail]’ under ‘Prio.Nbr’ section, where
PPP.NNN means port_priority.port_number. The priority can be changed by ‘(config-
if)#spanning-tree [vlan VID] port-priority PRI’
UplinkFast is used when you have multiple paths (from the Access switch) to the
distribution layer. Failure of one would cause the other link (called alternate port) to transit
into Forwarding state in 1 to 3 seconds. This feature works on all VLANs for this switch.
This feature is not allowed for the Root Bridge, and change the priority to 49,152 and
increase port cost by 3000 to prevent this switch from becoming Root Bridge.
BackboneFast
BackboneFast provide fast failover for indirect link failure. For instance, you have S1 as
Root Bridge, S2 and S3 are both connected to S1 and to each other. When link between S1
and S2 fails, S2 think it’s the Root Bridge and send out BPDU to S3.
S3 will ignore the BPDU (the link between S2 and S3 is inactive cuz one port is in Blocking)
until Max Age timer (of BPDU from Root Bridge) expires. S3’s Root Port is now in Listening
state and send out BPDU (from Root Bridge) to S2. S2 stops thinking its the Root Bridge
once it hears the better BPDU. This leaves S2 at least 50 seconds of no connectivity.
BackboneFast only works if a bridge detects a direct link failure. The purpose of this
feature is to reduce the Max Age timer as much as possible, thus, reducing STP process by
as much as 20 seconds.
BackboneFast send out a RLQ PDU request (Root Link Query + PDU) to all not-designated
port (Root Port + Blocking port) except the port where inferior BPDU is received. Switches
receiving this message will reply with RLQ response to indicate it’s alive. If such a reply is
received from the Root Bridge (from the Root Port), then S3 send Root Bridge’s BPDU to S2
(and move the port connect to S2 to Forwarding). If no such reply is received from Root
Port, then STP converges because Root Bridge can’t be heard.
If the inferior BPDU arrives on the Root Port and there are no Blocked ports, the switch will
assume it has lost connectivity with the Root Bridge. Unless it hears a better BPDU, it will
assume itself as the Root Bridge.
Note: RLQ PDU requests are send out S3’s Root Port, while RLQ PDU responses are send
back from Designated port. If there is another switch between S3 and S1, and that switch
has confirmed connectivity (Hello BPDUs) to S1, that switch will forward the response back
to S3 without passing it to S1. If there isn’t confirmed connectivity, that switch will relay the
frame to S1.
Note: RLQ PDU has same format as configuration BPDU but different SNAP address.
BackboneFast can be configured on every switch regardless of design consideration.
BackboneFast is not available on 802.1D but PVST+.
Verification
Here is a list of show commands
Task Command Syntax
View all possible STP parameters for all VLANs. # show spanning-tree
Port information is summarized.
View all possible STP information for all VLANs. # show spanning-tree detail
Port information is very detailed.
View the total number of switch ports currently in # show spanning-tree [vlan vlan-id] summary
each of the STP states.
Find the root bridge ID, the root port, and the root # show spanning-tree [vlan vlan-id] root
path cost.
Show BID and STP timers for each VLAN of the # show spanning-tree [vlan vlan-id] bridge
local switch.
Show the STP activity on a specific interface. # show spanning-tree interface type port
Show the STP UplinkFast status. # show spanning-tree uplinkfast
Show the STP BackboneFast status. # show spanning-tree backbonefast
Additional features
Root Guard
Root Guard prevent a Designated port from becoming a Root port (prevent change of Root
Bridge). This feature is enabled on a port (whose switch doesn’t have to be the root bridge)
so that a better configuration BPDU will put the port into a root-inconsistent state, which the
port can only forward BPDU. It can’t received BPDU and can’t send/receive data for any
VLANs. Once superior BPDU are no longer received, the port cycles through normal STP
state.
This feature should be enabled on all switches in a domain. It’s configured by ‘(config-
if)#spanning-tree guard root’. Verify with ‘show spanning-tree inconsistentports’ command
for error but only show with ‘show run’.
Can’t be used with BPDU Guard or Loop Guard.
BPDU Guard
Even when PortFast is enabled on a port, STP is disabled, but still can detect loop during
the first 50 seconds or so.
BPDU Guard is used to protect the PortFast port by putting the port into errdisable state
anytime a BPDU is received. The port remains in that state unless 1) ‘shut’ then ‘no shut’, or
2) ‘errdisable recovery cause bpduguard’ after 300 seconds (validate with ‘show errdisable
recovery’ and change timer with ‘(config)#errdisable recovery interval TIME’).
When the port times out, it returns to forwarding state through normal STP cycle.
Ports connect to hubs should have BPDU Guard on as it may repeat BPDU from another
switch. You should not enable BPDU guard on any switch uplink.
BPDU Filter
If BPDU filter is enabled, the switch port can’t send or receive BPDU. It’s enabled with
‘(config)#spanning-tree portfast bpdufilter default’ or ‘(config-if)#spanning-tree bpdufilter
{enable | disable}’ command. It doesn’t put such a port in Errdisabled state, but it disables
STP on that port (in disabled state), validate with ‘show spanning-tree summary’.
Loop Guard
Loop Guard checks Root Ports and Blocked Ports (all non-designated ports), to ensure they
receive BPDU. A port can be brought up (from blocked state) because BPDU is no longer
detected (possibly unidirectional), which will cycle through STP state once Max Age timer for
the current BPDU expires. This would result in a loop.
Loop guard continuously monitor non-designated ports’ BPDU, if they go missing, the port
is placed into loop-inconsistent state. When BPDUs are received again, the Loop Guard
move through normal states. Loop Guard disable ports on a per-VLAN basis. Loop Guard
should
- Can’t be enabled on the same port as Root Guard
- Not affect UplinkFast or BackboneFast
- Be enabled on PtP (full-duplex) links only
- Not be affected by STP timer
- Can’t detect unidirectional link (best implemented with UDLD)
- Not be enabled on PortFast or Dynamic VLAN ports.
UDLD
All our cables are bidirectional, meaning we can both send/receive data. However,
sometimes, the link may become unidirectional, or it can only send data or received data.
This may cause the device on the other end to think its neighbor is no longer present and
chooses to open up the blocking port; this can easily cause a loop to form.
Unidirectional Link Detection (UDLD) solves this problem by monitor a port to see if it’s
truly unidirectional (if packets are being received on one side only).
A UDLD protocol packet (contain this device’s and neighbor’s port ID) is send to neighbor
switch every 15 seconds (to 01-00-0C-CC-CC-CC), which neighbor should echo back, along
with its acknowledgement (if not, the port is shut down). Link status can be determined
after 3 messages. This takes 45 seconds, or before STP moves another link to forwarding
state. Here are the fields contains in UDLD frame:
Field Description
Device ID This field contains the MAC address of the sending device.
Port ID This field contains the module and port number of the sending device.
Echo This field contains the module and port pair known by the sending device.
Message Interval This field contains the transmit interval of the sending device.
Timeout Interval This field contains the timeout interval of the sending device.
Device Name This field contains the CDP Device ID string of the sending device.
Sequence Number This field contains the number used to validate discovery packets.
Reserved These fields are reserved for future use.
Both ends must be configured for UDLD, this means on a link, there are 2 UDLD processes
running simultaneously and independently. UDLD can be enabled on any port without design
consideration. UDLD has 2 modes of operation
- Normal mode: when unidirectional link is detected, port is allowed to continue its
operation. UDLD marks the port as ‘undetermined’ and generate syslog message.
- Aggressive mode: when no acknowledgement is heard back (even if the message comes
back), switch will try to re-establish connection by sending messages every second for 8
seconds. If no reply is heard, port is placed under errdisable state. You can re-enable such
a port with ‘#udld reset’
When UDLD is enabled for the first time, it will keep sending UDLD messages until it hears
a reply, which then start assessing whether a link is unidirectional.
BPDU
RSTP BPDU has the same format as STP BPDU (thus, backward compatible), except its
protocol version identifier = 2.
Another difference is that RSTP utilize all 8 bits (instead of 2) of the flag byte to indicate
different types of BPDU.
Bit 0 Bit 1 Bit 2-3 Bit 4 Bit 5 Bit 6 Bit 7
RSTP BPDUs are send from every switch every 2 seconds. Max age is 6 seconds, and
message age is simply used as hop (or switch?) count instead of calculation.
Port
Synchronization
If a configuration BPDU is not heard back, switch assumes neighbor to be STP and start
using 802.1D logic.
Topology Change
A topology change in RSTP only occur when a non-edge port move to Forwarding state.
In STP, there is TC and TCA BPDU, whereas in RSTP, there is only TC Configuration BPDU
(no TCA BPDU and no more TCN BPDU) unless a STP bridge exist. In STP, the TCN BPDU is
generated by the switch where topology change and require Root Bridge send out
configuration BPDU with TC bit set.
In RSTP, the switch (where topology change takes place) can send out the TC BPDU. When
this BPDU is sent out, the switch starts a TC timer (by default, 4 seconds), for which it will
send the BPDU during this time. This switch will flush all MAC address except MAC address
of edge ports. In other words, all ports that send out TC BPDU will have its MAC address
flushed.
For switches that receive the TC BPDU, it will flush all MAC address except the one which
BPDU arrived on. It also start a TC timer for when it propagate the TC BPDU.
Verify with ‘show spanning-tree vlan’ command. A type of ‘P2p Peer (STP)’ indicate a
neighbor running STP.
Compatibility
By default, 802.1D drops 802.1W frames. This means that the 802.1D will send BPDUs
(thinking it’s the Root Bridge) because RSTP BPDU can’t be processed.
On the other hand, 802.1W switch send out RSTP BPDU (because no RSTP response is
heard) and start the migration delay timer, which is 3 seconds by default; now the port is
said to be in compatibility mode. This means the 802.1W switch will continue to send RSTP
BPDU every 3 seconds and the port can accept any type of BPDU.
When the migration timer ends, 802.1W switch checks the STP type. If STP BPDU has
been received, it will start to use STP BPDU (and logic) instead. Know that the RSTP switch
can’t revert back to RSTP. This continue of flow cause the entire switch domain to use STP.
Note: during migration delay timer, 802.1W switch can generate responses to 802.1D BPDU
such as TCN and TCA BPDU.
Note: if the TC timer is active on a Root Port connected to an 802.1D switch and a BPDU
with TCA bit set is received, the TC timer is reset.
There are 2 types of RSTP, RPVST+ and MST; both are running RSTP, so everything it does
follow the rule of RSTP. To configure RPVST+, ‘(config)#spanning-tree mode rapid-pvst’, you
need to ‘reload’ the switch. Verify with ‘show spanning-tree summary’ or ‘show spanning-
tree bridge protocol’ command.
Multiple Spanning Tree (MST), defined in 802.1s, it has many advantages. All VLANs can
load balance, and the links are well-utilized and the burden on CPU is little.
A MST region defines the boundary within which MST operates. For switches to be in the
same region, these items must be the same
1. MST region name (< 32 bytes or characters), manually configured
2. Configuration revision number (0 - 65,535), manually configured and can’t be
dynamically changed or propagated
3. VLAN-to-Instance Mapping (0 - 4096 entries), even if that instance/VLAN doesn’t exist
on this switch. One VLAN can only be mapped to one instance.
A switch can belong to ONLY one MST region. Thus, that region must include all VLANs the
switch run.
MST region management can be performed by VTP v3.
MST BPDU has format similar to RSTP except that protocol version identifier = 3, and MST
only send one BPDU to each of its neighbor from each switch port. MST BPDU contain MST
extension field called M-record; M-record for IST MUST be transmitted, along with some
optional M-records (if that interface contain VLANs belonging to that instance). Flag byte is
the same as RSTP field.
‘MST configuration digest’ is a field in MST BPDU that is the hash result of VLAN to instance
mapping. This is used to verify both switches have the correct information.
When a switch port realize that its neighbor (on the same segment) is not running the
same MST as it is, it will detect its STP type from the BPDU it receives and try to cope with
the neighbor.
MST computation relies on hop count, default max hop count = 20.
IST and MISTP
MST groups VLANs into different instances, which can be identified by a number ranging
from 0 to 15; new switches use instance 0 to 64. Instance 0 is also known as Internal
Spanning Tree (IST), while other instances, instance x, is also known as multiple instance
STP x (MISTP x), where x can be a value from 0 to 4096.
Both don’t use Message Age or Max Age, but Path Cost and a max hop count (similar to IP
TTL) to calculate the topology instead. Hop counts can be adjusted with ‘(config)#spanning-
tree mst max-hops [1 - 40]’
All VLANs that not mapped into a MST instance become part of IST, or MST instance 0; you
can also think of it as CST. By default, when you enable MST, it puts all VLANs into IST when
there is no VLAN association mapping. All switches are connected with IST.
If a VLAN is not mapped to instance 0, it can’t communicate with the outside world.
IST presents the MST region as if it’s a single virtual bridge (cost and age are increment by
just one switch). Only information in IST will be contained in the BPDU send out from
boundary ports. BPDU are exchanged over native VLAN of trunks (at boundary port).
IST work out a loop-free topology for links that connect the MST region to CST. Root
Bridge in IST is known as IST master, elected based on best BID and Path Cost to CST Root.
If there is only a single region (or CST is running), IST master is also CST Root. If CST
Root is outside the region, the closest switch (at the boundary) is the IST master. IST
master BID and Root (to CST Root) path cost are included in MST BPDU.
Each region has its own spanning-tree instance and doesn’t affect the spanning-tree
instance of another region.
MST configuration
Task Command Syntax
Set STP type to MST (config)#spanning-tree mode mst
Enter MST configuration (config)#spanning-tree mst configuration
Create MST region name (config-mst)#name MST_NAME
Set revision number (config-mst)#revision REV_NUM
Create MST mapping (config-mst)#instance INS_ID vlan VLAN_LIST
Set root bridge (macro) (config)# spanning-tree mst instance-id root {primary |
secondary} [diameter diameter]
Set bridge priority. (config)# spanning-tree mst instance-id priority bridge-priority
Set port cost. (config)# spanning-tree mst instance-id cost cost
Set port priority. (config)# spanning-tree mst instance-id port-priority port-priority
Set STP timers. (config)# spanning-tree mst hello-time seconds
(config)# spanning-tree mst forward-time seconds
(config)# spanning-tree mst max-age seconds
Display MST config #show spanning-tree mst configuration
Show MST config for an #show spanning-tree mst INS_ID
instance
Force port type change. ‘shut’, #clear spanning-tree detected-protocols [interface TYPE MODE/NUM]
this command, ‘no shut’
Campus Network Design
A successful network design should be scalable, you should also aim at networks that can
adapt to a variety of traffic flow.
Distribution layer is where access layer switch interconnect, and core layer is where
distribution layer aggregate. There should be 3 kinds of traffic flow:
Service Type Location of Service Extent of Traffic Flow
Local Same segment/VLAN as user Access layer only
Remote Different segment/VLAN as user Access to distribution layers
Enterprise Central to all campus users Access to distribution to core layers
To maintain organization, simplicity, and predictability, you can design a campus network
in a logical manner, using a modular approach. In this approach, each layer of the
hierarchical network model can be broken into basic functional units. These units, or
modules, can then be sized appropriately and connected, while allowing for future scalability
and expansion. You can divide enterprise campus networks into:
- Switch block: a group of access-layer switches, with their distribution switches
- Core block: the campus network’s backbone.
Generally, you should provide 2 distribution switches in each switch block for redundancy.
There are 2 possible designs:
- Full mesh between switches, this cause the link to become 1) dependent on spanning tree
convergence and 2) links between the 2 distribute layer switch must be a Layer 2 link that
carry access VLAN.
- Daisy chain, access switch connect to distribution switch using a single link. This causes 1)
dependent on spanning tree convergence, 2) links between the 2 distribute layer switch
must be a Layer 2 link that carry access VLAN, 3) no redundant path, and 4) user isolated
from Layer 3 gateway causing strange behavior
The recommended practice is to keep Layer 2 VLAN to access switches; however, this is
only possible if access switch is MLS. If using full-mesh design, there will be no dependency
on STP because VLAN doesn't extend to distribution switch (connected to access switch by
Layer 3 link), which converges using routing protocol instead.
Core block needs efficiency and redundancy. The link between distribution switch and core
switch can be layer 2 or 3. In case of layer 2 link, SVI will provide routing for the VLAN.
Here are 2 core block design: collapsed core and dual core.
A collapsed core is network design in which the core layer is also the distribution layer.
This design integrates the core block with the switch block. Each access switch should have
a redundant path to each distribution switch.
Layer 3 redundancy is provided through a redundant gateway protocol such as HSRP.
Dual core is the design of connecting multiple switch block in a redundant fashion.
Recommended to use MLS because this way, multiple links can be used (between core
switches) and not be blocked (Layer 3). Core switch must be able to handle its link when
100% capacity is used and it must have high port density of high-speed ports.
In a campus network, the large amount of connection would not result is too many peering
because each peer has redundant link, meaning the core switch only record the number of
actual peers instead of number of links.
Scaled Switching compose of only switches at any layer. Low cost, easy management. Only
a single broadcast domain.
Large Switching with Minimal Routing Design: switch at access, distribution, and core layer.
Not very efficient, use legacy 80/20 rule.
Distributed Routing and Switching: best design for 20/80 rule, most commonly used. Follow
LAN hierarchical network model.
When a host wants to send data to far-end host (different subnet), there are 2 possible
situation. If the local host know the IP address of local default gateway, it will send a ARP
asking for MAC address, or use an entry (if already exist) in its ARP cache.
However, if the default gateway is misconfigured or the subnet mask is misconfigured, the
local router will reply to ARP requests for remote hosts. This is proxy ARP.
The above situation creates a single point of failure. The above protocols allow multiple
devices to share the default gateway address so that if one device is down, other can
replace its role and continue functioning; this is similar to IPv6 anycast address.
Note: when an end device has correct subnet mask and default gateway, and it need to
reach a remote host, it can directly send the data to the default gateway if the end device
already has its MAC address. On the other hand, if the end device doesn’t have a default
gateway (equal to the router), or incorrect subnet mask, it will broadcast ARP for MAC
address to that remote host, which is replied by the router.
HSRP
A good design should have the primary gateway also acting as STP Root Bridge.
HSRP allow several default gateway devices to appear using a single IP address, which is
activated by ‘(config-if)#standby GROUP_NO ip IP_ADD [secondary]’ (if the interface
already has an IP address configured, this must be a different IP address but in the same
subnet); ‘secondary’ keyword is used if the actual interface address is a secondary address.
All routers that provide redundancy for one default gateway address belong to the same
HSRP group, which can be identified by a number from 0 to 255.
HSRP can be assigned to a VLAN interface (int vlan x) but only 16 is allowed maximumly.
However, every instance of HSRP is locally significant, which means having the same HSRP
identifier in 2 different VLANs create 2 different HSRP instances.
If an ARP would be used, a corresponding MAC address for the HRSP group must exist,
which is 0000.0c07.acXX, where XX is group number in hexadecimal.
You may want to use the switch’s physical MAC address as the virtual MAC address for the
HSRP group, ‘(config-if)#standby use-bia’. Or, you may want a specific MAC address to
participate in HSRP instead of the default, use ‘(config-if)#standby NO mac-address ADD’.
There are 2 versions of HSRP, v1 (Version field = 0) and v2 (version field = 1), with v1
used by default. All routers communicate and establish its presence by sending Hello
messages to 224.0.0.2 UDP port 1985 every 3 seconds. The hold timer is 10 seconds. They
can be manually adjusted by ‘(config-if)#standby GROUP_NO timers [msec] HELLO_TIME
[msec] HOLD_TIME’. You can use the optional msec keyword if you wish to specify the time
in milliseconds.
HSRP version 2 use a Type/Length/Value (TLV) format. Capable of millisecond timers.
Group number from 0 to 4095. Identify source router with its physical MAC address at
‘Identifier’ field. Virtual MAC address use different range since group number was different;
0000.0C9F.FXXX. Configure version with ‘(config-if)#standby NO version [1 | 2]’
For MLS, you can configure HRSP group on different VLANs, using SVI to establish
connection. However, most Catalyst switches allow only 16 unique HSRP group numbers. In
order to use more HSRP groups, you can place the same HRSP group number on different
VLANs, this does create different, independent processes.
To verify HRSP, use ‘show standby [brief] [vlan VID | TYPE MOD/NUM]’
HSRP election
The election is based on highest priority, value range from 0 to 255, with 100 being the
default; if priority is tied, highest IP address wins. You can adjust the priority with ‘(config-
if)#standby GROUP_NO priority PRI’.
One router is elected the primary, or active HSRP router; another as the standby HSRP
router; rest routers remain in listen HSRP state.
The standby router monitors Hello message from primary router, if no Hello messages are
received within the hold timer, the primary router is declared down.
Standby router then takes primary router’s place, and a new standby router is elected
from the list of routers in Listening state. No new active router can be elected until the
current active router fails unless preemption is configured.
Note that the first router starting up will become the active router (initially no router is
active) even if it doesn’t have the highest priority. To challenge the current router (even if it
doesn’t have the highest priority), use ‘(config-if)#standby GROUP_NO preempt [delay
[minimum SEC] [reload SEC]]’. By default, this command is disabled, once enabled, a local
router immediately preempt if it has a higher priority (without optional configuration).
- minimum keyword to force the router to wait for seconds (0 to 3600 seconds) before
attempting to overthrow an active router. This time begins as soon as the router is capable
of assuming the active role, or after the router is up and HSRP is configured.
- reload keyword to force the router to wait (after it becomes the active router) for seconds
(0 to 3600 seconds) after it has been reloaded. During this period of time, the router will
gather routing protocol information in order to route.
Preemption doesn’t mean STP topology also change.
HSRP messages
HSRP states
HSRP authentication
HSRP ignores unauthenticated HSRP message and by default, Cisco devices use a plain-
text authentication of ‘cisco’. You can change the password to another plain-text password
with ‘(config-if)#standby GROUP_NO authentication text PASS’
You can configure MD5 passwords with ‘(config-if)#standby GROUP_NO authentication
md5 key-string [0 | 7] PASS [timeout SEC]’, where password should be at least 16
characters (recommended) but less than 64 characters.
You can also use a key-chain instead with ‘(config-if)#standby GROUP_NO authentication
md5 key-chain NAME’ where the key chain has to be previously configured.
If ‘0’ is used, this is equivalent to ‘standby authentication’ command, if ‘7’ is used, this
password will be encrypted if ‘service password-encryption’ is enabled. As for the key chain
configuration, you can use ‘key-string [0 | 7]’ when specifying it.
When this command is set, clear-text password become 0 indicating it’s disabled.
Timeout value is the time when the password will become invalid.
Interface track
HRSP has a special feature which it can de-prefer a router if it’s interface fails. The
purpose of HRSP is to deliver data reliably, if a router has outgoing interface down, it can’t
deliver the data and thus, not reliable.
You can track a certain interface with ‘(config-if)#standby GROUP_NO track TYPE MOD/
NUM [DECRE-VALUE]’, where the TYPE MOD/NUM specify the interface you would like to
track. For instance, you have enabled HRSP 2 in Fa 0/0 but would like to track S 0/0/0, so
you configure ‘standby 2 track S 0/0/0’ in Fa0/0 configuration mode.
DECRE-VALUE is used to make a router unfavorable if the monitoring (Tracking) interface
is down. By default, this is 10. This means 10 is taken away from the router priority if S
0/0/0 is down, and regained when S 0/0/0 is up. The only way a router can resume its role
after interface failure is ‘standby preempt’.
HRSP load-balance
It’s not possible to load balance traffic across 2 HRSP routers with a single HRSP group,
the solution is using 2 HRSP group, one for each HRSP router as the active router. Users are
evenly distributed across the group to load balance traffic so each half of devices uses one
of the routers as default gateway.
However, this increases the amount of work on the CPU. One solution is to use client/slave
groups. These group follow the master group without participate in the HSRP election.
Therefore, don’t need to exchange many messages. However, they do send refresh
messages to keep their MAC address. Use ‘(config-if)#standby MASTER name NAME’ and
‘(config-if)#standby SLAVE follow NAME’.
VRRP
GLBP
In HRSP and VRRP load-balancing, the hosts must be statically (partial) assigned to the
switch ports. GLBP is a Cisco-proprietary protocol using to solve this issue. GLBP operates at
224.0.0.102, using UDP port 3222.
In GLBP, all switches assigned to the same group can participate and offer load balancing
by forwarding a portion of the overall traffic. All hosts use the same gateway address (same
virtual router IP address), but when issuing an ARP, only the virtual MAC address of a
selected router is returned.
You can configure the AVG to store a database of host using the default gateway and other
information at the client cache using ‘glbp client-cache’ command and verify with ‘show blip
detail’ command.
One router is elected as the active virtual gateway (AVG), which uses the same election
methods as HRSP and VRRP. AVG assigns and distributes the virtual MAC addresses using a
load-balancing algorithm.
In a group, at most 4 virtual MAC addresses (also known as AVF) can be assigned by AVG
to 3 other routers, while the un-assigned routers (or routers that learned virtual MAC
address by hello message) serve as backup or secondary virtual forwarders.
The AVF has the format 0007.b40X.XXYY, where XXX is 2 bits of 0 + 10 bits of Group
number in binary (together to form 3 digits of hexadecimal); YY is 8-bit of virtual forward
number.
It’s possible that, when one AVF router fails, AVG assigns the AVF to another router that
already is an AVF. That router must now carry the load of 2 routers. GLBP uses the redirect
and timeout timer to reclaim the AVF. They can be configured by ‘(config-if)#glbp
GROUP_NO timers redirect REDIRECT TIMEOUT’ command.
During redirect timer, Traffic is still evenly distributed between all AVFs, meaning the
router with 2 AVFs will continue to work to direct traffic coming from both addresses.
Default to 4 hours and range from 700 to 64,800 seconds.
When the timeout timer expires, the old AVF is reclaimed and flushed from all GLBP peers.
Hosts using the old MAC address must refresh to obtain the new virtual MAC address.
Default to 600 seconds, and range from 0 to 3600 seconds.
From this, you understand that a GLBP network flushes a virtual MAC address after 10
minutes by default, but you also know that you can configure the network so it redirects the
traffic and allow you to continue using that default gateway.
Weighting feature
GLBP uses a weighting function to determine which router becomes the owner of the new
AVF. Each router begin with a maximum weight value (range from 1 to 254, default 100),
when tracked interfaces go down, the weight decrease and increase when the interface
come back. Secondary AVF can become AVF once current AVF fall below the threshold for 30
seconds. If there is no tracked interface, maximum weight value don’t change.
Tracked interface can be configured by ‘(config)#track OBJ_NO interface TYPE MOD/NUM
{line-protocol | ip routing}’, which the interface can be tracked based on line protocol status
or up interface + enable routing + valid IP address). OBJ_NO can range from 1 to 500.
There is a physical range, or threshold, of the router’s weight to be qualified for an AVF.
This range is established by lower bound (default 1) and upper bound (default 100). The
router with the highest weight (in the threshold range) becomes the new AVF.
This utilize the command ‘(config-if)#glbp GROUP_NO weighting MAX [lower LOWER]
[upper UPPER]’. MAX is the maximum weight value, which is the initial weight value
assigned to a router.
The rate at which the weight decrements by ‘(config-if)#glbp GROUP_NO weighting track
OBJ_NO [decrement VAL]’; default is 10.
Load-balance algorithm
The AVG assign virtual MAC address using one of the following methods:
- Round robin— Each new ARP request for the virtual router address receives the next
available AVF. Traffic load is distributed evenly across all routers, assuming that each of
the clients sends and receives the same amount of traffic. This is the default method used
by GLBP.
- Weighted— weighting value determines the proportion of traffic that should be sent to
that AVF. A higher weighting results in more frequent ARP replies containing the virtual
MAC address of that router. If interface tracking is not configured, the maximum weighting
value configured is used to set the relative proportions among AVFs.
- Host dependent— Each client that generates an ARP request for the virtual router
address always receives the same virtual MAC address in reply. This method is used if the
clients have a need for a consistent gateway MAC address.
A switch keep its power disabled if the switchport is down. However, the switch
continuously try to detect if a powered device is connected to a port. If it is, the switch must
begin providing power to that device. Then Ethernet link to that port is established.
The detection of power using 802.3af: Switch applies a small voltage across the transmit
and receive pairs of the copper twisted-pair connection. Resistant is used to determine if
power is drawn by the device. If 25K ohm is measured, a powered device is present.
The switch also apply several pre-determined voltage to test of corresponding resistance
value. The powered device return different resistant to identify itself as one of the five
802.3af power classes.
Power Class Maximum Power Offered at 48V DC Notes
0 15.4 W Default class
1 4.0 W Optional class
2 7.0 W Optional class
3 15.4 W Optional class
4 Up to 50 W Optional class (802.3at)
Class 0 is used to indicate that neither the switch nor the powered device support/attempt
PoE. Class 4 is also used for 802.3at, also known as PoE Plus.
Detection of power using ILP: send out 340 KHz test tone on transmit pair of Ethernet
cable to detect whether the connected device is inline power capable. If connected device is,
the test tone is looped back. ILP-capable device loops the transmit and receive pair while it’s
powered off.
By default, switches try to detect and offer inline power in ‘auto’ mode, but this can be
changed by ‘(config-if)#power inline {auto [max MILLI_WATT] | static [max MILLI_WATT] |
never}’, where MILLI_WATT is the default power offered.
Verify with ‘show power inline [TYPE MOD/NUM]’, under ‘Class’ if 0 - 4 is used, this
indicated device is used 802.3af and one the power class. If ‘n/a’ is used, this device is used
ILP.
By default, an IP phone contain 3 ports: one to connect user PC (access), one to connect
its own switch (access), and one to connect to the upstream switch (access or trunk). Voice
and QoS traffic must be carried over voice VLAN (VVID) or regular data VLAN, which is the
native VLAN (PVID).
Remember that single VLAN travel across access ports and multiple VLAN travel across
trunk ports.
The configuration are all performed on the switch, no need to configure on the phone.
It’s generally recommended to use an access port between the IP phone and the switch
and designate a data VLAN and voice VLAN; if using 802.1Q, can provide QoS using 802.1p
bits. Using a trunk port would receive unnecessary messages, an instance of STP is also
needed. This configuration is known as multiVLAN access port (MVAP)
When a frame is send from one host to another, 3 basic things can happen:
- Delay caused by time required to send the packet (physical, switching, routing); the time
takes for a packet to travel from start to finish is measured in latency
- Jitter is caused by variation occurred in latency so that packets don’t all arrive at
predictable time.
- Loss is caused by the missing of packets due to congestion or error; use connection-
oriented protocol
To address and alleviate these conditions, a network can employ 3 basic types of QoS:
- Best-effort delivery, used by default, no QoS
- Integrated services model, or (IntServ) uses RSVP (RFC 1633) to arrange a path for
priority data from source to destination. Source application send a packet with QoS
parameter through RSVP, each device on the path check if it meets the minimum
requirement for the application. If yes, special packet is send to the source to inform this,
then the source begin sending.
- Differentiated services model (DiffServ) is IntServ at large. Each network device handle
packets on an individual basis by applying different policies for different packets. QoS is
handled dynamically, in per-hop basis (IntServ uses per-flow basis). Most popular.
Note: for QoS to be effective, all devices from the sender to the receiver must implement
the same QoS policy.
Layer 2 QoS is tagged as 802.1p field, also known as CoS bits (3 bits). Value range from 0
(lowest-priority delivery) to 7 (highest-priority delivery).
- 802.1Q: Native VLAN receive default CoS.
- ISL uses 4 bits from User field in QoS. Lower 3 bits used for CoS. Cisco switches can
change trunk encapsulation and copy these bits.
Layer 3 QoS is known as ToS or DS byte, can be either 3-bit IP Precedence or 6-bit DSCP
named from DS0 to DS5.
DSCP is backward compatible with IP precedence. DSCP bits are divided into 3 bit class
selector (DS5 to DS3) and 3-bit Drop Precedence (DS2 to DS0) value
IP Precedence (3 Bits) DSCP (6 Bits)
Name Value Bits Per-Hop Class Drop Codepoint DSCP Bits
Behavior Selector Precedence Name (Decimal)
Routine 0 0 Default Default 000 000 (0)
Priority 1 1 AF 1 1: Low AF11 001 010 (10)
2: Medium AF12 001 100 (12)
3: High AF13 001 110 (14)
Immediate 2 10 AF 2 1: Low AF21 010 010 (18)
2: Medium AF22 010 100 (20)
3: High AF23 010 110 (22)
Flash 3 11 AF 3 1: Low AF31 011 010 (26)
2: Medium AF32 011 100 (28)
3: High AF33 011 110 (30)
Flash Override 4 100 AF 4 1: Low AF41 100 010 (34)
2: Medium AF42 100 100 (36)
3: High AF43 100 110 (38)
Critical 5 101 EF EF 101 110 (46)
Internetwork Control 6 110 (48–55)
Network Control 7 111 (56–63)
DS is backward-compatible with ToS, so non-DiffServ devices can forward the packet using
ToS only. DS and ToS occupy the same space, they differ in the way their value differ.
The first 3 bits, DS5 to DS3, specify one of 8 classes (DSCP defaults to 0):
- Class 0, default class, best-effort forwarding
- Class 1 - 4, or assured forwarding (AF) service level. Higher number = higher priority.
Lower priority packets may be dropped
- Class 5 or expedited forwarding (EF) service offer premium service. Least likely to be
dropped, for time-critical data
- Class 6 (internetwork control) and 7 (network control) area used for network control traffic
such as STP and routing protocol.
3 bits for IP precedence or drop precedence is contained in DS2 to DS0. Value range from
1 (low) to 3 (high). The higher the value, the more likely to be dropped. Factor to decide
whether or not to drop a packet.
1- Classification is used to identify which level of service each packet should receive. Each
packet can be classified based on many things, e.g L4 protocol, ACL, etc. There are 4 types
of classification:
- Untrusted: incoming frame lose its CoS value and inherit default value (0 for Catalyst 6500
by default) assigned by the device.
- Trusted-cos: incoming frame retain CoS value
- Trusted-dscp: incoming frame retain its DSCP value
- Trusted-ipprec: incoming frame retain its IP precedence value.
By default, a network should be able to trust QoS parameters assigned by network devices
(such as router, switch, and IP Phone) within the trust boundary. A trust boundary is the
endpoint of an extend of a network (such as access-layer switch and demarc points). End
users such as PCs are not trusted.
2- Scheduling is used to determine if a packet is dropped. The buffer holding the queue
(where packet is placed) can hold a certain amount of packets (hardware-dependent) before
extra packets get dropped.
Different CoS values can be mapped with different drop threshold value so it will be
discarded if the queue reaches that amount. For instance CoS 0 mapped to threshold one
that is set to 50% will drop any CoS 0 packets if the queue if 50% or more occupied. This
mapping can be found with ‘show qos infor config MODULE rx'
3- Marking and policing at the switch engine or PFC/PFC2 using 1) DSCP or IP precedence of
packet, 2) CoS value from frame, or 3) user-defined access list.
The packet can get dropped or marked with a lower priority if bandwidth is limited. Switch
can determine this by looking at a flow (microflow) or lots of packets (aggregate). On
Catalyst 6500, 63 microflows can be simultaneously supported. Aggregate support up to
1023 policing configuration.
4- Marking means the switch will return the CoS or ToS value the frame originally have at
the egress port.
5- Output scheduling determines how and when the frame gets send out due to different
queue value and size as well as threshold map.
Configuration
An upstream switch can instruct IP phone to extend QoS trust by CDP messages. To
enable QoS on the switch, use ‘(config)#mls qos'. By default QoS is disabled and all QoS
information is allowed to pass from one switch to another. Enable QoS configures all switch
ports as untrusted, by default.
Define QoS parameter by ‘(config-if)#mls qos trust {cos | ip-precedence | dscp}’, it’s
recommended to configured ‘cos’ for port connected to the IP Phone.
‘(config-if)#mls qos trust device cisco-phone’ allow the port to be trusted only if CDP
detects a Cisco phone connected. If not, port is untrusted.
‘(config-if)#switchport priority extend {cos VAL | trust}’, if the ‘trust’ keyword is used,
host connected to IP phone is trusted, in other words, its CoS value is unchanged. If ‘cos’
keyword is used instead, it defines the CoS value a packet should be overwritten with (by
the IP phone). By default, overwritten CoS is 0.
‘(config-if)#mls qos trust cos’ configures the switch to unconditionally trust packets
coming into the port.
By default, Cisco switch uses a CoS-to-DSCP mapping to convert inbound CoS values to
DSCP values since CoS MUST be converted to DSCP or IP precedence.
Auto-QoS is a macro command that allow you to easily configure QoS; recommended to
use on default configuration as current QoS configuration may be overwritten. Auto-QoS
handles:
- Enable QoS
- CoS-to-DSCP mapping for QoS marking
- Ingress and Egress queue tuning
- Strict priority queues for egress voice traffic
- Establishing an interface QoS trust boundary
Use ‘debug auto qos' to see all the additional command by the macro.
To verify QoS trust boundary, use ‘show mls qos interface TYPE MOD/NUM’ or ‘show auto
qos interface TYPE MOD/NUM’
Wireless
Ethernet frames are expected to be received within an expected amount of time, thus,
both half and full duplex cables must be the same length.
Distributed system refer to interconnection between APs of multiple cells. There are 3
types of DS:
- Integrated: composed of a BSS network, therefore, no connection
- Wired: different APs are linked by physical cables, usually to Ethernet switches
- Wireless: different APs are linked wirelessly.
AP Basics
For a client to form a membership with the AP (called association), the client has to match
one or more of the following criteria: 1) SSID, 2) wireless data rate, 3) authentication
credentials.
The client sends an association request message, and the AP grant’s or denies the request
(based on the criteria needed) by sending an association reply message.
An AP’s coverage area by its antenna is called a cell. When cell sizes are smaller than
average (for higher throughput), they are often called microcells. This concept can be
further utilized to form picocells (minimize AP power and cell size but increase throughput).
When a client associates with one AP, it can freely move about its range. As the client
moves from one AP’s cell into another (given that both APs have the same SSID), the
client’s association is also passed from one AP to another.
Moving from one AP to another is called roaming. If the client maintains its same IP subnet
as it roams between APs, it undergoes layer 2 roaming. If the client roams between APs
located in different IP subnets, it undergoes Layer 3 roaming.
In an ESS, the APs doesn’t have to connect to the same switch for roaming. However, it
does require you to be in the same mobility domain.
AP that can be standalone or autonomous within the larger network is called autonomous
mode AP. Each autonomous AP has its own security policies and set of rules, configuration
may differ on different APs and central management (such as monitoring and QoS, etc) is
difficult.
AP act as a translational bridge, where frames from 2 dissimilar media area translated and
then bridged at Layer 2. In other words, AP is in charge of mapping of a VLAN to an SSID.
In case of multiple VLANs, you need the same amount of SSID, connecting to a different
different SSID means connecting to a different VLAN. In this case, the AP connects to switch
using trunk link.
An AP can uplink into an Ethernet network because it has both wireless and wired
capability. AP form mesh topology.
It’s not possible to form a one-way communication path between AP and client, which the
client can hear the AP but AP can’t hear the client or vice versa.
An AP can form a single wireless link from one LAN to another over a long distance. In that
case, an AP is needed on each end of the wireless link. AP-to-AP or line-of-sight links are
commonly used for connectivity between buildings or between cities.
802.11 Basics
802.11 WLAN only use half duplex. Transmitting and receiving stations use the same
frequency (channel) so only one channel can transmit any time. For full-duplex to happen,
all transmitting should occur in one frequency while receiving in another.
This is possible, but not yet standardized by 802.11.
802.11 uses CSMA/CA, which require all stations to listen before they transmit a frame.
When a frame needs to be send, there are 2 possible situations:
- No other device is transmitting: so this station send its frame immediately.
Acknowledgement must be received to verify frame is collision-free.
- Another device is transmitting: this station wait until the frame in progress is complete;
then it must wait for a random period of time before transmitting its own frame.
The entire process of collision avoidance is separated into distributed coordination function
(DCF) and point coordination function (PCF).
DCF is composed of interframe space and random backoff time, with interframe space
further separated into SIFS (management frame, between data frame and its
acknowledgement) and DIFS (data frame for new transmission).
A station must wait for the frame to finish transmitting + random backoff time before it
can transmit; the total time is called SIFS or DIFS.
PCF is the optional process that can be used by the AP to send Contention-free poll packet
to each station to give them access to the data transmission. Capable of providing limited
QoS.
PIFS is used to gain access to the medium before any other STA. The PCF-enabled AP
waits for PIFS duration before it occupies the wireless medium; DIFS > PIFS > SIFS.
Channel access in PCF mode is centralized while it is distributed between STAs in DCF mode.
The PCF is located directly above DCF.
802.11 frames
A wireless frame is composed of 32-byte MAC header, body length ranging from 0 to 2312
bytes, and 4 byte FCS trailer. The 32-byte header is composed of:
- Frame control field (2 byte): information that define the frame such as fragmentation and
encryption.
- Duration/ID field (2 byte): contain transmission time used for CSMA/CA
- Sequence control field (2 byte): contain sequence number and fragment number
- Four address (24 byte): which are Source address, Destination address, Receiver address,
Transmitter address, and BSS Identifier (BSSID)
- QoS (2 byte): CUWNA support WMM
Wireless standard have a hidden node problem which the stations can reach APs but can’t
see each other, therefore, creating possible collision when both stations transmit. IEEE
introduced Virtual Carrier Sense to solve this problem, which uses ready to send (RTS) and
clear to send (CTS).
Station A will send a RTS frame which include the source, destination and transmission
time. Station A will only transmit the frame when the AP replied with CTS message, which
contains the duration information. All other stations that receive the CTS set their virtual
carrier sense, also known as network allocation vector (NAV) to not send frames for the
duration specified in CTS.
CUWNA splits the task of an autonomous AP into lightweight access point (LAP) for
individual function (hardware-based) and wireless LAN controller (WLC) for centralized
function. This structure is also known as split-MAC structure.
LAP can be distributed over several WLC for load balancing or redundancy purpose
Cisco Hybrid Remote Edge Access Point (HREAP) is a special case for remote sites where
the LAPs are separated from the WLC by a WAN link. With HREAP, the remote LAPs can keep
operating even while the WAN link is down and their WLC is not available, much like an
autonomous AP would do. This allows wireless users to keep communicating within the
remote site until the link (and WLC) is restored.
LAP can be in the same subnet as WLC (easier management), but they don’t have to be. A
tunnel is needed to forward the data between the 2 (as they are usually separated by a
switch).
The tunneling mechanism is either lightweight access point protocol, or LWAPP (Cisco-
proprietary use UDP ports 12222 and 12223) or provisioning wireless access point protocol,
or CAPWAP (RFC 4118, use UDP port 5246 and 5247). Each of these protocol consist of 2
tunnels:
- Control message: configure LAP and manage its operation. Authentication and encrypted
so LAP is securely controlled by only the WLC
- Data: packets to and from wireless clients associated with the LAP. Data is encapsulated
with LWAPP or CAPWAP but not secured between LAP and WLC.
CUWNA roaming
Roaming is handled in WLC so that client roaming becomes faster and easier as client
association is managed in a central location via tunnel.
Intra-controller roam refers to user moving from one AP to another, both APs connected to
the same WLC. This is simple as WLC simply updates its table and move the client. Layer 2
roaming
Inter-controller roam refers to use moving from one AP or another, both AP connect to
different WLC. If both WLCs are in the same subnet, association exchange can be easily
established by sending a mobility message exchanging client information. Layer 2 roaming
If 2 WLCs are located on different subnet (different VLAN) and the client roam without
changing its IP address. The original WLC (where the client used to reside) is now known as
anchor WLC and establish a Ether-IP tunnel with the current WLC (called foreign WLC due to
different IP subnet) the client is associated with.
Traffic generated by the host can send out from the foreign WLC, but traffic for the host is
received at the anchor WLC, which send the data over to the foreign WLC via Ether-IP
tunnel built using IP protocol 97. This data is encapsulated at anchor WLC and decapsulated
at foreign WLC so host see the data as if it’s received on the foreign WLC. Layer 2 roaming
WLCs are configured into logical mobility group to hand off a client’s association
information. Client must roam between WLCs in the same logical mobility group to keep its
IP address. A logical mobility group can have up to 24 WLCs.
Layer 3 roaming, or change of IP address, will occur when client roam between different
logical mobility group. The host’s IP address and all session information will be dropped.
Switch configuration
To configure autonomous AP (usually located in access-layer), you can apply PortFast here
and allow the VLANs needed to pass through.
LAP, should be located in access-layer and connect to the VLAN, which it obtains its IP
address to reach the WLC (DHCP server). It’s recommended to use a special VLAN for LAP
management. LAP and WLC doesn’t have to be connected with a Layer 2 VLAN or trunk link.
PortFast can also be applied.
WLC should be located in distribution layer. If you are using EtherChannel, remember that
WLC can’t negotiate so the EtherChannel should be manually configured on.
When providing DHCP service, you should have a DHCP pool on the switch or MLS, with
‘(dhcp-config)#option 43 ascii “WLC_ADD_LIST”’ command.
Security
Here are some tips about securing your switch:
- Configure secure password
- Use system banner
- Secure the web interface (apply ACL with ‘ip http access-class ACL_NO’)
- Secure switch console
- Secure virtual terminal access
- Use SSH whenever possible
- Secure unused switch ports (‘switchport host’ is a macro that put the port into access
mode, enable PortFast, and disable channel group)
- Secure STP operation (by enable BPDU guard, loop guard, and root guard)
- Secure the use of CDP (disable CDP on ports not connect to secure devices)
- Secure SNMP by disable read-write SNMP access by ‘snmp-server community STR RW’
Port Security
Switch can implement security control using port security. A switch port can allow a
maximum of 1024 MAC address. When the maximum MAC address is exceeded, a syslog
message appears but the interface stays up.
Port security protects against MAC spoofing attack and CAM table overflow attack.
If an interface is undergoing the ‘restrict’ or ‘protect’ port-security condition, you may need
to clear the learned MAC address so that a specific host can use the switch port. ‘#clear
port-security dynamic [address MAC_ADD | interface TYPE MOD/NUM]’.
Interface in shutdown mode are put into err-disabled state when maximum MAC address
exceeds. When a port is shut down, you can
1. Manually enable it with ‘shutdown’ then ‘no shutdown’
2. Configure automatic enable with ‘(config)#errdisable recovery cause [all | CAUSE_NAME]’
Static secure address are always in the MAC address table by default.
Dynamic secure address are stored in the MAC address table and are removed when
switch is reloaded or shut down. Configured by ‘(config-if)#switchport port-security
maximum N’ command
Sticky secure address can be either stored in MAC address table or NVRAM and remain
when switch reloads or power down. They can be found with ‘show running-config’
By default, secure MAC address (except dynamic) will remain in MAC address table forever,
you can change the infinite timer by ‘(config-if)#switchport port-security aging {static |
time TIME}’ -> configure all entries in the MAC address table to be flushed after some TIME
(0 means no flushing, default) and ‘static’ means static MAC address will also be aged.
Verify with ‘show port-security interface TYPE MOD/NUM’ and ‘show port-security address’
Attacker can use spoofed DHCP server to obtain user information (specifically its MAC
address) when user forwards DHCP broadcast (at FFFF:FFFF:FFFF) to the spoofed server.
This can easily form a man-in-the-middle attack because the client will now send its data
through the router/DNS server/WINS server etc. Specified by the spoofed DHCP server.
DHCP starvation work with MAC address spoofing by flooding a large number of DHCP
request with randomly generated MAC addresses, thereby exhaust the address space
available thus prevent legitimate client from obtaining legitimate address.
Then spoofed DHCP server misdirect the clients and obtain legitimate data.
To prevent this type of attack, you can enforce security with 1) port security, DHCP
snooping and VLAN ACL.
DHCP snooping categorize switch ports as trusted or untrusted; trusted port can transfer
traffic without any checks; trusted ports should be where DHCP server resides. DHCP
snooping keep track of complete DHCP bindings (from those legitimate replies).
Any DHCP replies coming from (DHCP snooping check inbound traffic only) an untrusted
port are discarded because they must have come from a rogue DHCP server. In addition, the
offending switch port automatically is shut down in the Errdisable state.
‘(config)#ip dhcp snooping’ enable DHCP snooping globally, and ‘(config)#ip dhcp snooping
vlan VID_RANGE’ specify which VLAN to enable.
By default, all ports are untrusted. To configure a port as trusted (or where DHCP server
reside), use ‘(config-if)#ip dhcp snooping trust’ command. This also means that all DHCP
DISCOVER messages are forwarded to this port.
By default, untrusted ports can receive unlimit amount of DHCP request is accepted. To
limit the rate of DHCP traffic, use ‘(config-if)#ip dhcp snooping limit rate VAL’, where VAL
can range from 1 to 2048 DHCP packets per second.
Verify with ‘show ip dhcp snooping [binding]’
DHCP option-82, the DHCP Relay Agent Information option, is described in RFC 3046.
When an untrusted port send DHCP DISCOVER message, the switch adds its own MAC
address and switch port identifier into the option-82 field of the request to make the DHCP
server think that it’s assigning address to legitimate host.
DHCP reply messages are verified to have received on trusted ports, and comparing the
option-82 field, the switch can create a mapping of what MAC address is assigned which IP
address, this is stored in DHCP snooping database. This feature is enabled by default and
can be disabled by ‘(config)#no ip dhcp snooping information option’
Each switch port can be classified as trusted or untrusted. Inspection is performed inbound
only on untrusted port but not on trusted ports. When frames arrive from untrusted ports,
source MAC and IP address are checked statically (ACL) or dynamic (based on DHCP
snooping database), if don’t pass the test, the packets are dropped.
Dropped packets recorded if DAI is enabled, which can be seen if you use ‘(config)#ip arp
inspection vlan logging’ command.
To enable DAI, use ‘(config)#ip arp inspection vlan VID_RANGE’, all switch ports belong to
this VLAN will be inspected because they are untrusted by default.
To specify that a port is trusted, use ‘(config-if)#ip arp inspection trust’.
Verify with ‘show ip arp inspection {vlan VID | interface TYPE MOD NUM}’
To statically define DAI permissions, use ACL to define static MAC-to-IP binding.
‘(config)#arp access-list acl-name’ -> ‘(config-acl)#permit ip host SOURCE_IP mac host
SOURCE_MAC [log]’.
To apply this, use ‘(config)#ip arp inspection filter ARP_ACL_NAME vlan VID [static]’
ARP packets going through this interface are matched against the ACL. If no match is
found, the DHCP snooping database is searched (if enabled). Using ‘static’ keyword forces
the switch to only check the ACL, if no match is found, frame consider illegal.
Verify with ‘show arp access-list’
By default, the switch checks the MAC and IP address contained in the ARP reply. You can
further configure with ‘(config)#ip arp inspection validate {src-mac | dst-mac | ip [allow
zeros]}’ command.
- ‘src-mac’ compares the source MAC address in Ethernet header to sender MAC address
(should be the same) on ARP request and reply.
- ‘dst-mac’ compares the destination MAC address in the Ethernet header with recipient MAC
address (should be the same) in the ARP reply.
- ‘ip’ check the sender’s IP address in all ARP request and replies, and recipient’s IP address
in all ARP replies. The addresses should be any multicast or 0.0.0.0 or 255.255.255.255.
‘allow zero’ modifies ‘ip’ so sender of 0.0.0.0 is not denied.
Verify with ‘show ip arp inspection {vlan VID | interface TYPE MOD NUM}’
IP source guard
Address spoofing is the use of address within the subnet while not really existent so any
return traffic can’t find its way back. This is the IP spoofing attack.
IP source guard works with DHCP snooping or static configured IP source binding to block
this type of attack. The binding table contains the IP address and associated MAC and VLAN
numbers. Per-port and VLAN ACL (PVACL) is installed on each port with binding table entry
to permit/deny traffic. IP source guard only works on access or trunk ports.
Binding table can be statically compiled with ‘(config)#ip source binding MAC_ADD vlan
VID IP_ADD interface TYPE MOD/NUM’
To verify IP source guard, use ‘show ip verify source [interface TYPE MOD/NUM]’
To verify IP source binding database, use ‘show ip source binding [IP_ADD] [MAC_ADD]
[dhcp-snooping | static] [interface TYPE MOD/NUM] [vlan VID]’
VLAN hopping
VLAN hopping is used to compromise devices on another VLAN. Two primary methods are:
- Switch spoofing by impersonate (imitate) a switch by generating signals such as 802.1q
and DTP. If the attacker can successfully form a trunk with the switch, it can get data from
any VLAN. To reach a remote switch, it sends a frame to the native VLAN, which can be
forwarded without going through a L3 device. To prevent this,
^ Disable DTP by ‘(config-if)#switchport nonegotiate’ on trunk links
^ Configure all other links to access ports
^ Configure a different native VLAN then prune it from the trunk (Although maintenance
protocols such as CDP, PAgP, and DTP normally are carried over the native VLAN of a trunk,
they will not be affected if the native VLAN is pruned from the trunk. They still will be sent
and received on the native VLAN as a special case even if the native VLAN ID is not in the
list of allowed VLANs. VLAN 1, and 1002 - 1005 is not allowed to be pruned.)
- Double tagging occurs when frames have two 802.1q tags so that when one is stripped off,
the switch will continue to forward it to the destination device. This attacks work even if
DTP is off. ISL is not used here as it would create a giant. To mitigate this situation, use
^ (config)#vlan dot1q tag native -> configures native VLAN to use 802.1Q tag
Cisco identity based networking services (IBNS) provides network access control and
policy performance at the switch port. This solution is based on 802.1x, EAP, and RADIUS. It
provides:
- Per-user or per-service authentication services
- Policies mapped to network identity
- Port-based network access control based on authentication and authorization policies
- Additional policy enforcement based on access level
When Cisco access control server (ACS) is used as authentication server, these additional
features are provided:
- Time and day restriction
- NAS restriction
- MAC address filtering
- Per-user and per-group VLAN assignment
- Per-user and per-group ACL assignment
802.1x is a LAN authentication protocol used to grant user access to network and enforce
policy. RADIUS is the protocol used to relay the message to a database where all the
credentials are stored. There’re 3 roles in this process:
- Supplicant/client is an end device that support EAP and 802.1x. Its credentials are send
using EAPOL frames (at 0180.C300.0003) and secured with 802.1x. This frame can travel
across any Data Link Layer protocol such as Token Ring, FDDI, or Ethernet.
- Authenticator is an intermediate point that verify authenticity of EAPOL and encapsulate it
in RADIUS format to relay the credential to the authentication server. When the frame
comes back, the RADIUS format is removed and 802.1x frame is send to the client. It may
also be the authentication server. Must support 802.1x, EAP, and RADIUS.
- Authentication server can be Cisco ACS, but it must support RADIUS with EAP extension.
The client first send an EAPOL frame to start authentication, authenticator send back a
login request asking for credentials. Credentials from client will be relayed (using RADIUS)
to the server, which checks its database. If no match is found, the client stays blocked.
If match is found, server send back a response demanding further information. If the client
reply with the correct information, s/he is permitted access. VLAN assignment is added to
access-accept packet from the server.
If the client log outs, the process restart if s/he wants to log back in.
By default, AAA is disabled, you can enable it with (config)#aaa new-model command. The
new-model keyword refers to the use of method lists, by which authentication methods
and sources can be grouped or organized. The new model is much more scalable than the
“old model,” in which the authentication source was explicitly configured.
Define external RADIUS server with its secret password (encryption key) by
‘(config)#radius-server host {HOSTNAME | IP_ADD} [key PASSWORD]’. More than one
RADIUS server can be configured.
If there are multiple hosts connected at a single switch port, such as through a hub, you
can use ‘(config-if)#dot1x host-mode multiple-host.
Switch ACLs
Port ACL (PACL) on the other hand, is similar to router ACL or (RACL) but are supported
and configured on Layer 2 interfaces, both physical and EtherChannel. PACL are not
supported on PVLAN and can’t be logged or reflexive.
This ACL is created in ACL TCAM can be configured as either standard or extended IP or
MAC ACL. Port ACL perform access control on all incoming traffic entering a specific Layer 2
port but not on control packets such as DTP, STP, etc.
MAC ACLs are configured with ‘(config)#{mac | ip} access-list extended NAME’, then apply
it at an interface with ‘(config-if)#{mac | ip} access-group {NAME | NO}’.
Know that you can apply only one ACL to an interface; either MAC or IP, not both.
‘(config-if)#access-group mode’ command can change the way PACL interact with other
ACLs. Here are the parameters:
- ‘prefer port’: If a PACL is configured for an interface, it overrides all other ACLs. If no PACL
is configured, other features combine to produce overall result. Default for 4500, used on
both 4500 and 6500
- ‘prefer vlan’: VACL override PACL if there is any. PACL only used when VACL is not present.
Used on 4500
- ‘merge’: merge all available ACL. Default on 6500.
Router ACL (or RACL) filter inter-VLAN packets and is compiled into TCAM. Packets in the
same VLAN can only be filtered by VLAN ACL, or VACL. VACL is also merged into TCAM
table. VACL doesn’t specify direction (in/out) nor interface (only to VLANs)
VACL is configured as VLAN access map (like a route map) by ‘(config)#vlan access-map
NAME [SEQ_NO]’
Then, you use match statement along with ACL in ‘match {ip | ipx | mac} address
{ACL_NO | ACL_NAME}. Note that MAC ACL can’t be a named ACL.
Instead of ‘set’, you use ‘action {drop | forward [capture] | redirect TYPE MODE/NUM}’.
PVLAN
Private VLAN further limit the reachability of a host by transforming a broadcast segment
into a NBMA segment (reaching only specific ends). PVLAN uses 3 types of ports:
- Community VLAN: reach any member of that community, and devices with promiscuous
ports. Carry traffic between community and promiscuous ports.
- Isolated VLAN: carry traffic from members of VLAN to only promiscuous ports.
- Promiscuous: reach any other ports. Allow traffic between ports, usually a router or L3
switch. One promiscuous port/switch. Place ACL here.
Interface from promiscuous VLAN are in promiscuous mode, while devices from secondary
VLANs are in host mode. Promiscuous mode port exhibit bidirectional behavior while host
switch port exhibit unidirectional or logical behavior.
A PVLAN contains:
- PVLAN itself
- Secondary VLANs (community and isolated)
- Promiscuous port
Private VLANs are configured using special cases of regular VLANs and therefore, shouldn’t
pass its configuration using VTP (switch should be VTP transparent); PVLANs are locally
significant to a switch. VLAN 1, and 1002 - 1005 can’t be used as PVLAN.
PVLAN travel in the network just like any other VLAN
Verify with ‘show vlan private-vlan’ and ‘show interface NAME switchport’ <- Mode +
association + mapping.
Protected ports are supported on lower end switch models (that doesn't support PVLAN).
Ports configured as protected doesn’t communicate with other protected ports, only
unprotected ports.
- Protected ports can’t communicate with other protected ports unless a Layer 3 device is
present
- Protected ports can transfer control protocol traffic (such as routing protocol, CDP, etc)
directly with other protected ports
- Protected ports can communicate directly with normal ports in the same subnet without L3
device.
Off by default, turned on with ‘(config-if)#switchport protected’ command, and verify with
‘show interface NAME switchport’. Protected ports are usually used with port blocking.
Port blocking is supported only in Catalyst 3750 and above. Switches, by default, flood
multicast and unknown unicast traffic (except the port where it comes from). You can stop
this with ‘(config-if)#switchport block [multicast | unicast]’. Off by default.
This command doesn’t affect MAC address learning process. Even if an address aged out, it
can still get back in using the normal MAC learning process.
Error management
Cisco Catalyst switches can automatically detect the error, which will cause it to shut down
the port to errdisabled state. The situation can be remedied if someone enabled the port, or
until a predetermined time has elapsed. You can change this behavior by ‘(config)#[no]
errdisable detect cause [all | CAUSE_NAME]. Here is a list of causes you can change:
- all— Detects every possible cause
- arp-inspection— Detects errors with dynamic ARP inspection (DAI)
- bpduguard— Detects when a spanning-tree bridge protocol data unit (BPDU) is received
on a port configured for STP PortFast
- channel-misconfig— Detects an error with an EtherChannel bundle
- dhcp-rate-limit— Detects an error with DHCP snooping
- dtp-flap— Detects when trunking encapsulation is changing from one type to another
- gbic-invalid— Detects the presence of an invalid GBIC or SFP module
- ilpower— Detects an error with offering inline power
- l2ptguard— Detects an error with Layer 2 Protocol Tunneling
- link-flap— Detects when the port link state is “flapping” between the up and down states
- loopback— Detects when an interface has been looped back
- pagp-flap— Detects when an EtherChannel bundle’s ports no longer have consistent
configurations
- psecure-violation— Detects conditions that trigger port security configured on a port
- rootguard— Detects when an STP BPDU is received from the root bridge on an
unexpected port
- security-violation— Detects errors related to port security
- storm-control— Detects when a storm control threshold has been exceeded on a port
- udld— Detects when a link is seen to be unidirectional (data passing in only one direction)
- unicast-flood— Detects conditions that trigger unicast flood blocking on a port
- vmps— Detects errors when assigning a port to a dynamic VLAN through VLAN
membership policy server (VMPS)
To re-enable this port, ‘shutdown’ then ‘no shutdown’. You can also force the switch to
automatically re-enable the port every time by ‘(config)#errdisable recovery cause [all |
CAUSE_NAME]’.
The default time to re-enable the port is 300 seconds, but you can change this by
‘(config)#errdisable recovery interval SEC’, the time can be from 30 to 86,400 seconds.
You can see port status with ‘show interface [status] [err-disabled]’, ‘show errdisable
{detect | recovery}’ command
More about Switching
0100.5E are reserved for multicast group address for higher layers. Multicast address used
by different protocols such as STP or UDLD use different range of multicast address.
DCE equipment is usually a type of concentrator or repeater, while DTE equipment usually
generate traffic. A crossover cable can link DCE to DCE or DTE to DTE.
A crossover cable can link DCE to DCE, and DTE to DTE. The exception to connecting like
devices is that some devices are manufactured to be connected together. An example would
be that some hubs and switches have an uplink or Media Dependent Interface (MDI) port.
There is typically a selector that allows the user to toggle between MDI and MDI-X (X for
crossover), with MDI-X intentionally reversing the pin out of transmit and receive similar to
a crossover cable. A setting of MDI-X allows two DCE devices, such as two hubs or switches,
to connect to each other using a typical straight through wired twisted-pair cable.
The algorithm in CSMA/CD used after a collision is Truncated Binary Exponential Backoff
algorithm. When a collision occurs, the device must wait a random number of slot times
before attempting to retransmit the packet. The slot time is contingent upon the speed of
the link. Cisco switches uses a more aggressive Max Wait Time.
Excessive collisions typically occur when too much traffic is on the wire or too many
devices are in the collision domain. After the fifteenth retransmission plus the original
attempt, the excessive collisions counter increments, and the packet gets dropped. In this
case, too many devices are competing for the wire. In addition, duplex mismatches can also
cause the problem.
Miscellaneous
Important parts of the device, such as power cord, and forwarding engine, may have a
backup in case of failure. Switch platforms such as Catalyst 4500R and 6500 can accept 2
supervisor modules installed in a single chassis.
The active module is fully initialized while redundant module is only initialized to a certain
level, the more initialized a module it, the less time it takes for transition in failovers. The
redundant module can be in one of several modes. The mode a module is in affect how 2
supervisor module synchronize their information.
- Route processor redundancy (RPR): partially booted an initialized. Need to reload every
other module to initialize supervisor function. Failover generally take at least 2 minutes.
When secondary engine takes over, 1) all switching module are reloaded, 2) remaining
subsystem of MSFC are brought up, and 3) ACLs are reprogrammed into hardware.
- Route processor redundancy plus (RPR+) redundant supervisor reboots to allow supervisor
and route engine to initialize; no other modules need to reload allow port states to be
retained. Failover generally take between 30 - 60 seconds.
- Stateful switchover (SSO) fully booted and initialized. Startup and running-config is
synchronized, L2 switching and port state can continue without notice. Failover generally
take at least 1 second.
SRM (single-router mode) simply means that two route processors (integrated into the
supervisors) are being used, but only one of them is active at any time. In DRM (dual router
mode), two route processors are active at all times. HSRP usually is used to provide
redundancy in DRM.
Although RPR and RPR+ have only one active supervisor, the route processor portion is not
initialized on the standby unit. Therefore, SRM is not compatible with RPR or RPR+.
SRM is inherent with SSO, which brings up the standby route processor. You usually will
find the two redundancy terms together, as “SRM with SSO.”
You can configure the redundancy mode by: (config)#redundancy -> (config-red)#mode
{rpr | rpr-plus | sos}. If the switch is configuring for redundancy the first time, you must
enter ‘redundancy’ command on both supervisor modules. When the redundancy mode is
enabled, you will make all configuration changes on the active supervisor only. The running
configuration is synchronized automatically from the active to the standby module.
Verify with ‘show redundancy states’
If you configure RPR+ with the rpr-plus keyword, the supervisor attempts to bring up RPR
+ with its peer module. The IOS images must be of exactly the same release before RPR+
will work. If the images differ, the supervisor automatically falls back to RPR mode instead.
You can also configure the synchronization between active and standby supervisor.
(config)#redundancy -> (config-red)#main-cpu -> (config-r-mc)#auto-sync {startup-config
| config-register | bottler}!! Default (config-r-mc)#auto-sync standard
Nonstop forwarding(NSF, Cisco-proprietary) is an interactive method that focuses on
quickly rebuilding the RIB after supervisor switchover. NSF works with SSO and asks NSF-
aware neighbor which provide routing information to the standby supervisor; NSF function
must be built into the routing protocol of both routers. NSF support BGP, EIGRP, OSPF, IS-IS
and CEF.
Routing Protocol Configuration Commands
BGP Router(config)# router bgp as-number
Router(config-router)# bgp graceful-restart
EIGRP Router(config)# router eigrp as-number
Router(config-router)#nsf
OSPF Router(config)# router ospf process-id
Router(config-router)#nsf
IS-IS Router(config)# router isis [tag]
Router(config-router)# nsf [cisco | ietf]
Router(config-router)# nsf interval [minutes]
Router(config-router)# nsf t3 {manual [seconds] | adjacency}
Router(config-router)# nsf interface wait seconds
NSF is enabled when SSO is enabled.
Macro
Macro can save you some typing by defining it to type those long commands and only
refer to the macro name. For instance,
(config)#define interface-range MyGroup gig 2/0/1 , gig 2/0/3 – 2/0/5 , gig 3/0/1 , gig
3/0/10 , gig 3/0/32 – 3/0/48
(config)#interface range macro MyGroup
The above command create a macro called MyGroup for a specific command. Remember to
surround any commas and hyphens with spaces when you enter interface range commands.
IRDP uses ICMP router advertisement and router solicitation messages to allow a network
host to discover addresses of operational gateways on the subnet.
Having a gateway doesn’t mean your traffic will travel the shortest path such as when
there are multiple gateways. To resolve this dilemma, use IRDP.
Routers configured with IRDP [(config)#ip irdp] will advertise about route that should be
forwarded to it. These information can be send in broadcast (255.255.255.255) or multicast
(224.0.0.1) with (config)#ip irdp [multicast | broadcast]’.
By default, these messages are send between every 450 to 600 seconds. To configure, use
‘(config-if)#ip irdp minadvertinterval XX’ and ‘(config-if)#ip irdp maxadvertinterval X’
command. Verify with ‘show ip irdp'
StackWise
9 Catalyst 3750 switches can be stacked on one another to create a single switching unit
with 32 Gbps switching stack interconnect. All switches can use the same resources, and a
master switch is elected to centrally manage the stack and create the CAM and routing
table.
During election, Layer 2 process is not affected. However, when a new master is elected,
the routing table has to be flushed and rebuild. Here are some mechanisms that provide
StackWise with high availability.
Mechanism Description
CrossStack This allows multiple switches in a stack to create an EtherChannel connection
EtherChannel between them. This prevents the loss of an individual switch from affecting
Technology connectivity for other switches in the stack.
Equal Cost Paths The switch stack can be dual-homed to multiple Distribution Layer switches or
multiple routers for redundancy.
1:N Master Each switch in the stack participates in the master election and is therefore
Redundancy eligible to be elected master in the event that the master fails.
Stacking Cable When a break in the stacking cable is detected, switches begin sending data
Resiliency using the alternate path that is currently operational.
Online Insertion OIR allows switches to be added and removed from the switch stack while it is
and Removal (OIR) operational.
Distributed Layer 2 If the master switch fails, stack members continue Layer 2 forwarding using the
Forwarding most recent tables received from the master.
RPR+ for Layer 3 As is the case with RPR+, each stack member is initialized and is ready to
Resiliency assume the role of master if the current master fails.
Power redundancy
The redundant power supply must be identical and possess the same power input and
output. There are 2 redundancy mode for Catalyst 4500 and 6500: combined and
redundant.
Combined mode: both power supplies are used at the same time. Power as much modules
as needed. When one fail, the system powers down the modules for which there isn’t
enough power.
Redundant mode: draw its full power from both power supplies. When one fails, the other
immediately start providing full power.
For Catalyst 3750, the power is located externally. Can be used with UPS.