Академический Документы
Профессиональный Документы
Культура Документы
Page 1 of 50
Tutorial
What to study -- and Not to Study New Paradigms and Metaphors Cisco's Switch Product Positioning Failover Requirements
CertificationZone Subscribers Should What does this mean in the context of switches?
Availability Terminology Paging Mr. Murphy Selecting Recovery Strategies Cost and Complexity in Selecting Strategies Recovery Time Requirements in Selecting Strategies 1:N, 1:1, and 1+1 Protection Strategies Switch Platform Architecture: A Model Practical Issues: What Are Ports? Management Hardware Software Control Forwarding Tables and Populating Them Forwarding Ingress Buffering and Processing
Pattern Recognition
Fabric
Shared Bus Shared Memory Crossbar
Egress Processing QoS at the Switch Interfacing: the GBIC (Gigabit Ethernet Interface Converter) Characterizing Switch Performance Throughput Blocking Output Blocking Grandfather Switch: Catalyst 5x00 Platform Family Stacking and Clustering: 3750 and 2950 Midrange Flexibility: Catalyst 3550 Platform Family A New Interface Paradigm
Hardware Aspects of Voice Ports
http://www.certificationzone.com/cisco/studyguides/component.html?module=studyguides... 5/31/2005
Page 2 of 50
Management and Control Forwarding Catalyst 6000/6500 Platform Family Management and Control Database Manager Forwarding Switching Functions for High Availability Layer 1/2 High Availability for Links and Interfaces Layer 1 Failover
SONET and POS
Unidirectional Links: Detection Protocol (UDLD) and configuring Unidirectional Ethernet Layer 2 Aggregation Preventing Broadcast Storms Other Layer 2 Security and Management Enhancements Private VLANs 802.1x -- Port Based Authentication DHCP-related Security Features Growing Frames beyond Normal Size Single Spanning Tree High Availability
Layer 2 Traceroute
Core/Backbone Switch Failure Indirect Root Failures Root Wars Distribution Switch Failure Performance Enhancements to Individual Spanning Trees IEEE 802.1w Rapid Spanning Tree Protocol (RSTP) Port Types in 802.1d and 802.1w Port States in 802.1d and 802.1w PortFast, BPDU Guard, and 802.1w Functional Equivalence Root Wars and Root Guard STP Convergence Time Performance Enhancements to Multiple Spanning Trees MSTP: Subdividing the Spanning Tree for Faster Convergence MSTP Regions IST, CIST, and CST VLAN Tagging and VLAN Trunk Protocol (VTP)
VTP Pruning
Introduction
While most of the focus of this paper is on L2 switching, there is a significant amount on the architecture and implementation of "L3 switching". L3 switching is really routing, but the term L3 switching has tended to become associated with implementation techniques that do much of the work in specialized hardware. Please, please don't get confused by trying to see how L3 switching is somehow different, in basic principles, from routing. It isn't. At worst, it's purely a marketing term; at best, it emphasizes certain implementations. There's no accident that the Cisco 12000 is called the Gigabit Switch Router (GSR), because it makes extensive use of hardware processing. Since it's targeted at a WAN and ISP market, however, Cisco doesn't designate it a switch to avoid confusion with enterprise and server farm relays. This particular paper has many cross-references to other CertificationZone tutorials, and for good reason. The focus here is how a switch does something, while such things as QoS, high availability, and security tutorials define why something is done.
http://www.certificationzone.com/cisco/studyguides/component.html?module=studyguides... 5/31/2005
Page 3 of 50
"Is it SAFE?"
Well, the quote is from the movie "Marathon Man," which is guaranteed to give nightmares about going to dentists. However, SAFE itself doesn't seem to be an acronym -- at least, it's not spelled out in the main SAFE blueprint from Cisco. Part of the confusion about SAFE and ECNM seems be that material about them is not on Cisco CCO. There is mention of ECNM in several security and design instructor-led courses, but there is no corresponding Cisco white paper. My best interpretation is that ECNM really means the overall design resulting from applying the three-layer hierarchical model to each appropriate subsystem of SAFE. Some Cisco presentations to service provider audiences introduce a fourth hierarchical layer, "collection", between access and distribution. The collection layer involves broadband aggregation (e.g., IP over cable or DSL) between the user premises and the ISP -- it's where the broadband service provider lives. In the Cisco Enterprise SAFE document, http://www.cisco.com/warp/public/cc/so/cuso/epso/sqfr/safe_wp.htm, there is one mention of an "enterprise campus module". This module is composed of the campus proper, the "campus edge", and the edge of service provider networks. Cisco has not made it clear if the "collection tier" is equivalent to the "campus edge" discussed in enterprise-oriented presentations. You may also want to look at an
http://www.certificationzone.com/cisco/studyguides/component.html?module=studyguides... 5/31/2005
Page 4 of 50
you are studying for the CCNP Switching or CCIE Written examinations, you need to know about platforms that are not in the CCIE lab. The 6500 switch, for example, is Cisco's flagship product for large enterprises and internal use within ISPs. It has some unique features on which you might be tested.
Internet-Draft I coauthored, which hopefully will soon move to RFC, "Terminology for Benchmarking BGP Device Convergence in the Control Plane", http://www.ietf.org/internetdrafts/draft-ietf-bmwg-conterm-05.txt, where we draw a distinction between two functions in the Cisco "distribution tier", the "provider edge router" and the "inter-provider border router," as opposed to the "subscriber edge router". This distinction, while informal, captures For many switches, you will some of the flavor of Cisco's "campus need to recognize that there is edge". While not listed as an official a product family that includes coauthor because we weren't allowed to more than one numbered list more than five coauthors, Alvaro series. For example, the 4000 Retana of Cisco was part of the team series switches are modular, that wrote this document. but the 2948G switches are very similar devices whose configurations are fixed. Table 1. General Positioning Model for Enterprise Switches Enterprise size Wire closet Small Midrange Large Backbone
Fixed configuration Fixed configuration Modular Modular Fixed configuration Modular Modular Modular Modular
You will find switches positioned for different functions, and for the same function within organizations of different size. Fixed configuration platforms are most associated with the smaller enterprises, but they also can be quite useful as aggregation platforms inside larger enterprises.
Real consolidation and a clear picture of future trend came with the introduction with the 3550 and its IOS-based interface. This interface has considerable QoS capability, especially important for Cisco
http://www.certificationzone.com/cisco/studyguides/component.html?module=studyguides... 5/31/2005
Page 5 of 50
Table 4. Qualifying the 1999 view for Enterprise Size Enterprise size Wire closet Small Midrange Large Backbone
Table 5. The View in 2003 Wire closet Server farm Core 6500
Failover Requirements
Selecting the appropriate level of availability is as much a business as a technical decision. In her book Planning for Survivable Networks, Annlee Hines has written extensively on the basis of these decisions. If you ever plan to recommend real network designs rather than simply pass tests, read her book! [Hines 2002]
My WAN Survival Guide [Berkowitz 2000] discusses some of these cost-benefit trade-offs from the enterprise standpoint, and my Building Service Provider Networks [Berkowitz 2002] looks at the tradeoffs from the service provider viewpoint. Table 6. Broad Goals for High Availability [Berkowitz 2000] Availability Level 1 2 Server "Do nothing special" Backups "Increased availability: protect the data" Full or partial disk mirroring, transaction logging "High availability: Protect the system" Clustered servers "Disaster recovery: protect the Network Locked network equipment Dial/ISDN backup
3 4
http://www.certificationzone.com/cisco/studyguides/component.html?module=studyguides... 5/31/2005
Page 6 of 50
backbone
High availability involves a great many cost trade-offs, some of which are "Layer 8" business rather than technical considerations. Table 7. Costs of High Availability Mechanisms Direct Backup equipment Additional lines/bandwidth Floor space, ventilation, and electrical power for additional resources Indirect Design Network administrator time due to additional complexity; higher salaries for higher skills Performance drops due to fault tolerance overhead
If you choose to "pay me later" and accept failures, what are some of the costs of failures when they occur? Table 8. Costs of Lack of Availability Direct Revenue loss Overtime charges for repair Indirect Lost marketing opportunities Shareholder suits
Salaries of idle production staff Staff morale Radia Perlman's doctoral thesis [Perlman 1988] was on the "Byzantine generals problem". She demonstrated that adding more network elements during certain kinds of failures not only does not increase availability but actually decreases it. The theoretical problem deals with a situation where the decision maker receives conflicting information from multiple sources, some of which is known to be untrue -- but it is not known which information is untrue. Sounds familiar from mutual redistribution problems, hmm? It applies to most routing mechanisms and related mechanisms such as Layer 2 spanning trees.
Availability Terminology
Remember that the CCIE written exam is more concerned with protocol theory and features than specific configuration of routers to use them. This section will give you a good deal of information relevant to the theory of many protocols. For more detail, see the High Availability tutorial. We often speak of single points of failure. Multiprotocol Label Switching (MPLS) has refined that definition into the shared risk group (SRG). The basic definition of an SRG is "a set of network elements that will be affected by the same fault". SRGs can apply to all sorts of network resources, and a given resource can belong to more than one SRG. A shared risk group of routers might be all of those on a common electrical power supply.
http://www.certificationzone.com/cisco/studyguides/component.html?module=studyguides... 5/31/2005
Page 7 of 50
Infrastructure Commercial power Physical Data Link Network Transport Application Cable in common duct, single shared medium Cables in common multilink bundle Router Routing software session/instance TCP software Single DNS server
One of the classic SRGs is the common cable or cable duct that gets cut by construction workers. While building alternate cable runs to the telco end office historically is prohibitively expensive, new Cisco technology gives you some creative alternatives. It may not be expensive, balanced against the cost of downtime, to run a wireless LAN from your main router to a router in a nearby building. That alternate router would connect to the end office, at the very least, via a different cable, and ideally would connect to an entirely different office. The bandwidth available to you from one wireless LAN, or a small number of parallel wireless LANs, usually will be comparable to your normal WAN uplink. When the WAN bandwidth requirements are substantial, you still can get laser or wireless links from non-Cisco vendors, providing short-haul bandwidth up to OC12 (622-Mbps) rates.
http://www.certificationzone.com/cisco/studyguides/component.html?module=studyguides... 5/31/2005
Page 8 of 50
assumes that the recovery technology does have sufficient resources to protect against at least a single failure without human intervention. Outside the scope of this discussion are failures where mean time to repair (MTTR) is significant because it requires human intervention, possibly at unmanned sites, and possibly where spares need to be shipped in. You must, however, always remember why you want a particular level of survivability and build against the defined requirements. Designers and, unfortunately, traditional telephony people often use the 50ms cutover goal of SONET as the gold standard. This number is derived from SS7 characteristics of large carrier networks. VoIP is much more tolerant of drops, tolerating 140 ms to 2 s.
Path Failure (PF) Recovery mechanisms have decided the path has totally lost connectivity. Link Failure (LF)
http://www.certificationzone.com/cisco/studyguides/component.html?module=studyguides... 5/31/2005
Page 9 of 50
although OSPF does have a specific notification abstraction, especially for demand circuits. Usually associated with an SNMP trap. BGP or IGP withdrawal route. Generally considered poor practice to announce periodically.
A signal repeatedly transmitted that a fault along a path has occurred, passed along the path until it reaches a network element capable of initiating recovery. Indication that a fault along a working path has been repaired.
You may have real-time applications such as telepresence, telemetry, etc. that must have predictable delay. Delay may also be a commercial differentiator for competitive offerings of mission-critical business applications such as automatic teller machines, credit authorization, and transaction-based Internet commerce.
Dynamic discovery
Both 1:N and 1:1 schemes may use the backup resource for lower-priority traffic, which can instantly be pre-empted if the working resource fails. 1+1 protection adds application complexity, because the applications need to be able to decide which copy of information should to be used. In switches, you may see it in cases where Cisco Nonstop Forwarding supports a hot-standby processor. 1+1 is very rare in networking. You will see it in SS7 telephony control networks, but it is not used extensively in enterprise networking.
http://www.certificationzone.com/cisco/studyguides/component.html?module=studyguides... 5/31/2005
Page 10 of 50
You can look at a switch abstractly as a relay. Relays are devices with at least two interfaces, which accept data on one interface and send it out another. A range-extending repeater, operating at the physical layer, is the simplest type of relay, with only one input and one output. Ethernet hubs are still relays, although they copy the data onto an internal shared medium and fan the contents of that medium to all other ports. You really can't get a good sense of relays until layer 2, when the platform software has to make a decision regarding which egress interface to use. While Cisco likes to talk about frames vs. packets vs. segments vs. messages, doing so is not correct OSI terminology. OSI formalism sometimes is very pedantic, but some of its terminology can be very precise and unambiguous. OSI documents speak not of specifically named units at every layer (e.g., frame at layer 2), but of Protocol Data Units (PDU). At a specific layer, you speak of Transport PDUs or Data Link PDUs. Another useful concept, especially when dealing with protocol encapsulation, is the layer above the current layer is called (N+1) while the layer below is (N-1). From the perspective of the network layer, it receives (N+1)PDUs from Transport, and sends out (N-1)PDUs to Data Link. A relay, which is a term from the formal (yes, that's the way it's spelled), is a device (or software function) with at least two interfaces. It receives PDUs on one interface and de-encapsulates them until it has the information on which it will make forwarding decisions. Ignoring devices such as multilayer switches, devices such as bridges and LAN and WAN switches accept physical layer bits, build them into Data Link PDUs, and make forwarding decisions on information at Data Link. Routers receive bits, form frames, and extract Network PDUs from the Data Link PDUs. After examining Network Layer information, they internally forward Network PDUs to an outgoing interface, and then encapsulate these into Data Link PDUs and then Physical Layer information. To make any of these forwarding decisions, the relay must first have an association between destination (and possibly other) information in the PDU at which it makes decisions, and information about the appropriate outgoing interface. The process of learning these associations is path determination. In bridges and LAN switches, path determination involves the spanning tree protocol, VLAN protocols, and source routing. In routers, path determination involves static and dynamic routing, as well as the up/down state of hardware interfaces.
http://www.certificationzone.com/cisco/studyguides/component.html?module=studyguides... 5/31/2005
Page 11 of 50
Source of traffic to be sent to the SPAN monitoring port Port associated with SPAN analysis (e.g., RMON)
Don't confuse the physical port types in Table 12 with the spanning tree port types in Table 35. A port can have both a physical type and a spanning tree type.
Management
In a relay, the management function is concerned with building the forwarding "map", whether that is a spanning tree at OSI Layer 2, a routing table at Layer 3, or content switching tables at higher layers. Other functions include exception processing such as ICMP, running routing and spanning tree protocols, etc. Management obviously includes the automated management functions (e.g., TFTP, logging) and the human interface.
Hardware
Management functions usually are implemented in general-purpose processors. As performance requirements grew more stringent, the processor often was a Reduced Instruction Set computer (RISC) design rather than a Complex Instruction Set (CISC) design. Under some conditions, forwarding uses the same processor as is used for management.
Software
Management is primarily a software function. Clearly, this is the role of the human interface, be it textual or Web-oriented, or be it any of the different switch operating systems.
Control
Control software runs management functions, including the human interface, as well as topology learning with spanning tree and dynamic routing protocols.
http://www.certificationzone.com/cisco/studyguides/component.html?module=studyguides... 5/31/2005
Page 12 of 50
were quite small, either 512 or 1024 entries. This small number of entries worked acceptably in an enterprise, which typically has a moderate number of frequently used routes, but was a severe performance limitation in ISP routers. Distributed switching on VIPs was a major performance advance, because the VIP FIB has a one-to-one correspondence with the RIB. With this correspondence, there never will be a cache fault.
Forwarding
At a general level, let's consider the forwarding modes, also called switching paths, in Cisco platforms. Table 13. L2 switching modes Switching mode "Software" "Hardware" -- CAM for L2 "Hardware" -- TCAM for L2 and L3 Speed Slowest but most intelligent Default mode and most common at layer 2 Good compromise between speed and intelligence MIB:RIB Relationship MIB and FIB are the same. May be centralized or distributed. Uses Content Addressable Memory requiring an exact match May be centralized or distributed. Uses one or more Ternary Content Addressable Memories
Table 14. L3 switching modes Switching mode Process switching Fast switching Speed Slowest but most intelligent Default mode, faster than process MIB:RIB Relationship MIB and FIB are the same. FIB is in RAM, and is smaller than the RIB. FIB is in special hardware, and is much smaller than the RIB. FIB is a full copy of the RIB.
Autonomous, silicon, Fast, hardware-assisted and platformoptimum dependent Express Fastest, especially when distributed into multiple Versatile Interface Processors
Pattern Recognition
Ingress processing, in the real world, gets complicated by frequent requirements to recognize patterns in the packet or frame, patterns other than the destination. Among the most common is what we generically call an access control list (ACL), which checks certain fields, usually with a mask that
http://www.certificationzone.com/cisco/studyguides/component.html?module=studyguides... 5/31/2005
Page 13 of 50
indicates whether the value of a bit is to be checked, or if the pattern will accept any bit value in that position (i.e., wild card). When you consider wild cards as well as a bit being one or zero, you introduce ternary logic, a step beyond a simple binary on-or-off decision. Cisco now describes the individual lines in an ACL as access control entries (ACE). You can recognize patterns, at L2 and L3, for various reasons, including security filtering, special routing (e.g., source routing) or QoS recognition and marking.
Templates
Switch Database Management for TCAMs was introduced on the 3550. Originally, there were four templates, which would set TCAM elements to an optimal solution for:
z
http://www.certificationzone.com/cisco/studyguides/component.html?module=studyguides... 5/31/2005
Page 14 of 50
All the templates in assume 8 routed interfaces and 1K VLANs. Table 15. 3550 Template Assumptions TCAM unicast MAC address IGMP group Access Default Routing VLAN 1024 2048 5120 1024 1024 1 8 1 5120 1024 512 512 16384 1024 8192 1024 1024 0 0 0
QoS Access Control Element (ACE) 1024 Security ACE Unicast Routes Multicast Route 2048 2048 2048
Notice that the default template is optimized to support a large number of MAC addresses in the MAC table, and a large number of IP routes in the routing table. The trade off is fewer resources for IGMP groups, QoS, and security related access control entries (lines in access-lists): The routing template offers support for twice as many routes (16,000 versus 8,000), but far fewer access control entries and QoS entries. In contrast, the VLAN template disables routing entirely, and focuses all resources towards L2 and VLAN support. As Chuck Larrieu put it in his 3550 Tutorial, "While it is unlikely that any CCIE Lab scenario would stress any of these settings, it is possible that a Candidate might be asked to 'assure that SVI support is maximized' or 'ensure that L3 functionality is not compromised by L2 considerations'." It is equally possible that a candidate for a written exam -- CCIE or CCNP -- might be asked a similar question. It's likely that the template model will spread to platforms other than the 3550.
"Cisco has created within the 3550 platform the means of customizing and optimizing system resource allocation based on particular application or requirement. For example, if a particular Forwarding models switch was strictly Layer 2, or a series of switches had a large number of Demand-based forwarding requires that the first packet of connected stations and a large number of a flow must go through the "slow" or "software" path, VLANs as well, then one could reallocate which then populates a high-speed table. You will see this resources to favor VLAN, while disabling in the Supervisor 1A/MSFC on the 6500. routing and freeing up routing resources. On the other hand, if a particular Topology-based forwarding, on the 6500 with Supervisor 2, installation required extensive QoS or the 4000 with Supervisor 3, and the 3550, breaks the security configurations, an administrator dependence on software lookup. could optimize the switch to allocate resources for those activities."
Fabric
The fabric interconnects the input and output interfaces. There are three main types of fabric:
z
http://www.certificationzone.com/cisco/studyguides/component.html?module=studyguides... 5/31/2005
Page 15 of 50
A given switch will have one or more types of fabric. Indeed, on high-performance switches such as the 6500, the highest-speed fabric is a separate card, not just part of the backplane. Don't make the mistake I did, early in my career, and equate the backplane with the fabric. The backplane tends to be passive or nearly so. The active fabric will be on the supervisor card (or integrated equivalent), and sometimes on a separate plug-in card. Indeed, a single platform can have more than one fabric. Table 16. Fabrics by Platform Type Fabric Speeds in Gbps * Shared bus Shared Memory Crossbar 8.8 13.6 8.8, 13.6, 24 [1] 32 [2] 32 28, 64 1.2 3.6 32 256
Platform 2900 2955 3550 3750 4000 4500 5000 5500 6000 6500
* Cisco specifications are not always clear if the bandwidth stated is unidirectional, or adds together the two directions [1] Depends on platform model [2] Total bandwidth for stack
Shared Bus
Most lower-performance devices use a shared bus as the fabric. A single bus allows a connection between two interfaces, with all interfaces contending for the bus. Don't fall into salesdroid traps and assume faster is always better. Shared bus is the cheapest solution, and thus appropriate for workgroup and other small switches where cost is more important than performance. The fabric is usually built into the backplane. Some devices, such as the 5500 switch, may have several busses bridged into one, and the throughput figure is the sum of the bus speeds.
z
Each port must arbitrate for access Broadcast and multicast are easy Oversubscription is normal
Flooded data decreases end-station performance Destination must be only those ports that need that traffic
http://www.certificationzone.com/cisco/studyguides/component.html?module=studyguides... 5/31/2005
Page 16 of 50
Shared Memory
Shared memory systems keep the frame or packet in memory until the last egress interface is finished with it. Memory management can be simple or difficult, depending on whether or not there are requirements for QoS and/or multicast. QoS requires static buffer allocation in the shared memory. When you are multicasting, unless there is enough concurrent ports to the memory to service simultaneously all egress ports in the multicast group, the packet or frame has to stay in memory until the last egress port transmits it.
Crossbar
Crossbar designs are a full mesh, allowing concurrent communications between any pair of interfaces. Obviously, there is no contention for unicast forwarding. Crossbars are the fastest fabric technology. There may be several cooperating crossbars within a large switch or router, as the ASICs involved are typically not greatly larger than 16x16. Multicasting on crossbars can be a challenge, since the one-to-one relationship inherent to a crossbar is not a good fit to the one-to-many of multicast involving multiple egress interfaces. Crossbar works perfectly well in the middle of a multicast tree, where you have a single egress interface for a multicast group address. Shared memory fabrics may work better for multiple-egress-interface multicasting.
Egress Processing
In most switches and routers, the bulk of the processing is done at the ingress. Such functions as egress QoS, data link protocol conversion, etc., do take place in the egress card. When the egress port connects to a server that is incapable of wire-speed operation, output buffering may be needed to avoid drops. In such cases, the amount of output buffering designed into the switch involves delicate tradeoffs. Too little buffering causes data drops, but too much buffering can cause unacceptable delay.
Switch#show qos maps dscp tx-queue DSCP-TxQueue Mapping Table (dscp = d1d2) d1 : d2 0 1 2 3 4 5 6 7 8 9 ------------------------------------0 : 01 01 01 01 01 01 01 01 01 01
http://www.certificationzone.com/cisco/studyguides/component.html?module=studyguides... 5/31/2005
Page 17 of 50
1 2 3 4 5 6
: : : : : :
01 02 02 03 04 04
01 02 02 03 04 04
01 02 03 03 04 04
01 02 03 03 04 04
01 02 03 03 04
01 02 03 03 04
02 02 03 03 04
02 02 03 03 04
02 02 03 04 04
02 02 03 04 04
interface command.
A special bandwidth subcommand of tx-queue, not to be confused with interface bandwidth, can allocate a guaranteed minimum bandwidth to each of the four queues. At present, this is only available on non-blocking Gigabit Ethernet interfaces. For a 4000-specific example of such ports, see Table 22. If you enable global QoS without bandwidth statements, each queue will get 250 Mbps. Do be aware that the switch does not check for consistency amount the assignments, and it will let you oversubscribe (e.g., assign 250 Mbps to queues 1 and 2 and 500 Mbps to queues 3 and 4). As long as a transmit queue is below the preconfigured share and shaping values, it is considered high priority and served by the priority queuing discipline. Queues that do meet the share and shape values will be serviced after the high priority queues. Only if no high priority queues exist will strict round robin be observed. The priority discussed here is not directly associated with the DSCP
Throughput
For the standard definition of throughput, see RFC 2544. Figure 2 shows test configurations for the
http://www.certificationzone.com/cisco/studyguides/component.html?module=studyguides... 5/31/2005
Page 18 of 50
Device Under Test (DUT) with both integrated and separate load generators and receivers.
+------------+ | | +------------| tester |<-------------+ | | | | | +------------+ | | | | +------------+ | | | | | +----------->| DUT |--------------+ | | +------------+
+--------+ +------------+ +----------+ | | | | | | | sender |-------->| DUT |--------->| receiver | | | | | | | +--------+ +------------+ +----------+
Figure 2. Standard Throughput Measurement topology I find it amusing that the presentations by Cisco technical people at Networkers often give a reduced but practical definition of throughput. For example, you'll hear the figure 256 Gbps used to state the throughput of the fabric module on a 6500 series switch. The maximum one-way throughput, however, is 128 Gbps. The sales figure adds together the maximum speeds in each direction, doubling the throughput. It's less amusing if you are asked to answer a question on "speeds and feeds", and it's not clear whether the question is looking for unidirectional or bidirectional management. There's no simple solution here, other than to read the question carefully and see if it makes the conditions of measurement clear. I'd also be more tempted to go with a salesy answer if I were taking a sales certification exam. Somehow, we've managed to avoid widespread propagation of the idea that full-duplex Fast Ethernet has a throughput of 200 Mbps, but this "spin" of the truth still seems popular in describing the throughput of a routing or switching platform.
Blocking
A source of much fear, uncertainty, and doubt (FUD) in switch marketing is whether a forwarding system is blocking or nonblocking. The usual definition of a nonblocking switching fabric is that the fabric is fast enough to transfer all traffic, without loss, while all ports are active. This definition is somewhat flawed. A better way to speak of a nonblocking fabric is one that can keep up with a set of input ports, each of which is outputting to a unique output port of the same or greater speed. Sales presentations for nonblocking relays tout their advantage over blocking devices. In practice, this is often a theoretical rather than a practical advantage. There is an underlying assumption of how nonblocking performance is measured, as shown in Figure 3. In a blocking switch, the fabric is too slow for full noninterfering transmission. In Figure 3, input and output ports are paired, as required by RFC 2544. Every input has a dedicated output.
http://www.certificationzone.com/cisco/studyguides/component.html?module=studyguides... 5/31/2005
Page 19 of 50
Output Blocking
Output blocking is fairly common, and you must understand that it is a client or server problem, not a switch or network problem, unless an intermediate, blocking relay is connected to the output port. Output blocking occurs when two or more ingress ports try to send simultaneously to the same egress port. Remember that the RFC 2544 throughput specification is explicit that each ingress port relays only to a single egress port. In this situation, the fabric speed is irrelevant, because the problem is at the egress port (Figure 4). You can trade off delay against data loss by providing output buffering. When QoS must be controlled, you need to think through ingress and egress parameters so unacceptable delay will never occur at an egress port.
http://www.certificationzone.com/cisco/studyguides/component.html?module=studyguides... 5/31/2005
Page 20 of 50
Figure 4. Output Blocking --- don't blame the switch! Some vendors, though not Cisco, support a technique that buffers at the ingress when the external destination cannot accept data fast enough, or the egress interface is busy. Input buffering, unless very carefully designed, can lead to head of line blocking (Figure 5). What Cisco has done is produce a GE (Gigabit Ethernet) interface for the 4000, which has 18 ports that share a 6 Gbps path into the fabric. These numbers were chosen because many Wintel servers can't generate more than 300 Mbps of traffic. With such servers, there's still a benefit to using GE, to reduce latency in transmission, use single GE NICs rather than Fast EtherChannel, and leave room for growth
http://www.certificationzone.com/cisco/studyguides/component.html?module=studyguides... 5/31/2005
Page 21 of 50
In modern switches, head-of-line blocking cannot occur, because there can be concurrent transfers between input and output ports. Such concurrency can be virtual, but is now generally physical, because a multiported shared memory, or a crossbar, will not get "stuck" waiting for a frame to transfer. Even if there is a blocking fabric, modern switch design prevents head-of-line blocking, because it creates multiple virtual queues in the input buffer, which prevent a frame from ever preventing another frame from reaching the fabric. With a blocking fabric and a single input buffer queue, you can have the scenario in Figure 5. This scenario involves ingress interfaces that have a single first-in-first-out (FIFO) buffer. Assume that two frames destined to output port three arrive simultaneously, one on port 1 and the other on port 2. Port 3, obviously, can only send one frame at a time. Again remembering the input buffer is FIFO, assume another frame, destined for port 4, arrives at port 1. If port 2 gained control of the fabric before port 1 could do so, port 1 can't send the 4th frame, because its path to the fabric is being blocked by the "backpressure" from the output port. Head-of-line blocking means that the data unit "behind" the port 1 destined data unit on port 2 has to wait to be transmitted. In principle, while input port 2 waits to get a path to output port 1, the data unit destined to output port 2 could be transmitted in parallel. The reality is that the fabric cannot see the input traffic in the buffer if the input buffer is a first-in-first-out (FIFO) structure. Shared memory buffering in more modern switches tends to avoid head-of-line blocking, since all ports have access to the memory. You encounter head-of-line blocking in daily life, when you are driving in the right lane, and come to a traffic light where you want to turn right. Your car, however, is the second in line, and the car in front of you wants to go straight. If that car were not at the head of the right lane, you could turn right on red. You are, however, blocked at the head of the line. Given an understanding that blocking may occur even in a "nonblocking" design, an anyto-any crossbar architecture may not improve performance at lower speeds. [Berkowitz 1999, p. 197-199].
http://www.certificationzone.com/cisco/studyguides/component.html?module=studyguides... 5/31/2005
Page 22 of 50
Clustering provides the functionality of stacking, but removes some of the limitations. Stacked switches needed to be in close proximity, as in a single wire closet. A cluster, however, can be defined among switches in different locations reachable by the same LAN. The members of the cluster are selected dynamically rather than by the physical wiring used in a stack. Members of a cluster can be linked with a dedicated cable and GBICs as in stacks, but also with Fast Ethernet or Fast EtherChannel. Since clustering no longer requires the physical proximity of switching, clustering is available on mixtures of up to 16 switches, including the Catalyst 3550, 2950, 3500 XL, 2900 XL, 2900 LRE XL and 1900 Series. While clustering functionality was introduced with the 3512XL, 3524XL, and 3508G XL, only the 3508 G XL is still sold. Their replacements are the Catalyst 3550 and 2950 series. The 3508 G XL is still supported, primarily as a GE concentrator. 3750 series switches are only semi-modular. They have fixed ports for 10/100 Ethernet, but have some number of Small Form-Factor Pluggable (SFP) uplinks. Gigabit Interface Converters (GBIC) plug into the SFP ports. You will also find SFP ports on the 3550 series.
Switch_2(config)#interface ? FastEthernet FastEthernet IEEE 802.3 GigabitEthernet GigabitEthernet IEEE 802.3z Port-channel Ethernet Channel of interfaces Vlan Catalyst Vlans
A "port-based" VLAN is a physical port that either has not been configured at all (in which case it is by default a member of VLAN 1) or which has been placed into a particular VLAN via the switchport access vlan command. It should be apparent that port-based VLANs are Layer 2 only. Physical ports become physical Layer 3 ports by the issuing of the no switchport interface command. Once this has been done, the port can be given an IP address and one can enter the port into a routing domain.
http://www.certificationzone.com/cisco/studyguides/component.html?module=studyguides... 5/31/2005
Page 23 of 50
Switch_2(config-if)#ip address 10.3.3.1 255.255.255.240 % IP addresses may not be configured on L2 links. Switch_2(config-if)#no switchport Switch_2(config-if)#ip address 10.3.3.1 255.255.255.240 Switch_2(config-if)#
Figure 6. Creating a Layer 3 Port A switch virtual interface (SVI) is a logical interface that represents VLANs of physical switch ports to the routed or bridged processes of the switch. Some detail will be given in the fallback bridging section. For now, let it be said that configuration and capability are similar to that of loopback interfaces. This really takes the concept of VLANs just a very small step beyond the thinking on the earlier Catalyst switches. Unlike loopback interfaces, the creation of an SVI is a two-step process. 1. create the VLAN, using either the VLAN database command from the privilege exec or the VLAN command from the global configuration mode. 2. invoke the SVI by entering the command interface vlan from the global configuration mode.
Switch_2#show interface Vlan307 is up, line protocol is up Hardware is EtherSVI, address is 0009.b775.d400 (bia 0009.b775.d400) MTU 1500 bytes, BW 1000000 Kbit, DLY 10 usec, reliability 255/255, txload 1/255, rxload 1/255 Encapsulation ARPA, loopback not set ARP type: ARPA, ARP Timeout 04:00:00 Last input 00:01:48, output never, output hang never Last clearing of "show interface" counters never Input queue: 0/75/0/0 (size/max/drops/flushes);Total output drops: 0 Queueing strategy: fifo Output queue :0/40 (size/max) 5 minute input rate 0 bits/sec, 0 packets/sec 5 minute output rate 0 bits/sec, 0 packets/sec 0 packets input, 0 bytes, 0 no buffer Received 0 broadcasts, 0 runts, 0 giants, 0 throttles 0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored 0 packets output, 0 bytes, 0 underruns 0 output errors, 0 interface resets 0 output buffer failures, 0 output buffers swapped out
Figure 8. Displaying an SVI Even though the VLAN is not assigned to a physical port, and even though there is no other
http://www.certificationzone.com/cisco/studyguides/component.html?module=studyguides... 5/31/2005
Page 24 of 50
configuration on the SVI, the SVI shows "up" and "up". The integration of the Layer 2 and Layer 3 functionality takes place at the SVI level. With the 3550 paradigm, you also have the capability to define an L2 port as either an access port, a trunk port, or a voice port:
Switch_2(config-if)#switchport ? access Set access mode characteristics of the interface trunk Set trunking characteristics of the interface voice Voice appliance attributes <cr> Switch_2(config-if)#switchport voice vlan 77 Switch_2(config-if)#switchport access vlan 78
Figure 9. SVI Functionality The above configuration shows that voice and data VLANs can co-exist on the same port.
Forwarding
Like the 4000 and related platforms, the 3550 has shared memory. All forwarding decisions take place in "Satellite" ASICs. Control information is sent on a separate control ring to the egress interfaces, while the data part is stored in shared memory.
Table 17. 3550 Filtering Capacity Resource Access control list Limit 512 security (256 in/256 out) 128 QoS Access control entry (i.e., a line in an ACL) 4000 security Depending on the specific 3550 model, there may be more than one TCAM. Table 18. Number and Use of TCAMs in 3550 Models Model 3550-24 TCAMs TCAM use 1 All interfaces on same TCAM
http://www.certificationzone.com/cisco/studyguides/component.html?module=studyguides... 5/31/2005
Page 25 of 50
3550-48
Fast Ethernets 1-36 on TCAM 1 All others on TCAM 2 Interfaces 1-4 on TCAM 1 5-8 on TCAM 2 9-12 on TCAM 3 Interfaces 1-4 on TCAM 1 5-8 on TCAM 2 9-12 on TCAM 3
3550-12T 3
3550-12G 3
For increased availability, load-sharing redundant power supplies work with all models of the 4500 series. Only the 4507R supports redundant supervisors.
http://www.certificationzone.com/cisco/studyguides/component.html?module=studyguides... 5/31/2005
Page 26 of 50
4506 Fabric (Gbps) Chipset Layers Processors NFSC [1] MAC addresses OS
[NA] Not available [1] NetFlow Services Card
4506 64 NA L2, L3 1 No
24 Three K1 ASIC L2 1 No
CatOS
IOS
IOS
IOS
Forwarding
In the 4000, the actual forwarding is done using a K2 chipset. The packet engine/supervisor populates TCAMs in the forwarding engines. The 4000 fabric is shared memory. With the Supervisor 3, cards have a 6 Gbps path into the fabric, so populating it with 4 cards establishes an aggregate bandwidth of 24 Mbps. Adding two GE uplinks brings the aggregate to 32 Mbps. It has an interesting variant on line cards. Each slot has a 6Gbps bandwidth, so you might assume that a card will not have more than 6 GE interfaces. This is true for cards intended for uplinks, but there is a server-oriented 18-port server interface cards. This card exploits the reality that most servers with GE interface cannot drive those interfaces at full speed. While the original 4000 had L2 switching only, an L3 path has been added with the Supervisor 3 with the Layer 3 service module. You can add multiple Layer 3 Services Modules for greater bandwidth. Table 20. 4000 Forwarding Performance L2 switching decisions L3 routing decisions L2-4 ACL processing 48 Mbps 48 Mbps 48 Mbps
QoS marking/processing 48 Mbps Filtering is in the fast path. Traffic policing is also in the fast path, but traffic shaping takes place in the supervisor. Table 21. 4000 Filtering Capacity (Supervisor 3) Resource Access control list Limit 1024 This number combines security and QoS
16,000 in/16,000 out security Access control entry (i.e., a line in an ACL)
http://www.certificationzone.com/cisco/studyguides/component.html?module=studyguides... 5/31/2005
Page 27 of 50
16,000 in/16,000 out QoS To understand QoS filtering, you must be aware of several assumptions. First, L2 prioritization depends on the QoS value in an ISL or 802.1Q header. Second, L3 IPv4 prioritization depends either on the Differentiated Services Code Point (DSCP) or the IP precedence value in the ToS byte. Both the DSCP and precedence value are in the Type of Service byte of the IPv4 header. This discussion applies to the Supervisor III, and, unless specifically mentioned, to the Supervisor IV. Given that the fabric is nonblocking, there is no input queuing. Each output interface has four queues, 240 packets each for Fast Ethernet and 1920 packets for non-blocking Gigabit Ethernet. Table 22. Blocking and non-blocking port types on the Catalyst 4000 series Non-blocking Supervisor III and IV uplinks all ports on WS-X4306-GB line card two 1000BASE-X ports on the WS-X4232-GB-RJ line card first two ports on the WS-X4418-GB line card two 1000BASE-X ports on the WS-X4412-2GB-TX line card WS-X4424-GB-RJ45 line card WS-X4448-GB-LX line card Switch supervisors often do not support the range of QoS measures on a router platform. For example, Weighted Random Early Discard (WRED) is not supported on switch platforms, but is available on routers like the 7200. The 6500 switch is an exception that supports WRED. Depending on the model, 4500 platforms will have 28 to 64 Gbps of shared memory backplane. With the Supervisor III or IV, the fabric is fast enough to allow all interfaces to run at wire speed, without fabric blocking. All other ports Blocking 10/100/1000 T ports on the WS-X4412-2GB-TX line card
1, 2, 720 2
http://www.certificationzone.com/cisco/studyguides/component.html?module=studyguides... 5/31/2005
Page 28 of 50
[1] Chassis, airflow, and power supply optimized for service provider environments
Database Manager
On a high-end platform such as the 6500, the more traditional limiting factors such as bandwidth are less often a problem than resource contention and exhaustion. You need to understand which ACL and related functions are done in software, creating a centralized bottleneck. Critical resources also can be in the distributed forwarding cards. In particular, these include masks in TCAM, the Logical Operation Units (LOUs); and the ACL-to-switch interface mapping labels. TCAM entries, LOUs, and ACL labels are limited resources. Therefore, depending on your ACL configuration, you might need to be careful not to exhaust the available resources. In addition, with large QoS ACL and VACL configurations, you also might need to consider Non-Volatile Random Access Memory (NVRAM) space. Remember that booting a configuration from a TFTP server is a workaround for configurations that won't fit into NVRAM. Table 24. ACLs Processed in Software in Cisco Catalyst 6500 Series Switches Function ACL denied traffic Specific Environment Supervisor 1a with PFC -- ACL denied packets are processed in software if interface does not have the no ip unreachables command configured Supervisor 2 with PFC2 -- ACL denied packets are leaked to the MSFC2 if unreachables are enabled. Packets are leaked at 10 packets per second (pps) per VLAN (Catalyst OS software with Cisco IOS Software) or one packet every two seconds per VLAN (Cisco IOS Software) Comments
http://www.certificationzone.com/cisco/studyguides/component.html?module=studyguides... 5/31/2005
Page 29 of 50
Supervisor 720 with PFC3 -ACL denied packets are leaked to the MSFC3 if unreachables are enabled Traffic denied in an output ACL (Supervisor 1a with PFC only)
Packets requiring ICMP unreachables are leaked at a user-configurable rate (500 pps by default) If traffic is denied in an output ACL, an MLS cache entry is never created for the flow. Therefore, subsequent packets do not match a hardware cache entry and are sent to the MSFC where they are denied in software
IPX filtering based on unsupported parameters (such as source host); ACEs requiring logging (log keyword)
on Supervisor 720, Layer 3 IPX traffic is always processed in software ACEs in the same ACL that do not require logging are still processed in hardware; Supervisor 1a with PFC -Traffic permitted in a TCP intercept ACL is handled in software Supervisor 2 with PFC2 and Supervisor 720 with PFC3 The TCP three-way handshake (SYN, SYN/ACK, ACK) and session close (FIN/RST) are handled in software; all remaining traffic is handled in hardware The set interface parameter is supported in software, with the exception of the set interface Null0 parameter, which is handled in hardware on Supervisor 2 with PFC2 and Supervisor 720 with PFC3 Supervisor 2 with PFC2 and Supervisor 720 with PFC3 support rate-limiting of packets redirected to the MSFC for ACL logging.
TCP intercept
Policy routed traffic (if match length, set ip precedence, or other unsupported parameters are used; if the mls ip pbr command is not configured
Null0 parameter, which is handled in hardware on Supervisor 2 with PFC2 and Supervisor 720 with PFC3 WCCP redirection for HTTP requests Traffic requiring Network Address Translation (NAT) Supervisor 1a with PFC only) (Supervisor 1a with PFC and Supervisor 2 with PFC2); traffic requiring NAT translation or NetFlow setup (Supervisor 720 with PFC3) Supervisor 2 with PFC2 and Supervisor 720 with PFC3 -Traffic denied in a uRPF check ACL ACE Supervisor 1a with PFC -- Any uRPF check configuration Non-IP (all Supervisors) and Supervisor 1a with PFC and
http://www.certificationzone.com/cisco/studyguides/component.html?module=studyguides... 5/31/2005
Page 30 of 50
Forwarding
Depending on the model and features, the 6000 series may use one of several fabric methods. On the 6000, "classic" line cards are interconnected by a Pinnacle ASICs to a 16 Gbps bus. Since the bus is bidirectional, it is marketed as 32 Mbps. In contrast to the 6000, the 6500 has a crossbar fabric, which is mounted on a separate card. This Switch Fabric Module (SFM) has a one-way 128 Gbps or 256 Gbps full-duplex capacity. Individual card channels are 8 Gbps; there are two channels per slot. Maximum throughput in the 6000 is 15 Mbps, while the 6500's maximum is 30 to 210 Mbps, depending on whether the SFM is present. Depending on the platform, slots can have different speeds, even within the same platform. On the 6506 and 6509 switches, and the 7606 router, all with SFM or SFM2 fabrics, each slot gets 16 Gbps. On the 6513, slots 1-8 get 8 Gbps but slots 9-13 get 16 Gbps. In the 6500, the Medusa ASIC interconnects the local card bus and the crossbar fabric. It also connects fabric-enabled cards to the 32 Gbps shared memory. Remember the 6500 supports 10 Gbps Ethernet and the 7600 supports OC-192. While these are considered, respectively, LAN and WAN interfaces, their physical layer is identical. Table 25. 6500 Card Types Card type Classic Fabric enabled Fabric only Function Bus only. COIL and Pinnacle ASICs Bus and fabric Medusa and Pinnacle ASICs. Fabric only. Medusa and Pinnacle ASICs. Can have Distributed Forwarding Card.
Switch fabric module line card Contains the actual fabric 6500s also use TCAM tables for cEF and ACLs. Input and output queuing take place in Pinnacle ASICs on the line card. Table 26. 6500 Filtering Capacity (Supervisor 2) Resource Access control list Limit 512 This number combines security RACL, QoS ACL and VLAN ACL (VACL)
http://www.certificationzone.com/cisco/studyguides/component.html?module=studyguides... 5/31/2005
Page 31 of 50
Table 27. 65xx Forwarding Capacity Supervisor 1 and/or classic line cards Supervisor 2 with fabric enabled line cards Supervisor 2 with SFM and 7 6816 fabric-only line cards 15 Mbps 30 Mbps 107 Mbps
Supervisor 2 with SFM and 7 6816 fabric-only line cards plus card-local traffic switching 170 Mbps 6513 with DFC-enabled fabric-only line cards plus card-local traffic switching 210 Mbps
Layer 1 Failover
Remember that many Layer 1 mechanisms cannot tell when a link has failed in one direction. You need Layer 2 mechanisms, ranging from link keepalives to the Unidirectional Link Detection Protocol, or Layer 3 routing updates, to detect that condition. The first Cisco feature to provide any sort of recovery in the event of link failure was dial backup, which operates at Layers 1 and 2. Subsequently, dial-on-demand routing (DDR) was adapted to give a Layer 3 capability for such backup. See the CertificationZone High Availability Study Guide for more detail on dial-based recovery.
http://www.certificationzone.com/cisco/studyguides/component.html?module=studyguides... 5/31/2005
Page 32 of 50
In APS, only the working ring actually carries user traffic. A management protocol runs over both rings, however. The APS Protect Group Protocol detects failures and triggers ring switchover. However, SONET has been extremely reliable, and duplicating all rings is very expensive. In the 1:N variant shown on the right side of the figure below, one protection ring covers four LTEs. When a failure occurs, the protection ring is activated only between the endpoints affected by the actual failure.
SONET no longer needs to run over its own physical fiber, but can run on a wavelength of DWDM. This allows links in multiple protection rings to run over the same fiber, with due regard not to put both links of the same ring over the same physical fiber, creating an SRG.
Layer 2 Aggregation
Layer 2 aggregation distributes frames across two or more links, normally load-sharing but also providing fallback if a link fails. LAN and MAN standards in this area come from IEEE, primarily under
http://www.certificationzone.com/cisco/studyguides/component.html?module=studyguides... 5/31/2005
Page 33 of 50
802.3, but also under 802.17. Table 28. Recent IEEE 802.3 Standards Protocol Function 802.3aa 802.3ab 802.3ac 802.3ad Updates to 802.3u Fast Ethernet Gigabit Ethernet over Cat 5 Frame extension for baby giants Link aggregation
There are two schemes for aggregating 802.3 traffic: Cisco's early and proprietary EtherChannel, and the newer IEEE 802.3ad standard. EtherChannel uses a control protocol called Port Aggregation Protocol (PAgP). 802.3ad uses the Link Aggregate Control Protocol (LACP). These methods use at least two parallel links between two routers or switches, protecting you against a single link failure or a failure of an interface at either end of one link.
Figure 10. Basic 802.3 Aggregation Protection between Switches You can also use 802.3 aggregation between a switch or router and a server with a suitable NIC (Figure 8). Using more links gives you protection against more link failures.
http://www.certificationzone.com/cisco/studyguides/component.html?module=studyguides... 5/31/2005
Page 34 of 50
Figure 11. Multiported Servers Other benefits of 802.3 aggregation include load balancing, in which source-destination pairs of MAC addresses are assigned to specific links in the bundle. Should a link fail, the addresses are redistributed onto the working links. Again, routing will be unaware of this redistribution. To implement 802.3 aggregation, first be sure that your interface card supports 802.3 aggregation. Check the platform-specific restrictions, such as which ports can be bundled and if they need to be contiguously numbered. See Dan Farkas' LAN Switching tutorials for configuration details and Chuck Larrieu's paper on 3550-specific features. Any easy way to ensure that all ports have a common configuration is to create the channel first and then configure one port in the channel. Perhaps the most basic application of 802.3 aggregation is having a bundle between two switches. If one link fails, traffic flow continues without impact on STP. It should have little effect on user traffic, although there is a possibility that a frame in transit on the failing link might be lost.
http://www.certificationzone.com/cisco/studyguides/component.html?module=studyguides... 5/31/2005
Page 35 of 50
Figure 12. Basic Link/Interface Protection between Access and Distribution Switches Figure 13 shows a fairly complex configuration I implemented for a client, which protected against both default router failure and distribution failure and could protect against access switch failure. To protect against access switch failure, a host would need to have two NICs; each connected to a different access switch. STP would keep one of those NICs in the blocking state. See Dan Farkas' LAN switching papers for configuration details.
Figure 13. Link/Interface Protection to Default Router(s) By using multiple NICs that the host knows how to bundle into an 802.3 aggregation, you can also protect against failures from network element to host. Again, a frame in transit might be lost. Multilink PPP also protects you against failures of interfaces or links in a bundle, but the technologies involved are appropriate for WANs rather than LANs. Potentially, multichassis multilink protects you against a failure of an access server in a stack of access servers. If you simply have one hunt group phone number for the entire stack, it will be random whether different calls go to the same or different access servers. Perhaps the extreme case of using multilink to avoid single points of failure is PPP over L2TP. Resilient Packet Rings, RPR, now under development in the IEEE 802.17 working group, is intended as a more efficient replacement for SONET/SDH, allowing better use of backup facilities. MANs, and RPR in general, are intended to smooth some of the disconnects between enterprise-oriented LANs and longhaul SONET/DWDM [Vijeh 2000]. While SONET/SDH are Layer 1 technologies, RPR is a Layer 2 MAC that runs over the arbitrary physical facilities, including those compatible with SONET/SDH, metropolitan Gigabit Ethernet, etc. The basic unit of data transfer on RPR is an Ethernet frame, not a bit. RPR's L2 technology replaces the framing and the protection mechanisms of SONET/SDH. As opposed to Ethernet, it offers protection switching at SONET speeds. RPR accepts that some traffic can be preempted if one ring fails, an idea certainly consistent with QoS prioritization. Other information is available from the IP over RPR Working Group in the IETF's sub-IP area, and an industry forum, the RPR Alliance, is being formed. As a technique primarily used in metropolitan and wide area carrier networks, RPR is beyond the scope of this paper.
http://www.certificationzone.com/cisco/studyguides/component.html?module=studyguides... 5/31/2005
Page 36 of 50
Private VLANs
Mentioned frequently in the Cisco SAFE blueprints, private VLANs impose a NBMA topology on a single Ethernet subnet. This is especially useful in broadband provider applications, where you do not want any user to be able to see the traffic of any other user. Table 29. Types of Ports in Private VLANs Port type Communicates with
promiscuous all other private VLAN ports and is the port used to communicate with devices such as routers, LocalDirector, backup servers, and administrative workstations. isolated Community promiscuous ports only communicate among themselves and with their promiscuous ports. These ports are isolated at Layer 2 from all other ports in other communities or isolated ports within their private VLAN.
Once you have defined the ports, you define pairs of VLANs (e.g., primary to community) that permit communications between them. Table 30. Sub-VLANs in a Private VLAN
http://www.certificationzone.com/cisco/studyguides/component.html?module=studyguides... 5/31/2005
Page 37 of 50
Traffic rules forwards incoming traffic arriving at a promiscuous port to all other promiscuous, isolated, and community ports isolated ports to communicate to the promiscuous ports. used by a group of community ports to communicate among themselves and transmit traffic to outside the group via the designated promiscuous port.
The simplest private VLAN consists of one primary VLAN and one of either isolated or community types. You are allowed to have additional isolated or community types, which do not communicate with one another. In your configuration, you must bind the isolated and/or community VLAN(s) to the primary VLAN and assign the isolated or community ports to the appropriate sub-VLAN. You will find that many of the private VLAN constraints (Table 31) also apply to 802.1x constraints (Table 32). Table 31. Private VLAN constraints Feature BPDU Guard Constraint Automatically enabled
VLAN membership Set to static Access ports VTP VTP mode primary VLAN isolated or community VLAN numbers Port restrictions ASIC consistency Redefined as host ports transparent mode cannot be changed to client or server. VTP does not understand private VLANs. only 1 isolated VLAN and/or multiple communities can be associated with it. only 1 primary VPAN Private VLANs cannot be numbered 1 or 1001 through 1005. Private VLAN port cannot be channeling or dynamic membership. It only can be trunking if it is a MSFC port. On the same ASIC, you cannot have one port that is a trunk or a SPAN destination, and others that are community, isolated or promiscuous. This is hardware platform specific. Must be identically configured on primary and isolated/community Mutually exclusive with private VLAN port May belong to a private VLAN Cannot be used with private VLANs Cannot be used on private VLAN ports Not supported
Spanning tree parameters Destination SPAN Source SPAN Remote SPAN EtherChannel IGMP snooping
! define the primary VLAN set vlan vlan_num pvlan-type primary ! set vlan vlan_num pvlan-type {isolated | community}
http://www.certificationzone.com/cisco/studyguides/component.html?module=studyguides... 5/31/2005
Page 38 of 50
set pvlan primary_vlan_num {isolated_vlan_num | community_vlan_num} ! Associate the isolated or community port(s) to the private VLAN. set pvlan primary_vlan_num {isolated_vlan_num | community_vlan_num} mod/ports ! Map the isolated/community VLAN to the primary VLAN on the promiscuous port. set pvlan mapping primary_vlan_num {isolated_vlan_num | community_vlan_num} mod/ports ! check the configuration show pvlan [vlan_num] show pvlan mapping
Figure 14. Configuring Private VLANs Before doing private VLANs in production, check platform and line-card-specific constraints. Many of these are 6500-specific, so won't show up in the CCIE lab.
Dynamic ports EtherChannel port Secure port (i.e., with MAC filters) Switch Port Analyzer (SPAN) destination port SPAN source port
To authenticate with 802.1x, Extensible Authentication Protocol Over LAN (EAPOL) frames are the only ones allowed through the switch, destined for an authentication server or an authentication function on
http://www.certificationzone.com/cisco/studyguides/component.html?module=studyguides... 5/31/2005
Page 39 of 50
the switch, until the port user successfully authenticates. In other words, EAPOL extends the functionality of RADIUS to the switch port.
show ip dhcp snooping binding would result in a display like that in Table 33.
Table 33. Hypothetical DHCP Snooping Table MAC address IP address Lease (sec) Type VLAN Interface FE 2/1
dynamic 86
! enable globally ip dhcp snooping ! enable for a VLAN ip dhcp snooping vlan vlan-number ! enable information insertion for subscriber tracking with ! dhcp option 82 ip dhcp snooping information option ! define the interface as trusted (e.g., inside firewall) ! or untrusted ip dhcp snooping trust ! set rate limit for accepting DHCP requests ip dhcp snooping limit rate
There is also a set of DHCP snooping features relevant to the DHCP database agent, but that agent and its files are beyond the scope of this discussion of switch functionality. IP source guard, intended for layer 2 ports, complements DHCP snooping at layer 3. Operating on the Principle of Least Privilege, IP source guard blocks any user traffic until a DHCP request and response are fulfilled. Once that process is complete, you can permit either:
z
IP traffic only with the IP source address captured during the DHCP exchange IP traffic only, which meets both the previous IP address criterion, but also originates from the MAC address observed during the DHCP exchange.
http://www.certificationzone.com/cisco/studyguides/component.html?module=studyguides... 5/31/2005
Page 40 of 50
The first case is the rather oxymoronic one of "baby giants". This deals with the need for a switch not to drop VLAN-associated frames that have either the additional 2 bytes of an 802.1q header, or the considerably longer frame of Cisco proprietary ISL. Baby giants are not a problem for switches that have ports associated with VLANs as well as their trunks, but the problem happens when you have an intermediate switch (e.g., for aggregation into GE), which only forwards trunk frames. This is a case that matters purely at Layer 2, for inter-switch trunking. The danger is that an intermediate switch such as the one above might drop perfectly legal VLAN trunk frames because they exceed 1518 bytes. IEEE 802.3ac provides a specification under which baby giants are legal. The second case has both L2 and potentially L3 implications. Some switches can support frames with lengths up to 9216 bytes, which reduce the overhead of 26 bytes of header and preamble plus a 9.6 microsecond (on 10 Mbps Ethernet) between frames. Such "jumbo" frames may be kept purely in a switched environment, or you also may need to increase the default MTU of 1500 on IP interfaces.
Layer 2 Traceroute
Traditionally, traceroute could only be used with routers and L3 hosts. Cisco developed an extension to traceroute that uses Cisco Discovery Protocol to identify L2-only devices in the path. For this feature to work, a number of criteria must be met. 1. All L2 switches in the path must have IP connectivity. 2. You cannot trace more than ten hops. 3. You can only trace within one VLAN. A given MAC address being traced can belong to more than one VLAN, but you can trace only its role in a single VLAN 4. Multicast addresses are not supported. 5. L3 addresses in the path are identified with ARP. If ARP requests cannot be resolved, the L2 traceroute fails. 6. Hubs will break L2 traceroute, because it assumes one device per switch port.
http://www.certificationzone.com/cisco/studyguides/component.html?module=studyguides... 5/31/2005
Page 41 of 50
To execute a L2 traceroute, enter the following command from privileged EXEC mode:
This section assumes that you understand the basic 802.1d implementation including Bridge Protocol Data Units (BPDUs). See Farkas' LAN Switching and Larrieu's 3550 for details. Table 34. 802.1 and Related Cisco Protocol Summary IEEE Protocol 802.1d 802.1s 802.1w 802.1q 802.1s 802.1x* Function Basic spanning tree Multiple spanning trees Rapid spanning tree; see 802.1d Basic VLAN VLAN Extensions Port authentication PVST Cisco Proprietary Equivalent None ISL PortFast IEEE Enhancement Second edition 802.1d, 802.1w
Its basic approach is for bridges to announce themselves to other bridges using Bridge Protocol Data Units (BPDUs). From BPDU information, the bridges elect a root bridge, and then prevent loops by blocking all but one link between pairs of bridges. We already discussed EtherChannel as a protection against interface and link failures. At the next logical level, several things can go wrong with a spanning tree or VLAN. They include: 1. More than one bridge assumes that it is root ("root wars"). 2. A new non-bridging device attached to a switch port takes too long to start forwarding. 3. A device, typically an end host, endangers the network with a malfunction. 4. A correct spanning tree takes too long to converge, even in the absence of failures. 5. A distribution switch fails and it takes the network a long time to reconverge. 6. A core switch fails and results in long reconvergence time. On a switch, ports will eventually be put in the roles in Table 12. Table 35. STP Port Types Port type Root port (RP) Function BPDUs
http://www.certificationzone.com/cisco/studyguides/component.html?module=studyguides... 5/31/2005
Page 42 of 50
Nondesignated port (NDP) All other ports Alternate port* Backup port*
*IEEE 802.1w RSTP.
IEEE 802.1w is an upgrade of 802.1d that significantly improves recovery time in switched networks. Cisco has long had proprietary mechanisms to improve convergence after certain failures or addition of devices to the network. Many of these capabilities appear in 802.1w as well. Cisco's hierarchical design model nicely identifies places where you can have STP failures that can get special handling:
z
Core switch failure and failover Distribution switch failure and failover Access switch failure and failover Multiported host NIC/port failure and failover
Figure 15. Link Failure between Core Switches CS1 is intended to be the root. However, what if link 1 fails? How does DS1 know that it now needs to unblock link 3 and forward directly to CS1?
http://www.certificationzone.com/cisco/studyguides/component.html?module=studyguides... 5/31/2005
Page 43 of 50
When CS2 loses link 1, it will start sending inferior BPDUs to its subordinate bridges. Under normal conditions, when a bridge starts to receive inferior BPDUs, it will ignore them until its STP aging timer expires. At the expiry of that timer, the default condition would be to go to a new STP reconvergence, blocking all forwarding until all bridges agree on the new root. If there are no blocked ports on the bridge whose timer expired, it will decide it is the new root bridge -- which may or may not be true. One workaround allows the bridge receiving inferior PDUs to send root link query PDUs out all blocked alternate paths to the root bridge. If the response to one of these PDUs indicates there is a path to the root, then all the blocked ports go into listening and learning, and eventually the spanning tree reconverges. But if we know at configuration time which alternate bridge will become root, we can speed recovery by recognizing that reality and giving the distribution switches a fast mechanism to find the backup core switch without a full STP recomputation. Cisco's original proprietary solution was BackboneFast. IEEE 802.1w has an equivalent solution.
Root Wars
One of the special problems of root switch failures is that you can wind up in failure modes where several switches, some of which were not intended ever to become root, decide that they are root. There may even be a root war. Root wars tend to happen most frequently when the spanning tree is quite large and contains relatively slow links. Root wars were especially common in bridging over WANs with link speeds below 56 Kbps. They are very rare at LAN speeds.
http://www.certificationzone.com/cisco/studyguides/component.html?module=studyguides... 5/31/2005
Page 44 of 50
Spanning tree, of course, will block one of these ports. If the port (or the associated link) fails, the spanning tree algorithm eventually will unblock the alternate port on the wiring closet switch. Deciding which port to unblock requires the spanning tree algorithm to run and possibly elect a new root. During this decision process, forwarding can stop for up to 90 seconds. UplinkFast works on the principle of a sergeant saying, "When I want your opinion, I'll tell you what it is." At the time the wiring closet switch is being configured, the network administrator knows which core switch is primary and which is backup. The noncore switch is told which its primary and secondary links are, and, when it detects a failure on the primary, it has been preconfigured with the knowledge of the backup switch to use. It enables the backup interface without going through the 802.1d listening and learning states.
RSTP can provide subsecond reconvergence where traditional 802.1d could take 30 s or more to converge. Remember that, as opposed to routers, bridges do not forward while reconvergence is in process. One of the problems of 802.1d convergence is that it must dynamically discover alternate switches after a failure by rerunning spanning tree. It must do so even if there is only one possible backup switch for the particular switch detecting the failure. While STP is running, no forwarding takes place. Another 802.1d problem is that a newly added port must spend 30 s in the learning state before it can begin forwarding. Obviously, this slows recovery time.
http://www.certificationzone.com/cisco/studyguides/component.html?module=studyguides... 5/31/2005
Page 45 of 50
RSTP allows for the creation of several types of ports in addition to the root port and the designated ports of 802.1d. There is an alternate port and a backup port designation kept in the active spanning tree table. When a topology change is detected that affects the current spanning tree, the active tree is immediately flushed, and the backup ports are immediately placed in the forwarding state. Having backup ports allows 802.1w RSTP to provide functionality similar to that of UplinkFast and BackupFast. After the root is selected, the nonroot switches must know which port is their RP. This is accomplished by each switch calculating path costs back to the root. Costs are based on the speed of the ports on which BPDUs are received.
The PortFast feature disables the learning phase of the 802.1d algorithm for that port, so that the port will begin forwarding after it learns key MAC addresses. There are cases when an installation tries to make a server more fault-tolerant by giving it multiple interfaces. When these interfaces are in different spanning trees, PortFast would not create any problem. In that case, the challenge is how the server knows which interface to use, a problem that needs to be solved at higher layers or in host software. If, however, the server has two or more interfaces in the spanning tree, there might or might not be a problem. As long as the host never attempts to forward frames between its interfaces, there will be no problem. The spanning tree will simply see it as two separate host endpoints. If the host is capable of bridging, however, then PortFast must not be enabled. Bridging-capable multiple interface hosts need to know which interface to put into the blocking state, putting it into a hot standby role.
http://www.certificationzone.com/cisco/studyguides/component.html?module=studyguides... 5/31/2005
Page 46 of 50
PortFast is not a recovery mechanism, but rather a mechanism to reduce the delay before a nonbridging end host can start to participate in higher-level communications. IEEE 802.1w provides an equivalent standards-based function. Both prevent the attempted reconvergence of the spanning tree when a new device is plugged into a switch port. Remember that while spanning tree reconvergence is in progress, as opposed to routing protocol reconvergence, all forwarding stops. You must not use PortFast or equivalents on hosts with multiple NICs in the same broadcast domain, although it can be used if you have hosts in different redundant broadcast domains. Using it in the first case will prevent the host from being able to decide which NIC to block to follow spanning tree rules.
http://www.certificationzone.com/cisco/studyguides/component.html?module=studyguides... 5/31/2005
Page 47 of 50
Spanning Trees 1 1 M N
VLANs None M M M
As with routing protocols, especially link state, processing load grows exponentially in some fullinformation topologies. Techniques for improving convergence in small domains, such as decreasing the interval between hellos, do not scale to large size. Instead, the trend is to introduce ISIS-like hierarchy into advanced spanning trees, restricting information from hierarchically lower parts just as information is withheld from ISIS or OSPF stub areas. When we discuss VLANs, you will also see that you can reduce STP overhead by choosing to assign a single STP instance to multiple VLANs, although you can create a 1:1 relationship between STPs and VLANs. 1:1, originally introduced in Cisco ISL, does allow optimal topology for each VLAN, but at greater overhead than MISTP and 802.1s.
MSTP Regions
http://www.certificationzone.com/cisco/studyguides/component.html?module=studyguides... 5/31/2005
Page 48 of 50
MSTP regions, comparable to Layer 3 routing protocol backbones, have a single STP for the region. Each MSTP switch belongs to one and only one region. The backbone consists of an internal spanning tree (IST) that sends and receives BPDUs and knows about up to 16 MSTP instances -- the nonbackbone areas. Table 39. Region Definition Parameter Name of region Revision number
http://www.certificationzone.com/cisco/studyguides/component.html?module=studyguides... 5/31/2005
Page 49 of 50
multiple spanning trees. LAN Emulation (LANE), strictly speaking, does not tag frames. It has an equivalent and implicit function, however, which identifies VLAN-equivalent Emulated LANs (ELANs) by the virtual circuit with which they are attached. Especially when multiple tagging methods are in use, there needs to be a way to convey the tagging translations among switches. This is the first function of the VLAN Trunk Protocol (VTP). VTP is less a specific high availability mechanism than a means to distribute -- quickly, efficiently, and with minimum human intervention -- changes made as a result of reconfiguration or recovery. VTP has one or more VTP domains to which VLANs are assigned. One or more physical ports are assigned to each VLAN. VTP propagates changes in these relationships.
VTP Pruning
The VTP pruning function reduces demands on system-wide performance by blocking information irrelevant to the downstream switches on a given path. For example, if a particular switch does not have VLAN 42 configured, VTP pruning prevents VLAN 42 control messages from going to that switch. Configuring VLAN 42 on that switch will inform VTP that the switch now needs to hear information about that VLAN.
PVST
Per-VLAN Spanning Tree Protocol (PVST) is built around the 802.1s extensions to spanning tree. Essentially, it is the specification that applies the 802.1w single spanning tree enhancements to multiple spanning trees in a VLAN environment. PVST causes some changes in VLAN numbering that you need to know about; these are described in the 3550 Study Guide.
Conclusion
To understand "switching", you must avoid the market droids' confusion based on the premise "switch good router bad". If you make decisions on L3 information, you are routing. At best, "L3 switching" means that some hardware acceleration techniques were used -- but very similar techniques are used on any high-performance router, such as the Cisco 12000 "Gigabit Switch Router." L2 switching does have meaning: it's the combination of bridging with microsegmentation. Bridging uses spanning trees to form its forwarding tables, and there have been considerable improvements in spanning tree robustness and performance. You need to be familiar with these, both Cisco and IEEE. You also need to be aware of the bandwidth-reducing techniques used to improve performance at L2, such as VTP pruning, CGMP, and IGMP snooping. It appears that Cisco is asking more and more platform-specific "speeds and feeds" questions on certification exams. You need to know the basic characteristic of platform families, but you also need to be realistic. While the number seems to change on a daily basis, it's not unreasonable to say there are over 500 combinations of families, models within families, and common components such as supervisors and power supplies.
http://www.certificationzone.com/cisco/studyguides/component.html?module=studyguides... 5/31/2005
Page 50 of 50
References
[Berkowitz 2000] H. Berkowitz. WAN Survival Guide. Wiley, 2000. [Berkowitz 2002] H. Berkowitz. Building Service Provider Networks. Wiley, 2002. [Hines 2002] AA Hines. Planning for Survivable Networks. Wiley, 2002. [Perlman 1988] R. Perlman. "Network Layer Protocols with Byzantine Robustness." PhD dissertation, Massachusetts Institute of Technology, 1988. Laboratory of Computer Science document MIT-LCS-TR429. http://www.lcs.mit.edu/publications/pubs/pdf/MIT-LCS-TR-429.pdf
[IENP-SW2-WP1-F03] [2003-12-29-02]
http://www.certificationzone.com/cisco/studyguides/component.html?module=studyguides... 5/31/2005