Академический Документы
Профессиональный Документы
Культура Документы
An examination of the methods for protecting MPLS LSPs against failures of network resources
Table of Contents
1. 2. Introduction ............................................................................................................................1 Background............................................................................................................................ 3 2.1 Introduction to MPLS......................................................................................................... 3 2.1.1 Label Distribution ....................................................................................................... 5 2.1.2 Tunnels and Label Stacks ............................................................................................ 6 2.2 2.3 2.4 2.5 2.6 3. 4. Components of an MPLS Network ...................................................................................... 8 Potential Resource Failures .............................................................................................. 11 Objectives for Failure Survival .......................................................................................... 11 Detecting Errors ...............................................................................................................13 Overview of Approaches to Failure Survival....................................................................... 15
Protected Link Resources ....................................................................................................... 17 Local Repair...........................................................................................................................18 4.1 4.2 4.3 4.4 4.5 Local Repair in MPLS Networks.........................................................................................18 Traffic Engineering Implications ...................................................................................... 20 Recovery Speed .............................................................................................................. 20 Crankback ...................................................................................................................... 20 Return to Preferred Routes ...............................................................................................21
5.
Protection Switching............................................................................................................. 22 5.1 5.2 5.3 5.4 5.5 5.6 5.7 Basic Operation .............................................................................................................. 22 Backup Modes and Options............................................................................................. 23 Notifying Error Conditions ............................................................................................... 24 Alternate Repair Points ................................................................................................... 25 Recovery Speed .............................................................................................................. 25 Sharing Resources .......................................................................................................... 26 Status of protection switching within IETF........................................................................ 28
6.
Fast Re-route ........................................................................................................................ 29 6.1 6.2 6.3 6.4 6.5 Link Protection................................................................................................................ 29 Node Protection ...............................................................................................................31 Generalizing node and link protection ............................................................................. 32 Automatic Protection using Detours................................................................................. 32 Current status of fast-reroute protection within IETF ......................................................... 34
7. 8.
9. 10. 11.
1. Introduction
Multi-Protocol Label Switching (MPLS) is growing in popularity as a set of protocols for provisioning and managing core networks. The networks may be data-centric like those of ISPs, voice-centric like those of traditional telecommunications companies, or one of the modern networks that combine voice and data. These networks are converging on a model that uses the Internet Protocol (IP) to transport data. MPLS overlays an IP network to allow resources to be reserved and routes pre-determined. Effectively, MPLS superimposes a connection-oriented framework over the connectionless IP network. It provides virtual links or tunnels through the network to connect nodes that lie at the edge of the network. A well-established requirement in telephone networks is that the network should display very high levels of reliability and availability. Subscribers should not have their calls dropped, and should always have access to their service. Downtime must consequently be kept to a minimum, and backup resources must be provided to take over when any component (link, switch, switch subcomponent) fails. The data world is increasingly demanding similar levels of service to those common in the arena of telephony. Individual customers expect to be able to obtain service at all times and expect reasonable levels of bandwidth. Corporate customers expect the same services, but may also have data streams that are sensitive to delays and disruption. As voice and data networks merge they inherit the service requirements of their composite functions. Thus, modern integrated networks need to be provisioned using protocols, software and hardware that can guarantee high levels of availability. High Availability (HA) is typically claimed by equipment vendors when their hardware achieves availability levels of at least 99.999% (five 9s). This may be achieved by provisioning backup copies of hardware and software. When a primary copy fails, processing is switched to the backup. This process, called failover, should result in minimal disruption to the data plane. Network providers can supply the required levels of service to their customers by building their network from equipment that provides High Availability. This, on its own, is not enough, since network links are also prone to failure, and entire switches may fail. The network provider must also provide backup routes through the network so that data can travel between customer sites even if there is a failure at some point in the network.
This white paper examines the features inherent in MPLS networks that facilitate high availability and considers techniques to build resilient networks by utilizing MPLS. It also examines proposals in the Internet Engineering Task Force (IETF) to standardize methods of signaling and provisioning MPLS networks to achieve protection against failures. Readers familiar with the concepts of MPLS, network components and network failure survival may want to turn straight to section 3.
2. Background
2.1 Introduction to MPLS
Multi-Protocol Label Switching (MPLS) is rapidly becoming a key technology for use in core networks, including converged data and voice networks. MPLS does not replace IP routing, but works alongside existing and future routing technologies to provide very high-speed data forwarding between Label-Switched Routers (LSRs) together with reservation of bandwidth for traffic flows with differing Quality of Service (QoS) requirements. MPLS enhances the services that can be provided by IP networks, offering scope for Traffic Engineering, guaranteed QoS and Virtual Private Networks (VPNs). The basic operation of an MPLS network is shown in the diagram below.
Host Z
LSR A Ingress
LSR D Egress
21
LSR B
47
17
Host X
11
LSR C Egress
Host Y
MPLS uses a technique known as label switching to forward data through the network. A small, fixed-format label is inserted in front of each data packet on entry into the MPLS network. At each hop across the network, the packet is routed based on the value of the incoming interface and label, and dispatched to an outwards interface with a new label value.
The path that data follows through a network is defined by the transition in label values, as the label is swapped at each LSR. Since the mapping between labels is constant at each LSR, the path is determined by the initial label value. Such a path is called a Label Switched Path (LSP). MPLS may also be applied to data switching technologies that are not packet based. The path followed by data through the network is still defined by the transition of switching labels and so is still legitimately called an LSP. However, these non-packet labels (such as wavelength identifiers or timeslots in optical networks) are only used to set up connections, known as cross-connects, at the LSRs. Once the cross-connect is in place all data can be routed without being inspected, so there is no need to place the label value in each packet. Viewed another way, the wavelength or timeslot is itself the label. At the ingress to an MPLS network, each packet is examined to determine which LSP it should use and hence what label to assign to it. This decision is a local matter but is likely to be based on factors including the destination address, the quality of service requirements and the current state of the network. This flexibility is one of the key elements that make MPLS so useful. The set of all packets that are forwarded in the same way is known as a Forwarding Equivalence Class (FEC). One or more FECs may be mapped to a single LSP. Figure 1 shows two data flows from host X: one to Y, and one to Z. Two LSPs are shown. LSR A is the ingress point into the MPLS network for data from host X. When it receives packets from X, LSR A determines the FEC for each packet, deduces the LSP to use and adds a label to the packet. LSR A then forwards the packet on the appropriate interface for the LSP. LSR B is an intermediate LSR in the MPLS network. It simply takes each labeled packet and uses the pairing {incoming interface, label value} to decide the pairing {outgoing interface, label value} with which to forward the packet. This procedure can use a simple lookup table that can be implemented in hardware - together with the swapping of label value and forwarding of the packet. This allows MPLS networks to be built on existing label switching hardware such as ATM and Frame Relay. This way of forwarding data packets is potentially much faster than examining the full packet header to decide the next hop. In the example, each packet with label value 21 will be dispatched out of the interface towards LSR D, bearing label value 47. Packets with label value 17 will be re-labeled with value 11 and sent towards LSR C. LSR C and LSR D act as egress LSRs from the MPLS network. These LSRs perform the same lookup as the intermediate LSRs, but the {outgoing interface, label value} pair marks the packet as exiting the LSP. The egress LSRs strip the labels from the packets and forward them using layer 3 routing.
So, if LSR A identifies all packets for host Z with the upper LSP and labels them with value 21, they will be successfully forwarded through the network, emerging from the LSP at D, which then forwards the packets through normal IP to Z. Note that the exact format of a label and how it is added to the packet depends on the layer 2 link technology used in the MPLS network. For example, a label could correspond to an ATM VPI/VCI, a Frame Relay DLCI, or a DWDM wavelength for optical networking. For other layer 2 types (such as Ethernet and PPP) the label is added to the data packet in an MPLS shim header, which is placed between the layer 2 and layer 3 headers. As mentioned above, if the LSP is set up through a network that is not packet switching (such as an optical network), there is no need to place the label in the data packet.
A detailed review of how these protocols are used for label distribution is outside the scope of this white paper. For a comparative analysis of RSVP and CR-LDP, refer to the white paper MPLS Traffic
LSR A
Backbone Network
Host X
LSR C
17 6
LSR D
21 6
LSR E
17 13 13
21 13
LSR B
Host Y
In Figure 2, two LSPs between LSR A and LSR E, and between LSR B and LSR E, shown as red and blue labels, are transparently tunneled across the backbone network in a single outer LSP between LSR C and LSR E.
At the ingress to the backbone network, LSR C routes both incoming LSPs down the LSP tunnel to LSR E, which is the egress from the backbone. To do this, it pushes an additional label onto the label stack of each packet (shown in yellow). LSRs within the backbone, such as LSR D, are aware only of the outer tunnel, shown by the yellow labels. Note that the inner labels are unchanged as LSRs C and D switch the traffic through the outer tunnel only the outer label is swapped at LSR D. At the egress of the outer tunnel, the top label is popped off the stack and the traffic is switched according to the inner label. In the example shown, LSR E also acts as the egress for the inner LSPs, so it pops the inner label too and routes the traffic to the appropriate host. The egress of the inner LSPs could be disjoint from E in the same way that LSR A and LSR B are separate from LSR C. Equally, an LSR can act as the ingress for both levels of LSP. A label stack is arranged with the label for the outer tunnel at the top and the label for the inner LSP at the bottom. On the wire (or fiber) the topmost label is transmitted first and is the only label used for routing the packet until it is popped from the stack and the next highest label becomes the top label. When a device allocates a label, it can allocate it either from a per platform label space (the Global Label Space) or from a per interface label space. In the first case, the label has global meaning within the device and therefore the outgoing interface and label for the LSP can be identified from this label only. In the second case, the incoming label can only be interpreted in the context of the incoming interface. If each device allocating a label for the bottom label of a stack (the red and blue labels in Figure 2) allocates such labels from the Global Label Space, the outer tunnel can be re-routed transparently to the inner tunnels (provided that the ingress and egress of the re-routed tunnel are LSRs C and E, respectively). This is because when the packets arrive at LSR E, the outer label will be stripped and the inner label will be correctly interpreted from the Global Label Space. If the labels for the bottom of the stack are from per interface label spaces, this will not be possible. This is because although the re-routed LSP may terminate at the same LSR E, it may terminate on a different interface on LSR E. Once the outer label has been stripped, LSR E will interpret the inner labels as per-interface labels, but now on the wrong interface. Lastly, it is worth noting that a device can use per interface label spaces for some interfaces, and the Global Label Space for others. Using the Global Label Space for all interfaces on the device gives maximum flexibility for re-routing, but reduces flexibility on allocation of labels. For a description of the use of label stacking to support VPNs see the white paper MPLS Virtual
Private Networks: A review of the implementation options for MPLS VPNs including the ongoing standardization work in the IETF MPLS Working Group [2] from Metaswitch.
Line Card
Controller Card
The diagram highlights some key terms that are expanded here so that they can be used freely throughout the rest of this IP Source This is the place from which IP data is sent into the network. Usually thought of as PC (or host), this may be any IP device, for example, an IP telephone. It may also be a gateway device that converts between a non-IP service and IP. IP Sink The target of the IP data transmission. A partner device to the IP source. LSR Label Switch Router. The key switching component of an MPLS network. Responsible for forwarding data according to the rules established by the MPLS signaling protocol.
LER
Label Edge Router. An LSR at the edge of the network that originates or terminates an LSP.
Ingress LER
The LER that receives IP data from the IP source, classifies it and injects it into an LSP for transmission across the network.
Egress LER
The partner of the ingress LER that terminates an LSP and forwards IP data to the IP source. Note that the role a device plays can be different for different LSPs. The same device can be an LSR, ingress LER and egress LER for different LSPs.
Cross-connect
The term used to describe the connection in the hardware between an {ingress interface, ingress label} and {egress interface, egress label}.
Link
A physical connection between two nodes in the network. This may be an electrical connection or an optical fiber.
Protected Link
A protected link is a physical link with some form of redundancy built in so that data transfer is not disrupted by a failure of one of the components of the link. A protected link appears to the MPLS control plane as a single point of connection within the network. There are many link protection schemes, but a popular one uses SONET/SDH protocols on an optical fiber loop.
There may more than one link between a pair of nodes in a network. Unlike a protected link, these individual links do appear as separate points of connection within the network. They may be managed as distinct entities providing different (but parallel) routes within the network, or they can be managed as a bundle where the choice of component link is only available to the nodes that are connected by the link.
Alternate Path
An alternate path is precisely that: a different route through the network to travel between the same to end points. Parallel links provide the simplest alternate paths. More complicated alternate paths will involve traversing distinct links and transiting other nodes. The preferred route is usually calculated using Shortest Path First (SPF) algorithms, or specified at the ingress after performing Traffic
Engineering (TE) calculations. Alternate routes may often be longer or less desirable. Controller Card The internals of switches and routers are usually organized such that the main processor is present on a controller card. This card usually runs the main software in the system and is responsible for coordinating the other components. Line Card Line cards manage the ends of the links - known as ports or interfaces. One card may have multiple ports and so service multiple links. There would typically be many line cards in any one switch. Line cards can be dumb and do nothing more than provide the hardware to terminate the links. Smart line cards also include a processor that may run part or all of the protocol software that signals to set up LSPs on the links. Backplane The backplane is like a LAN within the switch. It provides connectivity between the controller cards and line cards. Backup Card Resilience against faults within a switch is achieved by having backup cards. There can be backup controller cards and backup line cards. A backup card will run a backup copy of the software on the primary card and takes over processing in the event that the hardware or software on the primary card fails. A single backup card may be dedicated to backup a specific primary card, or may be shared by several primaries. Disjoint Paths Two paths through the network are said to be disjoint if they do not share any links or nodes, other than the ingress and egress notes. Link Disjoint Paths Similar to Disjoint Paths, although Link Disjoint Paths can share nodes, provided that they do not share links.
Even if the repair of an LSP takes longer than 60ms it is still important that the connection is restored automatically. Consider a telephone user ideally they will not notice the fault at all, however, it is still better to hear a few clicks on the line and have to say, Pardon; could you repeat that? than to lose the connection and have to re-dial. If there is disruption to the data flow, an important consideration is whether data is lost and if so, how much. Neither IP networks nor other networks such as ATM or Frame Relay attempt to provide reliable delivery of data, other than by using higher layer end-to-end protocols such as TCP over the network protocols. However, if a substantial amount of data is lost, such protocols may declare the connection failed, and require re-connection. A slightly lower priority aim is that the signaling service should remain available. That is, that it should continue to be possible to establish new connections for data traffic after the failure. It may be that new connections cannot be signaled while the failure is being repaired. Although this is undesirable, it is generally acceptable for a user to retry a connection attempt (e.g. redial at a telephone) if the connection fails to establish the first time. Given the statistical likelihood of a new connection being attempted during a failure repair, it is often considered acceptable that signaling is temporarily suspended. The process of repair in one part of the network should, of course, cause as little disruption as possible to other parts of the network. Broadcasting failure information around the network could seriously disrupt other signaling and data traffic. It is worth noting that the typical requirement is to survive a single network failure. Many network providers and device vendors are not attempting to provide solutions that survive multiple concurrent network failures. While this does reduce complexity, it implies that the recovery time of failed equipment must be low to ensure that the period of vulnerability is as short as possible. All of the solutions to these requirements involve forms of redundancy whether within links, as extra links in the network, or through the provision of additional hardware components within a switch. The cost of these solutions imposes an additional requirement that redundant resources should be kept to a minimum and preferably shared between potential users. Not many people keep a spare car in the garage at home in case their everyday car breaks down they prefer to share the cost with other people by relying on taxis or rental cars in the event of a failure of their own vehicle.
operator management, for example to fail processing over to a backup card before the primary card is removed from the rack. Hardware manager Most switches built using distributed components have a Hardware Manager that is responsible for monitoring the state of the controller and line cards, and the software running on them. When there is a failure, the Hardware Manager reports the problem to control software that can instigate recovery procedures. Signaling hellos, keepalives Many signaling protocols include keepalive processing where adjacent nodes poll each other periodically to check that the link is active and that the signaling software on the partner is active. RSVP includes Hello messages and LDP uses KeepAlive messages. However, the need to have multiple retries to allow for occasional data loss, and the general speed of these mechanisms, may mean that they detect failure much slower than hardware or lower layer protocols. This form of protocol exchange is particularly useful for detecting software failure or hardware card failure at an adjacent node since the physical link itself may be undisturbed by these faults. IGP hellos Interior Gateway Protocols (IGPs) disseminate routing and topology information within a network. IGPs also run hello message exchanges within their protocols. Such exchanges are typically relatively infrequent since routing table updates do not need to be rapid. This means that rapid recovery of data paths cannot rely on IGP hellos for error detection. IGP topology updates IGPs collect and distribute topology information. These updates will reflect the current state of the network as known by the IGP implementations on the nodes throughout the network, so link failure information is propagated in this way. As mentioned above, the IGP may not detect errors very fast, and does not distribute the topology very often. Signaling error notifications MPLS signaling protocols include ways of reporting failures to set up LSPs and of notifying upstream nodes, including the ingress (initiator), when an established LSP fails. Although this does not provide hardware failure detection or notification, much can be inferred from an LSP failure. Additionally, since the whole purpose of failure survival is to
preserve or re-establish LSPs, LSP error notifications play an important part in recovery processing. Notify messages Generalized MPLS (GMPLS [13]) introduces a new Notify message to the signaling protocols so that LSP failures can be reported to the ingress or some other node responsible for error recovery. Notify messages may provide faster error reporting than the normal error notifications since they can contain information about multiple failed LSPs, and because they are sent direct to the consumer. Note that this function is initially only specified for RSVP-TE signaling and not CR-LDP. Crankback Crankback is a process of providing additional information about hardware faults and broken topologies on signaling error or notification messages. Rather than simply reporting the underlying cause of the problem in an error message, such message would also carry crankback information identifying the failed link or node. This information can provide rapid feedback into topology and routing tables within the network and allows LSPs to be correctly set up around the failed resource.
Hardware and software failures within an individual LSR may be repaired using Fault Tolerance schemes that involve duplication of hardware and software components. Configuration and state information is replicated from the primary to backup components so that the backups are ready to take over if the primaries fail. Looking through this list, we see that most survival mechanisms involve duplication of resources in some way. This is, of course, expensive for the network provider. Although these costs can be passed on to the user, who pays extra for the additional level of service, it is still desirable to reduce these costs as far as possible. The sections below not only describe the techniques for making an MPLS network capable of surviving failures, but also discuss possible approaches to reduce the costs by sharing the resources.
4. Local Repair
IP is a connectionless protocol designed to send data as datagrams through the network from source to destination, using the best available route. To provide the background for MPLS local repair, we look at raw IP networks first. Routing protocols run within the network to collect a propagate topology information. This is processed using Shortest Path First (SPF) algorithms to produce a routing table at each node in the network that tells the node in which direction to forward an IP packet based on its destination address. When there is a link or node failure within an IP network, the change in topology is distributed by the routing protocol and the routing tables are updated at each node. Initially, this may result in data packets being lost for one of three reasons. They are sent down the broken link because the local routing table hasnt been updated. They are discarded because no suitable route is known. They are dropped because the routing table shows the best route to be back in the direction from which the packet came. Nevertheless, after a period of time (frequently measured in seconds), the routing protocol stabilizes and the routing tables either show that no route exists from source to destination (the network has fragmented) or a new route has been advertised and IP data flows again. Some Service Providers achieve rapid healing of their networks and protection against failures simply by double-provisioning their entire network. Every link and every router has a shadow. Any single failure is rapidly repaired by the routing protocol to use the backup resource.
do not affect the data flows. Data paths can only change once a new LSP has been signaled and devices on the LSP programmed with the new label mappings.
IP Source
Ingress LER
Egress LER
IP Sink
Figure 4 illustrates re-routing around a link failure in a simple network. Because this re-signaling is time consuming and may in any case not result in successful re-establishment of the LSP, the signaling protocols impose some restrictions on the extent of local repair that is supported: CR-LDP does not include facilities for routing repair at the node that detects the fault. In fact, if a connection between two LSRs fails, CR-LDP mandates that the affected LSPs are torn down and an error notification is sent back to the ingress. An LSP can be re-signaled from the ingress and may merge with components of the old LSP downstream of the fault. RSVP-TE has its roots in RSVP (RFC2205 [8]), which was intended to keep resource allocations in line with IP microflows as they reacted to routing table changes. As an MPLS signaling protocol, RSVP-TE is more flexible than CR-LDP and allows LSPs to be re-routed according to changes to the IP topology. This re-routing is, however, usually restricted to the point of failure detection and the ingress if each LSR on the path attempted to re-route and re-signal the LSP, but failed (e.g. due to inability to find a route that matched the requested constraints), it might take far too long for the error to finally propagate back to the ingress node. LDP is not a Traffic Engineering protocol and is more closely tied to the routing topology. In general, LDP will react to the new routes entered into the routing tables at each LSR by distributing new labels to allow label-based forwarding to operate. Since network topologies are rarely full meshes, local repair might not succeed, and re-routing may need to be resolved at the ingress.
4.4 Crankback
Crankback is a process of reporting information about route failures back along the route towards the ingress. Recent drafts in the IETF [5] define extensions to the protocol messages that report LSP failures, to carry the details of which link has failed at which LSR. Crankback can be used to augment the IGP link state databases, especially those that advise Traffic Engineering. Frequently, this will provide substantially faster feedback than routing protocol updates and can be used to help re-route the LSP, especially from the ingress. A crankback update could either be used to (a) re-route only the LSP to which it relates, or to (b) update the IGP link state databases, for routing all future LSP establishment requests and for affecting existing LSPs. The danger of using the crankback information for anything other than the LSP to which it relates is that there is currently no mechanism to tie together crankback information and other IGP link state information. Hence, it might be difficult to identify when information introduced from a crankback update should be discarded or updated due to IGP updates. When an LSP crosses domain or area boundaries, crankback can be used to re-route the LSP from the domain boundary without needing to propagate the error all the way back to the ingress.
5. Protection Switching
5.1 Basic Operation
Protection Switching is a method of ensuring recovery from link or node failure with minimal disruption to the data traffic. Many references to this function include a target failover time of 60ms that is reputed to be the longest acceptable disruption to voice traffic. In Protection Switching, data is switched from a failed LSP to a backup LSP at the repair point, which is not the point of failure and is conventionally the ingress, although may also be at other welldefined points on the LSP. The backup LSP is usually pre-provisioned. In Figure 5 data is switched on the red and green primary LSPs. The blue backup LSP takes a less favorable path, is ready and set up, but does not carry any data. When an error in one of the primary LSPs is reported back to the ingress LER (perhaps using Notify messages, see later), data is immediately switched to the backup LSP. Note that the blue path shown could be a backup for both the red and the green paths simultaneously. See below for discussion of backup modes. This can be considerably quicker than local repair since the backup LSP does not need to be signaled at the time of failure.
Backup path
acceptable way of operating since the failure of a second primary LSP is assumed to be highly unlikely in fact, after failover, the blue LSP is itself unprotected. This method reduces the cost of providing backup services.
Dividing the service levels in this way will reduce the pressure on resources, as not all customers will want to pay for protection switching. Nevertheless, service providers will want to search out ways of minimizing the need to reserve backup resources. If this creates an intermediate silver service level of protected most of the time, this will be acceptable. Several modes of operation have already been raised. The simplest has a single LSP providing the backup for more than one primary LSP. Since it is unlikely that both primaries will fail, this offers a good solution, but it does require that there is more than one primary LSP between ingress and egress something that may often not be the case. Another option that works when there are multiple data flows between ingress and egress is to use the backup LSP for low priority data. When the primary fails, the low priority data is dropped or reverts to best effort IP transfer. This is a nice model if MPLS is being used to handle DiffServ traffic as described in [12]. In a complex network, the issue may be wider than reducing the number of end-to-end backup LSPs. In this case, there is a need to reduce the amount of resources used on links in the core of the network.
LSR A
LSR B
Figure 6 shows two entirely distinct primary LSPs in red. Protection switched backups (in blue) are signaled through the network, and for part of their routes they are coincident (between LSRs A and B). Now, the resource reservation load between LSRs A and B due to the backup LSPs could be very high, yet no data is actually passing down this route. At best this may impair the ability of the network to set up primary LSPs between A and B, and at worst it may mean that one of the backups cannot be established. The ideal is for the resources of between A and B to be shared between the two backup LSPs. Since it is highly unlikely that both primary LSPs will fail at the same time, this is a good solution for a silver service. Several issues with this resource sharing approach are still open for study at the time of writing this white paper. Firstly, how do LSRs A and B know that it is acceptable to share resources between the two LSPs? One possibility would be to mark the signaling requests as backups. There are drafts proposed within the IETF [7, 19] that describe ways this can be done, for example by extending the Protection Information object defined in GMPLS [13]. The second question is how should LSR A behave if both primary LSPs do fail and data starts to flow on both backup LSPs. A probable answer is that the backups are treated as firstcome first-served so that the data on the second backup to be used is simply dropped at LSR A. This is hardly satisfactory, however, if both primary LSPs believe they are protected, and a better answer involves signaling to the ingress of the second primary that it is no longer protected. One way of doing this is discussed in [19] and relies on use of the Notify message in an RSVP network. This leads to a third question, which is how restoration of a failed primary can be achieved without disrupting data flow. One way of doing this is described as bridge and roll,
Lastly, care needs to be given to the process of data forwarding at LSR B. It is important that, however resource sharing is arranged between LSRs A and B, there are still distinct backup LSPs running in parallel. If this is not done, then packets arriving at LSR B will be indistinguishable and LSR B will not know to which egress LSR they should be forwarded.
This system can lead to another level of complexity where links and nodes can have both primary and backup resources. Primary resources can be committed only once, but backup resources could be over-committed many times, leading to two separate resource spaces to be managed. This is something that the IGPs do not currently support.
6. Fast Re-route
The previous section highlights some of the concerns with protection switching. Errors need to be signaled through the network from the point of detection to the point of repair and this can significantly delay the repair time. Fast re-route is a process where MPLS data can be directed around a link failure without the need to perform any signaling at the time that the failure is detected. Unlike protection switching, the repair point is the point of failure detection. Consequently there is no requirement to propagate the error to the repair point using the signaling protocol. Most fast re-route protection schemes rely on pre-signaled backup resources. When the failure is reported to the repair point, it simply updates the programming of its switch so that data that was previously sent out of one interface with one label is sent out of a different interface with another label. There are several fast re-route schemes currently under discussion in the IETF. The different approaches address different problems and vary in complexity. Some of the more established solutions are set out below.
LSR A
LSR B
The capacity of the backup LSP should, of course, be sufficient to carry the protected LSPs. If all LSPs on a link are to be protected then the capacity should equal the bandwidth of the protected link. This can potentially lead to a huge amount of backup bandwidth being required, especially if multiple links must be protected in this way. Note that not all LSPs using a link need to be protected by the same backup LSP, or even at all. By leaving some LSPs over the link unprotected, the backup bandwidth requirement can be reduced. Note that there are some very specific limits placed on the use of label spaces when this method of fast re-route is in use. The LSP that provides the backup virtual link is used as an LSP tunnel. That is, the data packets that would have been sent down the physical link have an additional label added and that top label is used to forward the packet along the backup LSP. When the other side of the broken link is reached (the egress of the backup LSP) the top label is stripped from the packet, and the data is forwarded according to the lower label this is the label that would have been used to forward the packet down the broken physical link. However, label switches provide a mapping from {ingress interface, ingress label} to {egress interface, egress label} and, although the ingress label has been preserved, the ingress interface will have changed. It will be reported either as the virtual interface that identifies the egress of the backup tunnel, or as the physical interface through which the backup tunnel arrives. There are two solutions. Map the virtual interface back to the original ingress physical interface. This may be possible, but it requires that the downstream node understands the reason why the backup LSP was established. This would probably involve configuration intervention. Use the Global Label Space for all LSPs protected by the backup tunnel. This is easily achieved. The main issues with link protection concern the increased complexity of configuration (each protected link must have a backup tunnel configured) and the amount of resources that must be reserved in the network.
Copyright Metaswitch Networks. Confidential | Page 30
LSR D
12, 19 83, 19 42, 19
23
19
LSR A LSR B
LSR C
As with fast-reroute link protection, not all LSPs passing through a particular node need to be protected. In order to reduce the bandwidth required on the backup tunnel one may choose to leave low priority traffic unprotected in this case the LSR at the start of the backup path must decide which LSPs to switch over to the backup tunnel and which to leave (with the consequence that data on the unprotected LSPs will be lost).
Detour path 1
Detour path 2
LSR A
LSR B
LSR C
LSR D
The draft provides new RSVP Signaling messages that are used to request that detour paths are set up once the primary path is set-up. LSRs supporting this function can then automatically initiate computation of a detour path that protects against the next downstream node and link in the primary path. Note that LSRs adjacent to the Egress LSR can only compute a detour that protects against the link between itself and the egress. In order to compute the detour paths, the LSR needs to know which downstream nodes the primary path goes through which outgoing link the primary path uses on the LSR which downstream nodes are to be protected against the traffic engineering requirement for the detour.
The first three items can be determined by using recorded route information, which is available during LSP establishment (i.e. on the Resv in RSVP). The traffic engineering requirements can be signaled on path creation within the new fast-reroute object. This allows detour paths to have different requirements to the main path. This might be useful in order to reduce backup bandwidth requirements if any failure is expected to be short lived, it may be appropriate to reduce the bandwidth requirement over the backup path on the premise that a lower quality of service is better than no service. Given this information the LSR determines the destination for the detour path as either the next but one LSR in the main path or, if the LSR is the penultimate hop, the egress. It then computes a route to the destination satisfying the following constraints. The detour path must originate from the current LSR. It should not traverse the immediate outgoing link.
It should not traverse the next hop downstream node (unless this is the egress). It should satisfy the traffic engineering requirements specified.
The results of the detour computation allow the LSR to generate a detour path request. This will be an RSVP Path request containing an explicit route following the computed path, the traffic parameters specified within the original fast-reroute object and a new detour object that identifies the path as being a detour, providing an indication of the node at the start of the detour and the node that the detour is avoiding (if appropriate). An LSR can recalculate its detour paths in order to take advantage of any favorable changes in network topology. If a subsequent calculation produces a different detour path to the one currently set up, the LSR can replace it. The main feature of this approach is the fact that detour paths are set up dynamically without any operator intervention. However, as it stands, the current draft has a number of deficiencies. There is no discussion on label allocation for the detour LSPs. As described in section 6.2, some thought must be given to label allocation in order to ensure that data passed over the detour path is correctly forwarded upon arrival at the LSR at the end of the detour. The methods described are only suitable for unidirectional LSPs. There is no discussion regarding reuse of the bandwidth provisioned for the detour paths when there are no failures.
Requires reporting of labels in Recorded Route. May be limited to the global label space.
Detour paths pre-allocated. Need a detour path for each protected link and node.
8. Summary
Building MPLS systems that can survive network failures is not simple. This is partly due to the fact that MPLS is built on top of IP, which has less demanding recovery requirements, and its own ways of resolving routing changes. It also owes something to the origins of MPLS, which was not originally designed with much attention to rapid recovery from failures. Protecting LSPs against network failure can use the inherent properties of IP and IP routing, the techniques of protection switching derived from experience in other signaling protocols, or fast reroute methods developed specifically for MPLS. Each has its own specific characteristics, advantages and disadvantages so that the method(s) used must be chosen carefully with regard to the requirements of the network and the users. If very rapid repair is needed, for example for voice traffic, then fast re-route will probably provide the best solution. If quick repair with the possibility of sharing backup resources is desired then protection switching can be chosen. If repair time is not crucial and network resources are limited then local repair can be used.
9. Glossary
AS Autonomous System. A part of the network under a single administration and usually running a single routing protocol for internal routing. BGP Border Gateway Protocol. The Exterior Gateway Protocol used for distributing routes over the Internet backbone. CR-LDP Constraint-based Routed Label Distribution Protocol. Extensions to LDP to set up Traffic Engineered LSPs, as defined in the Internet Draft Constraint-based LSP Setup using LDP [10]. DiffServ Differentiated Services. A system of differentiating data packets for IP networks that is based on setting relative priorities and drop precedence for each DSCP. It is defined by the DiffServ Working Group. DLCI Data Link Circuit Identifier. The labels used in Frame Relay that are equivalent to MPLS labels. ECMP Equal Cost Multi-Path. A scheme where more than one path of the same cost can exist in the routing table. The choice between routes is made using some other principle such as bandwidth requirements. FEC Forwarding Equivalence Class. A logical aggregation of traffic that is forwarded in the same way by an LSR. A FEC can represent any aggregation that is convenient for the SP. FECs may be based on such things as destination address and VPN Id. FT Fault Tolerance. A scheme where an piece of hardware, such as an LSR, is built using duplicate hardware and software components such that the whole is resilient to failures of individual components and can provide a highly available system. GMPLS Generalized MPLS [13, 14, 15]. Extensions to the MPLS Traffic Engineering signaling protocols to support additional features such as optical networks, bi-directional LSPs, and source control of labels and link.
HA
High Availability. High Availability (HA) is typically claimed by equipment vendors when their hardware achieves availability levels of at least 99.999% (five 9s).
IETF
Internet Engineering Task Force. The worldwide grouping of individuals from the computing and networking industry and from academia that devises and standardizes protocols for communications within the Internet. Responsible for the development of MPLS.
IGP
Interior Gateway Protocol. Any routing protocol used for distributing routes within a single Autonomous System such as OSPF.
IP
Internet Protocol. A connectionless, packet-based protocol developed by the IETF and at the root of communications within the Internet.
See RSVP-TE. Label Distribution Protocol. A protocol defined [11] by the IETF MPLS working group for distributing labels to set up MPLS LSPs.
LER
Label Edge Router. An LSR at the edge of the MPLS network. LERs typically form the ingress and egress points of LSP tunnels.
LMP
Link Management Protocol. A protocol under development by the IETF to discover and manage links, and to detect and isolate link failures.
LOL
Loss Of Light. The process of detecting the failure of an optical link by discovering that no signal is being received.
LSP
Label Switched Path. A data forwarding path determined by labels attached to each data packet where the data is forwarded at each hop according to the value of the labels.
A Traffic Engineered LSP capable of carrying multiple data flows. Label Switching Router. A component of an MPLS network that forwards data based on the labels associated with each data packet.
MPLS
MultiProtocol Label Switching. A standardized technology that provides connection-oriented switching based on IP routing protocols and labeling of data packets.
OEO
Opto-Electronic Switch. Short for Optical-Electronic-Optical, this switch terminates each optical connection converting the signal to electronics before forwarding packets to other optical links. Compare with PXC.
OSPF
Open Shortest Path First. A common routing protocol that provides IGP function.
PPP
Point-to-Point Protocol. A common access protocol for VPNs particularly important in providing connection from roaming workstations.
PXC
Photonic Cross-Connect. A type of switch that is capable of switching and forwarding optical signals without needing to convert the signals to electronics. Compare with OEO.
RSVP
Resource ReSerVation Protocol (RFC 2205) [8]. A setup protocol designed to reserve resources in an Integrated Services Internet. RSVP has been extended to form Labels RSVP.
RSVP-TE
Extensions to RSVP to set up Traffic Engineered LSPs [9]. Throughout this document, Labels RSVP or RSVP-TE is referred to simply as RSVP.
SPF
Shortest Path First. An algorithm for selecting a route through a physical topology. Shortest may apply to the number of hops (i.e. nodes) in the route, but some hops may be weighted to reflect other length characteristics such as the absolute length of the physical link between nodes. See OSPF.
TCP
Transmission Control Protocol. A transport level protocol developed by the IETF for reliable data transfer over IP.
TE
Traffic Engineering. The process of balancing the load on a network by applying constraints to the routes which individual data flows may take.
VPI/VCI
Virtual Path Identifier / Virtual Channel Identifier. The labels used in ATM layer 2 networks that are equivalent to MPLS labels.
VPN
Virtual Private Network. A private network provided by securely sharing resources within a wider, common network.
10. References
The following documents are referenced within this white paper. All RFCs and Internet drafts are available from www.ietf.org URLs are provided for other references. Note that all Internet drafts are work in progress and may be subject to change, or may be withdrawn, without notice. 1 2 White paper from Metaswitch (www.metaswitch.com) White paper from Metaswitch (www.metaswitch.com) MPLS Traffic Engineering: A choice of Signaling Protocols MPLS Virtual Private Networks: A review of the implementation options for MPLS VPNs including the ongoing standardization work in the IETF MPLS Working Group 3 4 5 6 7 8 9 10 11 12 13 14 15 RFC 3031 draft-ietf-mpls-lmp draft-iwata-mpls-crankback draft-swallow-rsvp-bypass-label draft-li-shared-mesh-restoration RFC 2205 draft-ietf-mpls-rsvp-lsp-tunnel draft-ietf-mpls-cr-ldp RFC 3036 draft-ietf-mpls-diff-ext draft-ietf-mpls-generalizedsignaling draft-ietf-mpls-generalized-rsvp-te draft-ietf-mpls-generalized-cr-ldp Generalized MPLS Signaling extensions for RSVP-TE Generalized MPLS Signaling extensions for CR-LDP Multiprotocol Label Switching Architecture Link Management Protocol (LMP) Crankback Routing Extensions for MPLS Signaling RSVP Label Allocation for Backup Tunnels RSVP-TE Extensions For Shared-Mesh Restoration in Transport Networks Resource ReSerVation Protocol (RSVP) Extensions to RSVP for LSP Tunnels Constraint-based Routed LSP Setup Using LDP LDP specification MPLS Support of Differentiated Services Generalized MPLS Signaling
16 17 18 19 20 21
RSVP Refresh Overhead Reduction Extensions OAM Functionality for MPLS Networks A Framework for MPLS User Plane OAM Generalized MPLS Recovery Mechanisms A Method for MPLS LSP Fast-Reroute Using RSVP Detours MPLS RSVP-TE Interoperability for Local Protection/Fast Reroute
In addition, the following is a number of other Internet drafts for MPLS which may be of interest. draft-ietf-mpls-recovery-frmwrk draft-chang-mpls-path-protection draft-chang-mpls-rsvpte-pathprotection-ext draft-owens-crldp-path-protection-ext draft-shew-lsp-restoration draft-kini-restoration-shared-backup draft-suraev-mpls-globl-recov-enhm draft-azad-mpls-oam-messaging draft-harrison-mpls-oam-req draft-many-optical-restoration Framework for MPLS-based Recovery A Path Protection/Restoration Mechanism for MPLS Networks Extensions to RSVP-TE for MPLS Path Protection Extensions to CR-LDP for MPLS Path Protection Fast Restoration of MPLS Label Switched Paths Shared backup Label Switched Path restoration Global path recovery enhancement using Notify reverse LSP MPLS user-plane OAM messaging Requirements for OAM in MPLS Networks Restoration Mechanisms and Signaling in Optical Networks
DC-MPLS is suitable for use in a wide range of IP switching and routing devices including Label Switch Routers (LSRs) and Label Edge Routers (LERs). Support is provided for a range of label distribution methods including Resource ReSerVation Protocol (RSVP), Constraint-based Routed Label Distribution Protocol (CR-LDP) and Label Distribution Protocol (LDP). The rich feature set gives DC-MPLS the performance, scalability and reliability required for the most demanding MPLS applications, including VPN solutions for massively scalable access devices. DC-MPLS integrates seamlessly with Metaswitchs other protocol products, and uses the same proven N-BASE communications execution environment. The N-BASE has been ported to a large number of operating systems including VxWorks, Linux, OSE, pSOS, Chorus, Nucleus, Solaris and Windows NT, and has been used on many processors including x86, i960, Motorola 860, Sparc, IDT and MIPS. All of the Metaswitch protocol implementations are built with scalability, distribution across multiple processors and fault tolerance architected in from the beginning. We have developed extremely consistent development processes that result in on-time delivery of highly robust and efficient software. This is backed up by an exceptionally responsive and expert support service, staffed by engineers with direct experience in developing the protocol solutions.
Metaswitch and the Metaswitch logo are trademarks of Metaswitch Networks. All other trademarks and registered trademarks are the property of their respective owners. Copyright 2001 - 2009 by Metaswitch Networks. Metaswitch Networks 100 Church Street Enfield EN2 6BQ England +44 20 8366 1177 http://www.metaswitch.com