Sie sind auf Seite 1von 15

412 IEEE TRANSACTIONS ON RELIABILITY, VOL. 63, NO.

2, JUNE 2014

Fast Recovery From Link Failures in


Ethernet Networks
Abishek Gopalan and Srinivasan Ramasubramanian, Senior Member, IEEE

Abstract—Fast-recovery from link failures is a well-studied SPF Shortest Path First


topic in IP networks. Employing fast-recovery in Ethernet net-
works is complicated as the forwarding is based on destination SPT Shortest Path Tree
MAC addresses, which do not have the hierarchical nature similar VLAN Virtual Local Area Network
to those exhibited in Layer 3 in the form of IP-prefixes. Moreover,
switches employ backward learning to populate the forwarding
table entries. Thus, any fast recovery mechanism in Ethernet NOTATION
networks must be based on undirected spanning trees if back- Graph
ward learning is to be retained. In this paper, we develop three
alternatives for achieving fast recovery from single link failures Number of nodes or vertices in a graph
in Ethernet networks. All three approaches provide guaranteed Number of edges or links in a graph
recovery from single link failures. The approaches differ in the
technologies required for achieving fast recovery, namely VLAN Next-hop on VLAN to reach switch
rewrite or mac-in-mac encapsulation or both. We study the per-
formance of the approaches developed on five different networks. Tree rooted at node
Index Terms—Fast recovery, Ethernet, backward learning, link Graph in which is removed
failure, vlans, independent spanning trees, network availability. Exit node of
Shortest path length from node to node under
ACRONYMS AND ABBREVIATIONS no failures
2E Two-edge connected network Average shortest path length under no failures
3E Three-edge connected network Default tree path length from node to node
under no failures
BFD Bidirectional Forwarding Detection
Average path length under no failures
BFS Breadth-first Search
Shortest path recovery length from node to node
CoS Class of Service
under link failure
DFS Depth-first Search
Average shortest path recovery length from node
ESCAP Efficient Scan for Alternate Paths to node under singe link failures
IP Internet Protocol Average shortest path recovery length under single
MAC Medium Access Control link failures
M-ESCAP Modified Efficient Scan for Alternate Paths Recovery path length from node to node under
link failure
MPLS Multiprotocol Label Switching
Average recovery path length from node to node
MSTP Multiple Spanning Tree Protocol
under single link failures
PBB-TE Provider Backbone Bridge Traffic Engineering
Average recovery path length under single link
RSTP Rapid Spanning Tree Protocol failures
SPB Shortest Path Bridging Undirected cycle
Directed cycle
Manuscript received April 28, 2013; revised July 25, 2013; accepted August
05, 2013. Date of publication April 14, 2014; date of current version May 29,
The set of exit links on a cycle
2014. This work was supported in part by the National Science Foundationunder
Grant CNS-1117274 and by the Hewlett Packard Labs Innovation Research Pro- I. INTRODUCTION
gram. Associate Editor: P.-H. Ho.
The authors are with the Department of Electrical and Computer Engineering,
University of Arizona, Tucson, AZ 85721 USA (e-mail: abishek@ece.arizona.
edu; srini@ece.arizona.edu).
Color versions of one or more of the figures in this paper are available online
at http://ieeexplore.ieee.org.
E THERNET is becoming an attractive solution in
metropolitan and wide area networks as it offers a
cost-effective way to provision high data rate services [2].
Digital Object Identifier 10.1109/TR.2014.2315957 However, the simplicity and cost-effectiveness come with two

0018-9529 © 2014 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
GOPALAN AND RAMASUBRAMANIAN: FAST RECOVERY FROM LINK FAILURES IN ETHERNET NETWORKS 413

major shortcomings: poor support for traffic engineering, and backward learning forces the routes to be symmetrical; and as
slow failure recovery times [3], [4]. These two shortcomings only one entry per host is maintained at every switch, the ap-
are a direct consequence of employing an undirected spanning proach works only on undirected spanning trees. In contrast to
tree as the basis for forwarding. The spanning tree plays a IP networks, Ethernet networks extract an undirected tree from
key role in 1) reducing the unnecessary overhead created by the underlying topology, through spanning tree protocol vari-
broadcasting when a destination address is not available, and ants, and then learn where the hosts are attached to that tree in
2) retaining the backward learning mechanism, which is crucial order to populate the forwarding table. Depending on the variant
in supporting the scalability and mobility of the end-hosts. of spanning tree protocol employed, the network may employ
The spanning tree, however, provides only one path between one or more trees. While each tree may be assigned one or more
any node pair, and hence the failure of any link or node would VLAN tags, each VLAN has a specific tree assigned to it. Any
disconnect the spanning tree. mechanisms developed for forwarding or fast recovery or both
To overcome the deficiencies of the spanning tree approach, must rely on undirected spanning trees unless they provide an
there have been several revisions to the original spanning tree alternative mechanism for learning MAC addresses, thus the
protocol, such as support for faster re-convergence (RSTP) [5], necessary forwarding logic at the switches.
and support for multiple spanning trees (MSTP) [6] that can help Techniques such as Provider Backbone Bridge Traffic Engi-
create smaller regions for recovery. Protocols to reduce fault de- neering (PBB-TE) [12] and Shortest Path Bridging (SPB) [13]
tection times, such as bi-directional forwarding detection (BFD) avoid the reliance on a spanning tree for regular network oper-
[7], were developed. Despite all these efforts, we still lack a fun- ation. PBB-TE provides a centralized architecture, and a con-
damental understanding of the application of undirected span- nection-oriented approach to computing forwarding entries and
ning trees in achieving good resiliency in network design. protection paths. SPB provides a link state control plane that al-
In this paper, our goal is to study the use of multiple span- lows switches to separately compute forwarding tables from a
ning trees with interesting properties for achieving fast recovery common view of the network topology. In such scenarios, it is
in Ethernet networks. We develop methods to achieve fast re- possible to employ techniques for fast re-routing that have been
covery from link failures in VLANs using proactive approaches developed in IP over MPLS networks. Examples include Col-
that rely only on local information with a constant overhead. ored trees [10], Not-via [14], Maximally redundant trees [15],
Every spanning tree may be configured with a unique VLAN and ESCAP [16]. As PBB-TE and SPB do not employ spanning
identifier. The VLANs are precomputed and preconfigured, thus trees, they cannot employ backward learning. Thus, these ap-
enabling fast recovery from link failures. In addition, traffic may proaches must be augmented with techniques to disseminate the
be split over multiple VLANs to provide increased cross-sec- MAC addresses of the devices connected to the switches across
tional bandwidth. The algorithms and protocols have provable the network. Then, every switch will have the knowledge of the
performance guarantees. destination switch to which a packet needs to be forwarded to
for a given destination MAC address. However, our focus in this
A. Fast Recovery in IP Vs. Ethernet paper is to study techniques for fast rerouting that would retain
Fast recovery from link and node failures has been studied backward learning.
extensively in the context of IP networks [8]–[11]. Tradition-
ally, IP routing table entries are computed by constructing des- B. Related Work
tination rooted trees for IP prefixes. The trees for different IP An undirected spanning tree gets disconnected by a link or
prefixes are computed separately from each other. The desti- node failure. The spanning tree protocol and its variants re-
nation-rooted trees are directed in nature, directed towards the cover from failures by computing another tree after the link
destination. Thus, the routes between two IP endpoints may failure. The convergence time of the original spanning tree al-
not necessarily be symmetrical. Fast recovery in IP networks gorithm was on the order of 30 to 50 seconds. RSTP [5], which
is achieved by providing one or more backup ports, in addition was developed later in 802.1d [17], can reduce the convergence
to the primary forwarding port used under no failures. The for- time anywhere from tens or hundreds of milli-seconds to a few
warding of IP packets is then based on the destination IP ad- seconds [18], [19] depending on considerations such as net-
dress, and some additional information, which is either carried work topology, port manipulation times, time for failure detec-
in the packet or derived from the incoming port on the router. tion, etc. As a spanning tree involving switches uses only
The forwarding mechanism of Ethernet is similar to that of links, several links in the network remain unused. To im-
IP networks. Ethernet forwarding is based on destination MAC prove link utilization in the network, support for multiple span-
address, similar to destination IP address, and VLAN tag (equiv- ning trees was developed in 802.1q and 802.1s, where one or
alent to the additional information used in IP addressing). How- more VLANs can be assigned to a tree. Multiple spanning trees,
ever, the failure recovery techniques developed for IP networks where each tree is identified using a specific VLAN, are em-
cannot be directly applied to Ethernet networks due to the fact ployed to distribute traffic in the network [20], [21]. When a
that Ethernet employs backward learning to compute its for- failure occurs, the failure is notified to a central system [22] by
warding table. The switches in Ethernet networks learn about the switch connected to the failed link or detected by receivers
the endpoints from the packets that arrive at the switch. If a which then notify the senders [23], [24]. The traffic is then re-
packet with VLAN tag and source MAC address arrives distributed by the source over the remaining spanning trees.
on port , then the switch infers that the host, identified by the Our goal in this paper is to employ multiple spanning trees,
VLAN-MAC tuple , may be reached through port . This each identified with a unique VLAN, to achieve fast recovery
414 IEEE TRANSACTIONS ON RELIABILITY, VOL. 63, NO. 2, JUNE 2014

from link failures. By fast recovery, we refer to the forwarding TABLE I


of packets along an alternate port (path) from the switch con- A COMPARISON OF NETWORK CONNECTIVITY, NUMBER OF VLANS REQUIRED,
AND 802.1AH MAC-IN-MAC ENCAPSULATION REQUIREMENTS FOR THE
nected to the failed link. The switching of traffic to an alternate DIFFERENT APPROACHES DEVELOPED IN THIS PAPER
path implies changing the VLAN tag in the packet, thereby em-
ploying a different spanning tree for routing. In [25], [26], the
authors employ multiple spanning trees over which traffic is dis-
tributed. Upon a failure, an intermediate switch would choose
a spanning tree whose VLAN identifier is higher than the one
that failed. The papers, however, do not quantify the number
of spanning trees needed to guarantee a single link failure re-
topology, the techniques to be employed for forwarding upon
covery, or even mention how the trees are computed. Link-
encountering a failure, and the number of VLAN tags required.
disjoint spanning trees have been proposed as a mechanism
for achieving load balancing and failure recovery [27], [28]. In
C. Contributions
these approaches, two spanning trees are constructed such that
no undirected link is common to both trees. The use of link-dis- Our contributions in this paper are as follows. We develop
joint spanning trees in conjunction with switching traffic from three approaches, namely the 3Trees, 2Trees, and 1Tree that
one spanning tree to another at the point of failure guarantees re- each provide different tradeoffs. All approaches are based on
covery from single link failures. However, a network has to be multiple spanning trees and shifting the packet from one span-
four edge connected to obtain two link-disjoint spanning trees ning tree to another upon failure, thus requiring VLAN re-write
[29], [30], which is a stringent requirement. Note that, in con- capability at the switches. In addition, some of the approaches
trast, we can compute two destination-rooted link-independent may require MAC-in-MAC encapsulation (802.1ah [34]), with
spanning trees1 in two edge connected networks, and achieve either 1 or 2 levels of encapsulation. Table I shows the summary
fast rerouting in IP networks [8]. of the requirements for the different approaches developed in
The use of multiple spanning trees, each identified using a this paper.
unique VLAN tag, is a popular approach for distributing traffic All three solutions guarantee recovery from single link
across different paths. There have been prior works that have failures, and the forwarding mechanism upon a failure is based
looked at the issues of improving fault tolerance and bandwidth purely on local information. Moreover, all of the solutions are
guarantees. Works [31]–[33] have studied how one could proactive approaches. Thus, the switches may be programmed
re-structure the original spanning tree upon link failures so with the actions to be taken for forwarding a packet under
that the changes during re-convergence are mitigated to some normal circumstances and upon failures.
extent. Such schemes often rely on messaging [31], [32], or We extend the contributions made in the conference version
might require the use of several backup trees [33], which cannot [1] of this paper by providing proofs for theorems, additional
scale well. Although these schemes can re-connect the broken performance evaluation, discussion of new results, and detailed
tree quickly (termed fast recovery or fast re-connection), it is illustrations of the proposed schemes.
not obvious how the packets at the switch can be re-forwarded
until such a new tree is set up. Thus, there will be packet losses, D. Organization
and control messages being exchanged with other switches In Section II, we describe issues that are common to any
until a new tree is setup. fast recovery approach that employs multiple spanning trees.
In this paper, our definition of guaranteed fast recovery upon In Sections III, IV, and V we describe the 3Trees, 2Trees, and
a link failure is the ability of the switches to re-forward packets 1Tree approaches, respectively. We illustrate the working of
without having to inform other routers of the failure (purely each approach with examples. In Section VI, we evaluate the
local recovery), thus limited only by the failure detection time performance of these approaches. We conclude the paper in
and the time to modify the packet header. Section VII. The Appendix describes the necessary theorems
In this paper, we focus on providing guaranteed recovery and proofs required to ensure the correctness of the 1Tree ap-
from single link failures in Ethernet networks using multiple proach.
undirected spanning trees, each identified with a unique VLAN.
To the best of our knowledge, this is the first work that pro- II. FAST RECOVERY WITH MULTIPLE SPANNING
vides mechanisms that guarantee purely local fast recovery from TREES—GENERAL CONSIDERATIONS
single link failures in Ethernet networks that are at least two-
edge connected. This connectivity requirement is the minimum Recall that any solution for fast recovery in Ethernet net-
to ensure single link fault tolerance. Moreover, we also develop works needs to use an undirected spanning tree due to back-
several schemes that can achieve this goal, and study the trade- ward learning. One approach to guarantee single link failure re-
offs between these schemes that vary based on the network covery is to decompose the network into multiple spanning trees
such that, for any given link failure, there exists a spanning tree
1(Link or vertex) Independent spanning trees are rooted spanning trees, such
that doesn’t contain that link. When a failure occurs, the switch
that the paths from any node to the root of the spanning trees are mutually (link
or vertex)-disjoint. In this paper, references to independent trees imply link- connected to the failure may simply transfer the packet to the
independent as well as spanning unless noted otherwise. spanning tree that does not contain the failed link by rewriting
GOPALAN AND RAMASUBRAMANIAN: FAST RECOVERY FROM LINK FAILURES IN ETHERNET NETWORKS 415

the VLAN tag in the packet. Thus, every switch must have the We may construct the desired spanning trees in two ways. In
knowledge of which VLAN to rewrite the packet with, upon a the first approach, we select an arbitrary node as a root node
link failure. and construct three link-independent trees [11], [38]. The three
One approach to use multiple spanning trees is to designate link-independent trees are directed trees, however we simply
one tree as primary, and the others as backup. Thus, when a consider the undirected version of the three trees. The second
link in the primary tree fails, the switch connected to the failed approach is to select an arbitrary node as a root, construct three
link would rewrite the VLAN tag in the packet to correspond rooted arc-disjoint trees [39]2, and consider the undirected ver-
to the secondary tree and forward it along the secondary tree. sion of the trees. Note that the link-independent trees are also
However, there is a drawback to this approach. As the switches arc-disjoint trees with the additional constraint that the paths
populate the forwarding table entries using backward learning, from any node to the root on the trees are mutually link-dis-
the switches would not have learnt about the end hosts on the joint3. Both approaches will ensure that every link is part of at
backup spanning tree. This limitation would force the interme- most two trees, thus there always exists a spanning tree that is
diate switch to broadcast the packet on the backup spanning tree. unaffected by any single link failure. The first approach has a
However, if the traffic is spread over all the spanning trees, then computation complexity of [38], while the second ap-
the broadcasting upon a failure may be avoided. If the traffic is proach has a complexity of [39].
sent over all the spanning trees, then the switches would need a Fig. 2 shows the procedure and the sequence of steps used
way to verify if the incoming packet has already encountered a for forwarding packets in the 3Trees approach. Here, we assume
failure or not. We may encode this information with one bit. We that default packet forwarding in the absence of failures occurs
may use one bit from the 3-bit class-of-service (CoS) field in the on the red VLAN. Also, if the switch is connected to the desti-
VLAN header. Note that the above considerations are applicable nation host, then the packet is forwarded to the host. Thus, Step
to any fast recovery mechanism using multiple spanning trees, 1 in the procedure simply forwards the packet along the VLAN
thus they naturally apply to the approaches developed here. in which it was received. This action is red when the packet has
While employing link-disjoint trees has the simplicity of seen no failure, and is either blue or green when the packet has
using two trees, it requires the network to be four edge con- already seen a failure. If however, the forwarding link has failed,
nected. However, if we increase the number of spanning trees the steps for recovering from the link failure is discussed next.
employed, it may relax the requirement on the network to less Step 2 is entered when there is a failure of the forwarding
than four edge connected. Consequently, we are interested in link. The step checks to see if the packet has seen a failure al-
the following problem. What is the minimum number of undi- ready. This is easily known by checking the VLAN tag of the
rected spanning trees an arbitrary network can be decomposed incoming packet, and if it is not red, we drop the packet. We pro-
into such that for any given link, there exists a spanning tree ceed to the third step only when this is the first link failure that
that does not contain the link?. For this problem, we have the the packet has seen. Because all VLANs are spanning, and we
following results. are guaranteed (by construction) that there exists one spanning
1) If the network is four edge connected, we may decompose VLAN untouched by the link failure, we can use the unaffected
the network into two spanning trees [30]–[36]. VLAN for recovery.
2) For three edge-connected networks, we may decompose Step 3(a) checks to see if the blue VLAN is also present on
the network into three spanning trees. the failed link. If so, the green VLAN which is unaffected is
3) For two edge connected networks, the number of spanning used to recover. If we reach Step 3(b), we know that the blue
trees required is , where is the number of nodes VLAN is available, and hence employ the same to recover.
(switches) in the network. We now illustrate, with the help of an example network, the
The last result may be readily seen from the following two 3Trees approach. The network is as shown in Fig. 1(a), and is
facts. First, any two edge connected network can be made a three edge-connected. We consider one of the tree approaches
minimally two edge connected network. The number of links described earlier to construct the trees, namely the link-inde-
in a minimally two edge connected network can be restricted pendent spanning trees. Figs. 1(b) through (d) show the three
to [37]. Second, we may consider removing one link at a link-independent spanning trees rooted at node A. Figs. 1(e)
time, and reducing the residual graph into a spanning tree, which through (g) show the undirected version of the three link-in-
in the worst-case (a ring network for example) would require as dependent spanning trees, which denote the three desired span-
many spanning trees as the number of links. ning trees to be employed as VLANs. Consider the failure of
link B-C. Consider a packet destined to host that is connected
to switch E, arriving at node C. In the absence of the failure,
III. 3TREES APPROACH the packet would have been forwarded on the red VLAN along
Our first approach is for three edge connected networks. We C-B-E. Now, it is easy to see that the link failure affects both
employ three spanning trees, referred to as the red, blue, and the red and blue VLANs while leaving the green VLAN unaf-
green, identified using three VLAN tags. Every link will be fected. Because each VLAN is spanning, the green VLAN can
present in at most two of the trees by construction. Thus, when be safely used to reach E, and is picked according to Step 3(a).
a link fails, the switch connected to the failed link can re-write 2An arc is a directed link (or edge).
the VLAN corresponding to the tree on which the link is not 3If a path on one tree contains directed link , then the paths on the other
present. two trees will contain neither nor .
416 IEEE TRANSACTIONS ON RELIABILITY, VOL. 63, NO. 2, JUNE 2014

Fig. 1. Construction of three spanning trees in an example three edge connected network. (a) Example network. (b)–(d) Three independent spanning trees rooted
at node A. (e)–(g) Three undirected spanning trees derived from the independent spanning trees. The three undirected spanning trees have the property that for
any link in the network, there exists a spanning tree that does not contain the link. (a) Example network, (b) Red independent spanning tree, (c) Blue independent
spanning tree, (d) Green independent spanning tree, (e) Red undirected tree, (f) Blue undirected tree, (g) Green undirected tree.

Fig. 3. Construction of two spanning trees in an example two edge connected


network. (a) Example network. (b)–(c) Two undirected spanning trees derived
from the link-independent trees, rooted at node A. (d)–(f) Three failure scenarios
depicting the different backup forwarding mechanisms required. (a) Example
network, (b) Red spanning tree, (c) Blue spanning tree, (d) Failure scenario 1,
(e) Failure scenario 2, (f) Failure scenario 3.
Fig. 2. Forwarding procedure in 3Trees approach.

The 3Trees approach is designed for three edge connected using VLAN re-write as the only mechanism for fast recovery.
networks. Although the 3Trees approach requires only VLAN However, we may achieve fast recovery with only two spanning
re-write capability at the switches, the three edge connectivity trees, thus two VLAN tags, if we can employ mac-in-mac en-
requirement may not be met by some real-life networks. In ad- capsulation in addition to VLAN re-writing.
dition, the approach requires the use of three trees, hence three We consider a two edge connected network. We select an
VLAN tags per virtual network. As the number of VLAN tags arbitrary node as the root, say , and construct two link-inde-
available is only 4096, allocating three VLAN tags per network pendent spanning trees4. We then turn the two link-independent
may not be preferred. For these two reasons, it is desired to spanning trees directed towards the root into undirected trees,
have fast recovery methods that work on two edge connected referred to as the red and blue trees. Fig. 3(a) shows an ex-
networks, and preferably with fewer number of VLAN tags per ample two edge connected network.Figs. 3(b), and (c) show the
network. In the following sections, we develop two approaches red, and blue undirected spanning trees respectively. The two
to achieve this goal. spanning trees were constructed by constructing two link-in-
dependent trees rooted at node A, and viewing them as undi-
IV. 2TREES APPROACH rected trees. Although we treat the spanning tree as undirected,
we will still use the root as an intermediate node to achieve
In an arbitrary two edge connected network, we may require
fast recovery. Thus, at every node, we maintain the forwarding
spanning trees such that for every link, there exists at least
neighbor to reach root along the red and blue trees.
one spanning tree that does not contain the link. Spanning trees
with this property are required only if we are restricted strictly to 4They can be constructed in [40].
GOPALAN AND RAMASUBRAMANIAN: FAST RECOVERY FROM LINK FAILURES IN ETHERNET NETWORKS 417

host attached to switch D. The destination would have been


learned over the link E—B on the red tree, and on link E—F on
the blue tree. As the forwarding neighbors on the red tree, B, is
different from that on the blue tree, F, the packet is re-written
with the blue VLAN tag, and forwarded to F.
Scenario 2: . As we have only two span-
ning trees, it is possible that the forwarding neighbor for a des-
tination is the same on the red and blue trees. For example, the
forwarding neighbor for (which is connected to switch D) at
switch B is switch C on both the red and blue trees. Thus, the
packet cannot be forwarded along the second spanning tree upon
a failure. Note that the spanning trees were constructed from
link-independent trees rooted at . The link-independent trees
have the property that, even under a link failure, every node has
connectivity to the root . We exploit this property, and use as
the intermediate point through which we may reach the destina-
tion.
Consider the situation when the red forwarding neighbor to
reach the root node of the spanning tree is not the same as the
forwarding neighbor for . Note that a spanning tree provides
a unique path for every node to reach the root. Thus, if the red
Fig. 4. Forwarding procedure in 2Trees approach. forwarding neighbors for a destination and root are different at
a switch, then it implies the following. (a) The switch can still
reach the root on the red tree. (b) The path to the root from
Consider a packet destined to end host at a switch. The on the red tree traverses the intermediate switch, and therefore,
VLAN tag in the packet is initially red to begin with. In the the destination is still connected to the root on the blue tree.
absence of failures, the switch forwards the packet along the Thus, we may forward the packet to the root along the red tree,
red forwarding edge on which it has learnt the destination . In and forward the packet from the root to the destination along
addition, assume that all the switches have learned all the end the blue tree. To achieve this solution, we employ mac-in-mac
hosts on both the trees. Let denote the failure bit part of the encapsulation. We first re-write the VLAN tag on the original
CoS field that is initialized to zero. Let and denote the packet to that of the blue spanning tree. We then add an encap-
forwarding neighbors for destination at the switch. In addition, sulation (outer) header to forward the packet to the root on the
let denote the forwarding neighbor to reach the root of the red tree.
spanning tree at the switch. Consider the failure of link B-C as shown in Fig. 3(e). Con-
The procedure shown in Fig. 4 details the forwarding actions sider a packet destined to node that is connected to switch D,
taken at a switch. Because the 2Trees approach will employ arriving at node B. The forwarding neighbor for destination
mac-in-mac encapsulations, the destination and tag in the at switch is the same switch, C, on the red and blue trees.
procedure refers to the MAC address and VLAN tag of the out- However, the forwarding neighbor for the root on the red tree is
ermost header in the packet. A. Thus, upon failure of link B-C, switch B would forward the
In Step 1, we assume that, if the switch is the destination packet destined to along the path B-A on the red tree. Node A
as specified in the packet header but the packet also has an would decapsulate the packet, and forward the inner packet to
inner MAC header, then Step 1 is executed with the VLAN tag node D along A-D on the blue tree.
and destination of the inner header. If there is no inner header, Scenario 3: . In this scenario, the forwarding
and the switch is connected to the host , it forwards the packet neighbor on the red and blue trees for the destination are the
directly to the host. In all other cases, the packet is forwarded to same as the forwarding neighbor on the red tree for the root.
as described in the procedure. Thus, we have the following properties. (a) The intermediate
Step 2 checks the 1 bit failure information in the CoS field to switch can reach the root along the blue tree. (b) As the root
see if the packet has already seen a link failure, and if so, simply and the destination are on the same side of the failure on the red
drops the packet. tree, the root and destination are still connected on the red tree
Step 3 is the key step that dictates the actions for recovering under a single link failure. Thus, we may forward the packet to
from a link failure, and depends on three scenarios described the root on the blue tree, and then forward the packet from the
next. root to the destination on the red tree. To achieve this result, we
Scenario 1: . As the forwarding neighbors, hence simply leave the original header on the packet as is, and add an
links, are different on the red and blue trees to reach destination encapsulation header to forward the packet to the root on the
, the packet is simply switched from the red to the blue span- blue tree.
ning tree, and forwarded to the neighbor on the blue tree. Consider the failure of link E-F shown in Fig. 3(f). Consider
For example, consider the failure of link E—B shown in a packet destined to end host that is connected to switch E,
Fig. 3(d). Consider a packet at switch E that is destined to end arriving at node F. The forwarding neighbors on the red and
418 IEEE TRANSACTIONS ON RELIABILITY, VOL. 63, NO. 2, JUNE 2014

Fig. 5. Construction of a shortest path spanning tree in an example two edge


connected network. (a) Example network. (b) Undirected shortest path spanning
tree rooted at node A. (a) Example network, (b) Spanning tree.

blue trees to reach E are the same switch E, which is also the
same as the forwarding neighbor for the root on the red tree.
Thus, switch F would encapsulate the packet to the root (switch Fig. 6. Construction of backup port assignments according to [16] on an ex-
A), and forward along the blue tree. The packet follows the path ample two edge connected network. (a) Tree rooted at node A. (b) Backup for-
F-G-D-A, and gets decapsulated at switch A. The inner packet, warding node assignments. (c) The red VLAN. (d) The blue VLAN. (a) Desti-
nation rooted tree, (b) Backup forwarding nodes, (c) Red undirected spanning
which is destined to node on the red tree, gets forwarded along tree, (d) Blue undirected forest.
the path A-B-E.
The 3Trees and 2Trees approaches developed thus far re-
quired that the trees be constructed simultaneously. If one de-
rooted at a destination node, the 1Tree-IP approach computes
sires a specific tree to be employed under no failure scenario,
backup ports for every node. Every node has a primary for-
then it may not be possible to construct the other tree(s). Con-
warding neighbor, and a backup forwarding neighbor. In the
sider the example network shown in Fig. 5(a), and assume that
context of IP networks, the packet forwarding is as follows. 1)
the spanning tree protocol is run with node A being chosen as
If a packet is received from any node other than the primary
the root. The resulting spanning tree is shown in Fig. 5(b). It
forwarding neighbor, then the packet is forwarded to the pri-
can be easily seen that a second link-independent spanning tree
mary forwarding neighbor. 2) If the primary forwarding link is
rooted at node A cannot be constructed as both links of node A
unavailable due to a failure, or if the packet is received from
are used in the spanning tree.
the primary forwarding neighbor, then the packet is forwarded
In the 3Trees and 2Trees approaches, one may use a separate
to the backup forwarding neighbor. For a given primary tree
tree for forwarding under the no failure scenario by dedicating
rooted at a destination node, the computation of backup for-
another VLAN tag, referred to as the normal VLAN tag. Thus,
warding neighbors at all other nodes is as follows. The links in
under this approach, the traffic under the no failure scenario
the primary tree are considered in the depth-first or breadth-first
would be forwarded based on the normal VLAN tag. When the
manner. Consider a link that connects node with its primary
packet encounters a failure, the switch would treat the packet
forwarding neighbor . The removal of link would disconnect
as one that was forwarded based on the VLAN tag of the failed
the primary tree into two sub-trees. One subtree would contain
outgoing edge, and select the corresponding backup forwarding
node , referred to as , while the other would contain the
action. If used with a normal VLAN tag, the 3Trees, and 2Trees
destination and node , denoted as . If node does
approaches would require a total of four, and three VLAN tags,
not have a backup forwarding node assigned already, then the
respectively.
shortest path from to any node in is computed by
using only the edges on except for the last hop, which is
V. 1TREE APPROACH the hop used to reach outside of .
To overcome the limitation of the 3Trees and 2Trees ap- The node at which the directed path ter-
proaches, we develop another approach that allows any default minates is referred to as the exit node of . This directed path
tree to be used for forwarding traffic under no failure scenario. provides the backup forwarding neighbors for all the nodes in
We refer to this approach as the 1Tree approach. the path, and is the exit node for all nodes in the path. This
The 1Tree approach is based on the technique for backup port procedure is repeated until the backup forwarding nodes for all
assignment for IP fast re-route developed by Xi et al. [16]. Using the nodes are computed.
their approach as a starting point, we show that it is possible An example shortest path tree rooted at node is shown in
to construct a collection of trees to recover upon a link failure Fig. 6(a). The backup forwarding nodes assigned for the primary
with mac-in-mac encapsulations. The collection of trees can all tree rooted at node A according to the ESCAP algorithm in [16]
use the same VLAN tag. Thus, the 1Tree approach requires two is shown in Fig. 6(b).
VLAN tags: one VLAN tag to identify the spanning tree used We now adapt the ESCAP approach to the VLAN setting.
for forwarding when there are no failures, and a second VLAN We first observe that any primary undirected tree to be used
tag that is shared by a collection of trees. as the default VLAN can be made directed and rooted at an
We first briefly describe the ESCAP algorithm introduced arbitrary node . Now, we would like to use all the backward
in [16]. We then describe how we employ it in our context. directed arcs that ESCAP would compute on the primary tree,
Given a two edge connected network , and a primary tree and treat them in an undirected form, so that we can employ it
GOPALAN AND RAMASUBRAMANIAN: FAST RECOVERY FROM LINK FAILURES IN ETHERNET NETWORKS 419

as the secondary VLAN that can somehow be used to recover


from a link failure on the primary VLAN. However, this ap-
proach is not straightforward as there could be cycles on the
secondary VLAN. Hence, the idea is to modify the original al-
gorithm to avoid these cycles so that we can view all the directed
backup-arcs in an undirected form, and employ them as a sec-
ondary VLAN. We describe the modified algorithm, denoted as
-ESCAP, next, and refer the interested reader to the Appendix Fig. 7. Failure scenarios when the first vlan can be arbitrarily chosen: two
failure scenarios depicting the different backup forwarding mechanisms
for an example of how cycles could result in the absence of any required on the example network. (a) Failure scenario 1, (c) Failure scenario 2.
modifications to ESCAP, a detailed description of the properties
of -ESCAP, and its proof of correctness.
On the primary tree, we construct the backup forwarding
paths for each link failure, considering the links in the breadth
first order as in ESCAP. The key change we make to ESCAP
is as follows. Assume that we are considering the failure of the
primary link for node . In the backup path of , let the last hop
be . Thus, is the exit node of . If, at the time
of this computation, we find that does not have a backup
forwarding node assigned, we assign to be the backup for-
warding node. This key modification to the ESCAP algorithm
guarantees that the forwarding backup node assignment would
never result in a directed loop5. This key change ensures that,
when the directed forwarding edges are turned into undirected
edges, the resultant sub-network of backup forwarding links
would remain a tree. Thus, we can use the same VLAN tag on
all of these trees, exploiting spatial re-use of the VLAN tag.
Applying this technique on our example network, we obtain the
primary tree in red as shown in Fig. 6(c), and the forest in blue
as shown in Fig. 6(d). The idea then is to use the forest to take Fig. 8. Forwarding procedure in 1Tree approach.
us across the failure to the other side of the primary spanning
tree which contains the node attached to the destination host.
However, because we require the spanning tree and forest to the destination as specified in the packet’s outermost header,
be undirected, we lose the information previously carried im- but the packet also has inner MAC header(s), then Step 1 is ex-
plicitly by the directed arcs. Thus, we need to maintain at every ecuted with the VLAN tag and the destination of the next inner
node , the exit node for itself, and for each of the neighbors of header. If there is no inner header, and the switch is connected to
that use to reach the root . These are the children of in the the host , it forwards the packet directly to the host. In all other
directed spanning tree. So, for instance, in the example network cases, the packet is forwarded to as described in the proce-
in Fig. 6(c), node B would need to save that its exit node is node dure. Now, in the event of a link failure at the current switch,
D, and also save the exit nodes of nodes C and E6, which are denoted by , the actions for recovering from the failure de-
nodes D and G respectively. Thus, at every node we maintain pends on two scenarios.
the forwarding neighbors to reach the root and all the exit nodes Scenario 1: . As the forwarding neighbors, hence
as detailed above. links, are the same to reach destination and the root on the
Consider a packet destined to end host at a switch. Assume primary spanning tree, it implies that the failed link was used to
that the VLAN tag in the packet corresponds to the red tree. reach the primary forwarding neighbor of in the directed span-
denotes the forwarding neighbor for destination at the switch, ning tree rooted at . Thus, we use the blue VLAN to reach the
and denotes the failure bit which is initialized to zero. exit node of , which by definition is outside the sub-tree rooted
Let denote the forwarding neighbor to reach the root at , and hence guaranteed to be connected to the switch with
of the spanning tree at the switch. When the network has no destination host on the red VLAN, even after the link failure.
failures, the switch would forward the packet along the red for- To achieve this result, we require an encapsulation to reach the
warding edge on which it has learnt the destination . exit node of . From there, the packet can resume normal for-
Fig. 8 shows the procedure used for forwarding packets under warding on the red VLAN towards the switch to which the host
the 1Tree approach. As in the 2Tree approach, the destination is attached.
and VLAN tag in the procedure correspond to the outermost For example, consider the failure of link E-B shown in
header of the packet. In Step 1, we assume that, if the switch is Fig. 7(a). Consider a packet arriving at switch E that is destined
5Note that the directed loop is not a concern in the IP fast rerouting. It is only
to end host attached to switch D. The destination would
a concern for our approach as we will turn the directed forwarding edges into have been learned over the link E-B on the red tree. As the
undirected edges. forwarding neighbor to reach the destination on the red tree,
6Children of node B in Fig. 6(a). B, is the same to reach the root A on the red tree, the packet
420 IEEE TRANSACTIONS ON RELIABILITY, VOL. 63, NO. 2, JUNE 2014

Fig. 9. Networks considered for performance evaluation. (a) ARPANET (20 nodes, 32 links), (b) NSFNET (14 nodes, 23 links), (c) Node-16 (16 nodes, 24 links),
(d) Node-28 (28 nodes, 42 links), (e) Mesh-4 4 (16 nodes, 32 links).

is encapsulated so that it first reaches the exit node of E, VI. PERFORMANCE EVALUATION
which is node G. This encapsulation can be done with an outer
We evaluate the three approaches developed in this paper:
blue VLAN tag destined to G. Upon reaching G, the packet
3Trees, 2Trees, and 1Tree (3T, 2T, and 1T for short) approaches.
is forwarded on the red VLAN towards D, according to the
We consider five topologies: NSFNET, ARPANET, and three
unmodified original header.
hypothetical topologies Node16, Node28, and Mesh4 4, as
Scenario 2: . As the forwarding neighbors, hence
shown in Fig. 9. The first two of these topologies are wide
links, are different to reach destination and the root on the
area networks. All the networks considered are three edge
primary spanning tree, it implies that the failed link was a child
connected, with the exception of NSFNET to which we added
of node in the directed spanning tree rooted at . Thus, we use
an additional link to make the network three edge connected.
the primary red VLAN to reach the exit node of , then use Thus, all three schemes developed in this paper can be em-
the VLAN corresponding to the blue forest to reach . This ployed on each of these topologies for failure recovery. We
approach requires two levels of encapsulation: one to reach the are interested in the average path length of the approaches
exit node of , and the second to reach itself. Essentially, under single link failures. This information will help quantify
we have used a combination of the red spanning tree and the the deviation from the path lengths during normal forwarding
blue forest to tunnel to the other side of the failed link. From in the absence of failures. Studying this deviation is useful
there on, the packet can resume usual forwarding on the primary because it is representative of the end-to-end delay observed by
spanning tree towards the switch to which the host is attached. the destination due to the failure, and thus helps in provisioning
For example, consider the failure of link A-B shown in buffer capacity for real time applications.
Fig. 7(b). Consider a packet arriving at switch A that is destined In the 3T and 2T approaches, we simply pick the red tree
to end host attached to switch F. The destination would as the default tree. In the 3T approach, upon link failure, we
have been learned over the link A-B on the red tree. As the check to see if the failed link is part of the blue VLAN. If so,
forwarding neighbor to reach the destination on the red tree, B, we use the green VLAN, and otherwise we simply use the blue
is different from that to reach the root A (itself in this case), the VLAN. In case of a link failure when using the 2T approach, the
packet is encapsulated such that it first reaches the exit node of re-forwarding behavior is exactly as is defined in Section IV.
B, which is switch D, and then reaches switch B. This progress In the 1T approach, we use a shortest path tree to mimic the
is achieved with an outermost red VLAN tag with destination behavior of the spanning tree protocol as the default tree, and
D, an inner header with destination B on the blue VLAN tag, the re-forwarding behavior is as described in Section V. In all
and the innermost header being the original one to reach node these approaches, the root of the trees is chosen based on the
F on the red VLAN left untouched. selection criteria outlined in Section VI.C.
GOPALAN AND RAMASUBRAMANIAN: FAST RECOVERY FROM LINK FAILURES IN ETHERNET NETWORKS 421

In all these schemes, we first compare the path lengths under the switch that sees the failure , also known as the point of
no failures to that using optimal shortest paths. We then evaluate local repair. Then, , the path length between nodes and
the average path lengths under single link failures. Formally, under failure , is computed as the sum of 1) the path length
we define the following path metrics that are useful in bench- from to on the default or primary VLAN, , and 2) the
marking and comparing the various schemes. shortest path length from to in the network that has the failed
link removed. Thus, the shortest recovery path represents the
A. No Failures path that a packet would take until it sees the failure, added to
the optimum path from the point of failure to the destination.
We compute the ideal average path lengths in the network The average ideal recovery path length under one failure in the
when there are no failures by computing the shortest path be- network denoted as SPF - One failure is now given by
tween the two nodes in the given network topology. Note that
it may not be possible to obtain these average path lengths in
(4)
reality in Layer-2 networks, as spanning trees are typically con-
structed with one node as root. Thus, while every node may have
a shortest path to the root, the path between any two nodes is
Much like in the case of no failures, this metric is an av-
not guaranteed to be the shortest. This ideal metric, however,
erage of optimal paths that one can achieve under single link
serves as a good benchmark for comparison. Let denote
failures, and hence serves as a good benchmark to evaluate each
the shortest path length from a node to a destination in a
of the schemes.
given network. Then, the average ideal path length under no
We compute the path length under a single link failure for
failures, denoted as SPF-No Failure , is given by
the fast recovery techniques developed as follows. The packet
traverses from switch to switch as defined by the path in the
(1) default forwarding tree. From switch , the packet is forwarded
on the recovery path dictated by the approach in question. Let
denote the recovery path length from to under the
where denotes the number of switches in the network.
failure scenario , then the total path length for the packet is
Now, we consider the path length between two nodes on a
computed as where is different for each of the
spanning tree. Given the default spanning tree that is employed
three approaches.
by any forwarding mechanism when the network does not have
Now, let be the recovery path length from node to
failures, let denote the path length between node and
averaged over all link failures that affect the default path be-
destination . The default spanning tree for the 1T approach
tween and , computed as
would be the shortest path tree rooted at the root node de-
noted as SPT—No Failure . For the 2T, 3T approaches,
this would be the length on the red tree, and denoted as Red (5)
Tree—No Failure
(6)
(2)

It is useful to view the relation of to as the overhead


(or stretch) involved in employing a single undirected spanning C. Root Node Selection
tree in Ethernet networks as opposed to using destination-rooted
trees in IP networks. We highlight that there is a choice to be made for the root,
the node at which the tree(s) are constructed in each of the three
different approaches, and that it can play a role in the perfor-
B. One Link Failures
mance of the various approaches. This result is because we do
We are interested in computing the average path length when not have the flexibility of constructing the trees per destination
a link failure affects the default path from a source to a destina- in any of the approaches. This result was also alluded to earlier
tion, which in turn implies only failures of links present on the in Section I.A as a key difference between the approaches for
default spanning tree. All other single link failure scenarios do Ethernet networks that retain backward learning as opposed to
not affect the re-forwarding schemes, and hence are not consid- approaches for forwarding in IP networks or Ethernet networks
ered as they could falsely indicate better performance. that disable learning.
Let be the shortest path recovery length from node In choosing a root, we optimize for the scenario under which
to averaged over all link failures that affect the default path there are no failures. The rationale for this design choice is that
between and , computed as we expect the network to be forwarding traffic under no failures
more often than in the presence of failures. Thus, for each ap-
(3) proach, we pick the node that minimizes the average distance
over all source-destination pairs on the primary VLAN. This
choice also implicitly assumes a uniform traffic matrix, and any
where denotes a link failure, and signifies the number a priori information on the traffic matrix can be included by ap-
of failure scenarios that affect the default path. Let denote propriately adding weights to corresponding source-destination
422 IEEE TRANSACTIONS ON RELIABILITY, VOL. 63, NO. 2, JUNE 2014

Fig. 10. Impact of root node selection on single link failure performance of the three approaches. (a) ARPANET-1T., (b) ARPANET-2T. & ARPANET-3T.

TABLE II
AVERAGE BACKUP PATH LENGTHS USING THE 1T APPROACH

pairs. Formally, the root denoted by is picked as defined Tables IV that the difference among the approaches is marginal
below. across all the considered networks. The interesting take from
this observation is that the choice of the default tree under no
failure could be different from the shortest path tree constructed
(7) by the spanning tree algorithms without significantly increasing
the path length when the network does not have any failures.
The reason for this result is that, even in the case of the tree
where corresponds to the distance from node to node constructed by spanning tree algorithms, the algorithm merely
on the primary VLAN (SPT in the 1T approach, Red Tree in the optimizes the paths from all nodes to the root, and thus does
other two approaches) for a fixed root . In our earlier descrip- not necessarily do well when all pairs are accounted for.
tions of path lengths, we have simply dropped the superscript Hence, it may not be necessary to dedicate an extra VLAN tag
of the root node that we chose according to (7) in the to use the shortest path tree with respect to the root or a specific
interest of clarity. spanning tree as the default VLAN in the 3T and 2T approaches.
This feature may be lucrative as these approaches involve no or
D. Performance Results fewer encapsulations as opposed to the 1T approach. However,
Before we delve into the performance results of the path for the cases in which a default tree has to be fixed for reasons
lengths on the three approaches, we first briefly study the besides path length optimizations (such as policy, economics
impact of the root selection. Fig. 10 shows the average path etc.), the 1T approach allows for such flexibility absent in the
length under single link failures in each of the three approaches other approaches.
on one particular network, ARPANET, for different choices of Next, we study , the key performance metric for evaluating
the root node. The white bar denotes the optimal root chosen the average path length under single link failures. We see from
according to (7). We observe that, although the root selection Tables II, Tables III, and Tables IV that in the 1T and 3T ap-
has not been optimized for the failure performance, the variance proaches, across all networks considered, we have a stretch of
in the average single link failure path length is marginal across 1.33x with respect to , the average of the optimal recovery
different root node selections in all the approaches. We observe path lengths under single link failures. The 2T approach, on the
similar trends in all other networks as well, but omit the figures other hand, has a stretch of 1.82x with respect to . This re-
for brevity. sult is due to the fact that the link-independent trees in the 2T
We now discuss the performance of path lengths under the approach are not constrained in any fashion. They could each
choice of a corresponding in each of the three approaches. be of arbitrary depth, and hence re-forwarding paths that need
The average path length across all the nodes in the network for to use the root (which is the case for the majority of failures) as
the 1Tree, 2Tree, and 3Tree fast recovery techniques developed a via-point becomes lengthy. In the 3T approach, although we
in this paper are shown in Tables II through Table IV. Some employ link-independent trees, we avoid detours involving the
interesting observations we can draw from the results are dis- root, thus mitigating the stretch involved. In the 1T approach,
cussed next. because the default tree is an SPT, the height of the tree is kept
We study the metric across the 1T, 2T, and 3T approaches. low, which helps keep the detour paths involving the root to
This metric represents the average path length on the primary shorter lengths. Hence, we suggest that the 2Trees approach be
tree in the absence of failures. We see in Tables II, Tables III, and employed only if the network is sparsely (less than three-edge)
GOPALAN AND RAMASUBRAMANIAN: FAST RECOVERY FROM LINK FAILURES IN ETHERNET NETWORKS 423

TABLE III
AVERAGE BACKUP PATH LENGTHS USING THE 2T APPROACH

TABLE IV
AVERAGE BACKUP PATH LENGTHS USING THE 3T APPROACH

connected, and two header encapsulations (required in the 1Tree


approach) may need to be avoided for packet re-forwarding.
Based on the performance results, we observe that the 3Tree
approach provides a good compromise in terms of the path
lengths achieved and the overhead involved in fast recovery.
Between the 2T and 1T approaches, we observe that the 1T
approach provides better path lengths even under single failure Fig. 11. Example of a directed cycle when ESCAP is employed. (a) Example
scenarios compared to the 2Tree approach. The performance network. (b) Primary and backup arcs on the example network. (a) Example
network, (b) The primary tree and backup arcs.
difference is approximately 40% between the two approaches.
The tradeoff is in terms of employing two levels of encap-
sulation in the case of the 1T approach versus one level of cally in Ethernet networks, and employ a VLAN ID, we cannot
encapsulation in the 2T approach. afford to have a cycle as this can cause inconsistencies in the
backward learning mechanism on the VLAN, and potential
VII. CONCLUSION
broadcast storm problems as well. In this Appendix, we show
In this paper, we develop three different techniques for how such cycles can be avoided by making a small modification
achieving fast recovery in Ethernet networks. Two of the ap- to the original ESCAP algorithm.
proaches (2Tree, 1Tree) are applicable to networks that are at Lemma 1: If there exists an undirected cycle on the blue
least two edge connected, and one of them (3Tree) is applicable VLAN, then the directed arcs that make up the cycle must also
to networks that are at least three edge connected. We show form a directed cycle .
that the approaches provide a tradeoff in terms of path length Proof: We prove the result by contradiction. Let’s say we
performance, and the techniques required for achieving fast have an undirected cycle. Now, let’s look at the direction of the
recovery, such as the number of VLAN tags, dependence on arcs that created the undirected cycle. Assume that it is not a
VLAN rewrite, and mac-in-mac encapsulations. Based on the directed cycle. Then, there is at least one node on the cycle that
performance results, the approach employing three spanning has two outgoing arcs. However, this is not possible because
trees offers a better tradeoff in path length across all the net- each node has at most one outgoing arc for backup. The root
works considered in this paper. has none, while all other nodes have exactly one. Thus, we are
only concerned with the possibility of a directed cycle.
APPENDIX Fig. 11 shows an example network alongside its primary
The primary tree used in the ESCAP algorithm which when tree and backup port assignments. It can be seen that the nodes
made undirected is referred to as the red VLAN. Similarly, are present on a cycle on the blue VLAN (dotted).
making all the backup arcs in ESCAP as undirected, we get To avoid such directed cycles, we now propose the following
sub-graphs that could be disconnected from one another. We modification to the ESCAP algorithm.
refer to this collection as the blue VLAN. We first show that Modification to ESCAP: Whenever a node gets picked as
the blue VLAN could have cycles in it. This condition is not a the exit node during ESCAP, say through link say , and
concern in ESCAP because the incoming link information and node does not have a backup neighbor assigned thus far, we
the fact that the primary tree is directed and rooted at a node force to be the exit node for . This modification to the
helps avoid looping on the backup arcs. However, when we original ESCAP algorithm for single link failures is denoted as
make the paths undirected so that they can be used symmetri- -ESCAP.
424 IEEE TRANSACTIONS ON RELIABILITY, VOL. 63, NO. 2, JUNE 2014

Fig. 13. Illustration of a scenario that cannot happen in ESCAP. Nodes and
are both in . Node has no backup arc defined yet but node has its
backup arc (and exit link) defined as .
Fig. 12. Illustration of -ESCAP, forcing the backup forwarding neighbor
for an exit node whose backup has not yet been defined, (a) Exit link in
ESCAP, (b) Forcing link in -ESCAP.

We will show that -ESCAP is 1) one of the several outputs


that the original ESCAP algorithm would compute, and thus,
does not violate any of the algorithm’s properties; and 2) avoids
the formation of directed cycles of length greater than two, thus
ensuring that the undirected graph of the backup arcs is a col-
lection of trees.
Theorem 1: -ESCAP is simply one of the several possible
Fig. 14. Illustration for the contradiction of the existence of a directed cycle in
outputs of the ESCAP algorithm. -ESCAP.
Before we prove the theorem, it is worth noting that the orig-
inal algorithm can produce several valid outputs (namely the
set of backup forwarding ports) because it relies on tree tra- Case 2: We now show that the second possibility will never
versal techniques, such as breadth-first search (BFS) or depth- occur in ESCAP, which will complete the proof. Consider
first search (DFS)), to compute the backup forwarding neigh- Fig. 13 for reference. We know that node has a backup arc
bors. The outputs of such tree traversal algorithms can differ and exit link defined at an earlier stage of ESCAP, which
depending on the implementation (data structures used to repre- implies that some ancestor of (or itself) is using to exit
sent the graph). In our proofs, we will assume that the primary the primary tree. Now, this ancestor, say node , could either
tree in ESCAP is explored using a DFS, and the backup for- be in or an ancestor of on the primary tree.
warding ports are computed using a BFS. Case 2(a): Node cannot be in because node itself
Proof: Consider a link, , that is chosen as an exit does not have a backup path yet, and thus, no link failure within
link for some node (maybe itself or some other node ). The will have been considered in the DFS on the primary tree.
link is not present on the primary tree by definition. Now, Case 2(b): Consider now the case where node is an ancestor
the original ESCAP and -ESCAP differ only in the case in of node , and is using the link present in as its
which node does not already have a backup neighbor assigned. exit link. Then, if node were to use to exit the primary tree,
In such a scenario, we force to be the exit node for , thus then it must also use node because the backup path of any
creating directed cycles of length two. This forcing is illustrated node will have to stay on the primary tree until the last hop on
in Figs. 12(a), (b) below. the exit link, . But, because node and hence ’s backup
We now consider the later stage at which node would have path is yet to be computed, node cannot already have a backup
its backup path computed in the original ESCAP algorithm. path leading to a contradiction.
Such a computation would be instigated either for the failure Hence, the possibility of link being inside will
of node ’s primary forwarding link, or for the failure of a pri- never occur, completing the proof.
mary forwarding link of some ancestor of which tries to use Theorem 2: -ESCAP guarantees that there is no directed
to exit the primary tree. Without loss of generality, let us refer cycle consisting of backup arcs of length greater than two.
to this node that instigates the backup path computation of node Proof: We prove this theorem by contradiction. Assume
as node . that there exists a directed cycle of length . Let the nodes
We consider the computations in ESCAP for . Now, on the cycle be denoted as . Now all links on this
given that the link is not on the primary tree, we have cycle cannot also be on the primary tree (in the opposite direc-
two possibilities for the backup path computation for node . tion) by the definition of a tree. Thus, there exists at least one
1) The link is outside and hence safe to use. exit link on this cycle. Let us call the set of exit links on this
2) The link is inside and hence cannot be used. cycle as , and denote the exit links as which
Case 1: The first possibility ensures that node would pick appear in the same order on the directed cycle. Consider Fig. 14
node or some other 1-hop neighbor, the link to whom is not on for reference, which shows a directed cycle of backup arcs with
the primary tree as the exit link. Hence, -ESCAP is simply the exit links marked on the cycle.
one of the choices in ESCAP as desired, and we are done. We Consider the stage at which the backup arc for node which
note that node will not pick a longer path in ESCAP because is exit link is computed. Now, according to -ESCAP, the
is a neighbor that is only one hop away. only reason why will not pick as its backup forwarding
GOPALAN AND RAMASUBRAMANIAN: FAST RECOVERY FROM LINK FAILURES IN ETHERNET NETWORKS 425

neighbor is because it already has a backup path and exit link [16] K. Xi and H. J. Chao, “Ip fast rerouting for single-link/node failure
computed. Also, because every node has exactly one backup recovery,” in Proc. 4th Int. Conf. Broadband Commun., Netw. Syst.
(BROADNETS 2007), 2007, pp. 142–151.
forwarding neighbor in ESCAP, and hence, also in -ESCAP [17] Standard for Local and Metropolitan Area Networks, IEEE 802.1d,
from Theorem 1, we know that the exit link for should also 2004.
belong to the directed cycle, and hence is . Now, by exten- [18] R. Pallos, J. Farkas, I. Moldovan, and C. Lukovszki, “Performance of
rapid spanning tree protocol in access and metro networks,” in Proc.
sion of the same argument, we can see that should also 2nd Int. Conf. Access Netw. Workshops(AccessNets’07), Aug. 2007,
have been assigned an exit link. Continuing the argument in this pp. 1–8.
fashion, we arrive at the last exit link on the directed cycle, . It [19] R. M. M. Pustylnik and M. Zafirovic Vukotic, “Performance of
the rapid spanning tree protocol in ring network topology.,”
is easy to see that, if a directed cycle as in Fig. 14 were to exist, [Online]. Available: [Online]. Available: http://www.ruggedcom.
then, node will have to have its exit link as . This result com/pdfs/white_papers/performance_of_rapid_spanning_tree_pro-
would imply that already has a backup neighbor assigned, tocol_in_ring_network_topology.pdf
[20] A. F. D. Sousa, “Improving load balance and resilience of ethernet
which contradicts the assumption we started with; namely, that carrier networks with ieee 802.1s multiple spanning tree protocol,” in
we are at the stage at which the exit link for node is to be com- Proc. Int. Conf. Mobile Commun. Lear. Technol., Conf. Netw., Conf.
puted. We note that this proof argument holds for any number Syst.,, 2006, p. 95.
[21] G. Mirjalily, F. Sigari, and R. Saadat, “Best multiple spanning tree in
of exit links on the cycle. metro ethernet networks,” in Proc. 2nd Int. Conf. Comput. Electr. Eng.,
Corollary 1: The undirected graph of the set of directed Dec. 2009, vol. 2, pp. 117–121.
backup arcs does not contain any cycles. Thus, the blue VLAN [22] S. Sharma, K. Gopalan, S. Nanda, and T. C. Chiueh, “Viking: A multi-
spanning-tree ethernet architecture for metropolitan area and cluster
is a forest (of trees) as required. networks,” in Proc. 23rd Annu. Joint Conf. IEEE Comput. Commun.
Soc. (INFOCOM 2004), 2004, vol. 4, pp. 2283–2294.
[23] J. Farkas, C. Antal, L. Westberg, A. Paradisi, T. Tronco, and V. Garcia
REFERENCES de Oliveira, “Fast failure handling in ethernet networks,” in Proc. IEEE
Int. Conf. Commun. (ICC’06) , June 2006, vol. 2, pp. 841–846.
[1] A. Gopalan and S. Ramasubramanian, “Fast recovery from link failures [24] J. Farkas, C. Antal, G. Toth, and L. Westberg, “Distributed resilient
in ethernet networks,” in Proc. 9th Int. Conf. Design Rel. Commun. architecture for ethernet networks,” in Proc. 5th Int. Workshop Design
Netw., Budapest, Hungary, Mar. 2013, pp. 1–10. Rel. Commun. Netw. (DRCN 2005). , Oct. 2005, p. 8, pp.
[2] “Metro Ethernet Services - A Technical Overview",” [Online]. [25] M. Huynh, P. Mohapatra, and S. Goose, “Cross-over spanning trees
Available: http://www.metroethernetforum.org/Assets/White_Pa- enhancing metro ethernet resilience and load balancing,” in Fourth
pers/Metro-Ethernet-Services.pdf Int. Conf. Broadband Commun. Netw. Syst. (BROADNETS 2007), Sep.
[3] M. Ali, G. Chiruvolu, and A. Ge, “Traffic engineering in metro eth- 2007, pp. 251–260, Ed..
ernet,” IEEE Netw., vol. 19, no. 2, pp. 10–17, Mar.–Apr. 2005. [26] M. Huynh, P. Mohapatra, and S. Goose, “Spanning tree elevation
[4] R. Sofia, “A survey of advanced ethernet forwarding approaches,” protocol: Enhancing metro ethernet performance and qos,” Comput.
IEEE Commun. Surveys Tutorials, vol. 11, no. 1, pp. 91–115, 2009. Commun. vol. 32, no. 4, pp. 750–765, Mar. 2009 [Online]. Available:
[5] Standard for Local and Metropolitan Area Networks—Rapid Reconfig- http://dx.doi.org/10.1016/j.comcom.2008.12.001, [Online]. Available:
uration of Spanning Tree, IEEE 802.1w, 2001. [27] J. Qiu, M. Gurusamy, K. C. Chua, and Y. Liu, “Local restoration with
[6] Standard for Local and Metropolitan Area Networks—Multiple Span- multiple spanning trees in metro ethernet networks,” IEEE/ACM Trans.
ning Trees, IEEE 802.1s.. Netw., vol. 19, no. 2, pp. 602–614, Apr. 2011.
[7] D. Katz and D. Ward, “Bidirectional forwarding detection (bfd),” pre- [28] B. Yener, Y. Ofek, and M. Yung, “Convergence routing on dis-
sented at the Internet Eng. Task Force (IETF), RFC 5880,, June 2010, joint spanning trees,” Comput. Netw. vol. 31, no. 5, pp. 429–443,
ISSN 2070-1721. 1999 [Online]. Available: http://www.sciencedirect.com/science/ar-
[8] M. Médard, S. G. Finn, and R. A. Barry, “Redundant trees for pre- ticle/B6VRG-3W377PT-T/2/870c35f216bf3252b5f1234aea75e2b5,
planned recovery in arbitrary vertex-redundant or edge-redundant [Online]. Available:
graphs,” IEEE/ACM Trans. Netw., vol. 7, no. 5, pp. 641–652, 1999. [29] W. T. Tutte, “On the problem of decomposing a graph into n connected
[9] G. Xue, L. Chen, and K. Thulasiraman, “Quality-of-service and factors,” J. London Math. So., vol. s1-36, no. 1, pp. 221–230, 1961.
quality-of-protection issues in preplanned recovery schemes using [30] N. C. S. J. A. Williams, “Edge-disjoint spanning trees of finite graphs,”
redundant trees,” IEEE J. Sel. Areas Commun., vol. 21, no. 8, pp. J. London Math. Soc., vol. 36, 1961.
1332–1345, Oct. 2003. [31] J. Qiu, Y. Liu, G. Mohan, and K. C. Chua, “Fast spanning tree recon-
[10] G. Jayavelu, S. Ramasubramanian, and O. Younis, “Maintaining nection for resilient metro ethernet networks,” in Proc. IEEE Int. Conf.
colored trees for disjoint multipath routing under node failures,” Commun. (ICC’09.), June 2009, pp. 1–5.
IEEE/ACM Trans. Netw., vol. 17, no. 1, pp. 346–359, Feb. 2009. [32] J. Qiu, Y. Liu, G. Mohan, and K. C. Chua, “Fast spanning tree re-
[11] A. Gopalan, “Graph Algorithms for Network Tomography and Fault connection mechanism for resilient metro ethernet networks,” Comput.
Tolerance” Ph.D. dissertation, Dept. of Electrical and Computer Engi- Netw., vol. 55, no. 12, pp. 2717–2729, Aug. 2011.
neering, Univ. Arizona, Tuscan, AZ, USA, 2013 [Online]. Available: [33] P. M. V. Nair, S. V. S. Nair, M. Marchetti, G. Chiruvolu, and M. Ali,
http://arizona.openrepository.com/arizona/handle/10150/301548, “Bandwidth sensitive fast failure recovery scheme for metro ethernet,”
[Online] Available: Comput. Netw., vol. 52, no. 8, pp. 1603–1616, June 2008.
[12] Standard for Media Access Control (MAC) Bridges and Virtual Bridge [34] 802.1ah - provider backbone bridges. [Online]. Available: [Online].
Local Area Networks, IEEE 802.1Q, 2011. Available: http://www.ieee802.org/1/pages/802.1ah.html
[13] IEEE Standard for Local and Metropolitan Area Networks—Media [35] R. E. Tarjan, “Edge-disjoint spanning trees and depth-first search,”
Access Control (MAC) Bridges and Virtual Bridged Local Area Net- Acta Informatica vol. 6, pp. 171–185, 1976 [Online]. Available:
works—Amendment 20: Shortest Path Bridging, IEEE Std 802.1aq- http://dx.doi.org/10.1007/BF00268499, 10.1007/BF00268499. [On-
2012 (Amendment to IEEE Std 802.1Q-2011 as amended by IEEE Std line]. Available:
802.1Qbe-2011, IEEE Std 802.1Qbc-2011, IEEE Std 802.1Qbb-2011, [36] J. Roskind and R. E. Tarjan, “English A note on finding minimum-
IEEE Std 802.1Qaz-2011, and IEEE Std 802.1Qbf-2011), 2012, pp. cost edge-disjoint spanning trees,” English Math. Operations Res. vol.
1-340. 10, no. 4, pp. 701–708, 1985 [Online]. Available: http://www.jstor.org/
[14] A Framework for IP and MPLS Fast Reroute Using Not-via Addresses stable/3689437, [Online]. Available:
draft-ietf-rtgwg-ipfrr-notvia-addresses-10, [Online]. Available:, Dec. [37] H. Nagamochi and T. Ibaraki, “A linear-time algorithm for finding a
2012 [Online]. Available: https://datatracker.ietf.org/doc/draft-ietf- sparse k-connected spanning subgraph of a k-connected graph,” Algo-
rtgwg-ipfrr-notvia-addresses/ rithmica vol. 7, pp. 583–596, 1992 [Online]. Available: http://dx.doi.
[15] Algorithms for computing Maximally Redundant Trees for org/10.1007/BF01758778, 10.1007/BF01758778. [Online]. Available:
IP/LDP Fast- Reroute draft-enyedi-rtgwg-mrt-frr-algorithm-03, [38] A. Gopalan and S. Ramasubramanian, On Constructing Three Edge
[Online]. Available:, Oct. 2013 [Online]. Available: https://data- Independent Spanning Trees , Technical Report, 2011 [Online]. Avail-
tracker.ietf.org/doc/draft-enyedi-rtgwg-mrt-frr-algorithm/ able: http://srini.ca/p/3trees.pdf, [Online]. Available:, Univ. Arizona
426 IEEE TRANSACTIONS ON RELIABILITY, VOL. 63, NO. 2, JUNE 2014

[39] A. Bhalgat, R. Hariharan, T. Kavitha, and D. Panigrahi, “Fast edge Srinivasan Ramasubramanian (S’99–M’02–
splitting and edmonds’ arborescence construction for unweighted SM’08) received the B.E. (Hons.) degree in elec-
graphs,” in Proc. 19th Annu. ACM-SIAM Symp. Discrete Algorithms trical and electronics engineering from Birla Institute
ser. SODA’08, 2008, pp. 455–464, Society for Industrial and Applied of Technology and Science (BITS), Pilani, India, in
Mathematics. 1997, and the Ph.D. degree in computer engineering
[40] A. Itai and M. Rodeh, “The multi-tree approach to reliability in dis- from Iowa State University, Ames, in 2002. He is
tributed networks,” in Proc. IEEE Symp. Foundations Comput. Sci., currently an Associate Professor in the Department
1984, pp. 137–147. of Electrical and Computer Engineering at the
University of Arizona, where he held the position
Abishek Gopalan received the B.E. degree in of Assistant Professor from August 2002 to July
electronics and communication engineering from 2008. He is a co-developer of the Hierarchical
Visvesvaraya Technological University, India, in Modeling and Analysis Package (HIMAP), a reliability modeling and analysis
2007, and the Ph.D. degree in electrical and com- tool, which is currently being used at Boeing, Honeywell, and several other
puter engineering from The University of Arizona, companies and universities. His research interests include architectures and
Tucson, in 2013. His research interests are in graph algorithms for optical and wireless networks, multipath routing, fault tolerance,
theory, algorithms, network design and architecture, monitoring and localization, network tomography, and performance analysis.
fault tolerant routing, and network tomography. He has served as the TPC Co-Chair of BROADNETS 2005, ICCCN 2008,
He has been a student intern at the Indian Institute and ICC 2010 conferences and LANMAN 2010 Workshop. He has served on
of Science, Bangalore, India where he has worked the editorial board of the Springer Wireless Networks Journal from 2005 to
on modeling and performance analysis of optical 2009. He is currently an Associate Editor for the IEEE/ACM TRANSACTIONS
burst switched networks. He has been a visiting research student at Raman ON NETWORKING.
Research Institute, Bangalore, India where he was involved in the design of
radio telescope arrays.