Sie sind auf Seite 1von 14


12, DECEMBER 2000


A Hierarchical Multilayer QoS Routing System with Dynamic SLA Management

Atsushi Iwata and Norihito Fujita
AbstractThis paper proposes a hierarchical multilayer QoS routing system with dynamic SLA management for large-scale IP networks. Recently, the promising approach to provide QoS in large-scale IP networks using a mixture of DiffServ-based QoS management and MPLS-based traffic engineering has been actively discussed. However, the introduction of QoS exacerbates the already existing scalability problems of the standard IP routing protocols. In order to address this issue, we propose a new scalable routing framework based on hierarchical QoS-aware path computation. We augment the existing OSPF and CR-LDP protocols to support hierarchical QoS routing, QoS aggregation, and QoS reservation in our MPLS-DiffServ-based hierarchical routing network. In order to provide additional flexibility and cost-efficiency, we augment the network with a policy server which is capable of dynamically handling SLAs between the networks and providing load balancing management within the network. We implement a prototype of the proposed framework and study its performance with a virtual network simulator and specially designed QoS routing algorithm simulator. In our simulations, we evaluate both the implementation complexity and algorithms performance; the results demonstrate the efficiency of the framework and its advantages over the existing proposals. Index TermsAggregation, DiffServ, hierarchical network, MPLS, path computation, QoS, routing, SLA.

I. INTRODUCTION HE IP differentiated services (DiffServ) approach [2] has been advanced as an efficient and scalable traffic management mechanism that guarantees QoS in a large-scale network without using per-flow resource reservation and per-flow signaling (e.g., integrated services). In DiffServ, the resource provisioning for the core routers is performed by the bandwidth broker [1] in a centralized manner. This provisioning has a traffic-class-based granularity, while per-flow policing and shaping are done only at the ingress routers. However, the drawback of the centralized approach is the processing bottleneck at the bandwidth broker. That is why several recent papers have therefore proposed that DiffServ traffic management [22], [24], [26] be combined with a more scalable distributed resource reservation signaling approach, multiprotocol label switching (MPLS). MPLS uses label-switching technology to aggregate a large number of IP flows onto a label at an ingress router and it supports label-based (or aggregated-flow-based) dynamic QoS management. The label-switched path between the ingress
Manuscript received October 15, 1999; revised April 15, 2000. The authors are with Computer & Communication Media Research, NEC Corporation, Kawasaki, Kanagawa 216-8555 Japan (e-mail:; Publisher Item Identifier S 0733-8716(00)09230-1.

router and egress router can be established by distributed routing and signaling, which supports constraint-path routing capability with resource reservation for the path [23]. MPLS also provides another useful mechanism, multipath routing, which utilizes network resources more effectively by using various hop routes toward the destination [26]. The technique for such improvements is called traffic engineering, which most Internet service providers (ISPs) are willing to use for the effective network management. As a distributed QoS path computation mechanism, an interior gateway routing protocol (IGP, such as OSPF) is also being standardized at the Internet Engineering Task Force (IETF) for extending it to support advertisement of QoS-related information in addition to the existing topology information [20]. The IETF, however, is specifying only the control packet formats for this advertisement without standardizing QoS-aware path computation algorithms. Several approaches to find the path providing the best QoS have been proposed, and they can be broadly classified into three categories [3]: source routing [4][6], distributed hop-by-hop routing [4], [15], [9], [10], and hierarchical source routing [11][13]. Because of the scalability problems of source routing and the loop problems of distributed routing, hierarchical source routing has been regarded as the most promising scalable QoS routing approach [3], [12], [13]. It has been used in ATM networks as the PNNI routing protocol [11]. The issue of supporting such a hierarchical QoS routing in MPLS-DiffServ environment has not been addressed so far. Obviously, a hierarchical QoS network has to use a QoS aggregation within its layers. However, there have been no proposals specifying the mechanisms of an appropriate QoS aggregation. In this paper, we address this missing piece, which is a necessary building block for scalable QoS-aware IP networks. We therefore propose to augment the existing IP routing and MPLS signaling protocols to support hierarchical QoS routing and signaling. The routing system using the extended protocols can 1) disseminate QoS information parameters, 2) aggregate QoS information to higher level of the hierarchy, and 3) provide the exact hierarchical QoS path computation and resource reservation. The fundamental protocols we used were the OSPF routing protocol [17] and the signaling protocol, constraint route based label distribution protocol (CR-LDP) [23]. The system is also equipped with a policy server that handles the dynamic management of the service level agreement (SLA) between the service provider and the users and that provides the load-balancing management needed for improving network utilization. These features can be used by network operators to provide more flexible and cost-effective service to the customers.

07338716/00$10.00 2000 IEEE



Fig. 1. Proposed multilayer QoS routing scheme.

The rest of this paper is organized as follows. Section II describes the hierarchical multilayer QoS routing system with dynamic management of the SLA, and discusses several of the components required. Section III describes the prototype system, and Section IV presents and discusses the results of the performance evaluation. Section V concludes this paper by summarizing it briefly. II. HIERARCHICAL MULTILAYER QOS ROUTING SYSTEM A. Overview Fig. 1 depicts the proposed multilayer QoS routing system for an IP core network interconnecting several ISPs. It lets them use the core IP network as a traditional best-effort IP routing network, and also lets them use it as a QoS-guaranteed network by establishing virtual leased lines (VLLs) over the network. We assume that these VLLs can be created dynamically and that their QoS parameters (e.g., bandwidth and delay) can be changed on demand according to the actual or anticipated traffic load. The main reason for introducing the dynamic VLLs is to ensure the end-to-end QoS, such as packet delay and loss, specified in SLA between the ISP and its users. This SLA is the user-SLA, and the SLA between the ISP and the core IP network is the ISP-SLA in Fig. 1. While the user-SLA for each DiffServ class is essentially static, the ISP-SLA is dynamic and is affected by the number of active users of the specified class. By making use of such dynamic VLL services, ISPs can replace the existing leased line services with VLLs, and can significantly reduce the total cost of the leased line. To support such services, we propose a hierarchical multilayer QoS routing system consisting of 1) dynamic SLA management of ISP-SLA (policy server), 2) hierarchical QoS reservation for aggregated IP flows (hierarchical CR-LDP signaling), and 3) hierarchical QoS routing (hierarchical QoS-enabled OSPF (QOSPF)). We assume that the IP core network is administrated by a single carrier as a single autonomous system (AS), in which the QOSPF is used for intradomain path computation, and the border gateway protocol (BGP) [16] is used for interdomain path computation. The overall procedures of the proposed system are illustrated in Fig. 1. Basically, the BGP runs on each ingress router to es-

tablish external BGP sessions (e-BGP), based on the network administrators policy, for exchanging interdomain reachability information with surrounding ISP routers. Internal BGP sessions (i-BGP), in which ingress and egress routers pass this reachability information through the core IP network, are also established. Thus, each ingress router can know the egress router address associated with destination reachable addresses. The policy server manages each ISP-SLA, and the negotiated SLA parameters (IP src-dst addresses, bandwidth, etc.) are downloaded to the policy agent on the ingress router of the IP core network. The policy agent conveys the parameters to the CR-LDP module [23] so that it can establish the MPLS label-switched path (LSP) from the ingress to egress router and reserve resources along the path. The address of this egress router is resolved by using the BGP reachability information, described in preceding paragraph. Although TE-RSVP signaling [23] could be used on this system, we use CR-LDP, because its implementation is easier while having the same functionality as TE-RSVP, and because it uses hard-state-based LSP maintenance, which requires significantly less control packets than does soft-state-based TE-RSVP. Path computation for the LSP is performed by the QOSPF module. Integrated IS-IS [18] could also be used as an alternative routing protocol, but we used OSPF for our system because it is more widely used in the IP network environment. The proposed extensions of QOSPF are QoS parameter aggregation, hierarchical QoS link state advertisement (LSA), and hierarchical QoS path computation. The proposed path computation uses a combination of multiple static-link-metric-based precomputations and on-demand computations for reducing both a blocking probability and an average delay of finding a QoS available path. These static multiple precomputed paths are also used for QoS aggregation for reducing the computational overhead of updating QoS summary LSAs. The proposed extension of CR-LDP is hierarchical source routing with crankback capability. This extension can be very useful for finding an alternative route on demand if the path specified by the QOSPF is unavailable due to the mismatch of local and global QoS information. The QOSPF link state information stored at each router is not so accurate actually because it is not so updated frequently and because the propagation delay for link state update messages is not negligible [12], [13]. In hierarchical networks, the aggregation of QoS information can also cause such mismatches more often. B. Dynamic SLA Management: Policy Server When an IP core network is given a new set of service features or functions, it is important that the changes on the user side (i.e., these on the ISP side) should be minimized. The routers of the ISP should be used just as they were before, even if there are many changes in the core network. In this context, it seems to be good to take a centralized approach in which a central policy server provides a user interface, which can exchange the dynamic SLA negotiation parameters with a secured communication channel, and in which it performs a centralized QoS path computation and controls the routers inside the IP core network. As pointed out earlier, however, this approach will lead to performance bottleneck problems when the network size be-



Fig. 2.

SLA negotiation procedures.

comes large. We therefore propose a decentralized approach in which the central policy server only performs SLA management, while the QoS path computation and resource reservation are performed in the routers in a distributed manner. Fig. 2 depicts the SLA negotiation procedure between an ISP and the core IP network. The ISP administrator who wants to create or change the VLL uses the policy client to issue an SLA request that specifies the source and destination ISP addresses, aggregated IP flow information, bandwidth, and QoS parameters. When the policy server receives this request, it downloads the parameters onto the policy agents in the appropriate ingress router (Node A in Fig. 2), which in turn establishes an LSP using the CR-LDP signaling. C. Constraint-Route-Based MPLS Signaling: CR-LDP This subsection describes the standard behavior of CR-LDP hierarchical signaling, and discusses the behavior of the proposed extension, hierarchical crankback routing. 1) Hierarchical Signaling with CR-LDP: The interaction of the proposed CR-LDP, BGP, and QOSPF modules during the QoS path computation is shown in Fig. 3. When the CR-LDP module receives a trigger for setting up an LSP, it asks the BGP module to resolve the address of an egress router toward the destination IP addresses. It then asks the QOSPF module to find the best QoS-guaranteed path to that egress router. OSPF routing protocol can have a two-layer hierarchy (i.e., area and backbone area) for supporting a large network within an AS. For hierarchical-source-route-based QoS path computation [11][13], the QOSPF module performs QoS path computation at three locations, or at each ingress router on the ingress, backbone, and egress area. The QOSPF module calculates the path as a combination of the strict source route within the area and loose source route beyond the area. In the example of hierarchical CR-LDP signaling behavior, shown in Fig. 4, the ingress node A.1 [in ingress area (A.*)] sends the label request message to the egress node C.1 [in egress area (C.*)], behind which the destination (D.*) is located. The ingress node of area A.* specifies the route by [A.2, A.3, A.n][D.*], where A.2 is the next hop router of the ingress router and A.n is an area border router (ABR) between areas A.* and B.*. A label request message follows the source route specified in the message all the way to

Fig. 3. Interaction of the CR-LDP, BGP, and QOSPF modules.

Fig. 4.

Hierarchical CR-LDP signaling and QOSPF routing.

the next hop ABR. This ABR in turn calculates the strict source route to the next hop ABR toward the IP destination and attaches (or pushes) a source route path onto the carried source route to get a new path [B.2, B.3, B.n][D.*]. This behavior continues recursively until the label request message reaches the egress router C.*. During the LSP setup, the requested bandwidth and DiffServ traffic class in the message are, at each hop of the transit router, examined by the call admission control (CAC) to see whether or not it is possible to establish the LSP [23]. Note that A, B, C, and D represent the network-prefix of the IP addresses, and that * represents the host parts. 2) Hierarchical Crankback Extension for CR-LDP: As discussed in [12] and [13], the resources needed for establishing a path along the source route path specified by QOSPF may not be available because the QoS information provided by QOSPF is inaccurate. We therefore propose that the CR-LDP signaling have a hierarchical crankback capability [25] which in such a case sends back to the ingress border router of each area a message instructing that node to find another path. Weve defined a private extension crankback TLV [25]. If a transit router finds that the available resources along the requested path are insufficient, it notifies the ingress border router of each area by putting



into the signaling message the ID of the blocked link or node. The ingress border router receiving the signaling message with the crankback TLV computes an alternative path within the area by temporarily pruning the unavailable link or node from its topology database. If no new path can be found at an ABR (or if it is blocked again), however, the signaling message is returned to the ingress router of the previous area in order to trigger an alternative routing there. D. QoS-Enabled OSPF Routing: QOSPF QoS-enabled OSPF (QOSPF) protocol is being standardized in IETF by extending the OSPF protocol to support QoS link state parameters. It is designed to collect and maintain the QoS topology map used for QoS path computation. This subsection briefly describes previous QOSPF approaches [10], [26] and then explains our proposed extension of link state advertisement (LSA), resource LSA, and discusses the proposed QoS path computation scheme using this extension. These proposals are explained for both flat and hierarchical networks. 1) Existing Problems of Previous QOSPF Approaches: Two encoding schemes for OSPF QoS extensions have been proposed: type-of-service(TOS)-metrics-based encoding [10] and opaque-LSA (link state advertisement) encoding [19]. Although the TOS-metrics-based encoding can support backward compatibility, it restricts the encoding of extended parameters and does not have sufficient flexibility to accommodate future possible extensions (e.g., other QoS extensions and traffic engineering extensions) [20], [26]. We therefore chose the opaque LSA encoding for the proposed QoS LSA, resource LSA [21]. Recent papers on QoS path computation [9], [10] have proposed a QoS-based precomputation scheme: the precomputation module computes new optimized (or load-balanced) paths and updates the routing table whenever it receives new QoS link state information. To reduce the number of LSAs, they have also proposed to use a triggering threshold and hold-down timers. As the intervals between LSAs becomes longer, however, the advertised QoS information becomes more inaccurate and thus may not be useful for QoS path computation. As we already mentioned, we therefore use a combination of multiple static-link-metric-based precomputations and on-demand computations to address this problem. The combination of the two computations can significantly reduce the computational load of the QOSPF module while providing a small LSP blocking probability. 2) LSA Extension for Basic QoS Routing: The proposed resource LSA represents the maximum link bandwidth ), the reserved link bandwidth ( ), the avail( ), and the link metric ( ) able link bandwidth ( corresponding to each physical link. Although other parameters, such as delay and delay jitter, could also be represented, since the current DiffServ-based traffic management does not include delay and delay jitter parameters, in the work described in this paper, we only use these link metrics and these bandwidth parameters. The path computation is basically to find a bandwidth-cost constrained path [5], [8], which represents a minimized cost path satisfying the requested bandwidth. can be either a physical link bandwidth or The

Fig. 5. QOSPF: QoS parameter aggregation.

the bandwidth allocated for a logical channel such as an is a ATM virtual channel connection (VCC), The bandwidth reserved by the CR-LDP signaling for establishing is the current residual a QoS-guaranteed LSP, and the the bandwidth calculated by subtracting from the bandwidth of the flows currently active on the link. 3) LSA Extension for Hierarchical QoS Routing: Fig. 5 shows the proposed QoS parameter aggregation behavior in a hierarchical network. The proposed resource LSA also supports , , , and for a logical link spanning across areas toward the IP destination. We define a logical link as multiple candidate paths from an ABR to the destination IP summary addresses, which are conventionally carried by standard summary LSA and AS external LSA [17]. These multiple candidate paths are precomputed initially by the sequential Dijkstra algorithm, for example, using a static administrative link metric as a cost function. As shown in Fig. 5, for example, there are in the egress area (C.*) two ABRs: ABR2 and ABR3. Each ABR initially calculates multiple static precomputed paths from itself to the egress router. Each ABR monitors the maximum bottleneck bandwidth for ABR2, among these precomputed paths (i.e., for ABR3) whenever a new resource LSA is received, and if a change is larger than a specified threshold value (i.e., if the change is significant), the ABR creates a summarized resource LSA that specifies destination IP summary addresses, the maximum bandwidth available in the precomputed paths, and the accumulated link metric costs for the chosen path. It then advertises a new resource LSA to backbone area (B.*). The processing load for this is small, since the maximum bottleneck bandwidth of only dedicated multiple precomputed paths is monitored. When ABR1 receives these resource LSAs from ABR2 and ABR3, it also verifies whether or not the change of maximum ( , bottleneck bandwidth (calculated as ), ( , ) ) of the precomputed paths (from itself to egress router through either ABR2 or ABR3) is significant. If it is, then ABR1 advertises an updated resource LSA to the ingress area (A.*). This QoS



Fig. 6. QOSPF: QoS path computation procedure.

parameter aggregation scheme is simpler than the ATM aggregation algorithms that we proposed previously [14], because it does not use any linear programming to calculate parameters for aggregated representation such as the complex node representation [11], and because there is no additive-metric constraints such as delay and jitter parameters. 4) QoS Path Computation: The QoS path computation is performed at each ingress router of each OSPF area (i.e., the ingress area, backbone area, and egress area) by using the ISP-SLA, which specifies destination addresses, required bandwidth, and QoS parameters. The QOSPF module computes a hierarchical QoS path as a strict source route within an area and a loose source route beyond the area, and it sends a path information to the CR-LDP module. To make a hierarchical path computation algorithm, we enhanced the algorithm we had developed for ATM PNNI networks [5], which features a hierarchical source routing method. Fig. 6 shows a flow chart for the proposed path computation procedure. It has two computation stages for QoS routes: precomputation and on-demand computation. The performance evaluation of this approach is discussed in Section IV. The precomputation part is done in the background, using the static administrative link metric; it only has to run if the topology is changed or if the administrative link metrics are changed. Note that it is not a QoS-based precomputation [9], [10], and that its table keeps the lowest-cost (smallest-metric) path or else keeps multiple candidate paths to possible egress routers. Since the proposed precomputed paths are static at almost all the time except the topology change, the computational overhead for this precomputation can be significantly reduced, especially when link resource utilization changes very frequently. However, the drawback of our approach is the suboptimality of precomputed paths, which do not reflect the current link load. That is why we also use on-demand computation when all the precomputed paths cannot provide the required QoS. As discussed later, since the on-demand computation takes a quite large delay for LSP setup, it is very important to minimize the number of using on-demand computations. Therefore, we propose to use multiple precomputed paths for diverse routing

and for reducing the number of on-demand computations. The multiple precomputed paths are selected sequentially from the shortest path to the longer path with randomization within the paths having the same accumulated link metrics. Although the proposed scheme requires more memory to store multiple precomputed paths in the database, since three to four precomputed paths are enough in the real ISP environment [7], the memory overhead does not become a big problem in such a limited number of precomputed paths. Assuming that the destination is located in an area other than the one the ingress router is in, when the ingress router is going to send an LSP setup message requesting the egress router address and specifying the QoS parameters, it 1) finds several candidate transit ABRs toward the egress router, 2) chooses the precomputed paths to these ABRs from the precomputation table, and 3) checks to see whether or not these candidate paths satisfy all the QoS parameters requested. As soon as one of the precomputed paths does satisfy them, it is selected for routing. Note that step 1 helps to improve the QoS routing performance in a multihoming environment, where the same egress router can be reached via different ABRs, as discussed in Section IV-B-3 If no precomputed path satisfies the QoS requirements, on-demand computation begins. It comprises the following two steps: 1) find the minimum metric path by pruning unavailable QoS links; and 2) find the shortest widest QoS path [4]. The first step prunes the links which do not provide the requested bandwidth, and calculates the minimum-metric path on the reduced topology database. The second step calculates the shortest widest QoS path from the ingress router to the egress router by using the resource LSA. The motivation of step 2 is to increase the possibility to find a path satisfying QoS requirements during the crankback situation, due to the inaccurate QoS information of QOSPF [12], [13]. When the path selected either in precomputation or on-demand computation is blocked at an intermediate router because the required QoS cannot be ensured somewhere along the path, the hierarchical crankback routing of CR-LDP is invoked at the ingress router of each area. Then, the ingress router tries to use one of multiple precomputed paths again, if they exist, or to perform the second step of on-demand computation. III. PROTOTYPE SYSTEM We have built an experimental prototype of MPLS-DiffServ based IP QoS routing system for evaluating the performance of the proposed scheme and for finding the performance bottlenecks. Fig. 7 is an outlook of the prototype system. Our prototype implementation used NEC MateNX PCs with Pentium-II 450 MHz processors, 128 Mbytes of real memory, 8 Gbytes of disk, and FreeBSD 2.2.8R as a platform of the terminals, the routers, and the policy servers. CR-LDP, QOSPF, and policy server software were implemented as Unix processes. Fig. 8 depicts the screen of the implemented policy server. It shows the topology/LSP window that shows the network topology and established LSP paths in real time. We also used the alternate queuing (ALTQ) package (ver. 2.0) [27], which offers various kinds of packet queuing disciplines



Fig. 9. Topologies evaluated in the prototype system.

Fig. 7.

Prototype system.

Fig. 10.

CR-LDP signaling: average processing delay for each router.

IV. PERFORMANCE EVALUATION For performance evaluation, we used three environments: 1) the prototype system, 2) a virtual network simulator, and 3) a QoS routing algorithm simulator. The prototype system was used for measurements of the basic performance (i.e., processing delay) of the CR-LDP signaling. Since the numbers of routers and links are limited in the prototype system, we built a virtual network simulator that can emulate a large-scale MPLS network. The virtual network simulator can be used as an integrated simulator that can integrate the simulator with the real prototype system in a virtual large network and that can generate a large number of routing and signaling control packets into the prototype system according to the emulated topology. The scalability performance (e.g., CPU load and path computation delay as the number of routers increases) of the implemented software was evaluated by controlling the number of nodes and topologies. Fig. 12 shows the structure of this network simulator. Using the measured results as basic parameters, we further evaluate the performance of the proposed QoS routing algorithms more extensively with our specialized QoS routing algorithm simulator. A. Performance Measured with the Prototype System and Virtual Network Simulator 1) Network Topology for Measurement: Fig. 9(a) and (b) shows two kinds of physical topologies used for measurement:

Fig. 8.

Policy server: Topology/LSP window.

in the Unix kernel. We modified the kernel to provide MPLS label-switching and also modified the ALTQ package to provide DiffServ-based absolute/differentiated packet scheduling for handling MPLS-labeled packets as well as pure IP packets. This modification enables the FreeBSD machine to serve as either an ingress, intermediate, or egress MPLS label-switching router (LSR). The implemented LSR is a frame-based LSR [24], which specifies the label in a shim header [24] in front of an IP packet.



Fig. 11. CR-LDP signaling: end-to-end LSP setup delay with and without crankbacks.

Fig. 13.

Evaluated topologies on the virtual network simulator.

Fig. 12. Virtual network simulator: the connectivity between real router and simulator.

a straight-line topology and a concatenated meshed-block topology. Both topologies are used for measuring the performance of CR-LDP signaling (we define the processing delay at each hop as a performance unit) for a normal routing and a crankback routing. This performance unit is used for a basic parameter on the QoS routing algorithm simulator. The link bandwidth for both topologies was 155 Mb/s. Fig. 13 shows a large network topology created on the virtual network simulator for measurement of QOSPF performance. ranging The size of the network is controlled by parameter and the from 2 to 6, where the total number of routers is . Although the network total number of links is , evaluated in this section is a medium size for an size, ISP network [7], we can extrapolate its performance behavior to larger networks as well. In order to compare the performance of our algorithm with flat (nonhierarchical) routing, we compare the routing performance on the aggregated five areas with the routing performance on a single flat area, consisting of the same five areas. We locate the real prototype system at the location (A) or (B), to which the virtual network simulator is connected for emulating all the other nodes than the node located in (A) or (B), respectively. The performance is measured on this prototype system connected with the simulator. The location (A) and (B) is for evaluating the load of the standard router and the area border router (ABR), respectively. The performance of the two

is expected to be different, since ABR has to handle multiple link state databases from different areas, and looks interesting to examine, as discussed later. 2) Processing Delay of CR-LDP Signaling: Processing delay of the CR-LDP signaling was measured using the topologies shown in Fig. 9. The average processing delay of each MPLS hop (ingress, intermediate, and egress) for the label-request and the label-mapping processing was measured using the topology in Fig. 9(a), the end-to-end delay for establishing the LSP and allocating the resources along the LSP in a two-level hierarchy was measured using the topology shown in Fig. 9(b). As shown in Fig. 10, most of the processing delay in the label request message occurs at the ingress router. This delay consists of (i) query-response delay for QoS path computation between the CR-LDP and the QOSPF module, (ii) the delay for reserving the ingress selector [24], label, and bandwidth resources; and (iii) other transmission delay of signaling packets. Element (i) depends on the hop counts due to the QoS path computation algorithm (all links of the path has to pass the feasibility check, which takes more time for longer paths) and is the largest part of the processing delay. Fig. 11 shows the average end-to-end CR-LDP processing delay measured, with and without crankback routing for a two-level hierarchy, where each of ingress, backbone, and egress areas have , , and blocked units inside (plotted by along the horizontal axis), and for the flat network routers. Here, the blocked unit is defined as with shown in Fig. 9(b). As the size of the network grows (i.e., to ), in the case of having a crankback routing once or twice in any area, there is less delay in the hierarchical network than there is in the flat network. Since the crankback routing requires the time to return to the ingress router to find another path for rerouting, the distance of cranking back affects the end-to-end delay performance. 3) CPU Load and Path Computation Delay of QOSPF Routing: Performance of QOSPF routing is measured on



Fig. 14.

QOSPF: CPU load against number of routers (flooding interval: 5 s).

Fig. 15. QOSPF: CPU load versus average flooding interval ( routers).


6, 176

Fig. 13 by connecting the real QOSPF router with the virtual network simulator. a) CPU Load: The CPU load required, under the 5-second flooding interval, for LSA processing and path precomputation is plotted in Fig. 14, as a function of the number ), is plotted in of routers. The CPU load, for 176 routers ( Fig. 15 for flooding intervals from 5 to 20 seconds. Both loads are evaluated on flat and hierarchical networks of various sizes 6) and with two different precomputation schemes; the ( QoS-based precomputation [9], [10] (Case 1) and the proposed static-link-metric-based precomputation (Case 2). Location (A) and (B) in each figure means the position in which the real router is located for the performance measurement. As shown in Fig. 14, the CPU load in a hierarchical network can, with either precomputation method, be kept within about one-third of the load in a flat network. This 3 : 1 ratio is almost constant regardless the number of routers. The load in the location (A) is smaller than that in the location (B), due to the fact the router in the location (B) has to update routing information for two directly connected areas, Area 0 and Area 2, while the router in the location (A) maintains only Area 2 information. As shown in Fig. 15, the CPU load in a hierarchical network

can also be kept to about one-third to one-half of the load in a flat network by using either precomputation method. As the flooding interval becomes smaller, the difference of this ratio becomes much more significant. Interesting results in both figures are that the CPU load due to QoS-based precomputation (Case 1) is greater than that due to the proposed precomputation (Case 2). This difference becomes more significant as either the number of routers becomes larger, or as the flooding interval becomes smaller. The QoS-based precomputation requires a lot of CPU cycles because it calculates the best available QoS path whenever new LSAs are received, whereas the proposed precomputation does not require any new calculation unless the network topology or the link metrics are changed. b) Path Computation Delay: Fig. 16(a) shows the path computation delay of the QOSPF in the flat and the hierarchical networks (176 routers, ), for precomputation and for on-demand computation, plotted against the LSP path length from the ingress router to the egress router. Fig. 16(b) and (c) shows the corresponding accumulated path computation delay along the LSP and its improvement due to the use of an area scope on-demand computation (described later in this subsection). As shown in Fig. 16(a), since the precomputation incurs processing delay only for checking the QoS availability along the precomputed path, the delay in either the flat or hierarchical network is small and almost independent of LSP length. The delay due to on-demand computation, on the other hand, is quite large and depends on the LSP length significantly. As expected, comparing the performance in the flat network [flat:location (A)] and the hierarchical network [hierarchy:location (A)], the delay in the flat network keeps growing as the LSP length grows, and the delay in the hierarchical network grows up to five hops (i.e., a distance between ingress and ABR router) and remains constant afterwards. This is because on-demand computation of the hierarchical network is only performed within the local area. However, measuring another delay at the location (B) in the hierarchical network [hierarchy:location (B)], we observed an unexpected result, where its performance is four times as much as that at the location (A), and is also worse than that in even the flat network with LSP length of 12 hops. The router located at (B) is ABR, and it maintains multiple area topological information, Area 0 and Area 2, which is two times as much as that at the location (A), and we find this causes a processing bottleneck. We then evaluate the average accumulated QoS path computation delay on ingress and two intermediate ABRs, and plot the result in Fig. 16(b), where precomputation*M + on-de) means a combination of mand computation*N ( M times precomputation and N times on-demand computation. Because of the on-demand computation processing bottleneck at intermediate ABRs, the average processing delay of on-demand*3, which uses on-demand computation on every router, is even worse than that of the flat network. To solve this problem, we propose an area scope on-demand (AS-OD) computation method, which chooses candidates for the transit areas and performs on-demand computation only for the reduced transit areas to which candidate ABRs are attached. The candidates for transit areas toward destinations are stored



(a) Fig. 17. Flat topologies evaluated in QoS routing algorithm simulations.

(b) Fig. 18. Hierarchical topologies evaluated in QoS routing algorithm simulations ( = 6, 176 routers).

on-demand computation is used three times at the ingress and intermediate ABRs (plotted by on-demand*3), it can be kept to a one-third delay of that of the flat network. Although we can reduce the on-demand computation delay by using the AS-OD scheme, the precomputation scheme incurs much less delay than the on-demand computation. It is therefore important that the number of on-demand computations be reduced by using multiple precomputation tables. B. Performance Evaluated Using the QoS Routing Algorithm Simulator
(c) Fig. 16. QOSPF: QoS path computation delay measured with different QoS path computation schemes. (a) Path computation delay versus LSP path length ( 6, 176 routers). (b) Accumulated path computation delay versus number of routers. (c) Improved path computation delay with the AS-OD scheme.


on an area scope table, which can help to improve the on-demand computation delay. Since ABRs are normally attached to multiple local areas, reducing (or scoping) the number of areas which on-demand computation is performed can improve the performance significantly. The improved delay performance based on the AS-OD scheme is shown in Fig. 18(c). Even when

The previous section mentioned the basic performance of the proposed mechanisms. Using these results, we have evaluated the LSP blocking probability and the average LSP setup delay, for different QoS path computation methods and different network topologies, by using the QoS routing algorithm simulator. 1) Simulation Model: Three kinds of flat networks evalu), ated are shown in Fig. 17; (a) square mesh (36 routers, (b) full mesh (15 routers), and (c) typical ISP topologies [7] (19 routers), and the hierarchical network (176 routers) is shown in Fig. 18. We simulated two types of traffic, best effort (BE) and expedited forwarding (EF) classes, and we assumed a uniform distribution of the load between the two classes: each traffic type



consisted of one-half of the total traffic flow. The traffic is randomly sent among the routers. Both the LSP holding time and bandwidth requirements for each LSP were simulated as exponentially distributed random values. This arrangement generates LSPs of longer duration less frequently than it does LSPs of shorter duration. LSPs requiring less bandwidth occur more frequently than those requiring more bandwidth. In this simulation, the average bandwidth requirement was 25 Mb/s and the average LSP holding times was 5 minutes. LSP interarrival times were simulated as exponentially distributed random values. By choosing the appropriate average interarrival times, we simulated different network loads or link utilizations. All LSPs were classified into 11 bandwidth slots ranging from 05 Mb/s to 9095 Mb/s, and the aggregated blocking probability for each range is plotted in the figures in Sections IV-B-2 and IV-B-3. 2) Performance of Different QoS Path Computations: We evaluated different QoS path computation schemes in the flat networks shown in Fig. 17: i) one precomputed path (1-PRE), ii) one precomputed path and one on-demand computation (1-PRE+1-OD), iii) four sets of precomputed paths (4-PRE(XX), XX: two different selection algorithms explained below), and iv) four sets of precomputed paths and one on-demand computation (4-PRE(XX)+1-OD). The four routes of iii) and iv) are selected in order to produce diverse routes. Even if four diverse routes are not found, multiple routes (less than four) are also used with the same path selection mechanisms. The reason of choosing four as multiple precomputed paths is that the neighbor link connectivity of the current typical ISP [7] is around three to four, and it is usually difficult to obtain more than four diverse routes. Note that on-demand computation includes the case of crankback routing. For choosing four precomputed paths, we use either a heuristic algorithm (XX=HE), or a node-disjoint-path routing algorithm (XX=ND), and compare the performance of each algorithm. If the network connectivity is dense enough to have multiple disjoint paths, ND algorithm is expected to perform well. However, in order to adapt the network, where there is sparse connectivity, not to have enough disjoint paths, the HE algorithm is expected to perform well to find multiple paths that can be well load-balanced. The heuristic algorithm performs the following steps: 1) the first route is the minimum link metric path chosen by the Dijkstra algorithm; and 2)4) the second, third, and fourth route is the minimum link metric path on the reduced network, where the first link of the first route is pruned, where the second link of the first route is pruned, and where the first link of the first route and the second link of the second routes are pruned, respectively. These steps are sequentially performed until the total number of candidate paths reaches four. If multiple paths with an equivalent cost are found in each step, those paths are given precedence over the following steps. The node-disjoint-path algorithm, on the other hand, performs the following steps: 1) the first route is the minimum link metric path; and 2) the th route is the minimum link metric path on the reduced network, where the router and link along th route are pruned except the 1st route, 2nd route, source and destination router. Examples of both computations are shown in Fig. 19. Thus we compared the performance of six different schemes with three different flat network topologies.

Fig. 19.

QOSPF: example of four precomputed paths.



(c) Fig. 20. QOSPF: average LSP blocking probabilities for flat networks with 6, different QoS path computation schemes. (a) Flat square mesh network ( 6 6, ave. link load = 0.22). (b) Flat full mesh network (15 routers, ave. link load = 0.17). (c) Typical flat ISP network (19 routers, ave. link load = 0.24).


Fig. 20 shows the LSP blocking probability for the squaremesh, full-mesh, and typical ISP topologies. The blocking prob-



ability of each scheme increases as the requested bandwidth increases. One precomputation scheme (1-PRE) has a blocking probability quite a bit worse than the other schemes do. The four sets of precomputation schemes [4-PRE(HE/ND)] result in lower blocking probabilities than 1-PRE does. The other three schemes with on-demand computation, 1-PRE+1-OD and 4-PRE(HE/ND)+1-OD, can significantly improve the performance. Although they have almost the same improvement, 4-PRE(HE/ND)+1-OD can reduce the number of on-demand computations more than 1-PRE+1-OD does. In Fig. 20(a), for example, when the requested bandwidth is 70 Mb/s, 4-PRE(ND)+1-OD can reduce the on-demand computation to as little as one-third of 1-PRE+1-OD scheme. Comparing the performances of 4-PRE(HE) and 4-PRE(ND), we found that their performances depend on the network topology. The node-disjoint-path routing algorithm, 4-PRE(ND), performs well in dense connectivity networks (i.e., square-mesh and full-mesh topologies), in which there are enough candidate paths. The heuristic algorithm, 4-PRE(HE), on the other hand, works well in sparse connectivity networks (i.e., ISP topology). Fig. 21 shows the average QoS path computation delay of each scheme, for square-mesh, full-mesh, and typical ISP topologies. The path computation delay uses the actual measured delay in Fig. 16(a) and the LSP blocking probability in Fig. 20(a)(c). As each part of this figure shows, delays are smallest for the precomputation methods, 1-PRE and 4-PRE(HE/ND). As the requested bandwidth increases, on-demand approaches, 4-PRE(HE/ND)+1-OD, increase the average delay, which, however, can be kept much lower than 1-PRE+1-OD scheme. Therefore, the scheme combining multiple precomputed paths and on-demand computation can provide the most beneficial solution to reduce both the LSP blocking probability and the processing delay. Fig. 22 shows another interesting LSP blocking probability result obtained when simulating the full-mesh network, when the network load was three times higher than in the case of Fig. 20(b). For requested bandwidth below 65 Mb/s, the similar call blocking behavior can be seen in this figure. For bandwidth above 65 Mb/s, the 1-PRE scheme performs the best, and the 4-PRE(HE/ND) performs the next best, and other three on-demand schemes performs the worst. This is because choosing the shortest path for larger requested bandwidth is the best way to save the network resources in such a heavy load situation. Fig. 23 depicts the throughput performance of Fig. 22, which can be calculated by multiplying the average number of succeeded calls with their average requested bandwidth. As shown in this figure, 1-PRE+1-OD and 4-PRE(HE/ND)+1-OD have the highest throughput at lower requested bandwidth, and have the lowest throughput at higher bandwidth. The performance of the 1-PRE exhibits the reverse behavior. Fig. 24 shows the average total throughput of Fig. 23 for each path computation scheme. It shows that the total throughput is almost the same for each scheme, although the 1-PRE+1-OD and 4-PRE(HE/ND)+1-OD schemes provide slightly higher throughput. Analyzing the average link utilization of each scheme, however, we can observe an interesting situation. The average link utilization is shown in Fig. 25. It shows that 1-PRE+1-OD and 4-PRE(HE/ND)+1-OD spend a higher link



(c) Fig. 21. QOSPF: average computation delay for flat networks with different QoS path computation schemes. (a) Flat square mesh network ( 6, 6 6, ave. link load = 0.22). (b) Flat full mesh network (15 routers, ave. link load = 0.17). (c) Typical flat ISP network (19 routers, ave. link load = 0.24).


utilization, whereas the average total throughput is the same as other schemes. 4-PRE(HE/ND) scheme has reasonably lower link utilization. This is because on-demand computation possibly finds a longer hop path that would then increase average link utilization. Thus, if one of the optimization goals is to keep the link utilization relatively small under the heavy load condition (while keeping the call blocking probability small), we should choose the multiple precomputation methods (4-PRE(HE/ND)) for good load balancing. 3) Performance of QoS Aggregation in a Hierarchical Network: We evaluated the performance of the proposed QoS aggregation schemes in a hierarchical network in Fig. 18(a) and compared it to the performance in a flat network in Fig. 18(b).



Fig. 22. QOSPF: average LSP blocking probabilities for flat full-mesh network under high load.

Fig. 25. QOSPF: average link utilization for various QoS path computation algorithms used in the flat full-mesh network under high load.

Fig. 23. QOSPF: average throughput against requested bandwidth for the flat full-mesh network under high load.



Fig. 24. QOSPF: average total throughput for various QoS path computation algorithms used in the flat full-mesh network under high load.

Fig. 26(a) and (b) shows the LSP blocking probability for each computation scheme as a function of requested bandwidth in a hierarchical network (depicted by AGG:) and the flat network (depicted by FLAT:). This probability only shows the blocking probability of inter-area traffic among Area 1, 2, 3, and 4 across Area 0, by removing the intra-area LSP blocking probability in each area. Fig. 26(a) shows the call blocking probability for the 1-PRE schemes and 4-PRE (HE/ND) schemes. Both 1-PRE schemes result in high blocking probabilities. The 4-PRE schemes result in lower blocking probabilities, because they can use diverse routes for path selection. Comparing the performance of each 1-PRE scheme and each 4-PRE scheme between the hier-

(c) Fig. 26. QOSPF: performance comparison between hierarchical network and flat network ( 6, ave. link load = 0.10). (a) LSP blocking probability for precomputation schemes. (b) LSP blocking probability for on-demand computation schemes. (c) Average path computation delay for each scheme.

N =

archical network and the flat network, we can observe an interesting result: they work slightly better in the hierarchical network. It is a bit counterintuitive, since the QoS aggregation



scheme (AGG:) only advertises the aggregated internal network topology and bandwidth information to the outside and the flat routing scheme (FLAT:) should provide more exact QoS information to help to reduce the blocking probability. The reason for this behavior has been analyzed as follows. In the hierarchical network, each area has two ABRs toward destination routers. Within each area, each router has one or four precomputed paths to all intra-area routers, including ABRs. When each router performs inter-area path computation in the hierarchical network, it has two or eight precomputation paths through two ABRs. Thus, doubling the precomputed paths helps to improve the call blocking probability. If we increase number of ABRs, we can expect the more performance improvement in the hierarchical network. Fig. 26(b) shows the call blocking probability for the 1-PRE+OD schemes and 4-PRE(HE/ND)+1-OD schemes. The performances of all on-demand related schemes are almost the same, except that the AGG:1-PRE+1-OD and the AGG:4-PRE(ND)+1-OD schemes have slightly higher blocking probabilities than the others do. This is also an unexpected result where the performance of the hierarchical routing is as good as that of the flat routing. This is mainly because doubling the number of precomputed paths and using hierarchical crankback routing can increase the probability of finding a path even when the QoS routing information is inaccurate. Since the crankback routing can get accurate QoS information on-demand and feed it back to the ingress router for a new path computation, it can decrease the blocking probability significantly, even when the topology and QoS information are available in accurate. Examining the performance difference of AGG:4-PRE(HE)+1-OD and AGG:4-PRE(ND)+1-OD in detail, the former one has better performance than the latter one in this hierarchical network topology. This is because the former one has more precomputed paths than the latter one, causing the performance difference. Fig. 26(c) shows the average QoS path computation delay for each computation scheme. Since inter-area traffic across the backbone area requires three separate path computations, the delay result is accumulated along the hops. As this figure shows, the average delay for each scheme is lower in the hierarchical network than in the flat network. This is because, as the path hop lengths become longer, the on-demand path computation delay in the flat network becomes much larger than that in the hierarchical network, even though the call blocking probability is the same. V. CONCLUSION This paper proposed a hierarchical multilayer QoS routing system with dynamic SLA management for large-scale IP networks and introduced three augmented components: a policy server, hierarchical CR-LDP signaling, and hierarchical QOSPF routing. We implemented the prototype system and a virtual network simulator so that we could evaluate the performance of the system and performance bottlenecks. We also developed a simulator for evaluating the performance of the routing algorithm itself. We found that the proposed hierarchical QoS routing and signaling were proved to be a scalable solution delivering better performance than previously proposed approaches. This solution reduces the LSP blocking probability and significantly re-

duces the control overhead. The applicability and the motivations of using the policy server is also explained and proved a quite useful feature for the proposed QoS-enabled network. We also plan to investigate the proposed system and its path computation algorithm more intensively in various kinds of network topologies and with various QoS service scenarios. ACKNOWLEDGMENT The authors would like to thank R. Izmailov at NEC USA, Inc. for providing many valuable comments and suggestions for improving this paper. REFERENCES
[1] X. Xiao et al., Internet QoS: A big picture, IEEE Network Mag., pp. 818, Mar. 1999. [2] S. Blake et al., An architecture for differentiated services, IETF RFC 2475, Dec. 1998. [3] S. Chen et al., An overview of quality of service routing of next-generation high speed networks: Problems and solutions, IEEE Network Mag., pp. 6479, Nov. 1998. [4] Z. Wang, Quality-of-service routing for supporting multimedia applications, IEEE J. Select. Areas Commun., vol. 14, Sept. 1996. [5] A. Iwata et al., ATM routing algorithms with multiple QoS requirements for multimedia Internetworking, IEICE Trans. Commun., vol. E79-B, no. 8, pp. 9991007, Aug. 1996. [6] Q. Ma et al., Quality-of-service routing with performance guarantees, in Proc. IFIP IWQoS Workshop, May 1997. , On path selection for traffic with bandwidth guarantees, in Proc. [7] IEEE Int. Conf. Network Protocols, 1997, pp. 191202. [8] S. Chen et al., On finding multi-constrained paths, in Proc. IEEE ICC98, June 1998. [9] G. Apostolopoulos et al., Quality of service based routing: A performance perspective, in Proc. ACM SIGCOMM, 1998. [10] G. Apostolopoulos et al., Implementation and performance measurements of QoS routing extensions to OSPF, in Proc. Infocom99, Apr. 1999, pp. 680688. [11] ATM Forum, Private network network interface (PNNI), v1.0 specification, May 1996. [12] R. Guerin et al., QoS routing in networks with inaccurate information: Theory and algorithms, IEEE/ACM Trans. Networking, vol. 7, no. 3, pp. 350364, June 1999. [13] A. Orda, Routing with end-to-end QoS guarantees in broadband networks, IEEE/ACM Trans. Networking, vol. 7, no. 3, pp. 365374, June 1999. [14] A. Iwata et al., QoS aggregation algorithms in hierarchical ATM networks, in Proc. IEEE ICC98, vol. 1, June 1998, pp. 243248. [15] S. Nelakuditi et al., Quality-of-service routing without global information exchange, in Proc. IEEE IWQoS Workshop99, Mar. 1999, pp. 129131. [16] B. Halabi, Internet Routing Architectures: Cisco Press, 1997. [17] J. Moy, OSPF: Anatomy of an Internet Routing Protocol. Reading, MA: Addison-Wesley, 1998. [18] ISO, Intermediate system to intermediate system intra-domain routing exchange protocol for use in conjunction with the protocol for providing the connectionless-mode network service (ISO 8473), ISO DP 10589, Feb. 1990. [19] R. Colton, The OSPF Opaque LSA option, IETF RFC, RFC 2370, July 1998. [20] D. Katz et al., Traffic engineering extensions to OSPF, IETF Internet Draft, draft-katz-yeung-ospf-traffic-00.txt, Apr. 1999. [21] N. Fujita and A. Iwata, Traffic engineering extensions to OSPF summary LSA, IETF Internet Draft, draft-fujita-ospf-te-summary-00.txt, Mar. 2000. [22] D. Awduche, MPLS and traffic engineering in IP networks, IEEE Commun. Mag., pp. 4247, Dec. 1999. [23] A. Ghanwani et al., Traffic engineering standards in IP networks using MPLS, IEEE Commun. Mag., pp. 4953, Dec. 1999. [24] G. Swallow, MPLS advantages for traffic engineering, IEEE Commun. Mag., pp. 5457, Dec. 1999. [25] N. Fujita and A. Iwata, Crankback routing extensions for CR-LDP, IETF Internet Draft, draft-fujita-mpls-crldp-crankback-00.txt, Mar. 2000.



[26] C. Villamizar, MPLS optimized multipath (MPLS-OMP), IETF Internet Draft, draft-villamizar-mpls-omp-01.txt, Feb. 1999. [27] K. Cho, A framework for alternate queuing: Toward traffic management by PC-Unix based routers, in Proc. USENIX 1998 Conf., June 1998.

Atsushi Iwata was born in Fukuoka, Japan, in 1964. He received the B.E. and M.E. degrees in electrical engineering from the University of Tokyo, Japan, in 1988 and 1990, respectively. He joined NEC Corporation in 1990, and is a Research Staff Member at Computer & Communication Media Research, NEC Corporation, Kanagawa, Japan. From 1997 to 1998, he was also a Visiting Researcher at the University of California, Los Angeles. His current research interest is the design and analysis of network architectures, routing algorithms, and protocols for computer communication networks.

Norihito Fujita was born in Tokyo, Japan, in 1973. He received the B.E. and M.E. degrees in electrical engineering from Kyoto University, Japan, in 1996 and 1998, respectively. He joined NEC Corporation in 1998, and is a Research Staff Member at Computer & Communication Media Research, NEC Corporation, Kanagawa, Japan. His current research interest is the end-to-end QoS control on computer communication networks.