Sie sind auf Seite 1von 9

LAPLUS : An Efficient, Effective and Stable Switch Algorithm for Flow

Control of the Available Bit Rate ATM Service


Sharat Prasad1 Kamran Kiasaleh and Poras Balsara

Broadband Network Lab, Samsung Dept of Elect. Engg., University of Texas at Dallas,
Telecommunications America, Richardson, TX 75081 P.O.Box 830688, Richardson, TX 75083
impractical. On the positive side, the algorithm quickly
ABSTRACT
converges to the Max-Min fair allocation.
LAPLUS is a novel switch algorithm for flow control of the At the other end of the spectrum are algorithms which rely on
Available Bit Rate (ABR) Asynchronous Transfer Mode (ATM) approximations to reduce the computation time and memory size
service. It ensures a steady-state rate allocation satisfying the [6 – 9, 13]. Network configurations can be constructed which
MCR-plus-equal-share criterion. It only requires constant-time make algorithms of this class converge to unfair rate allocation
processing and two tags to be stored per flow. It is naturally able [10]. In the middle are clever implementations of the basic
to take Peak Cell Rates of flows into account. LAPLUS solves in algorithm, which require significantly smaller amount of
a novel way the problem of selecting a measurement interval. memory and enable efficient computation of the rate allocation
The solution allows it to contain queue growth and keep [11]. All algorithms that recompute the allocation periodically
utilization high on one hand and control low speed flows and are prone to problems that arise from the period being too short
operate with stability on the other. We describe results of or too long. A long period makes an algorithm less responsive.
simulation study of LAPLUS. The results show it to be fair, A short period, when low speed flows are present, leads to errors
responsive and stable. in the measurement of traffic load on a link. If a short period is
used for measuring the available bandwidth, an unstable control
1. 1. Introduction results [16].
All switch algorithms for flow controlling the ABR ATM
The ATM Forum has defined the Available Bit Rate (ABR)
service have to deal with growth of switch queues during
service for applications that require from the network, in
transients. Techniques for containing the queue growth include
addition to an optional minimum bandwidth, an amount of
requiring source end-stations to delay rate increases but to affect
bandwidth that is difficult to specify precisely. The ABR service
rate reductions immediately upon being notified [11, 12], setting
dynamically varies the bandwidth allowed to an application
aside a fraction of the link bandwidth [7-9, 13], etc. Both the
based on both the needs of the application and the congestion
techniques lower link bandwidth utilization. Delaying rate
state of the network.
increases limits low utilization to periods of transients. But
The members of the ATM Forum have reached an agreement to
delaying rate increases requires mechanisms in the source end-
employ the rate-based closed-loop feedback flow control for
station which have not been specified by the ATM Forum.
supporting the ABR service. The Forum has specified [1] the
In this paper we present the algorithm named LAPLUS.
behavior required of end-stations and switches to be compliant
LAPLUS approximates the bottleneck rate of a link as the
with the ABR service. In this paper we focus our attention on the
LArgest among flow rates PLUs the Surplus bandwidth divided
algorithm a switch may use to manage congestion, maximize the
by the number of flows with the largest rate. This estimate, when
utilization of resources and to meet the QOS guarantees
bound below at a suitable value enables links to succeed in
applicable to ABR connections. There are two applicable
determining their bottleneck rates in the order of their bottleneck
guarantees - a fair access to the available bandwidth and a
levels. While rates of flows differ greatly from their respective
specified low Cell Loss Ratio (CLR).
MCR-plus-fair-share values, this approximation causes flows to
There are several possible fairness criteria [2]. Two we will use
change their rates by amounts that have correct sign but the
here are the Max-Min [3] and the MCR-plus-equal-share
magnitude may be imprecise. When rates of flows satisfy a lock
criteria. The simple distributed algorithm [3] for Max-Min fair
condition, the LAPLUS estimates the correct bottleneck rate.
rate allocation, and many others reported in the literature [4, 5,
The algorithm is naturally able to take Peak Cell Rates of flows
12], require the current rate of each flow to be stored. This calls
into account. The algorithm only requires constant-time
for a large amount of high-speed memory in the switches. Some
processing and two flags per flow to be stored while ensuring
variations of the simple distributed algorithm update the rate
MCR-plus-equal-share rate allocation in the steady state. It
allocation each time a forward RM cell arrives. Others
solves in a novel way the problem of selecting a control or
recompute the allocation periodically. Both the update and the
measurement interval. It uses nested measurement intervals.
recomputation require time proportional to the square of the
High rate flows are controlled using inner (therefor short)
number of flows making the simple distributed algorithm
intervals to keep utilization high and to contain queue growth.
Outer (therefor long) intervals are used to control low rate flows

1
This work was performed at Texas Instruments, Incorporated.

0-7803-4386-7/98/$10.00 (c) 1998 IEEE


and to measure available bandwidth. This ensures stable k − 1 inclusive. Let, for each flow Fi in Cj, ri be the constraint
operation in presence of short-lived large magnitude changes in rate of flow Fi . Each link in the network learns the constraint
the available bandwidth.
The rest of this paper is organized as follows. In Section 2 we rates of the flows which traverse it, determines Cj and then
consider fair rate allocations and switch algorithms for computes its bottleneck rate as [3]
computing them. Section 4 presents SLAPLUS, the simple R j − ∑ F ∈C ri 1
BR j =
i j

version of our algorithm to compute Max-Min fair rate


Bj
allocation. In Section 5 we describe LAPLUS, the enhanced
algorithm which uses nested intervals and the MCR-plus-equal- In effect a link Lj assigns flows in Cj their respective constraint
share criterion and considers PCRs of flows. Simulations rates and then equally divides the remaining bandwidth among
highlighting problems and their solutions are presented flows in Bj. The amount of bandwidth received by each flow in
throughout the paper. Section 6 presents results of simulating the Bj is the bottleneck rate of the link. While this is an algorithm for
LAPLUS algorithm. We summarize the paper in Section 7 and distributed computation of rate allocation, its time complexity is
mention our ongoing work. ( )
O N 2j and it needs to store constraint rates of all flows.

2. Switch Algorithms and Fairness 3. A Simple Distributed Algorithm


Fairness is one of the two assurances the network offers to the As we just saw, the computation of the rate allocation for the set
users of the ABR service. Informally, for a rate allocation to be of ABR flows V sharing a link L involves determining the set C
fair, it must offer a flow as big a share of the bandwidth of the of flows. Each flow in C is constrained at a link other than L.
most congested link it traverses as any other flow traversing the
Any flow with a constraint rate rl which is less than the
same link [3].
We define the Max-Min fair rate allocation by describing an bottleneck rate BR of link L is clearly constrained at a link other
iterative procedure [3] for computing it. At the outset set than L. This fact suggests the following method for determining
C [11].
variables u1 and v1 are initialized to the set of all links making
Consider a hypothetical sequence with the constraint rates of the
up the network and the set of all ABR flows traversing the flows as elements. Each rate rl occurs only once in the sequence
network, respectively. Variables b j and n j are initialized to the
and has a tag ml which gives the number of flows which have
bandwidth available to and the number of ABR flows sharing
the rate rl . Let the elements be arranged in descending order
the link L j , respectively. During the iteration l, we determine rl
and k be the length of the sequence. Consider the following
as the smallest among the ratio b j n j for all links L j ∈ u l . Let inequality
{ }
Wl = L j ⊆ u l such that b j n j = rl for each link L j ∈ W l . Let R − ∑l = l * m l rl
k 2
S l = {Fi }⊆ v l where each flow Fi travels over at least one link rl * −1 ≥ > r
N − ∑l = l * m l
*
k l

in Wl. Links in Wl are the level l bottleneck links. Flows in Sl are


Where N = ∑l =1 ml is the total number of, and R the total
k
the level l bottleneck flows. rl is the bottleneck rate of each link
in W l and the constraint rate of each flow in S l . bandwidth available to, ABR flows. Comparing the middle sub-
Now a reduced network ul + 1 is constructed by subtracting the set expression of (2) with the right-hand side of expression (1), we
see that they are equal if C only contains flows which have a rate
Wl from u l . v l +1 = v l − S l is the set of flows whose constraint
rl ≤ rl* . Hence inequality (2) tries to find rl * , the smallest rl ,
rates remain to be determined. Let m be the number of flows
which are in S l and which also travel over any link L j ∈ ul + 1 . such that if all flows with rate smaller than rl * are considered
To complete the construction, we subtract mrl from b j and m constrained at other links and BR of the link is computed, it is
found to satisfy rl * −1 ≥ BR > rl * , as it should.
from n j for each L j ∈ ul + 1 . If ul + 1 is null, the bottleneck rate
of each link and the constraint rate of each flow has been found. If the inequality (2) is not satisfied even for l * = 2 , then,
k k
The above procedure cannot be used as part of a switch R − ∑ ml rl R − ∑ m l rl k
algorithm as it requires a central entity having global knowledge.
r1 < l =2
= l =2
⇒ ∑ m l rl < R
A practical algorithm must allow the links in the network to N − ∑l = 2 m l
k
m1 l =1
determine their respective bottleneck rates in a distributed
fashion. and the link is under-subscribed. Hence the flows must be
Consider again the link Lj with ABR bandwidth Rj shared by Nj allowed to increase their rates. A value for BR must be
ABR flows. The set of flows V j traversing link Lj can be determined which will allow flows to increase their rates by
amounts which add up to precisely R − ∑l =1 ml rl , the amount of
k
divided into two subsets - the subset Cj containing the flows that
are constrained by other links and the subset Bj containing the
flows for which Lj is the bottleneck link. If bottleneck level of Lj under-subscription. Let us denote by l * the index such that all
is k, then Cj contains flows with bottleneck levels 1 through flows with rate rl ≤ rl ’ (indices l ≥ l * ) are constrained

0-7803-4386-7/98/$10.00 (c) 1998 IEEE


elsewhere2. Then only flows that are able to increase their rates The procedure described above is a complete algorithm for the
are those with rates rl ≥ rl ’ . Equating the amount of under- distributed computation of the Max-Min fair rate allocation and
subscription to the net increase, is referred to as the basic algorithm in this paper. Figure 4 (at
k l ’−1 l ’−1 l ’−1 the end of the paper) shows the Generic Fairness Configuration
R − ∑ m l rl = ∑ (BRU − rl )m l = BRU ∑ m l − ∑ rl ml II (GFCII) network [15] described further in Section 6. Table 1
l =1 l =1 l =1 l =1 gives the Max-Min fair rates for the various sources.
and the bottleneck rate of the link is then given by
k l ’−1 k 3 a1 a2
R − ∑ ml rl + ∑ ml rl R − ∑ ml rl 0 4T0 8T0
l =1 l =1 l =l ’
BRU = l ’−1
= k Figure 2 : RM cell not seen due to rate increase
∑ ml
l =1
n − ∑ ml
l =l ’
The subscript U stands for under-subscribed. Everything else
remaining the same, smaller the l ’ , larger is the BRU given by c1 c2
Eqn. (3). For l ′ = 1 , BRU = ∞ . Since BRU is the Explicit Rate 0 4T0 8T0
(ER) feedback sent to the sources and is meant to be used by the Figure 3 : RM cell not seen due to a rate decrease
sources as an upper bound on their ACR, in absence of the Unless mentioned otherwise, all simulations presented in this
knowledge of the true constraint rates, the largest admissible paper are for inter-switch distances of 200 km and end-station-
value should be used. Eqn. (3) gives this value when l ′ = 2 . So, to-switch distance of 200 m. End-stations and switches behave
strictly in accordance with [2]. Switches have a single FCFS
R − ∑l =1 ml rl 4
k

BRU = r1 + queue at each output. Figure 5 shows the source ACRs and
m1 switch queues when the basic algorithm is employed.
If the link is not under-subscribed, the inequality (2) is satisfied A problem with the above procedure is that it requires a
for some l = l * and the BR is given by sequence of rates to be maintained. This requires a large amount
of high-speed memory and O(N ) , where N is the total number
R − ∑l =l * m l rl 5
k

BRO = of flows, worst-case processing time. Computation of the


N − ∑l = l * m l bottleneck rate of a link also requires O(N ) worst-case
k

where the subscript O stands for over-subscribed. processing time, as elements in the sequence of rates for the link
have to be examined sequentially to find l = l * satisfying
Table 1 : Max-Min fair rate allocation for GFCII
inequality (2).
Flow groups Fair rate 8.0e+07
Figure 5A: Source ACRs for the basic algorithm

a, f 10 Mbps 7.5e+07
src_a*
src_b*
src_c*
b, g 5 Mbps 7.0e+07 src_d*
src_e*
c, d, e 35 Mbps 6.5e+07 src_f*
src_g*
6.0e+07 src_h*
h 52.5 Mbps
5.5e+07

5.0e+07
ACR (bps)

4.5e+07

4.0e+07
A Level 0 interval (1 ms) 3.5e+07

3.0e+07
A Level 1 interval (4 ms) 2.5e+07

2.0e+07

A Level 2 interval (16 ms) 1.5e+07

1.0e+07

5.0e+06
Figure 1 : Nested intervals and rate levels 0.000 0.005 0.010 0.015 0.020 0.025
Time (sec)
0.030 0.035 0.040 0.045 0.050

2
4. SLAPLUS: Simple LArgest PLUs Surplus
Charny et al have proven [11] that if the algorithm implied by
Eqn. (1) is used, then once changes in R and the bandwidth We note from Eqn. (4) that to compute BRU , it suffices to know
demands of sources cease, the convergence process begins and r1 (the largest rate), m1 (the number of flows with rate equal to
links succeed in determining their bottleneck rates in the order

k
of their bottleneck levels. We assume, at this point in the the largest rate) and the sum l =1
ml rl . Knowledge of rest of
discussion, that if L is a level l link, links at levels 1 through the rates individually is not required.
l − 1 have already determined their bottleneck rates. For the
analysis in [11] to be applicable, a bottleneck level l flow must at all instants have a rate greater than the level l-1 bottleneck
rate.

0-7803-4386-7/98/$10.00 (c) 1998 IEEE


Figure 5B: Switch queues for the basic algorithm
1.0e+03 Comparing with Eqn. (4), we see that the second operand to the
cc_0
9.6e+02 cc_1 Max function above is the same as the right-hand side of Eqn.
cc_2
9.0e+02 cc_3
cc_4
(4). It is easy to show that BRU as given by Eqn. (4) is always
8.3e+02 cc_5
7.7e+02
cc_6 greater than R N . Hence Eqns. (4) and (6) can be combined
7.0e+02
and we can write in general,
6.4e+02
R R − ∑l =1 m l rl  7
k
Number of cells

5.8e+02

BR = max , r1 +
5.1e+02
N m1 
4.5e+02
 
3.8e+02
The quantity r1 is the largest among the rates of the flows and
( )
3.2e+02

the quantity R − ∑l =1 m l rl m1 is the surplus bandwidth


2.6e+02 k
1.9e+02
1.3e+02 divided equally among the flows with rate r1 . The phrase Simple
6.4e+01
0.0e+00
LArgest PLUs Surplus (SLAPLUS) will be used to refer to the
0.000 0.005 0.010 0.015 0.020 0.025
Time (sec)
0.030 0.035 0.040 0.045 0.050 algorithm just described. Results from simulation of the GFC2
When the rates of the flows have converged to their Max-Min network with switches employing the SLAPLUS algorithm are

k presented graphically in Figure 6. It can be seen that the sources
fair allocation, l =1
ml rl = R and BR = r1 . In this case also
converge to the exact Max-Min fair rates. Preceding the

k
only r1 and ml rl are required. However when the link is attainment of steady values there are oscillations that last a few
l =1
measurement intervals. These oscillations are due to two
over-subscribed (Eqn. (5)), the complete sequence of the rates is
reasons. For over-subscribed links, in absence of the knowledge
required to determine l * and compute the exact BRO . of rates other than the largest rate, SLAPLUS computes a BR
If, instead of storing the complete sequence of rates, only the which reduces the largest rate by the amount necessary to end
∑ the over-subscription. But this value of BR may cause some of
k
first (i.e. the largest) G elements (in addition to l =1
ml rl ) are
the sources with rates smaller than the largest rate to also reduce
stored, maintaining the sequence reduces to an O(1) task. their rates. Overall a larger than required reduction results.
Whenever l * ≤ G + 1 , maintaining only the first G elements does Conversely, SLAPLUS may compute a larger than required
not affect the computation of link bottleneck rate. But now value for BR for an under-subscribed link.
These under- and over-estimations prolong the time required to
whenever l * > G + 1 , we are forced to use l * = G + 1 . This
reach convergence and affect the amount of buffers required in
amounts to assuming that flows with rates rl < rG are
switches. But they do not lead to persistent oscillations or
constrained elsewhere in the network. An incorrect BRO is unfairness [17].
Figure 6A: Source ACRs when SLAPLUS algorithm is used
computed when it is not so. The search to determine l * now 8.0e+07
src_a*
7.5e+07 src_b*
only requires O(G ) time. In this paper we only concern 7.0e+07
src_c*
src_d*

ourselves with the case of G = 1 , i.e. when only the largest rate
src_e*
6.5e+07 src_f*
src_g*
6.0e+07 src_h*
is stored. In [17] we report simulation studies of use of values of
5.5e+07
G greater than one. It is seen that for the network GFCII 5.0e+07
configuration described earlier, the performance of the algorithm
ACR (bps)

4.5e+07
when G = 3 is indistinguishable from when the complete 4.0e+07

sequence of rates is maintained. 3.5e+07

3.0e+07
Maintaining only r1 , i.e. the largest rate, and assuming that
2.5e+07
flows with rates rl < r1 are constrained elsewhere in the network 2.0e+07

corresponds to l = 2 . Substituting in (5), * 1.5e+07

1.0e+07

R − ∑l = 2 m l rl R − ∑l =1 m l rl
k k
5.0e+06

BR ′ = = r1 +
0.000 0.010 0.020 0.030 0.040 0.050 0.060 0.070 0.080 0.090 0.100
Time (sec)

N − ∑l = 2 m l
k
m1 k
The aggregate rate of all the flows ∑m r
l =1
l l , the total number of
R N is a lower bound on BR. BR equals R N when all flows
are bottlenecked at the link under consideration. BR is larger flows N, the largest rate r1 and the number m1 of flows with the
than R N when one or more flows are constrained at other links largest rate are all computed incrementally. To prevent multiple
in the network. Hence for an over-subscribed link, accounting, a flag seeni is associated with each flow Fi in S .
R R − ∑l =1 m l rl 
k 6 The flows are observed over an interval T0.
BR ′′ = max , r1 + 
N m1
 

0-7803-4386-7/98/$10.00 (c) 1998 IEEE


Figure 6B: Switch queues when SLAPLUS algorithm is used
5.1e+02 the effect of low-pass filtering the available bandwidth versus
cc_0
4.8e+02 cc_1
cc_2
time function and is known to make the control stable [16].
4.5e+02
4.2e+02
cc_3
cc_4 Use of nested intervals partitions the set of flows traversing a
cc_5
3.8e+02
cc_6 link into levels (Figure 1). A flow with a Current Cell Rate of
3.5e+02 CCR is said to be of level k given by
  N rm 
3.2e+02
8
Number of cells

2.9e+02
k = log M   + 1

  CCR × T0 
2.6e+02
2.2e+02
1.9e+02 where N rm CCR is the inter-arrival time of RM cells. The
1.6e+02
1.3e+02 above relation assigns to a flow the level k such that N rm CCR
9.6e+01
is strictly smaller than the duration of the level k measurement
6.4e+01
3.2e+01 interval given by T0 × M k . Alternatively, a flow is said to be of
level k iff RLk −1 ≥ CCR > RLk , where RLk = N rm T0 M k . A
0.0e+00
0.000 0.020 0.040 0.060 0.080 0.100
Time (sec)

Each time a forward RM cell for a flow Fi with seeni = false useful observation to make at this point is that when a level k
interval expires, intervals at levels k − 1 , k − 2 , .., 0 also expire.

k
arrives, ml rl , N, r1 and m1 are updated and seeni is set to
l =1
Now a pair - {Rk , N k }- is maintained for each level k in the
true. Each update is a constant time operation. At the end of the
nesting. The elements of the pair are intended to be the
interval, BR is computed and reinitialization is performed.
Computing BR using Eqn. (7) also required only O(1) time.
aggregate rate of level k flows and their number.
At the start of every level k measurement interval, Rk and N k
Unfortunately resetting of flags seeni in a straight-forward
manner requires O(N ) time. This is avoided by using two arrays
are initialized to zero. When a forward RM cell for a flow
arrives, the CCR for the flow is used to assign the flow a rate-
instead of one. During a measurement interval, one array is in level k. If this is the first forward RM cell for the flow seen
use while the other is being reset. At the start of a new during the current level k interval, CCR is added to Rk and N k
measurement interval the roles of the two arrays are swapped.
is incremented. When intervals at levels k ≤ k * expire, BR needs
5. LAPLUS : LArgest PLUs Surplus to be computed. To compute BR, the aggregate rate RT of all
flows and the total number N T of flows are required.
A short measurement interval improves the responsiveness of an
Figure 3 shows two levels deep nesting being used. The inner
algorithm. On the other hand, too short a measurement interval
level is referred to as the level zero and the outer level as level
has two undesirable effects. First, no RM cell for a flow may be
seen during the interval ∆T if the flow has a cell rate less than one. At t = T0 , a level zero interval has ended and N 0 gives the
Nrm ∆T . This will make the mi s smaller than they really are number of level zero flows. But N1 is not equal to the number
and the changes in BR larger than they need to be. Oscillations of level one flows at t = T0 or any t < T1 = 4T0 . Only at t = T1 ,
in cell rates result and queues in the switches may grow until when a level one interval has expired, does N1 equal the
buffers overflow. number of level one flows. To work around this problem, when
Second, if too short an interval is used to measure the available an interval expires, the corresponding number of flows and the
bandwidth, the flow control tries to adapt to even short lived aggregate rate of flows is saved away for use during the
changes in the available bandwidth. An unstable control results
following interval. Let SN k and SRk be the saved away N k
[16]. In presence of these conflicting consideration some
researchers have chosen to use large measurement intervals and and Rk . The question we seek to answer is, whether, at any
accept the reduced responsiveness [11]. Others use a short instant when intervals at levels k ≤ k * have expired,
measurement interval but also use methods such as exponential
averaging to deal with the errors [4, 9].
∑k ≤k * N k + ∑k >k * SN k equal N T .
A better solution to the problem is using nested intervals. Each When nested intervals, rather than a single interval, are being
of the innermost intervals are as small as is necessary to achieve used, changes in rates of the flows may result in changes in their
the desired responsiveness. Intervals other than the innermost levels. Referring again to Figure 3, an RM cell for a flow A is
are an integer multiple of the length of the next inner interval seen at the point in time marked a1 with a value for CCR which
long. For example the innermost intervals may have a duration classifies the flow as a level one flow. Rate allocation is re-
of T0 and an interval of level k may have a duration of computed at time 4T0. N 0 and N1 are saved away in SN 0 and
Tk = T0 M k . The ATM Forum mandates that an active flow must SN1 , respectively, and are initialized to zero. If the flow A then
send an RM cell every 100 ms. Hence the outermost interval increases its rate sufficiently to move to level zero, an RM cell
must be larger than 100 ms to ensure that at least one RM cell for the flow may arrive during the level zero interval ending at
from each active flow is seen during the outermost interval. time 5 T0 . As the seen flag for the flow is clear, the flow is
Nested intervals enable an outer and hence a large interval to be classified as a level zero flow and N 0 is incremented. At the
used for the determination of the available bandwidth. This has expiry of the level zero interval ending at time 5 T0 ,

0-7803-4386-7/98/$10.00 (c) 1998 IEEE


N T ≠ N 0 + SN 1 as flow A is being counted twice - once in N 0 arrives, its time-stamp is set to the modulo-4 count then
and once in SN1 . associated with its current rate level. This time-stamp makes it
possible to tell whether an RM cell for a flow last arrived during
To guard against this error, SNks and SRk s are incrementally the current interval, the previous interval or two intervals back.
updated as necessary. At the beginning of a level k measurement The complete pseudo-code for the LAPLUS algorithm is given
interval, SRk and SNk are equal respectively to the aggregate in Appendix.
rate of all level k flows and the total number of level k flows
seen during the previous level k interval. During the interval if a 5.2. Minimum Cell Rate (MCR) and Peak
flow moves from level k P to level k, k P ≠ k , SN k , SRk , SNk
P P Cell Rate (PCR) of Flows
and SRk are appropriately updated. It is simple to change the criterion from being Max-Min to
To understand the remaining problem, consider the case when MCR-plus-equal-share. The available bandwidth is reduced by
the only level zero flow, e.g. C in Figure 3, reduces its rate. An the sum of MCR of ABR flows, Max(0, ACR-MCR) is used as
RM cell for flow C arrives at time c1, 2T0 ≤ c1 < 3T0 , with a the constraint rate of a flow in place of its ACR and the ER field
of returning RM cells is compared against BR + MCR.
value for CCR which classifies the flow as a level zero flow.
LAPLUS determines the bottleneck rates by allowing flows to
Rate allocation is re-computed at t = 3T0 and N 0 is initialized to increase or asking them to decrease their rate over a number of
zero. Then the flow reduces its rate sufficiently enough to move measurement intervals. For a link with a bottleneck rate greater
to level one. When this happens no RM cell for the flow may be than the PCR of some of the flows traversing it, once the largest
seen during the level zero interval ending at t = 4T0 . Flow C is rate grows beyond the PCRs of any flows, those flows come to
not counted in N T = N 0 + SN 1 and so is invisible to the be regarded as being constrained elsewhere (in this case, at the
computation at t = 4T0 . source) as they should be. Thus PCRs of flows are taken into
consideration in the normal course of operation.
The number of such level k flows which were previously at level
k − 1 and then reduced their rates is given by
6. Simulation results
NI k = SN k −1 − N k −1 9
Note that (9) is an approximation as, in practice, a level k − 1 Models of an end-station and a switch were built using the
flow may have reduced its rate to any of the lower levels and not OPNET tool [14]. The network shown in Figure 1 was setup.
specifically to level k. But note that any error introduced is This network is called the Generic Fairness Configuration II in
short-lived as eventually an RM cell for the flow arrives, causes [15]. cc_0 through cc_6 represent seven switches connected by
the correct N k to be incremented and, one measurement interval six links. This network has embedded within it the parking-lot
and the chain configurations. It is known [10] that the parking
later, SN k to be set to N k . It also helps the stability of the lot configuration causes utilization to be low for some
control as any flow which climbs several levels down is taken algorithms (e.g. [6]) and the chain configuration causes some
down one level per measurement interval until an RM cell for algorithms (e.g. [8]) to converge to unfair allocations.
the flow does arrive. The total number of flows is now The links connecting cc_0 - cc_1 and cc_5 - cc_6 run at 50
approximated by Mbps. The links connecting cc_2 - cc_3 run at 100 Mbps.
N T = ∑ SN k 10 Finally the links connecting cc_3 - cc_4 and cc_4 - cc_5 run at
k 150 Mbps. Table 1 below gives the fair rate allocation. The end
The aggregate rate of all the flows is now approximated by station to switch distances is assumed to be 200 m and the
RT = ∑ Rk + ∑ (SN k − N k )RLk + ∑ SRk 11 distance between switches 200-km.
k ≤k * 1≤k ≤k * k >k * Figure 7 shows the source ACRs and the switch queue sizes
The bottleneck rate is once again computed as before, except when LAPLUS algorithm is employed. Comparing with the
that the aggregate rate of all flows and the total number of all results for SLAPLUS we see that the magnitude of oscillations is
flows given above are used. Hence, much reduced. Queues at every switch except cc_5 are seen to
 R R − RT  12 be shorter. This is because SLAPLUS used a single
BR = max , r1 +  measurement interval T = 3 ms. LAPLUS controls flows g1
 NT m1 
through g7, which are bottlenecked at link cc_5 - cc_6, using the
level one measurement interval T1 = 4 ms and the rest of the
5.1. The Rate-Level and Time-Stamp Tags flows using the level zero measurement interval T0 = 1 ms. A
A one bit seen flag is inadequate as the flag must also convey smaller measurement interval makes the algorithm more
the rate level of the flow. A two bit tag may be used for a three responsive and better at preventing queue growth.
levels deep nesting of intervals with one of the four values To study the performance of the LAPLUS algorithm when there
standing for unseen. A two bit time-stamp is also associated with are sudden changes in bandwidth demand or availability,
each flow. As many modulo-4 counters as is the depth of nesting simulation was carried out with all but sources g1 through g7
of intervals are maintained. All the counters are initialized to starting transmission first, attaining the Max-Min fair rates for
zero at power-up and are incremented each time an interval of this reduced configuration and then sources g1 through g7 begin
respective level expires. Whenever the first RM cell for a flow transmission. We again see (Figures 8 A and B) that all sources

0-7803-4386-7/98/$10.00 (c) 1998 IEEE


Figure 8B: Switch queues when some sources start after a delay
attain their correct Max-Min fair rates. It is known [10] that 1.0e+03
cc_0
some algorithms (e.g. [9]) are unfair to flows starting after other 9.6e+02 cc_1
cc_2
9.0e+02
flows. A recent version of [9] does not suffer from this 8.3e+02
cc_3
cc_4
cc_5
unfairness problem. 7.7e+02
cc_6
Figure 7A: Source ACRs when LAPLUS algorithm is used
8.0e+07 7.0e+02
src_a*
7.5e+07 src_b* 6.4e+02

Number of cells
src_c*
7.0e+07 src_d* 5.8e+02
src_e*
6.5e+07 src_f* 5.1e+02
src_g*
src_h* 4.5e+02
6.0e+07
3.8e+02
5.5e+07
3.2e+02
5.0e+07
2.6e+02
ACR (bps)

4.5e+07
1.9e+02
4.0e+07
1.3e+02
3.5e+07
6.4e+01
3.0e+07
0.0e+00
0.000 0.015 0.030 0.045 0.060 0.075 0.090 0.105 0.120 0.135 0.150
2.5e+07 Time (sec)
2.0e+07
The bandwidth available to ABR flows was set to be 95 % of the
1.5e+07
bandwidth remaining after guaranteed flows are provided for.
1.0e+07

5.0e+06
The sources g1 through g7 were modeled as ON-OFF processes.
0.000 0.010 0.020 0.030 0.040 0.050
Time (sec)
0.060 0.070 0.080 0.090 0.100
The ON-OFF periods were made short enough to not allow the
Figure 7B: Switch queues when LAPLUS algorithm is used
1.0e+03 network to reach a steady-state during them. To simplify
cc_0
9.6e+02 cc_1
cc_2
interpretation of results, the amount of traffic offered by these
9.0e+02
8.3e+02
cc_3
cc_4 sources, when they are ON, was set to their fair share. Hence if
cc_5
7.7e+02
cc_6 the flow-control algorithm works properly, when sources g1
7.0e+02 through g7 are on, the other sources must receive the same
6.4e+02
bandwidth as specified in Table 1 and when sources g1 through
Number of cells

5.8e+02

5.1e+02
g7 are off, the sources must be given their fair share for the
4.5e+02 reduced configuration (without sources g1 through g7). It can be
3.8e+02 seen from Figure 9A that it indeed is the case. Figure 9B shows
3.2e+02
that there is no uncontrolled growth of switch queues.
2.6e+02 Figure 9A: Source ACRs with frequent and sharp changes in the available bandwidth
8.0e+07
1.9e+02
src_a*
7.5e+07 src_b*
1.3e+02
src_c*
7.0e+07 src_d*
6.4e+01
src_e*
6.5e+07 src_f*
0.0e+00
0.000 0.010 0.020 0.030 0.040 0.050 0.060 0.070 0.080 0.090 0.100 src_g*
6.0e+07 src_h*
Time (sec)
Figure 8A: Source ACRs when some start after a delay 5.5e+07
8.0e+07
src_a* 5.0e+07
7.5e+07 src_b*
4.5e+07
ACR (bps)

src_c*
7.0e+07 src_d*
src_e* 4.0e+07
6.5e+07 src_f*
src_g* 3.5e+07
6.0e+07 src_h*
3.0e+07
5.5e+07 2.5e+07
5.0e+07 2.0e+07
ACR (bps)

4.5e+07 1.5e+07

4.0e+07 1.0e+07

3.5e+07 5.0e+06

3.0e+07 0.0e+00
0.000 0.125 0.250 0.375 0.500 0.625 0.750 0.875 1.000 1.125 1.250
2.5e+07 Time (sec)
Figure 9B: Switch queues with frequent and sharp changes in the available bandwidth
2.0e+07 1.0e+03
cc_0
9.6e+02 cc_1
1.5e+07
cc_2
9.0e+02 cc_3
1.0e+07
cc_4
8.3e+02 cc_5
5.0e+06
0.000 0.015 0.030 0.045 0.060 0.075 0.090 0.105 0.120 0.135 0.150 cc_6
7.7e+02
Time (sec)
7.0e+02
Finally to study the performance of LAPLUS algorithm in 6.4e+02
presence of frequent and sharp changes in CBR/VBR traffic,
Number of cells

5.8e+02

simulation was carried out with traffic offered by sources g1 5.1e+02

through g7 assigned to the guaranteed class whereas the traffic 4.5e+02


3.8e+02
offered by rest of the sources was assigned to the ABR class. 3.2e+02
The handling by the switches of the traffic classes was chosen to 2.6e+02

be the simplest possible as to offer the least help to the flow- 1.9e+02

control algorithm. The switches maintain two queues at each 1.3e+02


6.4e+01
output, one for each traffic class, and serve the queues in strict 0.0e+00
static priority. Hence the ABR queue is only served when the 0.000 0.125 0.250 0.375 0.500 0.625
Time (sec)
0.750 0.875 1.000 1.125 1.250

guaranteed queue is empty.

0-7803-4386-7/98/$10.00 (c) 1998 IEEE


7. Conclusions SRk = Aggregate constraint rate of level k flows computed
incrementally.
We described LAPLUS, a switch algorithm for flow control of
the ABR ATM service. LAPLUS requires as few as four - two to TS k = A modulo-4 counter incremented every level k interval
store the rate level and two to store a modulo-4 time stamp - flag i = VCI of the RM cell
bits of memory storage per flow. The algorithm has an ratelevi = Rate level of flow i
operational time complexity of O(1) . The low operational time- TSVi = “Time-stamp” identifying when ratelevi was updated
complexity is a consequence of the use of the novel largest plus
surplus heuristic to estimate the bottleneck rate of a link. Note Functions
that the algorithm though, on account of flags associated with function prevseen(i )
each flow, scales only as well as O(N ) . Let k P = ratelevi
We simulated LAPLUS on network configurations which are if TSVi = TS k P
known to cause algorithms to be unfair. LAPLUS is seen to
enable flows to attain their exact Max-Min fair rates in the error /* Information not available */
steady state. LAPLUS uses nested measurement intervals. Use else if TSVi = TS k P − 1 /* Modulo-4 */
of the outermost (therefor a large) measurement interval to return k P
determine available bandwidth filters out short-lived large
else return k P + 1
magnitude changes. Appropriate (based on the rate of flows)
measurement intervals, are used to reliably determine the end
number of and the bandwidth used by flows. Rate allocation is function seen(i )
recomputed once every an inner (therefor short) interval. The Let k P = ratelevi
specific interval used depends on the largest rate. Therefor if TSVi = TS k P
LAPLUS is able to contain queue growth and keep link
utilization high. In practice though, only a small depth of nesting return k P
(e.g. three) may be used. else return unseen
Hardware modules implementing LAPLUS in a 32 port switch end
handling 64k flows and having an aggregate bandwidth of 320
Initialization
Gbps have been designed. Work with the objective of
analytically proving the convergence property and deriving the ∀i : ratelevi = unseen , TSVi = 3
upper bound on convergence time is being pursued. ∀k : N k = Rk = rk = m k = N T = RT = 0, TS k = 0
Acknowledgments Event : Forward RM cell arrival
The authors wish to express their thankfulness to Martin Izzard Let k be such that RLk −1 > CCR ≥ RLk
and Nick McKeown for discussions and feedback which made it
possible to maintain the focus on minimizing the time and space If seen(i ) = unseen
requirement of the algorithm. They also thank Dave Scott and N k = N k + 1 , Rk = Rk + CCR
Bob Hewes. If CCR > r1
Appendix r1 = CCR , m1 = 1
else if CCR = r1
Pseudo-code for the LAPLUS algorithm
m1 = m1 + 1
Design Parameters Let k P = prevseen(i )
K = Nesting depth of intervals
T0 = Interval base if k ≠ k P
M = Interval scale factor SN k = SN k + 1 , SRk = SRk + CCR

Parameters If k P ≠ unseen
SN k P = SN k P + 1
R = Bandwidth available to ABR flows
RLk = Minimum rate for a flow to be considered level k if k P > k
Variables SRk = SRk − RLk
P P P
−1

N k = Number of level k flows else


SRk = SRk − RLk
P P P
Rk = Aggregate constraint rate of the level k flows
ratelevi = k, TSVi = TS k
r1 = Largest among flow rates
m1 = Number of flows with rate r1 Event : Expiry of measurement interval
SN k = Number of level k flows computed incrementally Let k * be the lowest level where the interval has expired.

0-7803-4386-7/98/$10.00 (c) 1998 IEEE


RT = ∑ Rk + ∑ (SN k − N k )RL k + ∑ SR k
[4] N. Ghani and J. W. Mark, “Dynamic Rate-Based Control
k ≤k * 1≤k ≤k * k >k *
Algorithm for ABR Service in ATM Networks,” Proc.
N T = ∑ SN k GLOBECOM’96, November 1996.
k [5] G. Bianchi et. al., “Congestion Control Algorithms for the
 R R − RT  ABR Service in ATM Networks,” Proc. GLOBECOM’96,
BR = max ,r +  November 1996.
 N T k* mk * 
  [6] S. Muddu et. al., “Max-Min Rate Control Algorithm for
For k = k * downto 0 do Available Bit Rate Service in ATM Networks,” Proc.
SRk = Rk , SN k = N k GLOBECOM’96, November 1996.
[7] A.Barnhart, “Enhanced Switch Algorithm for Section 5.4 of
N k = Rk = rk = mk = r1 = m1 = 0 TM Spec.,” AF-TM 95-0195, Feb 1995.
TS k = TS k + 1 /* Modulo-4 */ [8] L.Roberts, “Enhanced PRCA (Proportional Rate Control
Initiate Sequential Re-initialization. Algorithm),” AF-TM 94-0735R1, Aug 1994.
[9] R.Jain et. al., “ABR Switch Algorithm Testing: A Case
Event : Backward RM cell arrival Study With ERICA,” AF- TM 96-1267, October 1996.
If ER > BR [10] F.M.Chiussi and A. Varma, “QOS and Congestion Control
ER = BR in ATM Networks,” IEEE Workshop on VLSI in
Communications, 1996.
Sequential Re-initialization
[11] A.Charny et. al., “Time Scale Analysis and Scalability
Let k * be the lowest level where the interval has expired. Issues for Explicit Rate Allocation in ATM Networks.,”
For i = 1 to number of flows IEEE/ACM Trans. on Networking, pp 569 - 581, August 1996.
Let k P = ratelevi [12] D.H.K.Tsang et al., “A New Rate-Based Switch Algorithm
for ABR Traffic to Achieve Max-Min Fairness with Analytical
if k P ≤ k * Approximation and Delay Adjustment,” INFOCOM’96, Mar
if TSVi = TS k P − 2 /* Modulo-4 */ 1996.
ratelev i = k + 1 , TSVi = TS k − 1
P
P
[13] Y. Afek et. al., “Phantom: A Simple and Effective Flow
Control Scheme,” Proc. SIGCOMM’96, August 1996.
Bibliography [14] Opnet Modeller, Volumes 1 - 8. MIL 3 Inc., Washington.
[1] F. Bonomi and K. W. Fendick, “The Rate-Based Flow [15] R.Simcoe, “Test Configurations for Fairness and other
Control Framework for the Available Bit Rate ATM Service,” Tests,” AF-TM 94-0557, Jul 1994.
IEEE Network, March/April 1995, pp 25 - 39. [16] Y.Zhao et. al., “Feedback Control of Multiloop ABR
[2] S.S.Sathaye, “ATM Forum Traffic Management Traffic in presence of CBR/ABR Traffic Transmission,” Proc.
Specification,” AFTM-0056, June 1996. ICC’96, June 1996.
[3] D.Bertsekas and R.Gallager, Data Networks, Englwood [17] S.Prasad et. al., “LAPLUS: A Provably Convergent Switch
Cliffs, Nj: Prentice Hall, 1992. Algorithm for Flow Control of the Available Bit Rate ATM
Service,” In preparation.

Figure 4 - The Generic Fairness Configuration II network

0-7803-4386-7/98/$10.00 (c) 1998 IEEE

Das könnte Ihnen auch gefallen