Beruflich Dokumente
Kultur Dokumente
Broadband Network Lab, Samsung Dept of Elect. Engg., University of Texas at Dallas,
Telecommunications America, Richardson, TX 75081 P.O.Box 830688, Richardson, TX 75083
impractical. On the positive side, the algorithm quickly
ABSTRACT
converges to the Max-Min fair allocation.
LAPLUS is a novel switch algorithm for flow control of the At the other end of the spectrum are algorithms which rely on
Available Bit Rate (ABR) Asynchronous Transfer Mode (ATM) approximations to reduce the computation time and memory size
service. It ensures a steady-state rate allocation satisfying the [6 – 9, 13]. Network configurations can be constructed which
MCR-plus-equal-share criterion. It only requires constant-time make algorithms of this class converge to unfair rate allocation
processing and two tags to be stored per flow. It is naturally able [10]. In the middle are clever implementations of the basic
to take Peak Cell Rates of flows into account. LAPLUS solves in algorithm, which require significantly smaller amount of
a novel way the problem of selecting a measurement interval. memory and enable efficient computation of the rate allocation
The solution allows it to contain queue growth and keep [11]. All algorithms that recompute the allocation periodically
utilization high on one hand and control low speed flows and are prone to problems that arise from the period being too short
operate with stability on the other. We describe results of or too long. A long period makes an algorithm less responsive.
simulation study of LAPLUS. The results show it to be fair, A short period, when low speed flows are present, leads to errors
responsive and stable. in the measurement of traffic load on a link. If a short period is
used for measuring the available bandwidth, an unstable control
1. 1. Introduction results [16].
All switch algorithms for flow controlling the ABR ATM
The ATM Forum has defined the Available Bit Rate (ABR)
service have to deal with growth of switch queues during
service for applications that require from the network, in
transients. Techniques for containing the queue growth include
addition to an optional minimum bandwidth, an amount of
requiring source end-stations to delay rate increases but to affect
bandwidth that is difficult to specify precisely. The ABR service
rate reductions immediately upon being notified [11, 12], setting
dynamically varies the bandwidth allowed to an application
aside a fraction of the link bandwidth [7-9, 13], etc. Both the
based on both the needs of the application and the congestion
techniques lower link bandwidth utilization. Delaying rate
state of the network.
increases limits low utilization to periods of transients. But
The members of the ATM Forum have reached an agreement to
delaying rate increases requires mechanisms in the source end-
employ the rate-based closed-loop feedback flow control for
station which have not been specified by the ATM Forum.
supporting the ABR service. The Forum has specified [1] the
In this paper we present the algorithm named LAPLUS.
behavior required of end-stations and switches to be compliant
LAPLUS approximates the bottleneck rate of a link as the
with the ABR service. In this paper we focus our attention on the
LArgest among flow rates PLUs the Surplus bandwidth divided
algorithm a switch may use to manage congestion, maximize the
by the number of flows with the largest rate. This estimate, when
utilization of resources and to meet the QOS guarantees
bound below at a suitable value enables links to succeed in
applicable to ABR connections. There are two applicable
determining their bottleneck rates in the order of their bottleneck
guarantees - a fair access to the available bandwidth and a
levels. While rates of flows differ greatly from their respective
specified low Cell Loss Ratio (CLR).
MCR-plus-fair-share values, this approximation causes flows to
There are several possible fairness criteria [2]. Two we will use
change their rates by amounts that have correct sign but the
here are the Max-Min [3] and the MCR-plus-equal-share
magnitude may be imprecise. When rates of flows satisfy a lock
criteria. The simple distributed algorithm [3] for Max-Min fair
condition, the LAPLUS estimates the correct bottleneck rate.
rate allocation, and many others reported in the literature [4, 5,
The algorithm is naturally able to take Peak Cell Rates of flows
12], require the current rate of each flow to be stored. This calls
into account. The algorithm only requires constant-time
for a large amount of high-speed memory in the switches. Some
processing and two flags per flow to be stored while ensuring
variations of the simple distributed algorithm update the rate
MCR-plus-equal-share rate allocation in the steady state. It
allocation each time a forward RM cell arrives. Others
solves in a novel way the problem of selecting a control or
recompute the allocation periodically. Both the update and the
measurement interval. It uses nested measurement intervals.
recomputation require time proportional to the square of the
High rate flows are controlled using inner (therefor short)
number of flows making the simple distributed algorithm
intervals to keep utilization high and to contain queue growth.
Outer (therefor long) intervals are used to control low rate flows
1
This work was performed at Texas Instruments, Incorporated.
BRU = r1 + queue at each output. Figure 5 shows the source ACRs and
m1 switch queues when the basic algorithm is employed.
If the link is not under-subscribed, the inequality (2) is satisfied A problem with the above procedure is that it requires a
for some l = l * and the BR is given by sequence of rates to be maintained. This requires a large amount
of high-speed memory and O(N ) , where N is the total number
R − ∑l =l * m l rl 5
k
where the subscript O stands for over-subscribed. processing time, as elements in the sequence of rates for the link
have to be examined sequentially to find l = l * satisfying
Table 1 : Max-Min fair rate allocation for GFCII
inequality (2).
Flow groups Fair rate 8.0e+07
Figure 5A: Source ACRs for the basic algorithm
a, f 10 Mbps 7.5e+07
src_a*
src_b*
src_c*
b, g 5 Mbps 7.0e+07 src_d*
src_e*
c, d, e 35 Mbps 6.5e+07 src_f*
src_g*
6.0e+07 src_h*
h 52.5 Mbps
5.5e+07
5.0e+07
ACR (bps)
4.5e+07
4.0e+07
A Level 0 interval (1 ms) 3.5e+07
3.0e+07
A Level 1 interval (4 ms) 2.5e+07
2.0e+07
1.0e+07
5.0e+06
Figure 1 : Nested intervals and rate levels 0.000 0.005 0.010 0.015 0.020 0.025
Time (sec)
0.030 0.035 0.040 0.045 0.050
2
4. SLAPLUS: Simple LArgest PLUs Surplus
Charny et al have proven [11] that if the algorithm implied by
Eqn. (1) is used, then once changes in R and the bandwidth We note from Eqn. (4) that to compute BRU , it suffices to know
demands of sources cease, the convergence process begins and r1 (the largest rate), m1 (the number of flows with rate equal to
links succeed in determining their bottleneck rates in the order
∑
k
of their bottleneck levels. We assume, at this point in the the largest rate) and the sum l =1
ml rl . Knowledge of rest of
discussion, that if L is a level l link, links at levels 1 through the rates individually is not required.
l − 1 have already determined their bottleneck rates. For the
analysis in [11] to be applicable, a bottleneck level l flow must at all instants have a rate greater than the level l-1 bottleneck
rate.
5.8e+02
BR = max , r1 +
5.1e+02
N m1
4.5e+02
3.8e+02
The quantity r1 is the largest among the rates of the flows and
( )
3.2e+02
ourselves with the case of G = 1 , i.e. when only the largest rate
src_e*
6.5e+07 src_f*
src_g*
6.0e+07 src_h*
is stored. In [17] we report simulation studies of use of values of
5.5e+07
G greater than one. It is seen that for the network GFCII 5.0e+07
configuration described earlier, the performance of the algorithm
ACR (bps)
4.5e+07
when G = 3 is indistinguishable from when the complete 4.0e+07
3.0e+07
Maintaining only r1 , i.e. the largest rate, and assuming that
2.5e+07
flows with rates rl < r1 are constrained elsewhere in the network 2.0e+07
1.0e+07
R − ∑l = 2 m l rl R − ∑l =1 m l rl
k k
5.0e+06
BR ′ = = r1 +
0.000 0.010 0.020 0.030 0.040 0.050 0.060 0.070 0.080 0.090 0.100
Time (sec)
N − ∑l = 2 m l
k
m1 k
The aggregate rate of all the flows ∑m r
l =1
l l , the total number of
R N is a lower bound on BR. BR equals R N when all flows
are bottlenecked at the link under consideration. BR is larger flows N, the largest rate r1 and the number m1 of flows with the
than R N when one or more flows are constrained at other links largest rate are all computed incrementally. To prevent multiple
in the network. Hence for an over-subscribed link, accounting, a flag seeni is associated with each flow Fi in S .
R R − ∑l =1 m l rl
k 6 The flows are observed over an interval T0.
BR ′′ = max , r1 +
N m1
2.9e+02
k = log M + 1
CCR × T0
2.6e+02
2.2e+02
1.9e+02 where N rm CCR is the inter-arrival time of RM cells. The
1.6e+02
1.3e+02 above relation assigns to a flow the level k such that N rm CCR
9.6e+01
is strictly smaller than the duration of the level k measurement
6.4e+01
3.2e+01 interval given by T0 × M k . Alternatively, a flow is said to be of
level k iff RLk −1 ≥ CCR > RLk , where RLk = N rm T0 M k . A
0.0e+00
0.000 0.020 0.040 0.060 0.080 0.100
Time (sec)
Each time a forward RM cell for a flow Fi with seeni = false useful observation to make at this point is that when a level k
interval expires, intervals at levels k − 1 , k − 2 , .., 0 also expire.
∑
k
arrives, ml rl , N, r1 and m1 are updated and seeni is set to
l =1
Now a pair - {Rk , N k }- is maintained for each level k in the
true. Each update is a constant time operation. At the end of the
nesting. The elements of the pair are intended to be the
interval, BR is computed and reinitialization is performed.
Computing BR using Eqn. (7) also required only O(1) time.
aggregate rate of level k flows and their number.
At the start of every level k measurement interval, Rk and N k
Unfortunately resetting of flags seeni in a straight-forward
manner requires O(N ) time. This is avoided by using two arrays
are initialized to zero. When a forward RM cell for a flow
arrives, the CCR for the flow is used to assign the flow a rate-
instead of one. During a measurement interval, one array is in level k. If this is the first forward RM cell for the flow seen
use while the other is being reset. At the start of a new during the current level k interval, CCR is added to Rk and N k
measurement interval the roles of the two arrays are swapped.
is incremented. When intervals at levels k ≤ k * expire, BR needs
5. LAPLUS : LArgest PLUs Surplus to be computed. To compute BR, the aggregate rate RT of all
flows and the total number N T of flows are required.
A short measurement interval improves the responsiveness of an
Figure 3 shows two levels deep nesting being used. The inner
algorithm. On the other hand, too short a measurement interval
level is referred to as the level zero and the outer level as level
has two undesirable effects. First, no RM cell for a flow may be
seen during the interval ∆T if the flow has a cell rate less than one. At t = T0 , a level zero interval has ended and N 0 gives the
Nrm ∆T . This will make the mi s smaller than they really are number of level zero flows. But N1 is not equal to the number
and the changes in BR larger than they need to be. Oscillations of level one flows at t = T0 or any t < T1 = 4T0 . Only at t = T1 ,
in cell rates result and queues in the switches may grow until when a level one interval has expired, does N1 equal the
buffers overflow. number of level one flows. To work around this problem, when
Second, if too short an interval is used to measure the available an interval expires, the corresponding number of flows and the
bandwidth, the flow control tries to adapt to even short lived aggregate rate of flows is saved away for use during the
changes in the available bandwidth. An unstable control results
following interval. Let SN k and SRk be the saved away N k
[16]. In presence of these conflicting consideration some
researchers have chosen to use large measurement intervals and and Rk . The question we seek to answer is, whether, at any
accept the reduced responsiveness [11]. Others use a short instant when intervals at levels k ≤ k * have expired,
measurement interval but also use methods such as exponential
averaging to deal with the errors [4, 9].
∑k ≤k * N k + ∑k >k * SN k equal N T .
A better solution to the problem is using nested intervals. Each When nested intervals, rather than a single interval, are being
of the innermost intervals are as small as is necessary to achieve used, changes in rates of the flows may result in changes in their
the desired responsiveness. Intervals other than the innermost levels. Referring again to Figure 3, an RM cell for a flow A is
are an integer multiple of the length of the next inner interval seen at the point in time marked a1 with a value for CCR which
long. For example the innermost intervals may have a duration classifies the flow as a level one flow. Rate allocation is re-
of T0 and an interval of level k may have a duration of computed at time 4T0. N 0 and N1 are saved away in SN 0 and
Tk = T0 M k . The ATM Forum mandates that an active flow must SN1 , respectively, and are initialized to zero. If the flow A then
send an RM cell every 100 ms. Hence the outermost interval increases its rate sufficiently to move to level zero, an RM cell
must be larger than 100 ms to ensure that at least one RM cell for the flow may arrive during the level zero interval ending at
from each active flow is seen during the outermost interval. time 5 T0 . As the seen flag for the flow is clear, the flow is
Nested intervals enable an outer and hence a large interval to be classified as a level zero flow and N 0 is incremented. At the
used for the determination of the available bandwidth. This has expiry of the level zero interval ending at time 5 T0 ,
Number of cells
src_c*
7.0e+07 src_d* 5.8e+02
src_e*
6.5e+07 src_f* 5.1e+02
src_g*
src_h* 4.5e+02
6.0e+07
3.8e+02
5.5e+07
3.2e+02
5.0e+07
2.6e+02
ACR (bps)
4.5e+07
1.9e+02
4.0e+07
1.3e+02
3.5e+07
6.4e+01
3.0e+07
0.0e+00
0.000 0.015 0.030 0.045 0.060 0.075 0.090 0.105 0.120 0.135 0.150
2.5e+07 Time (sec)
2.0e+07
The bandwidth available to ABR flows was set to be 95 % of the
1.5e+07
bandwidth remaining after guaranteed flows are provided for.
1.0e+07
5.0e+06
The sources g1 through g7 were modeled as ON-OFF processes.
0.000 0.010 0.020 0.030 0.040 0.050
Time (sec)
0.060 0.070 0.080 0.090 0.100
The ON-OFF periods were made short enough to not allow the
Figure 7B: Switch queues when LAPLUS algorithm is used
1.0e+03 network to reach a steady-state during them. To simplify
cc_0
9.6e+02 cc_1
cc_2
interpretation of results, the amount of traffic offered by these
9.0e+02
8.3e+02
cc_3
cc_4 sources, when they are ON, was set to their fair share. Hence if
cc_5
7.7e+02
cc_6 the flow-control algorithm works properly, when sources g1
7.0e+02 through g7 are on, the other sources must receive the same
6.4e+02
bandwidth as specified in Table 1 and when sources g1 through
Number of cells
5.8e+02
5.1e+02
g7 are off, the sources must be given their fair share for the
4.5e+02 reduced configuration (without sources g1 through g7). It can be
3.8e+02 seen from Figure 9A that it indeed is the case. Figure 9B shows
3.2e+02
that there is no uncontrolled growth of switch queues.
2.6e+02 Figure 9A: Source ACRs with frequent and sharp changes in the available bandwidth
8.0e+07
1.9e+02
src_a*
7.5e+07 src_b*
1.3e+02
src_c*
7.0e+07 src_d*
6.4e+01
src_e*
6.5e+07 src_f*
0.0e+00
0.000 0.010 0.020 0.030 0.040 0.050 0.060 0.070 0.080 0.090 0.100 src_g*
6.0e+07 src_h*
Time (sec)
Figure 8A: Source ACRs when some start after a delay 5.5e+07
8.0e+07
src_a* 5.0e+07
7.5e+07 src_b*
4.5e+07
ACR (bps)
src_c*
7.0e+07 src_d*
src_e* 4.0e+07
6.5e+07 src_f*
src_g* 3.5e+07
6.0e+07 src_h*
3.0e+07
5.5e+07 2.5e+07
5.0e+07 2.0e+07
ACR (bps)
4.5e+07 1.5e+07
4.0e+07 1.0e+07
3.5e+07 5.0e+06
3.0e+07 0.0e+00
0.000 0.125 0.250 0.375 0.500 0.625 0.750 0.875 1.000 1.125 1.250
2.5e+07 Time (sec)
Figure 9B: Switch queues with frequent and sharp changes in the available bandwidth
2.0e+07 1.0e+03
cc_0
9.6e+02 cc_1
1.5e+07
cc_2
9.0e+02 cc_3
1.0e+07
cc_4
8.3e+02 cc_5
5.0e+06
0.000 0.015 0.030 0.045 0.060 0.075 0.090 0.105 0.120 0.135 0.150 cc_6
7.7e+02
Time (sec)
7.0e+02
Finally to study the performance of LAPLUS algorithm in 6.4e+02
presence of frequent and sharp changes in CBR/VBR traffic,
Number of cells
5.8e+02
be the simplest possible as to offer the least help to the flow- 1.9e+02
Parameters If k P ≠ unseen
SN k P = SN k P + 1
R = Bandwidth available to ABR flows
RLk = Minimum rate for a flow to be considered level k if k P > k
Variables SRk = SRk − RLk
P P P
−1