Sie sind auf Seite 1von 41

Router Design:

Table Lookups and Packet Scheduling


EECS 122: Lecture 13

Department of Electrical Engineering and Computer Sciences

University of California Berkeley

Review: Switch Architectures


n

Input Queued
Faster switch q Two congestion points
q
n

input interface

output interface

HOL

Backplane

Output Queued
q

Slower switch
n

Backplane speedup is N
RO C

One point of congsestion

March 4, 2003

Abhay Parekh, EE122 S2003

Today
n

Fast routing table lookups


q

input interface

output interface

Exact and LPM


Backplane

How to resolve router congestion?


q q

Assume output queued Focus on scheduling


n

Already covered packet dropping

RO

Defer the end-to-end issues to next lecture


Abhay Parekh, EE122 S2003 3

March 4, 2003

Route Lookup
n

Exact Match
q

q q

MPLS/ATM: Direct lookup possible because of labels Ethernet: Global address space is flat and huge Hashing is an option but we wont discuss it IP

Longest Prefix Match


q

In Ethernet and IP routing it is useful to think of the search process as a traversal of a special kind of labeled tree called a Trie
Abhay Parekh, EE122 S2003 4

March 4, 2003

Trees and Tries


Binary Search Tree < < > < > > Binary Search Trie 0 0 1
010

1 0 1
111

log2N

N entries

Nick Mckeown
March 4, 2003 Abhay Parekh, EE122 S2003 5

Simple Tries and Exact Matching in Ethernet


n

Each address in the routing table entry is of fixed length


q

E.g. using a simple binary search trie it takes 48 steps to resolve an MAC address lookup When the table is much smaller than 2^48 (!), this seems like a lot of steps Instead of matching one bit per level, why not m bits per level?

March 4, 2003

Abhay Parekh, EE122 S2003

Multiway Tries
16-ary Search Trie
Ptr=0 means no children 0000, 0

0000, ptr
1111, ptr

1111, ptr
0000, 0 1111, ptr

000011110000

111111111111

Nick Mckeown
March 4, 2003 Abhay Parekh, EE122 S2003 7

Why not just have one level?


n

Memory required goes up with higher values of m


q q

Too many nodes with 0 pointers Having more levels in the tree allows potentially large subtrees of the balanced tree to be to cut off and not stored

The right number of levels is a function of memory constraints

March 4, 2003

Abhay Parekh, EE122 S2003

Simple Tries and LPM


n

The routing table entry is a variable length prefix


q
q q

E.g. 01111111 00001111 0000100100 for 128.23.9.0/26


A balanced tree wont work Variable number of steps required

March 4, 2003

Abhay Parekh, EE122 S2003

LPM in IP Routers
Binary tries
0 1 Example Prefixes a) 00001 b) 00010 c) 00011 d) 001 e) 0101 f) 011 g) 100 h) 1010 i) 1100 j) 11110000
Nick Mckeown
10

d e a bc

g h i

j
March 4, 2003 Abhay Parekh, EE122 S2003

LPM in IP Routers
Patricia trie
0 1 Example Prefixes a) 00001 b) 00010 c) 00011 d) 001 e) 0101 j f) 011 Skip 5 1000 g) 100 h) 1010 i) 1100 j) 11110000
Nick Mckeown
March 4, 2003 Abhay Parekh, EE122 S2003 11

d e a bc

g h i

Associative Memory
n

Content addressable memory is expensive, consumes power and is still slow, but makes the lookup problem easy
q

For Ethernet compares incoming data to all entries in parallel For LPM do 32 exact matches in parallel
n n n

Match prefixes of length 1 Match prefixes of length 2 Etc.

Still not a practical, but this may change some day


Abhay Parekh, EE122 S2003 12

March 4, 2003

Scheduling and Classification


n

In a store and forward network, contention for router ports can result in considerable packet delay
q

How do we manage this delay so that


n n

The network does not get swamped with retransmissions Certain kinds of applications are not victimized and bear an unfair share of the delay Delay sensitive applications are protected

n n n

Every packet is separated into one of M classes At each port, have a separate queue for each class Very similar (and complimentary to) issues addressed by RED and other packet dropping mechanisms
Abhay Parekh, EE122 S2003 13

March 4, 2003

Router Without any classification

Model of FIFO router (output) queue


A(t)
B(t)

D(t)

Classification and Scheduling


Scheduling Discipline

Model of router queues

Router With M classes

A1(t) A(t) A2(t) AM(t)

R(f1) D(t) R(f2)

R(fM)
Nick Mckeown
March 4, 2003 Abhay Parekh, EE122 S2003 14

Classification
n

Packet Class
q q q q q

A way of receiving differentiated treatment from the output port scheduler Delay Sensitivity Loss Sensitivity Etc. More fields ? harder the implementation More queues ? harder the implementation One for each priority One for each flow (example, VCI) Etc.
Abhay Parekh, EE122 S2003 15

How many queues?


q q q q

March 4, 2003

Motivating Example: FCFS


A is an audio session Steady: (peak rate = avg. rate) B, C and D image files Burst (high peak rate)
rate B C D A time

A B C D What happens to A?
March 4, 2003 Abhay Parekh, EE122 S2003 16

Delays build up for the steady session


Arrivals rate L B 1 At Node 1 At Node 2 rate C A bits 1 A
T

time

time

time rate 1 A 1 rate A

Both links FCFS at rate 1

time

time

March 4, 2003

Abhay Parekh, EE122 S2003

17

Hard to control performance with FCFS


n

Bursty traffic can unfairly hold up steady traffic


q

Makes steady traffic bursty in the process!

Very difficult to analyze how a FCFS network will perform under reasonable traffic scenarios Only way to control performance is to make buffers very small
q

Difficult for high speed traffic


Abhay Parekh, EE122 S2003 18

March 4, 2003

Fairness
n

Suppose there are N applications sharing the network Each application j can express its satisfaction with network performance in terms of Uj(r), where r = (r1,,rN) is the rate allocated to the applications.
q

Most common utility function is Uj(r) =r for all j Fairness is vague so there are many definitions
n n

What is a fair allocation of the throughputs?


q

Maximize the sum of utilities: May penalize some apps Max-Min Fair

March 4, 2003

Abhay Parekh, EE122 S2003

19

Max-Min Fairness
n n

Let an assignment be (U1(r),,UN(r)) Given assignment, A, let MINA be the smallest component of the vector A Pick an assignment * such that for any other assignment, A
q

W * = MINA

n n

There may be many max-min allocations Need to be able to classify packets by flow in order to implement Also needs a scheduling policy
Abhay Parekh, EE122 S2003 20

March 4, 2003

Simplest Case
n n

N flows, Uj(r) = rj Max-min fair rate computation:


1.

2. 3. 4. 5.

Pick the unassigned flow, j, with the smallest rate sj = min{rj, C/N} C = C sj N= N-1 go to 1

r1 r2 r3 8 6 2

s1 s2 s3

10

4 4 2

Why is this max-min fair?

March 4, 2003

Abhay Parekh, EE122 S2003

21

Router Without any classification

Model of FIFO router (output) queue


A(t)
B(t)

D(t)

Classification and Scheduling


Scheduling Discipline

Model of router queues

Router With M classes

A1(t) A(t) A2(t) AM(t)

R(f1) D(t) R(f2)

R(fM)
Nick Mckeown
March 4, 2003 Abhay Parekh, EE122 S2003 22

Scheduling Goals
n

Flexibility
q

Must be able to accommodate a wide range of performance objectives Must be fair Must have some way to determine if the performance objectives are met Cost Performance
Abhay Parekh, EE122 S2003 23

Predictable/Analyzable
q

Implementable
q q

March 4, 2003

Fluid Tracking Scheduling Policies


n

n
q

The time taken to serve a packet of size L on a link of speed C is L/C. IDEALIZATION: Assume that L/C is infinitesimal
q

In any positive interval can serve traffic from any subset of the traffic classes Traffic behaves as a fluid

n n

Design a fluid service discipline that is flexible and fair Approximate this ideal policy by a Tracking Service discipline with the true value of L/C

March 4, 2003

Abhay Parekh, EE122 S2003

24

Generalized Processor Sharing


n n

Each class j has a weight wj If there is traffic to be served from class j at time t we say that it is Backloged at time t.
q q

B(t) is the set of backlogged classes at time t W(t) is the total weight of the backlogged classes at time t

If there is even one backlogged class, the server operates at rate C


q

This is called being work conserving If class j is backlogged at time t, give it service rate
n

Serve the classes in proportion to their weights


q

Sj(t) = [wj/W(t)] * C
Abhay Parekh, EE122 S2003 25

March 4, 2003

Generalized Processor Sharing

Class 1

w1 = w2

Class 2 Delay =5

Delay =4

B(2)={1,2} B(4)={2}

March 4, 2003

Abhay Parekh, EE122 S2003

26

Adjust weights to change delay

Class 1

3w1 = w2

Class 2 Delay =4

Delay =5

B(2)={1,2} B(4)={1}

March 4, 2003

Abhay Parekh, EE122 S2003

27

GPS Properties - I
n n n n
n n

Let the sum of all the weights be F q F = W(t) When j is backlogged, Sj(t) = [wj/W(t)] * C = [wj/F ] * C = gj
gj is called the Backlog Clearing Rate If an arriving class j bit sees a queue of q class j bits it must leave in q/gi sec.
q

This is an example of a delay bound Set w j = (80000* F )/C Performance should be good regardless of how the other classes arrive
n

Suppose audio class comes in at 64Kb/s and has BCR = 80Kb/s


q q

Cant do this under FCFS

March 4, 2003

Abhay Parekh, EE122 S2003

28

GPS Properties - II
n

If a class has an average rate of r and we give it a weight so that gj < rj then does the class j queue blow up?
q

Not if the aggregate incoming rate (rate summed over all classes) is less than C
n n

If it isnt some queue has to blow up under any service discipline Finding a delay bound is trickier

Giving steady classes a BCR much higher than rj keeps them steady and doesnt hurt other classes
Abhay Parekh, EE122 S2003 29

March 4, 2003

Treat Steady Sessions Well!


w1 = w2

March 4, 2003

Abhay Parekh, EE122 S2003

30

Treat Steady Sessions Well!


w1 = w2

w1 >>>> w2

March 4, 2003

Abhay Parekh, EE122 S2003

31

Weighted Fair Queueing


n n

Work Conserving Tracks GPS


q

q q

q q

Simulates the arrivals in real-time in a GPS system. Serves entire packets (only one class at a time) Let Fp be the time that the last bit of packet p departs the simulated GPS system WFQ attempts to serve packets in order of Fp Also called Packetized Generalized Processor Sharing (PGPS)
Abhay Parekh, EE122 S2003 32

March 4, 2003

WFQ/PGPS Example
w1 = w2

March 4, 2003

Abhay Parekh, EE122 S2003

33

WFQ/PGPS Example: Fair Queueing


w1 = w2

No packet finishes later under PGPS


March 4, 2003 Abhay Parekh, EE122 S2003 34

WFQ Example II: Fair Queueing


w1 = w2

Packets out of order


q q

Packet from Class 1 finishes late under WFQ Result: Maximum lateness is Lmax/C where L max is the maximum packet size allowed. WFQ never falls further behind.
Abhay Parekh, EE122 S2003 35

March 4, 2003

Virtual Time Implementation (FQ)


n n

Measure service, instead of time V(t) slope rate at which every active flow receives service
q q

C link capacity N(t) number of active flows in fluid flow system at time t

V(t)

V (t ) C = t N (t )

time Service in fluid flow system


March 4, 2003

1 3

3 4

5 5

6 time
Ion Stoica
36

Abhay Parekh, EE122 S2003

Fair Queueing Implementation


n

Define
Fi k finishing time of packet k of flow i (in system virtual time reference system) q aik arrival time of packet k of flow i q Lk length of packet k of flow i i
q

The finishing time of packet k+1 of flow i is


F i k + 1 = max( V ( a ik + 1 ), F i k ) + L k + 1 i

March 4, 2003

Abhay Parekh, EE122 S2003

Ion Stoica

37

WFQ and Performance


n

Since GPS is analyzable, WFQ can be analyzed as well using the relationships between the two
q

This allows for predictable performance in the network

n n

WFQ implemented on most routers For core fast routers which are input queued more complicated tracking policies for GPS must be devised Meaning of class:
q q q q

Session Type of traffic CEOs traffic Etc.


Abhay Parekh, EE122 S2003 38

March 4, 2003

Scheduling disciplines abound!


n

Many improve WFQ


q

Eg. WF2Q, Hierarchical FQ etc Deficit Round Robin

Some are much easier to implement


q

n n

Non-Work Conserving alternatives Earliest Deadline First


q q

Each packet has a deadline not a class Most flexible service discipline
Abhay Parekh, EE122 S2003 39

March 4, 2003

Summary
n

Scheduling is an important mechanism to implement a policy on how to share the bandwidth of an output port
q

Mainly used to manage delay

Scheduling is only effective if packet buffers are large enough to matter


q q

Overprovisioning argument (popular but controversial) Virtual queues in TCP may make this small

Scheduling should be used in conjunction with packet dropping


Abhay Parekh, EE122 S2003 40

March 4, 2003

Next Lecture
n

n n

The effect of scheduling on end-to-end performance Signaling QoS/CoS

March 4, 2003

Abhay Parekh, EE122 S2003

41

Das könnte Ihnen auch gefallen