Sie sind auf Seite 1von 97

Chapter 3

Transport Layer

Computer Networking:
A Top Down Approach
4th edition.
Jim Kurose, Keith Ross
Addison-Wesley, July
2007.

Transport Layer 3-1


Chapter 3: Transport Layer
Our goals:
understand principles learn about transport
behind transport layer protocols in the
layer services: Internet:
Multiplexing, UDP: connectionless
demultiplexing transport
reliable data transfer TCP: connection-oriented
flow control transport
congestion control TCP congestion control

Transport Layer 3-2


Chapter 3 outline
3.1 Transport-layer 3.5 Connection-oriented
services transport: TCP
3.2 Multiplexing and segment structure
demultiplexing reliable data transfer
flow control
3.3 Connectionless

connection management
transport: UDP

3.6 Principles of
3.4 Principles of
reliable data transfer congestion control
3.7 TCP congestion
control

Transport Layer 3-3


Transport services and protocols
application
transport
provide logical communication network
data link
between app processes physical

running on different hosts


transport protocols run in
end systems
send side: breaks app
messages into segments,
passes to network layer
rcv side: reassembles application
transport
segments into messages, network
data link
passes to app layer physical

more than one transport


protocol available to apps
Internet: TCP and UDP

Transport Layer 3-4


Internet transport-layer protocols
reliable, in-order
application
transport
network
delivery to app: TCP data link
physical
network
congestion control data link
network
physical
data link
flow control physical

connection setup
unreliable, unordered
network
data link
physicalnetwork
delivery to app: UDP data link
physical
no-frills extension of network
data link
best-effort IP
application
physical network transport
data link network
services not available:
physical data link
physical

delay guarantees
bandwidth guarantees

Transport Layer 3-5


Chapter 3 outline
3.1 Transport-layer 3.5 Connection-oriented
services transport: TCP
3.2 Multiplexing and segment structure
demultiplexing reliable data transfer
flow control
3.3 Connectionless

connection management
transport: UDP

3.6 Principles of
3.4 Principles of
reliable data transfer congestion control
3.7 TCP congestion
control

Transport Layer 3-6


Multiplexing/demultiplexing
Demultiplexing at rcv host: Multiplexing at send host:
gathering data from multiple
delivering received segments
sockets, enveloping data with
to correct socket
header (later used for
demultiplexing)
= socket = process

P3 P1
P1 P2 P4 application
application application

transport transport transport

network network network

link link link

physical physical physical

host 2 host 3
host 1
Transport Layer 3-7
How demultiplexing works:
General for TCP and UDP

32 bits
host receives IP datagrams
each datagram has source, source port # dest port #
destination IP addresses
each datagram carries 1 other header fields
transport-layer segment
each segment has source,
destination port numbers
application
host uses IP addresses & port
data
numbers to direct segment to (message)
appropriate socket, process,
application
TCP/UDP segment format

Transport Layer 3-8


Connectionless demultiplexing
When host receives UDP
Create sockets with port
segment:
numbers:
DatagramSocket mySocket1 = new checks destination port
DatagramSocket(12534); number in segment
DatagramSocket mySocket2 = new directs UDP segment to
DatagramSocket(12535); socket with that port
number
UDP socket identified by
two-tuple: IP datagrams with
different source IP
(dest IP address, dest port number)
addresses and/or source
port numbers directed
to same socket

Transport Layer 3-9


Connectionless demux (cont)
DatagramSocket serverSocket = new DatagramSocket(6428);

P2 P1
P1
P3

SP: 6428 SP: 6428


DP: 9157 DP: 5775

SP: 9157 SP: 5775


client DP: 6428 DP: 6428 Client
server
IP: A IP: C IP:B

SP provides return address

Transport Layer 3-10


Connection-oriented demux
TCP socket identified Server host may support
by 4-tuple: many simultaneous TCP
source IP address sockets:
source port number each socket identified by
dest IP address its own 4-tuple
dest port number Web servers have
recv host uses all four different sockets for
values to direct each connecting client
segment to appropriate non-persistent HTTP will
socket have different socket for
each request

Transport Layer 3-11


Connection-oriented demux
(cont)

P1 P4 P5 P6 P2 P1P3

SP: 5775
DP: 80
S-IP: B
D-IP:C

SP: 9157 SP: 9157


client DP: 80 DP: 80 Client
server
IP: A S-IP: A
IP: C S-IP: B IP:B
D-IP:C D-IP:C

Transport Layer 3-12


Chapter 3 outline
3.1 Transport-layer 3.5 Connection-oriented
services transport: TCP
3.2 Multiplexing and segment structure
demultiplexing reliable data transfer
flow control
3.3 Connectionless

connection management
transport: UDP

3.6 Principles of
3.4 Principles of
reliable data transfer congestion control
3.7 TCP congestion
control

Transport Layer 3-13


UDP: User Datagram Protocol [RFC 768]
no frills, bare bones
transport protocol Why is there a UDP?
best effort service, UDP no connection
segments may be: establishment (which can
lost add delay)
delivered out of order simple: no connection state
to app at sender, receiver
connectionless: small segment header
no handshaking between no congestion control: UDP
UDP sender, receiver can blast away as fast as
each UDP segment desired (more later on
handled independently interaction with TCP!)

Transport Layer 3-14


UDP: more
often used for streaming
multimedia apps 32 bits

loss tolerant Length, in source port # dest port #


rate sensitive bytes of UDP length checksum
segment,
other UDP uses including
DNS header
SNMP (net mgmt)
reliable transfer over UDP: Application
add reliability at app layer data
(message)
application-specific
error recovery!
used for multicast,
UDP segment format
broadcast in addition to
unicast (point-point)
Transport Layer 3-15
Chapter 3 outline
3.1 Transport-layer 3.5 Connection-oriented
services transport: TCP
3.2 Multiplexing and segment structure
demultiplexing reliable data transfer
flow control
3.3 Connectionless

connection management
transport: UDP

3.6 Principles of
3.4 Principles of
reliable data transfer congestion control
3.7 TCP congestion
control

Transport Layer 3-16


Principles of Reliable data transfer
important in app., transport, link layers
top-10 list of important networking topics!

characteristics of unreliable channel will determine


complexity of reliable data transfer protocol (rdt)
Transport Layer 3-17
Principles of Reliable data transfer
important in app., transport, link layers
top-10 list of important networking topics!

characteristics of unreliable channel will determine


complexity of reliable data transfer protocol (rdt)
Transport Layer 3-18
Principles of Reliable data transfer
important in app., transport, link layers
top-10 list of important networking topics!

characteristics of unreliable channel will determine


complexity of reliable data transfer protocol (rdt)
Transport Layer 3-19
Reliable data transfer: getting started
rdt_send(): called from above, deliver_data(): called by
(e.g., by app.). Passed data to rdt to deliver data to upper
deliver to receiver upper layer

send receive
side side

udt_send(): called by rdt, rdt_rcv(): called when packet


to transfer packet over arrives on rcv-side of channel
unreliable channel to receiver

Transport Layer 3-20


Flow Control
- End-to-end flow and Congestion control
study is complicated by:
- Heterogeneous resources (links, switches,
applications)
- Different delays due to network dynamics
- Effects of background traffic
We start with a simple case: hop-by-hop
flow control

Transport Layer 3-21


Hop-by-hop flow control
Approaches/techniques for hop-by-hop
flow control
- Stop-and-wait
- sliding window
- Go back N
- Selective reject

Transport Layer 3-22


Stop-and-wait: reliable transfer over a reliable channel
underlying channel perfectly reliable
no bit errors, no loss of packets

stop and wait


Sender sends one packet,
then waits for receiver
response

Transport Layer 3-23


channel with bit errors
underlying channel may flip bits in packet
checksum to detect bit errors

the question: how to recover from errors:


acknowledgements (ACKs): receiver explicitly tells sender
that pkt received OK
negative acknowledgements (NAKs): receiver explicitly
tells sender that pkt had errors
sender retransmits pkt on receipt of NAK

new mechanisms for:


error detection
receiver feedback: control msgs (ACK,NAK) rcvr->sender

Transport Layer 3-24


Stop-and-wait: Corrupt ACK/NACK

What happens if Handling duplicates:


ACK/NAK corrupted? sender retransmits current
sender doesnt know what pkt if ACK/NAK garbled
happened at receiver! sender adds sequence
cant just retransmit: number to each pkt
possible duplicate receiver discards (doesnt
deliver up) duplicate pkt

Transport Layer 3-25


discussion
Sender: Receiver:
seq # added to pkt must check if received
two seq. #s (0,1) will packet is duplicate
suffice. Why? state indicates whether
0 or 1 is expected pkt
must check if received seq #
ACK/NAK corrupted
note: receiver can not
know if its last
ACK/NAK received OK
at sender

Transport Layer 3-26


channels with errors and loss

New assumption: Approach: sender waits


underlying channel can reasonable amount of
also lose packets (data time for ACK
or ACKs) retransmits if no ACK
checksum, seq. #, ACKs, received in this time
retransmissions will be if pkt (or ACK) just delayed
of help, but not enough (not lost):
retransmission will be
duplicate, but use of seq.
#s already handles this
receiver must specify seq
# of pkt being ACKed
requires countdown timer

Transport Layer 3-27


Stop-and-wait operation Summary
Stop and wait:
- sender awaits for ACK to send another frame
- sender uses a timer to re-transmit if no ACKs
- if ACK is lost:
- A sends frame, Bs ACK gets lost
- A times out & re-transmits the frame, B receives duplicates
- Sequence numbers are added (frame0,1 ACK0,1)

- timeout: should be related to round trip time estimates


- if too small unnecessary re-transmission
- if too large long delays

Transport Layer 3-28


Stop-and-wait with lost packet/frame

Transport Layer 3-29


Transport Layer 3-30
Transport Layer 3-31
Stop and wait performance
utilization fraction of time sender busy
sending
- ideal case (error free)
- u=Tframe/(Tframe+2Tprop)=1/(1+2a),
a=Tprop/Tframe

Transport Layer 3-32


Performance of stop-and-wait

example: 1 Gbps link, 15 ms e-e prop. delay, 1KB packet:

Ttransmit = L (packet length in bits) 8kb/pkt


= = 8 microsec
R (transmission rate, bps) 10**9 b/sec

U sender: utilization fraction of time sender busy sending

U L/R .008
sender
= = = 0.00027
RTT + L / R 30.008 microsec
onds
1KB pkt every 30 msec -> 33kB/sec thruput over 1 Gbps link
network protocol limits use of physical resources!

Transport Layer 3-33


rdt3.0: stop-and-wait operation
sender receiver
first packet bit transmitted, t = 0
last packet bit transmitted, t = L / R

first packet bit arrives


RTT last packet bit arrives, send ACK

ACK arrives, send next


packet, t = RTT + L / R

U L/R .008
sender
= = = 0.00027
RTT + L / R 30.008 microsec
onds

Transport Layer 3-34


- consider losses
- assume Timeout ~ 2 Tprop
- on average need Nx attempts to get the frame
through
- p is the probability of frame being in error
- Pr[k attempts are made before the frame is
transmitted correctly]=pk-1.(1-p)
- Nx=kPr[k]=1/(1-p)
- For stop-and-wait
U=Tframe/[Nx.(Tframe+2.Tprop)]=1/Nx(1+2a)
U=[1-p]/(1+2a)
- stop and wait is a conservative approach to flow
control but is wasteful
Transport Layer 3-35
Sliding window techniques
- TCP is a variant of sliding window
- Includes Go back N (GBN) and selective
repeat/reject
- Allows for outstanding packets without Ack
- More complex than stop and wait
- Need to buffer un-Acked packets & more
book-keeping than stop-and-wait

Transport Layer 3-36


Pipelined (sliding window) protocols
Pipelining: sender allows multiple, in-flight, yet-to-
be-acknowledged pkts
range of sequence numbers must be increased
buffering at sender and/or receiver

Two generic forms of pipelined protocols: go-Back-N,


selective repeat
Transport Layer 3-37
Pipelining: increased utilization
sender receiver
first packet bit transmitted, t = 0
last bit transmitted, t = L / R

first packet bit arrives


RTT last packet bit arrives, send ACK
last bit of 2nd packet arrives, send ACK
last bit of 3rd packet arrives, send ACK
ACK arrives, send next
packet, t = RTT + L / R

Increase utilization
by a factor of 3!

U 3*L/R .024
sender
= = = 0.0008
RTT + L / R 30.008 microsecon
ds
Transport Layer 3-38
Go-Back-N
Sender:
k-bit seq # in pkt header
window of up to N, consecutive unacked pkts allowed

ACK(n): ACKs all pkts up to, including seq # n - cumulative ACK


may receive duplicate ACKs (more later)
timer for each in-flight pkt
timeout(n): retransmit pkt n and all higher seq # pkts in window

Transport Layer 3-39


GBN: receiver side

ACK-only: always send ACK for correctly-received pkt


with highest in-order seq #
may generate duplicate ACKs
need only remember expected seq num
out-of-order pkt:
discard (dont buffer) -> no receiver buffering!
Re-ACK pkt with highest in-order seq #

Transport Layer 3-40


GBN in
action

Transport Layer 3-41


Selective Repeat
receiver individually acknowledges all correctly
received pkts
buffers pkts, as needed, for eventual in-order delivery
to upper layer
sender only resends pkts for which ACK not
received
sender timer for each unACKed pkt
sender window
N consecutive seq #s
limits seq #s of sent, unACKed pkts

Transport Layer 3-42


Selective repeat: sender, receiver windows

Transport Layer 3-43


Selective repeat
sender receiver
data from above : pkt n in [rcvbase, rcvbase+N-1]
if next available seq # in send ACK(n)
window, send pkt out-of-order: buffer
timeout(n): in-order: deliver (also
resend pkt n, restart timer deliver buffered, in-order
pkts), advance window to
ACK(n) in [sendbase,sendbase+N]: next not-yet-received pkt
mark pkt n as received
pkt n in [rcvbase-N,rcvbase-1]
if n smallest unACKed pkt,
ACK(n)
advance window base to
next unACKed seq # otherwise:
ignore

Transport Layer 3-44


Selective repeat in action

Transport Layer 3-45


Selective repeat:
dilemma
Example:
seq #s: 0, 1, 2, 3
window size=3

receiver sees no
difference in two
scenarios!
incorrectly passes
duplicate data as new
in (a)

Q: what relationship
between seq # size
and window size?
(check hwk), (try applet) Transport Layer 3-46
performance:
- selective repeat:
- error-free case:
- if the window is w such that the pipe is fullU=100%
- otherwise U=w*Ustop-and-wait=w/(1+2a)
- in case of error:
- if w fills the pipe U=1-p
- otherwise U=w*Ustop-and-wait=w(1-p)/(1+2a)

Transport Layer 3-47


Chapter 3 outline
3.1 Transport-layer 3.5 Connection-oriented
services transport: TCP
3.2 Multiplexing and segment structure
demultiplexing reliable data transfer
flow control
3.3 Connectionless

connection management
transport: UDP

3.6 Principles of
3.4 Principles of
reliable data transfer congestion control
3.7 TCP congestion
control

Transport Layer 3-48


TCP: Overview RFCs: 793, 1122, 1323, 2018, 2581

point-to-point: full duplex data:


one sender, one receiver bi-directional data flow

reliable, in-order byte in same connection


MSS: maximum segment
steam:
size
no message boundaries
connection-oriented:
pipelined:
handshaking (exchange
TCP congestion and flow
of control msgs) inits
control set window size sender, receiver state
send & receive buffers before data exchange
flow controlled:
sender will not
application application
writes data reads data
socket socket

overwhelm receiver
door door
TCP TCP
send buffer receive buffer
segment

Transport Layer 3-49


TCP segment structure
32 bits

source port # dest port # counting


by bytes
sequence number of data
acknowledgement number (not segments!)
head not
len used
UA P R S F Receive window
# bytes
checksum Urg data pnter
rcvr willing
to accept
Options (variable length)

application
data
(variable length)

Transport Layer 3-50


TCP segment structure
32 bits
URG: urgent data counting
(generally not used) source port # dest port #
by bytes
sequence number of data
ACK: ACK #
valid acknowledgement number (not segments!)
head not
PSH: push data now len used
UA P R S F Receive window
(generally not used) # bytes
checksum Urg data pnter
rcvr willing
RST, SYN, FIN: to accept
Options (variable length)
connection estab
(setup, teardown
commands)
application
Internet data
checksum (variable length)
(as in UDP)

Transport Layer 3-51


TCP seq. #s and ACKs
Seq. #s:
Host A Host B
byte stream
number of first User
types
byte in segments C
data host ACKs
receipt of
ACKs: C, echoes
seq # of next byte back C
expected from
other side host ACKs
cumulative ACK receipt
of echoed
Q: how receiver handles C
out-of-order segments
A: TCP spec doesnt
time
say, - up to
simple telnet scenario
implementor
Transport Layer 3-52
Reliability in TCP
Components of reliability
1. Sequence numbers
2. Retransmissions
3. Timeout Mechanism(s): function of the round
trip time (RTT) between the two hosts (is it
static?)

Transport Layer 3-53


TCP Round Trip Time and Timeout
Q: how to set TCP Q: how to estimate RTT?
timeout value? SampleRTT: measured time from
longer than RTT segment transmission until ACK
but RTT varies
receipt
ignore retransmissions
too short: premature
timeout SampleRTT will vary, want
unnecessary
estimated RTT smoother
retransmissions average several recent

too long: slow reaction


measurements, not just
to segment loss current SampleRTT

Transport Layer 3-54


TCP Round Trip Time and Timeout
EstimatedRTT = (1- )*EstimatedRTT + *SampleRTT

Exponential weighted moving average


influence of past sample decreases exponentially fast
typical value: = 0.125

Transport Layer 3-55


Example RTT estimation:
RTT: gaia.cs.umass.edu to fantasia.eurecom.fr

350

300

250
RTT (milliseconds)

200

150

100
1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106
time (seconnds)

SampleRTT Estimated RTT

Transport Layer 3-56


TCP Round Trip Time and Timeout
Setting the timeout
EstimtedRTT plus safety margin
large variation in EstimatedRTT -> larger safety margin
first estimate of how much SampleRTT deviates from
EstimatedRTT:

DevRTT = (1-)*DevRTT +
*|SampleRTT-EstimatedRTT|

(typically, = 0.25)

Then set timeout interval:

TimeoutInterval = EstimatedRTT + 4*DevRTT

Transport Layer 3-57


Chapter 3 outline
3.1 Transport-layer 3.5 Connection-oriented
services transport: TCP
3.2 Multiplexing and segment structure
demultiplexing reliable data transfer
flow control
3.3 Connectionless

connection management
transport: UDP

3.6 Principles of
3.4 Principles of
reliable data transfer congestion control
3.7 TCP congestion
control

Transport Layer 3-58


TCP reliable data transfer
TCP creates rdt Retransmissions are
service on top of IPs triggered by:
unreliable service timeout events
Pipelined segments duplicate acks
Cumulative acks Initially consider
TCP uses single
simplified TCP sender:
ignore duplicate acks
retransmission timer
ignore flow control,
congestion control

Transport Layer 3-59


TCP sender events:
data rcvd from app: timeout:
Create segment with retransmit segment
seq # that caused timeout
seq # is byte-stream restart timer
number of first data Ack rcvd:
byte in segment If acknowledges
start timer if not previously unacked
already running (think segments
of timer as for oldest update what is known to
unacked segment) be acked
expiration interval: start timer if there are
TimeOutInterval outstanding segments

Transport Layer 3-60


TCP: retransmission scenarios
Host A Host B Host A Host B

Seq=92 timeout
timeout

X
loss

Sendbase
= 100

Seq=92 timeout
SendBase
= 120

SendBase
= 100 SendBase
= 120 premature timeout
time time
lost ACK scenario
Transport Layer 3-61
TCP retransmission scenarios (more)
Host A Host B
timeout

X
loss

SendBase
= 120

time
Cumulative ACK scenario

Transport Layer 3-62


Fast Retransmit
Time-out period often If sender receives 3
relatively long: ACKs for the same
long delay before data, it supposes that
resending lost packet segment after ACKed
Detect lost segments data was lost:
via duplicate ACKs. fast retransmit: resend
Sender often sends segment before timer
many segments back-to- expires
back
If segment is lost,
there will likely be many
duplicate ACKs.

Transport Layer 3-63


Chapter 3 outline
3.1 Transport-layer 3.5 Connection-oriented
services transport: TCP
3.2 Multiplexing and segment structure
demultiplexing reliable data transfer
flow control
3.3 Connectionless

connection management
transport: UDP

3.6 Principles of
3.4 Principles of
reliable data transfer congestion control
3.7 TCP congestion
control

Transport Layer 3-64


TCP Flow Control
flow control
sender wont overflow
receive side of TCP receivers buffer by
connection has a transmitting too much,
receive buffer: too fast

speed-matching
service: matching the
send rate to the
receiving apps drain
rate
app process may be
slow at reading from
buffer
Transport Layer 3-65
TCP Flow control: how it works
Rcvr advertises spare
room by including value
of RcvWindow in
segments
Sender limits unACKed
(Suppose TCP receiver data to RcvWindow
discards out-of-order guarantees receive
segments) buffer doesnt overflow
spare room in buffer
= RcvWindow
= RcvBuffer-[LastByteRcvd -
LastByteRead]

Transport Layer 3-66


TCP segment structure
32 bits

source port # dest port # counting


by bytes
sequence number of data
acknowledgement number (not segments!)
head not
len used
UA P R S F Receive window
# bytes
checksum Urg data pnter
rcvr willing
to accept
Options (variable length)

application
data
(variable length)

Transport Layer 3-67


Chapter 3 outline
3.1 Transport-layer 3.5 Connection-oriented
services transport: TCP
3.2 Multiplexing and segment structure
demultiplexing reliable data transfer
flow control
3.3 Connectionless

connection management
transport: UDP

3.6 Principles of
3.4 Principles of
reliable data transfer congestion control
3.7 TCP congestion
control

Transport Layer 3-68


TCP Connection Management
Recall: TCP sender, receiver Three way handshake:
establish connection
before exchanging data Step 1: client host sends TCP
segments SYN segment to server
initialize TCP variables: specifies initial seq #

seq. #s no data

buffers, flow control Step 2: server host receives


info (e.g. RcvWindow) SYN, replies with SYNACK
client: connection initiator segment
Socket clientSocket = new
server allocates buffers
Socket("hostname","port
specifies server initial
number");
seq. #
server: contacted by client
Socket connectionSocket =
Step 3: client receives SYNACK,
welcomeSocket.accept(); replies with ACK segment,
which may contain data

Transport Layer 3-69


TCP Connection Management (cont.)

Closing a connection: client server

close
client closes socket:
clientSocket.close();

Step 1: client end system close


sends TCP FIN control
segment to server

timed wait
Step 2: server receives
FIN, replies with ACK.
Closes connection, sends
FIN. closed

Transport Layer 3-70


TCP Connection Management (cont.)

Step 3: client receives FIN, client server


replies with ACK. closing
Enters timed wait -
will respond with ACK
to received FINs
closing
Step 4: server, receives
ACK. Connection closed.

timed wait
Note: with small
closed
modification, can handle
simultaneous FINs.
closed

Transport Layer 3-71


TCP Connection Management (cont)

TCP server
lifecycle

TCP client
lifecycle

Transport Layer 3-72


Chapter 3 outline
3.1 Transport-layer 3.5 Connection-oriented
services transport: TCP
3.2 Multiplexing and segment structure
demultiplexing reliable data transfer
flow control
3.3 Connectionless

connection management
transport: UDP

3.6 Principles of
3.4 Principles of
reliable data transfer congestion control
3.7 TCP congestion
control

Transport Layer 3-73


Principles of Congestion Control

Congestion:
informally: too many sources sending too much
data too fast for network to handle
different from flow control!
manifestations:
lost packets (buffer overflow at routers)
long delays (queueing in router buffers)
a top-10 problem!

Transport Layer 3-74


Causes/costs of congestion: scenario 1
Host A lout
two senders, two
lin : original data

receivers
one router,
Host B unlimited shared
output link buffers

infinite buffers
no retransmission

large delays
when congested
maximum
achievable
throughput
Transport Layer 3-75
Causes/costs of congestion: scenario 2

one router, finite buffers


sender retransmission of lost packet

Host A lin : original lout


data
l'in : original data, plus
retransmitted data

Host B finite shared output


link buffers

Transport Layer 3-76


Causes/costs of congestion: scenario 2
always: = l
l (goodput)
in out
perfect retransmission only when loss: l > lout
in
retransmission of delayed (not lost) packet makes l larger
in
(than perfect case) for same lout
R/2 R/2 R/2

R/3
lout

lout

lout
R/4

R/2 R/2 R/2


lin lin lin

a. b. c.
costs of congestion:
more work (retrans) for given goodput
unneeded retransmissions: link carries multiple copies of pkt
Transport Layer 3-77
Causes/costs of congestion: scenario 3
four senders
Q: what happens as l
multihop paths in
and l increase ?
timeout/retransmit in
Host A lout
lin : original data
l'in : original data, plus
retransmitted data

finite shared output


link buffers

Host B

Transport Layer 3-78


Causes/costs of congestion: scenario 3
H l
o
o
s
u
t
A t

H
o
s
t
B

Another cost of congestion:


when packet dropped, any upstream transmission
capacity used for that packet was wasted!

Transport Layer 3-79


Approaches towards congestion control
Two broad approaches towards congestion control:

End-end congestion Network-assisted


control: congestion control:
no explicit feedback from routers provide feedback
network to end systems
congestion inferred from single bit indicating
end-system observed loss, congestion (SNA,
delay DECbit, TCP/IP ECN,
approach taken by TCP ATM)
explicit rate sender
should send at

Transport Layer 3-80


Case study: ATM ABR congestion control

ABR: available bit rate: RM (resource management)


elastic service cells:
if senders path sent by sender, interspersed
underloaded: with data cells
sender should use bits in RM cell set by switches
available bandwidth (network-assisted)
if senders path NI bit: no increase in rate
congested: (mild congestion)
sender throttled to CI bit: congestion
minimum guaranteed indication
rate RM cells returned to sender by
receiver, with bits intact

Transport Layer 3-81


Case study: ATM ABR congestion control

two-byte ER (explicit rate) field in RM cell


congested switch may lower ER value in cell
sender send rate thus maximum supportable rate on path

EFCI bit in data cells: set to 1 in congested switch


if data cell preceding RM cell has EFCI set, sender sets CI
bit in returned RM cell

Transport Layer 3-82


Chapter 3 outline
3.1 Transport-layer 3.5 Connection-oriented
services transport: TCP
3.2 Multiplexing and segment structure
demultiplexing reliable data transfer
flow control
3.3 Connectionless

connection management
transport: UDP

3.6 Principles of
3.4 Principles of
reliable data transfer congestion control
3.7 TCP congestion
control

Transport Layer 3-83


TCP congestion control: additive increase,
multiplicative decrease
Approach: increase transmission rate (window size),
probing for usable bandwidth, until loss occurs
additive increase: increase CongWin by 1 MSS
every RTT until loss detected
multiplicative decrease: cut CongWin in half after
loss congestion
window
congestion window size

24 Kbytes

Saw tooth
behavior: probing
16 Kbytes

for bandwidth
8 Kbytes

time
time

Transport Layer 3-84


TCP Congestion Control: details
sender limits transmission: How does sender
LastByteSent-LastByteAcked perceive congestion?
CongWin loss event = timeout or
Roughly, 3 duplicate acks
CongWin TCP sender reduces
rate = Bytes/sec
RTT rate (CongWin) after
CongWin is dynamic, function
loss event
of perceived network three mechanisms:
congestion AIMD
slow start
conservative after
timeout events
Transport Layer 3-85
TCP Slow Start
When connection begins, When connection begins,
CongWin = 1 MSS increase rate
Example: MSS = 500 exponentially fast until
bytes & RTT = 200 msec first loss event
initial rate = 20 kbps
available bandwidth may
be >> MSS/RTT
desirable to quickly ramp
up to respectable rate

Transport Layer 3-86


TCP Slow Start (more)
When connection Host A Host B
begins, increase rate
exponentially until

RTT
first loss event:
double CongWin every
RTT
done by incrementing
CongWin for every ACK
received
Summary: initial rate
is slow but ramps up
exponentially fast time

Transport Layer 3-87


Refinement
Q: When should the
exponential
increase switch to
linear?
A: When CongWin
gets to 1/2 of its
value before
timeout.

Implementation:
Variable Threshold
At loss event, Threshold is
set to 1/2 of CongWin just
before loss event

Transport Layer 3-88


Refinement: inferring loss
After 3 dup ACKs:
CongWin is cut in half Philosophy:
window then grows
linearly 3 dup ACKs indicates
But after timeout event: network capable of
delivering some segments
CongWin instead set to
timeout indicates a
1 MSS;
more alarming
window then grows congestion scenario
exponentially
to a threshold, then
grows linearly

Transport Layer 3-89


Summary: TCP Congestion Control

When CongWin is below Threshold, sender in


slow-start phase, window grows exponentially.
When CongWin is above Threshold, sender is in
congestion-avoidance phase, window grows linearly.
When a triple duplicate ACK occurs, Threshold
set to CongWin/2 and CongWin set to
Threshold.

When timeout occurs, Threshold set to


CongWin/2 and CongWin is set to 1 MSS.

Transport Layer 3-90


TCP sender congestion control
State Event TCP Sender Action Commentary
Slow Start ACK receipt CongWin = CongWin + MSS, Resulting in a doubling of
(SS) for previously If (CongWin > Threshold) CongWin every RTT
unacked set state to Congestion
data Avoidance
Congestion ACK receipt CongWin = CongWin+MSS * Additive increase, resulting
Avoidance for previously (MSS/CongWin) in increase of CongWin by
(CA) unacked 1 MSS every RTT
data
SS or CA Loss event Threshold = CongWin/2, Fast recovery,
detected by CongWin = Threshold, implementing multiplicative
triple Set state to Congestion decrease. CongWin will not
duplicate Avoidance drop below 1 MSS.
ACK
SS or CA Timeout Threshold = CongWin/2, Enter slow start
CongWin = 1 MSS,
Set state to Slow Start
SS or CA Duplicate Increment duplicate ACK count CongWin and Threshold not
ACK for segment being acked changed

Transport Layer 3-91


TCP throughput
Whats the average throughout of TCP as a
function of window size and RTT?
Ignore slow start
Let W be the window size when loss occurs.
When window is W, throughput is W/RTT
Just after loss, window drops to W/2,
throughput to W/2RTT.
Average throughout: .75 W/RTT

Transport Layer 3-92


TCP Futures: TCP over long, fat pipes

Example: 1500 byte segments, 100ms RTT, want 10


Gbps throughput
Requires window size W = 83,333 in-flight
segments
Throughput in terms of loss rate:

1.22 MSS
RTT L
L = 210-10 Wow
New versions of TCP for high-speed

Transport Layer 3-93


TCP Fairness
Fairness goal: if K TCP sessions share same
bottleneck link of bandwidth R, each should have
average rate of R/K

TCP connection 1

bottleneck
TCP
router
connection 2
capacity R

Transport Layer 3-94


Why is TCP fair?
Two competing sessions:
Additive increase gives slope of 1, as throughout increases
multiplicative decrease decreases throughput proportionally

R equal bandwidth share

loss: decrease window by factor of 2


congestion avoidance: additive increase
loss: decrease window by factor of 2
congestion avoidance: additive increase

Connection 1 throughput R

Transport Layer 3-95


Fairness (more)
Fairness and UDP Fairness and parallel TCP
Multimedia apps often
connections
do not use TCP nothing prevents app from
do not want rate opening parallel
throttled by congestion connections between 2
control hosts.
Instead use UDP: Web browsers do this
pump audio/video at Example: link of rate R
constant rate, tolerate
packet loss
supporting 9 connections;
new app asks for 1 TCP, gets
Research area: TCP rate R/10
friendly new app asks for 11 TCPs,
gets R/2 !

Transport Layer 3-96


Chapter 3: Summary
principles behind transport
layer services:
multiplexing,
demultiplexing
reliable data transfer
flow control Next:
congestion control leaving the network
instantiation and edge (application,
implementation in the transport layers)
Internet into the network
UDP core
TCP
Transport Layer 3-97

Das könnte Ihnen auch gefallen