Beruflich Dokumente
Kultur Dokumente
Computer Networks
Transport Layer
network
Data Transport
Services Hop-to-
Hop-to-Hop
link protocols
physical
App. Software
Controlled
by App. Soft. application
(API) the application the application
transport transport
Data
Controlled network network
Transport
by OS link link
Services
physical physical
Data Data
Transport Transport
Services Services
Computer Network
Source
Source–
Source-
Source –-Destination
Breaking Destination
down the (end to end)
routing,
messages, flowthe
finding
in source,control.
path,
and It
Data
Transport
makes possible
Error
through Services
slow
slow-
links are
and provided
-running
thedetection
and routersprocess
correction.towell
theof the
...
(switches)
assembling
application the message,
process. The maininservices
destination.
are:
communicate
network. with fast-
fast-running process.
processes = students,
Port number = students ID number,
application messages = letters in envelopes,
hosts = universities,
IP add. = university’s address,
transport protocol = post office of universities
network-layer protocol = postal service of state
3. Introduction
reliable data transfer
3.1 Transport-layer flow control
services connection management
3.2 Multiplexing and 3.6 Principles of congestion
demultiplexing control
3.3 Connectionless 3.7 TCP congestion control
transport: UDP 3.8 Multimedia Stream & TCP
3.4 Principles of reliable 3.9 TCP fairness
data transfer 3.10 TCP modeling
3.5 Connection-oriented 3.11 http modeling
transport: TCP
segment structure
end systems
sending side: breaks app
messages into segments,
passes to network layer
receiving side:
reassembles segments
into messages
messages, passes to
application layer application
transport
more than one transport network
data link
protocol available to physical
applications.
Internet: TCP and UDP
bandwidth guarantees.
3. Introduction
reliable data transfer
3.1 Transport-layer flow control
services
connection management
3.2 Multiplexing and 3.6 Principles of congestion
demultiplexing control
3.3 Connectionless 3.7 TCP congestion control
transport: UDP 3.8 Multimedia Stream & TCP
3.4 Principles of reliable 3.9 TCP fairness
data transfer 3.10 TCP modeling
3.5 Connection-oriented 3.11 http modeling
transport: TCP
segment structure
application P3 P1
P1 application P2 P4 application
= process = socket
Jamali@iust.ac.ir ITransport Layer 3-13
Multiplexing/Demultiplexing
Demultiplexing at Receiving Host
delivering received segments
to correct socket
application P3 P1
P1 application P2 P4 application
= process = socket
Jamali@iust.ac.ir ITransport Layer 3-14
How Demultiplexing Works
write reply to
serverSocket
read reply from specifying client
clientSocket host address,
close port number
clientSocket
P2 P1 P1P3
write reply to
read reply from connectionSocket
clientSocket
close
close connectionSocket
clientSocket
Client Connection
socket
bytes socket
P2 P1
SP: 2549
DP: 1324
C to A
SP: 1324
client DP:
DP:2549
80 server
IP: A A
A to
to CC IP: C
P4 P5 P6 P1
P1 P2 P3
SP: 1807
DP: 2053
A to C
SP: 9157 SP: 5775
client DP: 2053 server DP: 2053 Client
IP: A A to C IP: C B to C IP:B
In contrast with UDP, two arriving TCP segments with different source IP
address or source port number will be directed to two different sockets.
Threaded Server
P2 P4 P1
P1 P3
SP: 1807
DP: 2053
A to C
SP: 9157 SP: 5775
client DP: 2053 server DP: 2053 Client
IP: A A to C IP: C B to C IP:B
3. Introduction
reliable data transfer
3.1 Transport-layer flow control
services
connection management
3.2 Multiplexing and 3.6 Principles of congestion
demultiplexing control
3.3 Connectionless 3.7 TCP congestion control
transport: UDP 3.8 Multimedia Stream & TCP
3.4 Principles of reliable 3.9 TCP fairness
data transfer 3.10 TCP modeling
3.5 Connection-oriented 3.11 http modeling
transport: TCP
segment structure
Sender: Receiver:
treat segment contents compute checksum of
as sequence of 16-bit received segment
integers. check if computed checksum
checksum: addition (1’s equals checksum field value:
complement sum) of NO - error detected
segment contents.
YES - no error detected.
sender puts checksum
value into UDP checksum
field.
Length:
1 0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 1
1 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 1
1’s
Sum:1 1 1 0 0 1 0 1 0 1 1 0 0 1 0 1 0
complement
Checksum: 0 0 1 1 0 1 0 1 0 0 1 1 0 1 0 1
Note
When adding numbers, a carryout from the most
significant bit needs to be added to the result
Example: add two 16-bit integers
1 1 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0
1 1 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
wraparound 1 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 1
1
Sum:1 1 0 1 1 1 0 1 1 1 0 1 1 1 1 0 0
Checksum:1 0 1 0 0 0 1 0 0 0 1 0 0 0 0 1 1
3. Introduction
reliable data transfer
3.1 Transport-layer flow control
services
connection management
3.2 Multiplexing and 3.6 Principles of congestion
demultiplexing control
3.3 Connectionless 3.7 TCP congestion control
transport: UDP
3.8 Multimedia Stream & TCP
3.4 Principles of 3.9 TCP fairness
reliable data transfer 3.10 TCP modeling
3.5 Connection-oriented 3.11 http modeling
transport: TCP
segment structure
application
application
layer
layer
send receive
side side
sender receiver
Wait for rdt_send(data) Wait for rdt_rcv(rcvpkt)
call from sndpkt = make_pkt(data) call from extract (rcvpkt,data)
above udt_send(sndpkt) below deliver_data(data)
rdt_send(data) event, creates a packet rdt_rcv(rcvpkt) event, removes the data from
containing the data via the action the packet via the action extract(rcvpkt, data)
make_pkt(data) and sends the packet and passes the data up to upper layer via the
via the action udt_send(packet). action deliver_data(data).
rdt_send(data
data)
sndpkt = make_pkt(data
data, checksum) NACK packet is received.
udt_send(sndpkt)
Wait for
rdt_rcv(rcvpkt) && isACK(rcvpkt)
call from Receiver
Wait for call from above below (one state)
ACK packet is received.
rdt_rcv(rcvpkt) &&
Sender notcorrupt(rcvpkt)
(two states) extract(rcvpkt,data)
deliver_data(data)
udt_send(ACK)
rdt_rcv(rcvpkt) &&
notcorrupt(rcvpkt)
extract(rcvpkt,data
data)
deliver_data(data
data)
udt_send(ACK)
rdt_rcv(rcvpkt) &&
notcorrupt(rcvpkt)
extract(rcvpkt,data
data)
deliver_data(data
data)
udt_send(ACK)
rdt_send(data)
sndpkt = make_pkt(0, data, checksum)
udt_send(sndpkt)
rdt_rcv(rcvpkt) &&
( corrupt(rcvpkt) ||
Wait for Wait for
ACK or
isNAK(rcvpkt) )
call 0 from
NAK 0 udt_send(sndpkt)
above
rdt_rcv(rcvpkt) Seq. no=0
&& notcorrupt(rcvpkt) rdt_rcv(rcvpkt)
&& isACK(rcvpkt) && notcorrupt(rcvpkt)
&& isACK(rcvpkt)
Wait for call 0
from above Wait for call 1
Wait for Wait for from above
ACK or call 1 from
rdt_rcv(rcvpkt) && NAK 1 Seq. no=1
above
( corrupt(rcvpkt) ||
rdt_send(data)
isNAK(rcvpkt) )
sndpkt = make_pkt(1, data, checksum)
udt_send(sndpkt)
udt_send(sndpkt)
means:
the pakcet recieved by SENDER is corrupted
or the packet recieved by SENDER is a NAK packet.
Dr. Analoui, 11/22/2002
rdt2.1: Receiver handles defected ACK/NAKs.
Sender: Receiver:
seq # added to pkt must check if received
two seq. #’s (0,1) will packet is duplicate.
suffice. Why? state indicates whether
0 or 1 is expected pkt
must check if received seq #.
ACK/NAK corrupted. note: receiver can not
twice as many states. know if its last
state must “remember” ACK/NAK received OK
whether “current” pkt at sender.
has 0 or 1 seq. #
rdt_send(data)
sndpkt = make_pkt(0, data, checksum)
udt_send(sndpkt)
rdt_rcv(rcvpkt) &&
( corrupt(rcvpkt) ||
Wait for Wait for
ACK isACK(rcvpkt,1) )
call 0 from
above 0 udt_send(sndpkt)
sender
rdt_rcv(rcvpkt) && rdt_rcv(rcvpkt)
( corrupt(rcvpkt) || && notcorrupt(rcvpkt)
has_seq1(rcvpkt) ) && isACK(rcvpkt,0)
Wait for …
udt_send(sndpkt) Wait for receiver
0 from
below
U L/R 0.008 ms
sender = = = 0.00027
RTT + L / R 15ms + 15ms + 0.008 ms
sender receiver
first packet bit transmitted, t = 0
last packet bit transmitted, t = L / R
L/R 0.008
U
sender = = 30.008
= 0.00027
RTT + L / R
sender receiver
first packet bit transmitted, t = 0
last bit transmitted, t = L / R
Increase utilization
by a factor of 3!
3*L/R .024
U = = = 0.0008
sender 30.008
RTT + L / R microsecon
window size
A
N C
K
6
default
udt_send(sndpkt) rdt_rcv(rcvpkt)
&& notcurrupt(rcvpkt)
start && hasseqnum(rcvpkt,expectedseqnum)
expectedseqnum=1 Wait extract(rcvpkt,data)
sndpkt = deliver_data(data)
make_pkt(expectedseqnum,ACK,chksum) sndpkt = make_pkt(expectedseqnum,ACK,chksum)
udt_send(sndpkt)
expectedseqnum++
go--back N
time
sender receiver
data from above : pkt n in [rcvbase, rcvbase+N-1]
if next available seq # in send ACK(n)
window, send pkt out-of-order: buffer
timeout(n): in-order: deliver (also
resend pkt n, restart timer deliver buffered, in-order
pkts), advance window to
next not-yet-received pkt
ACK(n) in [sendbase,sendbase+N]:
mark pkt n as received
pkt n in [rcvbase-N,rcvbase-1]
if n smallest unACKed pkt,
ACK(n)
advance window base to
next unACKed seq # otherwise:
ignore
Example:
seq #’s: 0, 1, 2, 3
window size=3
receiver sees no
difference in two
scenarios!
incorrectly passes
duplicate data as new
in (a)
Q: what relationship
between seq # size
and window size?
Jamali@iust.ac.ir ITransport Layer 3-63
Chapter 3 outline
3. Introduction
reliable data transfer
3.1 Transport-layer flow control
services
connection management
3.2 Multiplexing and 3.6 Principles of congestion
demultiplexing control
3.3 Connectionless 3.7 TCP congestion control
transport: UDP
3.8 Multimedia Stream & TCP
3.4 Principles of reliable
data transfer 3.9 TCP fairness
3.10 TCP modeling
3.5 Connection-
3.11 http modeling
oriented transport:
TCP
segment structure
Socket Socket
segment segment
TCP TCP
send receive
buffer buffer
3. Introduction
reliable data transfer
3.1 Transport-layer flow control
services
connection management
3.2 Multiplexing and 3.6 Principles of congestion
demultiplexing control
3.3 Connectionless 3.7 TCP congestion control
transport: UDP
3.8 Multimedia Stream & TCP
3.4 Principles of reliable
data transfer 3.9 TCP fairness
3.5 Connection-oriented 3.10 TCP modeling
transport: TCP 3.11 http modeling
segment structure
Header Length
[4Bytes] # bytes
rcvr willing
to accept
TCP checksum
(as in UDP)
data
application transport
1 2 … 1001 … 2001 … 3001 … 4001 … 5001… Byte
rdt_send(data)
data
Seq=001
6000 B
Seq=1001
Seq=2001
Seq=3001
Seq=4001
(a) 6000 Byte data
Seq=5001
passed to TCP
TCP Header
(b) Data is broken into 6 1000-Byte-segments.
Jamali@iust.ac.ir ITransport Layer 3-70
TCP seq#s and Ack#s
Seq. #’s:
Host A Host B
byte stream
“number” of first User
types
byte in segment’s ‘C’
data host ACKs
receipt of
ACKs: ‘C’, echoes
seq # of next byte back ‘C’
expected from
other side host ACKs
cumulative ACK receipt
of echoed
Q: how receiver handles ‘C’
out-of-order segments
A: TCP spec doesn’t
say, it is up to
simple telnet scenario time
implementor.
unnecessary
estimated RTT “smoother”
retransmissions average several recent
350
EstimatedRTT
SampleRTT
300
RTT(milisec)
250
200
150
100
1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106
n=time (seconds)
3. Introduction
reliable data transfer
3.1 Transport-layer
flow control
services
connection management
3.2 Multiplexing and
3.6 Principles of congestion
demultiplexing
control
3.3 Connectionless
3.7 TCP congestion control
transport: UDP
3.8 Multimedia Stream & TCP
3.4 Principles of reliable
data transfer 3.9 TCP fairness
3.5 Connection-oriented 3.10 TCP modeling
transport: TCP 3.11 http modeling
segment structure
loop (forever) {
switch(event)
A data segment
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 2122 2324 25 26
Ack=20001
Ack=12001
Ack=4001
Ack=18001
Ack=25001
Seq=1 Host A Win = 4000 [B]
Win=10000
Win=5000
Win=7000
Host B
Win=000
SendBase= 1
Win1
Seq=1 ( RTO)
T<500ms
T<500ms
SendBase= 4001
Win2
T<500ms
time
Host A Host B
SendBase= 1
Win1
T<500ms
Seq=1 (RTO)
T<500ms
SendBase= 1
Win1
SendBase= 4001
Seq=1 (RTO)
T<500ms
Win3
time
Host A Host B
Win1
T<500ms
Seq=1 (RTO)
T<500ms
SendBase= 1
Win1
time
Win
Seq=2001
All data up to seq=3001
Ack=3001
are ACKed.
Seq=3001
Seq=8001 is expected
Seq=12001
gap is detected
Seq=9001 2Acks=8001
Sender keeps gap is detected
transmission 2Acks=8001
based on Seq=8001
Win&SendBase Received segment
starts at lower end of
Acks=9001 the gap.
Seq=8001
Immediate send ACK,
Time
Host A Host B
Seq=1001 (RTO)
1
2
3
3. Introduction
reliable data transfer
3.1 Transport-layer
flow control
services
connection management
3.2 Multiplexing and
3.6 Principles of congestion
demultiplexing
control
3.3 Connectionless
3.7 TCP congestion control
transport: UDP
3.8 Multimedia Stream & TCP
3.4 Principles of reliable
data transfer 3.9 TCP fairness
3.5 Connection-oriented 3.10 TCP modeling
transport: TCP 3.11 http modeling
segment structure
spare rate.
buffer
App process may be
slow at reading from
buffer.
data from IP
RcvBuffer
in buffer
RcvWindow
3. Introduction
reliable data transfer
3.1 Transport-layer flow control
services
connection management
3.2 Multiplexing and
3.6 Principles of congestion
demultiplexing
control
3.3 Connectionless
3.7 TCP congestion control
transport: UDP
3.8 Multimedia Stream & TCP
3.4 Principles of reliable
data transfer 3.9 TCP fairness
3.5 Connection-oriented 3.10 TCP modeling
transport: TCP 3.11 http modeling
segment structure
seq. #s no data
client server
connection request
connection accepted
connection ack.
Win= RcvWindow
ISN= Initial Sequence Number
time time
Closing a connection:
client server
Either of the two processes
close
Example
client closes socket:
TCP server
lifecycle
TCP client
lifecycle
LISTEN
(Step 2 of the 3-way handshake)
SYN/SYN + ACK
RST/- Send/SYN
SYN_RCVD SYN/SYN + ACK SYN_SENT
ACK/- SYN + ACK/ACK
(Step 3 of the 3-way handshake)
Close/FIN
ESTABLISHED
(Passive close)
Close/FIN FIN/ACK
FIN_WAIT_1
FIN_WAIT_1 CLOSE_WAIT
(Active close)
FIN/ACK
ACK/- Close/FIN
ACK/- ACK/-
FIN/ACK Timeout/-
TIME_WAIT timed wait
CLOSED (back to start)
30,60.120 sec
LISTEN
applic. close
or timeout,
SYN_RCVD receive SYN, SYN_SENT delete TCB
send ACK
applic.
close, ESTABLISHED
send
FIN
CLOSE_WAIT
FIN_WAIT_1 CLOSING
LAST_ACK
3. Introduction
reliable data transfer
3.1 Transport-layer flow control
services
connection management
3.2 Multiplexing and
demultiplexing 3.6 Principles of
3.3 Connectionless congestion control
transport: UDP 3.7 TCP congestion control
3.4 Principles of reliable 3.8 Multimedia Stream & TCP
data transfer 3.9 TCP fairness
3.5 Connection-oriented 3.10 TCP modeling
transport: TCP 3.11 http modeling
segment structure
sink1
100Mbps 100Mbs
bottleneck source2
sink2
1.5Mbps 10Mbps
100Mbps
packet
knee cliff loss
knee – point after which
Throughput
throughput increases very slowly
delay increases fast
congestion
cliff – point after which collapse
throughput starts to decrease
very fast to zero (congestion
collapse) Offered
Load
delay approaches infinity
Delay
Note (in an M/M/1 queue)
delay = 1/(1 – utilization)
Offered
Load
Throughput
Keeps network operating
at full capacity, but congestion
collapse
minimizes packet loss
maximize “goodput”
Offered
Congestion avoidance goal Load
stay left of knee
Right of cliff:
Congestion collapse
unlimited shared
Host B output link buffers
Shared link
R[B/s]
Delay (ms)
R/2
[Byte/s] [Byte/s]
R/2 R/2
λin(offered load) λin(offered load)
(a) Per-connection throughput. (b) Per-connection delay.
λout (throughput)
R/2
R/3
R/4
[Byte/s]
R/2
λ’ in
(offered load)
[Byte/s]
R/2
λout (throughput)
R/3
R/4
[Byte/s]
R/2
λ’in(offered load)
R3
Host A
Host B
Host D R1
λout
R4
λout
R/2 R2
[Byte/s]
λ’in (offered load)
Host C
A-C and B-D traffic compete at router R2 for the buffer, A-C traffic
that successfully gets through R2 becomes smaller and smaller as the
offered load from B-D gets larger and larger.
Throughput
Controlled
Uncontrolled
Offered load
Load
Power =
Delay
Average Packet delay
Power
»Congestion collapse
»TCP’s congestion avoidance (Jacobson)
Kbytes/sec
Load
Kbytes/sec
Load
Goodput
Goodput
Time Time
3. Introduction
reliable data transfer
3.1 Transport-layer flow control
services
connection management
3.2 Multiplexing and 3.6 Principles of congestion
demultiplexing control
3.3 Connectionless
transport: UDP 3.7 TCP congestion
3.4 Principles of reliable control
data transfer 3.8 Multimedia Stream & TCP
3.5 Connection-oriented 3.9 TCP fairness
transport: TCP 3.10 TCP modeling
segment structure 3.11 http modeling
TCP
Queue
Sink
Inbound Link Router Outbound Link
Sink
ACK…
Congestion Notification…
New CongWin =1 4
New threshold=12/2 2
Window then grows 0
exponentially 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
3ACKs after eigth transmi.:
Number of Transmission 3ACKs
New CongWin =1
New threshold=12/2
Window then grows
exponentially • 3ACKs indicates network capable of
3ACKs after eigth transmi.: delivering some segments
New CongWin=12/2
• timeout before 3ACKs
window then grows linearly
•3ACKs is “more alarming”
Timeouts
Rate
halved
Slow start in operation
Exponential “slow until it reaches half of
start” t cwnd.
previous
W W+1
4
2
1
RTT RTT Time
Packet loss
cwnd
8
7
X
6
5
4
O
3
Y
2
1
X O Y
time
Slow Congestion Timeout Slow
Start Avoidance Start
time
time
Retransmission
X
Duplicate Acks
time
Slow Start
Fast Recovery
Congestion Avoidance
Queue Empty
Queue Full
CongWin
24 Kbytes
ˇ limitation of network
receive window
ˇ ˇ ˇ
16 Kbytes
ˇ ˇ optimal (average)
window size
8 Kbytes
time
Long-lived TCP connection
Jamali@iust.ac.ir ITransport Layer 3-138
TCP--Reno Slow Start (more)
TCP
MSS*CongWin(t) [Bytes/sec]
Rate(t) =
RTT
When connection begins, Host A Host B
CongWin = 1 MSS
Rate=20kbps
Example:
RTT
MSS = 500B(4000b)
RTT = 200 msec
Rate=40kbps
initial rate = 4000/200
=20 kbps
Rate=80kbps
When connection begins,
increase rate exponentially
until first loss event:
double CongWin every RTT
done by incrementing CongWin time
for every ACK received.
1.22 × MSS
RTT e
For:
1Gbps throughput,
RTT=100ms and
MSS=1500 Byte
e = 2.14×10-8
Upon receiving a duplicated Received ACK for packet 10 (packets 11 and 12 are in transit)
Send packet 13 (which is lost)
ACK or an ACK for a
One RTT
Received ACK for packet 11
retransmitted packet, Vegas Send packet 14
checks the time interval after
Received ACK for packet 12
the previous packet of the just Send packet 15 (which is also lost)
ACKed packet was sent. Should have gotten ACK for packet 13
If the time interval is greater Received dup ACK for packet 12 (due to packet 14)
than the timeout value, then Vegas checks timestamp of packet 13 and decides to transmit it
One RTT
(Reno would need to wait for the 3rd duplicate ACK)
the packet is retransmitted
without waiting triple
duplicated ACKs. Received ACK for packets 13 and 14
Since it is 1st or 2nd ACK after retransmission,
Only decreasing CWND if the Vegas checks timestamp of packet 15 and decide to transmit it
(Reno would need to wait for 3 new duplicate ACKs)
retransmitted packet was sent
after the last decrease.
3. Introduction
reliable data transfer
3.1 Transport-layer flow control
services
connection management
3.2 Multiplexing and 3.6 Principles of congestion
demultiplexing control
3.3 Connectionless 3.7 TCP congestion control
transport: UDP
3.4 Principles of reliable 3.8 Multimedia Stream
data transfer & TCP
3.5 Connection-oriented 3.9 TCP fairness
transport: TCP 3.10 TCP modeling
segment structure 3.11 http modeling
MSS
B=
Byte/sec
RTT ×
2be
3
+ RTO × 3(3be
8
)
× e × (1 + 32e 2 )
non-TCP
non-TCP
Internet
TCP TCP
3. Introduction
reliable data transfer
3.1 Transport-layer flow control
services
connection management
3.2 Multiplexing and 3.6 Principles of congestion
demultiplexing control
3.3 Connectionless 3.7 TCP congestion control
transport: UDP
3.8 Multimedia Stream & TCP
3.4 Principles of reliable
data transfer 3.9 TCP fairness
3.5 Connection-oriented 3.10 TCP modeling
transport: TCP 3.11 http modeling
segment structure
source1
sink1 source1
10ms
5ms
bottleneck
sink2 200ms source2
5ms 1.5Mbps
5ms
sink3 5ms 100ms
source3
5ms 30ms
sink4
source4
A
Connection 1 throughput R
3. Introduction
reliable data transfer
3.1 Transport-layer flow control
services
connection management
3.2 Multiplexing and 3.6 Principles of congestion
demultiplexing control
3.3 Connectionless 3.7 TCP congestion control
transport: UDP
3.8 Multimedia Stream & TCP
3.4 Principles of reliable
data transfer 3.9 TCP fairness
3.5 Connection-oriented 3.10 TCP modeling
transport: TCP 3.11 http modeling
segment structure
Notation, assumptions:
Q: How long does it take to Assume one link between
receive an object from a client and server of rate R
Web server after sending S: Segment Size (bits)
a request? O: object (file) size (bits)
Ignoring congestion, delay is no retransmissions (no loss,
influenced by: no corruption)
TCP connection establishment Window size:
data transmission delay First assume: fixed
slow start congestion window, W
segments
Then dynamic window,
modeling slow start
client server
First case:
ACK for first segment
in window returns
2RTT
before window’s
worth of data sent
client server
Second case:
wait for ACK after
sending window’s
worth of data sent
Delay = 2RTT + O/R
K = O/WS
WS/R < RTT + S/R
O S S
Delay = 2 RTT + + P RTT + − ( 2P −1)
R R R
P = min{ Q , K −1 }
S + The time from when server begins to transmit the 1st segment
=
RTT until the time when the server receives an acknowledgment the segment.
R
k −1 S
2 = total transmission time for kth window
R
P
O S
delay = + 2 RTT + ∑ idleTime p
k −1
2 mst segment is send
R p =1
R
S
+ RTT
P
O S S
= + 2 RTT + ∑ [ + RTT − 2 k −1 ]
Server receives 1st ack
R R
k =1 R R
O S S
= + 2 RTT + P[ RTT + ] − (2 P − 1)
R R R
kth window including m segments
m=2k-1
S
1st segment is send
R
S S
Q = max k : + RTT − 2 k −1 ≥ 0
R R k −1 S
2 mst segment is send
RTT R
k −1
= max k : 2 ≤ 1 +
S S
R + RTT Server receives 1st ack
R
RTT
+ 1
= max k : k ≤ log 2 1 +
S
R
RTT kth window including m segments
= log 2 1 + + 1
S
R
Assumptions: S = 536B
RTT = 100msec
O = 100kB
K=8
Assumptions: S = 536B
RTT = 100msec
O = 5kB
K=4
Assumptions: S = 536B
RTT = 1000msec
O = 5kB
K=4
1 2 3
S = 536B S = 536B S = 536B
RTT = 100msec RTT = 100msec RTT = 1000msec
O = 100kB O = 5kB O = 5kB
K=8 K=4 K=4
Wmax 1
B(e) ≅ min( , )
RTT
RTT ×
2be
3
(
+ RTO × min 1,3
3be
8
)
× e × (1 + 32e 2 )
[segments/sec]
For e ≤ 0.05 :
1 3 1 1.22 b =1
B (e ) ≅ + o( ) ≈ ←
RTT 2be e RTT e
The notation f = o(g) means that (g > 0 and) (f/g) -> 0. The
notation o(g) indicates that the term is of smaller order of
magnitude than g.
10000
1000
segments/100 Secs
100
Network Throughput
TCP Throughput (Send Rate)
10 RTT = 0.470
RTO = 3.2, Wmax=12
1
0.001 0.01 0.1 1
Loss Rate
3. Introduction
reliable data transfer
3.1 Transport-layer flow control
services
connection management
3.2 Multiplexing and 3.6 Principles of congestion
demultiplexing control
3.3 Connectionless 3.7 TCP congestion control
transport: UDP
3.8 Multimedia Stream & TCP
3.4 Principles of reliable
data transfer 3.9 TCP fairness
3.5 Connection-oriented 3.10 TCP modeling
transport: TCP 3.11 http modeling
segment structure
Non-persistent HTTP:
M+1 TCP connections in series
O
.delay = ( M + 1) × 2 RTT + ( M + 1)
MSS × B(e)
Non-persistent HTTP O
delay = ( M + 1) × 2 RTT + ( M + 1)
MSS × B(e)
Persistent with O
pipelining HTTP
delay = 3RTT + ( M + 1)
MSS × B(e)
Non-persistent HTTP M O
with X parallel delay = ( + 1) × 2 RTT + ( M + 1)
connections X MSS × B(e)
For low bandwidth, connection & response time dominated by transmission time.
Persistent connections only give minor improvement over parallel connections.
TCP
1984
1975 Nagel’s algorithm
Three-way handshake to reduce overhead 1987
Raymond Tomlinson of small packets; Karn’s algorithm 1990
In SIGCOMM 75 predicts congestion to better estimate 4.3BSD Reno
collapse round-trip time fast retransmit
delayed ACK’s
1983
BSD Unix 4.2 1986 1988
1974 supports TCP/IP Congestion Van Jacobson’s
collapse algorithms
TCP described by
observed congestion avoidance
Vint Cerf and Bob Kahn
In IEEE Trans Comm 1982 and congestion control
TCP & IP (most implemented in
RFC 793 & 791 4.3BSD Tahoe)
1994 1996
T/TCP,rfc1644 SACK TCP
(Braden) (Floyd et al)
Transaction Selective
TCP Acknowledgement
TCP REFERENCES:
[TCP:1] "Transmission Control Protocol," J. Postel, RFC-793,
September,1981.
[TCP:2] "Transmission Control Protocol," MIL-STD-1778, US
Department of, Defense, August 1984.
This specification as amended by RFC-964 is intended to
describe the same protocol as RFC-793 [TCP:1]. If there is a
conflict, RFC-793 takes precedence, and the present document
is authoritative over both.
[TCP:3] "Some Problems with the Specification of the Military
Standard Transmission Control Protocol," D. Sidhu and T.
Blumer, RFC-964, November 1985.
[TCP:4] "The TCP Maximum Segment Size and Related Topics,"
J. Postel, RFC-879, November 1983.
[TCP:5] "Window and Acknowledgment Strategy in TCP," D.
Clark, RFC-813, July 1982.
[TCP:6] "Round Trip Time Estimation," P. Karn & C. Partridge,
ACM SIGCOMM-87, August 1987.
[TCP:7] "Congestion Avoidance and Control," V. Jacobson, ACM
SIGCOMM 88, August 1988.
D M Chiu And R Jain, "Analysis of Increase and Decrease Algorithms, Part III
of Congestion Avoidance in Computer Networks with a Connectionless Network
Layer", DEC Technical Report 509, August 1987.
User 1 x1
x2
User 2 Σ Σxi>Xgoal
xn
Simple, yet powerful model
Explicit binary
User n signal of congestion
y
Jamali@iust.ac.ir ITransport Layer 3-198
Possible Choices
aI + bI xi (t ) increase
xi (t + 1) =
a +
D D ib x (t ) decrease
Multiplicative increase, additive decrease
aI=0, bI>1, aD<0, bD=1
Additive increase, additive decrease
aI>0, bI=1, aD<0, bD=1
Multiplicative increase, multiplicative decrease
aI=0, bI>1, aD=0, 0<bD<1
Additive increase, multiplicative decrease
aI>0, bI=1, aD=0, 0<bD<1
Which one?
at bI aD
x1h = x2 h = (x1h,x2h)
1 − bI
2: x2
User 2
(x1h+aD,x2h+aD)
Fixed point
is unstable! efficiency
line
User 1: x1
does not
User 2: x2
converge to (x1h+aD,x2h+aD)
fairness
efficiency
line
User 1: x1
User 2: x2
(bdx1h,bdx2h)
efficiency
line
User 1: x1
User 2: x2
(bDx1h,bDx2h)
efficiency
line
User 1: x1
3-206
TCP: Slow Start
3-207
Slow Start Example
The congestion
window size cwnd = 1
grows very
rapidly cwnd = 2
cwnd = 4
ssthresh
cwnd = 1
ssthresh = 8 cwnd = 4
14
12
10 cwnd = 8
Cwnd (in segments)
8
ssthresh
6
4
2 cwnd = 9
0
0
6
t=
t=
t=
t=
Roundtrip times
cwnd = 10
3-211
The big picture
cwnd
Timeout
Congestion
Avoidance
Slow Start
Time
3-212
Fast Retransmit
Resend a segment
after 3 duplicate
ACKs cwnd = 4
3 duplicate
ACKs
Lesson:
Jamali@iust.ac.ir avoid RTOs at all costs! ITransport Layer 3-214
Fast Retransmit and Fast Recovery
cwnd
Congestion
Avoidance
Slow Start
Time
Retransmit after 3 duplicated acks
prevent expensive timeouts
No need to slow start again
At steady state, cwnd oscillates around
the optimal window size.
3-215
Engineering vs Science in CC
Tput ~ 1/sqrt(d)
Fairness:
Throughput depends on RTT
High speeds:
to reach 10gbps, packet losses occur every
90 minutes!
Short flows:
How to set initial cwnd properly