Sie sind auf Seite 1von 16

UDP Header Format

User Datagram Protocol (UDP)


Thin wrapper around IP services
Service Model

Unreliable unordered datagram service


Addresses multiplexing of multiple connections

16

31

Destination Port

UDP Length

UDP Checksum

Length includes 8-byte header and data


Checksum

Multiplexing

Source Port

Uses IP checksum algorithm


Computed on header, data and pseudo header:

16-bit port numbers (some are well-known)

Checksum

Validate header
Optional in IPv4
Mandatory in IPv6

16

31

Source IP Address
Destination IP Address
0

17 (UDP)

UDP Length

Transmission Control Protocol (TCP)


Guaranteed delivery:

TCP

Messages delivered in the order they were sent


Messages delivered at most once

No limit on message size


Synchronization between sender and receiver
Multiple connections per host
Flow control

TCP

TCP vs. Direct Link

Connection oriented

Explicit connection setup requires


RTT varies, depending on destination and network
condition
adaptive approach to retransmission
Packets

Explicit setup and teardown required

Byte stream abstraction


No boundaries in data
App writes bytes, TCP send segments, App receives bytes

Full duplex
Data flows in both directions simultaneously
Point-to-point connection

Delayed
Reordered
Late

Implements congestion control


Flow control: receiver controls sender rate
Congestion control: network indirectly controls sender rate
5

TCP vs. Direct Link

TCP: Connection Stages


1. Connection setup

Peer capabilities vary

3-way handshake

Minimum link speed on route


Buffering capacity at destination

2. Data transport: Sender writes data, and TCP


Breaks data into segments
Sends segment in IP packets
Retransmits, reorders and removes duplicates as
necessary
Delivers data to receiver

adaptive approach to window sizes


Network capacity varies
Other traffic competes for most links
Requires global congestion control strategy

3. Teardown
4 step exchange

TCP Segment Header

TCP Segment Header Format


0

16

16-bit source and destination ports


32-bit send and ACK sequence numbers
4-bit header length (unit = 32 bits)

31

Source Port

Destination Port

Sequence Number
ACK Sequence Number
Header Length
0
Flags
Advertised Window
TCP Checksum
Urgent Pointer
Options

Minimum 5 (20 bytes)


Used as offset to first data byte

6 1-bit flags

Meta header
0

16

31

Source IP Address
Destination IP Address
0

16 (TCP)

TCP Segment Length

URG:
ACK:
PSH:
RST:
SYN:
FIN:

*Segment contains urgent data


ACK sequence number is valid
*Do not delay delivery of data
Reset connection (reject or abn. termination)
Synchronize segment for setup
Final segment for teardown

10

TCP Segment Header (cont.)

TCP Options
Negotiate maximum segment size (MSS)

16-bit advertised window

Each host suggests a value


Minimum of two values is chosen
Prevents IP fragmentation over first and last hops

Space remaining in receive window

16-bit checksum
Uses IP checksum algorithm
Computed on header, data and pseudo header

Packet timestamp
Allows RTT calculation for retransmitted packets
Extends sequence number space for identification of stray packets

16-bit urgent data pointer

Negotiate advertised window scaling factor

If URG = 1
Index of last byte of urgent data in segment

Allows larger windows: 64KB too small for routes with large
bandwidth-delay products

11

12

TCP: Data Transport

TCP Byte Stream

Data broken into segments


Limited by maximum segment size (MSS)
Negotiable during connection setup
Typically set to
MTU of directly connected network size of TCP and IP
headers

Application
process
Write
bytes

Three events cause a segment to be sent


At least MSS bytes of data ready to be sent
Explicit PUSH operation by application
Periodic timeout

Application
process
Read
bytes

TCP

TCP

Send buffer

Recv buffer

TCP Segment

TCP Segment

TCP Segment

13

14

TCP SNs and ACKs


Seq. #s:
Count bytes, not
packets. First SN to
avoid insertion

Host A
User
types
C

ACKs:
SN of next byte
expected from other
side
cumulative ACK
GBN: TCP spec doesnt say
what to do with premature
packets - up to
implementation

TCP ACK rules


Host B

host ACKs
receipt of
C, echoes
back C

host ACKs
receipt
of echoed
C

simple telnet scenario

time
15

Event

TCP Receiver action

in-order segment arrival,


no gaps,
everything else already ACKed

delayed ACK. Wait up to 500ms


for next segment. If no next segment,
send ACK

in-order segment arrival,


no gaps,
one delayed ACK pending

immediately send single


cumulative ACK

out-of-order segment arrival


higher-than-expect seq. #
gap detected

send duplicate ACK, indicating seq. #


of next expected byte

arrival of segment that


partially or completely fills gap

immediate ACK if segment starts


at lower end of gap
16

TCP: retransmission scenarios

loss

time

Host A

Host B

lost ACK scenario

Host B
Round-trip time (RTT)

Seq=100 timeout
Seq=92 timeout

timeout

Host A

TCP: Retransmission and Timeouts


Retransmission TimeOut (RTO)
Guard
Band

Host A

Estimated RTT

Data1

Data2

ACK

ACK

Host B

TCP uses an adaptive retransmission timeout value


Dynamic network (congestion, changes in routing)
=> RTT cannot be static

premature timeout,
cumulative ACKs
17

18

TCP: Retransmission and Timeouts

TCP: Retransmission and Timeouts

(Jacobson/Karels alg.)

RTO value is important:

too big: wait too long to retransmit a packet


too small: unnecessarily retransmit packets.

Newer algorithm estimates std. dev. of RTT:


1.

Original algorithm for picking RTO:


1. EstimatedRTT = EstimatedRTT + (1 - ) SampleRTT

2.

2. RTO = 2 EstimatedRTT

3.
4.

Characteristics of the original algorithm:

Std. dev. implicitly assumed to be bounded by RTT.


But if utilization = 75%, could have factor 16 between
typical (mean2stdev) short and long RTTs
19

Diff = SampleRTT - EstimatedRTT


EstimatedRTT = EstimatedRTT + Diff
(for some 0<<1)
Deviation = Deviation + ( |Diff| - Deviation )
RTO = EstimatedRTT + Deviation
1
4
20

TCP Sliding Window Protocol Sender


Side

TCP: Retransmission and Timeouts


(Karns Alg.)
Host A

Host B

Host A

Retransmission
Wrong RTT
Sample

Host B

LastByteAcked <= LastByteSent


LastByteSent <= LastByteWritten
Buffer bytes between LastByteAcked and LastByteWritten

Retransmission

Maximum buffer size

Wrong RTT
Sample

Advertised window

Problem: How to estimate RTT of retransmitted


packets?
Solution: Dont! Also: double RTO.

Data available, but


outside window
First unacknowledged byte

Last byte sent

21

22

TCP Sliding Window Protocol


Receiver Side

TCP Flow Control


Receiving side

LastByteRead < NextByteExpected


NextByteExpected <= LastByteRcvd + 1
Buffer bytes between NextByteRead and LastByteRcvd
Shrinks as data arrives and
Grows as the application consumes data

Receive buffer size = MaxRcvBuffer


LastByteRcvd - LastByteRead < = MaxRcvBuffer
AdvertisedWindow = MaxRcvBuffer - (NextByteExpected NextByteRead)

Maximum buffer size

Shrinks as data arrives and


Grows as the application consumes data

Sending side

Advertised window

Send buffer size = MaxSendBuffer


LastByteSent - LastByteAcked < = AdvertisedWindow
EffectiveWindow = AdvertisedWindow - (LastByteSent LastByteAcked)
EffectiveWindow > 0 to send data
LastByteWritten - LastByteAcked < = MaxSendBuffer
block sender if (LastByteWritten - LastByteAcked) + y > MaxSenderBuffer

Buffered, out-of-order data


Next byte expected (ACK value)
Next byte to be read by application
23

24

TCP Flow Control

TCP Flow Control

Problem: Slow receiver application

Problem: Application delivers tiny pieces of data to TCP

Advertised window goes to 0


Sender cannot send more data
Receiver may not spontaneously generate update or update may be
lost
Sender gets stuck

Example: telnet in character mode


Each piece sent as a segment, returned as ACK
Very inefficient

Solution

Solution
Sender periodically sends 1-byte segment, ignoring advertised window
of 0
Eventually window opens
Sender learns of opening from next ACK of 1-byte segment

Delay transmission to accumulate more data


Nagles algorithm
Send first piece of data
Accumulate data until first piece ACKed
Send accumulated data and restart accumulation
Not ideal for some traffic (e.g. mouse motion)

25

TCP Flow Control

26

TCP Bit Allocation Limitations

Problem: Slow application reads data in tiny pieces

Sequence numbers vs. packet lifetime

Receiver advertises tiny window


Sender fills tiny window
Known as silly window syndrome

Assumed that IP packets live less than 60 seconds


Can we send 232 bytes in 60 seconds?
approx. 573Mbps: Less than an STS-12 line

Solution

Advertised window vs. delay-bandwidth

Advertise window opening only when MSS or of buffer is


available
Sender delays sending until window is MSS or of
receivers buffer (estimated)

27

Only 16 bits for advertised window


coast-coast RTT = 100 ms
Adequate for only 5.24 Mbps!

28

TCP Sequence Numbers


32-bit
Bandwidth

Speed

TCP Connection Establishment


3-Way Handshake
Exchange initial sequence
numbers (j,k)
Message Types

Time until wrap around

T1

1.5 Mbps

6.4 hours

Ethernet

10 Mbps

57 minutes

T3

45 Mbps

13 minutes

FDDI

100 Mbps

6 minutes

STS-3

155 Mbps

4 minutes

STS-12

622 Mbps

55 seconds

STS-24

1.2 Gbps

28 seconds

Client

Server
listen

Synchronize (SYN)
Acknowledge (ACK):
cumulative!

Passive Open
Server listens for connection
from client

Active Open

Time flows down

Client initiates connection to


server
29

30

TCP State Descriptions

TCP: Connection Termination


Message Types
Finished (FIN)
Acknowledge (ACK)

Client

Server

Active Close
Sends no more data

Passive close
Accepts no more data

Connection can be half


closed (one-way)
Time flows down

31

CLOSED

Disconnected

LISTEN

Waiting for incoming connection

SYN_RCVD

Connection request received

SYN_SENT

Connection request sent

ESTABLISHED

Connection ready for data transport

CLOSE_WAIT

Connection closed by peer

LAST_ACK

Connection closed by peer, closed locally, await ACK

FIN_WAIT_1

Connection closed locally

FIN_WAIT_2

Connection closed locally and ACKd

CLOSING

Connection closed by both sides simultaneously

TIME_WAIT

Wait for network to discard related packets

TCP State Transition Diagram

TCP State Transition Diagram


Passive open

Close

SYN/SYN + ACK
SYN_RCVD

Close/FIN

Close/FIN

LISTEN
SYN/SYN + ACK
ESTABLISHED

ACK
FIN +
ACK/ACK

SYN + ACK/ACK

CLOSE_WAIT
CLOSING

FIN_WAIT_2

Close/FIN
LAST_ACK

ACK
TIME_WAIT

FIN/ACK

SYN_SENT

FIN/ACK

FIN/ACK

FIN_WAIT_1

State transitions
Describe the path taken by a server under normal conditions
Describe the path taken by a client under normal conditions
Describe the path taken assuming the client closes the
connection first
TIME_WAIT state
What purpose does this state serve
Prove that at least one side of a connection enters this state
Explain how both sides might enter this state

Close
Send/SYN

ACK

Questions

Active
open/SYN

CLOSED

ACK
Timeout

CLOSED
33

TCP State Transition Diagram


Close

SYN/SYN + ACK
SYN_RCVD

Close/FIN

Close/FIN

LISTEN

ACK
FIN +
ACK/ACK

FIN/ACK

SYN_RCVD

SYN + ACK/ACK

Close

Close/FIN

Close/FIN

CLOSE_WAIT

ACK

LAST_ACK

TIME_WAIT

ACK
Timeout

FIN +
ACK/ACK

FIN/ACK

35

SYN_SENT

SYN + ACK/ACK
ESTABLISHED
FIN/ACK
CLOSE_WAIT
CLOSING

FIN_WAIT_2

CLOSED

SYN/SYN + ACK

FIN/ACK

FIN_WAIT_1

Close/FIN

ACK

Close

LISTEN
Send/SYN

ACK

FIN/ACK
CLOSING

FIN_WAIT_2

SYN_SENT

ESTABLISHED

FIN/ACK

FIN_WAIT_1

Passive open
SYN/SYN + ACK

SYN/SYN + ACK

Active
open/SYN

CLOSED

Close
Send/SYN

ACK

TCP State Transition Diagram

Active
open/SYN

CLOSED
Passive open

34

Close/FIN
LAST_ACK

ACK
TIME_WAIT

ACK
Timeout

CLOSED
36

Congestion
H1

Congestion
Control &
Avoidance

A1(t)
10Mb/s
R1

H2

D(t)
1.5Mb/s

H3

A2(t)
100Mb/s
A1(t)

A1(t)+A2(t)
A2(t)

Cumulative
bytes

A2(t)

D(t)
X(t)

A1(t)
X(t)
D(t)

37

TCP Congestion Control

38

Ideal steady state: self-clocking

Basic idea: control rate by window size.


Average rate (window)/RTT
Crude

Add notion of congestion window


Effective window is minimum of
Advertised window (flow control), and
Congestion window (congestion control)

39

40

TCP Congestion Control

Slow Start

Start up phase: quickly find the correct rate

Destination

Source

Objective: determine available capacity


Idea:

Slow Start

Steady state: gently try to increase rate, back


off quickly when congestion detected

Begin with cwnd = 1 packet


Increment cwnd by 1 packet for
each ACK

Congestion Avoidance

Meaning: double every RTT!

Phases are determined by the value of


variable ssthres

41

Slow Start Implementation

42

Slow Start Trace

When starting or restarting after timeout,


cwnd=1.
On each ack for new segment, cwnd += segSize.

43

Each dot is a 512B packet sent, y-axis is sequence


number, x-axis is time, straight line is 20 KBps of available
bandwidth.
without ss: ~7KBps, with ss: ~19KBps

44

Host Solutions

Congestion is good?

Q: How does the source determine


whether or not the network is
congested?
A: Timeout signals packet loss

Empty buffers => low delay, low utilization


Full buffers => good utilizaion, but high
delay, potential loss
Real question:
how much congestion is too much?

Packet loss is rarely due to transmission error (on


wired networks)
Lost packet implies congestion!

45

Congestion Avoidance

46

How to get to steady-state?

Control vs. avoidance

If overusing link => packet loss => decrease rate


Why increase at all?

Control: minimize impact of congestion when it occurs


Avoidance: avoid producing congestion

Must check all the time so in order not to leave


dead bandwidth; only indication is dropped
packets

In terms of operating point limits

Slow-start: multiplicative increase

optimal load

Timeout: decrease to 1!

control
power

avoidance

idealized
power curve

Symmetric multiplicative increase and decrease: strong


oscillation, poor throughput. Rush-hour effect.

load
47

48

Additive Increase/ Multiplicative


Decrease

Rush Hour Effect


Easy to drive the
network into
saturation, but
difficult for the
network to recover.
Analogy to rush hour
traffic

rate
Arrivals &
departures

Source

Increment cwnd by one packet per


RTT
Linear increase
Divide CongestionWindow by
two whenever a timeout occurs
Multiplicative decrease

Destination

Queue size

Algorithm

50

AIMD: additive increase,


multiplicative decrease
increase window by
1 per RTT
decrease window
by factor of 2 on
loss event

Why AIMD?

Fairness goal: if N TCP sessions


share same bottleneck link,
each should get 1/N of link
capacity

Model: Two sessions compete for R bandwidth


underutilized &
unfair to 1

overutilized &
unfair to 1
overutilized &
unfair to 2

TCP connection 1

TCP
connection 2

desired
region

bottleneck
router
capacity R

underutilized &
unfair to 2
Conn 1 throughput

51

full utilization line


R
52

Model assumptions
Sessions know if link is
overused (losses)
Sessions dont know
relative rates
Simplification:
Sessions respond
simultaneously, and in
the same direction (both
increase or both
decrease)

AIMD Convergence
Additive Increase up at 45 angle

(both connections add 1)


Multiplicative Decrease down
R
toward the origin
X
pt. of convergence

full utilization
line
Conn 1 throughput

full utilization line

Conn 1 throughput
53

54

Convergence Avoidance Typical Trace

TCP Congestion Avoidance

Trace: sawtooth behavior


When a new segment is acked, the sender does the
following:

KB

If (cwnd < ssthresh) cwnd += segSize


else cwnd += segSize/cwnd
(What happens when an ACK arrives for x new segments?)

On timeout:
ssthresh := cwnd/2

70
60
50
40
30
20
10
1.0

cwnd := 1 (i.e., slow start)

2.0

3.0

4.0

5.0

6.0

7.0

8.0

9.0

10.0

Time (seconds)

55

56

Fast Retransmit and Fast Recovery


Sender

Problem: crude TCP timeouts lead


to idle periods, slow start is not
fast
Fast retransmit:
use duplicate ACKs to trigger
retransmission
Fast recovery:
skip slow start, go directly to half
the last successful cwnd (called
ssthresh)

TCP Congestion Control: summary


Maintain threshold window size (last good estimate)
Threshold value

Receiver

Packet 1
Packet 2
Packet 3

ACK 1

Packet 4

ACK 2

Packet 5

ACK 2

Initially set to maximum window size


Set to 1/2 of current window on timeout or 3 dup ACKs

Congestion window drops to 1 on timeout, drops by half on 3


dup ACKs
When congestion window smaller than threshold:

Packet 6
ACK 2
ACK 2
TIMEOUT!
Retransmit
packet 3

Double window for each window ACKd (multiplicative increase)

When congestion window larger than threshold:

ACK 6

Increase window by one MSS for each window ACKd

Try to avoid timeouts by fast retransmit


57

58

TCP Dynamics: Rate

TCP Congestion Window Trace


TCP Reno
70

Congestion Window

60
timeouts

threshold

Sending rate: Congwin*MSS / RTT

congestion
window

Assume fixed RTT

50

fast retransmission

40
30
20

additive increase

W/2

10
slow start period

0
0

10

20

30

40

50

60

Actual Sending rate:

Time

59

between W*MSS / RTT and (1/2) W*MSS / RTT


Average (3/4) W*MSS / RTT
60

TCP Dynamics: Loss

Congestion Avoidance
TCPs strategy: increase load until congestion occurs, then
back off

Loss rate (TCP Reno)


Consider a cycle

Alternative Strategy
W

Predict when congestion is about to happen and reduce rate just


before packets start being discarded

Two possibilities
Some help from network:

W/2

DECbit, RED

Total packet sent:

Host-centric

about (3/8) W2 MSS/RTT = O(W2)


One packet loss

TCP Vegas

Loss Probability: p=O(1/W2) or W=O(1/p)


61

62

Das könnte Ihnen auch gefallen