Sie sind auf Seite 1von 22

Linux TCP/IP Stack

TCP / IP

vs.

Process

Socket layer

Protocol Layer (TCP / IP)

Interface Layer (Ethernet, etc.)

OSI model
7: Application
6: Presentation
5: Session

4: Transport
3: Network

2: Data
Link

1: Physical
Layer

TCP/IP Stack Overview


Process

1: sosend (... )

5: recvfrom(.)

Socket Layer

2: tcp_output ( . )

4: tcp_input ( ... )

Protocol Layer (TCP Layer)

3: ip_output ( . )

3: ip_input ( ... )

Protocol Layer (IP Layer)

4: ethernet_output ( . )

2: ethernet_input ( .. )
Interface Layer (Ethernet Device Driver)

Physical Media

Output Queue

Input Queue

Process Layer

to

TCP Layer

send (int socket, const char *buf, int length, int flags)

Process

Kernel

sendto (int socket, const char *data_buffer, int length, int flags, struct sockaddr *destination, int destination _length)

sendit (struct proc *p, int socket, struct msghdr *mp, int flags, int *return_size)

sosend (struct socket *s, struct mbuf *addr, struct uio *uio, struct mbuf *top, struct mbuf *control, int flags )

tcp_userreq (struct socket *s, int request, struct mbuf *m, struct mbuf * nam, struct mbuf * control )

TCP Layer

tcp_output (struct tcpcb *tp)

uipc_syscalls.c

uipc_socket.c

tcp_userreq.c

tcp_output.c

Socket Layer
sendto (int socket, const char *data_buffer, int length, int flags, struct sockaddr *destination, int destination _length)
MBUF Chain
m_next = NULL

m_next

28 Bytes

m_nextpkt = NULL

m_nextpkt = NULL

m_len = 100

m_len = 50

m_data

20 Bytes

m_type = MT_DATA

m_type = MT_DATA
data_buffer

m_flags = M_PKTHDR

m_flags = 0

m_pkthdr.len = 150

128 Bytes
mBuf

150 Bytes
Data

m_data

m_pkthdr.recvif =NULL

100 Bytes

50 Bytes

Data

58 Bytes

Unused Space

Data

Socket Layer -sosend passes data and control information to the protocol layer
sosend(struct socket *s, struct mbuf *addr, struct uio *uio, struct mbuf *data_buffer, struct mbuf *control, int flags )

Initialize a new memory buffer and


variables to hold flags

no
Is there enough space
in the buffer
sbspace(s->sb_snd)
yes
Copy data_buffer

mbuf

int error = tcp_usrreq(s, flags, mbuf, addr, control)


yes

More buffers
to send?

1
error

no

Free the memory buffers


received

Return value of error


to sendto ( )

TCP Layer - tcp_usrreq(struct socket *s, int request, struct mbuf *data_buffer, mbuf *nam, mbuf * control)
Initialize internet protocol control block inp and
TCP control block tp
to store information useful for TCP

Convert Socket to
Internet Protocol Control Block
inp = sotoinpcb(so)

Convert the internet protocol control block


to a tcp control block
tp = intopcb(inp)

request

PRU_SEND
int error = tcp_output(tp)

return error
to tcp_userreq( )

TCP Layer (tcp_output.c) - tcp_output(struct tcpcb *tp)


Called by tcp_usrreq for one of the following reasons:
To send the initial SYN
To send a finished_sending message
To send data
To send a window update after data has been received.
tcp_ouput ( ) functionality:
1. determines whether TCP can send a segment or not depending on:
flags in the data sent by the socket layer to send an ACK, etc.
Size of window advertised by the receivers end.
Amount of data ready to send
whether unacknowledged data already exists for the connection
2. Calculate the amount of data to be sent depending on:
size of receivers window
number of bytes in the send buffer
3. Check for window shrink
4. Send a segment
Allocate a buffer for the TCP and IP header from the header template
Copy the TCP and IP header template into the the buffer to be sent.
Fill the fields in the TCP header.
Decrement the number of buffers to tbe sent, so that the end can be checked.
Set sequencenumber and acknowledgement field.
Set three fields in the IP header - IP length, TTL and Tos.
Pass the datagram to IP

TCP Layer (tcp_output.c) - tcp_output(struct tcpcb *tp)


struct socket *so = tp -> t_inpcb -> inp_socket

Initialize a tcp header tcp_header

Idle is true if the max sequence number


equals the oldest unacknowledged sequence number,
if an ACK is not expected from the other end.
int idle = (tp -> snd_max == tp -> snd_una)
false
idle
true

Check ACK Flag


Acknowledgement is
not expected, set the
congestion window to
one segment
tp -> snd_cwnd =
tp -> t_maxseg;

TCP Layer - tcp_output(struct tcpcb *tp)


Acknowledgement is
not expected, set the
congestion window to
one segment
tp -> snd_cwnd =
tp -> t_maxseg;

off is the offset in bytes from the beginning of


the send buffer of the first data byte to send.
off bytes have already been sent and
acknowledgement
on those is awaited.
int off = tp -> snd_nxt - tp -> snd_una

Determine length of data that should


be transmitted and the flags to be used.
len is the minimum number of bytes in the
send buffer,
win (the minimum of the receivers window)
and the congestion window.
len = min(so -> so_snd.sb_cc, win) - off

Determine the flags like TH_ACK, TH_FIN,


TH_RST, TH_SYN
flags = tcp _outflags [ tp -> t_state ]

TCP Layer - tcp_output(struct tcpcb *tp)


Determine the flags like TH_ACK, TH_FIN,
TH_RST, TH_SYN
flags = tcp _outflags [ tp -> t_state ]

tp -> t_flags &


TF_ACKNOW

true
Send acknowledgement

false

tp -> t_flags &


TF_SYN || TH_RST

true

Send sequence number


or reset

false

tp -> t_flags &


TH_FIN

false

true
Finished sending

Ckeck flags to determine the type of message:


window probe
retransmission
normal data transmission

Allocate an mbuf for the TCP & IP header and data if possible.
MGETHDR ( m, M_DONTWAIT, MT_HEADR)
M_DONTWAIT indicates that if memory is not available for
mbuf then come out of the routine and return an error state.

Length of data < 44 Bytes


100 - 40 - 16

no

Create a new mbuf chain,


copy the surplus data and
point it to the first mbuf chain.

yes
Copy the data from the socket send buffer into the
new packet header mbuf

ip_output(m, tp->t_inpcb -> inp_options, &tp -> t_inpcb -> inp_route,


so -> so_options & SO_DONOTROUTE, 0)

ip_output.c
ip_output(struct mbuf *m, struct mbuf *opt, struct route *ro, int flags, struct ip_moptions *imo)
1. Header initialization
2. Route Selection
3. Source address selection and Fragmentation
1. Header initialization
Packets
damaged?

ERROR

yes

Check if there were any errors while adding headers in higher


layers. Most of the fields of the IP header are pre defined by
higher layer protocols.

no
if ((flags == IP_FORWARDING ) ||
(flags == IP_RAWOUTPUT ))
yes
no

Save header length in hlen


for fragmentation algorithm

Construct and initialize IP header


set ip_v = 4, clear ip_off
assign unique identifier to ip_id
length, offset, TTL, protocol, TOS etc
are set by higher layers.

The value of flags decides whats to be done with the data


IP_FORWARDING : Forward packet
IP_ROUTETOIF : Route directly to Interface
IP_ALLOWBROADCAST : Allow broadcasting of packet
IP_RAWOUTPUT : Packet contains pre-constructed header
If the packet has to be forwarded to another host, i.e if the
machine is acting as a router, then the IP header for forwarded
packets should not be modified by ip_output.

If the packet is not being forwarded and has to be sent to


another host then initialize the IP header.

2. Route Selection

A cached route may be provided to ip_output as an


argument. UDP and TCP maintain a route cache
associated with each socket.

Verify Cached Route for


destination address

If (cached_route == destination)

no

Locate route : Call rtalloc(dst_ip) to


locate a route to the destination. Find
the interface on which the packet has
to be placed. Ifp points to the
interfaces ifnet structure. If
rtalloc(dst_ip) fails to find a route,
return host unreachable error.

yes

Find the interface on which the


packet has to be placed. Ifp points to
the interfaces ifnet structure.

Check if the cached route is the correct destination. If a


route has not been provided, ip_output sets a temporary
route structure called iproute.

If the cached route is provided, find the interface on


which the frame has to be sent.

If the packet is being routed, rtalloc locates a route to


the address specified by dst. If rtalloc fails, an
EHOSTUNREACH error is generated. If ip_forward called
ip_output the error is converted to an ICMP error.
If the address is found then ifp is made to point to thr
ifnet structure for the interface. If the next hop is not the
packets final destination, then dst is changed to point to
the next hop router.

3. Source address selection and Fragmentation

Check if valid source


address is specified.

no

Select the IP address of the outgoing


interface as the source address.

The final section of the ip_output ensures that the


IP header has a valid source IP address. This
couldnt have been done earlier because the route
hadnt been selected yet. If there is no source IP then
the IP address of the outgoing interface is used as the
source IP.

yes

Does the packet have


to be fragmented ?

yes

Fragment the packet if its size is


greater than the MTU.

Larger packets (packets that exceed the MTU) must


be fragmented before they can be sent.

no

If there are no check_sum errors, send


the data to if_output function of the
selected interface.

In either case (fragmented or not) the checksum is


computed (in_cksum). If no errors are found, the
data is sent to if_output function of the output
interface.

Interface Layer (if_ethersubr.c)


ether_output(struct ifnet *ifp, struct mbuf *mbuf, struct sockaddr *destination, struct rtentry *routing_entry)
1. Verification
2. Protocol-Specific Processing
3. Frame Construction
4. Interface Queuing.

1. Verification

Ethernet port
up and running ?
ifp -> if_flags &
(IF_UP | IF_RUNNING )

yes

no
senderr (ENETDOWN)

Interface Layer(if_ethersubr.c) - ether_output(struct ifnet *ifp, struct mbuf *mbuf,


struct sockaddr *destination, struct rtentry *rt_entry)
Function: Takes the data portion of an Ethernet frame ans encapsulates it with a 14-byte header and places it on the interface send_queue.
Phases: Verification, Protocol-Specific Processing, Frame Construction, Interface Queuing.

Arguments ifp points to outgoing interfaces ifnet structure


mbuf is the data to be sent
destination is the destination address
rt_entry points o the routing entry

InitializeEthernet header - struct eth_header *eh


Verification
Ethernet port
up and running ?
ifp -> if_flags &
(IF_UP | IF_RUNNING )

yes

no
senderr (ENETDOWN)

0
Route valid ?
rt_entry = rtalloc1 (destination, 1)

senderr (EHOSTUNREACH)

Next hop a gateway ?


rt = rt -> rt_gwroute

1
Destination responding
to ARP requests?
If not then do not send more
packets to avoid flooding.
rt -> rt_flags &
RTF_REJECT

no

Verification

Protocol Specific Processing

Functionality: Finds Ethernet address corresponding to the IP address of the destination.

Protocol Specific Processing

destination -> sa_family

AF_INET

Send ARP broadcast to find the


ethernet address corresponding to the
destination IP address

Use m_copy( ) to keep the packet till


an ack. Is recvd.

Frame Preparartion

Protocol Specific Processing

Frame Preparartion
Make sure there is room for the 14 byte
ethernet header
M_PREPEND ( m, sizeof(ethernet_header),
M_DONOTWAIT)

Form the Ethernet header from


ethernet frame type,
ethernet MAC address,
unicast ethernet address associated
with the output interface.
e.g. the default gateway for a host

Frame Preparartion

Interface Queuing
yes
Is the output queue full

Discard the frame


Free the memory buff
senderr ( ENOBUFS )

no

Place the frame on the


interfaces send queue

lestart ( ifp )

if_snd

lestart ( ifp )

Interface Layer(if_le.c) - lestart(struct ifnet *ifp)


Function: Dequeues frames from the interface output queue and arranges for them to be transmitted by the Ethernet Card.

struct le_softc *le = & le_softcl [ ifp -> if_unit ]

0
le -> sc_if.if_flags &
IFF_RUNNING

1
Copy the the frame in mbuf to the
hardware buffer

Set the IFF_OACTIVE on to indicate that the


device is busy transmitting.

return error

Das könnte Ihnen auch gefallen