Sie sind auf Seite 1von 22

Linux TCP/IP Stack

TCP / IP vs. OSI model


7: Application
Process 6: Presentation
5: Session

4: Transport
Socket layer
3: Network

2: Data
Protocol Layer (TCP / IP)
Link

1: Physical
Interface Layer (Ethernet, etc.) Layer
TCP/IP Stack Overview
Process

1: sosend (... ) 5: recvfrom(.)

Socket Layer

2: tcp_output ( . ) 4: tcp_input ( ... )

Protocol Layer (TCP Layer)

3: ip_input ( ... )
3: ip_output ( . )

Protocol Layer (IP Layer)

4: ethernet_output ( . ) 2: ethernet_input ( .. )

Interface Layer (Ethernet Device Driver)

Physical Media

Output Queue Input Queue


Process Layer to TCP Layer

send (int socket, const char *buf, int length, int flags)
Process

Kernel sendto (int socket, const char *data_buffer, int length, int flags, struct sockaddr *destination, int destination _length)

sendit (struct proc *p, int socket, struct msghdr *mp, int flags, int *return_size) uipc_syscalls.c

sosend (struct socket *s, struct mbuf *addr, struct uio *uio, struct mbuf *top, struct mbuf *control, int flags ) uipc_socket.c

tcp_userreq (struct socket *s, int request, struct mbuf *m, struct mbuf * nam, struct mbuf * control ) tcp_userreq.c

TCP Layer tcp_output (struct tcpcb *tp) tcp_output.c


Socket Layer

sendto (int socket, const char *data_buffer, int length, int flags, struct sockaddr *destination, int destination _length)

MBUF Chain
m_next m_next = NULL

m_nextpkt = NULL m_nextpkt = NULL


m_len = 100 m_len = 50

28 Bytes m_data 20 Bytes m_data

m_type = MT_DATA m_type = MT_DATA


data_buffer
m_flags = M_PKTHDR m_flags = 0

m_pkthdr.len = 150
128 Bytes
mBuf m_pkthdr.recvif =NULL
50 Bytes Data

150 Bytes
Data 100 Bytes
Data
58 Bytes Unused Space
Socket Layer -sosend passes data and control information to the protocol layer
sosend(struct socket *s, struct mbuf *addr, struct uio *uio, struct mbuf *data_buffer, struct mbuf *control, int flags )

Initialize a new memory buffer and


variables to hold flags

no
Is there enough space
in the buffer
sbspace(s->sb_snd)

yes

Copy data_buffer mbuf

int error = tcp_usrreq(s, flags, mbuf, addr, control)

yes Free the memory buffers


More buffers 0 1
error received
to send?

no Return value of error


to sendto ( )
TCP Layer - tcp_usrreq(struct socket *s, int request, struct mbuf *data_buffer, mbuf *nam, mbuf * control)

Initialize internet protocol control block inp and


TCP control block tp
to store information useful for TCP

Convert Socket to
Internet Protocol Control Block
inp = sotoinpcb(so)

Convert the internet protocol control block


to a tcp control block
tp = intopcb(inp)

request

PRU_SEND
return error
int error = tcp_output(tp) to tcp_userreq( )
TCP Layer (tcp_output.c) - tcp_output(struct tcpcb *tp)
Called by tcp_usrreq for one of the following reasons:
To send the initial SYN
To send a finished_sending message
To send data
To send a window update after data has been received.

tcp_ouput ( ) functionality:
1. determines whether TCP can send a segment or not depending on:
flags in the data sent by the socket layer to send an ACK, etc.
Size of window advertised by the receivers end.
Amount of data ready to send
whether unacknowledged data already exists for the connection

2. Calculate the amount of data to be sent depending on:


size of receivers window
number of bytes in the send buffer

3. Check for window shrink

4. Send a segment
Allocate a buffer for the TCP and IP header from the header template
Copy the TCP and IP header template into the the buffer to be sent.
Fill the fields in the TCP header.
Decrement the number of buffers to tbe sent, so that the end can be checked.
Set sequencenumber and acknowledgement field.
Set three fields in the IP header - IP length, TTL and Tos.
Pass the datagram to IP
TCP Layer (tcp_output.c) - tcp_output(struct tcpcb *tp)

struct socket *so = tp -> t_inpcb -> inp_socket

Initialize a tcp header tcp_header

Idle is true if the max sequence number


equals the oldest unacknowledged sequence number,
if an ACK is not expected from the other end.
int idle = (tp -> snd_max == tp -> snd_una)

false

idle Check ACK Flag


Acknowledgement is
true not expected, set the
congestion window to
one segment
tp -> snd_cwnd =
tp -> t_maxseg;
TCP Layer - tcp_output(struct tcpcb *tp)
Acknowledgement is
not expected, set the
congestion window to
one segment
tp -> snd_cwnd =
tp -> t_maxseg;

off is the offset in bytes from the beginning of


the send buffer of the first data byte to send.
off bytes have already been sent and
acknowledgement
on those is awaited.
int off = tp -> snd_nxt - tp -> snd_una

Determine length of data that should


be transmitted and the flags to be used.
len is the minimum number of bytes in the
send buffer,
win (the minimum of the receivers window)
and the congestion window.
len = min(so -> so_snd.sb_cc, win) - off

Determine the flags like TH_ACK, TH_FIN,


TH_RST, TH_SYN
flags = tcp _outflags [ tp -> t_state ]
TCP Layer - tcp_output(struct tcpcb *tp)
Determine the flags like TH_ACK, TH_FIN,
TH_RST, TH_SYN
flags = tcp _outflags [ tp -> t_state ]

true
tp -> t_flags & Send acknowledgement
TF_ACKNOW

false

true
tp -> t_flags & Send sequence number
TF_SYN || TH_RST or reset

false

true
tp -> t_flags & Finished sending
TH_FIN

false
Ckeck flags to determine the type of message:
window probe
retransmission
normal data transmission

Allocate an mbuf for the TCP & IP header and data if possible.
MGETHDR ( m, M_DONTWAIT, MT_HEADR)
M_DONTWAIT indicates that if memory is not available for
mbuf then come out of the routine and return an error state.

Length of data < 44 Bytes no Create a new mbuf chain,


100 - 40 - 16 copy the surplus data and
point it to the first mbuf chain.

yes

Copy the data from the socket send buffer into the
new packet header mbuf

ip_output(m, tp->t_inpcb -> inp_options, &tp -> t_inpcb -> inp_route,


so -> so_options & SO_DONOTROUTE, 0)
ip_output.c
ip_output(struct mbuf *m, struct mbuf *opt, struct route *ro, int flags, struct ip_moptions *imo)
1. Header initialization
2. Route Selection
3. Source address selection and Fragmentation

1. Header initialization

Packets Check if there were any errors while adding headers in higher
yes ERROR layers. Most of the fields of the IP header are pre defined by
damaged?
higher layer protocols.
no
The value of flags decides whats to be done with the data
IP_FORWARDING : Forward packet
if ((flags == IP_FORWARDING ) ||
IP_ROUTETOIF : Route directly to Interface
(flags == IP_RAWOUTPUT ))
IP_ALLOWBROADCAST : Allow broadcasting of packet
yes IP_RAWOUTPUT : Packet contains pre-constructed header

no If the packet has to be forwarded to another host, i.e if the


Save header length in hlen
machine is acting as a router, then the IP header for forwarded
for fragmentation algorithm
packets should not be modified by ip_output.

Construct and initialize IP header


set ip_v = 4, clear ip_off
assign unique identifier to ip_id If the packet is not being forwarded and has to be sent to
length, offset, TTL, protocol, TOS etc another host then initialize the IP header.
are set by higher layers.
2. Route Selection

A cached route may be provided to ip_output as an


argument. UDP and TCP maintain a route cache
Verify Cached Route for associated with each socket.
destination address

Check if the cached route is the correct destination. If a


route has not been provided, ip_output sets a temporary
yes route structure called iproute.
If (cached_route == destination)

Find the interface on which the If the cached route is provided, find the interface on
no
packet has to be placed. Ifp points to which the frame has to be sent.
the interfaces ifnet structure.

If the packet is being routed, rtalloc locates a route to


Locate route : Call rtalloc(dst_ip) to
the address specified by dst. If rtalloc fails, an
locate a route to the destination. Find
EHOSTUNREACH error is generated. If ip_forward called
the interface on which the packet has
ip_output the error is converted to an ICMP error.
to be placed. Ifp points to the
If the address is found then ifp is made to point to thr
interfaces ifnet structure. If
ifnet structure for the interface. If the next hop is not the
rtalloc(dst_ip) fails to find a route,
packets final destination, then dst is changed to point to
return host unreachable error.
the next hop router.
3. Source address selection and Fragmentation

The final section of the ip_output ensures that the


IP header has a valid source IP address. This
couldnt have been done earlier because the route
Check if valid source no Select the IP address of the outgoing
hadnt been selected yet. If there is no source IP
address is specified. interface as the source address.
then the IP address of the outgoing interface is used
as the source IP.

yes

yes
Does the packet have Fragment the packet if its size is Larger packets (packets that exceed the MTU) must
to be fragmented ? greater than the MTU. be fragmented before they can be sent.

no

In either case (fragmented or not) the checksum is


computed (in_cksum). If no errors are found, the
If there are no check_sum errors, send data is sent to if_output function of the output
the data to if_output function of the interface.
selected interface.
Interface Layer (if_ethersubr.c)
ether_output(struct ifnet *ifp, struct mbuf *mbuf, struct sockaddr *destination, struct rtentry *routing_entry)
1. Verification
2. Protocol-Specific Processing
3. Frame Construction
4. Interface Queuing.

1. Verification

no
Ethernet port senderr (ENETDOWN)
up and running ?
ifp -> if_flags &
(IF_UP | IF_RUNNING )

yes
Interface Layer(if_ethersubr.c) - ether_output(struct ifnet *ifp, struct mbuf *mbuf,
struct sockaddr *destination, struct rtentry *rt_entry)

Function: Takes the data portion of an Ethernet frame ans encapsulates it with a 14-byte header and places it on the interface send_queue.
Phases: Verification, Protocol-Specific Processing, Frame Construction, Interface Queuing.

Arguments -
ifp points to outgoing interfaces ifnet structure
mbuf is the data to be sent
destination is the destination address
rt_entry points o the routing entry

Initialize-
Ethernet header - struct eth_header *eh

Verification

no
Ethernet port senderr (ENETDOWN)
up and running ?
ifp -> if_flags &
(IF_UP | IF_RUNNING )

yes
0
Route valid ?
rt_entry = rtalloc1 (destination, 1) senderr (EHOSTUNREACH)

0
Next hop a gateway ?
rt = rt -> rt_gwroute

Destination responding
to ARP requests? no
If not then do not send more
packets to avoid flooding.
rt -> rt_flags &
RTF_REJECT

Verification

Protocol Specific Processing


Protocol Specific Processing
Functionality: Finds Ethernet address corresponding to the IP address of the destination.

destination -> sa_family

AF_INET

Send ARP broadcast to find the


ethernet address corresponding to the
destination IP address

Use m_copy( ) to keep the packet till


an ack. Is recvd.

Frame Preparartion
Protocol Specific Processing

Frame Preparartion
Make sure there is room for the 14 byte
ethernet header
M_PREPEND ( m, sizeof(ethernet_header),
M_DONOTWAIT)

Form the Ethernet header from


ethernet frame type,
ethernet MAC address,
unicast ethernet address associated
with the output interface.
e.g. the default gateway for a host
Frame Preparartion

Interface Queuing
yes Discard the frame
Free the memory buff
Is the output queue full
senderr ( ENOBUFS )

no

Place the frame on the


interfaces send queue if_snd

lestart ( ifp ) lestart ( ifp )


Interface Layer(if_le.c) - lestart(struct ifnet *ifp)

Function: Dequeues frames from the interface output queue and arranges for them to be transmitted by the Ethernet Card.

struct le_softc *le = & le_softcl [ ifp -> if_unit ]

0
le -> sc_if.if_flags &
return error
IFF_RUNNING

Copy the the frame in mbuf to the


hardware buffer

Set the IFF_OACTIVE on to indicate that the


device is busy transmitting.

Das könnte Ihnen auch gefallen