Presentation is loading. Please wait.

Presentation is loading. Please wait.

Linux TCP/IP Stack.

Similar presentations


Presentation on theme: "Linux TCP/IP Stack."— Presentation transcript:

1 Linux TCP/IP Stack

2 TCP / IP vs. OSI model 7: Application 6: Presentation 5: Session
Process Socket layer 4: Transport 3: Network 2: Data Link Protocol Layer (TCP / IP) Interface Layer (Ethernet, etc.) 1: Physical Layer

3 TCP/IP Stack Overview Physical Media Process 1: sosend (……………... )
5: recvfrom(……….) Socket Layer 2: tcp_output ( ……. ) 4: tcp_input ( ……... ) Protocol Layer (TCP Layer) 3: ip_input ( ……... ) 3: ip_output ( ……. ) Protocol Layer (IP Layer) 4: ethernet_output ( ……. ) 2: ethernet_input ( …….. ) Interface Layer (Ethernet Device Driver) Physical Media Output Queue Input Queue

4 Process Layer to TCP Layer
send (int socket, const char *buf, int length, int flags) Process Kernel sendto (int socket, const char *data_buffer, int length, int flags, struct sockaddr *destination, int destination _length) sendit (struct proc *p, int socket, struct msghdr *mp, int flags, int *return_size) uipc_syscalls.c sosend (struct socket *s, struct mbuf *addr, struct uio *uio, struct mbuf *top, struct mbuf *control, int flags ) uipc_socket.c tcp_userreq (struct socket *s, int request, struct mbuf *m, struct mbuf * nam, struct mbuf * control ) tcp_userreq.c TCP Layer tcp_output (struct tcpcb *tp) tcp_output.c

5 Socket Layer sendto (int socket, const char *data_buffer, int length, int flags, struct sockaddr *destination, int destination _length) MBUF Chain m_next m_next = NULL m_nextpkt = NULL m_nextpkt = NULL m_len = 100 m_len = 50 28 Bytes m_data 20 Bytes m_data m_type = MT_DATA m_type = MT_DATA data_buffer m_flags = M_PKTHDR m_flags = 0 m_pkthdr.len = 150 Data 128 Bytes mBuf m_pkthdr.recvif =NULL 50 Bytes Data Unused Space 150 Bytes Data 100 Bytes 58 Bytes

6 sbspace(s->sb_snd)
Socket Layer -sosend passes data and control information to the protocol layer sosend(struct socket *s, struct mbuf *addr, struct uio *uio, struct mbuf *data_buffer, struct mbuf *control, int flags ) Initialize a new memory buffer and variables to hold flags Is there enough space in the buffer sbspace(s->sb_snd) no yes Copy data_buffer mbuf int error = tcp_usrreq(s, flags, mbuf, addr, control) More buffers to send? yes error Free the memory buffers received 1 no Return value of error to sendto ( )

7 TCP Layer - tcp_usrreq(struct socket. s, int request, struct mbuf
TCP Layer - tcp_usrreq(struct socket *s, int request, struct mbuf *data_buffer, mbuf *nam, mbuf * control) Initialize internet protocol control block inp and TCP control block tp to store information useful for TCP Convert Socket to Internet Protocol Control Block inp = sotoinpcb(so) Convert the internet protocol control block to a tcp control block tp = intopcb(inp) request PRU_SEND return error to tcp_userreq( ) int error = tcp_output(tp)

8 TCP Layer (tcp_output.c) - tcp_output(struct tcpcb *tp)
Called by tcp_usrreq for one of the following reasons: To send the initial SYN To send a finished_sending message To send data To send a window update after data has been received. tcp_ouput ( ) functionality: 1. determines whether TCP can send a segment or not depending on: flags in the data sent by the socket layer to send an ACK, etc. Size of window advertised by the receiver’s end. Amount of data ready to send whether unacknowledged data already exists for the connection 2. Calculate the amount of data to be sent depending on: size of receiver’s window number of bytes in the send buffer 3. Check for window shrink 4. Send a segment Allocate a buffer for the TCP and IP header from the header template Copy the TCP and IP header template into the the buffer to be sent. Fill the fields in the TCP header. Decrement the number of buffers to tbe sent, so that the end can be checked. Set sequencenumber and acknowledgement field. Set three fields in the IP header - IP length, TTL and Tos. Pass the datagram to IP

9 TCP Layer (tcp_output.c) - tcp_output(struct tcpcb *tp)
struct socket *so = tp -> t_inpcb -> inp_socket Initialize a tcp header tcp_header Idle is true if the max sequence number equals the oldest unacknowledged sequence number, if an ACK is not expected from the other end. int idle = (tp -> snd_max == tp -> snd_una) false idle Check ACK Flag Acknowledgement is not expected, set the congestion window to one segment tp -> snd_cwnd = tp -> t_maxseg; true

10 TCP Layer - tcp_output(struct tcpcb *tp)
Acknowledgement is not expected, set the congestion window to one segment tp -> snd_cwnd = tp -> t_maxseg; off is the offset in bytes from the beginning of the send buffer of the first data byte to send. off bytes have already been sent and acknowledgement on those is awaited. int off = tp -> snd_nxt - tp -> snd_una Determine length of data that should be transmitted and the flags to be used. len is the minimum number of bytes in the send buffer, win (the minimum of the receiver’s window) and the congestion window. len = min(so -> so_snd.sb_cc, win) - off Determine the flags like TH_ACK, TH_FIN, TH_RST, TH_SYN flags = tcp _outflags [ tp -> t_state ]

11 TCP Layer - tcp_output(struct tcpcb *tp)
Determine the flags like TH_ACK, TH_FIN, TH_RST, TH_SYN flags = tcp _outflags [ tp -> t_state ] tp -> t_flags & TF_ACKNOW true Send acknowledgement false tp -> t_flags & TF_SYN || TH_RST true Send sequence number or reset false tp -> t_flags & TH_FIN true Finished sending false

12 Length of data < 44 Bytes
Ckeck flags to determine the type of message: window probe retransmission normal data transmission Allocate an mbuf for the TCP & IP header and data if possible. MGETHDR ( m, M_DONTWAIT, MT_HEADR) M_DONTWAIT indicates that if memory is not available for mbuf then come out of the routine and return an error state. Length of data < 44 Bytes no Create a new mbuf chain, copy the surplus data and point it to the first mbuf chain. yes Copy the data from the socket send buffer into the new packet header mbuf ip_output(m, tp->t_inpcb -> inp_options, &tp -> t_inpcb -> inp_route, so -> so_options & SO_DONOTROUTE, 0)

13 ip_output.c ip_output(struct mbuf *m, struct mbuf *opt, struct route *ro, int flags, struct ip_moptions *imo) 1. Header initialization 2. Route Selection 3. Source address selection and Fragmentation 1. Header initialization Packets damaged? Check if there were any errors while adding headers in higher layers. Most of the fields of the IP header are pre defined by higher layer protocols. ERROR yes no if ((flags == IP_FORWARDING ) || (flags == IP_RAWOUTPUT )) The value of “flags” decides what’s to be done with the data IP_FORWARDING : Forward packet IP_ROUTETOIF : Route directly to Interface IP_ALLOWBROADCAST : Allow broadcasting of packet IP_RAWOUTPUT : Packet contains pre-constructed header yes If the packet has to be forwarded to another host, i.e if the machine is acting as a router, then the IP header for forwarded packets should not be modified by ip_output. no Save header length in hlen for fragmentation algorithm Construct and initialize IP header set ip_v = 4, clear ip_off assign unique identifier to ip_id length, offset, TTL, protocol, TOS etc are set by higher layers. If the packet is not being forwarded and has to be sent to another host then initialize the IP header.

14 Verify Cached Route for destination address
2. Route Selection A cached route may be provided to ip_output as an argument. UDP and TCP maintain a route cache associated with each socket. Verify Cached Route for destination address Check if the cached route is the correct destination. If a route has not been provided, ip_output sets a temporary route structure called iproute. If (cached_route == destination) yes Find the interface on which the packet has to be placed. Ifp points to the interface’s ifnet structure. If the cached route is provided, find the interface on which the frame has to be sent. no If the packet is being routed, rtalloc locates a route to the address specified by dst. If rtalloc fails, an EHOSTUNREACH error is generated. If ip_forward called ip_output the error is converted to an ICMP error. If the address is found then ifp is made to point to thr ifnet structure for the interface. If the next hop is not the packets final destination, then dst is changed to point to the next hop router. Locate route : Call rtalloc(dst_ip) to locate a route to the destination. Find the interface on which the packet has to be placed. Ifp points to the interface’s ifnet structure. If rtalloc(dst_ip) fails to find a route, return host unreachable error.

15 3. Source address selection and Fragmentation
The final section of the ip_output ensures that the IP header has a valid source IP address. This couldn’t have been done earlier because the route hadn’t been selected yet. If there is no source IP then the IP address of the outgoing interface is used as the source IP. Check if valid source address is specified. no Select the IP address of the outgoing interface as the source address. yes Does the packet have to be fragmented ? yes Fragment the packet if it’s size is greater than the MTU. Larger packets (packets that exceed the MTU) must be fragmented before they can be sent. no In either case (fragmented or not) the checksum is computed (in_cksum). If no errors are found, the data is sent to if_output function of the output interface. If there are no check_sum errors, send the data to if_output function of the selected interface.

16 Interface Layer (if_ethersubr.c)
ether_output(struct ifnet *ifp, struct mbuf *mbuf, struct sockaddr *destination, struct rtentry *routing_entry) 1. Verification 2. Protocol-Specific Processing 3. Frame Construction 4. Interface Queuing. 1. Verification Ethernet port up and running ? ifp -> if_flags & (IF_UP | IF_RUNNING ) no senderr (ENETDOWN) yes

17 Interface Layer(if_ethersubr. c) - ether_output(struct ifnet
Interface Layer(if_ethersubr.c) - ether_output(struct ifnet *ifp, struct mbuf *mbuf, struct sockaddr *destination, struct rtentry *rt_entry) Function: Takes the data portion of an Ethernet frame ans encapsulates it with a 14-byte header and places it on the interface send_queue. Phases: Verification, Protocol-Specific Processing, Frame Construction, Interface Queuing. Arguments - ifp points to outgoing interface’s ifnet structure mbuf is the data to be sent destination is the destination address rt_entry points o the routing entry Initialize- Ethernet header - struct eth_header *eh Ethernet port up and running ? ifp -> if_flags & (IF_UP | IF_RUNNING ) Verification no senderr (ENETDOWN) yes

18 rt_entry = rtalloc1 (destination, 1)
Route valid ? rt_entry = rtalloc1 (destination, 1) senderr (EHOSTUNREACH) 1 Next hop a gateway ? rt = rt -> rt_gwroute 1 Destination responding to ARP requests? If not then do not send more packets to avoid flooding. rt -> rt_flags & RTF_REJECT no Verification Protocol Specific Processing

19 destination -> sa_family
Functionality: Finds Ethernet address corresponding to the IP address of the destination. Protocol Specific Processing destination -> sa_family AF_INET Send ARP broadcast to find the ethernet address corresponding to the destination IP address Use m_copy( ) to keep the packet till an ack. Is recvd. Frame Preparartion

20 Protocol Specific Processing
Frame Preparartion Make sure there is room for the 14 byte ethernet header M_PREPEND ( m, sizeof(ethernet_header), M_DONOTWAIT) Form the Ethernet header from ethernet frame type, ethernet MAC address, unicast ethernet address associated with the output interface. e.g. the default gateway for a host

21 Is the output queue full
Frame Preparartion Interface Queuing Is the output queue full Discard the frame Free the memory buff senderr ( ENOBUFS ) yes no if_snd Place the frame on the interface’s send queue lestart ( ifp ) lestart ( ifp )

22 Interface Layer(if_le.c) - lestart(struct ifnet *ifp)
Function: Dequeues frames from the interface output queue and arranges for them to be transmitted by the Ethernet Card. struct le_softc *le = & le_softcl [ ifp -> if_unit ] le -> sc_if.if_flags & IFF_RUNNING return error 1 Copy the the frame in mbuf to the hardware buffer Set the IFF_OACTIVE on to indicate that the device is busy transmitting.

23 ip_input.c void ipintr( ) 1. Verification of incoming packets
2. Option processing and forwarding 3. Packet reassembly 4. Demultiplexing Storing IP packets: ip packets are stored in a chain of mbuf structs in a linked list. Theheader must be stored in one mbuf. Unable to reassemble a complete datagram get IP header from ipintrq in first mbuf 1. If no ip addresses set yet but the interfaces are receiving, can’t do anything with incoming packets yet. This occurs during system initialization when interfaces have not been configured. 2. If length of packet in mbuf < length of struct ip increment ipstat.ips_toosmall. 3. Check ip version 4. Check header length 5. Ip_sum = in_cksum() (ip_sum should be = 0) (used by all protocols although on different parts.) 6. Convert from network byte order to host byte order. 7. Ip_len > m_pkthdr.len indicates that some bytes are missing. 8. Trim buffers if longer than expected. 9. Drop if shorter than expected. Dequeue packets Packets damaged? yes discard Verification no ip_dooptions() ip_dst found? yes host in the same subnet Forwarding 1. Is ip_dst a local address? Look for ip_dst in_ifaddr (list of configured addresses) Ip_dooptions() == 1 ICMP error message no Goto next buffer ip_forwarding == 0 yes no ip_forward ( ) Discard & free mem

24 ip_forward (struct mbuf *m, int srcrt )
Phase I: Is the packet eligible for forwarding Multicast packet no yes Is packet a link level broadcast packet loopback packet network 0 and class E addresses Ip_mforward ( ) yes no TTL == 1 yes ICMP error message discard Locate next hop m points to the packet to be forwarded. If srcrt == 0, packet is being forwarded because of a source route option. struct route { struct rtentry *ro_rt; // pointer to struct with information struct sockaddr ro_dst; //destination associated with the route entry pointed to by ro_rt. }; no Cache most recent route usually consecutive packets have same destination. Decrement TTL save at most 64 bytes of the packet in case ICMP message has to be sent

25 ip == null yes Unable to reassemble a complete datagram no Ip points to a full datagram Goto next Map ip_p to a protocol number in ip_protox array Transport Demultiplexing 1 UDP 2 T CP 3 IP(raw) 4 ICMP 5 IGMP Ip_protox [ ]


Download ppt "Linux TCP/IP Stack."

Similar presentations


Ads by Google