Presentation is loading. Please wait.

Presentation is loading. Please wait.

Transport Layer3-1 Transport Layer Outline r 3.1 Transport-layer services r 3.2 Multiplexing and demultiplexing r 3.3 Connectionless transport: UDP r 3.4.

Similar presentations


Presentation on theme: "Transport Layer3-1 Transport Layer Outline r 3.1 Transport-layer services r 3.2 Multiplexing and demultiplexing r 3.3 Connectionless transport: UDP r 3.4."— Presentation transcript:

1 Transport Layer3-1 Transport Layer Outline r 3.1 Transport-layer services r 3.2 Multiplexing and demultiplexing r 3.3 Connectionless transport: UDP r 3.4 Principles of reliable data transfer r 3.5 Connection-oriented transport: TCP m segment structure m reliable data transfer m flow control m connection management r 3.6 Principles of congestion control r 3.7 TCP congestion control

2 Transport Layer3-2 Recap: rdt3.0 sender ( Stop-and-wait) sndpkt = make_pkt(0, data, checksum) udt_send(sndpkt) start_timer rdt_send(data) Wait for ACK0 rdt_rcv(rcvpkt) && ( corrupt(rcvpkt) || isACK(rcvpkt,1) ) Wait for call 1 from above sndpkt = make_pkt(1, data, checksum) udt_send(sndpkt) start_timer rdt_send(data) rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) && isACK(rcvpkt,0) rdt_rcv(rcvpkt) && ( corrupt(rcvpkt) || isACK(rcvpkt,0) ) rdt_rcv(rcvpkt) && notcorrupt(rcvpkt) && isACK(rcvpkt,1) stop_timer udt_send(sndpkt) start_timer timeout udt_send(sndpkt) start_timer timeout rdt_rcv(rcvpkt) Wait for call 0from above Wait for ACK1  rdt_rcv(rcvpkt)   

3 Transport Layer3-3 Recap: rdt3.0: stop&wait op first packet bit transmitted, t = 0 senderreceiver RTT last packet bit transmitted, t = L / R first packet bit arrives last packet bit arrives, send ACK ACK arrives, send next packet, t = RTT + L / R

4 Transport Layer3-4 Recap: Pipelining: increased utilization first packet bit transmitted, t = 0 senderreceiver RTT last bit transmitted, t = L / R first packet bit arrives last packet bit arrives, send ACK ACK arrives, send next packet, t = RTT + L / R last bit of 2 nd packet arrives, send ACK last bit of 3 rd packet arrives, send ACK Increase utilization by a factor of 3!

5 Transport Layer3-5 Recap: GBN for Pipelined Error Recovery Sender: r There is a k-bit sequence # in packet header r “window” of up to N, consecutive unacknowledged sent/can-be-sent packets allowed r window moves by 1 packet at a time when its 1st sent pkt is acknowledged (standard behavior) Sender must respond to three types of events: r 1- Invocation from above: application layers tries to send a packet, if window is full then packet is returned otherwise the packet is accepted and sent. r 2- Receipt of an ACK: One ACK(n) received indicates that all pkts up to, including seq # n have been received - “cumulative ACK” m may receive duplicate ACKs (when receiver receives out-of-order packets) r 3- A timeout event (only cause of retransmission): m timer for each in-flight pkt. m if timeout occurs: retransmit packets that have not been acknowledged. window cannot contain acknowledged pkts

6 Transport Layer3-6 Recap: Selective repeat for error recovery Window may contain acknowledged pkts (unlike GBN)

7 Transport Layer3-7 TCP: Overview RFCs: 793, 1122, 1323, 2018, 2581 r full duplex data: m bi-directional data flow in same connection at the same time r flow controlled: m sender will not overwhelm receiver r point-to-point: m one sender, one receiver m no one to many multicasts r connection-oriented: m processes must handshake before sending data m three-way handshake: (exchange of control msgs) initializes sender, receiver state before data exchange r pipelined: m TCP congestion and flow control set window size r send & receive buffers: m set-aside during the 3-way handshaking

8 Transport Layer3-8 TCP: Overview - cont r Maximum Segment Size (MSS): m Defined as the maximum amount of application-layer data in the TCP segment. m TCP grabs data in chunks from the send buffer where the maximum chunk size is called MSS. TCP segment contains TCP header and MSS. m MSS is set by determining the largest link layer frame (Maximum Transmission Unit or MTU) that can be sent by the local host m MSS is set so that an MSS put into an IP datagram will fit into a single link layer frame. Common values of MTU is 1460 bytes, 536 bytes and 512 bytes. r TCP sequence #s: m both sides randomly choose initial seq #s (other than 0) to prevent receiving segments of older connections that were using the same ports. m TCP views data as unordered structured stream of bytes so seq #s are over the stream of byes. m file size of 500,000 bytes and MSS=1,000 bytes, segment seq #s are: 0, 1000, 2000, etc. r TCP acknowledgement #s: m uses cumulative acks: TCP only acks bytes up to the first missing byte in the stream. TCP RFCs do not address how to handle out-of-order segments. m ACK # field has the next byte offset that the sender or receiver is expecting

9 Transport Layer3-9 TCP segment structure source port # dest port # 32 bits application data (variable length) sequence number acknowledgement number Receive window Urgent data pointer checksum F SR PAU header length not used Options (variable length) used to negotiate MSS URG: urgent data (generally not used) ACK: ACK # valid PSH: push data now to upper layer SYN/FIN: connection setup and close. RST=1: used in response when client tries to connect to a non-open server port. 16-bit= # bytes receiver willing to accept (RcvWindow size) counting by bytes of data (not segments!) largest file that can be sent = 2 32 (4GB) total #segments= filesize/MSS Internet checksum (as in UDP) header-length = 4-bits in 32-bit words

10 Transport Layer3-10 Seq Numbers and Ack Numbers r Suppose a data stream of size 500,000 bytes, MSS is 1,000 bytes; the first byte of the data stream is numbered zero. m Seq number of the segments: 1 st seg: 0; 2 nd seg: 1000; 3 rd seg: 2000, … r Ack number: m Assume host A is sending seg to host B. Because TCP is full-duplex, A may be receiving data from B simultaneously. m Ack number that host B puts in its seg is the seq number of the next byte B is expecting from A B has received all bytes numbered 0 through 535 from A. If B is about to send a segment to host A. The ack number in its segment should 536

11 Transport Layer3-11 TCP seq. #’s and ACKs - Telnet example r Telnet uses “echo back” to ensure characters seen by user already been received and processed at server. r Assume starting seq #s are 42 and 79 for client and server respectively. r After connection is established, client is waiting for byte 79 and server for byte 42. Seq. #’s: r byte stream “number” of first byte in segment’s data ACKs: r seq # of next byte expected from other side r cumulative ACK Host A client Host B server Seq=42, ACK=79, data = ‘C’ Seq=79, ACK=43, data = ‘C’ Seq=43, ACK=80 User types ‘C’ host ACKs receipt of echoed ‘C’ host ACKs receipt of ‘C’, echoes back ‘C’ time simple telnet scenario

12 Transport Layer3-12 TCP Round Trip Time and Timeout Q: how to set TCP timeout value ? (timer management) r based on RTT r longer than RTT m but RTT varies r too short: premature timeout m unnecessary retransmissions r too long: slow reaction to segment loss Q: how to estimate RTT?  SampleRTT : measured time from segment transmission (handing the segment to IP) until ACK receipt m ignore retransmissions (why?)  SampleRTT will vary from segment to segment, want estimated RTT “smoother”  average several recent measurements, not just current SampleRTT  TCP maintains an average called EstimatedRTT to use it to calculate the timeout value

13 Transport Layer3-13 TCP Round Trip Time (RTT) and Timeout EstimatedRTT = (1-  ) * priorEstimatedRTT +  * currentSampleRTT r Exponential Weighted Moving Average (EWMA) r Puts more weight on recent samples rather than old ones r influence of past sample decreases exponentially fast  typical value:  = 0.125 r Formula becomes: EstimatedRTT = 0.875 * priorEstimatedRTT + 0.125 * currentSampleRTT Why TCP ignores retransmissions when calculating SampleRTT: Suppose source sends packet P1, the timer for P1 expires, and the source then sends P2, a new copy of the same packet. Further suppose the source measures SampleRTT for P2 (the retransmitted packet) and that shortly after transmitting P2 an acknowledgment for P1 arrives. The source will mistakenly take this acknowledgment as an acknowledgment for P2 and calculate an incorrect value of SampleRTT.

14 Transport Layer3-14 RTT Sample Ambiguity r Karn’s RTT Estimator m If a segment has been retransmitted: Don’t count RTT sample on ACKs for this segment Keep backed off time-out for next packet Reuse RTT estimate only after one successful transmission AB ACK Sample RTT Original transmission retransmission Estimate RTT AB Original transmission retransmission Sample RTT ACK eRTT X

15 Transport Layer3-15 Example RTT estimation:

16 Transport Layer3-16 TCP Round Trip Time and Timeout Setting the timeout  EstimtedRTT plus “safety margin”  large variation in EstimatedRTT -> larger safety margin r first estimate of how much SampleRTT deviates from EstimatedRTT: TimeoutInterval = EstimatedRTT + 4*DevRTT DevRTT = (1-  )*DevRTT +  *|SampleRTT-EstimatedRTT| (typically,  = 0.25) Then set timeout interval:

17 Transport Layer3-17 TCP: conn-oriented transport r segment structure r RTT Estimation and Timeout r reliable data transfer r flow control r connection management

18 Transport Layer3-18 TCP reliable data transfer r TCP creates rdt service on top of IP’s unreliable service r Pipelined segments r Cumulative acks r TCP uses single retransmission timer as multiple timers require considerable overhead r Retransmissions are triggered by: m timeout events m duplicate acks r Initially consider simplified TCP sender: m ignore duplicate acks m ignore flow control, congestion control

19 Transport Layer3-19 TCP sender events: data rcvd from app: r Create segment with seq # r seq # is byte-stream number of first data byte in segment r start timer if not already running for some other segment (think of timer as for oldest unacknowledged segment)  expiration interval: TimeOutInterval timeout: r retransmit segment that caused timeout r restart timer Ack rcvd: r a valid ACK field (cumulative ACK) acknowledges previously unacknowledged segments: m update expected ACK # m restart timer if there are currently unacknowledged segments

20 Transport Layer3-20 TCP sender (simplified) NextSeqNum = InitialSeqNum SendBase = InitialSeqNum loop (forever) { switch(event) event: data received from application above create TCP segment with sequence number NextSeqNum if (timer currently not running) start timer pass segment to IP NextSeqNum = NextSeqNum + length(data) event: timer timeout retransmit not-yet-acknowledged segment with smallest sequence number start timer event: ACK received, with ACK field value of y if (y > SendBase) { SendBase = y if (there are currently not-yet-acknowledged segments) start timer } } /* end of loop forever */ Comment: SendBase-1: last cumulatively ack’ed byte Example: SendBase-1 = 71; y= 73, so the rcvr wants 73+ ; y > SendBase, so that new data is acked

21 Transport Layer3-21 TCP: retransmission scenarios Host A Seq=100, 20 bytes data ACK=100 time premature timeout Host B Seq=92, 8 bytes data ACK=120 Seq=92, 8 bytes data Seq=92 timeout ACK=120 Host A Seq=92, 8 bytes data ACK=100 loss timeout lost ACK scenario Host B X Seq=92, 8 bytes data ACK=100 time Seq=92 timeout SendBase = 100 SendBase = 120 SendBase = 120 Sendbase = 100 transmit not-yet-ack segment with smallest seq #

22 Transport Layer3-22 TCP retransmission scenarios (more) Host A Seq=92, 8 bytes data ACK=100 loss timeout Cumulative ACK scenario Host B X Seq=100, 20 bytes data ACK=120 time SendBase = 120 r Doubling the timeout value technique is used in TCP implementations. The timeout value is doubled for every retransmission since the timeout could have occurred because the network is congested. (the intervals grow exponentially after each retransmission and reset after either of the two other events)

23 Transport Layer3-23 TCP ACK generation policy [RFC 1122, RFC 2581] Event at Receiver Arrival of in-order segment with expected seq #. All data up to expected seq # already ACKed Arrival of in-order segment with expected seq #. One other segment has ACK pending Arrival of out-of-order segment higher-than-expect seq. #. Gap detected Arrival of segment that partially or completely fills gap TCP Receiver action Delayed ACK. Wait up to 500ms for next segment. If no next segment, send ACK Immediately send single cumulative ACK, ACKing both in-order segments Immediately send duplicate ACK, indicating seq. # of next expected byte Immediate send ACK, provided that segment starts at lower end of gap leaves buffering of out-of-order segments open

24 Transport Layer3-24 Fast Retransmit r Time-out period often relatively long: m long delay before resending lost packet r Detect lost segments via duplicate ACKs. m Dup Ack is an ack that reaknolwedges the receipt of an acknowledged segment m Sender often sends many segments back-to-back m If segment is lost, there will likely be many duplicate ACKs. r If sender receives 3 ACKs for the same data, it supposes that segment after last ACKed segment was lost: m sender performs fast retransmit: resend segment before that segment’s timer expires m algorithm comes as a result of 15 years TCP experience !

25 Transport Layer3-25 event: ACK received, with ACK field value of y if (y > SendBase) { SendBase = y if (there are currently not-yet-acknowledged segments) start timer } else { increment count of dup ACKs received for y if (count of dup ACKs received for y = 3) { resend segment with sequence number y } Fast retransmit algorithm: a duplicate ACK for already ACKed segment fast retransmit

26 Transport Layer3-26 Is TCP a GBN or SR protocol ? r TCP can buffer out-of-order segments (like SR). r TCP has a proposed RFC called selective acknowledgement to selectively acknowledge out-of- order segments and save on retransmissions (like SR). r TCP sender need only maintain smallest seq # of a transmitted but unacknowledged byte and the seq # of next byte to be sent (like GBN). r TCP is hybrid between GBN and SR.

27 Transport Layer3-27 TCP: conn-oriented transport r segment structure r RTT Estimation and Timeout r reliable data transfer r flow control r connection management

28 Transport Layer3-28 TCP Flow Control r receive side of TCP connection has a receive buffer: r speed-matching service: matching the send rate to the receiving app’s drain rate r app process may be slow at reading from buffer sender won’t overflow receiver’s buffer by transmitting too much, too fast flow control

29 Transport Layer3-29 TCP Flow control: how it works (Suppose TCP receiver discards out-of- order segments) r sender maintains variable called receive window  spare room in buffer = RcvWindow = RcvBuffer-[LastByteRcvd - LastByteRead]  TCP is not allowed to overflow the allocated buffer ( LastByteRcvd - LastByteRead <= RcvBuffer )  Rcvr advertises spare room by including value of RcvWindow in segments  RcvWindow = RcvBuffer at the start of transmission  Sender limits unACKed data to RcvWindow  sender keeps track of UnAcked data size = ( LastByteSent - LastByteAcked)  UnAcked data size <= RcvWindow  When Receiver RcvWindow = 0, Sender does not block but rather sends 1 byte segments that are acked by receiver until RcvWindow becomes bigger.

30 Transport Layer3-30 TCP: conn-oriented transport r segment structure r RTT Estimation and Timeout r reliable data transfer r flow control r connection management

31 Transport Layer3-31 Recap: TCP socket interaction Server (running on hostid ) wait for incoming connection request connectionSocket = welcomeSocket.accept() create socket, port= x, for incoming request: welcomeSocket = ServerSocket() create socket, connect to hostid, port= x clientSocket = Socket() close connectionSocket read reply from clientSocket close clientSocket Client send request using clientSocket read request from connectionSocket write reply to connectionSocket TCP connection setup

32 Transport Layer3-32 TCP Connection Management Recall: TCP sender, receiver establish “connection” before exchanging data segments r initialize TCP variables: m seq. #s  buffers, flow control info (e.g. RcvWindow ) r client: connection initiator Socket clientSocket = new Socket("hostname","port number"); r server: contacted by client Socket connectionSocket = welcomeSocket.accept(); source port # dest port # 32 bits application data (variable length) sequence number acknowledgement number Receive window Urgent data pointer checksum F SR PAU header length not used Options (variable length) used to negotiate MSS

33 Transport Layer3-33 TCP Connection Management - connecting client SYN=1, seq=client_isn server SYN=1, seq=server_isn, ack=client_isn+1 SYN=0, seq=client_isn+1, ack=server_isn+1 conn request Time conn granted ACK Time r Three way handshake: m Step 1: client host sends TCP SYN segment (SYN bit=1) to server specifies initial seq # (client_isn) no data m Step 2: server host receives SYN, replies with SYNACK segment server allocates buffers specifies server initial seq. # (server_isn), with ACK # = client_isn+1 m Step 3: client receives SYNACK, replies with ACK # = server_isn+1, which may contain data

34 Transport Layer3-34 TCP Connection Setup Example r Client SYN m SeqC: Seq. #4019802004, window 65535, max. seg. 1260 r Server SYN-ACK+SYN m Receive: #4019802005 (= SeqC+1) m SeqS: Seq. #3428951569, window 5840, max. seg. 1460 r Client SYN-ACK m Receive: #3428951570 (= SeqS+1) 09:23:33.042318 IP 128.2.222.198.3123 > 192.216.219.96.80: S 4019802004:4019802004(0) win 65535 09:23:33.118329 IP 192.216.219.96.80 > 128.2.222.198.3123: S 3428951569:3428951569(0) ack 4019802005 win 5840 09:23:33.118405 IP 128.2.222.198.3123 > 192.216.219.96.80:. ack 3428951570 win 65535 sackOK: selective acknowledge

35 Transport Layer3-35 TCP Connection Management - disconnecting Closing a connection: client closes socket: clientSocket.close(); Step 1: client end system sends TCP FIN control segment (FIN bit=1) to server Step 2: server receives FIN, replies with ACK. Closes connection, sends FIN=1. client FIN server ACK FIN close closed timed wait

36 Transport Layer3-36 TCP Connection Management (cont.) Step 3: client receives FIN, replies with ACK. m Enters “timed wait” - will respond with ACK to received FINs where typical wait is 30 sec. All resources and ports are released. Step 4: server, receives ACK. Connection closed. client FIN server ACK FIN closing closed timed wait closed

37 Transport Layer3-37 TCP Conn.Teardown Example r Session m Echo client on 128.2.222.198, server on 128.2.210.194 r Client FIN m SeqC: 1489294581 r Server ACK + FIN m Ack: 1489294582 (= SeqC+1) m SeqS: 1909787689 r Client ACK m Ack: 1909787690 (= SeqS+1) 09:54:17.585396 IP 128.2.222.198.4474 > 128.2.210.194.6616: F 1489294581:1489294581(0) ack 1909787689 win 65434 09:54:17.585732 IP 128.2.210.194.6616 > 128.2.222.198.4474: F 1909787689:1909787689(0) ack 1489294582 win 5840 09:54:17.585764 IP 128.2.222.198.4474 > 128.2.210.194.6616:. ack 1909787690 win 65434

38 Transport Layer3-38 TCP Connection Management (cont) TCP client lifecycle TCP server lifecycle

39 Transport Layer3-39 Queue Management r Two queues for each listening socket

40 Transport Layer3-40 Concurrent Server (1) pid_t pid; (2) int listenfd, connfd; (3) listenfd = Socket(... ); (4) /* fill in sockaddr_in{} with server's well-known port */ (5) Bind(listenfd,... ); (6) Listen(listenfd, LISTENQ); (7) for ( ; ; ) { (8) connfd = Accept (listenfd,... ); /* probably blocks */ (9) if( (pid = Fork()) == 0) { (10) Close(listenfd); /* child closes listening socket */ (11)doit(connfd); /* process the request */ (12) Close(connfd); /* done with this client */ (13)exit(0); /* child terminates */ (14) } (15) Close(connfd); /* parent closes connected socket */ (16) }

41 Transport Layer3-41 Concurrent Server (Cont’) (a) Status before call to call to accept returns (b) status after return from accept (c) Status after return of spawning a process (d) Status after parent/child close appropriate sockets

42 Transport Layer3-42 TCP Summary r TCP Properties : m point to point, connection-oriented, full-duplex, reliable r TCP Segment Structure r How TCP sequence and acknowledgement #s are assigned r How does TCP measure the timeout value needed for retransmissions using EstimatedRTT and DevRTT r TCP retransmission scenarios, ACK generation and fast retransmit r How does TCP Flow Control work r TCP Connection Management: 3-segments exchanged to connect and 4-segments exchanged to disconnect


Download ppt "Transport Layer3-1 Transport Layer Outline r 3.1 Transport-layer services r 3.2 Multiplexing and demultiplexing r 3.3 Connectionless transport: UDP r 3.4."

Similar presentations


Ads by Google