Presentation is loading. Please wait.

Presentation is loading. Please wait.

Advanced Networks 20021 Transport Layer Michalis Faloutsos Many slides from Kurose-Ross.

Similar presentations


Presentation on theme: "Advanced Networks 20021 Transport Layer Michalis Faloutsos Many slides from Kurose-Ross."— Presentation transcript:

1

2 Advanced Networks Transport Layer Michalis Faloutsos Many slides from Kurose-Ross

3 Advanced Networks Transport Layer Functionality Hide network from application layer Transport layer resides at end points Sees the network as a black box

4 Advanced Networks Transport Layers of the Internet TCP: reliable protocol Guarantees end-to-end deliveryGuarantees end-to-end delivery Self-controls rate: congestion and flow controlSelf-controls rate: congestion and flow control Connection oriented: handshake, stateConnection oriented: handshake, state Ordered delivery of packets to applicationOrdered delivery of packets to application UDP: unreliable protocol Non-regulated sending rateNon-regulated sending rate Multiplexing-demultiplexingMultiplexing-demultiplexing

5 Advanced Networks TCP overview

6 Advanced Networks TCP: What and How For more: RFCs: 793, 1122, 1323, 2018, 2581 full duplex data: bi-directional data flow in same connectionbi-directional data flow in same connection MSS: maximum segment sizeMSS: maximum segment sizeconnection-oriented: handshaking (exchange of control msgs) init’s sender, receiver state before data exchangehandshaking (exchange of control msgs) init’s sender, receiver state before data exchange flow controlled: sender will not overwhelm receiversender will not overwhelm receiver point-to-point: one sender, one receiver reliable, in-order byte steam: no “message boundaries” pipelined: TCP congestion and flow control set window size send & receive buffers

7 Advanced Networks TCP segment structure source port # dest port # 32 bits application data (variable length) sequence number acknowledgement number rcvr window size ptr urgent data checksum F SR PAU head len not used Options (variable length) URG: urgent data (generally not used) ACK: ACK # valid PSH: push data now (generally not used) RST, SYN, FIN: connection estab (setup, teardown commands) # bytes rcvr willing to accept counting by bytes of data (not segments!) Internet checksum (as in UDP)

8 Advanced Networks TCP overview TCP is a sliding window protocol Sender can have (Window) bytes in flight Operates with cumulative ACKs It includes control for the sending rate Flow control: receiver-set sending rate Congestion control: network-aware sending rate Congwin

9 Advanced Networks TCP seq. #’s and ACKs Seq. #’s: byte stream “number” of first byte in segment’s databyte stream “number” of first byte in segment’s dataACKs: seq # of next byte expected from other sideseq # of next byte expected from other side cumulative ACKcumulative ACK Q: how receiver handles out-of-order segments A: TCP spec doesn’t say, - up to implementorA: TCP spec doesn’t say, - up to implementor Host A Host B Seq=42, ACK=79, data = ‘C’ Seq=79, ACK=43, data = ‘C’ Seq=43, ACK=80 User types ‘C’ host ACKs receipt of echoed ‘C’ host ACKs receipt of ‘C’, echoes back ‘C’ time simple telnet scenario

10 Advanced Networks TCP in a nutshell I. Slow start phase (actually this is fast increase) Start with a window of 1 (or 2)Start with a window of 1 (or 2) Successful ACK: Increase window by one 1 max size segmentSuccessful ACK: Increase window by one 1 max size segment Do this up to a threshold: sshthreshDo this up to a threshold: sshthresh II. Congestion control phase Increase window by 1 max size segment every RTTIncrease window by 1 max size segment every RTT Drop window in half, if there is congestionDrop window in half, if there is congestion  Packet loss: duplicate ACKs  Time expiration

11 Advanced Networks TCP Congestion Control end-end control (no network assistance) transmission rate limited by congestion window size, Congwin, over segments: w segments, each with MSS bytes sent in one RTT: throughput = w * MSS RTT Bytes/sec Congwin

12 Advanced Networks TCP congestion control: Intuition TCP is “probing” for usable bandwidth: ideally: transmit as fast as possible ( Congwin as large as possible) without loss increase Congwin until loss (congestion) loss: decrease Congwin, then begin probing (increasing) again

13 Advanced Networks TCP congestion control: TCP has two “phases” slow start:  start from small, increase quickly congestion avoidance:  Additive Increase Multiplicative Decrease important variables: Congwin threshold: defines threshold between two slow start phase, congestion control phase

14 Advanced Networks TCP Slowstart exponential increase (per RTT) in window size loss event: timeout (Tahoe TCP) and/or or three duplicate ACKs (Reno TCP) initialize: Congwin = 1 for (each segment ACKed) Congwin++ until (loss event OR CongWin > threshold) Slowstart algorithm Host A one segment RTT Host B time two segments four segments

15 Advanced Networks Why Call it Slow Start ? The original version of TCP suggested that the sender transmit as much as the Advertised Window permitted. Routers may not be able to cope with this “burst” of transmissions. Slow start is slower than the above version -- ensures that a transmission burst does not happen at once.

16 Advanced Networks TCP Congestion Avoidance /* slowstart is over */ /* Congwin > threshold */ Until (loss event) { every w segments ACKed: Congwin++ } threshold = Congwin/2 Congwin = 1 perform slowstart Congestion avoidance 1 1: TCP Reno skips slowstart (fast recovery) after three duplicate ACKs

17 Advanced Networks TCP Congestion: Real Life is Hairy! Remember: bytes vs packets! CW += MSS * MSS/CW Thres = Max( 2* MSS, InFlightData/2) MSS: max segment size InFlighData: un-ACK-ed data /* slowstart is over */ /* Congwin > threshold */ Until (loss event) { every w segments ACKed: Congwin++ } threshold = Congwin/2 Congwin = 1 perform slowstart Congestion avoidance 1 RFC 2581: TCP Congestion Control

18 Advanced Networks Fairness goal: if N TCP sessions share same bottleneck link, each should get 1/N of link capacity TCP congestion avoidance: AIMD: additive increase, multiplicative decrease increase window by 1 per RTT decrease “window” by factor of 2 on loss event TCP Fairness and AIMD TCP connection 1 bottleneck router capacity R TCP connection 2

19 Advanced Networks Why is TCP fair? Two competing sessions: Additive increase gives slope of 1, as throughout increases multiplicative decrease decreases throughput proportionally R R equal bandwidth share Connection 1 throughput Connection 2 throughput congestion avoidance: additive increase loss: decrease window by factor of 2 congestion avoidance: additive increase loss: decrease window by factor of 2

20 Advanced Networks Macroscopic Description of Throughput Assume window toggling: W/2 to W High rate: W * MSS / RTT Low rate: W * MSS / 2 RTT Rate increase is linearly between two extremes Average throughput: 0.75 * W * MSS / RTT0.75 * W * MSS / RTT

21 Advanced Networks TCP: reliable data transfer Simplified sender, assuming wait for event wait for event event: data received from application above event: timer timeout for segment with seq # y event: ACK received, with ACK # y create, send segment retransmit segment ACK processing one way data transfer no flow, congestion control

22 Advanced Networks TCP sender 00 sendbase = initial_sequence number 01 nextseqnum = initial_sequence number loop (forever) { 04 switch(event) 05 event: data received from application above 06 create TCP segment with sequence number nextseqnum 07 start timer for segment nextseqnum 08 pass segment to IP 09 nextseqnum = nextseqnum + length(data) 10 event: timer timeout for segment with sequence number y 11 retransmit segment with sequence number y 12 compute new timeout interval for segment y 13 restart timer for sequence number y 14 event: ACK received, with ACK field value of y 15 if (y > sendbase) { /* cumulative ACK of all data up to y */ 16 cancel all timers for segments with sequence numbers < y 17 sendbase = y 18 } 19 else { /* a duplicate ACK for already ACKed segment */ 20 increment number of duplicate ACKs received for y 21 if (number of duplicate ACKS received for y == 3) { 22 /* TCP fast retransmit */ 23 resend segment with sequence number y 24 restart timer for segment y 25 } 26 } /* end of loop forever */ Simplified TCP sender

23 Advanced Networks TCP Receiver: ACK generation [RFC 1122, RFC 2581] Event in-order segment arrival, no gaps, everything else already ACKed in-order segment arrival, no gaps, one delayed ACK pending out-of-order segment arrival higher-than-expect seq. # gap detected arrival of segment that partially or completely fills gap TCP Receiver action delayed ACK. Wait up to 500ms for next segment. If no next segment, send ACK immediately send single cumulative ACK send duplicate ACK, indicating seq. # of next expected byte immediate ACK if segment starts at lower end of gap

24 Advanced Networks TCP: retransmission scenarios Host A Seq=92, 8 bytes data ACK=100 loss timeout time lost ACK scenario Host B X Seq=92, 8 bytes data ACK=100 Host A Seq=100, 20 bytes data ACK=100 Seq=92 timeout time premature timeout, cumulative ACKs Host B Seq=92, 8 bytes data ACK=120 Seq=92, 8 bytes data Seq=100 timeout ACK=120

25 Advanced Networks TCP Round Trip Time and Timeout Q: how to set TCP timeout value? longer than RTT note: RTT will varynote: RTT will vary too short: premature timeout unnecessary retransmissionsunnecessary retransmissions too long: slow reaction to segment loss Q: how to estimate RTT? SampleRTT : measured time from segment transmission until ACK receipt ignore retransmissions, cumulatively ACKed segments SampleRTT will vary, want estimated RTT “smoother” use several recent measurements, not just current SampleRTT

26 Advanced Networks TCP Round Trip Time and Timeout EstimatedRTT = (1-x)*EstimatedRTT + x*SampleRTT Exponential weighted moving average influence of given sample decreases exponentially fast typical value of x: 0.1 Setting the timeout EstimtedRTT plus “safety margin” large variation in EstimatedRTT -> larger safety margin Timeout = EstimatedRTT + 4*Deviation Deviation = (1-x)*Deviation + x*|SampleRTT-EstimatedRTT|

27 Advanced Networks A problem When there are retransmissions, it is unclear if the ACK is for the original transmission or for a retransmission. How do we overcome this ?

28 Advanced Networks The Karn Patridge Algorithm Take SampleRTT measurements only for segments that have been sent once ! This eliminates the possibility that wrong RTT estimates are factored into the estimation. Another change -- Each time TCP retransmits, it sets the next timeout to 2 X Last timeout --> This is called the Exponential Back-off (primarily for avoiding congestion).

29 Advanced Networks Jacobson Karels Algorithm An issue with the Karn/Patridge scheme is that it does not take into account the variation between RTT samples. New method proposed -- the Jacobson Karels Algorithm. Estimated RTT = Estimated RTT +  X Difference Difference = Sample RTT - Estimated RTT Difference = Sample RTT - Estimated RTT Deviation = Deviation +  (|Difference| - deviation) Timeout =  Estimated RTT +  deviation. The values of  and  are computed based on experience -- Typically  = 1 and  = 4.

30 Advanced Networks Silly Window Syndrome Suppose a MSS worth of data is collected and advertised window is MSS/2. What should the sender do ? -- transmit half full segments or wait to send a full MSS when window opens ? Early implementations were aggressive -- transmit MSS/2. Aggressively doing this, would consistently result in small segment sizes -- called the Silly Window Syndrome.

31 Advanced Networks Issues.. We cannot eliminate the possibility of small segments being sent. However, we can introduce methods to coalesce small chunks. Delaying ACKs -- receiver does not send ACKs as soon as it receives segments. Delaying ACKs -- receiver does not send ACKs as soon as it receives segments.  How long to delay ? Not very clear. Ultimate solution falls to the sender -- when should I transmit ?Ultimate solution falls to the sender -- when should I transmit ?

32 Advanced Networks Nagle’s Algorithm If sender waits too long --> bad for interactive connections. If it does not wait long enough -- silly window syndrome. How do we solve this? Timer -- clock based If both available data and Window ≥ MSS, send full segment. If both available data and Window ≥ MSS, send full segment. Else, if there is unACKed data in flight, buffer new data until ACK returns.Else, if there is unACKed data in flight, buffer new data until ACK returns. Else, send new data now.Else, send new data now. Note -- Socket interface allows some applications to turn off Nagle’s algorithm by setting the TCP-NODELAY option.

33 Advanced Networks TCP Connection Management Recall: TCP sender, receiver establish “connection” before exchanging data segments initialize TCP variables: seq. #sseq. #s buffers, flow control info (e.g. RcvWindow )buffers, flow control info (e.g. RcvWindow ) client: connection initiator Socket clientSocket = new Socket("hostname","port number"); Socket clientSocket = new Socket("hostname","port number"); server: contacted by client Socket connectionSocket = welcomeSocket.accept(); Socket connectionSocket = welcomeSocket.accept();

34 Advanced Networks TCP Set-up Three way handshake: Step 1: client end system sends TCP SYN control segment to server specifies initial seq #specifies initial seq # Step 2: server end system receives SYN, replies with SYNACK control segment ACKs received SYNACKs received SYN allocates buffersallocates buffers specifies server-> receiver initial seq. #specifies server-> receiver initial seq. # Step 3: Client replies with an ACK (using servers seq number)

35 Advanced Networks TCP Connection Management (cont.) Closing a connection: client closes socket: clientSocket.close(); Step 1: client end system sends TCP FIN control segment to server Step 2: server receives FIN, replies with ACK. Closes connection, sends FIN. Last ACK is never ACK-ed!! client FIN server ACK FIN close closed timed wait

36 Advanced Networks TCP Connection Management (cont.) Step 3: client receives FIN, replies with ACK. Enters “timed wait” - will respond with ACK to received FINs Step 4: server, receives ACK. Connection closed. Sends FIN. Last ACK is never ACK-ed client FIN server ACK FIN closing closed timed wait closed

37 Advanced Networks TCP Connection Management (cont) TCP client lifecycle TCP server lifecycle


Download ppt "Advanced Networks 20021 Transport Layer Michalis Faloutsos Many slides from Kurose-Ross."

Similar presentations


Ads by Google