Presentation is loading. Please wait.

Presentation is loading. Please wait.

Advanced Networks 20021 Transport Layer Michalis Faloutsos Many slides from Kurose-Ross.

Similar presentations


Presentation on theme: "Advanced Networks 20021 Transport Layer Michalis Faloutsos Many slides from Kurose-Ross."— Presentation transcript:

1

2 Advanced Networks 20021 Transport Layer Michalis Faloutsos Many slides from Kurose-Ross

3 Advanced Networks 2002 2 Transport Layer Functionality Hide network from application layer Transport layer resides at end points Sees the network as a black box

4 Advanced Networks 2002 3 Transport Layers of the Internet TCP: reliable protocol Guarantees end-to-end deliveryGuarantees end-to-end delivery Self-controls rate: congestion and flow controlSelf-controls rate: congestion and flow control Connection oriented: handshake, stateConnection oriented: handshake, state Ordered delivery of packets to applicationOrdered delivery of packets to application UDP: unreliable protocol Non-regulated sending rateNon-regulated sending rate Multiplexing-demultiplexingMultiplexing-demultiplexing

5 Advanced Networks 20024 TCP overview

6 Advanced Networks 2002 5 TCP: What and How For more: RFCs: 793, 1122, 1323, 2018, 2581 full duplex data: bi-directional data flow in same connectionbi-directional data flow in same connection MSS: maximum segment sizeMSS: maximum segment sizeconnection-oriented: handshaking (exchange of control msgs) init’s sender, receiver state before data exchangehandshaking (exchange of control msgs) init’s sender, receiver state before data exchange flow controlled: sender will not overwhelm receiversender will not overwhelm receiver point-to-point: one sender, one receiver reliable, in-order byte steam: no “message boundaries” pipelined: TCP congestion and flow control set window size send & receive buffers

7 Advanced Networks 2002 6 TCP segment structure source port # dest port # 32 bits application data (variable length) sequence number acknowledgement number rcvr window size ptr urgent data checksum F SR PAU head len not used Options (variable length) URG: urgent data (generally not used) ACK: ACK # valid PSH: push data now (generally not used) RST, SYN, FIN: connection estab (setup, teardown commands) # bytes rcvr willing to accept counting by bytes of data (not segments!) Internet checksum (as in UDP)

8 Advanced Networks 2002 7 TCP overview TCP is a sliding window protocol Sender can have (Window) bytes in flight Operates with cumulative ACKs It includes control for the sending rate Flow control: receiver-set sending rate Congestion control: network-aware sending rate Congwin

9 Advanced Networks 2002 8 TCP seq. #’s and ACKs Seq. #’s: byte stream “number” of first byte in segment’s databyte stream “number” of first byte in segment’s dataACKs: seq # of next byte expected from other sideseq # of next byte expected from other side cumulative ACKcumulative ACK Q: how receiver handles out-of-order segments A: TCP spec doesn’t say, - up to implementorA: TCP spec doesn’t say, - up to implementor Host A Host B Seq=42, ACK=79, data = ‘C’ Seq=79, ACK=43, data = ‘C’ Seq=43, ACK=80 User types ‘C’ host ACKs receipt of echoed ‘C’ host ACKs receipt of ‘C’, echoes back ‘C’ time simple telnet scenario

10 Advanced Networks 2002 9 TCP in a nutshell I. Slow start phase (actually this is fast increase) Start with a window of 1 (or 2)Start with a window of 1 (or 2) Successful ACK: Increase window by one 1 max size segmentSuccessful ACK: Increase window by one 1 max size segment Do this up to a threshold: sshthreshDo this up to a threshold: sshthresh II. Congestion control phase Increase window by 1 max size segment every RTTIncrease window by 1 max size segment every RTT Drop window in half, if there is congestionDrop window in half, if there is congestion  Packet loss: duplicate ACKs  Time expiration

11 Advanced Networks 2002 10 TCP Congestion Control end-end control (no network assistance) transmission rate limited by congestion window size, Congwin, over segments: w segments, each with MSS bytes sent in one RTT: throughput = w * MSS RTT Bytes/sec Congwin

12 Advanced Networks 2002 11 TCP congestion control: Intuition TCP is “probing” for usable bandwidth: ideally: transmit as fast as possible ( Congwin as large as possible) without loss increase Congwin until loss (congestion) loss: decrease Congwin, then begin probing (increasing) again

13 Advanced Networks 2002 12 TCP congestion control: TCP has two “phases” slow start:  start from small, increase quickly congestion avoidance:  Additive Increase Multiplicative Decrease important variables: Congwin threshold: defines threshold between two slow start phase, congestion control phase

14 Advanced Networks 2002 13 TCP Slowstart exponential increase (per RTT) in window size loss event: timeout (Tahoe TCP) and/or or three duplicate ACKs (Reno TCP) initialize: Congwin = 1 for (each segment ACKed) Congwin++ until (loss event OR CongWin > threshold) Slowstart algorithm Host A one segment RTT Host B time two segments four segments

15 Advanced Networks 2002 14 Why Call it Slow Start ? The original version of TCP suggested that the sender transmit as much as the Advertised Window permitted. Routers may not be able to cope with this “burst” of transmissions. Slow start is slower than the above version -- ensures that a transmission burst does not happen at once.

16 Advanced Networks 2002 15 TCP Congestion Avoidance /* slowstart is over */ /* Congwin > threshold */ Until (loss event) { every w segments ACKed: Congwin++ } threshold = Congwin/2 Congwin = 1 perform slowstart Congestion avoidance 1 1: TCP Reno skips slowstart (fast recovery) after three duplicate ACKs

17 Advanced Networks 2002 16 TCP Congestion: Real Life is Hairy! Remember: bytes vs packets! CW += MSS * MSS/CW Thres = Max( 2* MSS, InFlightData/2) MSS: max segment size InFlighData: un-ACK-ed data /* slowstart is over */ /* Congwin > threshold */ Until (loss event) { every w segments ACKed: Congwin++ } threshold = Congwin/2 Congwin = 1 perform slowstart Congestion avoidance 1 RFC 2581: TCP Congestion Control

18 Advanced Networks 2002 17 Fairness goal: if N TCP sessions share same bottleneck link, each should get 1/N of link capacity TCP congestion avoidance: AIMD: additive increase, multiplicative decrease increase window by 1 per RTT decrease “window” by factor of 2 on loss event TCP Fairness and AIMD TCP connection 1 bottleneck router capacity R TCP connection 2

19 Advanced Networks 2002 18 Why is TCP fair? Two competing sessions: Additive increase gives slope of 1, as throughout increases multiplicative decrease decreases throughput proportionally R R equal bandwidth share Connection 1 throughput Connection 2 throughput congestion avoidance: additive increase loss: decrease window by factor of 2 congestion avoidance: additive increase loss: decrease window by factor of 2

20 Advanced Networks 2002 19 Macroscopic Description of Throughput Assume window toggling: W/2 to W High rate: W * MSS / RTT Low rate: W * MSS / 2 RTT Rate increase is linearly between two extremes Average throughput: 0.75 * W * MSS / RTT0.75 * W * MSS / RTT

21 Advanced Networks 2002 20 TCP: reliable data transfer Simplified sender, assuming wait for event wait for event event: data received from application above event: timer timeout for segment with seq # y event: ACK received, with ACK # y create, send segment retransmit segment ACK processing one way data transfer no flow, congestion control

22 Advanced Networks 2002 21 TCP sender 00 sendbase = initial_sequence number 01 nextseqnum = initial_sequence number 02 03 loop (forever) { 04 switch(event) 05 event: data received from application above 06 create TCP segment with sequence number nextseqnum 07 start timer for segment nextseqnum 08 pass segment to IP 09 nextseqnum = nextseqnum + length(data) 10 event: timer timeout for segment with sequence number y 11 retransmit segment with sequence number y 12 compute new timeout interval for segment y 13 restart timer for sequence number y 14 event: ACK received, with ACK field value of y 15 if (y > sendbase) { /* cumulative ACK of all data up to y */ 16 cancel all timers for segments with sequence numbers < y 17 sendbase = y 18 } 19 else { /* a duplicate ACK for already ACKed segment */ 20 increment number of duplicate ACKs received for y 21 if (number of duplicate ACKS received for y == 3) { 22 /* TCP fast retransmit */ 23 resend segment with sequence number y 24 restart timer for segment y 25 } 26 } /* end of loop forever */ Simplified TCP sender

23 Advanced Networks 2002 22 TCP Receiver: ACK generation [RFC 1122, RFC 2581] Event in-order segment arrival, no gaps, everything else already ACKed in-order segment arrival, no gaps, one delayed ACK pending out-of-order segment arrival higher-than-expect seq. # gap detected arrival of segment that partially or completely fills gap TCP Receiver action delayed ACK. Wait up to 500ms for next segment. If no next segment, send ACK immediately send single cumulative ACK send duplicate ACK, indicating seq. # of next expected byte immediate ACK if segment starts at lower end of gap

24 Advanced Networks 2002 23 TCP: retransmission scenarios Host A Seq=92, 8 bytes data ACK=100 loss timeout time lost ACK scenario Host B X Seq=92, 8 bytes data ACK=100 Host A Seq=100, 20 bytes data ACK=100 Seq=92 timeout time premature timeout, cumulative ACKs Host B Seq=92, 8 bytes data ACK=120 Seq=92, 8 bytes data Seq=100 timeout ACK=120

25 Advanced Networks 2002 24 TCP Round Trip Time and Timeout Q: how to set TCP timeout value? longer than RTT note: RTT will varynote: RTT will vary too short: premature timeout unnecessary retransmissionsunnecessary retransmissions too long: slow reaction to segment loss Q: how to estimate RTT? SampleRTT : measured time from segment transmission until ACK receipt ignore retransmissions, cumulatively ACKed segments SampleRTT will vary, want estimated RTT “smoother” use several recent measurements, not just current SampleRTT

26 Advanced Networks 2002 25 TCP Round Trip Time and Timeout EstimatedRTT = (1-x)*EstimatedRTT + x*SampleRTT Exponential weighted moving average influence of given sample decreases exponentially fast typical value of x: 0.1 Setting the timeout EstimtedRTT plus “safety margin” large variation in EstimatedRTT -> larger safety margin Timeout = EstimatedRTT + 4*Deviation Deviation = (1-x)*Deviation + x*|SampleRTT-EstimatedRTT|

27 Advanced Networks 2002 26 TCP Connection Management Recall: TCP sender, receiver establish “connection” before exchanging data segments initialize TCP variables: seq. #sseq. #s buffers, flow control info (e.g. RcvWindow )buffers, flow control info (e.g. RcvWindow ) client: connection initiator Socket clientSocket = new Socket("hostname","port number"); Socket clientSocket = new Socket("hostname","port number"); server: contacted by client Socket connectionSocket = welcomeSocket.accept(); Socket connectionSocket = welcomeSocket.accept(); Three way handshake: Step 1: client end system sends TCP SYN control segment to server specifies initial seq # Step 2: server end system receives SYN, replies with SYNACK control segment ACKs received SYN allocates buffers specifies server-> receiver initial seq. # Step 3: Client replies with an ACK (using servers seq number)

28 Advanced Networks 2002 27 TCP Connection Management (cont.) Closing a connection: client closes socket: clientSocket.close(); Step 1: client end system sends TCP FIN control segment to server Step 2: server receives FIN, replies with ACK. Closes connection, sends FIN. Last ACK is never ACK-ed!! client FIN server ACK FIN close closed timed wait

29 Advanced Networks 2002 28 TCP Connection Management (cont.) Step 3: client receives FIN, replies with ACK. Enters “timed wait” - will respond with ACK to received FINs Step 4: server, receives ACK. Connection closed. Sends FIN. Last ACK is never ACK-ed client FIN server ACK FIN closing closed timed wait closed

30 Advanced Networks 2002 29 TCP Connection Management (cont) TCP client lifecycle TCP server lifecycle

31 Advanced Networks 2002 30 Principles of Congestion Control Congestion: informally: “too many sources sending too much data too fast for network to handle” different from flow control! manifestations: lost packets (buffer overflow at routers)lost packets (buffer overflow at routers) long delays (queueing in router buffers)long delays (queueing in router buffers) Major research issue

32 Advanced Networks 2002 31 Consequences of Congestion Large delays: throughput vs delay trade-off We don’t want to operate near capacityWe don’t want to operate near capacity Finite buffers: lost packets Resending of packets causes More packets for the same goodputMore packets for the same goodput Wasted bandwidth of the packet that gets droppedWasted bandwidth of the packet that gets dropped

33 Advanced Networks 2002 32 Causes/costs of congestion: scenario 1 two senders, two receivers one router, infinite buffers no retransmission large delays when congested maximum achievable throughput

34 Advanced Networks 2002 33 Causes/costs of congestion: scenario 2 one router, finite buffers sender retransmission of lost packet

35 Advanced Networks 2002 34 Causes/costs of congestion: scenario 2 Always: (goodput) If packets are dropped: in out = ’ in out >

36 Advanced Networks 2002 35 Causes/costs of congestion: scenario 3 Four senders, multihop paths, timeout/retransmit Congestion in one link -> retransmits -> congestion in other links

37 Advanced Networks 2002 36 Causes/costs of congestion: scenario 3 Another “cost” of congestion: when packet dropped, any “upstream transmission capacity used for that packet was wasted!

38 Advanced Networks 2002 37 Approaches towards congestion control End-end congestion control: no explicit feedback from network congestion inferred from end-system observed loss, delay approach taken by TCP Network-assisted congestion control: routers provide feedback to end systems single bit indicating congestion (SNA, DECbit, TCP/IP ECN, ATM) explicit rate sender should send at Two broad approaches towards congestion control:

39 Advanced Networks 2002 38 Case study: ATM ABR congestion control ABR: available bit rate: “elastic service” if sender’s path “underloaded”: sender should use available bandwidthsender should use available bandwidth if sender’s path congested: sender throttled to minimum guaranteed ratesender throttled to minimum guaranteed rate RM (resource management) cells: sent by sender, interspersed with data cells bits in RM cell set by switches (“network-assisted”) NI bit: no increase in rate (mild congestion) CI bit: congestion indication RM cells returned to sender by receiver, with bits intact

40 Advanced Networks 2002 39 Case study: ATM ABR congestion control two-byte ER (explicit rate) field in RM cell congested switch may lower ER value in cellcongested switch may lower ER value in cell sender’ send rate thus minimum supportable rate on pathsender’ send rate thus minimum supportable rate on path EFCI bit in data cells: set to 1 in congested switch if data cell preceding RM cell has EFCI set, sender sets CI bit in returned RM cellif data cell preceding RM cell has EFCI set, sender sets CI bit in returned RM cell

41 Advanced Networks 2002 40 TCP latency modeling Q: How long does it take to receive an object from a Web server after sending a request? TCP connection establishment data transfer delay Notation, assumptions: Assume one link between client and server of rate R Assume: fixed congestion window, W segments S: MSS (bits) O: object size (bits) no retransmissions (no loss, no corruption) Two cases to consider: WS/R > RTT + S/R: ACK for first segment in window returns before window’s worth of data sent WS/R < RTT + S/R: wait for ACK after sending window’s worth of data sent

42 Advanced Networks 2002 41 TCP latency Modeling Case 1: latency = 2RTT + O/R Case 2: latency = 2RTT + O/R + (K-1)[S/R + RTT - WS/R] K:= O/WS Green lag

43 Advanced Networks 2002 42 TCP Latency Modeling: Slow Start Now suppose window grows according to slow start. Will show that the latency of one object of size O is: where P is the number of times TCP stalls at server: - where Q is the number of times the server would stall if the object were of infinite size. - and K is the number of windows that cover the object.

44 Advanced Networks 2002 43 TCP Latency Modeling: Slow Start (cont.) Example: O/S = 15 segments K = 4 windows Q = 2 P = min{K-1,Q} = 2 Server stalls P=2 times.

45 Advanced Networks 2002 44 TCP Latency Modeling: Slow Start (cont.)

46 Advanced Networks 2002 45 Current TCP Versions TCP specs can be implemented in different ways TCP versions: Tahoe (basic as described)Tahoe (basic as described) RenoReno Las VegasLas Vegas

47 Advanced Networks 2002 46 TCP Reno Most popular TCP implementation Fast retransmit on 3 duplicate ACKs Fast recovery: cancel slow start after fast retransmission Half the congestion window threshold, but start with congestion window equal to thresholdHalf the congestion window threshold, but start with congestion window equal to threshold Go to congestion avoidance phaseGo to congestion avoidance phase Optimistic Rationale:  I hope there was only one packet lost  Since I sent it, I hope it arrives this time

48 Advanced Networks 2002 47 TCP Vegas Idea: infer problems from RTT delay Reduce rate before you have lossReduce rate before you have loss What is a “sign” of congestion: When RTT increases above a thresholdWhen RTT increases above a threshold Sending rate flattensSending rate flattens Decrease sending rate linearly Issues: Estimate RTTEstimate RTT Set appropriate thresholdSet appropriate threshold

49 Intuition Driving on Ice 60 20 0.51.01.54.04.56.58.0 KB Time (seconds) 70 30 40 50 10 2.02.53.03.55.05.56.07.07.58.5 2.02.53.03.55.05.56.07.07.58.5 Congestion Window Average send rate at source Average Q length in router

50 Advanced Networks 2002 49 TCP Vegas Details Value of throughput with no congestion is compared to current throughput If current difference is small, increase window size linearly If current difference is large, decrease window size linearly The change in the Slow Start Mechanism consists of doubling the window every other RTT, rather than every RTT and of using a boundary in the difference between throughputs to exit the Slow Start phase, rather than a window size value.

51 Advanced Networks 2002 50 The TCP Vegas: Algorithm Let BaseRTT be the minimum of all measured RTTs (commonly the RTT of the first packet) If not overflowing the connection, then ExpectedRate = CongestionWindow / BaseRTTExpectedRate = CongestionWindow / BaseRTT Source calculates current sending rate (ActualRate) once per RTT Source compares ActualRate with ExpectedRate Diff = ExpectedRate – ActualRateDiff = ExpectedRate – ActualRate if Diff <  if Diff <   -->increase CongestionWindow linearly else if Diff > else if Diff >   -->decrease CongestionWindow linearly elseelse  -->leave CongestionWindow unchanged

52 Advanced Networks 2002 51 Parameters  : 1 packet  : 1 packet  : 3 packets  : 3 packets Even faster retransmit keep fine-grained timestamps for each packetkeep fine-grained timestamps for each packet check for timeout on first duplicate ACKcheck for timeout on first duplicate ACK Vegas Parameters

53 Advanced Networks 2002 52 Example TCP Vegas Expected throughput Actual Throughput

54 Advanced Networks 2002 53 Router Assisted Congestion Control Random Early Detection Explicit Congestion Notification Note: often this is referred to as Active Networking: ie routers are involved in perfomance. Active Nets is a much more general idea Active Nets is a much more general idea

55 Advanced Networks 2002 54 RED: Random Early Detection Idea: routers start dropping packets before they are congested Benefits: make behavior smoother How: When queue is above a thres-1: drop packets with probability pWhen queue is above a thres-1: drop packets with probability pIssues: setting the parameterssetting the parameters Estimating the queue sizeEstimating the queue size

56 Advanced Networks 2002 55 two queue length thresholdstwo queue length thresholds  if AvgLen  MinThreshold then enqueue the packet  if MinThreshold < AvgLen < MaxThreshold calculate probability P drop arriving packet with probability P  if MaxThreshold  AvgLen drop arriving packet Thresholds

57 Advanced Networks 2002 56 RED: probability P Not fixed Function of AvgLen and how long since last drop (count) keeps track of new packets that have been queued while AvgLen has been between the two thresholds TempP = MaxP * (AvgLen - MinThreshold) /(MaxThreshold - MinThreshold)TempP = MaxP * (AvgLen - MinThreshold) /(MaxThreshold - MinThreshold) P = TempP/(1 - count * TempP)P = TempP/(1 - count * TempP) MaxP is often set to 0.02, meaning that the gateway drops 1 out of 50 packets when queue size is halfway between MinThreshold and MaxThreshold

58 Advanced Networks 2002 57 Comments on RED Probability of dropping a particular flow's packet(s) is roughly proportional to the share of the bandwidth that flow is currently getting MaxP is typically set to 0.02, meaning that when the average queue size is halfway between the two thresholds, the gateway drops roughly one out of 50 packets.

59 Advanced Networks 2002 58 RED: Dropping probability P(drop) 1.0 MaxP MinThreshMaxThresh AvgLen

60 Advanced Networks 2002 59 Selecting Parameters if traffic is bursty, then MinThreshold should be sufficiently large to allow link utilization to be maintained at an acceptably high level The difference between two thresholds should be larger than the typical increase in the calculated average queue length in one RTT; setting MaxThreshold to twice MinThreshold is reasonable for traffic on today's Internet

61 Advanced Networks 2002 60 Explicit Congestion Notification Dropping packets = Warn of congestion Idea: mark packets to notify congestion How: Congested router marks packet (sets a bit)Congested router marks packet (sets a bit) Receiver “copies” bit in the ACKReceiver “copies” bit in the ACK Sender reduces its windowSender reduces its window Benefit: proactive without losing packets Problem: sender can ignore it

62 Advanced Networks 2002 61 Current Beliefs RED + ECN are considered to be good RED alone has problems

63 Advanced Networks 2002 62 Chapter 3: Summary principles behind transport layer services: multiplexing/demultiplexingmultiplexing/demultiplexing reliable data transferreliable data transfer flow controlflow control congestion controlcongestion control instantiation and implementation in the Internet UDPUDP TCPTCP Next: leaving the network “edge” (application transport layer) into the network “core”

64 Advanced Networks 2002 63 TCP Flow Control receiver: explicitly informs sender of (dynamically changing) amount of free buffer space RcvWindow field in TCP segmentRcvWindow field in TCP segment sender: keeps the amount of transmitted, unACKed data less than most recently received RcvWindow sender won’t overrun receiver’s buffers by transmitting too much, too fast flow control receiver buffering RcvBuffer = size or TCP Receive Buffer RcvWindow = amount of spare room in Buffer


Download ppt "Advanced Networks 20021 Transport Layer Michalis Faloutsos Many slides from Kurose-Ross."

Similar presentations


Ads by Google