1 NS-2 LAB: TCP/IP Congestion control 2005-2006 - Semester A Miriam Allalouf.

1 NS-2 LAB: TCP/IP Congestion control 2005-2006 - Semester A Miriam Allalouf

2 Networks Design Course Datagram Networks Datagram Networks Transport Layer - TCP Internet Congestion control

3 Networks Design Course Datagram Networks Network Layer Service Model The transport layer relies on the services of the network layer : communication services between hosts. Router’s role is to “switch” packets between input port to the output port. Which services can be expected from the Network Layer? Is it reliable? Can the transport layer count on the network layer to deliver the packets to the destination? When multiple packets are sent, will they be delivered to the transport layer in the receiving host in the same order they were sent? Amount time between two sends equal to the amount time between two receives? Will be any congestion indication of the network? What is the abstract view of the channel connecting the transport layer in the sending and the receiving end points?

4 Networks Design Course Datagram Networks Best-effort service is the only service model provided by Internet. r No guarantee for preserving timing between packets r No guarantee of order keeping r No guarantee of transmitted packets eventual delivery Network that delivered no packets to the destination would satisfy the definition of best-effort delivery service. Is Best-effort service equivalent to no service at all?

5 Networks Design Course Datagram Networks No service at all? Or What the advantages? r Minimal requirements on the network layer  easier to interconnect networks with very different link-layer technologies: satellite, Ethernet, fiber, or radio use different transmission rates and loss characteristics. r The Internet grew out of the need to connect computers together. r Sophisticated end-system devices implement any additional functionality and new application-level network services at a higher layer. applications such as e-mail, the Web, and even a network-layer-centric service such as the DNS are implemented in hosts (servers) at the edge of the network. Simple new services addition: by attaching a host to the network and defining a new higher-layer protocol (such as HTTP)  Fast adoption of WWW to the Internet.

6 TCP: Overview RFCs: 793, 1122, 1323, 2018, 2581 r full duplex data: m bi-directional data flow in same connection m MSS: maximum segment size r connection-oriented: m handshaking (exchange of control msgs) init’s sender, receiver state before data exchange r flow controlled: m sender will not overwhelm receiver r point-to-point: m one sender, one receiver r reliable, in-order byte stream: m no “message boundaries” r pipelined: m TCP congestion and flow control set window size

7 TCP segment structure source port # dest port # 32 bits application data (variable length) sequence number acknowledgement number rcvr window size ptr urgent data checksum F SR PAU head len not used Options (variable length) URG: urgent data (generally not used) ACK: ACK # valid PSH: push data now (generally not used) RST, SYN, FIN: connection estab (setup, teardown commands) # bytes rcvr willing to accept counting by bytes of data (not segments!) Internet checksum

8 TCP seq. #’s and ACKs Seq. #’s: m byte stream “number” of first byte in segment’s data ACKs: m seq # of next byte expected from other side m cumulative ACK Q: how receiver handles out- of-order segments m A: TCP spec doesn’t say, - up to implementor Host A Host B Seq=42, ACK=79, data = ‘C’ Seq=79, ACK=43, data = ‘C’ Seq=43, ACK=80 User types ‘C’ host ACKs receipt of echoed ‘C’ host ACKs receipt of ‘C’, echoes back ‘C’ time simple telnet scenario

9 At TCP Receiver: TCP ACK generation [RFC 1122, RFC 2581] Event in-order segment arrival, no gaps, everything else already ACKed in-order segment arrival, no gaps, one delayed ACK pending out-of-order segment arrival higher-than-expect seq. # gap detected arrival of segment that partially or completely fills gap TCP Receiver action delayed ACK. Wait up to 500ms for next segment. If no next segment, send ACK immediately send single cumulative ACK send duplicate ACK, indicating seq. # of next expected byte immediate ACK if segment starts at lower end of gap

10 TCP: retransmission scenarios Host A Seq=92, 8 bytes data ACK=100 loss timeout time lost ACK scenario Host B X Seq=92, 8 bytes data ACK=100 Host A Seq=100, 20 bytes data ACK=100 Seq=92 timeout time premature timeout, cumulative ACKs Host B Seq=92, 8 bytes data ACK=120 Seq=92, 8 bytes data Seq=100 timeout ACK=120

11 TCP Round Trip Time and Timeout Q: how to set TCP timeout value? r longer than RTT m note: RTT will vary r too short: premature timeout m unnecessary retransmissions r too long: slow reaction to segment loss Q: how to estimate RTT?  SampleRTT : measured time from segment transmission until ACK receipt m ignore retransmissions, cumulatively ACKed segments  SampleRTT will vary, want estimated RTT “smoother”  use several recent measurements, not just current SampleRTT

12 TCP Round Trip Time and Timeout EstimatedRTT = (1-x)*EstimatedRTT + x*SampleRTT r Exponential weighted moving average r influence of given sample decreases exponentially fast r typical value of x: 0.125 (1/8) Setting the timeout  EstimtedRTT plus “safety margin”  large variation in EstimatedRTT -> larger safety margin Timeout = EstimatedRTT + 4*Deviation Deviation = (1-x)*Deviation + x*|SampleRTT-EstimatedRTT|

13 Principles of Congestion Control Congestion: r informally: “too many sources sending too much data too fast for network to handle” r different from flow control! r manifestations: m lost packets (buffer overflow at routers) m long delays (queueing in router buffers) r a top-10 problem!

14 Causes/costs of congestion: scenario 1 r two senders, two receivers r one router, infinite buffers r no retransmission r large delays when congested r maximum achievable throughput

15 Causes/costs of congestion: scenario 2 r one router, finite buffers r sender retransmission of lost packet

16 Causes/costs of congestion: scenario 2 r always: (goodput) r “perfect” retransmission only when loss: r retransmission of delayed (not lost) packet makes larger (than perfect case) for same in out = in out > in out “costs” of congestion: r more work (retrans) for given “goodput” r unneeded retransmissions: link carries multiple copies of pkt

17 Causes/costs of congestion: scenario 3 r four senders r multihop paths r timeout/retransmit in Q: what happens as and increase ? in

18 Causes/costs of congestion: scenario 3 Another “cost” of congestion: r when packet dropped, any “upstream transmission capacity used for that packet was wasted!

19 Approaches towards congestion control End-end congestion control: r no explicit feedback from network r congestion inferred from end-system observed loss, delay r approach taken by TCP Network-assisted congestion control: r routers provide feedback to end systems m single bit indicating congestion (SNA, DECbit, TCP/IP ECN, ATM) m explicit rate sender should send at Two broad approaches towards congestion control:

20 TCP Congestion Control r end-end control (no network assistance)  transmission rate limited by congestion window size, Congwin, over segments: r w segments, each with MSS bytes sent in one RTT: throughput = w * MSS RTT Bytes/sec Congwin

21 TCP Self-clocking r Adjust Congestion window  sending rate adjustment r Slow Acks (low link or high delay)  Slow cwnd Increase r Fast Acks  Fast cwnd Increase

22 TCP congestion control: r two “phases” m slow start m congestion avoidance r important variables:  Congwin based on the perceived level of congestion. The Host receives indications of internal congestion  threshold: defines threshold between two slow start phase, congestion control phase r “probing” for usable bandwidth:  ideally: transmit as fast as possible ( Congwin as large as possible) without loss  increase Congwin until loss (congestion)  loss: decrease Congwin, then begin probing (increasing) again

23 TCP Slowstart r exponential increase (per RTT) in window size (not so slow!) r loss event: timeout (Tahoe TCP) and/or or three duplicate ACKs (Reno TCP) initialize: Congwin = 1 for (each segment ACKed) Congwin++ until (loss event OR CongWin > threshold) Slowstart algorithm Host A one segment RTT Host B time two segments four segments

24 TCP Congestion Avoidance /* slowstart is over */ /* Congwin > threshold */ Until (loss event) { every w segments ACKed: Congwin++ } threshold = Congwin/2 Congwin = 1 perform slowstart Congestion avoidance 1 1: TCP Reno skips slowstart (fast recovery) after three duplicate ACKs Tahoe Reno

25 Additive Increase r Additive Increase is a reaction to perceived available capacity. r Linear Increase basic idea: For each “cwnd’s worth” of packets sent, increase cwnd by 1 packet. r In practice, cwnd is incremented fractionally for each arriving ACK. TCP congestion avoidance: r AIMD: additive increase, multiplicative decrease m increase window by 1 per RTT m decrease window by factor of 2 on loss event AIMD increment = MSS x (MSS /cwnd) cwnd = cwnd + increment

27 Fast Retransmit r Coarse timeouts remained a problem, and Fast retransmit was added with TCP Tahoe. r Since the receiver responds every time a packet arrives, this implies the sender will see duplicate ACKs. Basic Idea:: use duplicate ACKs to signal lost packet. Fast Retransmit Upon receipt of three duplicate ACKs, the TCP Sender retransmits the lost packet.

28 Fast Retransmit r Generally, fast retransmit eliminates about half the coarse-grain timeouts. r This yields roughly a 20% improvement in throughput. r Note – fast retransmit does not eliminate all the timeouts due to small window sizes at the source.

29 Fast Recovery r Fast recovery was added with TCP Reno. r Basic idea:: When fast retransmit detects three duplicate ACKs, start the recovery process from congestion avoidance region and use ACKs in the pipe to pace the sending of packets. Fast Recovery After Fast Retransmit, half cwnd and commence recovery from this point using linear additive increase ‘primed’ by left over ACKs in pipe.

30 TCP Fairness Fairness goal: if N TCP sessions share same bottleneck link, each should get 1/N of link capacity TCP connection 1 bottleneck router capacity R TCP connection 2

31 Why is TCP fair? Two competing sessions: r Additive increase gives slope of 1, as throughout increases r multiplicative decrease decreases throughput proportionally R R equal bandwidth share Connection 1 throughput Connection 2 throughput congestion avoidance: additive increase loss: decrease window by factor of 2 congestion avoidance: additive increase loss: decrease window by factor of 2

32 TCP strains Tahoe Reno Vegas

33 Vegas

34 From Reno to Vegas RenoVegas RTT measurem-tcoarsefine Fast Retrans3 dup ackDup ack + timer CongWin decrease For each Fast Retrans. For the 1 st Fast Retrans. in RTT Congestion detection Loss detectionAlso based on measured throughput

35 Some more examples

36 TCP latency modeling Q: How long does it take to receive an object from a Web server after sending a request? r TCP connection establishment r data transfer delay Notation, assumptions: r Assume one link between client and server of rate R r Assume: fixed congestion window, W segments r S: MSS (bits) r O: object size (bits) r no retransmissions (no loss, no corruption) Two cases to consider: r WS/R > RTT + S/R: ACK for first segment in window returns before window’s worth of data sent r WS/R < RTT + S/R: wait for ACK after sending window’s worth of data sent

37 TCP latency Modeling Case 1: latency = 2RTT + O/R Case 2: latency = 2RTT + O/R + (K-1)[S/R + RTT - WS/R] K:= O/WS

38 TCP Latency Modeling: Slow Start r Now suppose window grows according to slow start. r Will show that the latency of one object of size O is: where P is the number of times TCP stalls at server: - where Q is the number of times the server would stall if the object were of infinite size. - and K is the number of windows that cover the object.

39 TCP Latency Modeling: Slow Start (cont.) Example: O/S = 15 segments K = 4 windows Q = 2 P = min{K-1,Q} = 2 Server stalls P=2 times.

40 TCP Latency Modeling: Slow Start (cont.)

41 TCP Flow Control receiver: explicitly informs sender of (dynamically changing) amount of free buffer space  RcvWindow field in TCP segment sender: keeps the amount of transmitted, unACKed data less than most recently received RcvWindow sender won’t overrun receiver’s buffers by transmitting too much, too fast flow control receiver buffering RcvBuffer = size or TCP Receive Buffer RcvWindow = amount of spare room in Buffer

42 TCP Connection Management Recall: TCP sender, receiver establish “connection” before exchanging data segments r initialize TCP variables: m seq. #s m buffers, flow control info (e.g. RcvWindow) r client: connection initiator Socket clientSocket = new Socket("hostname","port number"); r server: contacted by client Socket connectionSocket = welcomeSocket.accept(); Three way handshake: Step 1: client sends TCP SYN control segment to server m specifies initial seq # Step 2: server receives SYN, replies with SYNACK control segment  ACKs received SYN m allocates buffers m specifies server-to-receiver initial seq. # Step 3: client sends ACK and data.

43 TCP Connection Management (cont.) Closing a connection: client closes socket: clientSocket.close(); Step 1: client end system sends TCP FIN control segment to server. Step 2: server receives FIN, replies with ACK. Closes connection, sends FIN. client FIN server ACK FIN close closed timed wait

44 TCP Connection Management (cont.) Step 3: client receives FIN, replies with ACK.  Enters “timed wait” - will respond with ACK to received FIN s Step 4: server, receives ACK. Connection closed. Note: with small modification, can handly simultaneous FIN s. client FIN server ACK FIN closing closed timed wait closed

45 TCP Connection Management (cont) TCP client lifecycle TCP server lifecycle

46 Summary r principles behind transport layer services: m multiplexing/demultiplexing m reliable data transfer m flow control m congestion control r instantiation and implementation in the Internet m UDP m TCP

47 Networks Design Course Datagram Networks QoS Extension: NW Layer congestion control Integrated services IntServ Differentiated service – DiffServ RSVP-TE

48 Q mng : Packet Dropping : Tail Drop  Tail Drop – packets are dropped when the queue is full causes the Global Synch. problem with TCP Queue Utilization 100% Time Tail Drop

49 Packet Dropping : RED  Proposed by Sally Floyd and Van Jacobson in the early 1990s r packets are dropped randomly prior to periods of high congestion, which signals the packet source to decrease the transmission rate r distributes losses over time

50 RED - Implementation r Drop probability is based on min_threshold, max_threshold, and mark probability denominator. r When the average queue depth is above the minimum threshold, RED starts dropping packets. The rate of packet drop increases linearly as the average queue size increases until the average queue size reaches the maximum threshold. r When the average queue size is above the maximum threshold, all packets are dropped.

51 RED (cont.) Buffer occupancy calculation Average Queue Size – Weighted Exponential Moving Average 0 1 minmax av. queue size drop prob. … Max Prob

52 GRED Gentle RED Buffer occupancy calculation Average Queue Size – Weighted Exponential Moving Average 0 1 minmax av. queue size drop prob. … Max Prob 2*max

1 NS-2 LAB: TCP/IP Congestion control 2005-2006 - Semester A Miriam Allalouf.

Similar presentations

Presentation on theme: "1 NS-2 LAB: TCP/IP Congestion control 2005-2006 - Semester A Miriam Allalouf."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

1 NS-2 LAB: TCP/IP Congestion control 2005-2006 - Semester A Miriam Allalouf.

Similar presentations

Presentation on theme: "1 NS-2 LAB: TCP/IP Congestion control 2005-2006 - Semester A Miriam Allalouf."— Presentation transcript:

Similar presentations

About project

Feedback