Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Computer Networks Transport Layer (Part 3). 2 Transport Layer Last class –CIDR exam question –Specific transport layers UDP This class –TCP.

Similar presentations

Presentation on theme: "1 Computer Networks Transport Layer (Part 3). 2 Transport Layer Last class –CIDR exam question –Specific transport layers UDP This class –TCP."— Presentation transcript:

1 1 Computer Networks Transport Layer (Part 3)

2 2 Transport Layer Last class –CIDR exam question –Specific transport layers UDP This class –TCP

3 3 TL: TCP and Transport Layer Functions Demux to upper layer Quality of service Security Delivery semantics Flow control Congestion control Reliable data transfer

4 4 TL: TCP Overview RFCs: 793, 1122, 1323, 2018, 2581 full duplex data: –bi-directional data flow in same connection –MSS: maximum segment size connection-oriented: –handshaking (exchange of control msgs) init’s sender, receiver state before data exchange –protocol implemented at ends (“fate-sharing”) flow and congestion controlled: –sender will not overwhelm receiver or network point-to-point: –one sender, one receiver reliable, in-order byte steam: –no “message boundaries” pipelined: –TCP congestion and flow control set window size send & receive buffers

5 5 TL: TCP header source port # dest port # 32 bits application data (variable length) sequence number acknowledgement number rcvr window size ptr urgent data checksum F SR PAU head len not used Options (variable length) URG: urgent data (generally not used) ACK: ACK # valid PSH: push data now (generally not used) RST, SYN, FIN: connection estab (setup, teardown commands) # bytes rcvr willing to accept counting by bytes of data (not segments!) Internet checksum (as in UDP)

6 6 TL: TCP connections TCP sender, receiver establish “connection” before exchanging data segments –initialize TCP variables: Initial sequence #s Buffers, flow control info (e.g. RcvWindow ) Window scaling client: connection initiator server: contacted by client Java API Socket clientSocket = new Socket("hostname","port#”); Socket connectionSocket = welcomeSocket.accept();

7 7 TL: TCP connections Three way handshake: –Step 1: client end system sends TCP SYN control segment to server specifies initial seq # should be random to prevent spoofing ( http://www.rfc- )http://www.rfc- –Step 2: server end system receives SYN, replies with SYNACK control segment ACKs received SYN allocates buffers specifies server-> receiver initial seq. # –Step 3: client receives SYNACK control segment, replies with ACK and potentially data ACKs received SYNACK goes to established state

8 8 TL: TCP Connection Establishment A and B must agree on initial sequence number selection 3-way handshake AB SYN + Seq A SYN+ACK-A + Seq B ACK-B

9 9 TL: TCP Sequence Number Selection Why not simply chose 0? Must avoid overlap with earlier incarnation Client machine seq #0, initiates connection to server with seq #0. –Client sends one byte and machine crashes –Client reboots and initiates connection again –Server thinks new incarnation is the same as old connection

10 10 TL: TCP Sequence Number Selection Why is selecting a random ISN Important? Suppose machine X selects ISN based on predictable sequence Fred has.rhosts to allow login to X from Y Evil Ed attacks –Disables host Y – denial of service attack –Make a bunch of connections to host X –Determine ISN pattern a guess next ISN –Fake pkt1: [, guessed ISN] –Fake pkt2: desired command –Attack popularized by K. Mitnick

11 11 TL: TCP ISN selection and spoofing attacks Ed Y X.rhosts Y 1. Flood continuously 3. TCP SYNACK ACK spoofed Y ISN Send X ISN PACKET DROPPED! 2. Spoof TCP SYN from Y With spoofed Y ISN 6. Real acks dropped so Y does not reset connection 4. Send ACK with guess of X’s ISN as if you received TCP SYNACK 5. Send pre-canned rlogin/rsh messages rsh echo “Ed” >>.rhosts spoof acknowledgements Ed 7. Door now open, rlogin to X from Ed directly

12 12 TL: TCP connection setup CLOSED SYN SENT SYN RCVD ESTAB LISTEN active OPEN create TCB Snd SYN create TCB passive OPEN delete TCB CLOSE delete TCB CLOSE snd SYN APP SEND snd SYN ACK rcv SYN Send FIN CLOSE rcv ACK of SYN Snd ACK Rcv SYN, ACK rcv SYN snd ACK

13 13 TL: TCP connections Data transfer for established connections using sequence numbers and sliding windows with cumulative ACKs Seq. #’s: –byte stream “number” of first byte in segment’s data ACKs: –seq # of next byte expected from other side –cumulative ACK –duplicate acks sent when out- of-order packet received See web trace Java API connectionSocket.receive(); clientSocket.send(); Host A Host B Seq=42, ACK=79, data = ‘C’ Seq=79, ACK=43, data = ‘C’ Seq=43, ACK=80 User types ‘C’ host ACKs receipt of echoed ‘C’ host ACKs receipt of ‘C’, echoes back ‘C’ time simple telnet scenario

14 14 TL: TCP connections Closing a connection: Client-initiated close (reverse process for server-initiated close) Java API: clientSocket.close(); Step 1: client end system sends TCP FIN control segment to server Step 2: server receives FIN, replies with ACK. Closes connection, sends FIN. client FIN server ACK FIN close closed timed wait

15 15 TL: TCP connections Step 3: client receives FIN, replies with ACK. –Enters “timed wait” - will respond with ACK to received FINs Step 4: server, receives ACK. Connection closed. Note: with small modification, can handle simultaneous FINs. client FIN server ACK FIN closing closed timed wait closed

16 16 TL: TCP Connection Tear-down SenderReceiver FIN FIN-ACK FIN FIN-ACK Data write Data ack

17 17 TL: TCP Connection Tear-down CLOSING CLOSE WAIT FIN WAIT-1 ESTAB TIME WAIT snd FIN CLOSE send FIN CLOSE rcv ACK of FIN LAST-ACK CLOSED FIN WAIT-2 snd ACK rcv FIN delete TCB Timeout=2msl send FIN CLOSE send ACK rcv FIN snd ACK rcv FIN rcv ACK of FIN snd ACK rcv FIN+ACK rcv ACK

18 18 TL: Time Wait Issues Cannot close connection immediately after receiving FIN –What if a new connection restarts and uses same sequence number? Web servers not clients close connection first –Established  Fin-Waits  Time-Wait  Closed –Why would this be a problem? Time-Wait state lasts for 2 * MSL –MSL is should be 120 seconds (is often 60s) –Servers often have order of magnitude more connections in Time-Wait

19 19 TL: TCP connections TCP client lifecycle TCP server lifecycle

20 20 TL: TCP Demux to upper layer multiplexing/demultiplexing: based on sender, receiver port numbers, IP addresses –source, dest port #s in each segment –recall: well-known port numbers for specific applications –Servers wait on well known ports (/etc/services) gathering data from multiple app processes, enveloping data with header (later used for demultiplexing) source port #dest port # 32 bits application data (message) other header fields TCP/UDP segment format Multiplexing:

21 21 TL: TCP Demux to upper layer host A server B source port: x dest. port: 23 source port:23 dest. port: x port use: simple telnet app Web client host A Web server B Web client host C Source IP: C Dest IP: B source port: x dest. port: 80 Source IP: C Dest IP: B source port: y dest. port: 80 port use: Web server Source IP: A Dest IP: B source port: x dest. port: 80

22 22 TL: TCP Flow control TCP is a sliding window protocol –For window size n, can send up to n bytes without receiving an acknowledgement –When the data is acknowledged then the window slides forward Each packet advertises a window size –Indicates number of bytes the receiver has space for Original TCP always sent entire window –Congestion control now limits this

23 23 TL: TCP Flow control receiver: explicitly informs sender of (dynamically changing) amount of free buffer space –RcvWindow field in TCP segment sender: keeps the amount of transmitted, unACKed data less than most recently received RcvWindow sender won’t overrun receiver’s buffers by transmitting too much, too fast flow control receiver buffering RcvBuffer = size or TCP Receive Buffer RcvWindow = amount of spare room in Buffer

24 24 TL: TCP Flow control What happens if window is 0? –Receiver updates window when application reads data –What if this update is lost? Deadlock TCP Persist timer –Sender periodically sends window probe packets –Receiver responds with ACK and up-to-date window advertisement

25 25 TL: TCP flow control enhancements Problem: (Clark, 1982) –If receiver advertises small increases in the receive window then the sender may waste time sending lots of small packets What happens if window is small? –Small packet problem known as “Silly window syndrome” Receiver advertises one byte window Sender sends one byte packet (1 byte data, 40 byte header = 4000% overhead)

26 26 TL: TCP flow control enhancements Solutions to silly window syndrome Clark (1982) –receiver avoidance –prevent receiver from advertising small windows –increase advertised receiver window by min(MSS, RecvBuffer/2) Nagle’s algorithm (1984) –sender avoidance –prevent sender from unnecessarily sending small packets – “ Inhibit the sending of new TCP segments when new outgoing data arrives from the user if any previously transmitted data on the connection remains unacknowledged ” Allow only one outstanding small (not full sized) segment that has not yet been acknowledged Works for idle connections (no deadlock) Works for telnet (send one-byte packets immediately) Works for bulk data transfer (delay sending)

27 27 TL: TCP reliable data transfer Segment integrity Acknowledgement generation Retransmission

28 28 TL: TCP RDT segment integrity Checksum included in header Is it sufficient to just checksum the packet contents? No, need to ensure correct source/destination –Pseudoheader – portion of IP hdr that are critical –Checksum covers Pseudoheader, transport hdr, and packet body –Layer violation, redundant with parts of IP checksum

29 29 TL: TCP RDT acks and timeouts TCP’s reliable data transfer approach –Cumulative acknowledgements Receiver sends back the byte number it expects to receive next Out of order packets generate duplicate acknowledgements –Receive 1, Ack 2 –Receive 4, Ack 2 –Receive 3, Ack 2 –Receive 2, Ack 5 –Retransmissions Sender sends segment and sets a timer Waits for an acknowledgement indicating segment was received –Send 1 –Wait for Ack 2 –No Ack 2 and timer expires –Send 1 again

30 30 TL: TCP RDT acks and timeouts simplified sender, assuming wait for event wait for event event: data received from application above event: timer timeout for segment with seq # y event: ACK received, with ACK # y create, send segment retransmit segment ACK processing one way data transfer no flow, congestion control

31 31 TL: TCP RDT acks and timeouts 00 sendbase = initial_sequence number 01 nextseqnum = initial_sequence number 02 03 loop (forever) { 04 switch(event) 05 event: data received from application above 06 create TCP segment with sequence number nextseqnum 07 start timer for segment nextseqnum 08 pass segment to IP 09 nextseqnum = nextseqnum + length(data) 10 event: timer timeout for segment with sequence number y 11 retransmit segment with sequence number y 12 compute new timeout interval for segment y 13 restart timer for sequence number y 14 event: ACK received, with ACK field value of y 15 if (y > sendbase) { /* cumulative ACK of all data up to y */ 16 cancel all timers for segments with sequence numbers < y 17 sendbase = y 18 } 19 else { /* a duplicate ACK for already ACKed segment */ 20 increment number of duplicate ACKs received for y 21 if (number of duplicate ACKS received for y == 3) { 22 /* TCP fast retransmit */ 23 resend segment with sequence number y 24 restart timer for segment y 25 } 26 } /* end of loop forever */ Simplified TCP sender

32 32 TL: TCP delayed acknowledgements Problem: –In request/response programs, you send separate ACK and Data packets for each transaction Delay ACK in order to send ACK back along with data Solution: –Don’t ACK data immediately Wait 200ms (must be less than 500ms – why?) Must ACK every other packet Must not delay duplicate ACKs –Without delayed ACK: 40 byte ack + data packet –With delayed ACK: data packet includes ACK –See web trace example –Extensions for asymmetric links See later part of lecture

33 33 TL: TCP ACK generation [RFC 1122, RFC 2581] Event in-order segment arrival, no gaps, everything else already ACKed in-order segment arrival, no gaps, one delayed ACK pending out-of-order segment arrival higher-than-expect seq. # gap detected arrival of segment that partially or completely fills gap TCP Receiver action delayed ACK. Wait up to 500ms for next segment. If no next segment, send ACK immediately send single cumulative ACK send duplicate ACK, indicating seq. # of next expected byte immediate ACK if segment starts at lower end of gap

34 34 TL: TCP retransmission Wait at least one RTT before retransmitting packet Importance of accurate RTT estimators: –Estimator too low  unneeded retransmissions –Estimator too high  poor throughput, slow reaction to segment loss RTT estimator must adapt to change in RTT –But not too fast, or too slow! Backing off the retransmission timeout –Exponential backoff –Double retransmission timer interval after every loss until successful retransmission

35 35 TL: TCP retransmission scenarios Host A Seq=92, 8 bytes data ACK=100 loss timeout time lost ACK scenario Host B X Seq=92, 8 bytes data ACK=100 Host A Seq=100, 20 bytes data ACK=100 Seq=92 timeout time premature timeout, cumulative ACKs Host B Seq=92, 8 bytes data ACK=120 Seq=92, 8 bytes data Seq=100 timeout ACK=120

36 36 TL: Initial Round-trip Estimator Round trip times exponentially averaged: –Recommended value for x: 0.1-0.2 0.125 for most TCP’s –Influence of given sample decreases exponentially fast Retransmit timer set to  RTT, where  = 2 –Every time timer expires, RTO exponentially backed-off –Like Ethernet Not good at preventing spurious timeouts EstimatedRTT = (1-x)*EstimatedRTT + x*SampleRTT

37 37 TL: Jacobson’s Retransmission Timeout Key observation: –At high loads round trip variance is high –Need larger safety margin with larger variations in RTT Solution: –Base RTO value on RTT and standard deviation (RRTT)

38 38 TL: Jacobson’s Retransmission Timeout EstimatedRTT = (1-x)*EstimatedRTT + x*SampleRTT Setting the timeout EstimtedRTT plus “safety margin” large variation in EstimatedRTT -> larger safety margin Timeout = EstimatedRTT + 4*Deviation Deviation = (1-x)*Deviation + x*|SampleRTT-EstimatedRTT|

39 39 TL: Retransmission Ambiguity AB ACK Sample RTT Original transmission retransmission RTO AB Original transmission retransmission Sample RTT ACK RTO X

40 40 TL: Karn’s algorithm Accounts for retransmission ambiguity If a segment has been retransmitted: –Don’t count RTT sample on ACKs for this segment –Keep backed off time-out for next packet –Reuse RTT estimate only after one successful transmission

41 41 TL: Timer Granularity Many TCP implementations set RTO in multiples of 200,500,1000ms Why? –Avoid spurious timeouts – RTTs can vary quickly due to cross traffic –Make timers interrupts efficient

42 42 TL: TCP Congestion Control Motivated by ARPANET congestion collapse –Flow control, but no congestion control –Sender sends as much as the receiver resources will allow –Go-back-N on loss, burst out advertised window Congestion control –Extending control to network resources –Underlying design principle: packet conservation At equilibrium, inject packet into network only when one is removed Basis for stability of physical systems (fluid model) Why was this not working before? –No equilibrium Solved by self-clocking and congestion window –Spurious retransmissions Solved by accurate RTO estimation (see earlier discussion) –Resource limitations prevent equilibrium Solved by congestion window and congestion avoidance algorithms

43 43 TL: TCP congestion control basics Keep a congestion window, cwnd –Book calls this “ Congwin ” –Denotes how much network is able to absorb –Sender’s maximum window: Min (receiver’s advertised window, cwnd) –Sender’s actual window: Max window - unacknowledged segments

44 44 TL: TCP Congestion Control end-end control (no network assistance) transmission rate limited by congestion window size, cwnd over segments: w segments, each with MSS bytes sent in one RTT: throughput = w * MSS RTT Bytes/sec cwnd

45 45 TL: TCP congestion control: two “phases” –slow start –congestion avoidance important variables: –cwnd –ssthresh: defines threshold between two slow start phase, congestion control phase (Book calls this threshold ) useful reference – ers/ ers/ “probing” for usable bandwidth: –ideally: transmit as fast as possible ( cwnd as large as possible) without loss –increase cwnd until loss (congestion) –loss: decrease cwnd, then begin probing (increasing) again

46 46 TL: TCP slow start Start the self-clocking behavior of TCP –Use acks to clock sending new data –Do not send entire advertised window in one shot PrPr PbPb ArAr AbAb Receiver Sender AsAs

47 47 TL: TCP slow start exponential increase (per RTT) in window size –Window actually increases to W in RTT * log 2 (W) –Can overshoot window and cause packet loss initialize: cwnd = 1 for (each segment ACKed) cwnd++ until (loss event OR cwnd > ssthresh) Slowstart algorithm Host A one segment RTT Host B time two segments four segments

48 48 TL: TCP slow start example 1 One RTT One pkt time 0R 2 1R 3 4 2R 5 6 7 8 3R 9 10 11 12 13 14 15 1 23 4567

49 49 TL: TCP slow start sequence plot Time Sequence No......

Download ppt "1 Computer Networks Transport Layer (Part 3). 2 Transport Layer Last class –CIDR exam question –Specific transport layers UDP This class –TCP."

Similar presentations

Ads by Google