Download presentation
Presentation is loading. Please wait.
Published byOsborn Banks Modified over 8 years ago
1
1 CSE 524: Lecture 12 Transport Layer (Part 2)
2
2 Administrative Exam –Still being graded –Will be returned on Wednesday guaranteed
3
3 Transport Layer Last class –Transport layer functions This class –Specific transport layers
4
4 Specific transport layers UDP –unreliable (“best-effort”), –unordered –unicast or multicast delivery TCP –reliable –in-order –unicast SCTP (will not cover in class) –See http://www.ietf.org/rfc/rfc2960.txthttp://www.ietf.org/rfc/rfc2960.txt –reliable –optional ordering –unicast
5
5 TL: UDP and Transport Layer Functions Demux to upper layer –UDP port field Quality of service –none Security –none Delivery semantics –Unordered –Unicast or multicast Flow control –none Congestion control –none Reliable data transfer –none, but data integrity provided by checksum
6
6 TL: UDP: User Datagram Protocol http://www.rfc-editor.org/rfc/rfc768.txt “no frills,” “bare bones” Internet transport protocol “best effort” service, UDP segments may be: –lost –delivered out of order to app connectionless: –no handshaking between UDP sender, receiver –each UDP segment handled independently of others Why is there a UDP? no connection establishment (which can add delay) simple: no connection state at sender, receiver small segment header no congestion control: UDP can blast away as fast as desired
7
7 TL: UDP: more often used for streaming multimedia apps –loss tolerant –rate sensitive other UDP uses (why?): –DNS –SNMP reliable transfer over UDP: add reliability at application layer –application-specific error recovery! –many applications re- implement reliability over UDP to bypass TCP –new transport protocols? source port #dest port # 32 bits Application data (message) UDP segment format length checksum Length, in bytes of UDP segment, including header
8
8 TL: UDP checksum Sender: treat segment contents as sequence of 16-bit integers checksum: addition (1’s complement sum) of segment contents sender puts checksum value into UDP checksum field similar to IP’s header checksum Receiver: compute checksum of received segment check if computed checksum equals checksum field value: –NO - error detected –YES - no error detected. But maybe errors nonethless? More later …. Goal: detect “errors” (e.g., flipped bits) in transmitted segment
9
9 TL: TCP and Transport Layer Functions Demux to upper layer Quality of service Security Delivery semantics Flow control Congestion control Reliable data transfer
10
10 TL: TCP Overview RFCs: 793, 1122, 1323, 2018, 2581 full duplex data: –bi-directional data flow in same connection –MSS: maximum segment size connection-oriented: –handshaking (exchange of control msgs) init’s sender, receiver state before data exchange –protocol implemented at ends (“fate-sharing”) flow and congestion controlled: –sender will not overwhelm receiver or network point-to-point: –one sender, one receiver reliable, in-order byte steam: –no “message boundaries” pipelined: –TCP congestion and flow control set window size send & receive buffers
11
11 TL: TCP header source port # dest port # 32 bits application data (variable length) sequence number acknowledgement number rcvr window size ptr urgent data checksum F SR PAU head len not used Options (variable length) URG: urgent data (generally not used) ACK: ACK # valid PSH: push data now (generally not used) RST, SYN, FIN: connection estab (setup, teardown commands) # bytes rcvr willing to accept counting by bytes of data (not segments!) Internet checksum (as in UDP)
12
12 TL: TCP connections TCP sender, receiver establish “connection” before exchanging data segments –initialize TCP variables: Initial sequence #s Buffers, flow control info (e.g. RcvWindow ) Window scaling client: connection initiator server: contacted by client Java API Socket clientSocket = new Socket("hostname","port#”); Socket connectionSocket = welcomeSocket.accept();
13
13 TL: TCP connections Three way handshake: –Step 1: client end system sends TCP SYN control segment to server specifies initial seq # should be random to prevent spoofing ( http://www.rfc- editor.org/rfc/rfc1948.txt )http://www.rfc- editor.org/rfc/rfc1948.txt –Step 2: server end system receives SYN, replies with SYNACK control segment ACKs received SYN allocates buffers specifies server-> receiver initial seq. # –Step 3: client receives SYNACK control segment, replies with ACK and potentially data ACKs received SYNACK goes to established state
14
14 TL: TCP Connection Establishment A and B must agree on initial sequence number selection 3-way handshake AB SYN + Seq A SYN+ACK-A + Seq B ACK-B
15
15 TL: TCP Sequence Number Selection Why not simply chose 0? Must avoid overlap with earlier incarnation Client machine seq #0, initiates connection to server with seq #0. –Client sends one byte and machine crashes –Client reboots and initiates connection again –Server thinks new incarnation is the same as old connection
16
16 TL: TCP Sequence Number Selection Why is selecting a random ISN Important? Suppose machine X selects ISN based on predictable sequence Fred has.rhosts to allow login to X from Y Evil Ed attacks –Disables host Y – denial of service attack –Make a bunch of connections to host X –Determine ISN pattern a guess next ISN –Fake pkt1: [, guessed ISN] –Fake pkt2: desired command –Attack popularized by K. Mitnick
17
17 TL: TCP ISN selection and spoofing attacks Ed Y X.rhosts Y 1. Flood continuously 3. TCP SYNACK ACK spoofed Y ISN Send X ISN PACKET DROPPED! 2. Spoof TCP SYN from Y With spoofed Y ISN 6. Real acks dropped so Y does not reset connection 4. Send ACK with guess of X’s ISN as if you received TCP SYNACK 5. Send pre-canned rlogin/rsh messages rsh echo “Ed” >>.rhosts spoof acknowledgements Ed 7. Door now open, rlogin to X from Ed directly
18
18 TL: TCP connection setup CLOSED SYN SENT SYN RCVD ESTAB LISTEN active OPEN create TCB Snd SYN create TCB passive OPEN delete TCB CLOSE delete TCB CLOSE snd SYN APP SEND snd SYN ACK rcv SYN Send FIN CLOSE rcv ACK of SYN Snd ACK Rcv SYN, ACK rcv SYN snd ACK
19
19 TL: TCP connections Data transfer for established connections using sequence numbers and sliding windows with cumulative ACKs Seq. #’s: –byte stream “number” of first byte in segment’s data ACKs: –seq # of next byte expected from other side –cumulative ACK –duplicate acks sent when out- of-order packet received See web trace Java API connectionSocket.receive(); clientSocket.send(); Host A Host B Seq=42, ACK=79, data = ‘C’ Seq=79, ACK=43, data = ‘C’ Seq=43, ACK=80 User types ‘C’ host ACKs receipt of echoed ‘C’ host ACKs receipt of ‘C’, echoes back ‘C’ time simple telnet scenario
20
20 TL: TCP connections Closing a connection: Client-initiated close (reverse process for server-initiated close) Java API: clientSocket.close(); Step 1: client end system sends TCP FIN control segment to server Step 2: server receives FIN, replies with ACK. Closes connection, sends FIN. client FIN server ACK FIN close closed timed wait
21
21 TL: TCP connections Step 3: client receives FIN, replies with ACK. –Enters “timed wait” - will respond with ACK to received FINs Step 4: server, receives ACK. Connection closed. Note: with small modification, can handle simultaneous FINs. client FIN server ACK FIN closing closed timed wait closed
22
22 TL: TCP Half-Close SenderReceiver FIN FIN-ACK FIN FIN-ACK Data write Data ack
23
23 TL: TCP Connection Tear-down CLOSING CLOSE WAIT FIN WAIT-1 ESTAB TIME WAIT snd FIN CLOSE send FIN CLOSE rcv ACK of FIN LAST-ACK CLOSED FIN WAIT-2 snd ACK rcv FIN delete TCB Timeout=2msl send FIN CLOSE send ACK rcv FIN snd ACK rcv FIN rcv ACK of FIN snd ACK rcv FIN+ACK rcv ACK
24
24 TL: Time Wait Issues Cannot close connection immediately after receiving FIN –What if a new connection restarts and uses same sequence number? Web servers not clients close connection first –Established Fin-Waits Time-Wait Closed –Why would this be a problem? Time-Wait state lasts for 2 * MSL –MSL is should be 120 seconds (is often 60s) –Servers often have order of magnitude more connections in Time-Wait
25
25 TL: TCP connections TCP client lifecycle TCP server lifecycle
26
26 TL: TCP Demux to upper layer multiplexing/demultiplexing: based on sender, receiver port numbers, IP addresses –source, dest port #s in each segment –recall: well-known port numbers for specific applications –Servers wait on well known ports (/etc/services) gathering data from multiple app processes, enveloping data with header (later used for demultiplexing) source port #dest port # 32 bits application data (message) other header fields TCP/UDP segment format Multiplexing:
27
27 TL: TCP Demux to upper layer host A server B source port: x dest. port: 23 source port:23 dest. port: x port use: simple telnet app Web client host A Web server B Web client host C Source IP: C Dest IP: B source port: x dest. port: 80 Source IP: C Dest IP: B source port: y dest. port: 80 port use: Web server Source IP: A Dest IP: B source port: x dest. port: 80
28
28 TL: TCP Flow control TCP is a sliding window protocol –For window size n, can send up to n bytes without receiving an acknowledgement –When the data is acknowledged then the window slides forward Each packet advertises a window size –Indicates number of bytes the receiver has space for Original TCP always sent entire window –Congestion control now limits this
29
29 TL: TCP Flow control receiver: explicitly informs sender of (dynamically changing) amount of free buffer space –RcvWindow field in TCP segment sender: keeps the amount of transmitted, unACKed data less than most recently received RcvWindow sender won’t overrun receiver’s buffers by transmitting too much, too fast flow control receiver buffering RcvBuffer = size or TCP Receive Buffer RcvWindow = amount of spare room in Buffer
30
30 TL: TCP Flow control What happens if window is 0? –Receiver updates window when application reads data –What if this update is lost? Deadlock TCP Persist timer –Sender periodically sends window probe packets –Receiver responds with ACK and up-to-date window advertisement
31
31 TL: TCP flow control enhancements Problem: (Clark, 1982) –If receiver advertises small increases in the receive window then the sender may waste time sending lots of small packets What happens if window is small? –Small packet problem known as “Silly window syndrome” Receiver advertises one byte window Sender sends one byte packet (1 byte data, 40 byte header = 4000% overhead)
32
32 TL: TCP flow control enhancements Solutions to silly window syndrome Clark (1982) –receiver avoidance –prevent receiver from advertising small windows –increase advertised receiver window by min(MSS, RecvBuffer/2) Nagle’s algorithm (1984) –sender avoidance –prevent sender from unnecessarily sending small packets –http://www.rfc-editor.org/rfc/rfc896.txthttp://www.rfc-editor.org/rfc/rfc896.txt “Inhibit the sending of new TCP segments when new outgoing data arrives from the user if any previously transmitted data on the connection remains unacknowledged” Allow only one outstanding small (not full sized) segment that has not yet been acknowledged Works for idle connections (no deadlock) Works for telnet (send one-byte packets immediately) Works for bulk data transfer (delay sending)
33
33 TL: TCP reliable data transfer Segment integrity Acknowledgement generation Retransmission
34
34 TL: TCP RDT segment integrity Checksum included in header Is it sufficient to just checksum the packet contents? No, need to ensure correct source/destination –Pseudoheader – portion of IP hdr that are critical –Checksum covers Pseudoheader, transport hdr, and packet body –Layer violation, redundant with parts of IP checksum
35
35 TL: TCP RDT acks and timeouts TCP’s reliable data transfer approach –Cumulative acknowledgements Receiver sends back the byte number it expects to receive next Out of order packets generate duplicate acknowledgements –Receive 1, Ack 2 –Receive 4, Ack 2 –Receive 3, Ack 2 –Receive 2, Ack 5 –Retransmissions Sender sends segment and sets a timer Waits for an acknowledgement indicating segment was received –Send 1 –Wait for Ack 2 –No Ack 2 and timer expires –Send 1 again
36
36 TL: TCP RDT acks and timeouts simplified sender, assuming wait for event wait for event event: data received from application above event: timer timeout for segment with seq # y event: ACK received, with ACK # y create, send segment retransmit segment ACK processing one way data transfer no flow, congestion control
37
37 TL: TCP RDT acks and timeouts 00 sendbase = initial_sequence number 01 nextseqnum = initial_sequence number 02 03 loop (forever) { 04 switch(event) 05 event: data received from application above 06 create TCP segment with sequence number nextseqnum 07 start timer for segment nextseqnum 08 pass segment to IP 09 nextseqnum = nextseqnum + length(data) 10 event: timer timeout for segment with sequence number y 11 retransmit segment with sequence number y 12 compute new timeout interval for segment y 13 restart timer for sequence number y 14 event: ACK received, with ACK field value of y 15 if (y > sendbase) { /* cumulative ACK of all data up to y */ 16 cancel all timers for segments with sequence numbers < y 17 sendbase = y 18 } 19 else { /* a duplicate ACK for already ACKed segment */ 20 increment number of duplicate ACKs received for y 21 if (number of duplicate ACKS received for y == 3) { 22 /* TCP fast retransmit */ 23 resend segment with sequence number y 24 restart timer for segment y 25 } 26 } /* end of loop forever */ Simplified TCP sender
38
38 TL: TCP delayed acknowledgements Problem: –In request/response programs, you send separate ACK and Data packets for each transaction Delay ACK in order to send ACK back along with data Solution: –Don’t ACK data immediately Wait 200ms (must be less than 500ms – why?) Must ACK every other packet Must not delay duplicate ACKs –Without delayed ACK: 40 byte ack + data packet –With delayed ACK: data packet includes ACK –See web trace example –Extensions for asymmetric links See later part of lecture
39
39 TL: TCP ACK generation [RFC 1122, RFC 2581] Event in-order segment arrival, no gaps, everything else already ACKed in-order segment arrival, no gaps, one delayed ACK pending out-of-order segment arrival higher-than-expect seq. # gap detected arrival of segment that partially or completely fills gap TCP Receiver action delayed ACK. Wait up to 200ms for next segment. If no next segment, send ACK immediately send single cumulative ACK send duplicate ACK, indicating seq. # of next expected byte immediate ACK if segment starts at lower end of gap
40
40 TL: TCP retransmission Wait at least one RTT before retransmitting packet Importance of accurate RTT estimators: –Estimator too low unneeded retransmissions –Estimator too high poor throughput, slow reaction to segment loss RTT estimator must adapt to change in RTT –But not too fast, or too slow! Backing off the retransmission timeout –Exponential backoff –Double retransmission timer interval after every loss until successful retransmission
41
41 TL: TCP retransmission scenarios Host A Seq=92, 8 bytes data ACK=100 loss timeout time lost ACK scenario Host B X Seq=92, 8 bytes data ACK=100 Host A Seq=100, 20 bytes data ACK=100 Seq=92 timeout time premature timeout, cumulative ACKs Host B Seq=92, 8 bytes data ACK=120 Seq=92, 8 bytes data Seq=100 timeout ACK=120
42
42 TL: Initial Round-trip Estimator Round trip times exponentially averaged: –Recommended value for x: 0.1-0.2 0.125 for most TCP’s –Influence of given sample decreases exponentially fast Retransmit timer set to RTT, where = 2 –Every time timer expires, RTO exponentially backed-off –Like Ethernet Not good at preventing spurious timeouts EstimatedRTT = (1-x)*EstimatedRTT + x*SampleRTT
43
43 TL: Jacobson’s Retransmission Timeout Key observation: –At high loads round trip variance is high –Need larger safety margin with larger variations in RTT Solution: –Base RTO value on RTT and standard deviation (RRTT)
44
44 TL: Jacobson’s Retransmission Timeout EstimatedRTT = (1-x)*EstimatedRTT + x*SampleRTT Setting the timeout EstimtedRTT plus “safety margin” large variation in EstimatedRTT -> larger safety margin Timeout = EstimatedRTT + 4*Deviation Deviation = (1-x)*Deviation + x*|SampleRTT-EstimatedRTT|
45
45 TL: Retransmission Ambiguity AB ACK Sample RTT Original transmission retransmission RTO AB Original transmission retransmission Sample RTT ACK RTO X
46
46 TL: Karn’s algorithm Accounts for retransmission ambiguity If a segment has been retransmitted: –Don’t count RTT sample on ACKs for this segment –Keep backed off time-out for next packet –Reuse RTT estimate only after one successful transmission
47
47 TL: Timer Granularity Many TCP implementations set RTO in multiples of 200,500,1000ms Why? –Avoid spurious timeouts – RTTs can vary quickly due to cross traffic –Make timers interrupts efficient
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.