INFO 330 Computer Networking Technology I

INFO 330 Computer Networking Technology I
Chapter 3 The Transport Layer Dr. Jennifer Booker INFO 330 Chapter 3

Transport Layer The Transport Layer handles logical communication between processes It’s the last layer not used between processes for routing, so it’s the last thing a client process and the first thing a server process sees of a packet By logical communication, we recognize that the means used to get between processes, and the distance covered, are irrelevant INFO 330 Chapter 3

Transport vs Network Notice we didn’t say ‘hosts’ in the previous slide…that’s because The network layer provides logical communication between hosts Mail analogy Let’s assume cousins (processes) want to send letters to each other between their houses (hosts) They use their parents (transport layer) to mail the letters, and sort the mail when it arrives INFO 330 Chapter 3

Transport vs Network The letters travel through the postal system (network layer) to get from house to house The transport layer doesn’t participate in the network layer activities (e.g. most parents don’t work in the mail distribution centers) The transport layer protocols are localized in the hosts Routing isn’t affected by anything the transport layer added to the messages INFO 330 Chapter 3

Transport vs Network Following the analogy, different people might have to pick up and sort the mail; they’re like using different transport layer protocols And the transport layer protocols (parents) are often at the mercy of what services the network layer (postal system) provides Some services can be provided at the transport layer, even if the network layer doesn’t (e.g. reliable data transfer or encryption) INFO 330 Chapter 3

Two Choices Here we choose between TCP and UDP
In the transport layer, a packet is a segment In the network layer, a packet is a datagram The network layer is home to the Internet Protocol (IP) IP provides logical communication between hosts IP makes a “best effort” to get segments where they belong – no guarantees of delivery, or delivery sequence, or delivery integrity INFO 330 Chapter 3

IP Each host has an IP address
Common purpose of UDP and TCP is extend delivery of IP data to the host’s processes This is called transport-layer multiplexing and demultiplexing Both UDP and TCP also provide error checking That’s it for UDP – data delivery and error checking! INFO 330 Chapter 3

TCP TCP also provides reliable data transfer (not just data delivery)
Uses flow control, sequence numbers, acknowledgements, and timers to ensure data is delivered correctly and in order TCP also provides congestion control TCP applications share the available bandwidth (they watched Sesame Street!) UDP takes whatever it can get (greedy little protocol) INFO 330 Chapter 3

Multiplexing & Demultiplexing
At the destination host, the transport layer gets segments from the network layer Needs to deliver these segments to the correct process on that host Do so via sockets, which connect processes to the network Each socket has a unique identifier, whose format varies for UDP and TCP INFO 330 Chapter 3

Demultiplexing is getting the transport layer segment into the correct socket Hence Multiplexing is taking data from various sockets, applying header info, breaking it into segments, and delivering it to the network layer Multiplexing and demultiplexing are used in any kind of network; not just in the Internet protocols INFO 330 Chapter 3

application transport network link physical P1 P2 P3 P4 host 1 host 2 host 3 = process = socket delivering received segments to correct socket Demultiplexing at rcv host: gathering data from multiple sockets, enveloping data with header (later used for demultiplexing) Multiplexing at send host: INFO 330 Chapter 3

Mail Analogy Multiplexing is when a parent collects letters from the cousins, and puts them into the mail Demultiplexing is getting the mail, and handing the correct mail to each cousin Here we need unique socket identifiers, and some place in the header for the socket identifier information INFO 330 Chapter 3

Segment Header Hence the segment header starts with the source and destination port numbers Each port number is a 16-bit (2 byte) value (0 to 65,535) Well known port numbers are from 0 to 1023 (210 -1) After the port numbers are other headers, specific to TCP or UDP, then the message INFO 330 Chapter 3

UDP Multiplexing UDP assigns a port number from 1024 to 65,535 to each socket, unless the developer specifies otherwise UDP identifies a socket only by destination IP address and destination port number The port numbers for source and destination are switched (inverted) when a reply is sent So a segment from port to port generates a reply from port to 19157 INFO 330 Chapter 3

TCP Multiplexing TCP is messier, of course
TCP identifies a socket by four values: Source IP address, source port number, destination IP address, and destination port number Hence if UDP gets two segments with the same destination IP and port number, they’ll both go to the same process TCP tells the segments apart via source IP/port INFO 330 Chapter 3

TCP Multiplexing So if you have two HTTP sessions going to the same web server and page, how can TCP tell them apart? Even though the destination IP and port (80) are the same, and the two sessions (processes) have the same source IP address, they have different source port numbers INFO 330 Chapter 3

Port scanning Apps called port scanners (e.g. nmap) can scan the ports on a computer and see which are open This tell us what apps are running on that host Then target attacks on those apps A big security vulnerability is to leave ports open you aren’t using Could accept hostile TCP connections INFO 330 Chapter 3

Web Servers & TCP Each new client connection often uses a new process and socket to send HTTP requests and get responses But a thread (lightweight process) can be used, so a process can have multiple sockets for each thread INFO 330 Chapter 3

UDP The most minimal transport layer has to do multiplexing and demultiplexing UDP does this and a little error checking and, well, um, that’s about it! UDP was defined in RFC 768 An app that uses UDP almost talks directly to IP Adds only two small data fields to the header, after the requisite source/destination addresses There’s no handshaking; UDP is connectionless INFO 330 Chapter 3

UDP for DNS DNS uses UDP A DNS query is packaged into a segment, and is passed to the network layer The DNS app waits for a response; if it doesn’t get one soon enough (times out), it tries another server or reports no reply Hence the app must allow for the unreliability of UDP, by planning what to do if no response comes back INFO 330 Chapter 3

UDP Advantages Still UDP is good when:
You want the app to have detailed control over what is sent across the network; UDP changes it little No connection establishment delay No connection state data in the end hosts; hence a server can support more UDP clients than TCP Small packet header overhead per segment TCP uses 20 bytes of header data, UDP only 8 bytes INFO 330 Chapter 3

UDP Apps Other than DNS, UDP is also used for
Network management (SNMP) Routing (RIP) Multimedia & telephony (proprietary protocols) Remote file server (NFS) The lack of congestion control in UDP can be a problem when lost of large UDP messages are being sent – can crowd out TCP apps INFO 330 Chapter 3

UDP Header The UDP header has four two-byte fields in two lines (8 B total), namely: Source port number; Destination port number Length; Checksum Length is the total length of the segment, including headers, in bytes The checksum is used by the receiving app to see if errors occurred INFO 330 Chapter 3

Checksum Noise in the transmission lines can lose bits of data or rearrange them in transit Checksums are a common method to detect errors (RFC 1071) To create a checksum: Find the sum of the binary digits of the message The checksum is the 1s (ones) complement of the sum If message is uncorrupted, sum of message plus checksum is all ones … INFO 330 Chapter 3

1s Complement? The 1s complement is a mirror image of a binary number – change all the zeros to ones, and ones to zeros So the 1s complement of is UDP does error checking because not all lower layer protocols do error checking This provides end-to-end error checking, since it’s more efficient than every step along the way INFO 330 Chapter 3

UDP That’s it for UDP! The port addresses, the message length, and a checksum to see if it got there intact Now see what happens when we want reliable data transfer INFO 330 Chapter 3

Reliable Data Transfer
Distinguish between the service model, and how it’s really implemented Service model: From the app perspective, it just wants a reliable transport layer to connect sending and receiving processes Service implementation: In reality, the transport layer has to use an unreliable network layer (IP), so transport has to make up for the unreliability below it INFO 330 Chapter 3

The sending process will give the transport layer a message rdt_send (rdt = reliable data transfer) The transport protocol will convert to udt_send (udt = unreliable data transfer; Fig 3.8 has typo) and give to the network layer At the receiving end, the protocol gets rdt_rcv from the network layer, The protocol will convert to deliver_data and give it to the receiving application process INFO 330 Chapter 3

App sees this “service model” But our transport protocol has to do this network layer INFO 330 Chapter 3

Here we’ll refer to the data as packets, rather than distinguish segments, etc. Also consider that we’ll pretend we only have to send data one direction (unidirectional data transfer) Bidirectional data transfer is what really occurs, but the sending and receiving sides get switched Time to build a reliable data transfer protocol, one piece at a time INFO 330 Chapter 3

Reliable Data Transfer v1.0
For the simplest case, called rdt1.0, assume the network is completely reliable Finite state machines (FSMs) for the sender and receiver each have one state – waiting for a call The sending side (rdt_send) makes a packet (make_pkt) and sends it (udt_send) The receiving side (rdt_rcv) extracts data from the packet (extract), and delivers it to the receiving app (deliver_data) INFO 330 Chapter 3

Here a packet is the only unit of data No feedback to sender is needed to confirm receipt of data, and no control over transmission rate is needed INFO 330 Chapter 3

Now allow bit errors in transmission But all packets are received, in the correct order Need acknowledgements to know when a packet was correct (OK, 10-4) versus when it wasn’t (please repeat); called positive and negative acknowledgements, respectively These types of messages are typical for any Automatic Repeat reQuest (ARQ) protocol INFO 330 Chapter 3

So allowing for bit errors requires three capabilities Error detection to know if a bit error occurred Receiver feedback, both positive (ACK) and negative (NAK) acknowledgements Retransmission of incorrect packets INFO 330 Chapter 3

INFO 330 Chapter 3

Sending FSM (cont.) The left state waits for a packet from the sending app, makes a packet with a checksum (make_pkt) Then the left state sends the packet (udt_send) It moves to the other state (waiting for ACK/NAK) If it gets a NAK response (errors detected), then it resends the packet (udt_send) until it gets it right If it gets an ACK response (no errors), then it goes back to the other state to wait for the next packet from the app INFO 330 Chapter 3

Notice this model does nothing until it gets the NAK/ACK, so it’s a stop-and-wait protocol Receiving FSM The receiving side uses the checksum to see if the packet was corrupted If it was (&& corrupt) send a NAK response If it wasn’t (&& notcorrupt), extract and deliver the data, and send an ACK response But what if the NAK/ACK is corrupted? INFO 330 Chapter 3

Three possible ways to handle NAK/ACK errors Add another type of response to have the NAK/ACK repeated; but what if that response got corrupted? Leads to long string of messages… Add checksum data to the NAK/ACK, and data to recover from the error Resend the packet if the NAK/ACK is garbled; but introduces possible duplicate packets INFO 330 Chapter 3

TCP and most reliable protocols add a sequence number to the data from the sender Since we can’t lose packets yet, a one-bit number is adequate to tell if this is a new packet or a repeat of the previous one This gives our new model rdt version 2.1 INFO 330 Chapter 3

sender INFO 330 Chapter 3

Now the number of states are doubled, since we have sequence numbers 0 or 1 So in make_pkt(1, data, checksum) the 1 is the sequence number Sequence number alternates if everything works; if a packet is corrupted, the same sequence number is expected two or more times Start at ‘Wait for call 0’ state; when get packet, send it to network with sequence 0 Then wait for ACK or NAK with sequence 0 INFO 330 Chapter 3

If the packet was corrupt, or got a NAK, resend that packet (upper right loop) Otherwise wait for call with sequence 1 from app When call 1 is received, make and send the packet with sequence 1 (desired outcome) Then wait for a NAK/ACK with sequence 1 If corrupt or got a NAK, resend (lower left loop) Otherwise go to waiting for a sequence 0 call from the app Repeat cycle INFO 330 Chapter 3

receiver INFO 330 Chapter 3

The receiver side doubles in # of states When waiting for seq 0 state If the packet has sequence 0 and isn’t corrupt, extract and deliver the data, and send an ACK; go to wait for seq 1 state If the packet was corrupt, reply with a NAK If the packet has sequence 1 and was not corrupt (it’s out of order) send an ACK and keep waiting for a seq 0 packet Mirror the above for starting from ‘wait for seq 1’ state INFO 330 Chapter 3

Could achieve the same effect without a NAK (for corrupt packet) if we only ACK the last correctly received packet Two ACKs for the same packet (duplicate ACKs) means the packet after the second ACK wasn’t received correctly The NAK-free protocol is called rdt2.2 INFO 330 Chapter 3

INFO 330 Chapter 3

Again, the send and receive FSMs are symmetric for sequence 0 and 1 Sender must now check the sequence number of the packet being ACK’d (see isACK message) The receiver must include the sequence number in the make_pkt message FSM on page 211 also has oncethru variable to help avoid duplicate ACKs INFO 330 Chapter 3

Now account for the possibility of lost packets Need to detect packet loss, and decide what to do about it The latter is easy with the tools we have (ACK, checksum, sequence #, and retransmission), but need a new detection mechanism Many possible loss detection approaches Focus on making the sender responsible for it INFO 330 Chapter 3

Sender thinks a packet lost when packet doesn’t get to receiver, or the ACK gets lost Can’t wait for worst case transmission time, so pick a reasonable time before error recovery is started Could result in duplicate packets if it was still on the way; but rdt2.2 can handle that For the sender, retransmission is ultimate solution – whether packet or ACK was lost INFO 330 Chapter 3

Knowing when to retransmit needs a countdown timer Count time from sending a packet to still not getting an ACK If time is exceeded, retransmit that packet Works the same if packet is lost or ACK is lost Since packet sequence numbers alternate etc., is called an alternate-bit protocol INFO 330 Chapter 3

sender INFO 330 Chapter 3

How does the receiver FSM differ from rdt2.2? It doesn’t. The sender is responsible for loss detection Notice that, even allowing for lost packets, we still assume only once packet is sent completely and correctly at a time But rdt3.0 still stops to wait for timeout of each packet – fix with pipelining INFO 330 Chapter 3

Pipelined RDT Suppose we implemented rdt3.0 between NYC and LA
Distance of 3000 miles gives RTT of about 30 ms If transmission rate is 1 Gbps, and packets are 1 kB (8 kb) Transmission time is therefore only 8 kb / 1E9 b/s = 8 microseconds (ms) Even if ACK messages are very small (transmission time about zero), the time for one packet to be sent and ACK is ms INFO 330 Chapter 3

Pipelined RDT Hence we’re transmitting ms out of the ms RTT, which equals 0.03% utilization How a protocol is implemented drastically affects its usefulness! It makes sense to send multiple packets and keep track of the ACKs for each Methods to do so are Go-Back-N (GBN) and Selective Repeat (SR) INFO 330 Chapter 3

* Why a limit at all? Need for flow and congestion control later.
Go-Back-N In this protocol, sender can send up to N packets without getting an ACK* N is also called a window size, and the protocol is a.k.a. a sliding-window protocol Let base be the number of the first packet in a window The window size, N, is already defined Then all packets from 0 to base-1 have already been sent * Why a limit at all? Need for flow and congestion control later. INFO 330 Chapter 3

Go-Back-N The window currently focuses on packets number base to base+N, these packets can be sent before their ACK is received Packet sequence numbers need to have a maximum value; if ‘k’ bits are in the sequence number, the range of sequence numbers is 0 to 2k-1 The sequence numbers are used in a circle, so after 2k-1 you use 0 again, then 1, etc. INFO 330 Chapter 3

Go-Back-N In the FSMs for Go-Back-N (GBN)
rdt3.0 only had sequence numbers 0 and 1 TCP has a 32-bit sequence number range for the bytes in a byte stream In the FSMs for Go-Back-N (GBN) Sender must respond to: Call from above (i.e. the app) Receipt of an ACK from any of the packets outstanding, providing cumulative acknowledgement Timeout – causes all un-ACKed packets re-sent INFO 330 Chapter 3

Go-Back-N The GBN receiver does:
If a packet is correct and in order, send an ACK Sender moves window up with each correct and in order packet ACKed – this minimizes resending later In all other cases, throw away the packet, and resend ACK for the most recent correct packet Hence we throw away correct but out-of-order packets – this makes receiver buffering easier INFO 330 Chapter 3

Go-Back-N GBN can be implemented in event-based programming; events here are App invokes rdt_send Receiver protocol receives rdt_rcv Timer interrupts In contrast, consider the selective repeat (SR) approach for pipelining INFO 330 Chapter 3

Selective Repeat Large window size and bandwidth delay can make a lot of packets in the pipeline under GBN, which can cause a lot of retransmission when a packet is lost Selective repeat only retransmits packets believed to be in error – so retransmission is on a more individual basis To do this, buffer out-of-order packets until the missing packets are filled in INFO 330 Chapter 3

Selective Repeat SR still uses a window of size N packets
SR sender responds to: Data from the app above it; finds next sequence number available, and sends as soon as possible Timeout is kept for each packet ACK received from the receiver; then sender marks off that packet, and moves the window forward; can transmit packets inside the new window INFO 330 Chapter 3

Selective Repeat The SR receiver responds to
Packet within the current window; then send an ACK; deliver packets at the bottom of the window, but buffer higher number packets (out of order) Packets that were previously ACKed are ACKed again Otherwise ignore the packet Notice the sender and receiver windows are generally not the same!! INFO 330 Chapter 3

Selective Repeat It’s possible that the sequence number range and window size could be too close, producing confusing signals To prevent this, need window size < half of sequence number range INFO 330 Chapter 3

Packet Reordering Our last assumption was that packets arrive in order, if at all What is they arrive out of order? Out of order packets could have sequence numbers outside of either window (snd or rcv) Handle by not allowing packets older than some max time TCP typically uses 3 minutes INFO 330 Chapter 3

Reliable Data Transfer Mechanisms
Checksum, to detect bit errors in a packet Timer, to know when a packet or its ACK was lost Sequence number, to detect lost or duplicate packets Acknowledgement, to know packet got to receiver correctly Negative acknowledgement, to tell packet was corrupted but received Window, to pipeline many packets at once before an ACK was received for any of them INFO 330 Chapter 3

TCP Intro Now see how all this applies to TCP
First in RFC 793, now RFC 2581 Invented circa 1974 by Vint Cerf and Robert Kahn TCP starts with a handshake protocol, which defines many connection variables Connection only at hosts, not in between Routers are oblivious to whether TCP is used! TCP is a full duplex service – data can flow both directions at once, and is connection-oriented INFO 330 Chapter 3

TCP Intro TCP is point-to-point – between a single sender and a single receiver In contrast with multipoint technologies TCP is client/server based Client needs to establish a socket to the server’s hostname and port Recall default port numbers are app-specific Special segments are sent by client, server, and client to make the three-way handshake INFO 330 Chapter 3

TCP Intro Once connection exists, processes can send data back and forth Sending process sends data through socket to the TCP send buffer TCP sends data from the send buffer when it feels like it Max Segment Size (MSS) is based on the max frame size, or Max Transmission Unit (MTU) Want 1 TCP segment to eventually fit in the MTU INFO 330 Chapter 3

TCP Intro Typical MTU values are 512 – 1460 bytes MSS is the max app data that can fit in a segment, not the total segment size (which includes headers) TCP adds headers to the data, creating TCP segments Segments are passed to the network layer to become IP datagrams, and so on into the network INFO 330 Chapter 3

TCP Intro At the server side, the segment is placed in the receive buffer So a TCP connection consists of two buffers (send and receive), some variables, and two socket connections (send and receive) on the corresponding processes INFO 330 Chapter 3

TCP Segment Structure A TCP segment consists of header fields and a data field The data field size is limited by the MSS Typical header size is 20 bytes The header is 32 bits wide (4 bytes), so it has five lines at a minimum INFO 330 Chapter 3

TCP Header Structure The header lines are
Source and destination port numbers (16 bit ea.) Sequence number (32 bit) ACK number (32 bit) A bunch of little stuff (header length, URG, ACK, PSH, RST, SYN, and FIN bits), then the receive window (16 bit) Internet checksum, urgent data pointer (16 bit ea.) And possibly several options INFO 330 Chapter 3

TCP Segment Structure We’ve seen the port numbers (16 bits each), sequence and ACK numbers (32 bits each) The ‘bunch of little stuff’ includes Header length (4 bits) A flag field includes six one-bit fields: ACK, RST, SYN, FIN, PSH, and URG The URG bit marks urgent data later on that line The receive window is used for flow control INFO 330 Chapter 3

TCP Segment Structure The checksum is used for bit error detection, as with UDP The urgent data pointer tells where the urgent data is located The options include negotiating the MSS, scaling the window size, or time stamping INFO 330 Chapter 3

TCP Sequence Numbers The sequence numbers are important for TCP’s reliability TCP views data as unstructured but ordered stream of bytes Hence sequence numbers for a segment is the byte-stream number of the first byte in the segment Yes, each byte is counted! INFO 330 Chapter 3

TCP Sequence Numbers So if the MSS is 1000 bytes, the first segment will be number 0, and cover bytes 0 to 999 The second segment is number 1000, and covers bytes Third is number 2000, and covers , etc. Typically start sequences at random numbers on both sides, to avoid accidental overlap with previously used numbers INFO 330 Chapter 3

TCP Acknowledgement No.
TCP acknowledgement numbers are weird The number used is the next byte number expected from the sender So if host B sends to A (!) bytes of data, host A expects byte 536 to be the start of the next segment, so 536 is the Ack number This is a cumulative acknowledgement, since it only goes up to the first missing byte in the byte-stream INFO 330 Chapter 3

TCP Out-of-Order Segments
What does it do when segments arrive out of order? That’s up to the TCP implementer TCP can either discard out of order segments, or keep the strays in buffer and wait for the pieces to get filled in The former is easier to implement, the latter is more efficient and commonly used INFO 330 Chapter 3

Telnet Example Telnet (RFC 854) is an old app for remote login via TCP
Telnet interactively echoes whatever was typed to show it got to the other side Host A is the client, starts a session with Host B, the server Suppose client starts with sequence number 42, and server with 79 INFO 330 Chapter 3

Telnet Example User types a single letter, ‘c’
Notice how the seq and Ack numbers mirror or “piggy back” each other INFO 330 Chapter 3

Timeout Calculation TCP needs a timeout interval, as discussed in the rdt example, but how long? Longer than RTT, but how much? A week? Measure sample RTT for segments here and there (not every one) This SampleRTT value will fluctuate, with an average value called EstimatedRTT which is a moving average updated with each measurement INFO 330 Chapter 3

Timeout Calculation Naturally, EstimatedRTT is a smoother curve than each SampleRTT EstimatedRTT =0.875*EstimatedRTT *SampleRTT The variability of RTT is measured by DevRTT, which is the moving average magnitude difference between SampleRTT and EstimatedRTT Let DevRTT = 0.75*DevRTT * |SampleRTT - EstimatedRTT| INFO 330 Chapter 3

Timeout Calculation We want the timeout interval larger than EstimatedRTT, but not huge; use TimeoutInterval = EstimatedRTT + 4*DevRTT This is analogous to control charts, where the expected value of a measurement is no more than the (mean + 3*the standard deviation) about ¼% of the time DevRTT isn’t a standard deviation, but the idea is similar INFO 330 Chapter 3

Timeout Calculation Notice this means that the timeout interval is constantly being calculated, and to do so requires frequent measurement of SampleRTT to find current values for: Estimated RTT DevRTT TimeoutInterval INFO 330 Chapter 3

IP is not a reliable datagram service It doesn’t guarantee delivery, or in order, or intact delivery In theory we saw that separate timers for each segment would be nice; in reality TCP uses one retransmission timer for several segments (RFC 2988) For the next example, assume Host A is sending a big file to Host B INFO 330 Chapter 3

Simplified TCP Here the sender responds to three events:
Receive data from application Then it makes segments of the data, each with a sequence number, and passes them to the IP layer Starts timer Timer times out Then it re-sends the segment that timed out ACK was received Compares the received ACK value with SendBase, the last byte number successfully received Restart timer if any un-ACK segments left INFO 330 Chapter 3

Simplified TCP Even this version of TCP can successfully handle lost ACKs by ignoring duplicate segments (Fig 3.34, p. 256) If a segment times out, later segments don’t get re-sent (Fig 3.35, p. 257) A lost ACK can still be deduced to not be a lost segment (Fig 3.36, p. 258) INFO 330 Chapter 3

Doubling Timeout After a timeout event, many TCP implementations double the timeout interval This helps with congestion control, since timeout is often due to congestion, and retransmitting often just makes it worse! INFO 330 Chapter 3

Fast Retransmit Waiting for the timeout can be too slow
Might know to retransmit sooner if get duplicate ACKs An ACK for a given byte number means a gap was noted in the segment sequence (since there are no negative NAKs) Getting three duplicate ACKs typically forces a fast retransmit of the segment after that value INFO 330 Chapter 3

Go-Back-N vs. Selective Repeat?
TCP partly looks like Go-Back-N (GBN) Tracks last sequence number transmitted but not ACKed (SendBase) and sequence number of next byte to send (NextSeqNum) TCP partly looks like Selective Repeat (SR) Often buffers out-of-order segments to limit the range of segments retransmitted TCP can use selective acknowledgment (RFC 2018) to specify which segments are out of order INFO 330 Chapter 3

Flow Control TCP connection hosts maintain a receive buffer, for bytes received correctly and in order Apps might not read from the buffer for a while, so it can overflow Flow control focuses on preventing overflow of the receive buffer So it also depends on how fast the receiving app is reading the data! INFO 330 Chapter 3

Flow Control Hence the sender in TCP maintains a receive window (RcvWindow) variable – how much room is left in the receive buffer The receive buffer has size RcvBuffer The last byte number read by the receiving app is LastByteRead The last byte put in the receive buffer is LastByteRcvd RcvWindow = RcvBuffer – (LastByteRcvd – LastByteRead) = rwnd INFO 330 Chapter 3

Flow Control So the amount of room in RcvWindow varies with time, and is returned to the sender in the receive window field of every segment (see slide 73) The sender also keeps track of LastByteSent and LastByteAcked; the difference between them is the amount of data between sender and receiver Keep that difference less than the RcvWindow to make sure the receive buffer isn’t overflowed LastByteSent – LastByteAcked <= RcvWindow INFO 330 Chapter 3

Flow Control If the RcvWindow goes to zero, the sender can’t send more data to the receiver ever! To prevent this, TCP makes the sender transmit one byte messages when RcvWindow is zero, so that the receiver can indicate when the buffer is not full INFO 330 Chapter 3

UDP Flow Control There ain’t none (sic!)
UDP adds newly arrived segments to a buffer in front of the receiving socket If the buffer gets full, segments are dropped Bye-bye data! INFO 330 Chapter 3

TCP Connection Management
Now look at the TCP handshake in detail Important since many security threats exploit it Recall the client process wants to establish a connection with a server process Step 1 – client sends segment with code SYN=1 and an initial sequence number (client_isn) to the server Choosing a random client_isn is key for security INFO 330 Chapter 3

Step 2 – Server allocates variables needed for the connection, and sends a connection-granted segment, SYNACK, to the client This SYNACK segment has SYN=1, the ack field is set to client_isn+1, and the server chooses its initial sequence number (server_isn) Step 3 – Client gets SYNACK segment, and allocates its buffers and variables Client sends segment with ack value server_isn+1, and SYN=0 INFO 330 Chapter 3

The SYN bit stays 0 while the connection is open Why is a three-way handshake used? Why isn’t two-way enough? Now look at closing the connection Either client or server can close the connection INFO 330 Chapter 3

One host, let’s say the client, sends a segment with the FIN bit set to 1 The server acknowledges this with a return segment, then sends a separate shutdown segment (also with FIN=1) Client acknowledges the shutdown from the server, and resources in both hosts are deallocated INFO 330 Chapter 3

TCP State Cycle Another way to view the history of a TCP connection is through its state changes (Fig 3.41, 3.42) The connection starts Closed After the handshake is completed it’s Established Then the processes communicate Sending or receiving a FIN=1 starts the closing process, until both sides get back to Closed Whoever sent a FIN waits some period ( s) after ACKing the other host’s FIN before closing their connection INFO 330 Chapter 3

Stray Segments Receiving a segment with SYN trying to open an unknown or closed port results in: Server sends a reset message; RST=1, meaning “go away, that port isn’t open” Similarly, a UDP packet with unknown socket results in sending a special ICMP datagram (see next chapter) INFO 330 Chapter 3

Stray Segments So mapping ports on a system could yield three responses Get a TCP SYNACK, implying the port is open and some app is using it Get a TCP RST segment, meaning the port is closed No response, implying the port could be blocked by a firewall INFO 330 Chapter 3

SYN Flood Attacks The TCP handshake is the basis for an attack called the SYN flood Have one or more computers sent lots of SYN messages to a server – but spoof the return IP address so the connection is never finished Makes the server waste resources waiting for you; can crash it if done fast enough A new defense against this is the SYN cookie INFO 330 Chapter 3

SYN cookie When a SYN segment is received, the server creates a sequence number that is a hash function of the source and destination IP addresses and port numbers It sets up nothing else! When it receives the ACK response, it uses the cookie to recover the original info INFO 330 Chapter 3

Congestion Control Now address congestion control issues
Congestion is a traffic jam in the middle of the network somewhere Most common cause is too many sources sending data too fast into the network INFO 330 Chapter 3

Congestion Control Key lessons from cases b and c are:
A congested network forces retransmissions for packets lost due to buffer overflow, which adds to the congestion A congested network can waste its bandwidth by sending duplicate packets which weren’t lost in the first place INFO 330 Chapter 3

Congestion Control (skipping the big messy example)
The lesson is: dropping a packet wastes the transmission capacity of every upstream link that packet saw So what are our approaches for dealing with congestion? INFO 330 Chapter 3

Congestion Control Approaches
Either the network provides explicit support for congestion control, or it doesn’t End-to-end congestion control is when the network doesn’t provide explicit support Presence of congestion is inferred from packet loss, delays, etc. Since TCP uses IP, this is our only option right now INFO 330 Chapter 3

Congestion Control Approaches
Network-assisted congestion control is when network components (e.g. routers) provide congestion feedback explicitly IBM SNA, DECnet, and ATM use this, and proposals for improving TCP/IP have been made Network equipment may provide various levels of feedback Send a choke packet to tell sender they’re full Flag existing packets to indicate congestion Tell what transmission rate the router can support at the moment INFO 330 Chapter 3

ATM ABR Congestion Control
ATM Available Bit-Rate (ABR) is one method of network-assisted congestion control It uses a combination of virtual circuits (VC) and resource management (RM) cells (packets) to convey congestion information along the VC Data cells (packets) contain a congestion bit to prompt sending a RM cell back to the sender Other bits convey whether the congestion is mild (don’t increase traffic) or severe (back off) or tell the max rate supported along the circuit INFO 330 Chapter 3

TCP Congestion Control
As noted, TCP uses end-to-end congestion control, since IP provides no congestion feedback to the end systems In TCP, each sender limits its send rate based on its perceived amount of congestion Each side of a TCP connection has a send buffer, receive buffer, and several variables Each side also has a congestion window variable, CongWin (or cwnd) INFO 330 Chapter 3

The max send rate for a sender is the minimum of CongWin and the RcvWindow LastByteSent – LastByteAcked <= min(CongWin, RcvWindow) Assume for the moment that the RcvWindow is large, so we can focus on CongWin If loss and transmission delay are small, CongWin bytes of data can be sent every RTT, for a send rate of CongWin/RTT INFO 330 Chapter 3

Now address how to detect congestion Call a “loss event” when a timeout occurs or three duplicate ACKs are received Congestion causes loss events in the network If there’s no congestion, lots of happy ACKs tell TCP to increase CongWin quickly, and hence transmission rate Conversely, slow ACK receipt slows CongWin increase INFO 330 Chapter 3

TCP is self-clocking, since it measures its own feedback (ACK receipt) to determine changes in CongWin Now look at how TCP defines its congestion control algorithm in three parts Additive-increase, multiplicative-decrease Slow start Reaction to timeout events INFO 330 Chapter 3

Additive-increase, Multiplicative-decrease
When a loss event occurs, CongWin is halved unless it approaches 1.0 MSS, a process called multiplicative-decrease When there’s no perceived congestion, TCP increases CongWin slowly, adding 1 MSS each RTT – this is additive-increase Collectively they are the AIMD algorithm Recall MSS = maximum segment size INFO 330 Chapter 3

AIMD Algorithm Over a long TCP connection, when there’s little congestion, AIMD will result in slow rises in CongWin, followed by a cut in half when a loss event occurs; repeated that produces a grumpy sawtooth wave INFO 330 Chapter 3

Slow Start The initial send rate is typically 1 MSS/RTT, which is really slow To avoid a really long ramp up to a fast rate, an exponential increase in CongWin is used until the first loss event occurs CongWin doubles every RTT during slow start Then the AIMD algorithm takes over INFO 330 Chapter 3

Reaction to Timeout Timeouts are not handled the same as triple duplicate ACKs Triple duplicate ACKs are followed by: halve CongWin, then use AIMD approach But true timeout events are handled differently The TCP sender returns to slow start, and if no problems occur, ramps up to half of the CongWin value before the timeout occurred A variable Threshold stores the 0.5*CongWin value when a loss event occurs INFO 330 Chapter 3

Reaction to Timeout Once CongWin gets back to the Threshold value, it is allowed to increase linearly per AIMD So after a triple duplicate ACK, CongWin recovers faster (called a fast recovery, oddly enough) than after a timeout Why do this? Because the triple duplicate ACK proves that several other packets got there successfully, even if one was lost A timeout is a more severe congestion indicator, hence the slower recovery of CongWin INFO 330 Chapter 3

TCP Tahoe & Reno TCP Tahoe follows the timeout recovery pattern after any loss event Go back to CongWin = 1 MSS, ramp up exponentially until reach Threshold, then follow AIMD TCP Reno introduced the fast recovery from triple duplicate ACK (use this) After loss event, cut CongWin in half, and resume linear increase until next loss event; repeat INFO 330 Chapter 3

TCP Tahoe & Reno New Threshold is 12/2=6*MSS
Assumes loss event from transmission round 8; shows how Tahoe and Reno respond differently. New Threshold is 12/2=6*MSS INFO 330 Chapter 3

TCP Throughput Other variations exist, e.g. TCP Vegas
If the sawtooth pattern continues, with a loss event occurring at the same congestion window size consistently, then the average throughput (rate) is Average throughput = 0.75*W/RTT where W is the CongWin size when the loss event occurs INFO 330 Chapter 3

TCP Future TCP will keep changing to meet the needs of the Internet
Obviously, many critical Internet apps depend on TCP, so there are always changes being proposed See RFC Index for current ideas For example, many want to support very high data rates (e.g. 10+ Gbps) INFO 330 Chapter 3

TCP Future In order to support that rate, the congestion window would have to be 83,333 segments And not lose any of them! If we have the loss rate (L) and MSS, we can derive Average throughput = 1.22*MSS/(RTT*sqrt(L)) For 10 Gbps throughput, we need L about 2x10-10, or lose one segment in five billion! INFO 330 Chapter 3

Fairness If a router has multiple connections competing for bandwidth, is it fair in sharing? If two TCP connections of equal MSS and RTT are sharing a router, and both are primarily in AIMD mode, the throughput for each connection will tend to balance fairly, with cyclical changes in throughput due to changes in CongWin after packet drops INFO 330 Chapter 3

Fairness More realistically, unequal connections are less fair
Lower RTT gets more bandwidth (CongWin increases faster) UDP traffic can force out the more polite TCP traffic Multiple TCP connections from a single host (e.g. from downloading many parts of a Web page at once) get more bandwidth INFO 330 Chapter 3

Are We Done Yet? So we’ve covered transport layer protocols from the terribly simple UDP to a seemingly exhaustive study of TCP Key features along the way include multiplexing/demultiplexing, error detection, acknowledgements, timers, retransmissions, sequence numbers, connection management, flow control, end-to-end congestion control So much for the “edge” of the Internet; next is the network layer, to start looking at the core INFO 330 Chapter 3

INFO 330 Computer Networking Technology I

Similar presentations

Presentation on theme: "INFO 330 Computer Networking Technology I"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

INFO 330 Computer Networking Technology I

Similar presentations

Presentation on theme: "INFO 330 Computer Networking Technology I"— Presentation transcript:

Similar presentations

About project

Feedback