Presentation is loading. Please wait.

Presentation is loading. Please wait.

End-to-End Protocols Outline Simple Demultiplexer Reliable Byte-Stream Remote Procedure Call Performance.

Similar presentations


Presentation on theme: "End-to-End Protocols Outline Simple Demultiplexer Reliable Byte-Stream Remote Procedure Call Performance."— Presentation transcript:

1 End-to-End Protocols Outline Simple Demultiplexer Reliable Byte-Stream Remote Procedure Call Performance

2 End-to-End Protocols Common end-to-end services –guarantee message delivery –deliver messages in the same order they are sent –deliver at most one copy of each message –support arbitrarily large messages –support synchronization –allow the receiver to flow control the sender –support multiple application processes on each host Underlying best-effort network –drop messages –reorders messages –delivers duplicate copies of a given message –limits messages to some finite size –delivers messages after an arbitrarily long delay

3 Simple Demultiplexor (UDP) User Datagram Protocol (UDP) - Unreliable and unordered datagram service Adds multiplexing to allow multiple application processes on each host to share the network A port is the abstraction of the communication endpoints. –Use a pair to identify a process –Endpoints identified by ports servers have well-known ports – DNS:53, talk:517 see /etc/services on Unix

4 Simple Demultiplexor (UDP) A port is implemented by a message queue. UDP has no flow control. UDP header format –Optional checksum: psuedo header + UDP header + data –psuedo header: Protocol number, Source IP address, Destination IP address, and UDP length field –Verify that this message has been delivered between the correct two endpoints. SrcPortDstPort ChecksumLength Data 01631

5 Reliable Byte-Stream (TCP) Outline Connection Establishment/Termination Sliding Window Revisited Flow Control Adaptive Timeout

6 TCP Overview Transmission Control Protocol (TCP) is a reliable, connection-oriented, and byte-stream service. A byte-stream service –application writes bytes –TCP sends segments –application reads bytes TCP is a full-duplex protocol. TCP supports a demultiplexing mechanism.

7 TCP Overview Application process Write bytes TCP Send buffer Segment Transmit segments Application process Read bytes TCP Receive buffer … …… Flow control: keep sender from overrunning receiver Congestion control: keep sender from overrunning network TCP uses the sliding window algorithm.

8 Data Link Versus Transport Potentially have many connections between different hosts –need explicit connection establishment and termination Potentially different RTT –need adaptive timeout mechanism Potentially long delay in network –need to be prepared for arrival of very old packets Potentially different capacity at destination –need to accommodate different node capacity Potentially different network capacity –need to be prepared for network congestion

9 TCP Segment Format The packets exchanged between TCP peers are called segments. How does TCP decide that it has enough bytes to send a segment? –TCP maintains a variable, called the maximum segment size (MSS), and it sends a segment as soon as it has collected MSS bytes from the sending process. –TCP supports a push operation, and the sending process invokes this operation to effectively flush the buffer of unsent byte. –The final trigger is a timer that periodically fires.

10 Segment Format

11 TCP Header Format SrcPort: Source port, DstPort: Destination port Acknowledgement, SequenceNum, and AdvertisedWindow fields are all involved in TCP’s sliding window algorithm. The 6-bit Flags field is used to replay control information between TCP peers: –SYN, FIN: establish and terminate a TCP connection. –RESET, PUSH: push operation –URG: urgent data up to UrgPtr bytes –ACK: Acknowledgement

12 Segment Format (cont) Each connection identified with 4-tuple: –(SrcPort, SrcIPAddr, DsrPort, DstIPAddr) Sliding window + flow control –acknowledgment, SequenceNum, AdvertisedWinow Flags –SYN, FIN, RESET, PUSH, URG, ACK Checksum –pseudo header + TCP header + data Sender Data(SequenceNum) Acknowledgment + AdvertisedWindow Receiver

13 Three-Way Handshake The algorithm used by TCP to establish and terminate a connection is a called a three-way handshake. –A timer is scheduled for each of the first two segments. –The client and server select an initial starting sequence number at random and have to exchange starting sequence numbers with each other at connection setup time. –This is to protect against the chance that a segment from an early connection might interfere with a latter one. TCP can be specified in a state-transition diagram.

14 Connection Establishment and Termination Active participant (client) Passive participant (server) SYN, SequenceNum = x SYN + ACK, SequenceNum = y, ACK, Acknowledgment = y + 1 Acknowledgment = x + 1

15 State Transition Diagram CLOSED LISTEN SYN_RCVDSYN_SENT ESTABLISHED CLOSE_WAIT LAST_ACKCLOSING TIME_WAIT FIN_WAIT_2 FIN_WAIT_1 Passive openClose Send/SYN SYN/SYN + ACK SYN + ACK/ACK SYN/SYN + ACK ACK Close/FIN FIN/ACKClose/FIN FIN/ACK ACK + FIN/ACK Timeout after two segment lifetimes FIN/ACK ACK Close/FIN Close CLOSED Active open/SYN

16 Sliding Window TCP’s sliding window algorithm serves several purposes: –It guarantees the reliable delivery of data. –It ensures that data is delivered in order. –It enforces flow control between the sender and the receiver. In order to keep the sender from overrunning the receiver’s buffer, the receiver advertises a window size to the sender by specifying the AdvertisedWindow field in the TCP header.

17 Sliding Window Revisited Sending side –LastByteAcked < = LastByteSent –LastByteSent < = LastByteWritten –buffer bytes between LastByteAcked and LastByteWritten Sending application LastByteWritten TCP LastByteSentLastByteAcked Receiving application LastByteRead TCP LastByteRcvdNextByteExpected Receiving side –LastByteRead < NextByteExpected –NextByteExpected < = LastByteRcvd +1 –buffer bytes between NextByteRead and LastByteRcvd

18 Flow Control Send buffer size: MaxSendBuffer Receive buffer size: MaxRcvBuffer Receiving side –LastByteRcvd - LastByteRead < = MaxRcvBuffer –AdvertisedWindow = MaxRcvBuffer - ( LastByteRcvd - NextByteRead ) Sending side –LastByteSent - LastByteAcked < = AdvertisedWindow –EffectiveWindow = AdvertisedWindow - ( LastByteSent - LastByteAcked ) –LastByteWritten - LastByteAcked < = MaxSendBuffer –block sender if ( LastByteWritten - LastByteAcked ) + y > MaxSenderBuffer Always send ACK in response to arriving data segment Persist when AdvertisedWindow = 0

19 Protection Against Wrap Around 32-bit SequenceNum BandwidthTime Until Wrap Around T1 (1.5 Mbps)6.4 hours Ethernet (10 Mbps)57 minutes T3 (45 Mbps)13 minutes FDDI (100 Mbps)6 minutes STS-3 (155 Mbps)4 minutes STS-12 (622 Mbps)55 seconds STS-24 (1.2 Gbps)28 seconds

20 Keeping the Pipe Full 16-bit AdvertisedWindow BandwidthDelay x Bandwidth Product T1 (1.5 Mbps)18KB Ethernet (10 Mbps)122KB T3 (45 Mbps)549KB FDDI (100 Mbps)1.2MB STS-3 (155 Mbps)1.8MB STS-12 (622 Mbps)7.4MB STS-24 (1.2 Gbps)14.8MB

21 Adaptive Retransmission (Original Algorithm) Measure SampleRTT for each segment/ ACK pair Compute weighted average of RTT –EstRTT =  x EstimatedRTT +  x SampleRTT –where  +  = 1  between 0.8 and 0.9  between 0.1 and 0.2 Set timeout based on EstRTT –TimeOut = 2 x EstRTT

22 Karn/Partridge Algorithm Do not sample RTT when retransmitting Double timeout after each retransmission SenderReceiver Original transmission ACK SampleR TT Retransmission SenderReceiver Original transmission ACK SampleR TT Retransmission

23 Jacobson/ Karels Algorithm New Calculations for average RTT Diff = sampleRTT - EstRTT EstRTT = EstRTT + ( 8 x Diff) Dev = Dev + 8 ( |Diff| - Dev) –where 8 is a factor between 0 and 1 Consider variance when setting timeout value TimeOut =  x EstRTT +  x Dev –where  = 1 and  = 4 Notes –algorithm only as good as granularity of clock (500ms on Unix) –accurate timeout mechanism important to congestion control (later)

24 TCP Extensions Implemented as header options Store timestamp in outgoing segments Extend sequence space with 32-bit timestamp (PAWS) Shift (scale) advertised window

25 Remote Procedure Call Outline Basics Protocol Stack Presentation Formatting

26 Remote Procedure Call Basics Problems with sockets  The read/write (input/output) mechanism is used in socket programming.  Socket programming is different from procedure calls which we usually use.  To make computing transparent from locations, input/output is not the best way.

27 Remote Procedure Call Basics A procedure call is a standard abstraction in local computation. Procedure calls are extended to distributed computation in Remote Procedure Call (RPC) as shown in Figure  A caller invokes execution of procedure in the callee via the local stub procedure.  The implicit network programming hides all network I/O code from the programmer.  Objectives are simplicity and ease of use.

28 Remote Procedure Call Basics The concept is to provide a transparent mechanism that enables the user to utilize remote services through standard procedure calls. Client sends request, then blocks until a remote server sends a response (reply). Advantages: user may be unaware of remote implementation (handled in a stub in library); uses standard mechanism. Disadvantages: prone to failure of components and network; different address spaces; separate process lifetimes.

29 RPC Components Protocol Stack –BLAST: fragments and reassembles large messages –CHAN: synchronizes request and reply messages –SELECT: dispatches request to the correct process Stubs Caller (client) Client stub RPC protocol Return value Arguments ReplyRequest Callee (server) Server stub RPC protocol Return value Arguments ReplyRequest

30 RPC Timeline ClientServer Request Reply Computing Blocked

31 SunRPC IP implements BLAST-equivalent –except no selective retransmit SunRPC implements CHAN-equivalent –except not at-most-once UDP + SunRPC implement SELECT-equivalent –UDP dispatches to program (ports bound to programs) –SunRPC dispatches to procedure within program

32 Sun RPC It is designed for client-server communication over Sun NFS network file system. UDP or TCP can be used. If UDP is used, the message length is restricted to 64 KB, but KB in practice. The Sun XDR is originally intended for external data representation. Valid data types supported by XDR include int, unsigned int, long, structure, fixed array, string (null terminated char *), binary encoded data (for other data types such as lists).

33 Sun XDR A program number and a version number are supplied. The procedure number is used as a procedure definition. Single input parameter and output result are being passed.

34 Files interface in Sun XDR const MAX = 1000; typedef int FileIdentifier; typedef int FilePointer; typedef int Length; struct Data { int length; char buffer[MAX]; }; struct writeargs { FileIdentifier f; FilePointer position; Data data; }; struct readargs { FileIdentifier f; FilePointer position; Length length; }; program FILEREADWRITE { version VERSION { void WRITE(writeargs)=1;1 Data READ(readargs)=2;2 }=2; } = 9999;

35 Sun RPC The interface compiler rpcgen is used to generate the following from interface definition.  client stub procedures  server main procedure, dispatcher and server stub procedures  XDR marshalling and unmarshalling procedures used by dispatcher and client, server stub procedures. Binding:  portmapper records program number, version number, and port number.  If there are multiple instance running on different machines, clients make multicast remote procedure calls by broadcasting them to all the port mappers.

36 RPC Interface Compiler

37 Example (Sun RPC) long sum(long) example  client localhost 10  result: 55 Need RPC specification file (sum.x)  defines procedure name, arguments & results Run (interface compiler) rpcgen sum.x  generates sum.h, sum_clnt.c, sum_xdr.c, sum_svc.c  sum_clnt.c & sum_svc.c: Stub routines for client & server  sum_xdr.c: XDR (External Data Representation) code takes care of data type conversions

38 RPC XDR File (sum.x) struct sum_in { long arg1; }; struct sum_out { long res1; }; program SUM_PROG { version SUM_VERS { sum_out SUMPROC(sum_in) = 1; /* procedure number = 1*/ } = 1; /* version number = 1 */ } = 0x ; /* program number */

39 Example (Sun RPC) Program-number is usually assigned as follows:  0x x1fffffff defined by SUN  0x x3fffffff defined by user  0x x5fffffff transient  0x xffffffff reserved

40 RPC Client Code (rsum.c) #include ''sum.h'' main(int argc, char* argv[]) { CLIENT* cl; sum_in in; sum_out *outp; // create RPC client handle; need to know server's address cl = clnt_create(argv[1], SUM_PROG, SUM_VERS, ''tcp''); in.arg1 = atol(argv[2]); // number to be squared // Call RPC; note convention of RPC function naming if ( (outp = sumproc_1(&in, cl)) == NULL) err_quit(''%s'', clnt_sperror(cl, argv[1]); printf(''result: %ld\n'', outp->res1); }

41 RPC Server Code (sum_serv.c) #include "sum.h" sum_out* sumproc_1_svc (sum_in *inp, struct svc_req *rqstp) { // server function has different name than client call static sum_out out; // why is this static? int i; out.res1 = inp->arg1; for (i = inp->arg1 - 1; i > 0; i--) out.res1 += i; return(&out); } // server's main() is generated by rpcgen

42 Compilation Linking rpcgen sum.x cc -c rsum.c -o rsum.o cc -c sum_clnt.c -o sum_clnt.o cc -c sum_xdr.c -o sum_xdr.o cc -o client rsum.o sum_clnt.o sum_xdr.o cc -c sum_serv.c -o sum_serv.o cc -c sum_svc.c -o sum_svc.o cc -o server sum_serv.o sum_svc.o sum_xdr.o

43 Internal Details of Sun RPC Initialization  Server runs: register RPC with port mapper on server host (rpcinfo –p)  Client runs: clnt_create contacts server's port mapper and establishes TCP connection with server (or UDP socket) Client  Client calls local procedure (client stub: sumproc_1), that is generated by rpcgen. Client stub packages arguments, puts them in standard format (XDR), and prepares network messages (marshaling).  Network messages are sent to remote system by client stub.  Network transfer is accomplished with TCP or UDP.

44 Internal Details of Sun RPC Server  Server stub (generated by rpcgen) unmarshals arguments from network messages. Server stub executes local procedure (sumproc_1_svc) passing arguments received from network messages.  When server procedure is finished, it returns to server stub with return values.  Server stub converts return values (XDR), marshals them into network messages, and sends them back to client Back to Client  Client stub reads network messages from kernel  Client stub returns results to client function

45 Details of RPC

46 SunRPC Header Format XID (transaction id) is similar to CHAN’s MID Server does not remember last XID it serviced Problem if client retransmits request while reply is in transit Data MsgType = CALL XID RPCVersion = 2 Program Version Procedure Credentials (variable) Verifier (variable) 031 Data MsgType = REPLY XID Status = ACCEPTED 031


Download ppt "End-to-End Protocols Outline Simple Demultiplexer Reliable Byte-Stream Remote Procedure Call Performance."

Similar presentations


Ads by Google