Reliable Client-Server Communication. Reliable Communication So far: Concentrated on process resilience (by means of process groups). What about reliable.

Slides:



Advertisements
Similar presentations
Primitives for Achieving Reliability 3035/GZ01 Networked Systems Kyle Jamieson Department of Computer Science University College London.
Advertisements

Remote Procedure Call (RPC)
Remote Procedure Call Design issues Implementation RPC programming
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Tam Vu Remote Procedure Call CISC 879 – Spring 03 Tam Vu March 06, 03.
Computing Systems 15, 2015 Next up Client-server model RPC Mutual exclusion.
Distributed Object & Remote Invocation Vidya Satyanarayanan.
Chapter 11 Data Link Control
Tutorials 2 A programmer can use two approaches when designing a distributed application. Describe what are they? Communication-Oriented Design Begin with.
Computer Science Lecture 17, page 1 CS677: Distributed OS Last Class: Fault Tolerance Basic concepts and failure models Failure masking using redundancy.
Lecture 4 Remote Procedure Calls (cont). EECE 411: Design of Distributed Software Applications [Last time] Building Distributed Applications: Two Paradigms.
1 Chapter Six - Errors, Error Detection, and Error Control Chapter Six.
Distributed Systems CS Fault Tolerance- Part II Lecture 14, Oct 19, 2011 Majd F. Sakr, Mohammad Hammoud andVinay Kolar 1.
Outcomes What is RPC? The difference between conventional procedure call and RPC? Understand the function of client and server stubs How many steps could.
Top Three Layers Session Layer Presentation Layer Application Layer.
.NET Mobile Application Development Remote Procedure Call.
EECS122 - UCB 1 CS 194: Distributed Systems Communication Protocols, RPC Computer Science Division Department of Electrical Engineering and Computer Sciences.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Chapter 8 Fault.
11.1 Chapter 11 Data Link Control Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Distributed Systems Principles and Paradigms Chapter 07 Fault Tolerance 01 Introduction 02 Communication 03 Processes 04 Naming 05 Synchronization 06 Consistency.
1 8.3 Reliable Client-Server Communication So far: Concentrated on process resilience (by means of process groups). What about reliable communication channels?
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
McGraw-Hill©The McGraw-Hill Companies, Inc., 2004 Data Link Control and Protocols.
1 Lecture 5 (part2) : “Interprocess communication” n reasons for process cooperation n types of message passing n direct and indirect message passing n.
 Remote Procedure Call (RPC) is a high-level model for client-sever communication.  It provides the programmers with a familiar mechanism for building.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved RPC Tanenbaum.
1 Chapter Six - Errors, Error Detection, and Error Control Chapter Six.
ICS362 – Distributed Systems
 Communication Distributed Systems IT332. Outline  Fundamentals  Layered network communication protocols  Types of communication  Remote Procedure.
CS603 Fault Tolerance - Communication April 17, 2002.
Chapter 3: Channel Coding (part 3). Automatic repeat request (ARQ) protocols ▫Used in combination with error detection/correction ▫Block of data with.
1 Conventional Procedure Call read(fd,buf,nbytes) a)Parameter passing in a local procedure call: the stack before the call to read b)The stack while the.
Remote Procedure Call RPC
Fault Tolerance. Basic Concepts Availability The system is ready to work immediately Reliability The system can run continuously Safety When the system.
Reliable Communication Smita Hiremath CSC Reliable Client-Server Communication Point-to-Point communication Established by TCP Masks omission failure,
- Manvitha Potluri. Client-Server Communication It can be performed in two ways 1. Client-server communication using TCP 2. Client-server communication.
Remote Procedure Call and Serialization BY: AARON MCKAY.
Manish Kumar,MSRITSoftware Architecture1 Remote procedure call Client/server architecture.
Computer Science Lecture 3, page 1 CS677: Distributed OS Last Class: Communication in Distributed Systems Structured or unstructured? Addressing? Blocking/non-blocking?
Failure detection The design of fault-tolerant systems will be easier if failures can be detected. Depends on the 1. System model, and 2. The type of failures.
Distributed Systems CS Fault Tolerance- Part II Lecture 18, Nov 19, 2012 Majd F. Sakr and Mohammad Hammoud 1.
1 CHAPTER 5 Fault Tolerance Chapter 5-- Fault Tolerance.
Fault Tolerance Chapter 7. Goal An important goal in distributed systems design is to construct the system in such a way that it can automatically recover.
Fault Tolerance CSCI 4780/6780. RPC Semantics in Presence of Failures 5 types of exceptions Client cannot locate server Request to server is lost Server.
Data Link Layer. Data link layer The communication between two machines that can directly communicate with each other. Basic property – If bit A is sent.
TCP/IP1 Address Resolution Protocol Internet uses IP address to recognize a computer. But IP address needs to be translated to physical address (NIC).
DATA LINK CONTROL. DATA LINK LAYER RESPONSIBILTIES  FRAMING  ERROR CONTROL  FLOW CONTROL.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Data Link Layer.
Lecture 5: RPC (exercises/questions). 26-Jun-16COMP28112 Lecture 52 First Six Steps of RPC TvS: Figure 4-7.
Object Interaction: RMI and RPC 1. Overview 2 Distributed applications programming - distributed objects model - RMI, invocation semantics - RPC Products.
Fault Tolerance Chap 7.
Fast Retransmit For sliding windows flow control we waited for a timer to expire before beginning retransmission of a packet TCP uses an additional mechanism.
Last Class: Introduction
“Request /Reply Communication”
03 – Remote invoaction Request-reply RPC RMI Coulouris 5
Chapter 8 Fault Tolerance Part I Introduction.
Congestion Control, Internet transport protocols: udp
Remote Procedure Call (RPC)
Distributed Systems CS
Distributed Systems CS
Reliable Client-Server Communication
CS4470 Computer Networking Protocols
Remote Procedure Call (invocation) RPC
Lecture 6: RPC (exercises/questions)
Lecture 6: RPC (exercises/questions)
Lecture 7: RPC (exercises/questions)
Distributed Systems CS
Last Class: Communication in Distributed Systems
Last Class: Fault Tolerance
Presentation transcript:

Reliable Client-Server Communication

Reliable Communication So far: Concentrated on process resilience (by means of process groups). What about reliable communication channels? Error detection: –Framing of packets to allow for bit error detection –Use of frame numbering to detect packet loss Error correction: –Add so much redundancy that corrupted packets can be automatically corrected –Request retransmission of lost, or last N packets

Reliable Communication Observation: Most of this work assumes point- to-point communication –TCP reliable –Mask omission failure (loss of messages) –What if TCP connection breaks? High-level communication facilities.

Traditional RPC Principle of RPC between a client and server program.

Remote Procedure Calls 1.The client procedure calls the client stub in the normal way. 2.The client stub builds a message and calls the local operating system. 3.The client’s OS sends the message to the remote OS. 4.The remote OS gives the message to the server stub. 5.The server stub unpacks the parameters and calls the server. 6.The server does the work and returns the result to the stub. 7.The server stub packs it in a message and calls its local OS. 8.The server’s OS sends the message to the client’s OS. 9.The client’s OS gives the message to the client stub. 10.The stub unpacks the result and returns to the client. A remote procedure call occurs in the following steps:

RPC Failures Five different classes of failures. 1.Can’t find server. 2.Request message lost. 3.Server crashes after receiving request. 4.Reply message is lost. 5.Client crashes after receiving request.

Methods 1: no server -- report back to client –Raise an exception. –Lost transparency. 2: Lost Request -- resend message –Start a timer, send another. –Or is the server down?

3: Server Crashes Harder issue: Server can crash in two different points. –Client can treat differently if known which case. –But client only knows no rep, how it tell and act accordingly. Solution?

Server Crashes At least once: The server guarantees it will carry out an operation at least once, no matter what. So keep trying until a reply comes back. At most once: The server guarantees it will carry out an operation at most once. So report failure immediately. No general solution for exactly once. Consider a print server that crashes and comes back up. –Client sends a message, gets an ack. –Server sends a completion message either right before or right after. –If crash, client can never reissue, always reissue, only reissue if no ack, only reissue if there is an ack.

Print Server Three events that can happen at the server: 1.Send the completion message (M). 2.Print the text (P). 3.Crash (C).

Server Crashes These events can occur in six different orderings: 1.M →P →C: A crash occurs after sending the completion message and printing the text. 2.M →C (→P): A crash happens after sending the completion message, but before the text could be printed. 3.P →M →C: A crash occurs after sending the completion message and printing the text. 4.P→C(→M): The text printed, after which a crash occurs before the completion message could be sent. 5.C (→P →M): A crash happens before the server could do anything. 6.C (→M →P): A crash happens before the server could do anything.

Server Crashes Server crashes and comes back up.

4: reply lost Detecting lost replies can be hard, because it can also be that the server had crashed. You don’t know whether the server has carried out the operation Solution: –None, except that you can try to make your operations idempotent: repeatable without any harm done if it happened to be carried out before.

5: client crashes Problem: The server is doing work and holding resources for nothing (called doing an orphan computation). –Orphan is killed (or rolled back) by client when it reboots –Broadcast new epoch number when recovering ⇒ servers kill orphans –Require computations to complete in a T time units. Old ones are simply removed. Question: What’s the rolling back for?