Presentation is loading. Please wait.

Presentation is loading. Please wait.

Remote Procedure Call (RPC)

Similar presentations


Presentation on theme: "Remote Procedure Call (RPC)"— Presentation transcript:

1 Remote Procedure Call (RPC)
UNIT - 2

2 Motivation A Request/Reply protocol model naturally fits with the Client/Server model, and hence is appropriate for distributed systems. RPCs (Remote Procedure Calls) emerged as a IPC protocol for designing several distributed applications in 1994. RPC is a mechanism through which control & data is transferred from one program to another.

3 Mechanism The caller places arguments to the procedure (located at remote place) in some specified location and format. Control is then transferred to the sequence of instructions that constitute the body of remote procedure. Procedure is executed. After execution, the control (and data as result) is returned back to caller.

4 Complexity The remote-procedure does not reside in the address space of the calling process. The remote-procedure may be on the same computer or on a different computer; thus parameters and results are complicated. Machines can crash or network may fail.

5 Design Issues UNIT - 2

6 Parameter Passing Call by value Call by reference
Parameters are copied into a message. Suitable for simple compact types like integers. Passing large structures can increase transmission costs. Call by reference In absence of shared memory and presence of disjoint address space, highly difficult. Copy-in and Copy-out may help but is language dependent. Call by Object reference Emerald designers proposed moving the parameter-object along with its reference to the callee’s node. Depending upon whether the object is moved back to caller’s node or not after the call, this could be interpreted as call-by-visit or call-by-move respectively.

7 Data Representation Different byte-ordering
Little Endian or Big Endian Different sizes of Integers and other types 16-bit or 32-bit, or 1’s or 2’s complement Different floating point representations Different character sets ASCII, EBCDIC, Unicode. A simple solution can be to do conversions on the fly Or Implicit (only values are transmitted) or Explicit (both type & value are transmitted) typing can be employed.

8 Call Semantics Normal functioning of RPC may be disrupted due to
Call or Response are lost due to network failure Caller or Callee node crashes and gets restarted Therefore, some call semantics need to be standardized. Possibly/may-be call semantics Weakest semantics (for completeness only) To prevent caller from waiting indefinitely for a response, a timeout based mechanism is employed. Doesn’t guarantee anything about receipt of call message or execution. Suitable for some applications distributed over LAN network with high reliability.

9 Call Semantics Last-one call semantics
Retransmission of call messages based on timeouts until a response is recived by the caller Clearly, the results of the last executed call are used by the caller. Last-one can be achieved easily when only 2 nodes are involved. If N1 crashes, it again calls R1which inturn calls another R2. Orphan calls tend to create problem and their extermination is a difficult and costly solution. R1 R2 N1 N2 N3

10 Call Semantics Last-of-Many call semantics
Similar to Last-one Call Semantics except the result of Orphan calls is avoided or discarded. Calls are given unique identifiers and response messages have corresponding identifier associated. The caller compares the identifier of response with the latest repeated call for its acceptance. Unfortunately, the caller has to wait for last response.

11 Call Semantics Atleast-Once call semantics Exactly-once call semantics
Timeout based retransmission without caring for Orphan calls. For nested calls, it takes the first response message. Weaker call-semantics. Exactly-once call semantics Feature of LPC’s and thus is most desirable. The disadvantage of previous call-semantics is that they don’t guarantee same results for same parameters if the procedure is executed more than once. E.g. readNextRecord(filename); Malloc(10); Timeouts, retransmissions, call-identifiers, reply-cache and duplicate filtering is employed here to achieve it.

12 Server Creation Semantics
Server processes may either be created before clients invoke RPCs or on demand. Based on the time duration for which servers survive, they can be: Instance per call Server Server is created on demand and then terminated. Any state information must be maintained by the OS or client. In case OS does, RPCs become expensive while in other case they lose transparency. Also, multiple requests to same type of server is expensive.

13 Server Creation Semantics
Instance per session Server Server exists for the entire session initiated by the client. Normally, it is dedicated to one client, and hence maintains the state information of that client until client declares end-of-session. Thus, it can only be used by a single client and hence is not scalable. Persistent Server Exists indefinitely and is shared by many clients. Created before client requests. Has to service concurrent requests and thus RPCs need to be designed accordingly (see pop-up threads) Reliability can be achieved by replication along with load-balancing.

14 Binding Client needs to locate the server before the call.
Process by which a client process becomes associated with the server process so that calls can take place is called BINDING.

15 Considerations in binding
Server Naming Sever Locating Binding Time Changing Binding Multiple Simultaneous Bindings

16 Server Naming An Interface name is used by client to specify the server. 2 parts: Type – specifies the interface itself (e.g. FAT_FS_SVC) Instance – specifies one the several instances of same server In general, type is enough. Version numbers can be associated with type field for providing new as well as old servers (e.g. FAT_FS_SVC_1_0 and FAT_FS_SVC_1_1) Interface names are created by programmers and are not dictated by RPC Packages.

17 Server LOCATING Two common methods: Broadcasting
Send messages from client to all nodes for interface type If server is replicated, many response messages are received. Choose the best one (load & network) Good for small n/ws but the method is expensive n/w traffic for large n/ws Incomplete and out-of-date decision making criteria (no. of servers, workload, best path)

18 Server LOCATING Binding Agent
A name-server (naming-agent) is used to bind client with server by providing client with location of server. In addition, it contains complete and up-to-date decision making criteria. Binding-table contains mapping of a server’s interface to its location. Additional information can be instances, versions, load, best path, etc.

19 Server LOCATING Binding Agent
Binding agent can poll servers periodically for existence. Address is implementation specific. Client may use broadcasting and caching for locating binding agent. On relocation of binding agent, the name-server can use broadcasting to intimate every node.

20 Server LOCATING 3 primitives Register De-register Look-up
When server goes up, it registers it self with the binding agent Can be located by broadcasting De-register When server goes down, it de-registers itself but may cache the location of binding-agent. Look-up The primitive is used by client for finding the location of server.

21 Binding Time Compile Time Link Time Hardcodes values in code
Inflexible if server is moved, replicated or interface is changed Can’t exploit the runtime characteristics of the system for efficient decision Link Time Client contacts agent for interface location Agent returns handle and client caches it Client calls the RPC Good for situations When client calls a specific RPC multiple times C A S

22 BINDING TIME Call Time Server Client binding takes place when the client calls the server for the first-time. Indirect Call Method Passes interface name and arguments to the agent. Agent on behalf of client calls the RPC and returns the handle and result. Next time, a direct call can be made. C A S

23 Changing Bindings A Call to server fails
Contact agent Server is moved to another node A new version is installed The state information migration is also required.

24 Multiple simultaneous Bindings
A client may be bound to many servers of same type Reliability and fault tolerance A multicast communication at binding-agent can be established E.g. An update to a file replicated at several nodes S1 C A S2

25 Server LOCATING Advantages Disadvantages
Fault tolerant as multiple servers of same interface type are possible. Load balancing Best path Filtering of clients Location transparency Low n/w bandwidth consumption Disadvantages Single point failure Replicating agents can help but synchronization is to be ensured Performance bottle neck Agents with binding information of specific class of services can be used Overhead in binding if many short lived clients exist

26 Reading Assignment Distributed systems: principles and paradigms
by AST & MV Steen Chapter 2 2.4. REMOTE PROCEDURE CALL Basic RPC Operation Parameter Passing Dynamic Binding RPC Semantics in the Presence of Failures Client Cannot Locate the Server Lost Request Messages Lost Reply messages Server Crashes Client Crashes Implementation Issues

27 Implementation of RPC UNIT - 2

28 Implementation of RPC Transparency is the main goal
Syntactic & Semantic RPCs achieve this goal by exploiting the concept of stubs “Every problem in computer science can be solved by adding a layer of abstraction” RPC Packages contain 3 entities Client/Server process Client/Server Stub RPC runtime

29 Implementation of Rpc Client/Server process Client Stub Server Stub
Packs the specification of the target RPC and arguments into a message and unpacks on receipt of result Server Stub Unpacks the call message and packs the result RPC runtime Handles transmission Interacts with binding agent Handles retransmission, call semantics, etc.

30 Implementation of rpc Call Return Result Body of RPC Client Server
Stub pack unpack Server Stub pack unpack RPC Runtime receive send RPC Runtime send receive

31 Implementation of rpc Stub-Generation can be done in 2 ways: Manually
RPC programmer provides a set of translation functions from which a user can construct his or her own stubs. Easy to implement and can handle complex parameter types.

32 Implementation of rpc Automatically
Uses IDL (Interface Definition Language) to define the Interface. ID is a list of procedure signatures, their arguments, and result types – all provided by the Interface. Also contains: Constants, Enumerated types, & so on to be used by both Client & Server. Whether argument(s) is input type or output type or both Input type are copied from Client to Server Output type are copied from Server to Client Server exports that interface while client imports it. Hence, Compile-time type checking is possible.

33 Implementation of rpc IDL Compiler
Uses the Interface-definition to create (automatically) Client and Server stubs. Uses the Interface-definition to create routines for argument-marshaling and un-marshaling. Marshaling means taking data and converting it into suitable form for transmission. Uses the interface-definition to create other files. Interface sum_svc { int sum [in] int x; [in] int y; }; }

34 CLASSES of RPC UNIT - 2

35 Classes of RPC Callback RPC
Client-Server relationship fits with RPCs; but peer-to-peer relationship is required by some applications. Example: A remote-interactive application may need user to input some data periodically Callback RPCs RPC is called by Client Server executes some part of RPC and calls the Client back Client processes and returns the requested data to server Step 2 & 3 can happen multiple times Finally, server returns the result

36 Classes of RPC Client Server RPC callback Callback result Result

37 Classes of RPC 3 Issues in Callback RPC
Providing Server with Client’s handle Client that uses Callback RPC should use transient but unique identifier for Callback service and hence should register with binding agent. This identification should be passed to server during RPC call. The Server should invoke the RPC on Client for Callback RPC Peer-to-peer relationship Making Client process to wait Primitive should be synchronous/blocking Handling Deadlocks P1 P2 P3

38 Classes of RPC Broadcast RPC
1-to-1 relationship fits with RPCs; but 1-to-many relationship is required by some applications. Example: An update to a file replicated at n-nodes. 2 ways: Use of special broadcast primitive that is processed by binding-agent for calling RPCs in multiple servers Use of special broadcast port to which all nodes are connected

39 Classes of RPC Batch-Mode RPC
RPC are not called frequently but some applications may call RPCs frequently. To reduce the overhead of sending every individual RPC independently and individual waiting time, they can be buffered at client and sent to server in a batch. The prime requisite of this mode is that client shouldn’t require the reply for the sequence of requests. How to queue? Pre-defined interval Pre-defined number Buffer space

40 Classes of RPC Complicated RPC Long Duration Calls
Some mechanism is to be established to keep the parties in sync. 1) Periodically, send a probe packet to server which is acknowledged immediately. The packet contains message identifier of last call. The acknowledge may contain processing or failed. 2) Periodically, an acknowledgment is generated by Server to tell Client that I am processing the request If the ack is not received, then Client assumes Server has crashed or n/w has failed.

41 Classes of RPC Complicated RPC Long Message Calls
Some mechanism is to be established if the arguments do not fit in a single packet. 1) Use several physical RPCs for one logical RPC Fixed overhead in each individual RPC 2) Fragment at lower-level in protocol hierarchy

42 RPC in LINUX UNIT - 2

43 RPC IN LINUX Stub Generation Procedure Arguments & Result Marshaling
Both Automatic and Manual Procedure Arguments & Result Accepts only one argument and returns on result Multiple arguments can be packed into a single one and then sent Like structure in case of C Language UNIX RPCs have 2 arguments – pointer to single argument struct and handle of client Marshaling RPC-runtime library has procedures used by stubs for marshaling some basic data types

44 RPC IN LINUX Call Semantics Exception Handling Binding
Supports atleast-once call semantics Timeout=5 seconds, retries = 5 times Exception Handling Error Strings or global stderr variable Binding No n/w wide client server binding Each Server-Node has a local-binding agent called portmapper. It maintains a database of each service identified by its program number, version number and its map to port-number. Clients has to explicitly mention the hostname of server Location Transparency is compromised

45 RPC IN LINUX Security No authentication UNIX Style DES Style
Using UID and GID DES Style Each user has a unique netname which is sent in encrypted form

46 RPC IN LINUX Classes of RPC Asynchronous RPC Callback RPC
Set timeout to zero Callback RPC Register Client process as Server on local portmapper Broadcast RPC Call is directed to all portmappers Batch mode Using queuing

47 Synchronization in Distributed Systems
UNIT - 2

48 Synchronization Certain rules are to be followed in an OS for sharing resources among concurrently executing programs to get correct results – Synchronization mechanisms. Synchronization is harder to achieve in distributed systems Disjoint address space Physical unreliable network Scattered relevant information over multiple machines

49 Clock Synchronization
Temporal ordering of events produced by concurrent processes is mandatory On centralized system, all processes get same clock and thus it can be achieved In distributed system, there are multiple clocks and if they are not synchronized Senders & receivers will be out-of-sync Serialization of concurrent access to shared objects can’t be guaranteed This Clock synchronization can be achieved by Synchronizing Physical clocks Using Logical Clocks

50 Physical clock Synchronization
UNIT - 2

51 Physical Clock Synchronization
Set and start all clocks at the same time Computer clocks are realized as quartz crystal which oscillate at certain frequency when put under tension If put under specific tension, can generate clock ticks at specific intervals. However, the frequency also depends upon on the physical characteristics like Voltage, humidity, temperature, cut, quality, etc. This means even if two (or more) clocks are set and started at the same time, the clocks may drift from ideal clock and hence from each other. What is the solution? Attach UTC receiver (atomic clock) to each machine Economically not feasible Attach UTC receiver to one machine and Periodically synchronize all clocks

52 Physical Clock Synchronization
When to synchronize? Drift rate is the rate with which a clock drifts away from expected real time (generally 1 sec in days) Clock skew is the amount of difference between 2 clocks at any instant of time. Depending upon the nature and criticality of the system any 2 clocks are said to be synchronized if the clock skew is less than some specific constant.

53 Physical Clock Synchronization
Fast Normal Slow dc/dt < 1; means Slow Clock time (c) dc/dt = 1; Perfect Clock dc/dt > 1; Fast Clock UTC time (t)

54 Physical Clock Synchronization
Time UTC (1 tick/sec) Slow ( 0.5 tick /1 sec) Fast ( 1.5 ticks /sec) 1 2 1 (1-2= -1) 3 (3-2 = +1) 3 4 2 (2-4= -2) 6 (6-4 = +2) 5 6 3 (3-6= -3) 9 (9-6= +3) 7 8 4 (4-8= -4) 12 (12-8= +4) 9 10 5 (5-10= -5) 15 (15-10= +5) 11 12 6 (6-12= -6) 18 (18-12= +6)

55 Physical Clock Synchronization
In worst case, the 2 clocks will drift in opposite direction, then after ∆t UTC time, they are 2d∆t Clock time apart. If the maximum skew affordable is S, Then S= 2d∆t ∆t = S / 2d Thus, after S/2d interval, the clocks need to be synchronized so that maximum skew is less than S.

56 Physical Clock Synchronization
What if 2 clocks drift in Opposite direction with different rates? S=d1∆t + d2∆t ∆t = S/(d1+d2) If d1 > d2, then S/2d1 < S/(d1+d2) We synchronize early What if 2 clocks drift in same direction? S=d1∆t – d2∆t ∆t = S/(d1-d2) If d1 > d2, or d1 < d2 or d1 = d2 then S/2d < S/(d1-d2) We again synchronize early

57 Physical Clock Synchronization
Fast Normal Slow Clock time (c) 2d∆t ∆t UTC time (t)

58 Physical Clock Synchronization
Based on this proposition, we have 2 types of Clock Sync algorithms Centralized Algorithm Passive-Time Server Algorithm Active-Time Server Algorithm Berkeley Algorithm Distributed Algorithm Global Averaging Algorithm Localized Averaging Algorithm

59 Passive-time server Algorithm
Steps: A Time-Server node has UTC receiver Periodically (before S/2d time is over) every node sends a message to this Time Server to know its time and accordingly synchronize. The Time Server responds immediately with its current time Issue: Due to propagation delay incurred, the received time needs to be adjusted. Considering symmetric delay; Current time = t + (T1 – T0)/2 T0 T1 C S t

60 Passive-time server Algorithm
Issue: The measure doesn’t take into consideration the elimination of request processing time for accurate measure. Considering symmetric delay; Current time = t + (T1 – T0 - I)/2 This way only the time taken by message to reach the client is used for adjustment. T0 T1 C S I t

61 Passive-time server Algorithm
The accuracy can be improved Series of calls yielding a number of (T1 – T0) The minimum of the measurements or the average is considered. Fault Tolerant Average Values (T1 – T0) which are greater than some threshold value are discarded and are considered victims of n/w congestion. Cristian Algorithm used in NTP.

62 ACTIVE-time server Algorithm
Steps: A Time-Server node has UTC receiver Time Server periodically broadcasts its current time periodically All nodes have some prior knowledge of minimum n/w delay Using this estimate: Correct Time = T + td Issue: Not Fault tolerant; if n/w delay > td

63 Berkeley Algorithm Steps: No UTC is used
Time Server asks every node for their current time Time Server has some prior knowledge of n/w delay between every node and itself Using this delay, correct time of every node is estimated. Fault-Tolerant Average of all values (including its) is calculated The adjustments are then propagated to all nodes (no UTC)

64 Berkeley Algorithm 18+1=19 15+1=16 12+1=13 (16+19+13)/3=16
Adj-Server= = 0 Adj-Client A = 16 – 19= -3 Adj-Client B = 16 – 13= +3 15 = 17 Td=1 Td=1 -3 +3 12 18 = 17 = 17

65 Centralized Algorithms
Drawbacks: Subject to single-point failures Not Scalable +ve adjustments have no problem but –ve adjustments can create chaos

66 Distributed Global Averaging Algorithm
Steps: Every node periodically broadcasts its local time Then it waits for some specified time (T), during which It collects same messages from other nodes, For every message, the node records the arrival time acc. To its own clock, After T time has lapsed, The node estimates the skew of its clock w.r.t. each of the other nodes, Computes fault tolerant average, Uses the skew to adjust its clock When to resync? T0 + iR T0 is fixed time in past agreed upon by all nodes R is system parameter

67 Distributed local Averaging Algorithm
Global averaging algorithm puts load on n/w In this algorithm, 2 near neighbors exchange their clock time to get average and re-adjust their clocks Load on n/w is reduced With time all the clocks in a system get synchronized and then re-synchronized However, it requires some ordering of nodes.

68 Logical clock UNIT - 2

69 Logical clock It is sufficient to ensure that all events be totally ordered in a manner consistent with observed behavior – Lamport Lets define time in terms of the order in which the events occur. And not in terms of physical clock time. Getting all our events marked by unique numbers in sequence – Logical Clock.

70 Happened-before relation
Denoted by -> If a & b are two events in same process and a occurs before b, then a->b If two process exchange messages, then lets assume that a is an event message-sent and b is an event message-received, Law of casualty If a->b and b->c, then a->c

71 Happened-before relation
a->a is not true If a & b are two events in 2 processes that do not exchange messages (directly or indirectly) Neither a->b nor b->a Because they are concurrent, hence nothing can be said In other words, neither can casually affect the other Casual Ordering a c P1 Though, a->c but! P2 b

72 Happened-before relation
Partial ordering a b P1 c d P2 Though, a->b->c->d->e but! f e P3

73 Implementation of Logical Clock
It is a way to associate a timestamp (a number) with each event so that events that are related (non-concurrent) to each other by happened-before relation can be properly ordered. If a->b, then clock(a) -> clock(b) Clock must always go forward Clock is incremented between any two successive events (related or not) Can be implemented using Counters or physical clocks Can be global or local Global makes the system centralized and hence vulnerable to problems!!!

74 LOCAL Logical Clock Every process gets a logical clock
If event a is sending a message by process p1 having clock c1 gets a timestamp t1 to process p2 having clock c2; then increment c2 If t1 < c2 +1; set c2 = c2 + 1 Else if t1 >= c2 +1; Set c2 = t1 + 1

75 Time P1 (4 ticks) P2 (7 ticks) P3 (10 ticks) 1 4 7 10 2 8 14 20 3 12 21 30 16 28 40 5 35 50 6 24 42 => 51 60 58 70 32 => 59 65 80 9 63 72 90 67 79 100

76 Reading Assignment Distributed systems: principles and paradigms
by AST & MV Steen Chapter 3 3 Synchronization in Distributed Systems 3.1. CLOCK SYNCHRONIZATION Logical Clocks Physical Clocks Clock Synchronization Algorithms Cristians Algorithm The Berkeley Algorithm Averaging Algorithms

77 Mutual Exclusion in Distributed OS
Unit 2

78 Mutual Exclusion? It is a way to ensure that two or more processes access a shared resource in a serialized way. In other words, exclusive access is given to process to update a shared resource. The region within the process that is given exclusive access – Critical Region

79 How to achieve It? Centralized Algorithms Distributed Algorithms
Contention Based Timestamp Based Voting Based Token Based

80 Centralized Algorithm
A Coordinator process coordinates exclusive access to shared resources Inside Critical Region GRANTED 2 1 GRANTED REQUEST RELEASE REQUEST Put it in Q Pull next request from Q 3

81 Advantages & Problems Advantages Problems
Grants permission in the order of requests – A Fair Algorithm. Easy to implement Problems Single point failure can bring down the system. Not Scalable Confusion in dealing with Denial & Dead. Can be solved by adding a message for Denial.

82 Distributed Algorithms
Contention based Token Based

83 Contention Based Contention based algorithms allow multiple processes to request for shared resources but solve the argument based on Timestamp Voting

84 Timestamp Based When a process p wants to access a shared resource, it sends a message to all other nodes containing the resource ID and its Timestamp. The receiver can take following actions: If it is not in critical region and does not want to enter into it, it sends reply message. If it is in critical region, it does not reply and queues the request message. If it wants to enter critical region, it compares the two timestamps – sender’s and receiver’s; the earlier request(lowest value) wins If receiver wins, it queues the request message and doesn’t reply If sender wins, it replies.

85 Timestamp Based The sender can take following actions:
It waits until it receives replies (say OK) from every node. After getting permission from all nodes, it executes critical section. After critical section, it replies to all messages queued previously (if any).

86 Inside Critical Section
Timestamp Based Waiting Timestamp = 5 Inside Critical Section 2 1 Timestamp = 3 Waiting

87 Advantages & Problems Advantages Problems
No Starvation & Guaranteed Mutual Exclusion Problems 2(n-1) messages are send/received for a single request. Number of point-failures is n – Higher single point failure probability. Confusion in dealing with Denial & Dead. Can be solved by adding a message for Denial (Tanenbaum95)

88 Voting Based It is same as that of Timestamp Based, but the decision is made as soon as the reply (say OK) is received from majority of nodes. In this approach, a process can give permission to one process only. Requires the requesting process to inform others when it is done.

89 Token Based Nodes are arranged in a Logical Ring.
Some process p0 initializes the token and then token starts circulating in the ring. If a process needs to access a shared resource, it waits for the token to arrive. It holds the token while executing critical section, then passes token to next process/node. If a process gets the token but doesn’t need to access any shared resource, it simply passes it to next.

90 Token Based 1 Hold the token and execute the critical section
1 Hold the token and execute the critical section 3 Need to access a shared resource 2

91 Advantages & Problems Advantages
Guarantees Mutual Exclusion & avoids starvation. Problems Lost tokens can create confusion Denial or Dead Lost Process

92 Comparison of 3 Algorithms
Messages per Critical Section Delay before Permission Problems Centralized 3 2 Coordinator crash Distributed (Contention) 2 (n-1) Crash of any process (Token) 1 to infinity 0 to n-1 Lost token & process

93 Reading Assignment Distributed systems: principles and paradigms
by AST & MV Steen Chapter 3 3.2. MUTUAL EXCLUSION A Centralized Algorithm A Distributed Algorithm A Token Ring Algorithm A Comparison of the Three Algorithms

94 ELECTION ALGORITHMS in Distributed OS
Unit 2

95 Election Algorithms Failures are inevitable Two Strategies:
Don’t Care Reorganize Most of distributed algorithms rely on the existence of a coordinator, a sequencer, an initiator or any other special process called coordinator process. What to do if such a process goes down? We need to dynamically elect a new coordinator process.

96 Election Algorithms Election Algorithms are meant for electing a coordinator process from among the currently running processes in such a manner that at any instance of time there is a single coordinator for all processes. All Election Algorithms make certain assumptions: Every process has a unique priority number Every process knows about other process’ priority Highest priority number process is elected On recovery, the actual coordinator takes appropriate steps.

97 Election Algorithms Election Algorithms Bully Algorithm
Invitation Algorithm Ring Algorithm

98 Bully Algorithm Bully Algorithm assumes that
Each process stores its state onto some permanent storage. There are no transmission errors. The communication subsystem does not fail

99 Bully Algorithm When a process p is asking coordinator for some service and coordinator is not responding; it assumes that coordinator is down. What to do? Announce an Election The node that found the coordinator down announces election by sending ELECTION message to every other node whose priority > his priority.

100 Bully Algorithm There a 3 possible outcomes
No body replies; In this case the process becomes the coordinator process. Single Reply; The interested process takes on the position of a coordinator process. Multiple Replies; The current process is relived of its duties and these processes further carry on the election until a single coordinator is found. After a coordinator is found, it is informed to every process by a COORDINATOR message.

101 Bully Algorithm Coordinator 5 X 4 1 2 3

102 Bully Algorithm 5 Recovered Coordinator Lets go for an Election
(Garcia-Molina) Current Coordinator 4 1 2 3

103 Bully Algorithm Assuming n processes In Worst Case, In Best Case,
if the initiator is the lowest priority process and every other process is interested, then O(n^2) messages are sent. In Best Case, if the initiator is the highest priority process n-1 messages are sent.

104 Invitation Algorithm The Bully Algorithm fails for Asynchronous system
Works fine if the process responds timely – unrealistic A Synchronous DS is one in which The time to execute each step of a process has known lower and upper bounds, and Each message transmitted over a channel is received within a known bounded time

105 Invitation Algorithm The Invitation Algorithm works in presence of timing failures no assumption The algorithm works even if some router fails making communication between 2 subsets of the processes impossible How can a single global coordinator exist. It makes sense to think in terms of a coordinator for each sub-group of processes.

106 Invitation Algorithm Initially there is a single group with a global coordinator. When it fails, node(s) sensing the failure start creating singleton groups with itself as coordinator. Every such group is given a unique group-number. The coordinators of these singleton groups periodically send invitation to other processes belonging to old group to join it in forming a larger group.

107 Invitation Algorithm As the group structure changes, it is assigned a new unique group-number. The unification is done as follows: The coordinator sends messages to every other node asking whether the node is itself a coordinator process. If they reply that they are, it waits for a period based on its own group number before issuing the invitation. Processes with lower priority defer sending invitation for a longer period to avoid sending invitations to all processes. The coordinator (or simple process) on receiving invitation from high priority process accepts the proposal.

108 Invitation Algorithm When a coordinator (another process) receives an invitation it forwards it to all members of its group. Any process receiving an invitation accepts by sending accept to the coordinator which acknowledges with answer. The process (coordinator) which initiated the merger becomes the coordinator for the new-group. This is confirmed by sending ready to each member which respond with answer.

109 INVITATION ACCEPT ANSWER READY Top Priority 1 2 3 4 Least Priority

110 Invitation Algorithm In Bully Algorithm, the low priority processes are ‘bullied’ to submission by high priority processes. In Invitation Algorithm, the process invites other processes to join its group and agree upon it being leader.

111 RING Algorithm There is a Logical Ring of processes.
When a process senses that the coordinator has failed It initiates the Election by passing an ELECTION message to its successor in the ring. The message contains its priority. The next process appends its priority to the message and passes it to its successor. If the next process is down, the sender skips until an alive successor is found. Finally, the initiator receives the message back. The highest priority process within the list is the Coordinator

112 RING Algorithm After a coordinator is found, it is informed to every node by a COORDINATOR message sent in the same fashion as an ELECTION message. The message is again received by the initiator, and removed. Two or more processes might simultaneously initiate elections In this case, extra messages may exist BUT still same new coordinator is elected.

113 X RING Algorithm Coordinator 2 1 4 is the Coordinator 3, 4, 2 3, 4

114 Reading Assignment Distributed systems: principles and paradigms
by AST & MV Steen Chapter 3 3.3. ELECTION ALGORITHMS The Bully Algorithm A Ring Algorithm

115 References Books: Papers:
Distributed systems: principles and paradigms by AST & MV Steen Distributed OS (Concepts & Design) by PK Sinha Papers: Implementing remote procedure calls By Andrew D. Birrell, Bruce J. Nelson


Download ppt "Remote Procedure Call (RPC)"

Similar presentations


Ads by Google