Algorithm for Virtually Synchronous Group Communication Idit Keidar, Roger Khazan MIT Lab for Computer Science Theory of Distributed Systems Group
Talk Outline Motivating applications Group Communication Virtual Synchrony Existing solutions Problems with existing solutions Our idea Our contributions Implementation
Motivation: Modern Distributed Applications Highly available servers –Web –Video-on-Demand Collaborative computing –Military command and control –Shared white-board, shared editor, etc. –Online strategy games Stock Market
Common, Important Issues in Building Distributed Applications Reliable communication Consistency –same picture of game, same shared file Fault-tolerance, high-availability –failures, recoveries, partitions, merges Scalability Performance
Group Communication -- Useful “Building Block” Group Abstraction –processes interact in a group –dynamic: fail/join/partition/merge Reliable Group Multicast Group Membership -- generates “views” –tell each process who it is connected to Systems: Ensemble, Horus, Isis, Newtop, Psync, Sphynx, Relacs, Totem, Transis
Example: Group Communication
Virtual Synchrony Synchronization of Messages and Views: Powerful abstraction for replication Semantics: VS [Birman, Joseph 87], EVS, SVS Procs that go together through same views, deliver same sets of messages.
Example: Virtual Synchrony
Virtual Synchrony: How To? Before moving into new view: Need to know which synch msgs to use, since there may be several view proposals Exchange synch messages (“flush”) to agree which msgs to deliver in old view.
Example: Synchronization Msgs
Problematic Scenario
Existing Solutions Limit Reconfiguration –Do not allow joins during reconfiguration –When someone wants to join: first, deliver view without joiner; then, start new reconfiguration. Use common id to identify synch msgs for same view proposal
Limited Reconfiguration
Problems with Existing Solutions Limited Reconfiguration –Obsolete views delivered to application –Creates overhead –Limits usefulness of virtual synchrony Use of common id to identify synch msgs –Pre-agreement or dissemination is required –Costly, especially in WANs
Our Idea Don’t limit reconfiguration Issue locally unique id per process for each view proposal Tag synch msgs with these local ids View includes vector of latest local ids –View is a triple: e.g., Procs use sync msgs identified by view Hence, procs use right sync msgs
Our Algorithm Allows Joiners
No Common Sync Ids Required
Contributions Algorithm for Virtual Synchrony –useful commonly provided semantics –processes can join during reconfiguration; hence, more benefit from Virtual Synchrony –one round, in parallel with view reconfiguration Architecture –external membership service ([KSMD, 00]) Formal Treatment –specs, algorithm, safety and liveness proof
Implementation VS library (C++), linked with application Use [KSMD,00] membership service implemented in C++, socket interface with members Reliable FIFO layer (made in Hebrew University), uses IP multicast and recovers lost messages, library --- linked with VS