Presentation is loading. Please wait.

Presentation is loading. Please wait.

Lecture 7 Data distribution Epidemic protocols. EECE 411: Design of Distributed Software Applications Epidemic algorithms: Basic Idea Idea Update operations.

Similar presentations


Presentation on theme: "Lecture 7 Data distribution Epidemic protocols. EECE 411: Design of Distributed Software Applications Epidemic algorithms: Basic Idea Idea Update operations."— Presentation transcript:

1 Lecture 7 Data distribution Epidemic protocols

2 EECE 411: Design of Distributed Software Applications Epidemic algorithms: Basic Idea Idea Update operations are initially performed at one node A node passes its updated state to a limited number of ‘peers’; which, in-turn, pass the update to other peers Eventually, each update will reach every node Update propagation is lazy, i.e., not immediate [Assumption: there are no write–write conflicts]

3 EECE 411: Design of Distributed Software Applications Preventing an incident like the Amazon S3 incident Verify message and state correctness – all kind of corruption errors may occur Add checksums to detect corruption of system state messages Verify invariants before processing state Engineer protocols to control the amount of messages they generate. Add rate limiters. Put additional monitoring and alarming for gossip rates and failures Have an emergency procedure to restore clear state in your system may be the solution of last resort. Make it work quickly.

4 EECE 411: Design of Distributed Software Applications Epidemic algorithms: Principle Basic idea: A node passes its updated state to a limited number of other peers (generally randomly chosen); these peers, in-turn, pass the update to other peers Update propagation is lazy, i.e., not immediate Eventually, each update should reach every node Anti-entropy: Each node regularly chooses another node at random, and exchanges state differences, leading to identical states at both afterwards [Variation] Gossiping: A replica which has just been updated (i.e., has been contaminated), tells a number of other replicas about its update (contaminating them as well). Advantages: reliability, asynchronous, autonomous nodes

5 EECE 411: Design of Distributed Software Applications Anti-Entropy Protocols Each node P selects another node Q from the system at random. Push: P only sends its updates to Q Pull: P only retrieves updates from Q Push-Pull: P and Q exchange mutual updates (after which they hold the same information).

6 EECE 411: Design of Distributed Software Applications Anti-entropy – Push and Pull Push Pull Susceptible (clean) node Infected node Rumor

7 EECE 411: Design of Distributed Software Applications Anti-Entropy Protocols Each node P selects another node Q from the system at random. Push: P only sends its updates to Q Pull: P only retrieves updates from Q Push-Pull: P and Q exchange mutual updates (after which they hold the same information). Observation: for push-pull it takes O(log(N)) rounds to disseminate updates to all N nodes one round = each node takes the initiative to start one exchange. Main properties: Reliability: a node failures do not impact the protocol Dissemination time & effort, scales well with the number of nodes

8 EECE 411: Design of Distributed Software Applications Gossiping Basic model: A node S that is ‘infected’ (i.e., having an update to report), contacts other randomly chosen nodes and ‘infects’ them Newly infected nodes proceed similarly Termination decision: If the contacted node already has the update S stops contacting other nodes with probability 1 / k. P the share of nodes that have not been reached P = e -(k+1)(1-p) KP 120.0% 26.0% 40.7% ln(P)

9 EECE 411: Design of Distributed Software Applications Deletion and Death Certificates Absence of item does not spread; On the contrary, it can get resurrected! Use of death certificates (DCs) – when a node receives a DC, old copy of data is deleted How long to maintain a DC? Simple strategy – hold DC for fixed amount of time

10 EECE 411: Design of Distributed Software Applications Example applications (I) Data dissemination: in p2p, wireless sensor networks, clusters Lots of scenarios Distributing updates: E.g., disconnected replicated list maintenance Demers et al., Epidemic algorithms for replicated database maintenance. SOSP’87Epidemic algorithms for replicated database maintenance Membership protocols: E.g., Amazon Dynamo service: DeCandia et. al, Dynamo: Amazon’s Highly Available Key- value Store, SOSP’07 Various p2p networks (e.g., Tribler)

11 EECE 411: Design of Distributed Software Applications Example applications (II) Data aggregation The problem: compute the average value for a large set of sensors Each sensor (node) maintains a variable x i. When two nodes gossip, they each reset their variable to x i, and x k ← (x i + x k )/2 Result: in the end each node will have computed the average avg = sum(x i) )/N.

12 EECE 411: Design of Distributed Software Applications Quiz-like questions Design an epidemic style protocol to calculate the number of sensors in a sensor network. Tradeoffs between a multicast overlay and an epidemic protocol.

13 EECE 411: Design of Distributed Software Applications Advantages of epidemic techniques Probabilistic model. Rigorous mathematical underpinnings. Good framework for reasoning about the spread of information through a system over time. Asynchronous communication pattern. Operate in a 'fire-and -forget' mode, where, even if the initial sender fails, surviving nodes will receive the update. Autonomous actions. Enable nodes to take actions based on the data received without the need for additional communication to reach agreement with partners; nodes can take decisions autonomously. Robust with respect to message loss & node failures. Once a message has been received by at least one of your peers it is almost impossible to prevent the spread of the information through the system.

14 EECE 411: Design of Distributed Software Applications Roadmap Recap the differences between processes and threads advantages/drawbacks for using one or the other Reasons why clients/servers in distributed applications may use multithreaded designs Tradeoffs between multi-threaded / single threaded / finite- state machine designs for servers. Other client and server design issues

15 EECE 411: Design of Distributed Software Applications Context switching (I) Context for ‘context switching’: Processor level: The minimal collection of values stored in the registers of a processor used for the execution of a series of instructions (e.g., stack pointer, addressing registers, program counter). Thread level : The minimal collection of values stored in registers and memory, used for the execution of a series of instructions (i.e., processor context, state). Process level : The minimal collection of values stored in registers and memory, used for the execution of a thread (i.e., thread context, but now also at least MMU register values).

16 EECE 411: Design of Distributed Software Applications Threads vs. Processes: Context switching (II) Observation 1: Threads share the same address space. Thread context switching could be done entirely independent of the operating system. Observation 2: Process switching is generally more expensive as it involves getting the OS in the loop, i.e., trapping to the kernel. Observation 3: Creating and destroying threads is much cheaper than doing so for processes. Threading support could be implemented either by OS or at the process level Q: What are the tradeoffs?

17 EECE 411: Design of Distributed Software Applications Threads & distributed systems: Server side issues Multithreaded servers: Main issues are performance and structure. Improve performance: Starting a thread to handle an incoming request is much cheaper than starting a new process. Having a single-threaded server prohibits simply scaling the server to a multiprocessor system. As with clients: reduce latency by reacting to next request while previous one is being processed. Better structure: Most servers have high I/O load. Using simple, well-understood blocking calls may simplify the overall structure. Multithreaded programs tend to be smaller and easier to understand due to simplified flow of control.

18 EECE 411: Design of Distributed Software Applications How to handle incoming requests? (iteratively vs. concurrently) Why multiple threads can be a good idea? Multithreaded File Server Example

19 EECE 411: Design of Distributed Software Applications How to handle incoming requests? (iteratively vs. concurrently) Main Choices: Iterative vs. concurrent Blocking vs. non-blocking I/O [Concurrent server with blocking I/O]: Processes vs. threads. [Concurrent: non-blocking I/O] Finite state machine based design Event driven programming

20 EECE 411: Design of Distributed Software Applications Summary so far Client and server design: processes focus Sequential vs. concurrent, Concurrent: Processes vs. threads Concurrent: blocking vs. non-blocking IO


Download ppt "Lecture 7 Data distribution Epidemic protocols. EECE 411: Design of Distributed Software Applications Epidemic algorithms: Basic Idea Idea Update operations."

Similar presentations


Ads by Google