EECE 411: Design of Distributed Software Applications Lecture 6 [Last time] Distributed object systems Java RMI Assignment 2 Garbage collection Data distribution.

EECE 411: Design of Distributed Software Applications Lecture 6 [Last time] Distributed object systems Java RMI Assignment 2 Garbage collection Data distribution

EECE 411: Design of Distributed Software Applications Summary for last time Push vs. pull design Distributed garbage collection Solutions much more complex than for non- distributed case No perfect solution: depending on the assumptions you make on your platform one or the other might offer the best tradeoffs Lease based approaches (or soft-state): often practical and scalable in distributed environments

EECE 411: Design of Distributed Software Applications Assignment 2 discussion Push vs. pull design Server initiates communication (pushes data) Advantage: possibly lower load on server Drawback: server needs to maintain state (list of clients) Client initiates communication (pulls data) Advantage: no client registration needed, server does not maintain data, more flexibility for clients Drawback: load on server, DoS attacks

EECE 411: Design of Distributed Software Applications Assignment 2 discussion Server initiates communication (pushes data) Two subsequent problems: When to initiate communication (When to push the data)? Where/How to push it (How to find the clients?)

EECE 411: Design of Distributed Software Applications Assignment 2 discussion: Chat system using RMI & callbacks A possible implementation : the server has a Multicaster object with a method send(String) each client has a Display object with a method show(String) both methods are remote. Clients invoke send and the server invokes show. Sending a string means showing it on all displays.

EECE 411: Design of Distributed Software Applications /* a synchronized queue */ public class MessageQueue { /* the actual queue */ private LinkedList _queue ; /* the constructor - it simply creates the LinkedList to store queue elements*/ public MessageQueue() { _queue = new LinkedList(); } /* gets the first element of the queue or blocks if the queue is empty*/ public synchronized String dequeue() throws InterruptedException { while (_queue.isEmpty()) { wait(); } return (String)_queue.removeFirst(); } /* add a new element to the queue */ public synchronized void enqueue(String m) { _queue.addLast(m); notify(); }

EECE 411: Design of Distributed Software Applications public class Main { static GUI gui; static MessageQueue _queue; public static void main(String[] args) { // create a shared buffer where the GUI adds the messages that need to // be sent out by the main thread. The main thread stays in a loop and // when a new message shows up in the buffer it sends it out to the server _queue = new MessageQueue(); // instantiate the GUI - in a new thread javax.swing.SwingUtilities.invokeLater(new Runnable() { public void run() { gui = GUI.createAndShowGUI(_queue); } }); // hack: make sure the GUI instantioation is completed by the GUI thread // before the next call while (gui == null) Thread.currentThread().yield(); // calling the GUI method that updates the text area of the GUI // you might want to call the same method when a new chat message arrives gui.addToTextArea("RemoteUser:> Sample of displaying remote maessage"); /// … cont next page // The code below serves as an example to show how to shares message // between the GUI and the main thread. // You will probably want to replace the code below with code that sits in a loop, // waits for new messages to be entered by the user, and sends them to the // chat server (using an RMI call) // // In addition you may want to add code that // * connects to the chat server and provides an object for callbacks (so // that the server has a way to send messages generated by other users) // * implement the callback object which is called by the server remotely // and, in turn, updates the local GUI while (true) { String s; try { // wait until the user enters a new chat message s = _queue.dequeue(); } catch (InterruptedException ie) { break; } // update the GUI with the message entered by the user gui.addToTextArea("Me:> " + s); // print it to System.out (or send it to the RMI server) System.out.println ("User entered: " + s + " -- now sending it to chat server"); } // end while loop } }

EECE 411: Design of Distributed Software Applications public static void main(String[] args) {…… CONTUNIED …. // example to show how to share message between the GUI and the main thread. // You will probably want to replace the code below with code that sits in a loop, // waits for new messages to be entered by the user, and sends them to the // chat server // In addition you may want to add code that: // * connects to the chat server and provides an object for callbacks (so // that the server has a way to send messages generated by other users) // * implement the callback object which is called by the server remotely // and, in turn, updates the local GUI while (true) { String s; try { // wait until the user enters a new chat message s = _queue.dequeue(); } catch (InterruptedException ie) { break; } // update the GUI with the message entered by the user gui.addToTextArea("Me:> " + s); } // end while loop }

EECE 411: Design of Distributed Software Applications Design exercise Imagine a two-level p2p network (e.g., Skype) Each normal peer registers with one super-peer Super-peers provide additional functionality: directory search, call routing, etc. There are some central servers (e.g., that support the www.skype.com domain, register new users, etc). www.skype.com Skype would like to present on its webpage and estimate of for the number of participating nodes. Design a protocol.

EECE 411: Design of Distributed Software Applications Soft-state Producer sends state to receiver(s) over a (lossy) channel. Receivers keep state and associated timeouts. Advantages: Decuples state producer and consumer: no explicit failure detection and state removal messages ‘Eventual’ state Works well in practice: RSVP, RIP, tons of other systems. State producer State consumer

EECE 411: Design of Distributed Software Applications Garbage collection in single box systems Solutions Reference counting Tracing based solutions (mark and sweep)

EECE 411: Design of Distributed Software Applications Garbage collection in distributed systems Why is it different? References distributed across multiple address spaces Why a solution may be hard to design: Unreliable communication Unannounced failures Overheads

EECE 411: Design of Distributed Software Applications Reference Counting The problem: maintaining a proper reference count in the presence of unreliable communication. Key: ability to detect duplicate messages [A note on terminology: for the next few slides I’ll use proxy for client stub and skeleton for server stub.]

EECE 411: Design of Distributed Software Applications Reference Counting (cont) Passing remote object references a) Copy the reference and let the destination increment the counter Problems? What if P1 deletes its reference before P2 increments the counter b) Signal the copy first to the server Problems? Overheads, Coupling (what if P2 fails?)

EECE 411: Design of Distributed Software Applications Advanced Solutions Weighted Reference Counting a) Initial assignment of weights (lifes) b) New weight (life) assignment when creating a new reference.

EECE 411: Design of Distributed Software Applications Advanced Solutions: Weighted Reference Counting (II) Weight (life) assignment when copying a reference. Pros/cons? + Create new references without contacting the server! - Client machine failures

EECE 411: Design of Distributed Software Applications Reference Listing (Java RMI’s solution) Skeleton maintains a list of client proxies Creating a remote reference Assume P attempts to create remote reference to O P sends its identification to O skeleton O acknowledges and stores P identity P creates the proxy Copying a remote reference (P 1 attempts to pass to P 2 a remote reference to O) Advantages: add/delete are idempotent i.e. duplicate operations have no effect no reliable communication required Drawback overheads/scalability – the list of proxies can grow large handling unanounced client failures (may lead to resource leak)

EECE 411: Design of Distributed Software Applications Reference Listing (Java RMI’s solution) Handling failures Handling failures Lease based approach: Skeleton promises to keep info on client only for limited time. If info not renewed then the skeleton discards it. Pros/Cons?

EECE 411: Design of Distributed Software Applications Distributed system: collection of independent components that appears to its users as a single coherent system  Components need to communicate  Shared memory  Message exchange So far we talked about point-to-point, (generally synchronous, non-persistent) communication Socket programming: Message based, generally synchronous, non-persistent Client-server infrastructures RPC, RMI Data distribution: Multicast Epidemic algorithms Roadmap

EECE 411: Design of Distributed Software Applications Multicast Communication Calgary Chicago MIT1 UBC MIT2 end systems routers IP multicast flow Chicago UBC Calgary MIT1 MIT2 end systems overlay tunnels IP Multicast Overlay Two categories of solutions: Based on support from the network: IP-multicast Without network support: application-layer multicast

EECE 411: Design of Distributed Software Applications Discussion Deployment if IP-multicast is limited. Why?

EECE 411: Design of Distributed Software Applications Application Layer Multicast Calgary Chicago MIT1 UBC MIT2 end systems routers IP multicast flow Chicago UBC Calgary MIT1 MIT2 end systems overlay tunnels IP Multicast Overlay What should be the success metrics?

EECE 411: Design of Distributed Software Applications Overheads compared to IP multicast Relative Delay Penalty (RDP): Overlay-delay vs. IP-delay Stress: number of duplicate packets on each physical link MIT2 Chicago MIT1 UBC Calg2 Calg1 IP Multicast MIT2 Chicago MIT1 Calg1 Calg2 UBC Overlay Application-level multicast success metrics: Relative Delay Penalty and Link Stress Link stress distribution Relative delay penalty distribution 90%-tile RDP Maximum link stress

EECE 411: Design of Distributed Software Applications Roadmap … Data distribution: Multicast Epidemic algorithms

EECE 411: Design of Distributed Software Applications Epidemic algorithms: Principle Basic idea: Assume there are no write–write conflicts: (e.g., update operations are initially performed at one node) A node passes its updated state to a limited number of ‘neighbors’; neighbors, in-turn, pass the update to their neighbors Update propagation is lazy, i.e., not immediate Eventually, each update should reach every node Anti-entropy: Each node regularly chooses another node at random, and exchanges state differences, leading to identical states at both afterwards [Variation] Gossiping: A replica which has just been updated (i.e., has been contaminated), tells a number of other replicas about its update (contaminating them as well). What are the advantages?

EECE 411: Design of Distributed Software Applications Amazon S3 incident on Sunday, July 20 th, 2008 Amazon S3 service: Provides a simple web services interface to store and retrieve any amount of data. Intends to be highly scalable, reliable, fast, and inexpensive data storage infrastructure… S3 serves a large number of customers. Amazon itself uses S3 to run its own global network of web sites. Lots of objects stored 4billion Q4’06  40billion Q4’08  100billion Q2’10

EECE 411: Design of Distributed Software Applications Amazon S3 incident on Sunday, July 20 th, 2008 8:40am PDT: error rates began to quickly climb 10 min: error rates significantly elevated and very few requests complete successfully 15 min: Multiple engineers investigating the issue. Alarms pointed at problems within the systems and across multiple data centers. Trying to restore system health by reducing system load in several stages. No impact.

EECE 411: Design of Distributed Software Applications Amazon S3 incident on Sunday, July 20 th, 2008 1h01min: engineers detect that servers within Amazon S3 have problems communicating with each other Amazon S3 uses a gossip protocol to spread servers’ state info in order to quickly route around failed or unreachable servers After, engineers determine that a large number of servers were spending almost all of their time gossiping 1h52min: unable to determine and solve the problem, they decide to shut down all components, clear the system's state, and then reactivate the request processing components. Restart the system!

EECE 411: Design of Distributed Software Applications Amazon S3 incident on Sunday, July 20 th, 2008 2h29min: the system's state cleared 5h49min: internal communication restored and began reactivating request processing components in the US and EU. 7h37min: EU was ok and US location began to process requests successfully. 8h33min: Request rates and error rates had returned to normal in US.

EECE 411: Design of Distributed Software Applications Post-event investigation Message corruption was the cause of the server-to-server communication problems Many messages on Sunday morning had a single bit corrupted MD5 checksums are used in the system, but Amazon did not apply them to detect errors in this particular internal state The corruption spread wrong states throughout the system and increased the system load

EECE 411: Design of Distributed Software Applications Preventing the problem Change the gossip algorithm in order to control/reduce the amount of messages. Add rate limiters. Put additional monitoring and alarming for gossip rates and failures Add checksums to detect corruption of system state messages

EECE 411: Design of Distributed Software Applications Lessons learned You get a big hammer … use it wisely! Verify message and state correctness – all kind of corruption errors may occur An emergency procedure to restore clear state in your system may be the solution of last resort. Make it work quickly! Lessons

EECE 411: Design of Distributed Software Applications Amazon’s the report for the incident http://status.aws.amazon.com/s3- 20080720.html http://status.aws.amazon.com/s3- 20080720.html Current status for Amazon services http://status.aws.amazon.com/ http://status.aws.amazon.com/

EECE 411: Design of Distributed Software Applications Back to epidemic communication

EECE 411: Design of Distributed Software Applications Anti-Entropy Protocols A node P selects another node Q from the system at random. Push: P only sends its updates to Q Pull: P only retrieves updates from Q Push-Pull: P and Q exchange mutual updates (after which they hold the same information). Observation: for push-pull it takes O(log(N)) rounds to disseminate updates to all N nodes one round = every node as taken the initiative to start one exchange. Main properties: Reliability: a node failures do not impact the protocol Dissemination time & effort, scales well with the number of nodes

EECE 411: Design of Distributed Software Applications Gossiping Basic model: A node S having an update to report, contacts other randomly chosen servers. Termination decision: If the contacted node already has the update S stops contacting other nodes with probability 1 / k. P the share of nodes that have not been reached P = e -(k+1)(1-p) KP 120.0% 26.0% 40.7% ln(P)

EECE 411: Design of Distributed Software Applications Example applications (I) Data dissemination: in p2p, wireless sensor networks, clusters Spreading updates: E.g., disconnected replicated list maintenance – Demers et al., Epidemic algorithms for replicated database maintenance. SOSP’87Epidemic algorithms for replicated database maintenance Membership protocols: e.g., Amazon Dynamo service: DeCandia et. al, Dynamo: Amazon’s Highly Available Key-value Store, SOSP’07 Various p2p networks (e.g., Tribler)

EECE 411: Design of Distributed Software Applications Example applications (II) Data aggregation The problem: compute the average value for a large set of sensors Let every node i maintain a variable x i. When two nodes gossip, they each reset their variable to x i, x k ← (x i + x k )/2 Result: in the end each node will have computed the average avg = sum(x i) )/N.

EECE 411: Design of Distributed Software Applications Advantages of epidemic techniques Probabilistic model. Rigorous mathematical underpinnings. Good framework for reasoning about the spread of information through a system over time. Asynchronous communication pattern. Operate in a 'fire-and -forget' mode, where, even if the initial sender fails, surviving nodes will receive the update. Autonomous actions. Enable nodes to take actions based on the data received without the need for additional communication to reach agreement with partners; nodes can take decisions autonomously. Robust with respect to message loss & node failures. Once a message has been received by at least one of your peers it is almost impossible to prevent the spread of the information through the system.

EECE 411: Design of Distributed Software Applications Lecture 6 [Last time] Distributed object systems Java RMI Assignment 2 Garbage collection Data distribution.

Similar presentations

Presentation on theme: "EECE 411: Design of Distributed Software Applications Lecture 6 [Last time] Distributed object systems Java RMI Assignment 2 Garbage collection Data distribution."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

EECE 411: Design of Distributed Software Applications Lecture 6 [Last time] Distributed object systems Java RMI Assignment 2 Garbage collection Data distribution.

Similar presentations

Presentation on theme: "EECE 411: Design of Distributed Software Applications Lecture 6 [Last time] Distributed object systems Java RMI Assignment 2 Garbage collection Data distribution."— Presentation transcript:

Similar presentations

About project

Feedback