Gossip Algorithms and Implementing a Cluster/Grid Information service MsSys Course Amar Lior and Barak Amnon.

Slides:



Advertisements
Similar presentations
Ch. 12 Routing in Switched Networks
Advertisements

Correctness of Gossip-Based Membership under Message Loss Maxim GurevichIdit Keidar Technion.
Chapter 5: Tree Constructions
* Distributed Algorithms in Multi-channel Wireless Ad Hoc Networks under the SINR Model Dongxiao Yu Department of Computer Science The University of Hong.
Distributed Systems Major Design Issues Presented by: Christopher Hector CS8320 – Advanced Operating Systems Spring 2007 – Section 2.6 Presentation Dr.
Impossibility of Distributed Consensus with One Faulty Process
Lecture 8: Asynchronous Network Algorithms
Leader Election Let G = (V,E) define the network topology. Each process i has a variable L(i) that defines the leader.  i,j  V  i,j are non-faulty.
Gossip and its application Presented by Anna Kaplun.
Maximum Battery Life Routing to Support Ubiquitous Mobile Computing in Wireless Ad Hoc Networks By C. K. Toh.
Towards an Exa-scale Operating System* Ely Levy, The Hebrew University *Work supported in part by a grant from the DFG program SPPEXA, project FFMK.
RUMOR SPREADING “Randomized Rumor Spreading”, Karp et al. Presented by Michael Kuperstein Advanced Topics in Distributed Algorithms.
Lecture 7 Data distribution Epidemic protocols. EECE 411: Design of Distributed Software Applications Epidemic algorithms: Basic Idea Idea Update operations.
Gossip Scheduling for Periodic Streams in Ad-hoc WSNs Ercan Ucan, Nathanael Thompson, Indranil Gupta Department of Computer Science University of Illinois.
1 Accessing nearby copies of replicated objects Greg Plaxton, Rajmohan Rajaraman, Andrea Richa SPAA 1997.
The Antnet Routing Algorithm - A Modified Version Firat Tekiner, Z. Ghassemlooy Optical Communications Research Group, The University of Northumbria, Newcastle.
Small-world Overlay P2P Network
1 Complexity of Network Synchronization Raeda Naamnieh.
Dynamic Tuning of the IEEE Protocol to Achieve a Theoretical Throughput Limit Frederico Calì, Marco Conti, and Enrico Gregori IEEE/ACM TRANSACTIONS.
Practical Belief Propagation in Wireless Sensor Networks Bracha Hod Based on a joint work with: Danny Dolev, Tal Anker and Danny Bickson The Hebrew University.
1 University of Freiburg Computer Networks and Telematics Prof. Christian Schindelhauer Distributed Coloring in Õ(  log n) Bit Rounds COST 293 GRAAL and.
Extensible Scalable Monitoring for Clusters of Computers Eric Anderson U.C. Berkeley Summer 1997 NOW Retreat.
Performance Comparison of Existing Leader Election Algorithms for Dynamic Networks Mobile Ad Hoc (Dynamic) Networks: Collection of potentially mobile computing.
Chapter Resynchsonous Stabilizer Chapter 5.1 Resynchsonous Stabilizer Self-Stabilization Shlomi Dolev MIT Press, 2000 Draft of Jan 2004, Shlomi.
A Local Facility Location Algorithm Supervisor: Assaf Schuster Denis Krivitski Technion – Israel Institute of Technology.
A Distance Routing Effect Algorithm for Mobility (DREAM)* Stefano Basagni Irnrich Chlamtac Violet R. Syrotiuk Barry A. Woodward.
Improving Data Access in P2P Systems Karl Aberer and Magdalena Punceva Swiss Federal Institute of Technology Manfred Hauswirth and Roman Schmidt Technical.
Lecture 12 Synchronization. EECE 411: Design of Distributed Software Applications Summary so far … A distributed system is: a collection of independent.
Composition Model and its code. bound:=bound+1.
Correctness of Gossip-Based Membership under Message Loss Maxim Gurevich, Idit Keidar Technion.
Multicast Communication Multicast is the delivery of a message to a group of receivers simultaneously in a single transmission from the source – The source.
Algorithms for Self-Organization and Adaptive Service Placement in Dynamic Distributed Systems Artur Andrzejak, Sven Graupner,Vadim Kotov, Holger Trinks.
ROUTING ON THE INTERNET COSC Aug-15. Routing Protocols  routers receive and forward packets  make decisions based on knowledge of topology.
Data Communications & Computer Networks
Lecture 6: Introduction to Distributed Computing.
Communication (II) Chapter 4
1 Plaxton Routing. 2 Introduction Plaxton routing is a scalable mechanism for accessing nearby copies of objects. Plaxton mesh is a data structure that.
Probabilistic Broadcast Presented by Keren Censor 1.
1 Pertemuan 20 Teknik Routing Matakuliah: H0174/Jaringan Komputer Tahun: 2006 Versi: 1/0.
1 Introducing Routing 1. Dynamic routing - information is learned from other routers, and routing protocols adjust routes automatically. 2. Static routing.
1 System Models. 2 Outline Introduction Architectural models Fundamental models Guideline.
Network Aware Resource Allocation in Distributed Clouds.
1 A Mutual Exclusion Algorithm for Ad Hoc Mobile networks Presentation by Sanjeev Verma For COEN th Nov, 2003 J. E. Walter, J. L. Welch and N. Vaidya.
Andreas Larsson, Philippas Tsigas SIROCCO Self-stabilizing (k,r)-Clustering in Clock Rate-limited Systems.
Distributed Algorithms Rajmohan Rajaraman Northeastern University, Boston May 2012 Chennai Network Optimization WorkshopDistributed Algorithms1.
Chi-Cheng Lin, Winona State University CS 313 Introduction to Computer Networking & Telecommunication Chapter 5 Network Layer.
Consensus and Its Impossibility in Asynchronous Systems.
04/06/2016Applied Algorithmics - week101 Dynamic vs. Static Networks  Ideally, we would like distributed algorithms to be: dynamic, i.e., be able to.
Routing Networks and Protocols Prepared by: TGK First Prepared on: Last Modified on: Quality checked by: Copyright 2009 Asia Pacific Institute of Information.
Teknik Routing Pertemuan 10 Matakuliah: H0524/Jaringan Komputer Tahun: 2009.
Content caching and scheduling in wireless networks with elastic and inelastic traffic Group-VI 09CS CS CS30020 Performance Modelling in Computer.
Proposal of Asynchronous Distributed Branch and Bound Atsushi Sasaki†, Tadashi Araragi†, Shigeru Masuyama‡ †NTT Communication Science Laboratories, NTT.
UNIT IV INFRASTRUCTURE ESTABLISHMENT. INTRODUCTION When a sensor network is first activated, various tasks must be performed to establish the necessary.
Antidio Viguria Ann Krueger A Nonblocking Quorum Consensus Protocol for Replicated Data Divyakant Agrawal and Arthur J. Bernstein Paper Presentation: Dependable.
Distributed, Self-stabilizing Placement of Replicated Resources in Emerging Networks Bong-Jun Ko, Dan Rubenstein Presented by Jason Waddle.
A Stable Broadcast Algorithm Kei Takahashi Hideo Saito Takeshi Shibata Kenjiro Taura (The University of Tokyo, Japan) 1 CCGrid Lyon, France.
Error-Correcting Code
SERENA: SchEduling RoutEr Nodes Activity in wireless ad hoc and sensor networks Pascale Minet and Saoucene Mahfoudh INRIA, Rocquencourt Le Chesnay.
CIS 825 Review session. P1: Assume that processes are arranged in a ring topology. Consider the following modification of the Lamport’s mutual exclusion.
1 Plaxton Routing. 2 History Greg Plaxton, Rajmohan Rajaraman, Andrea Richa. Accessing nearby copies of replicated objects, SPAA 1997 Used in several.
1 Roie Melamed, Technion AT&T Labs Araneola: A Scalable Reliable Multicast System for Dynamic Wide Area Environments Roie Melamed, Idit Keidar Technion.
Mobile Networks and Applications (January 2007) Presented by J.H. Su ( 蘇至浩 ) 2016/3/21 OPLab, IM, NTU 1 Joint Design of Routing and Medium Access Control.
EEC 688/788 Secure and Dependable Computing Lecture 10 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
Accessing nearby copies of replicated objects
湖南大学-信息科学与工程学院-计算机与科学系
Providing Secure Storage on the Internet
DHT Routing Geometries and Chord
Dynamic Routing and OSPF
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
Presentation transcript:

Gossip Algorithms and Implementing a Cluster/Grid Information service MsSys Course Amar Lior and Barak Amnon

2 Agenda A short introduction to gossip algorithms Cluster/Grid Information services requirements –How good is old information The distributed bulletin board model Implementation

3 A Problem In an n node system assume that every pair of nodes can communicate directly node i wishes to send a message (rumor, color) to all other nodes. Possible deterministic solutions –BROADCAST (only in a broadcast medium) –Defining a static tree between the nodes and sending the message along the edges of this tree

4 A Gossip Style solution Starting with the round in which a rumor is generated each node that holds the rumor selects another node independently and uniformly at random send the rumor to this node The distribution of the rumor is terminated after some fixed number of O( ln n ) rounds At this point all players are informed with high probability

5 Uniform Gossip Example 1 t

6 t 2

7 t 3

8 t 4

9 t 5

10 Gossip benefits Robustness to the presence of node failures –Messages will continue to propagate due to the random selection of destination –F nodes failure results in only O(F) uninformed players Simplicity –All nodes run the same algorithm Scalability –The number of massages each nodes send (and possibly receive) each round is fixed

11 Gossip taxonomy Other names are –Epidemic algorithms (demers et al) –Randomized communication (Karp et al) Propagation can be done by –Push – sending the information from the node to the selected node –Pull – the other way around –Push&Pull both ways We distinguish between 2 conceptual layers –A basic gossip algorithm »by which nodes choose other nodes for communication –A gossip-based protocol »Built on top of a gossip algorithm »Determine the content of the messages that are sent »The way received messages cause nodes to update their internal state

12 Rumor speeding bounds From a single node to all Time complexity: Message complexity (Karp el al) lower bound to the number of messages:

13 Spatial Gossip (Kampe at al) New information is most interesting to nodes that are nearby Combines the benefits of –Uniform gossip –Deterministic flooding The gossip algorithm chooses the nodes according to New information is spread to nodes at distance d with high probability,in :

14 Aggregating values Gossip can also be used to aggregate a value over all nodes Average, maximum, minimum … In this case the question is how fast the local value in each node converge to the desired value

15 Cluster/Grid Information services Basic properties of Grid environment –Information sources are distributed –Individual sources are subject to failure –Total number of information providers is large –Both the types of information sources and the ways it is used can be varied We cannot in general provide users with accurate information: any information delivered to a user is “old” –How useful is old information? (Mitzenmacher) –How to build an information service with guaranteed age properties?

16 Distributed Bulletin board The system –Consists of ‘N’ nodes (or clusters) –Distributed –Nodes are subject to failure Each node maintains a data structure that holds an entry on selected (or all) nodes in the system We refer to this data structure as “The vector” Each vector entry holds: –state of the resources (static and dynamic) about the corresponding node –age of the information (tune to the local clock) The vector is a distributed bulletin board that serves information requests locally

17 Algorithm 1- Information dissemination Each time unit –Update local information –Find all vector entries which are up to age t –Choose a random node –Send the above entries to that node Upon receiving a message –Compute the received entries age –Update the entries which the newly received information is fresher A:1B:12C:2D:4E:11 A:1C:2D:4 A:4B:12C:2D:4E:11 B:1C:3E:3

18 Algorithm 1 : t=2 1 t

19 Algorithm 1 : t=2 t 2

20 Algorithm 1 : t=2 t 3

21 Algorithm 1 : t=2 t 4

22 Algorithm 1 : t=2 t 5

23 Bounds and Approximations We want to know “how old” is the information in the vector First we find E(Xt) (for the asynchronous case) –The expected number of nodes that have information about node i which is up to t time unit old Synchronous case

24 Bounds and Approximations An approximation for the expected age of the vector

25 Real results

26 Approximating the age distribution Ak is a random variable describing the number of nodes which are up to age k

27 Age distribution

28 Handling inactive nodes The presence of inactive nodes causes problems –Age quality of the information deteriorate –Number of ARP broadcasts increase linearly Using a fixed size window improves the age quality but the number of ARP broadcasts stay the same

29 Algorithm 2 Algorithm 2 solves the above 2 issues Works basically the same as algorithm 1 with the following difference when sending a message –Calculate l the number of active nodes (from the local vector) –Generate a random number between k=0…l –If K=0 send the window to all nodes –Else send the window only to the active nodes Using Algorithm 2 the maximal expected number of messages to inactive nodes ≤ 1 –From all nodes at each round

30 Algorithm 2 – Age performance

31 Algorithm 2 – minimizing messages to inactive nodes 1 t

32 Algorithm 2 t 2

33 Algorithm 2 t 3

34 Algorithm 2 t 4

35 Supporting Urgent information In previous algorithm information is propagated from all nodes constantly In some cases we wish to send an important message urgently to all –such as the detection of a newly dead node –In this case the source node give the message high priority 2*log(n) When a node assemble the window it is about to send it takes the entries with the highest priority and only then the younger entries The priority of an entry is decremented every time unit The result is that urgent messages are disseminated in O(log(n)) steps And regular information is disseminated a bit slower

36 Information service clients MOSIX –load balancing »Fresh information is used by the load balancing algorithm to consider migrating processes –mmon, Mosix Monitoring tool »Presents the vector of a specific node »mmon –h xil-10 MPICH –Improved assignment of processes to nodes »No assignment to “dead” nodes »Assignment to the least loaded ones Nagios –Colleting information about clusters over time (history) –Periodically retrieving a vector from a machine and keeping it Decision algorithms in the cluster level –Leader election (queue fault tolerance) –Node reservation

37 Conclusions Constructed a distributed bulletin board –Age properties are guaranteed –The administrator can configure it to the desired properties –No two nodes have the same view of the system –Information requests are served locally –Noise level (messages to inactive) is constant –Urgent messages are propagated quickly

38 Future Work Investigating other gossip models –Push and Pull-Push Using only a partial view of the system