Presentation is loading. Please wait.

Presentation is loading. Please wait.

RUMOR SPREADING “Randomized Rumor Spreading”, Karp et al. Presented by Michael Kuperstein Advanced Topics in Distributed Algorithms.

Similar presentations


Presentation on theme: "RUMOR SPREADING “Randomized Rumor Spreading”, Karp et al. Presented by Michael Kuperstein Advanced Topics in Distributed Algorithms."— Presentation transcript:

1 RUMOR SPREADING “Randomized Rumor Spreading”, Karp et al. Presented by Michael Kuperstein Advanced Topics in Distributed Algorithms

2 Agenda  Background  “What else can we steal from the DB world?”  Simple “Rumor Mongering”  Push  Pull  More advanced protocols  Push-Pull  Median-Counter  Proof outline only  Dessert  The protocols presented are (in a sense) optimal 2

3 Background  Original application: Replicated DBs  “Epidemic Algorithms for Replicated Database Management”, Demers, et al.  DB Updates originate at some node  Need to be distributed to the other nodes  Efficiently  Robustly  Remind you of anything? 3

4 Dissemination concepts  Deterministic  Either efficient or robust  But not both  “Direct Mail”  Another term for Broadcast…  Robust, but very inefficient  Epidemic algorithms  “Anti-Entropy”  “Rumor Mongering” 4

5 Anti-Entropy  Every once in a while:  A node randomly selects another node  The nodes compare their DBs  Differences are resolved  In more modern contexts  Aggregation  “Gossip-Based Aggregation in Large Dynamic Networks”, Jelasity et. al  “Gossip-Based Computation of Aggregate Information”, Kempe et. al  Failure detection  “A Gossip-Based Failure Detection Service”, van Renesse et. al 5

6 Rumor Mongering  Every DB update is a rumor  A new update is a “hot rumor”  If a node knows the update was seen by many other nodes  It’s no longer hot…  So we ought to stop spreading it! 6

7 Epidemic Metaphors  Nodes can be categorized like patients:  Susceptible  Uninformed, hasn’t heard the rumor yet  Infective  Heard the rumor, is spreading it around  Removed  Heard the rumor, no longer spreading it 7

8 Random Phone Calls Model  Synchronous – works in rounds  Every round  Each node u randomly picks another node v  u makes a phone call to v  u and v exchange rumors 8

9 Random Phone Calls Model  Synchronous – works in rounds  Every round  Each node u randomly picks another node v  u makes a phone call to v  u and v exchange rumors 9

10 Random Phone Calls Model  Synchronous – works in rounds  Every round  Each node u randomly picks another node v  u makes a phone call to v  u and v exchange rumors 10

11 Random Phone Calls Model  Formally: Every round t induces a graph  For all vertices,  Information is exchanged along the edges of  At each round define to be the set of informed nodes, and the set of uninformed nodes  11

12 Assumptions  The underlying network is a complete graph  Uniform cost for all edges  Membership  Some results assume knowledge of full membership  Others have a weaker requirement  Each rumor is dealt with separately  Fan-in isn’t an issue  A node receiving many messages in the same round can handle them all 12

13 Complexity Measures  Number of rounds  What you would expect…  Number of messages sent  Rumor messages only  A phone call without a rumor sent doesn’t count! 13

14 Push Scheme  Assume u calls v  If u has the rumor u sends it to v  Nodes are never removed (in a sense, )  The algorithm is stopped “externally” after O(log n) rounds Did you hear? Alice is dating Bob! Hello? 14

15 Push Scheme – Analysis  When few nodes are informed, everything’s great  Exponential growth  O(log n) rounds until nodes are informed 15

16 Push Scheme – Example 16

17 Push Scheme – Example 17

18 Push Scheme – Example 18

19 Push Scheme – Example 19

20 Push Scheme – Analysis  When “a lot” of nodes are informed  Even when almost all nodes are informed, the probability some uninformed node won’t get called is roughly  Only a constant fraction ( ) of the remaining nodes is informed in each round  O(log n) rounds until every node is informed w.h.p 20

21 Push Scheme – Example II 21

22 Pull Scheme  Assume u calls v  The opposite of push: If v knows the rumor it sends it to u.  Nodes are never removed Hi, What’s up? Alice is dating Bob! 22

23 Pull Scheme – Analysis  When a few nodes are informed  Startup is unpredictable  An informed node might not get called  But still experiences exponential growth  O(log n) rounds until n/2 nodes are informed  When most nodes are informed  If a fraction p is still uninformed in this round, then p 2 will remain uninformed in the next  O(log log n) rounds until every node is informed w.h.p 23

24 Pull Scheme – Example 24

25 Pull Scheme – Example 25

26 Pull Scheme – Example 26

27 Pull Scheme – Example 27

28 Push-Pull Scheme  We want predictability  But also fast convergence  Solution: “Push-Pull”  u calls v  If u has the rumor, it pushes it to v  Concurrently, u tries to pull it from v 28

29 Push-Pull Scheme  Rumors have an age  Start out at age = 0  Every round the age is incremented  A node stops forwarding the rumor it’s too old  The maximum age is t = log 3 n+O(log log n)  This means the algorithm terminates after exactly t rounds 29

30 Push-Pull Scheme - Complexity  Theorem: Push-pull informs all nodes w.h.p  In log 3 n+O(log log n) rounds  Using O(n log log n) messages  Proof: Break the run into 4 phases  Startup  Exponential Growth  Quadratic Shrinking  Final 30

31 Push-Pull Scheme - Startup  Starts with 1 informed node  Stops when there are at least ln 4 n informed nodes  Only a constant number of rounds is needed to double the number of informed nodes  Even if only the pushes are taken into account  So the phase takes O(log log n) rounds 31

32 Push-Pull Scheme – Exponential Growth I  Stops when there are at least n/ln n informed nodes  In each round, the expected number of messages sent is 2s t-1  With high probability, the number of messages actually sent is  Using the Chernoff bound…  This is why we needed 32

33 Push-Pull Scheme – Exponential Growth II  Some messages are wasted  A message might be sent (with probability s t-1 /n) to an already informed node  A message might be sent (with probability at most m/n) to a node that receives another message in this round  The probability a given message is wasted is at most 33

34 Push-Pull Scheme – Exponential Growth III  So the expected number of informed nodes is   And with high probability   Thus, the total number of rounds this phase takes is log 3 n ± O(log log n)  Uses at most O(n) messages 34

35 Push-Pull Scheme – Quadratic Shrinking  Stops when there are at most uninformed nodes  Even accounting only for the pull transmissions, the fraction shrinks quadratically  Takes O(log log n) rounds to drop from uninformed nodes to 35

36 Push-Pull Scheme – Final Phase  Starts when there are at least informed nodes  Taking in account only pulls, the probability of a node being informed is at least  Takes a constant numbers of rounds to inform all nodes with high probability. 36

37 Push-Pull Scheme – Wrap Up  The Exponential Growth phase:  log 3 n+O(log log n) rounds  O(n) messages overall  Everything else  O(log log n) rounds  At most 2n messages in every round  So, overall:  log 3 n+O(log log n) rounds  O(n log log n) messages 37

38 Push-Pull Scheme – Perfect?  Not quite!  Requires exactly log 3 n + O(log n) rounds.  (1+ ε ) log 3 n rounds: O(n log n) messages  (1- ε ) log 3 n rounds: Not enough informed nodes  A more robust protocol is needed 38

39 Median-Counter Algorithm  Problem: In push-pull, nodes must stop after an exact number of rounds  We want a node to become removed when it feels the rumor is already well-spread  Solution: Explicitly define the algorithm’s phases  A node can be in one of four states:  State A: uninformed  States B, C: informed  State D: removed  Send current state together with rumor 39

40 Median-Counter – State B  State B is a collection of sub-states B-m  m runs from 1 to t max, where t max = O(log log n)  Initial state when a node receives a rumor: B-1  If in a certain round a node is in state B-m and the median “m-level” of received rumors is  Move to state B-(m+1)  For calculating the median, A counts as B-0 40

41 Median-Counter – State C  Whenever a node should move to state B-t max  Moves to state C instead  State C is “infectious”  If a node receives the rumor from someone in state C, it moves to state C  Limited lifetime:  After moving to state C, keep spreading the rumor for O(ln ln n) rounds, then move to state D 41

42 Median-Counter - Examples B-4ACB-2 B-9 B-6B-3 B-2 B-8B-3 B-2A 42

43 Median-Counter - Examples B-4ACB-2 B-9 B-3 B-6 B-8B-3 B-2A B-1C B-4C 43

44 Assumption Relaxation  For the analysis of push-pull, we assumed that  Nodes choose call partners out of a uniform distribution  There are no node failures  The assumptions can be relaxed  The distribution may be non-uniform as long as it’s the same for all nodes  Up to F nodes may fail 44

45 Median-Counter – Some Definitions  D – The distribution used by nodes to select a call partner  w i – the probability i is chosen under D  “Weight” of a node  Reduces to 1/ n if the distribution is uniform  g t – the sum of the weights of all informed nodes in round t  Reduces to s t / n is the distribution is uniform 45

46 Median-Counter – Analysis  Theorem: Median-Counter informs all nodes w.h.p  In O(log n) rounds  Using O(n log log n) messages  Intuition:  Prove that push-pull works with an arbitrary distribution  Proof similar to the previous one in spirit, but more complex  Concurrently, show that the median-counter scheme provides a correct termination condition  “Stage B” is long enough to support exponential growth  But once enough nodes are informed, counters increase rapidly and things wrap up quickly (in O(log log n) rounds) 46

47 Median-Counter – Warning  Vigorous hand-waving ahead! 47

48 Median-Counter - Startup  We want to start exponential growth with   We need O(log log n) rounds of “push” to achieve  Then O(log log n) rounds of “pull” to get the s t we want 48

49 Median-Counter – Exponential Growth  Stops when  Equivalent to under the uniform distribution  Idea:  For every round, define a set of “heavy” uninformed nodes  If that set is heavy, it draws push transmissions from informed nodes, so the weight of the informed nodes increases by a constant factor  If that set is light (there are few heavy nodes), then push transmissions are “well-distributed” and the number of informed nodes increases by a constant factor  If then so that’s enough 49

50 Median-Counter – Exponential Growth What happens to the counters?  We want to show that during exponential growth, nodes remain in state B  If a node v is heavier than its counter does not increase  In this phase  So at least uninformed nodes call v  And at most informed nodes do  This means the median is always 0 50

51 Median-Counter – Exponential Growth II What happens to the counters?  If a node is lighter  Once the counter stops increasing  Same argument  While the counter may increase in every round  But there are only O(log log n) such rounds, because s t increases exponentially  If we can show that the number of nodes that increase their counter to j before falls rapidly  Each such node gets called rarely enough and calls mostly other uninformed nodes, so increases are rare.  51

52 Median-Counter – Quadratic Shrinking I  Goes on until all nodes are at least in state C  Using only pull transmissions, a node will stay uninformed with probability at most  “Some easy calculations show” that w.h.p  So the weight of uninformed nodes shrinks quadratically  We only need to make sure the algorithm terminates 52

53 Median-Counter – Quadratic Shrinking II  From the point all nodes are informed, it takes O(log log n) rounds until termination  If in some round all nodes are in state (at least) then in the next round all nodes are in state (at least)  So O(log log n) rounds until every node is in state C, then O(log log n) until every node is in state D 53

54 Median-Counter – Summary  So, what did we get?  O(log n) rounds of exponential growth that use O(n) messages overall  O(log log n) rounds for the rest, at most O(n) messaged per round  A grand total of O(log n) rounds and O(log log n) messages  But O(n log n) phone calls  Improved by “Randomized Rumor Spreading with Fewer Phone Calls”, Hildrum et al 54

55 Median-Counter – Failures  Without going into details: we can assume there are up to failures and the total weight of the failed nodes isn’t above  Start-up and exponential growth are unaffected  Quadratic shrinking is almost unaffected, still converges in O(log log n) rounds  All but O(F) nodes are informed w.h.p  The number of transmissions is also unaffected 55

56 Trivial Lower Bounds  Ω (n) messages  Every node must be informed…  Ω (log n) rounds  The number of informed nodes grows at most by a constant factor every round 56

57 Some More Definitions  Distributed Algorithm  A node’s behavior depends only on the node’s state  Address-Oblivious Algorithm  A node’s state does not depend on the names of its call partners in the current round  May depend on the names of its past call partners 57

58 And Some Rationale  Both Pull-Push and Median-Counter are distributed and address-oblivious  If we remove the address-obliviousness constraint  We can easily build algorithms that use O(n) messages  But we might need many rounds 58

59 Lower Bound  Theorem:  Any address-oblivious rumor spreading algorithm informing all but a fraction f of the players with constant probability needs to perform Ω (n log log 1/f) transmissions  Corollary:  If we want to inform all nodes with high probability, f=1/n, so Ω (n log log n) messages are required 59

60 Lower Bound - Outline  Split the algorithm’s run into phases  Make sure phases use, on average, Ω (n) transmissions  Show the number of uninformed nodes after phase i is (w.h.p.) at least  This shows the algorithm needs at least log log n phases 60

61 Phases  Construct the phases so that in the first i phases, there are at least transmissions.  Sparse phase:  A maximal set of rounds that does not contain n/2 transmissions  Dense phase:  A single round that contains more than n/2 transmissions  Any two consequent phases contain at least n/2 transmissions, so the condition holds 61

62 Proof by induction  Assume  For each uninformed node k, define:  Lemma:  The probability node k will not get informed is “large”  Proof a bit later 62

63 Proof by induction - II  Which nodes are still uninformed after round i?  And from the lemma we have:  Another Chernoff bound 63

64 Infection probability - Sparse  During the phase there are at most n/2 messages  The probability of a given message being sent to a given node k is 1/n 64

65 Infection probability - Dense  Only one round  In that round:  The probability of not getting informed by pull is at least  The probability of not getting informed by push is at least  So the overall probability of not getting informed is at least 65

66 General Lower Bound  Theorem:  Any distributed rumor spreading algorithm guaranteeing that all but a fraction o(1) of the players receive the rumor within O(log n) rounds with constant probability need to perform ω (n) transmissions in expectations  Proof not presented. 66

67 Questions? 67


Download ppt "RUMOR SPREADING “Randomized Rumor Spreading”, Karp et al. Presented by Michael Kuperstein Advanced Topics in Distributed Algorithms."

Similar presentations


Ads by Google