Download presentation

Presentation is loading. Please wait.

Published byHarriet Brown Modified about 1 year ago

1
RUMOR SPREADING “Randomized Rumor Spreading”, Karp et al. Presented by Michael Kuperstein Advanced Topics in Distributed Algorithms

2
Agenda Background “What else can we steal from the DB world?” Simple “Rumor Mongering” Push Pull More advanced protocols Push-Pull Median-Counter Proof outline only Dessert The protocols presented are (in a sense) optimal 2

3
Background Original application: Replicated DBs “Epidemic Algorithms for Replicated Database Management”, Demers, et al. DB Updates originate at some node Need to be distributed to the other nodes Efficiently Robustly Remind you of anything? 3

4
Dissemination concepts Deterministic Either efficient or robust But not both “Direct Mail” Another term for Broadcast… Robust, but very inefficient Epidemic algorithms “Anti-Entropy” “Rumor Mongering” 4

5
Anti-Entropy Every once in a while: A node randomly selects another node The nodes compare their DBs Differences are resolved In more modern contexts Aggregation “Gossip-Based Aggregation in Large Dynamic Networks”, Jelasity et. al “Gossip-Based Computation of Aggregate Information”, Kempe et. al Failure detection “A Gossip-Based Failure Detection Service”, van Renesse et. al 5

6
Rumor Mongering Every DB update is a rumor A new update is a “hot rumor” If a node knows the update was seen by many other nodes It’s no longer hot… So we ought to stop spreading it! 6

7
Epidemic Metaphors Nodes can be categorized like patients: Susceptible Uninformed, hasn’t heard the rumor yet Infective Heard the rumor, is spreading it around Removed Heard the rumor, no longer spreading it 7

8
Random Phone Calls Model Synchronous – works in rounds Every round Each node u randomly picks another node v u makes a phone call to v u and v exchange rumors 8

9
Random Phone Calls Model Synchronous – works in rounds Every round Each node u randomly picks another node v u makes a phone call to v u and v exchange rumors 9

10
Random Phone Calls Model Synchronous – works in rounds Every round Each node u randomly picks another node v u makes a phone call to v u and v exchange rumors 10

11
Random Phone Calls Model Formally: Every round t induces a graph For all vertices, Information is exchanged along the edges of At each round define to be the set of informed nodes, and the set of uninformed nodes 11

12
Assumptions The underlying network is a complete graph Uniform cost for all edges Membership Some results assume knowledge of full membership Others have a weaker requirement Each rumor is dealt with separately Fan-in isn’t an issue A node receiving many messages in the same round can handle them all 12

13
Complexity Measures Number of rounds What you would expect… Number of messages sent Rumor messages only A phone call without a rumor sent doesn’t count! 13

14
Push Scheme Assume u calls v If u has the rumor u sends it to v Nodes are never removed (in a sense, ) The algorithm is stopped “externally” after O(log n) rounds Did you hear? Alice is dating Bob! Hello? 14

15
Push Scheme – Analysis When few nodes are informed, everything’s great Exponential growth O(log n) rounds until nodes are informed 15

16
Push Scheme – Example 16

17
Push Scheme – Example 17

18
Push Scheme – Example 18

19
Push Scheme – Example 19

20
Push Scheme – Analysis When “a lot” of nodes are informed Even when almost all nodes are informed, the probability some uninformed node won’t get called is roughly Only a constant fraction ( ) of the remaining nodes is informed in each round O(log n) rounds until every node is informed w.h.p 20

21
Push Scheme – Example II 21

22
Pull Scheme Assume u calls v The opposite of push: If v knows the rumor it sends it to u. Nodes are never removed Hi, What’s up? Alice is dating Bob! 22

23
Pull Scheme – Analysis When a few nodes are informed Startup is unpredictable An informed node might not get called But still experiences exponential growth O(log n) rounds until n/2 nodes are informed When most nodes are informed If a fraction p is still uninformed in this round, then p 2 will remain uninformed in the next O(log log n) rounds until every node is informed w.h.p 23

24
Pull Scheme – Example 24

25
Pull Scheme – Example 25

26
Pull Scheme – Example 26

27
Pull Scheme – Example 27

28
Push-Pull Scheme We want predictability But also fast convergence Solution: “Push-Pull” u calls v If u has the rumor, it pushes it to v Concurrently, u tries to pull it from v 28

29
Push-Pull Scheme Rumors have an age Start out at age = 0 Every round the age is incremented A node stops forwarding the rumor it’s too old The maximum age is t = log 3 n+O(log log n) This means the algorithm terminates after exactly t rounds 29

30
Push-Pull Scheme - Complexity Theorem: Push-pull informs all nodes w.h.p In log 3 n+O(log log n) rounds Using O(n log log n) messages Proof: Break the run into 4 phases Startup Exponential Growth Quadratic Shrinking Final 30

31
Push-Pull Scheme - Startup Starts with 1 informed node Stops when there are at least ln 4 n informed nodes Only a constant number of rounds is needed to double the number of informed nodes Even if only the pushes are taken into account So the phase takes O(log log n) rounds 31

32
Push-Pull Scheme – Exponential Growth I Stops when there are at least n/ln n informed nodes In each round, the expected number of messages sent is 2s t-1 With high probability, the number of messages actually sent is Using the Chernoff bound… This is why we needed 32

33
Push-Pull Scheme – Exponential Growth II Some messages are wasted A message might be sent (with probability s t-1 /n) to an already informed node A message might be sent (with probability at most m/n) to a node that receives another message in this round The probability a given message is wasted is at most 33

34
Push-Pull Scheme – Exponential Growth III So the expected number of informed nodes is And with high probability Thus, the total number of rounds this phase takes is log 3 n ± O(log log n) Uses at most O(n) messages 34

35
Push-Pull Scheme – Quadratic Shrinking Stops when there are at most uninformed nodes Even accounting only for the pull transmissions, the fraction shrinks quadratically Takes O(log log n) rounds to drop from uninformed nodes to 35

36
Push-Pull Scheme – Final Phase Starts when there are at least informed nodes Taking in account only pulls, the probability of a node being informed is at least Takes a constant numbers of rounds to inform all nodes with high probability. 36

37
Push-Pull Scheme – Wrap Up The Exponential Growth phase: log 3 n+O(log log n) rounds O(n) messages overall Everything else O(log log n) rounds At most 2n messages in every round So, overall: log 3 n+O(log log n) rounds O(n log log n) messages 37

38
Push-Pull Scheme – Perfect? Not quite! Requires exactly log 3 n + O(log n) rounds. (1+ ε ) log 3 n rounds: O(n log n) messages (1- ε ) log 3 n rounds: Not enough informed nodes A more robust protocol is needed 38

39
Median-Counter Algorithm Problem: In push-pull, nodes must stop after an exact number of rounds We want a node to become removed when it feels the rumor is already well-spread Solution: Explicitly define the algorithm’s phases A node can be in one of four states: State A: uninformed States B, C: informed State D: removed Send current state together with rumor 39

40
Median-Counter – State B State B is a collection of sub-states B-m m runs from 1 to t max, where t max = O(log log n) Initial state when a node receives a rumor: B-1 If in a certain round a node is in state B-m and the median “m-level” of received rumors is Move to state B-(m+1) For calculating the median, A counts as B-0 40

41
Median-Counter – State C Whenever a node should move to state B-t max Moves to state C instead State C is “infectious” If a node receives the rumor from someone in state C, it moves to state C Limited lifetime: After moving to state C, keep spreading the rumor for O(ln ln n) rounds, then move to state D 41

42
Median-Counter - Examples B-4ACB-2 B-9 B-6B-3 B-2 B-8B-3 B-2A 42

43
Median-Counter - Examples B-4ACB-2 B-9 B-3 B-6 B-8B-3 B-2A B-1C B-4C 43

44
Assumption Relaxation For the analysis of push-pull, we assumed that Nodes choose call partners out of a uniform distribution There are no node failures The assumptions can be relaxed The distribution may be non-uniform as long as it’s the same for all nodes Up to F nodes may fail 44

45
Median-Counter – Some Definitions D – The distribution used by nodes to select a call partner w i – the probability i is chosen under D “Weight” of a node Reduces to 1/ n if the distribution is uniform g t – the sum of the weights of all informed nodes in round t Reduces to s t / n is the distribution is uniform 45

46
Median-Counter – Analysis Theorem: Median-Counter informs all nodes w.h.p In O(log n) rounds Using O(n log log n) messages Intuition: Prove that push-pull works with an arbitrary distribution Proof similar to the previous one in spirit, but more complex Concurrently, show that the median-counter scheme provides a correct termination condition “Stage B” is long enough to support exponential growth But once enough nodes are informed, counters increase rapidly and things wrap up quickly (in O(log log n) rounds) 46

47
Median-Counter – Warning Vigorous hand-waving ahead! 47

48
Median-Counter - Startup We want to start exponential growth with We need O(log log n) rounds of “push” to achieve Then O(log log n) rounds of “pull” to get the s t we want 48

49
Median-Counter – Exponential Growth Stops when Equivalent to under the uniform distribution Idea: For every round, define a set of “heavy” uninformed nodes If that set is heavy, it draws push transmissions from informed nodes, so the weight of the informed nodes increases by a constant factor If that set is light (there are few heavy nodes), then push transmissions are “well-distributed” and the number of informed nodes increases by a constant factor If then so that’s enough 49

50
Median-Counter – Exponential Growth What happens to the counters? We want to show that during exponential growth, nodes remain in state B If a node v is heavier than its counter does not increase In this phase So at least uninformed nodes call v And at most informed nodes do This means the median is always 0 50

51
Median-Counter – Exponential Growth II What happens to the counters? If a node is lighter Once the counter stops increasing Same argument While the counter may increase in every round But there are only O(log log n) such rounds, because s t increases exponentially If we can show that the number of nodes that increase their counter to j before falls rapidly Each such node gets called rarely enough and calls mostly other uninformed nodes, so increases are rare. 51

52
Median-Counter – Quadratic Shrinking I Goes on until all nodes are at least in state C Using only pull transmissions, a node will stay uninformed with probability at most “Some easy calculations show” that w.h.p So the weight of uninformed nodes shrinks quadratically We only need to make sure the algorithm terminates 52

53
Median-Counter – Quadratic Shrinking II From the point all nodes are informed, it takes O(log log n) rounds until termination If in some round all nodes are in state (at least) then in the next round all nodes are in state (at least) So O(log log n) rounds until every node is in state C, then O(log log n) until every node is in state D 53

54
Median-Counter – Summary So, what did we get? O(log n) rounds of exponential growth that use O(n) messages overall O(log log n) rounds for the rest, at most O(n) messaged per round A grand total of O(log n) rounds and O(log log n) messages But O(n log n) phone calls Improved by “Randomized Rumor Spreading with Fewer Phone Calls”, Hildrum et al 54

55
Median-Counter – Failures Without going into details: we can assume there are up to failures and the total weight of the failed nodes isn’t above Start-up and exponential growth are unaffected Quadratic shrinking is almost unaffected, still converges in O(log log n) rounds All but O(F) nodes are informed w.h.p The number of transmissions is also unaffected 55

56
Trivial Lower Bounds Ω (n) messages Every node must be informed… Ω (log n) rounds The number of informed nodes grows at most by a constant factor every round 56

57
Some More Definitions Distributed Algorithm A node’s behavior depends only on the node’s state Address-Oblivious Algorithm A node’s state does not depend on the names of its call partners in the current round May depend on the names of its past call partners 57

58
And Some Rationale Both Pull-Push and Median-Counter are distributed and address-oblivious If we remove the address-obliviousness constraint We can easily build algorithms that use O(n) messages But we might need many rounds 58

59
Lower Bound Theorem: Any address-oblivious rumor spreading algorithm informing all but a fraction f of the players with constant probability needs to perform Ω (n log log 1/f) transmissions Corollary: If we want to inform all nodes with high probability, f=1/n, so Ω (n log log n) messages are required 59

60
Lower Bound - Outline Split the algorithm’s run into phases Make sure phases use, on average, Ω (n) transmissions Show the number of uninformed nodes after phase i is (w.h.p.) at least This shows the algorithm needs at least log log n phases 60

61
Phases Construct the phases so that in the first i phases, there are at least transmissions. Sparse phase: A maximal set of rounds that does not contain n/2 transmissions Dense phase: A single round that contains more than n/2 transmissions Any two consequent phases contain at least n/2 transmissions, so the condition holds 61

62
Proof by induction Assume For each uninformed node k, define: Lemma: The probability node k will not get informed is “large” Proof a bit later 62

63
Proof by induction - II Which nodes are still uninformed after round i? And from the lemma we have: Another Chernoff bound 63

64
Infection probability - Sparse During the phase there are at most n/2 messages The probability of a given message being sent to a given node k is 1/n 64

65
Infection probability - Dense Only one round In that round: The probability of not getting informed by pull is at least The probability of not getting informed by push is at least So the overall probability of not getting informed is at least 65

66
General Lower Bound Theorem: Any distributed rumor spreading algorithm guaranteeing that all but a fraction o(1) of the players receive the rumor within O(log n) rounds with constant probability need to perform ω (n) transmissions in expectations Proof not presented. 66

67
Questions? 67

Similar presentations

© 2016 SlidePlayer.com Inc.

All rights reserved.

Ads by Google