Presentation is loading. Please wait.

Presentation is loading. Please wait.

Updates in Highly Unreliable, Replicated Peer-to-Peer Systems

Similar presentations


Presentation on theme: "Updates in Highly Unreliable, Replicated Peer-to-Peer Systems"— Presentation transcript:

1 Updates in Highly Unreliable, Replicated Peer-to-Peer Systems
Anwitaman Datta, Manfred Hauswirth, Karl Aberer (EPFL) Presented by Zhiyuan “Troy” Zhan 1/18/2019

2 Outline Motivation System Model & Algorithm
Analytical Model & Analysis Related Work Conclusion 1/18/2019

3 Motivation Peer-to-Peer System is not just about file sharing.
Data items can be added, deleted and updated frequently. Peer commerce Shared calendars/address books Trust management Medical information sharing Replication is used to improve fault-tolerance and response time. 1/18/2019

4 Motivation – cont’d How to disseminate updates to other peers is the target problem Consistency guarantee Scalability, underlying system assumption, resource consumption Challenges: Huge number of peers Peers can go online/offline at any time Often lack of global knowledge 1/18/2019

5 Motivation – cont’d Contributions
Address the update dissemination problem with low online probabilities of peers (<30%) and no global knowledge. Present a “fully decentralized, efficient and robust communication scheme” based on rumor spreading. A generic analytical model of combined push/pull technique. 1/18/2019

6 Motivation - Problem Statement
Assumptions: Low percentage of online peers, impossible to achieve any kind of quorum. Transactional consistency is not required, eventual consistency is desirable in most applications. Update conflicts is very rare, the paper does not handle it. Probabilistic guarantee of successful search are sufficient. Total # of replicas is substantially lower than total # of peers (i.e vs 1000,000). Consecutive updates can be distributed sparsely over time. Communication overhead is the major performance measurement. 1/18/2019

7 Outline Motivation System Model & Algorithm
Analytical Model & Analysis Related Work Conclusion 1/18/2019

8 System Model A peer-to-peer overlay network.
Each peer has its own local knowledge, i.e. routing table, replica list, etc. Peers can go offline at any time. A communication channel can be established between any two online peers, otherwise, assume each other offline. 1/18/2019

9 Algorithm – Push Phase Executed when disseminating updates.
At replica “p”, upon receiving message Push(U,V,Rf,t): IF Push(U,V,Rf,t) not processed THEN Select a random subset Rp of replicas with |Rp|=R*fr; With probability PF(t), send Push(U, V, Rf+Rp+{p}, t+1) to Rp-Rf; Set Push(U,V,Rf,t) as processed; 1/18/2019

10 Algorithm – Push Phase,cont’d
U: actual update data item V: version vector. Contains global version identifiers (GUID, can be computed locally), altered data items are treated as distinct coexist versions. R, Rf, Rp: replicas. PF(t): a function of t. 1/18/2019

11 Algorithm – Pull Phase Executed when a peer recovers from failure, or reconnects, or receives no updates for a while, or receives pull message but not sure whether itself is in sync. Contact online replicas; Inquire for missed updates based on version vectors; 1/18/2019

12 Outline Motivation System Model & Algorithm
Analytical Model & Analysis Related Work Conclusion 1/18/2019

13 General Assumptions Assume an update U is initiated for R online replicas. In general, the online population in push round t: Ron(t)=Ron(t-1)*x+[R-Ron(t-1)]*y x=1-p, y=q p: probability of an online peer going offline in one push round; q: probability of an offline peer coming online in one push round. p,q are typically small and may vary in different rounds. ASSUME p is constant and omit peers coming online: Ron(t)=Ron(t-1)*x ASSUME fr is constant. 1/18/2019

14 Analysis of Pull Phase – Round 0
Total number of messages: msg(0)=R*fr; New replicas which receive the update: newreplicas(0)=Ron(0)*fr; Online replicas that do not receive the update: Ron(0)*(1-fr); Message length (size, denote U as the update message size): ML(0)=U+R*fr*B; (B: size of data required to describe one replica – meta data), only consider U and Rf; 1/18/2019

15 Analysis of Pull Phase – Round 1
# of messages in round 1: msg(1)=Ron(0)*fr*x*PF(1) * R*fr(1-fr) # of replicas that newly pushed with updates after round 1: newreplicas(1)=Ron(0)*x*(1-fr) * [1-(1-fr)Ron(0)*fr*x*PF(1) ] Length of message: ML(1)=U+R*B*(fr+fr*(1-fr)) =U+R*B*(1-(1-fr)2) 1/18/2019

16 Analysis of Pull Phase – Round t>=2
Define fd_aware(t) and faware(t) fd_aware(t): Increment in fraction of online replicas which are aware of the update after round t faware(t): Total fraction of online replicas which are aware of the update at the beginning of round t faware(t)= faware(t-1)+ fd_aware(t-1) 1/18/2019

17 Analysis of Pull Phase – Round t>=2, cont’d
newreplicas(t)=Ron(t-1)*(1-faware(t-1))*x * [1 - (1-fr)Ron(t-1)*fd_aware(t-1)*x*PF(t) ] – in paper newreplicas(t)=Ron(t-1)*(1-faware(t))*x * [1 - (1-fr)Ron(t-1)*fd_aware(t-1)*x*PF(t) ] – I think Given fd_aware(t)=(1-faware(t))*[1-(1-fr)Ron(t-1)*fd_aware(t-1)*x*PF(t) ], we have: faware(t)= faware(t-1)+ fd_aware(t-1)=1-(1-faware(t-1))*(1-fr)Ron(t-2)*fd_aware(t-2)*x*PF(t-1) =….; faware(t) rapidly grows to 1; 1/18/2019

18 Analysis of Pull Phase – Round t>=2, cont’d
If the partial list is ignored: msg(t)=Ron(t-1)* fd_aware(t-1)*x*RF(t) * R*fr; If the partial list is considered: msg(t)=Ron(t-1)* fd_aware(t-1)*x*RF(t) * R*fr*(1-fr)t; - (1) ML(t)=U+R*B*(1-(1-fr)t+1); - (2) Both (1) and (2) are proved in the paper by induction on t. 1/18/2019

19 Analysis of Pull Phase Case1: a replica “p” comes online after a push phase is over Trivial, assume other online replicas have got the update already. Case2: “p” comes online during the push phase, suppose faware fraction of the replicas Ron are aware of the updates, the probability of “p” getting the update in m attempts is: 1-[1-(Ron* faware /R)]m -(3) Query: similar to Pull, but may need majority logic, or version scheme, or hybrid of two, to identify the latest updates 1/18/2019

20 Analytical Results Varying initial online population Ron(0), 1%
1/18/2019

21 Analytical Results – cont’d
Varying initial online population Ron(0), >5% 1/18/2019

22 Analytical Results – cont’d
1/18/2019

23 Analytical Results – cont’d
1/18/2019

24 Analytical Results – cont’d, Parameter tuning
1/18/2019

25 Analytical Results – cont’d, scalability
1/18/2019

26 Discussions: Comparison with Gnutella
Parameter self-tuning (Optimization) 1/18/2019

27 Outline Motivation System Model & Algorithm
Analytical Model & Analysis Related Work Conclusion 1/18/2019

28 Related Work Replication and updates in DB:
iAnywhere Solutions: Server-based approach Bayou: assumes significantly less replicas, less updates, disconnections are short Some other approaches assume availability of resource and replicas in general 1/18/2019

29 Related Work – cont’d Group communication and lazy epidemic schemes:
Similar work: Bimodal multicast, epidemic updates None has done “special case study of bimodal behavior and the utility of epidemic algorithms in a highly unreliable environment” 1/18/2019

30 Outline Motivation System Model & Algorithm
Analytical Model & Analysis Related Work Conclusion 1/18/2019

31 Conclusion This paper provides “an analytical model to demonstrate the significant reduction of message overhead” using combined push and pull techniques. Totally decentralized solution, no global knowledge is needed. The paper is available at citeseer. Will appear in ICDCS 2003. 1/18/2019


Download ppt "Updates in Highly Unreliable, Replicated Peer-to-Peer Systems"

Similar presentations


Ads by Google