Presentation is loading. Please wait.

Presentation is loading. Please wait.

Coterie availability in sites Flavio Junqueira and Keith Marzullo University of California, San Diego DISC, Krakow, Poland, September 2005.

Similar presentations


Presentation on theme: "Coterie availability in sites Flavio Junqueira and Keith Marzullo University of California, San Diego DISC, Krakow, Poland, September 2005."— Presentation transcript:

1 Coterie availability in sites Flavio Junqueira and Keith Marzullo University of California, San Diego DISC, Krakow, Poland, September 2005

2 DISC’05 2 Multi-site systems Emerging class of distributed systems Collection of sites across a WAN Multiple nodes in each site Share resources  Data sets  Computational power E.g. BIRN, Geon, TeraGrid, PlanetLab Site failure  All the nodes in a site simultaneously unavailable

3 DISC’05 3 Site availability — BIRN 10 sites experience at least one outage One site under 97%

4 DISC’05 4 Improving availability Better availability through replication Coteries  Set system of processes: a set of subsets of processes  Each subset is called a quorum  Minimal sets, pairwise intersect Coteries are useful  Distributed mutual exclusion  Distributed registers  Consensus through Paxos Coterie availability in multi-site systems

5 DISC’05 5 Roadmap System model Availability metrics  Previous deterministic metrics not necessarily good  A new metric Failure model  Characterize failures using survivor sets  Survivor sets: more expressive Quorum construction  Multi-site hierarchical construction Practical issues  Failure model in practice  PlanetLab experiment Conclusions

6 DISC’05 6 System model Set P of processes  Pairwise connected by quasi-reliable asynchronous channels  Process failure: crash  Processes can recover Set B of sites  Partition of the set processes  Site failure: simultaneous failure of all the processes in the site  Process failures are not independent Execution  Sequence of steps of processes  E : set of all executions In a step s  Available process in s  p  P is available if p  F(s)

7 DISC’05 7 Survivor sets A set S  P is a survivor set iff  Example Processes Sites E ={E 1,E 2,E 3,E 4 } E1,E2:E1,E2: s1s1 s2s2 E3:E3: s1s1 E4:E4: s1s1 NF(s i ) Survivor sets

8 DISC’05 8 Availability metrics Traditional deterministic metrics  Undirected graph: nodes = processes, edges = comm. links  Node vulnerability: Minimal number of nodes  Edge vulnerability: Minimal number of edges Majority is optimal [Barbara and Garcia-Molina’86]  Complete graphs

9 DISC’05 9 A counterexample Processes Survivor sets Sites Majority  Quorum: 5 processes  In some step, no quorum can be formed Using S P as quorums  In every step, at least one quorum can be formed Majority is not optimal

10 DISC’05 10 Availability metrics Traditional deterministic metrics  Undirected graph: nodes = processes, edges = comm. links  Node vulnerability: Minimal number of nodes  Edge vulnerability: Minimal number of edges Majority is optimal [Barbara and Garcia-Molina’86]  Complete graphs A new metric A ( Q ), Q is a coterie  Number of covered survivor sets in Q  A survivor set S is covered in Q if:

11 DISC’05 11 Failure model Multi-site hierarchical model  A set F s of subsets of B  Subsets of simultaneously faulty sites  An array F p  One entry per site  Each entry: subsets of processes in the site  Subsets of simultaneously faulty processes at a site  A survivor set S : FS  F s   B i  FS:  FP  F p [i]:P\FP  S   B i  FS:B i  S =  Processes ( P ) B1B1 B2B2 B3B3 F s ={{B 1 },{B 2 },{B 3 }} 1 23 1 23 1 23 F p [1]={{ }: i  {1,2,3}} i F p [2]={{ }: i  {1,2,3}} i F p [3]={{ }: i  {1,2,3}} i Sites( B ) S p ={{ }: i, j,k,l  {1,2,3}  i  j  k  l} ij kl  {{ }: i, j,k,l  {1,2,3}  i  j  k  l} ij kl ij kl

12 DISC’05 12 Quorum construction Optimal availability with respect to A Coterie Q : S p = Q OR Q dominates S p  Survivor sets in S p pairwise intersect  If not, then optimally discarding survivor sets is NP-Complete A special case: Qsite  All subsets of B of size f s in F s  All subsets of size t of B i in F p [i], for every i Site 1 Site 2 Site 3 E.g.: f s = 1, t = 1 Quorums

13 DISC’05 13 Model in practice Qsite  f s : Threshold on site failures  Data on site availability  t : Threshold on process failures  Markov chains  One Markov chain for each site  Transitions  Failure transitions: same probability, homogeneous processes  Repair transitions: variable probability, amount of resources used Failure transitionsRepair transitions

14 DISC’05 14 PlanetLab experiment Toy application  Paxos: quorums of acceptors  Client accessing quorums Hosts used  Three sites: three from each site  One UCSD host: proposer, learner Three settings  3Sites: One acceptor per site  Quorum: two hosts  3SitesMaj: All hosts  Quorum: four hosts, majority from each of two sites  SimpleMaj: All hosts  Quorum: any five processes UC Davis UT Austin Duke UC San Diego SimpleMaj has worse availability 3SitesMaj has better availability

15 DISC’05 15 The Bimodal model Sites are survivor sets S p is not a coterie  “Throw out” survivor sets  In general, optimal solution is NP-Complete  Simple solution for this model Practical issues  Practical for two sites  More than two sites: open problem

16 DISC’05 16 Conclusions Coteries for multi-site systems  Site failures: process failures not independent A new metric  Counts covered survivor sets Multi-site hierarchical construction  Practical  Illustrated with Markov model  Experiment shows better availability Using majority quorums is not a good idea  Not optimal  Poor performance Future work  More experiments, more constructions, real deployment

17 DISC’05 17 END

18 DISC’05 18 Backup Slides

19 DISC’05 19 Failure models The multi-site hierarchical model  A set F s of subsets of B  An array F p  One entry per site  Each entry: subsets of processes in the site  A survivor set S : FS  F s   B i  FS:  FP  F p [i]:P\FP  S   B i  FS:B i  S =  The bimodal model  A set F s of subsets of B  There is one site that is in no element of F s  An array F p  A survivor set S  As in the previous model OR  B i  B : S = B i Processes B2B2 B1B1 F s =  F p [1]={{ }: i  {1,2,3}} 123123 i F p [2]={{ }: i  {1,2,3}} i MSH: S p ={{ }: i, j,k,l  {1,2,3}  i  j  k  l} i j kl B: S p ={{ }: i, j,k,l  {1,2,3}  i  j  k  l}  B ij kl

20 DISC’05 20 Bimodal construction Bimodal model By construction: Not all pairs of survivor sets intersect  Discard survivor sets until remaining intersect  Selecting optimally is NP-Complete Solution: Remove | B |-1 survivor sets  Survivor sets containing processes from multiple sites pairwise intersect  Construction is also optimal with respect to metric A A special case: Bsite  All elements of F s have size f s  All elements of F p [i] have the same size t, for every i E.g.: f s = 1, t = 1 B1B1 B2B2 Quorums

21 DISC’05 21 Site availability Goals  Show that sites are unavailable frequently enough BIRN - Biomedical Informatics Research Network  Test bed projects centered around brain imaging  Currently: 19 universities, 26 research groups Availability  Monthly basis  Pings (BIRN-CC)  Storage broker logs Site availability  Jan/04-Aug/04  Availability under 100%  On average in 5 out of the 8 months

22 DISC’05 22 Causes of site failures Misconfigured software Shared resources 1.Storage 2.Power circuits 3.Cooling pipes 4.Air conditioning 5.Network


Download ppt "Coterie availability in sites Flavio Junqueira and Keith Marzullo University of California, San Diego DISC, Krakow, Poland, September 2005."

Similar presentations


Ads by Google