Scalable Computing on Open Distributed Systems Jon Weissman University of Minnesota National E-Science Center CLADE 2008.

Scalable Computing on Open Distributed Systems Jon Weissman University of Minnesota National E-Science Center CLADE 2008

What is the Problem? Open distributed systems – Tasks submitted to the “system” for execution – Workers do the computing, execute a task, return an answer The Challenge – Computations that are erroneous or late are less useful – Failure, errors, hacked, misconfigured – Unpredictable time to return answers Both local- and wide-area systems – Focus on volunteer wide-area systems

Shape of the Solution Replication Works for all sources of unreliability – computation and data How to do this intelligently - scalably?

Replication Challenges How many replicas? – too many – waste of resources – too few – application suffers Most approaches assume ad-hoc replication – under-replicate: task re-execution (^ latency) – over-replicate: wasted resources (v throughput) Using information about the past behavior of a node, we can intelligently size the amount of redundancy

Problems with ad-hoc replication Unreliable node Reliable node Task x sent to group A Task y sent to group B

System Model 0.9 0.4 0.8 0.7 0.8 0.7 0.4 0.3 Reputation rating r i – degree of node reliability Dynamically size the redundancy based on r i Note: variable sized groups Assume no correlated errors, relax later

Smart Replication Rating based on past interaction with clients – prob. (r i ) over window  correct/total or timely/total – extend to worker group (assuming no collusion) => likelihood of correctness (LOC) Smarter Redundancy – variable-sized worker groups – intuition: higher reliability clients => smaller groups

Terms LOC (Likelihood of Correctness), g – computes the ‘actual’ probability of getting a correct or timely answer from a group g of clients Target LOC ( target ) – the success-rate that the system tries to ensure while forming client groups

Scheduling Metrics Guiding metrics – throughput  : is the set of successfully completed tasks in an interval – success rate s: ratio of throughput to number of tasks attempted

Algorithm Space How many replicas? – algorithms compute how many replicas to meet a success threshold How to reach consensus? – Majority (better for byzantine threats) – M-1 (better for timeliness) – M-2 (2 matching)

One Scheduling Algorithm

Evaluation Baselines – Fixed algorithm: statically sized equal groups uses no reliability information – Random algorithm: forms groups by randomly assigning nodes until target is reached Simulated a wide-variety of node reliability distributions

Experimental Results: correctness Simulation: byzantine behavior only … majority voting

Role of target Key parameter – hard to specify Too large – groups will be too large (low throughput) Too small – groups will be too small (low success rate) Instead, adaptively learn it – bias toward  or s or both

Adaptive Algorithm

What about time? Timeliness Result > time T is less (or not) useful – (1) soft deadlines user interacting, visualization output from computation – (2) hard deadlines need to get X results done before HPDC/NSDI/… deadline Live experimentation on PlanetLab Real application: BLAST

Some PL data Computation - both across and within nodes Communication - both across and within nodes Temporal variability

PL Environment Ridge is our live system that implements reputation 120 wide-area nodes, fully correct, M-1 consensus 3 Timeliness environments based on deadlines D=120sD=180sD=240s

Experimental Results: timeliness Best BOINC (BOINC*), conservative (BOINC-) vs. RIDGE

Makespan Comparison

Collusion Suppose errors are correlated? How? – Widespread bug (hardware or software) – Misconfiguration – Virus – Sybil attack – Malicious group With Emmanuel Jeannot (Inria)

Key Ideas Execute a task => answer groups – A 1, A 2, … A k – For each A i there are associated workers W i 1, W i 2 … W i n – P collusion (workers in A i ) Learn probability of correlated errors – P collusion (W 1, W 2 ) Estimate probability of group correlated errors – P collusion (G), G=[W 1, W 2, W 3, …] via f {P collusion (W i, W j ), for all i,j} Rank and select answer – P collusion (G) and |G| – Update matrix: P collusion (W1, W2)

Bootstrap Problem Building collusion matrix Must first “bait” colluders – Over-replicate such that majority group is still correct to expose colluders –  : probability of worker collusion –  : probability colluders fool the system Given  group size k

4: 1 group 30% colluders, always collude 5. Same group – colludes 30% of the time 7. 2 groups (40%, 30% colluders) correctness

throughput

Summary Reliable Scalable computing – correctness and timeliness Future work – combined models and metrics – workflows: coupling data and computation reliability Visit ridge.cs.umn.edu to learn more

Scalable Computing on Open Distributed Systems Jon Weissman University of Minnesota National E-Science Center CLADE 2008.

Similar presentations

Presentation on theme: "Scalable Computing on Open Distributed Systems Jon Weissman University of Minnesota National E-Science Center CLADE 2008."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Scalable Computing on Open Distributed Systems Jon Weissman University of Minnesota National E-Science Center CLADE 2008.

Similar presentations

Presentation on theme: "Scalable Computing on Open Distributed Systems Jon Weissman University of Minnesota National E-Science Center CLADE 2008."— Presentation transcript:

Similar presentations

About project

Feedback