Presentation is loading. Please wait.

Presentation is loading. Please wait.

Evaluating a Defragmented DHT Filesystem Jeff Pang Phil Gibbons, Michael Kaminksy, Haifeng Yu, Sinivasan Seshan Intel Research Pittsburgh, CMU.

Similar presentations


Presentation on theme: "Evaluating a Defragmented DHT Filesystem Jeff Pang Phil Gibbons, Michael Kaminksy, Haifeng Yu, Sinivasan Seshan Intel Research Pittsburgh, CMU."— Presentation transcript:

1 Evaluating a Defragmented DHT Filesystem Jeff Pang Phil Gibbons, Michael Kaminksy, Haifeng Yu, Sinivasan Seshan Intel Research Pittsburgh, CMU

2 Problem Summary 150-210 211-400 401-513 TRADITIONAL DISTRIBUTED HASH TABLE (DHT) Each server responsible for pseudo-random range of ID space Objects are given pseudo-random IDs 800-999 324 987 160

3 Problem Summary 150-210 211-400 401-513 DEFRAGMENTED DHT Each server responsible for dynamically balanced range of ID space Objects are given contiguous IDs 800-999 320 321 322

4 Motivation Better availability You depend on fewer servers when accessing your files Better end-to-end performance You don’t have to perform as many DHT lookups when accessing your files

5 Availability Setup Evaluated via simulation ~250 nodes with 1.5Mbps each Faultload: PlanetLab failure trace (2003) included one 40 node failure event Workload: Harvard NFS trace (2003) primarily home directories used by researchers Compare: Traditional DHT: data placed using consisent hashing Defragmented DHT: data placed contiguously and load balanced dynamically (via Mercury)

6 Availability Setup Metric: failure rate of user “tasks” Task(i,m) = sequence of accesses with a interarrival threshold of i and max time of m Task(1sec,5min) = sequence of accesses that are spaced no more than 1 sec apart and last no more than 5 minutes Idea: capture notion of “useful unit of work” Not clear what values are right Therefore we evaluated many variations Task(1sec,…) <1sec 5min Task(1sec,5min) …

7 Availability Results Failure rate of 5 trials Lower is better Note log scale Missing bars have 0 failures Explanation User tasks access 10-20x fewer nodes in the defragmented design

8 Performance Setup Deploy real implementation 200-1000 virtual nodes with 1.5Mbps (Emulab) Measured global e2e latencies (MIT King) Workload: Harvard NFS Compare: Traditional vs Defragmented Implementation Uses Symphony/Mercury DHTs, respectively Both use TCP for data transport Both employ a Lookup Cache: remembers recently contacted nodes and their DHT ranges

9 Performance Setup Metric: task(1sec,infinity) speedup Task t takes 200msec in Traditional Task t takes 100msec in Defragmented speedup(t) = 200/100 = 2 Idea: capture speedup for each unit of work that is independent of user think time Note: 1 second interarrival threshold is conservative => tasks are longer Defragmented does better with shorter tasks (next slide)

10 Performance Setup Accesses within a task may or may not be inter- dependent Task = (A,B,…) App. may read A, then depending on contents of A, read B App. may read A and B regardless of contents Replay trace to capture both extremes Sequential - Each access must complete before starting the next (best for Defragmented) Parallel - All accesses in a task can be submitted in parallel (best for Traditional) [caveat: limited to 15 outstanding]

11 Performance Results

12 Other factors: TCP slow start Most tasks are small

13 Overhead Defragmented design is not free We want to maintain load balance Dynamic load balance => data migration

14 Conclusions Defragmented DHT Filesystem benefits: Reduces task failures by an order of magnitude Speeds up tasks by 50-100% Overhead might be reasonable: 1 byte written = 1.5 bytes transferred Key assumptions: Most tasks are small to medium sized (file systems, web, etc. -- not streaming) Wide area e2e latencies are tolerable

15 Tommy Maddox Slides

16 Load Balance

17 Lookup Traffic

18 Availability Breakdown

19 Performance Breakdown

20 Performance Breakdown 2 With parallel playback, the Defragmented suffers on the small number of very long tasks ignore - due to topology

21 Maximum Overhead

22 Other Workloads


Download ppt "Evaluating a Defragmented DHT Filesystem Jeff Pang Phil Gibbons, Michael Kaminksy, Haifeng Yu, Sinivasan Seshan Intel Research Pittsburgh, CMU."

Similar presentations


Ads by Google