Evaluating a Defragmented DHT Filesystem Jeff Pang Phil Gibbons, Michael Kaminksy, Haifeng Yu, Sinivasan Seshan Intel Research Pittsburgh, CMU.

Slides:



Advertisements
Similar presentations
SkipNet: A Scalable Overlay Network with Practical Locality Properties Nick Harvey, Mike Jones, Stefan Saroiu, Marvin Theimer, Alec Wolman Microsoft Research.
Advertisements

Dynamic Replica Placement for Scalable Content Delivery Yan Chen, Randy H. Katz, John D. Kubiatowicz {yanchen, randy, EECS Department.
Colyseus: A Distributed Architecture for Online Multiplayer Games
The Effects of Wide-Area Conditions on WWW Server Performance Erich Nahum, Marcel Rosu, Srini Seshan, Jussara Almeida IBM T.J. Watson Research Center,
Efficient Event-based Resource Discovery Wei Yan*, Songlin Hu*, Vinod Muthusamy +, Hans-Arno Jacobsen +, Li Zha* * Chinese Academy of Sciences, Beijing.
Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
P2PIR'06: "Distributed Cache Table (DCT)" Gleb Skobeltsyn, Karl Aberer D istributed T able: Efficient Query-Driven Processing of Multi-Term Queries in.
Scalable Multi-Cache Simulation Using GPUs Michael Moeng Sangyeun Cho Rami Melhem University of Pittsburgh.
Web Server Benchmarking Using the Internet Protocol Traffic and Network Emulator Carey Williamson, Rob Simmonds, Martin Arlitt et al. University of Calgary.
Load Rebalancing for Distributed File Systems in Clouds Hung-Chang Hsiao, Member, IEEE Computer Society, Hsueh-Yi Chung, Haiying Shen, Member, IEEE, and.
ZHT 1 Tonglin Li. Acknowledgements I’d like to thank Dr. Ioan Raicu for his support and advising, and the help from Raman Verma, Xi Duan, and Hui Jin.
CS162 Section Lecture 9. KeyValue Server Project 3 KVClient (Library) Client Side Program KVClient (Library) Client Side Program KVClient (Library) Client.
Small-Scale Peer-to-Peer Publish/Subscribe
1 Failure Recovery for Priority Progress Multicast Jung-Rung Han Supervisor: Charles Krasic.
1 Defragmenting DHT-based Distributed File Systems Jeffrey Pang, Srinivasan Seshan Carnegie Mellon University Phillip B. Gibbons, Michael Kaminsky Intel.
Multimedia Proxy Caching Mechanism for Quality Adaptive Streaming Applications in the Internet R. Rejaie, H. Yu, M. Handley, D. Estrin.
Carnegie Mellon University Complex queries in distributed publish- subscribe systems Ashwin R. Bharambe, Justin Weisz and Srinivasan Seshan.
Load Balancing in Structured P2P Systems (DHTs) Sonesh Surana [Brighten Godfrey, Karthik Lakshminarayanan, Ananth Rao, Ion Stoica,
Lee Center Workshop, May 19, 2006 Distributed Objects System with Support for Sequential Consistency.
P2P: Advanced Topics Filesystems over DHTs and P2P research Vyas Sekar.
Efficient Content Location Using Interest-based Locality in Peer-to-Peer Systems Presented by: Lin Wing Kai.
A Novel Approach for Transparent Bandwidth Conservation David Salyers, Aaron Striegel University of Notre Dame Department of Computer Science and Engineering.
@ Carnegie Mellon Databases 1 Invalidation Clues for Database Scalability Services Amit Manjhi* 1, Phillip B. Gibbons z, Anastassia Ailamaki *, Charles.
1 Lecture 24: Interconnection Networks Topics: communication latency, centralized and decentralized switches (Sections 8.1 – 8.5)
Squirrel: A decentralized peer- to-peer web cache Paul Burstein 10/27/2003.
1 Exploring Data Reliability Tradeoffs in Replicated Storage Systems NetSysLab The University of British Columbia Abdullah Gharaibeh Matei Ripeanu.
SIMULATING A MOBILE PEER-TO-PEER NETWORK Simo Sibakov Department of Communications and Networking (Comnet) Helsinki University of Technology Supervisor:
Frangipani: A Scalable Distributed File System C. A. Thekkath, T. Mann, and E. K. Lee Systems Research Center Digital Equipment Corporation.
Slingshot: Deploying Stateful Services in Wireless Hotspots Ya-Yunn Su Jason Flinn University of Michigan.
Towards Efficient Load Balancing in Structured P2P Systems Yingwu Zhu, Yiming Hu University of Cincinnati.
Achieving Load Balance and Effective Caching in Clustered Web Servers Richard B. Bunt Derek L. Eager Gregory M. Oster Carey L. Williamson Department of.
A Locality Preserving Decentralized File System Jeffrey Pang, Suman Nath, Srini Seshan Carnegie Mellon University Haifeng Yu, Phil Gibbons, Michael Kaminsky.
Word Wide Cache Distributed Caching for the Distributed Enterprise.
 Zhichun Li  The Robust and Secure Systems group at NEC Research Labs  Northwestern University  Tsinghua University 2.
Ao-Jan Su, David R. Choffnes, Fabián E. Bustamante and Aleksandar Kuzmanovic Department of EECS Northwestern University Relative Network Positioning via.
1 Enabling Large Scale Network Simulation with 100 Million Nodes using Grid Infrastructure Hiroyuki Ohsaki Graduate School of Information Sci. & Tech.
Load Balancing in Structured P2P System Ananth Rao, Karthik Lakshminarayanan, Sonesh Surana, Richard Karp, Ion Stoica IPTPS ’03 Kyungmin Cho 2003/05/20.
Segment-Based Proxy Caching of Multimedia Streams Authors: Kun-Lung Wu, Philip S. Yu, and Joel L. Wolf IBM T.J. Watson Research Center Proceedings of The.
1 CS 425 Distributed Systems Fall 2011 Slides by Indranil Gupta Measurement Studies All Slides © IG Acknowledgments: Jay Patel.
Aditya Akella The Performance Benefits of Multihoming Aditya Akella CMU With Bruce Maggs, Srini Seshan, Anees Shaikh and Ramesh Sitaraman.
ECO-DNS: Expected Consistency Optimization for DNS Chen Stephanos Matsumoto Adrian Perrig © 2013 Stephanos Matsumoto1.
The Design and Implementation of Log-Structure File System M. Rosenblum and J. Ousterhout.
MapReduce and GFS. Introduction r To understand Google’s file system let us look at the sort of processing that needs to be done r We will look at MapReduce.
PARALLEL APPLICATIONS EE 524/CS 561 Kishore Dhaveji 01/09/2000.
1 Evaluation of Cooperative Web Caching with Web Polygraph Ping Du and Jaspal Subhlok Department of Computer Science University of Houston presented at.
Fast Crash Recovery in RAMCloud. Motivation The role of DRAM has been increasing – Facebook used 150TB of DRAM For 200TB of disk storage However, there.
Paper Survey of DHT Distributed Hash Table. Usages Directory service  Very little amount of information, such as URI, metadata, … Storage  Data, such.
Performance of Web Proxy Caching in Heterogeneous Bandwidth Environments IEEE Infocom, 1999 Anja Feldmann et.al. AT&T Research Lab 발표자 : 임 민 열, DB lab,
Rassul Ayani 1 Performance of parallel and distributed systems  What is the purpose of measurement?  To evaluate a system (or an architecture)  To compare.
Measuring and Mitigating Web Performance Bottlenecks in Broadband Access Networks Srikanth Sundaresan, Nick Feamster (Georgia Tech) Renata Teixeira (Inria)
Slide #1 Performance Evaluation of Routing Protocol for Low Power and Lossy Networks (RPL) draft-tripathi-roll-rpl-simulation-04 IETF Virtual Interim WG.
Plethora: Infrastructure and System Design. Introduction Peer-to-Peer (P2P) networks: –Self-organizing distributed systems –Nodes receive and provide.
MiddleMan: A Video Caching Proxy Server NOSSDAV 2000 Brian Smith Department of Computer Science Cornell University Ithaca, NY Soam Acharya Inktomi Corporation.
IMP: Indirect Memory Prefetcher
Click to edit Master title style Multi-Destination Routing and the Design of Peer-to-Peer Overlays Authors John Buford Panasonic Princeton Lab, USA. Alan.
Two Peer-to-Peer Networking Approaches Ken Calvert Net Seminar, 23 October 2001 Note: Many slides “borrowed” from S. Ratnasamy’s Qualifying Exam talk.
Review CS File Systems - Partitions What is a hard disk partition?
NCLAB 1 Supporting complex queries in a distributed manner without using DHT NodeWiz: Peer-to-Peer Resource Discovery for Grids Sujoy Basu, Sujata Banerjee,
1 CSCE 520 Test 2 Info Indexing Modified from slides of Hector Garcia-Molina and Jeff Ullman.
1 Evaluation of Cooperative Web Caching with Web Polygraph Ping Du and Jaspal Subhlok Department of Computer Science University of Houston presented at.
Using Deduplicating Storage for Efficient Disk Image Deployment Xing Lin, Mike Hibler, Eric Eide, Robert Ricci University of Utah.
Measurement-based Design
Vivaldi: A Decentralized Network Coordinate System
Plethora: Infrastructure and System Design
Early Measurements of a Cluster-based Architecture for P2P Systems
SCOPE: Scalable Consistency in Structured P2P Systems
Distributed P2P File System
Henge: Intent-Driven Multi-Tenant Stream Processing
Prophecy: Using History for High-Throughput Fault Tolerance
Lu Tang , Qun Huang, Patrick P. C. Lee
Presentation transcript:

Evaluating a Defragmented DHT Filesystem Jeff Pang Phil Gibbons, Michael Kaminksy, Haifeng Yu, Sinivasan Seshan Intel Research Pittsburgh, CMU

Problem Summary TRADITIONAL DISTRIBUTED HASH TABLE (DHT) Each server responsible for pseudo-random range of ID space Objects are given pseudo-random IDs

Problem Summary DEFRAGMENTED DHT Each server responsible for dynamically balanced range of ID space Objects are given contiguous IDs

Motivation Better availability You depend on fewer servers when accessing your files Better end-to-end performance You don’t have to perform as many DHT lookups when accessing your files

Availability Setup Evaluated via simulation ~250 nodes with 1.5Mbps each Faultload: PlanetLab failure trace (2003) included one 40 node failure event Workload: Harvard NFS trace (2003) primarily home directories used by researchers Compare: Traditional DHT: data placed using consisent hashing Defragmented DHT: data placed contiguously and load balanced dynamically (via Mercury)

Availability Setup Metric: failure rate of user “tasks” Task(i,m) = sequence of accesses with a interarrival threshold of i and max time of m Task(1sec,5min) = sequence of accesses that are spaced no more than 1 sec apart and last no more than 5 minutes Idea: capture notion of “useful unit of work” Not clear what values are right Therefore we evaluated many variations Task(1sec,…) <1sec 5min Task(1sec,5min) …

Availability Results Failure rate of 5 trials Lower is better Note log scale Missing bars have 0 failures Explanation User tasks access 10-20x fewer nodes in the defragmented design

Performance Setup Deploy real implementation virtual nodes with 1.5Mbps (Emulab) Measured global e2e latencies (MIT King) Workload: Harvard NFS Compare: Traditional vs Defragmented Implementation Uses Symphony/Mercury DHTs, respectively Both use TCP for data transport Both employ a Lookup Cache: remembers recently contacted nodes and their DHT ranges

Performance Setup Metric: task(1sec,infinity) speedup Task t takes 200msec in Traditional Task t takes 100msec in Defragmented speedup(t) = 200/100 = 2 Idea: capture speedup for each unit of work that is independent of user think time Note: 1 second interarrival threshold is conservative => tasks are longer Defragmented does better with shorter tasks (next slide)

Performance Setup Accesses within a task may or may not be inter- dependent Task = (A,B,…) App. may read A, then depending on contents of A, read B App. may read A and B regardless of contents Replay trace to capture both extremes Sequential - Each access must complete before starting the next (best for Defragmented) Parallel - All accesses in a task can be submitted in parallel (best for Traditional) [caveat: limited to 15 outstanding]

Performance Results

Other factors: TCP slow start Most tasks are small

Overhead Defragmented design is not free We want to maintain load balance Dynamic load balance => data migration

Conclusions Defragmented DHT Filesystem benefits: Reduces task failures by an order of magnitude Speeds up tasks by % Overhead might be reasonable: 1 byte written = 1.5 bytes transferred Key assumptions: Most tasks are small to medium sized (file systems, web, etc. -- not streaming) Wide area e2e latencies are tolerable

Tommy Maddox Slides

Load Balance

Lookup Traffic

Availability Breakdown

Performance Breakdown

Performance Breakdown 2 With parallel playback, the Defragmented suffers on the small number of very long tasks ignore - due to topology

Maximum Overhead

Other Workloads