Paper Survey of DHT Distributed Hash Table. Usages Directory service  Very little amount of information, such as URI, metadata, … Storage  Data, such.

Slides:



Advertisements
Similar presentations
Dynamo: Amazon’s Highly Available Key-value Store
Advertisements

Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
Peer to Peer and Distributed Hash Tables
Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.
Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK
Peer-to-Peer (P2P) Distributed Storage 1Dennis Kafura – CS5204 – Operating Systems.
Kademlia: A Peer-to-peer Information System Based on the XOR Metric Petar Mayamounkov David Mazières A few slides are taken from the authors’ original.
Chord: A scalable peer-to- peer lookup service for Internet applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashock, Hari Balakrishnan.
Xiaowei Yang CompSci 356: Computer Network Architectures Lecture 22: Overlay Networks Xiaowei Yang
Fabián E. Bustamante, Fall 2005 Efficient Replica Maintenance for Distributed Storage Systems B-G Chun, F. Dabek, A. Haeberlen, E. Sit, H. Weatherspoon,
Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.
Common approach 1. Define space: assign random ID (160-bit) to each node and key 2. Define a metric topology in this space,  that is, the space of keys.
Outline for today Structured overlay as infrastructures Survey of design solutions Analysis of designs.
The Impact of DHT Routing Geometry on Resilience and Proximity New DHTs constantly proposed –CAN, Chord, Pastry, Tapestry, Plaxton, Viceroy, Kademlia,
Applications over P2P Structured Overlays Antonino Virgillito.
P2P: Advanced Topics Filesystems over DHTs and P2P research Vyas Sekar.
Spring 2003CS 4611 Peer-to-Peer Networks Outline Survey Self-organizing overlay network File system on top of P2P network Contributions from Peter Druschel.
Distributed Lookup Systems
Object Naming & Content based Object Search 2/3/2003.
Chord-over-Chord Overlay Sudhindra Rao Ph.D Qualifier Exam Department of ECECS.
Topics in Reliable Distributed Systems Fall Dr. Idit Keidar.
1 CS 194: Distributed Systems Distributed Hash Tables Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer.
Wide-area cooperative storage with CFS
EPFL-I&C-LSIR [P-Grid.org] Workshop on Distributed Data and Structures ’04 NCCR-MICS [IP5] presented by Anwitaman Datta Joint work with Karl Aberer and.
1 Peer-to-Peer Networks Outline Survey Self-organizing overlay network File system on top of P2P network Contributions from Peter Druschel.
Tapestry: A Resilient Global-scale Overlay for Service Deployment Ben Y. Zhao, Ling Huang, Jeremy Stribling, Sean C. Rhea, Anthony D. Joseph, and John.
File Sharing : Hash/Lookup Yossi Shasho (HW in last slide) Based on Chord: A Scalable Peer-to-peer Lookup Service for Internet ApplicationsChord: A Scalable.
Structured P2P Network Group14: Qiwei Zhang; Shi Yan; Dawei Ouyang; Boyu Sun.
SIMULATING A MOBILE PEER-TO-PEER NETWORK Simo Sibakov Department of Communications and Networking (Comnet) Helsinki University of Technology Supervisor:
Storage management and caching in PAST PRESENTED BY BASKAR RETHINASABAPATHI 1.
Roger ZimmermannCOMPSAC 2004, September 30 Spatial Data Query Support in Peer-to-Peer Systems Roger Zimmermann, Wei-Shinn Ku, and Haojun Wang Computer.
Tapestry GTK Devaroy (07CS1012) Kintali Bala Kishan (07CS1024) G Rahul (07CS3009)
Designing a DHT for low latency and high throughput Robert Vollmann P2P Information Systems.
1 PASTRY. 2 Pastry paper “ Pastry: Scalable, decentralized object location and routing for large- scale peer-to-peer systems ” by Antony Rowstron (Microsoft.
Thesis Proposal Data Consistency in DHTs. Background Peer-to-peer systems have become increasingly popular Lots of P2P applications around us –File sharing,
Cooperative File System. So far we had… - Consistency BUT… - Availability - Partition tolerance ?
A Distributed Architecture for Multi-dimensional Indexing and Data Retrieval in Grid Environments Athanasia Asiki, Katerina Doka, Ioannis Konstantinou,
Content Overlays (Nick Feamster). 2 Content Overlays Distributed content storage and retrieval Two primary approaches: –Structured overlay –Unstructured.
Chord & CFS Presenter: Gang ZhouNov. 11th, University of Virginia.
The Impact of DHT Routing Geometry on Resilience and Proximity K. Gummadi, R. Gummadi..,S.Gribble, S. Ratnasamy, S. Shenker, I. Stoica.
Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Xiaozhou Li COS 461: Computer Networks (precept 04/06/12) Princeton University.
1 Distributed Hash Tables (DHTs) Lars Jørgen Lillehovde Jo Grimstad Bang Distributed Hash Tables (DHTs)
Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
Peer-to-Peer Name Service (P2PNS) Ingmar Baumgart Institute of Telematics, Universität Karlsruhe IETF 70, Vancouver.
1 JTE HPC/FS Pastis: a peer-to-peer file system for persistant large-scale storage Jean-Michel Busca Fabio Picconi Pierre Sens LIP6, Université Paris 6.
Chord Advanced issues. Analysis Theorem. Search takes O (log N) time (Note that in general, 2 m may be much larger than N) Proof. After log N forwarding.
1 Distributed Hash Table CS780-3 Lecture Notes In courtesy of Heng Yin.
Effective Replica Maintenance for Distributed Storage Systems USENIX NSDI’ 06 Byung-Gon Chun, Frank Dabek, Andreas Haeberlen, Emil Sit, Hakim Weatherspoon,
Chord Advanced issues. Analysis Search takes O(log(N)) time –Proof 1 (intuition): At each step, distance between query and peer hosting the object reduces.
Plethora: Infrastructure and System Design. Introduction Peer-to-Peer (P2P) networks: –Self-organizing distributed systems –Nodes receive and provide.
1. Efficient Peer-to-Peer Lookup Based on a Distributed Trie 2. Complex Queries in DHT-based Peer-to-Peer Networks Lintao Liu 5/21/2002.
Algorithms and Techniques in Structured Scalable Peer-to-Peer Networks
Click to edit Master title style Multi-Destination Routing and the Design of Peer-to-Peer Overlays Authors John Buford Panasonic Princeton Lab, USA. Alan.
LOOKING UP DATA IN P2P SYSTEMS Hari Balakrishnan M. Frans Kaashoek David Karger Robert Morris Ion Stoica MIT LCS.
Bruce Hammer, Steve Wallis, Raymond Ho
Large Scale Sharing Marco F. Duarte COMP 520: Distributed Systems September 19, 2004.
Malugo – a scalable peer-to-peer storage system..
Peer-to-Peer Protocol (P2PP) Salman Baset, Henning Schulzrinne Columbia University.
Ivy: A Read/Write Peer-to- Peer File System Authors: Muthitacharoen Athicha, Robert Morris, Thomer M. Gil, and Benjie Chen Presented by Saurabh Jha 1.
Distributed Hash Tables (DHT) Jukka K. Nurminen *Adapted from slides provided by Stefan Götz and Klaus Wehrle (University of Tübingen)
CS Spring 2010 CS 414 – Multimedia Systems Design Lecture 24 – Introduction to Peer-to-Peer (P2P) Systems Klara Nahrstedt (presented by Long Vu)
Distributed Hash Tables
Dynamo: Amazon’s Highly Available Key-value Store
SCOPE: Scalable Consistency in Structured P2P Systems
Chord Advanced issues.
Chord Advanced issues.
Chord Advanced issues.
Consistent Hashing and Distributed Hash Table
P2P: Distributed Hash Tables
Kademlia: A Peer-to-peer Information System Based on the XOR Metric
Presentation transcript:

Paper Survey of DHT Distributed Hash Table

Usages Directory service  Very little amount of information, such as URI, metadata, … Storage  Data, such as files, …  Immutable, just for download Database  Each entry is small, but large amount of entries  Mutable  Special operations for query

Challenges Immutable  Latency  Availability  Query Consistency Mutable  Object Consistency

Latency Query  Different routing architectures Chord, Tapestry, Pastry, Kademlia, Can, …  Recursive, interactive  Proximity Neighbor Route  Parallel  Routing table size Fetch  Transport Protocol  Proximity Neighbor Selections  Cache  Distributed Object

Query: Routing Architectures Routing Complexity  O (log n), O (d), O (1), … Principle  Each peer has a unique digest  Object with a digest  Put the object to the peer with the closed digest Famous ones are O (log n) O (1)  cache

Query: Recursive or Interactive Query is recursive forward  Faster 2 times than interactive theoretically  Primary parameters Base # of successor  Persistent problem

Query: Recursive or Interactive Query is interactively forward  Not very slow in practical  Primary parameters # of parallel query Routing table tree  Learning new neighbor easily  Exchange information with other peers  Flexible

Query: Proximity Neighbor Route Route by a node with smaller delay Small delay -> small timeout  TCP > Vivaldi > fixed

Query: Proximity Neighbor Route Measure methods  Global Sampling  Neighbor’s neighbors  Neighbor’s inverse  Recursive sampling

Query: others Parallel query  Faster  With partial PNS property  Persistent  More traffic Large routing table  Easy to find a closer node locally

Fetch: Cache Cache objects on nodes closer to the primary one # of nodes to cache is upon the popularity of the object Average query hops can be reduced to a constant number ( O (1) ) Hard to apply to mutable object Consider churn  more bandwidth consumption

Fetch: Distributed Object Split object to small pieces and put on different nodes Recover faster Download faster Hard to maintain Only for immutable data

Fetch: Transport Protocol Striped Transport Protocol  UDP  Window control  Retransmission

Availability Replicate  Reactive / Proactive  Eager / lazy repair Erasure coding Load balance is broken  High correlation between uptime and storage Maintenance traffic problem

Availability: Replicate Reactive  Duplicate when a copy is lost  Consume lots of bandwidth in short time  When churn is low, reactive is better Proactive  Duplicate continually  Consume constant and small bandwidth continually  Need avail. prediction and redundancy management  Bandwidth usage is predictable

Availability: Replicate Temporary / Permanent churn Availability Durability Achieve 100% availability or/and durability ? Eager repair Duplicate immediately Lazy repair Duplicate after timeout Need a good choice of timeout Reintegrating returning replicas

Availability: Erasure Coding Matter more on larger object Save storage and bandwidth For high churn, the bandwidth consumption is still not acceptable Complex maintenance Download latency is heterogeneous Only for immutable data

Query Consistency A digest-object mapping is existed, then the result of query must be it Weakly consistent KBR  Eventual consistency  Most of existed DHT Strongly consistent KBR  Causality consistency  Strong consistency Solution  Route by W-KBR to a group  S-KBR in a group

Mutable DHT Object stored in DHT is mutable  Insert, update, delete Churn -> Replica New Challenge …

Object Consistency For immutable data  For security issue, it may be there Merkle tree For mutable data  Consensus algorithm Distributed algorithm for data consistency  Quorum algorithm Read / write locks

Pitfalls Different kinds of p2p have different properties Lack of new real traces Standard simulation platform

References Efficient Replica Maintenance for Distributed Storage Systems Proactive replication for data durability On object Maintenance in Peer-to-Peer systems Enforcing Routing Consistency in Structured Peer-to-peer Overlays: Should We and Could We? High Availability in DHTs: Erasure Coding vs. Replication Toward Fault-tolerant Atomic Data Access in Mutable Distributed Hash Tables Kademlia: A Peer-to-peer Information System Based on the XOR Metric Total Recall: System Support for Automated Availability Management Designing a DHT for low latency and high throughput

References Fallacies in evaluating decentralized systems Anatomy of a P2P Content Distribution system with Network Coding Comparing the performance of distributed hash tables under churn EpiChord: Parallelizing the Chord Lookup Algorithm with Reactive Routing State management Bandwidth-efficient management of DHT routing tables Improving Lookup Performance over a Widely-Deployed DHT Failure Recovery for Structured P2P Networks: Protocol Design and Performance Evaluation Handling Churn in a DHT