DHTs and Peer-to-Peer Systems Supplemental Slides Aditya Akella 03/21/2007.

Slides:



Advertisements
Similar presentations
Distributed Hash Tables: An Overview
Advertisements

Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
Peer to Peer and Distributed Hash Tables
Kademlia: A Peer-to-peer Information System Based on the XOR Metric Petar Mayamounkov David Mazières A few slides are taken from the authors’ original.
Technische Universität Yimei Liao Chemnitz Kurt Tutschku Vertretung - Professur Rechner- netze und verteilte Systeme Chord - A Distributed Hash Table Yimei.
CHORD: A Peer-to-Peer Lookup Service CHORD: A Peer-to-Peer Lookup Service Ion StoicaRobert Morris David R. Karger M. Frans Kaashoek Hari Balakrishnan Presented.
Chord: A scalable peer-to- peer lookup service for Internet applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashock, Hari Balakrishnan.
Xiaowei Yang CompSci 356: Computer Network Architectures Lecture 22: Overlay Networks Xiaowei Yang
Distributed Hash Tables CPE 401 / 601 Computer Network Systems Modified from Ashwin Bharambe and Robert Morris.
Robert Morris, M. Frans Kaashoek, David Karger, Hari Balakrishnan, Ion Stoica, David Liben-Nowell, Frank Dabek Chord: A scalable peer-to-peer look-up.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Robert Morris Ion Stoica, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT.
Common approach 1. Define space: assign random ID (160-bit) to each node and key 2. Define a metric topology in this space,  that is, the space of keys.
Presented by Elisavet Kozyri. A distributed application architecture that partitions tasks or work loads between peers Main actions: Find the owner of.
The Impact of DHT Routing Geometry on Resilience and Proximity New DHTs constantly proposed –CAN, Chord, Pastry, Tapestry, Plaxton, Viceroy, Kademlia,
Peer to Peer File Sharing Huseyin Ozgur TAN. What is Peer-to-Peer?  Every node is designed to(but may not by user choice) provide some service that helps.
1 Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Robert Morris Ion Stoica, David Karger, M. Frans Kaashoek, Hari Balakrishnan.
What is a P2P system? A distributed system architecture: No centralized control Nodes are symmetric in function Large number of unreliable nodes Enabled.
Topics in Reliable Distributed Systems Lecture 2, Fall Dr. Idit Keidar.
Introduction to Peer-to-Peer (P2P) Systems Gabi Kliot - Computer Science Department, Technion Concurrent and Distributed Computing Course 28/06/2006 The.
Looking Up Data in P2P Systems Hari Balakrishnan M.Frans Kaashoek David Karger Robert Morris Ion Stoica.
The Impact of DHT Routing Geometry on Resilience and Proximity Krishna Gummadi, Ramakrishna Gummadi, Sylvia Ratnasamy, Steve Gribble, Scott Shenker, Ion.
Distributed Lookup Systems
University of Oregon Slides from Gotz and Wehrle + Chord paper
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 2: Peer-to-Peer.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek and Hari alakrishnan.
The Impact of DHT Routing Geometry on Resilience and Proximity Krishna Gummadi, Ramakrishna Gummadi, Sylvia Ratnasamy, Steve Gribble, Scott Shenker, Ion.
Topics in Reliable Distributed Systems Fall Dr. Idit Keidar.
1 CS 194: Distributed Systems Distributed Hash Tables Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer.
Peer To Peer Distributed Systems Pete Keleher. Why Distributed Systems? l Aggregate resources! –memory –disk –CPU cycles l Proximity to physical stuff.
Peer-to-Peer Networks Slides largely adopted from Ion Stoica’s lecture at UCB.
File Sharing : Hash/Lookup Yossi Shasho (HW in last slide) Based on Chord: A Scalable Peer-to-peer Lookup Service for Internet ApplicationsChord: A Scalable.
A Backup System built from a Peer-to-Peer Distributed Hash Table Russ Cox joint work with Josh Cates, Frank Dabek, Frans Kaashoek, Robert Morris,
Distributed Systems Lecture 21 – CDN & Peer-to-Peer.
CSE 461 University of Washington1 Topic Peer-to-peer content delivery – Runs without dedicated infrastructure – BitTorrent as an example Peer.
INTRODUCTION TO PEER TO PEER NETWORKS Z.M. Joseph CSE 6392 – DB Exploration Spring 2006 CSE, UT Arlington.
Distributed Hash Table Systems Hui Zhang University of Southern California.
Content Overlays (Nick Feamster). 2 Content Overlays Distributed content storage and retrieval Two primary approaches: –Structured overlay –Unstructured.
The Impact of DHT Routing Geometry on Resilience and Proximity K. Gummadi, R. Gummadi..,S.Gribble, S. Ratnasamy, S. Shenker, I. Stoica.
Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Xiaozhou Li COS 461: Computer Networks (precept 04/06/12) Princeton University.
Vincent Matossian September 21st 2001 ECE 579 An Overview of Decentralized Discovery mechanisms.
Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Dr. Yingwu Zhu.
SIGCOMM 2001 Lecture slides by Dr. Yingwu Zhu Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications.
Distributed Hash Tables CPE 401 / 601 Computer Network Systems Modified from Ashwin Bharambe and Robert Morris.
15-744: Computer Networking L-22: P2P. Lecture 22: Peer-to-Peer Networks Typically each member stores/provides access to content Has quickly.
Structured P2P Overlays. Consistent Hashing – the Basis of Structured P2P Intuition: –We want to build a distributed hash table where the number of buckets.
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 2: Distributed Hash.
Chord Advanced issues. Analysis Search takes O(log(N)) time –Proof 1 (intuition): At each step, distance between query and peer hosting the object reduces.
15-744: Computer Networking L-22: P2P. L -22; © Srinivasan Seshan, P2P Peer-to-peer networks Assigned reading [Cla00] Freenet: A Distributed.
CS 268: Computer Networking
Algorithms and Techniques in Structured Scalable Peer-to-Peer Networks
LOOKING UP DATA IN P2P SYSTEMS Hari Balakrishnan M. Frans Kaashoek David Karger Robert Morris Ion Stoica MIT LCS.
15-744: Computer Networking L-23: P2P. L -23; © Srinivasan Seshan, P2P Peer-to-peer networks Assigned reading [Cla00] Freenet: A Distributed.
15-744: Computer Networking L-16 P2P/DHT. 2 Overview P2P Lookup Overview Centralized/Flooded Lookups Bittorrent Routed Lookups – Chord Comparison of DHTs.
Nick McKeown CS244 Lecture 17 Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications [Stoica et al 2001]
15-829A/18-849B/95-811A/19-729A Internet-Scale Sensor Systems: Design and Policy Review.
CS694 - DHT1 Distributed Hash Table Systems Hui Zhang University of Southern California.
Distributed Hash Tables (DHT) Jukka K. Nurminen *Adapted from slides provided by Stefan Götz and Klaus Wehrle (University of Tübingen)
CS 425 / ECE 428 Distributed Systems Fall 2015 Indranil Gupta (Indy) Peer-to-peer Systems All slides © IG.
Chord: A Scalable Peer-to-Peer Lookup Service for Internet Applications * CS587x Lecture Department of Computer Science Iowa State University *I. Stoica,
CS 268: Lecture 22 (Peer-to-Peer Networks)
Scott Shenker and Ion Stoica April 11, 2005
(slides by Nick Feamster)
Distributed Hash Tables
EE 122: Peer-to-Peer (P2P) Networks
CS 268: Peer-to-Peer Networks and Distributed Hash Tables
DHT Routing Geometries and Chord
15-744: Computer Networking
P2P: Distributed Hash Tables
Presentation transcript:

DHTs and Peer-to-Peer Systems Supplemental Slides Aditya Akella 03/21/2007

Peer-to-Peer Networks Typically each member stores/provides access to content Has quickly grown in popularity Basically a replication system for files –Always a tradeoff between possible location of files and searching difficulty –Peer-to-peer allow files to be anywhere  searching is the challenge –Dynamic member list makes it more difficult What other systems have similar goals? –Routing, DNS

The Lookup Problem Internet N1N1 N2N2 N3N3 N6N6 N5N5 N4N4 Publisher Key=“title” Value=MP3 data… Client Lookup(“title”) ?

Centralized Lookup (Napster) Client Lookup(“title”) N6N6 N9N9 N7N7 DB N8N8 N3N3 N2N2 N1N1 SetLoc(“title”, N4) Simple, but O( N ) state and a single point of failure Key=“title” Value=MP3 data… N4N4

Flooded Queries (Gnutella) N4N4 Client N6N6 N9N9 N7N7 N8N8 N3N3 N2N2 N1N1 Robust, but worst case O( N ) messages per lookup Key=“title” Value=MP3 data… Lookup(“title”)

Routed Queries (Chord, etc.) N4N4 Publisher Client N6N6 N9N9 N7N7 N8N8 N3N3 N2N2 N1N1 Lookup(“title”) Key=“title” Value=MP3 data…

Routing: Structured Approaches Goal: make sure that an item (file) identified is always found in a reasonable # of steps Abstraction: a distributed hash-table (DHT) data structure –insert(id, item); –item = query(id); –Note: item can be anything: a data object, document, file, pointer to a file… Proposals –CAN (ICIR/Berkeley) –Chord (MIT/Berkeley) –Pastry (Rice) –Tapestry (Berkeley) –…

Routing: Chord Associate to each node and item a unique id in an uni-dimensional space Properties –Routing table size O(log(N)), where N is the total number of nodes –Guarantees that a file is found in O(log(N)) steps

Aside: Consistent Hashing [Karger 97] N32 N90 N105 K80 K20 K5 Circular 7-bit ID space Key 5 Node 105 A key is stored at its successor: node with next higher ID

Routing: Chord Basic Lookup N32 N90 N105 N60 N10 N120 K80 “Where is key 80?” “N90 has K80”

Routing: Finger table - Faster Lookups N80 ½ ¼ 1/8 1/16 1/32 1/64 1/128

Routing: Chord Summary Assume identifier space is 0…2 m Each node maintains –Finger table Entry i in the finger table of n is the first node that succeeds or equals n + 2 i –Predecessor node An item identified by id is stored on the successor node of id

Routing: Chord Example Assume an identifier space 0..8 Node n1:(1) joins  all entries in its finger table are initialized to itself i id+2 i succ Succ. Table

Routing: Chord Example Node n2:(3) joins i id+2 i succ Succ. Table i id+2 i succ Succ. Table

Routing: Chord Example Nodes n3:(0), n4:(6) join i id+2 i succ Succ. Table i id+2 i succ Succ. Table i id+2 i succ Succ. Table i id+2 i succ Succ. Table

Routing: Chord Examples Nodes: n1:(1), n2(3), n3(0), n4(6) Items: f1:(7), f2:(2) i id+2 i succ Succ. Table i id+2 i succ Succ. Table i id+2 i succ Succ. Table 7 Items 1 i id+2 i succ Succ. Table

Routing: Query Upon receiving a query for item id, a node Check whether stores the item locally If not, forwards the query to the largest node in its successor table that does not exceed id i id+2 i succ Succ. Table i id+2 i succ Succ. Table i id+2 i succ Succ. Table 7 Items 1 i id+2 i succ Succ. Table query(7)

What can DHTs do for us? Distributed object lookup –Based on object ID De-centralized file systems –CFS, PAST, Ivy Application Layer Multicast –Scribe, Bayeux, Splitstream Databases –PIER

Comparison Many proposals for DHTs –Tapestry (UCB) -- Symphony (Stanford) -- 1hop (MIT) –Pastry (MSR, Rice) -- Tangle (UCB) -- conChord (MIT) –Chord (MIT, UCB) -- SkipNet (MSR,UW) -- Apocrypha ( Stanford ) –CAN (UCB, ICSI) -- Bamboo (UCB) -- LAND ( Hebrew Univ. ) –Viceroy (Technion) -- Hieras (U.Cinn) -- ODRI (TexasA&M) –Kademlia (NYU) -- Sprout (Stanford) –Kelips (Cornell) -- Calot (Rochester) –Koorde (MIT) -- JXTA’s (Sun) What are the right design choices? Effect on performance?

Deconstructing DHTs Two observations: 1.Common approach –N nodes; each labeled with a virtual identifier (128 bits) –define “distance” function on the identifiers –routing works to reduce the distance to the destination 2.DHTs differ primarily in their definition of “distance” –typically derived from (loose) notion of a routing geometry

DHT Routing Geometries Geometries: –Tree (Plaxton, Tapestry) –Ring (Chord) –Hypercube (CAN) –XOR (Kademlia) –Hybrid (Pastry) What is the impact of geometry on routing?

Tree (Plaxton, Tapestry) Geometry –nodes are leaves in a binary tree –distance = height of the smallest common subtree –logN neighbors in subtrees at distance 1,2,…,logN

Hypercube (CAN) Geometry –nodes are the corners of a hypercube –distance = #matching bits in the IDs of two nodes –logN neighbors per node; each at distance=1 away

Ring (Chord) Geometry –nodes are points on a ring –distance = numeric distance between two node IDs –logN neighbors exponentially spaced over 0…N

Hybrid (Pastry) Geometry: –combination of a tree and ring –two distance metrics –default routing uses tree; fallback to ring under failures neighbors picked as on the tree

Geometry’s Impact on Routing Routing –Neighbor selection: how a node picks its routing entries –Route selection: how a node picks the next hop Proposed metric: flexibility –amount of freedom to choose neighbors and next-hop paths FNS: flexibility in neighbor selection FRS: flexibility in route selection –intuition: captures ability to “tune” DHT performance –single predictor metric dependent only on routing issues

FRS for Ring Geometry Chord algorithm picks neighbor closest to destination A different algorithm picks the best of alternate paths

FNS for Ring Geometry Chord algorithm picks i th neighbor at 2 i distance A different algorithm picks i th neighbor from [2 i, 2 i+1 )

Flexibility: at a Glance FlexibilityOrdering of Geometries Neighbors (FNS) Hypercube << Tree, XOR, Ring, Hybrid (1) (2 i-1 ) Routes (FRS) Tree << XOR, Hybrid < Hypercube < Ring (1) (logN/2) (logN/2) (logN)

Geometry  Flexibility  Performance? Validate over three performance metrics: 1.resilience 2.path latency 3.path convergence Metrics address two typical concerns: –ability to handle node failure –ability to incorporate proximity into overlay routing

Analysis of Static Resilience Two aspects of robust routing Dynamic Recovery : how quickly routing state is recovered after failures Static Resilience : how well the network routes before recovery finishes –captures how quickly recovery algorithms need to work –depends on FRS Evaluation: Fail a fraction of nodes, without recovering any state Metric: % Paths Failed

Does flexibility affect static resilience? Tree << XOR ≈ Hybrid < Hypercube < Ring Flexibility in Route Selection matters for Static Resilience

Which is more effective, FNS or FRS? Plain << FRS << FNS ≈ FNS+FRS Neighbor Selection is much better than Route Selection