LOOKING UP DATA IN P2P SYSTEMS Hari Balakrishnan M. Frans Kaashoek David Karger Robert Morris Ion Stoica MIT LCS.

Slides:



Advertisements
Similar presentations
CAN 1.Distributed Hash Tables a)DHT recap b)Uses c)Example – CAN.
Advertisements

Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
Peer to Peer and Distributed Hash Tables
Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.
Scalable Content-Addressable Network Lintao Liu
CHORD – peer to peer lookup protocol Shankar Karthik Vaithianathan & Aravind Sivaraman University of Central Florida.
PDPTA03, Las Vegas, June S-Chord: Using Symmetry to Improve Lookup Efficiency in Chord Valentin Mesaros 1, Bruno Carton 2, and Peter Van Roy 1 1.
Technische Universität Yimei Liao Chemnitz Kurt Tutschku Vertretung - Professur Rechner- netze und verteilte Systeme Chord - A Distributed Hash Table Yimei.
Technische Universität Chemnitz Kurt Tutschku Vertretung - Professur Rechner- netze und verteilte Systeme Chord - A Distributed Hash Table Yimei Liao.
Chord: A Scalable Peer-to- Peer Lookup Service for Internet Applications Ion StoicaRobert Morris David Liben-NowellDavid R. Karger M. Frans KaashoekFrank.
The Chord P2P Network Some slides have been borowed from the original presentation by the authors.
CHORD: A Peer-to-Peer Lookup Service CHORD: A Peer-to-Peer Lookup Service Ion StoicaRobert Morris David R. Karger M. Frans Kaashoek Hari Balakrishnan Presented.
Ion Stoica, Robert Morris, David Liben-Nowell, David R. Karger, M
Chord: A scalable peer-to- peer lookup service for Internet applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashock, Hari Balakrishnan.
Chord: A Scalable Peer-to-Peer Lookup Service for Internet Applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan Presented.
Robert Morris, M. Frans Kaashoek, David Karger, Hari Balakrishnan, Ion Stoica, David Liben-Nowell, Frank Dabek Chord: A scalable peer-to-peer look-up protocol.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Ion StoicaRobert Morris David Liben-NowellDavid R. Karger M. Frans KaashoekFrank.
Massively Distributed Database Systems Distributed Hash Spring 2014 Ki-Joune Li Pusan National University.
Sylvia Ratnasamy, Paul Francis, Mark Handley, Richard Karp, Scott Schenker Presented by Greg Nims.
Common approach 1. Define space: assign random ID (160-bit) to each node and key 2. Define a metric topology in this space,  that is, the space of keys.
Presented by Elisavet Kozyri. A distributed application architecture that partitions tasks or work loads between peers Main actions: Find the owner of.
Peer to Peer File Sharing Huseyin Ozgur TAN. What is Peer-to-Peer?  Every node is designed to(but may not by user choice) provide some service that helps.
1 Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Robert Morris Ion Stoica, David Karger, M. Frans Kaashoek, Hari Balakrishnan.
Topics in Reliable Distributed Systems Lecture 2, Fall Dr. Idit Keidar.
Introduction to Peer-to-Peer (P2P) Systems Gabi Kliot - Computer Science Department, Technion Concurrent and Distributed Computing Course 28/06/2006 The.
Looking Up Data in P2P Systems Hari Balakrishnan M.Frans Kaashoek David Karger Robert Morris Ion Stoica.
Chord: A Scalable Peer-to-Peer Lookup Protocol for Internet Applications Stoica et al. Presented by Tam Chantem March 30, 2007.
Distributed Lookup Systems
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 2: Peer-to-Peer.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek and Hari alakrishnan.
Structure Overlay Networks and Chord Presentation by Todd Gardner Figures from: Ion Stoica, Robert Morris, David Liben- Nowell, David R. Karger, M. Frans.
Chord-over-Chord Overlay Sudhindra Rao Ph.D Qualifier Exam Department of ECECS.
Topics in Reliable Distributed Systems Fall Dr. Idit Keidar.
1 CS 194: Distributed Systems Distributed Hash Tables Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer.
Decentralized Location Services CS273 Guest Lecture April 24, 2001 Ben Y. Zhao.
Peer-to-Peer Networks Slides largely adopted from Ion Stoica’s lecture at UCB.
Mobile Ad-hoc Pastry (MADPastry) Niloy Ganguly. Problem of normal DHT in MANET No co-relation between overlay logical hop and physical hop – Low bandwidth,
1 A scalable Content- Addressable Network Sylvia Rathnasamy, Paul Francis, Mark Handley, Richard Karp, Scott Shenker Pirammanayagam Manickavasagam.
Content Overlays (Nick Feamster). 2 Content Overlays Distributed content storage and retrieval Two primary approaches: –Structured overlay –Unstructured.
1 Reading Report 5 Yin Chen 2 Mar 2004 Reference: Chord: A Scalable Peer-To-Peer Lookup Service for Internet Applications, Ion Stoica, Robert Morris, david.
Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Xiaozhou Li COS 461: Computer Networks (precept 04/06/12) Princeton University.
Vincent Matossian September 21st 2001 ECE 579 An Overview of Decentralized Discovery mechanisms.
Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
Content Addressable Network CAN. The CAN is essentially a distributed Internet-scale hash table that maps file names to their location in the network.
Presentation 1 By: Hitesh Chheda 2/2/2010. Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT Laboratory for Computer Science.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan Presented.
1 Peer-to-Peer Technologies Seminar by: Kunal Goswami (05IT6006) School of Information Technology Guided by: Prof. C.R.Mandal, School of Information Technology.
Scalable Content- Addressable Networks Prepared by Kuhan Paramsothy March 5, 2007.
1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan MIT Laboratory.
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 2: Distributed Hash.
Pastry Antony Rowstron and Peter Druschel Presented By David Deschenes.
Algorithms and Techniques in Structured Scalable Peer-to-Peer Networks
Two Peer-to-Peer Networking Approaches Ken Calvert Net Seminar, 23 October 2001 Note: Many slides “borrowed” from S. Ratnasamy’s Qualifying Exam talk.
P2P Search COP6731 Advanced Database Systems. P2P Computing  Powerful personal computer Share computing resources P2P Computing  Advantages: Shared.
1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan MIT Laboratory.
Md Tareq Adnan Centralized Approach : Server & Clients Slow content must traverse multiple backbones and long distances Unreliable.
CS694 - DHT1 Distributed Hash Table Systems Hui Zhang University of Southern California.
Chord: A Scalable Peer-to-Peer Lookup Service for Internet Applications * CS587x Lecture Department of Computer Science Iowa State University *I. Stoica,
The Chord P2P Network Some slides taken from the original presentation by the authors.
Peer-to-Peer Information Systems Week 12: Naming
Ion Stoica, Robert Morris, David Liben-Nowell, David R. Karger, M
Magdalena Balazinska, Hari Balakrishnan, and David Karger
The Chord P2P Network Some slides have been borrowed from the original presentation by the authors.
EE 122: Peer-to-Peer (P2P) Networks
Reading Report 11 Yin Chen 1 Apr 2004
MIT LCS Proceedings of the 2001 ACM SIGCOMM Conference
Consistent Hashing and Distributed Hash Table
P2P: Distributed Hash Tables
Peer-to-Peer Information Systems Week 12: Naming
Presentation transcript:

LOOKING UP DATA IN P2P SYSTEMS Hari Balakrishnan M. Frans Kaashoek David Karger Robert Morris Ion Stoica MIT LCS

Key Idea Survey paper Discusses how to access data in a P2P system Covers four solutions –CAN –Chord –Pastry –Tapestry

INTRODUCTION P2P systems are popular due to –Low startup cost – High scalability at very low cost –Use of resources that would otherwise remain unused –Potential for greater robustness Fully decentralized and distributed

The lookup problem How do we locate data in large P2P systems? One solution – Distributed hash tables ( DHT )

Previous solutions (I) Centralized database –Napster Not scalable Vulnerable to attacks on database

Previous solutions (II) Broadcasting –Customers broadcast their requests to their neighbors, which forward them to their own neighbors and so on –Gnutella –Does not scale either Broadcast messages consume too much bandwidth

Previous solutions (III) Internet DNS –Organizes network nodes into an hierarchy –All searches start at top of hierarchy Propagate down –Used by KaZaA, Grokster and others –Nodes higher in the tree do much more work than lower nodes –Solution vulnerable to loss of root node(s)

Previous solutions (IV) Freenet –Forwards queries from node to node until requested data are found –Emphasis is on anonymity Not performance Unpopular documents may become inaccessible –Nobody cares!

DISTRIBUTED HASH TABLES Implements primitive lookup ( key ) –Produces a path going from a node n o to the node holding key Big tradeoff is between –Keeping paths short –Minimizing state information kept by nodes

Main design issues Mapping keys to nodes in a balanced way –Use a hash function Forwarding a lookup for a key to appropriate node –Find at each step a node closer to the node holding the key Building routing tables –Each node should have a successor

CAN Uses a d-dimensional key space –Partitioned into hyper-rectangles " Zones " –Each node manages a zone Responsible for all keys in zone

Neighbors Each node keeps track of addresses of all its neighbors –Routing table Neighbors are defined as nodes sharing a (d-1) dimensional hyper-plane –Contacts with fewer dimensions in common do not count

A two-dimensional example (I)

A two-dimensional example (II) X (0, 0; 0.5, 0.5) (0, 0) (1, 0) (1, 1) (0, 1) In reality the state space wraps X (0, 0.5; 0.5, 1) X (0.5, 0.5; 1, 1) X (0.5, 0.25; 0.75, 0.5) X (0.5, 0; 0.75, 0.25) X (0.75, 0;1, 0.5)

A path from (0.25, 0.3) to (0.8, 0.8) X (0, 0; 0.5, 0.5) (0, 0) (1, 0) (1, 1) (0, 1) In reality the state space wraps X (0, 0.5; 0.5, 1) X (0.5, 0.5; 1, 1) X (0.5, 0.25; 0.75, 0.5) X (0.5, 0; 0.75, 0.25) X (0.75, 0;1, 0.5)

Lookup Routing tries to approximate the straight path between current zone and zone holding the key Various optimizations attempt to reduce lookup latency

Dynamic behavior When a node joins the network –It picks random point in space –Find node managing the zone –Splits with it current zone When a node departs –Zones are merged More complex process

Fault-tolerance When a node fails neighbor with smallest zone takes over –Multiple failures may cause too many nodes to handle multiple zones

CHORD Assigns ID's to keys and nodes in the same address space ID's are organized in a ring –ID 0 follows the highest ID Each node is responsible for all keys that immediately precede it in the key space

Example N 4 K 6 N 12 N 20 N 24 K1 K 10 K 15

Finger table Each node keeps a table containing IP addresses of nodes – Halfway around in the key space –Quarter-of-the-way around – … Table has log N entries –Allows O(log N) searches

Partial example N 4 N 12 N 20 N 24

Fault-tolerance Each node has a successor list –Contains IP addresses of next r successors Guarantees routing progress as long as all r successors are not down

Dynamic behavior New node n learns its place in the Chord ring by asking any extant node to do a lookup( n ) Must also –Update successor list of its predecessor –Create its own successor list

PASTRY Scalable, self-organizing, routing and object location infrastructureScalable, self-organizing, routing and object location infrastructure Each node has a node IDEach node has a node ID –IDs are uniformly distributed in the ID space Includes a proximity metric to measure distances between pairs of ID'sIncludes a proximity metric to measure distances between pairs of ID's

Pastry Nodes Each node maintains three sets of nodesEach node maintains three sets of nodes – Leaf set Closest nodes in terms of node ID'sClosest nodes in terms of node ID's Same function as Chord's successor listSame function as Chord's successor list – Nodes in routing table Prefix routing (big idea)Prefix routing (big idea) – Neighborhood set Closest nodes in terms of proximity metricClosest nodes in terms of proximity metric

Dynamic behavior Pastry is self-organizingPastry is self-organizing – Nodes come and go – Includes a seed discovery protocol

Prefix Routing At each step, a node forwards an incoming request to a node whose node id has largest common prefix withAt each step, a node forwards an incoming request to a node whose node id has largest common prefix with Destination ID: 1230Destination ID: 1230 Node ID: 023Node ID: Next Hop: 12 --Next Hop: 12 --

Routing table for node No common prefix One common digit Two common digits Three common digits

Routing request for node No common prefix One common digit Two common digits Three common digits Request is always send to a node having at least one more common prefix digit. Here it's node 1223

At node No common prefix One common digit Two common digits Three common digits Node with at least one more common prefix digit is node 1230

TAPESTRY Interprets keys as sequences of digits Incremental prefix routing –Similar to Pastry Main contribution is emphasis on proximity –In the actual world Reduces query latency Makes system much more complex

CONCLUSIONS Major issues include – Operational costs: searches are all O(log n ) ; storage costs vary – Fault-tolerance and concurrent changes: only Chord and Tapestry can handle them – Proximity routing: Pastry, CAN and Tapestry have heuristics – Malicious nodes: Pastry checks node ID's

Summary of costs CANChordPastryTapestry Node state 1 d log N Lookup 2 dN 1/ d log N Join 2 dN 1/ d + d log N log 2 N 1 number of other nodes known by a given