Fault-tolerant Routing in Peer-to-Peer Systems James Aspnes Zoë Diamadi Gauri Shah Yale University PODC 2002.

Slides:



Advertisements
Similar presentations
Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
Advertisements

Peer to Peer and Distributed Hash Tables
Scalable Content-Addressable Network Lintao Liu
SKIP GRAPHS Slides adapted from the original slides by James Aspnes Gauri Shah.
Chord: A scalable peer-to- peer lookup service for Internet applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashock, Hari Balakrishnan.
Massively Distributed Database Systems Distributed Hash Spring 2014 Ki-Joune Li Pusan National University.
Sylvia Ratnasamy, Paul Francis, Mark Handley, Richard Karp, Scott Schenker Presented by Greg Nims.
Common approach 1. Define space: assign random ID (160-bit) to each node and key 2. Define a metric topology in this space,  that is, the space of keys.
Presented by Elisavet Kozyri. A distributed application architecture that partitions tasks or work loads between peers Main actions: Find the owner of.
A Scalable Content Addressable Network (CAN)
Peer to Peer File Sharing Huseyin Ozgur TAN. What is Peer-to-Peer?  Every node is designed to(but may not by user choice) provide some service that helps.
Pastry: Scalable, decentralized object location and routing for large-scale peer-to-peer systems Antony Rowstron and Peter Druschel Proc. of the 18th IFIP/ACM.
Topics in Reliable Distributed Systems Lecture 2, Fall Dr. Idit Keidar.
A Scalable Content-Addressable Network Authors: S. Ratnasamy, P. Francis, M. Handley, R. Karp, S. Shenker University of California, Berkeley Presenter:
A Scalable and Load-Balanced Lookup Protocol for High Performance Peer-to-Peer Distributed System Jerry Chou and Tai-Yi Huang Embedded & Operating System.
Distributed Lookup Systems
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 2: Peer-to-Peer.
SCALLOP A Scalable and Load-Balanced Peer- to-Peer Lookup Protocol for High- Performance Distributed System Jerry Chou, Tai-Yi Huang & Kuang-Li Huang Embedded.
Chord-over-Chord Overlay Sudhindra Rao Ph.D Qualifier Exam Department of ECECS.
Topics in Reliable Distributed Systems Fall Dr. Idit Keidar.
1 CS 194: Distributed Systems Distributed Hash Tables Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer.
Decentralized Location Services CS273 Guest Lecture April 24, 2001 Ben Y. Zhao.
Or, Providing Scalable, Decentralized Location and Routing Network Services Tapestry: Fault-tolerant Wide-area Application Infrastructure Motivation and.
1 Peer-to-Peer Networks Outline Survey Self-organizing overlay network File system on top of P2P network Contributions from Peter Druschel.
SKIP GRAPHS James Aspnes Gauri Shah To appear in SODA Level 0 Level 1 Level 2.
Peer-to-Peer Networks Slides largely adopted from Ion Stoica’s lecture at UCB.
ICDE A Peer-to-peer Framework for Caching Range Queries Ozgur D. Sahin Abhishek Gupta Divyakant Agrawal Amr El Abbadi Department of Computer Science.
File Sharing : Hash/Lookup Yossi Shasho (HW in last slide) Based on Chord: A Scalable Peer-to-peer Lookup Service for Internet ApplicationsChord: A Scalable.
1 Distributed Data Structures for a Peer-to-peer system Advisor: James Aspnes Committee: Joan Feigenbaum Arvind Krishnamurthy Antony Rowstron [MSR, Cambridge,
 Structured peer to peer overlay networks are resilient – but not secure.  Even a small fraction of malicious nodes may result in failure of correct.
Introduction to Peer-to-Peer Networks. What is a P2P network Uses the vast resource of the machines at the edge of the Internet to build a network that.
INTRODUCTION TO PEER TO PEER NETWORKS Z.M. Joseph CSE 6392 – DB Exploration Spring 2006 CSE, UT Arlington.
1 A scalable Content- Addressable Network Sylvia Rathnasamy, Paul Francis, Mark Handley, Richard Karp, Scott Shenker Pirammanayagam Manickavasagam.
Roger ZimmermannCOMPSAC 2004, September 30 Spatial Data Query Support in Peer-to-Peer Systems Roger Zimmermann, Wei-Shinn Ku, and Haojun Wang Computer.
Symmetric Replication in Structured Peer-to-Peer Systems Ali Ghodsi, Luc Onana Alima, Seif Haridi.
1 PASTRY. 2 Pastry paper “ Pastry: Scalable, decentralized object location and routing for large- scale peer-to-peer systems ” by Antony Rowstron (Microsoft.
Introduction to Peer-to-Peer Networks. What is a P2P network A P2P network is a large distributed system. It uses the vast resource of PCs distributed.
Proof of Kleinberg’s small-world theorems
Content Overlays (Nick Feamster). 2 Content Overlays Distributed content storage and retrieval Two primary approaches: –Structured overlay –Unstructured.
Sylvia Ratnasamy (UC Berkley Dissertation 2002) Paul Francis Mark Handley Richard Karp Scott Shenker A Scalable, Content Addressable Network Slides by.
Using the Small-World Model to Improve Freenet Performance Hui Zhang Ashish Goel Ramesh Govindan USC.
Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Xiaozhou Li COS 461: Computer Networks (precept 04/06/12) Princeton University.
Vincent Matossian September 21st 2001 ECE 579 An Overview of Decentralized Discovery mechanisms.
Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
Content Addressable Network CAN. The CAN is essentially a distributed Internet-scale hash table that maps file names to their location in the network.
1 Slides from Richard Yang with minor modification Peer-to-Peer Systems: DHT and Swarming.
A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.
An IP Address Based Caching Scheme for Peer-to-Peer Networks Ronaldo Alves Ferreira Joint work with Ananth Grama and Suresh Jagannathan Department of Computer.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications.
SIGCOMM 2001 Lecture slides by Dr. Yingwu Zhu Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications.
Scalable Content- Addressable Networks Prepared by Kuhan Paramsothy March 5, 2007.
The new protocol of freenet Taken from Ian Clarke and Oskar Sandberg (The Freenet Project)
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 2: Distributed Hash.
Plethora: Infrastructure and System Design. Introduction Peer-to-Peer (P2P) networks: –Self-organizing distributed systems –Nodes receive and provide.
Algorithms and Techniques in Structured Scalable Peer-to-Peer Networks
LOOKING UP DATA IN P2P SYSTEMS Hari Balakrishnan M. Frans Kaashoek David Karger Robert Morris Ion Stoica MIT LCS.
Two Peer-to-Peer Networking Approaches Ken Calvert Net Seminar, 23 October 2001 Note: Many slides “borrowed” from S. Ratnasamy’s Qualifying Exam talk.
P2P Search COP6731 Advanced Database Systems. P2P Computing  Powerful personal computer Share computing resources P2P Computing  Advantages: Shared.
CS694 - DHT1 Distributed Hash Table Systems Hui Zhang University of Southern California.
Small-world phenomenon: An Algorithmic Perspective Jon Kleinberg.
Peer-to-Peer Information Systems Week 12: Naming
Peer-to-Peer and Social Networks
Accessing nearby copies of replicated objects
SKIP GRAPHS James Aspnes Gauri Shah SODA 2003.
DHT Routing Geometries and Chord
A Scalable content-addressable network
SKIP LIST & SKIP GRAPH James Aspnes Gauri Shah
Proof of Kleinberg’s small-world theorems
Peer-to-Peer Information Systems Week 12: Naming
SKIP GRAPHS (continued)
Presentation transcript:

Fault-tolerant Routing in Peer-to-Peer Systems James Aspnes Zoë Diamadi Gauri Shah Yale University PODC 2002

P2P network Bunch of peers. Store resources identified by keys. Peers subject to crash failures. Goal: locate resources ‘’efficiently’’. Resources Peers Key

Properties of ideal network Data availability Decentralization Fault-tolerance Scalability Load balancing Maintaining network Dynamic node addition/deletion Self-stabilization Efficient searching Incorporating geography Incorporating locality

Early P2P systems Napster ? x x ? x Central server bottleneck Gnutella Inefficient flooding

Tapestry [JKZ’01] Uses Plaxton’s Algorithm: Correct one digit at a time to reach target. Pastry [DR’01] is also similar Node xyz links to *XX, x*X and xy* [* = all digits, X = any digit]

CAN [RFHKS’01] (0,0) (1,0) (0,1)(1,1) Partition d-dimensional co-ordinate space into zones. Nodes own zones and keys hashed to them. Greedy routing: forward to neighbor closest to target. d=2 zone

Chord [SMKKB‘01] Nodes and resources mapped to identifier circle. Routing table: successor nodes at distances Greedy routing: forward to node in routing table closest to target successors identifier circle (n=8)

Common underlying structure Underlying metric space. Nodes embedded in metric space. Location determined by key. Hashing to balance load. Greedy routing. O(log n) space at each node. O(log n) routing time.

Unifying approach v2 HASH v1 v4 v3 v1 v2 v3 v4 Keys Nodes Actual Route Physical Link Virtual Link Virtual Route PHYSICAL NETWORK VIRTUAL OVERLAY NETWORK

Link Distribution Each node independently selects k long-hop links as per some distribution. Links chosen as per Nodes x-d 1 x x-d 2

Abstract model Simple metric space: 1D line. Hash(key) = Metric space location. Short-hop links: immediate neighbors. Long-hops links:inverse-distance distribution. Pr[edge(u,v)] = 1/d(u,v) / Greedy Routing: forward message to neighbor closest to target in metric space. 1/d(u,v’)

What do we care about? Do we get similar upper bounds on routing time with failures? Is it possible to design a link distribution that beats the O(log 2 n) bound for routing given by 1/d distribution? Can we dynamically construct such a network?

Greedy routing with failures Analyze message delivery in phases [Kleinberg ‘99]. Target t Phase 0 Phase 1 Phase 2 Message at node n in phase i: 2 i d(n, t) < 2 i+1 At most (log n + 1) such phases.

[1..log n] long-hop links Suppose each node has k long-hop links. Average time spent in each phase: ((log n)/k). With O(log n) such phases: Total time: O((log 2 n)/k). With failures: Suppose each node/link fails with prob (1-p). Average time spent in each phase: ((log n)/pk). Total time: O((log 2 n)/pk)

Simulation results n= nodes log n=17 links What happens with > log n links?

What do we care about? Do we get similar upper bounds on routing time with failures? Is it possible to design a link distribution that beats the O(log 2 n) bound for routing given by 1/d distribution? Lower bound on routing time as a function of number of links per node. Can we dynamically construct such a network?

Intuition for lower bound [KUW’88] Time needed for a non-increasing real-valued Markov chain X 0, X 1, X 2 …. to drop to 1 bounded by: where = E[X t –X t+1 : X t = z] is a non-decreasing function of z.

Upper bound on time SFONYC z gives lower bound on average crossing speed. ( is non-decreasing so ) x gives upper bound on time. Starting from x, average speed at z =.

Lower bound on time SFONYC z x gives upper bound on average crossing speed. m z = sup This may give too large an estimate, so condition against high bursts of speed. gives lower bound on time.

Tool for lower bound Non-increasing Markov chain: X 0, X 1, X 2 ….., state space S. E[ Time to reach 0 ] T(X 0 )/[ T(X 0 ) + (1- )] Pr[X t – X t+1 U : X t = x] Few long jumps Time from x [no long jumps] Upper bound on speed [no long jumps] E[X t – X t+1 : X t =x, X t – X t+1 < U] m z = sup { : x S, x [z, z+U) }

Applying tool to routing Cannot bound progress of single node with an arbitrary distribution! So use an aggregate chain S t of nodes for collective behavior of nodes in some range. Track ln(|S t |) for recurrence relation. StSt d3d3 d2d2 d1d1 Nodes Links 0 S t+1 0

Lower bounds Random graph G. Node x has k independent links on average. x links to (x-1) and (x+1). Expected time to reach 0 from a Point chosen uniformly from 1..n: (ln 2 n) worse that O(ln n) for a tree: cost of assuming symmetry between nodes. 1-sided routing: 2-sided routing: * link ignored s s link ignored * Probability of choosing links symmetric about 0 and unimodal.

What do we care about? Do we get similar upper bounds on routing time with failures? Is it possible to design a link distribution that beats the O(log 2 n) bound for routing given by 1/d distribution? Can we dynamically construct such a network?

Heuristic for construction New node chooses neighbors using inverse distance distribution. Links to live nodes closest to chosen ones. Selects older nodes to point to it. new node ideal link adjusted link absent node new link initial link older node y x

Open problems Design a self-stabilization mechanism. [Aspnes, Shah ’02] submitted to SODA Does lower bound generalize to multidimensional metric spaces? Analyze security properties such as anonymity and byzantine failures. ? Does backtracking give provably good routing bound?