Presentation is loading. Please wait.

Presentation is loading. Please wait.

Pastry: Scalable, decentralized object location and routing for large-scale peer-to-peer systems Antony Rowstron and Peter Druschel Proc. of the 18th IFIP/ACM.

Similar presentations


Presentation on theme: "Pastry: Scalable, decentralized object location and routing for large-scale peer-to-peer systems Antony Rowstron and Peter Druschel Proc. of the 18th IFIP/ACM."— Presentation transcript:

1 Pastry: Scalable, decentralized object location and routing for large-scale peer-to-peer systems Antony Rowstron and Peter Druschel Proc. of the 18th IFIP/ACM International Conference on Distributed Systems Platforms, November 2001.

2 Outline Pastry Goals Routing Adding/Deleting Nodes Use of Locality Results Conclusions

3 Goals of Pastry Create robust P2P overlay substrate Provide efficient location and routing services Make it highly scalable (100,000+ nodes) Use locality to speed underlying network routing Fault tolerance: repair, good replication support No authentication or encryption No naming or attribute based queries

4 Related Work Chord – another overlay substrate Overlay routing performance also O(log N) No notion of locality Build on top of Pastry: PAST - persistent file storage SCRIBE - publish/subscribe application

5 Overlay Substrate Pastry overlay network consists of nodes. Node ID is random 128 bit number Id ’ s uniformly distributed in circular 128 bit space. So numerically adjacent id ’ s are likely to be widely dispersed in underlying physical network. Provides fault tolerance if you replicate data on nodes adjacent in id space. Pastry basically provides message delivery service Give a 128 bit key and message to Pastry. Pastry routes message to node with node id numerically closest to key value. Like a distributed hash table.

6 Pastry API Pastry ’ s API only supports message delivery Pastry offers a very simple API! nodeId = pastryInit(credentials, appHandle) Joins to network and returns this nodes id Route(msg, key) Route msg to node with id closest to key Pastry requires applications to export: Deliver(msg,key) // msg arrived for this node Forward(msg,key,nextID) // about to forward msg newLeafs(leafSet) // notification leaf set changed

7 Basic Routing ID ’ s treated as sequence of digits in base 2^b A good number for b is 4, so a node or key ID is a sequence of 32 hex digits. Routing is based on address prefixes Nodes forward messages to a node “ closer ” to keyID Closer in terms of length of common prefix If current nodeID shares first n digits with keyID Forward to a node that shares > n prefix digits with keyID If no longer common prefix nodeID is in routing table Then forward to a node that is numerically closer to keyID. Requires less than ceil( ) steps

8 Basic Routing Routing from 65a1fc to d46a1c Each hop, shared prefix gets longer, (or at least numerically closer).

9 Node State for Routing To support routing each node maintains three pieces of state Leaf Set, “ Neighborhood ” Set and a Routing Table Leaf Set has id/ip ’ s of nearest nodes in id space L/2 nearest nodes with greater ids, L/2 with lower ids. Typical L is 16 or 32 Leaf set is 8 or 16 adjacent nodes on each side of current node Neighborhood Set, nodes close by locality metric Not used in routing, but when populating routing tables. Typical set size is 32 nodes.

10 Node State for Routing Routing table has forwarding info Each entry has an appropriate nodeID and its ip address. Table has approx. (log N) rows and (2^b) columns For 100K nodes with b=4, 5 rows and 16 columns Nth row has entries with same n digit prefix as current E.g. every entry in row 2 has the same 2 digit prefix as node. N+1 st digit of each entry has value of its column # E.g. row 5, column 2 will have same first 5 digits as the current node ’ s id and the 6 th digit will be “ 2 ”.

11 Example Routing Table Example in base 4. L = 8 leafs. M = 8 neighbors Each row down the shared prefix is one digit longer. The next digit after the shared prefix is the column number.

12 Routing Algorithm Given a key to forward, a node does the following: If key within range of leaf nodes, forward to leaf. done Else if routing table has node with longer common prefix with key than current node, forward to it. Else forward to a node with same length prefix, but closer numerically. Only fails if the L/2 leafs closer to the key fail simultaneously. Unlikely, since leaf nodes widely distributed in underlying net.

13 Adding a Node Assumption: Join at a node that is close in terms of locality metric. Valid if ip-multicast enabled, otherwise is a problem. Send a join msg to your id (no one else knows it yet) It gets routed to node with id closest to yours. Use that nodes leaf set as basis for yours. Get neighbors from the joined node, since it is assumed close. Get all state info from nodes in route and neighbors. Build routing table from aggregate state data using best nodes in terms of locality. Cost: approx. (3 x 2^b) x log N = 200+ RPCs for 100k nodes and b=4

14 Adding a Node Neighborhood Set Leaf Set Leaf Set, Neighborhood Set and Routing Table

15 Losing a Node Leaf node fails Get an adjacent leaf to send you its leaf set. Routing entry fails Contact nodes in that row for a replacement. If they don ’ t have one, ask nodes in next row. Experiments show it takes about 57 RPC calls to fix. Ping neighborhood set periodically If lose one, ask another neighbor for their set

16 Locality Assumption: The locality space obeys triangulation inequality Not true for ip hops. Authors looking into it. Assumption not necessarily true but authors claim experimental results show the algorithm works.

17 Locating Nearest of k Nodes If data replicated at k nodes (e.g. a leaf set) Want to find nearest of k, in terms of locality. Algorithm works pretty well as described, but Adding a heuristic can help Heuristic Estimate node densities by looking at current ’ s leafs Based on this density estimate, switch to numerically nearest address routing to find nearest replica, instead of using routing table to get to node with id closest to keyID.

18 Heuristic Results Reaching nearest of k nodes 10,000 node network No heuristic: closest 68%, or second best 87% Heuristic: closest 76%, or second best 92%

19 Results: Overlay Routing Simulation with 100K nodes, run in 1 JVM Overlay Routing Performance With b=4, L=16 and Neighbors=32 Average # hops =4, max = 5 Not surprising since “ uniform distribution ” of id ’ s. 16^5 is > 100K, so 5 digits makes an id unique.

20 Results: Overlay Routing

21 Results: Locality Using distance in a plane as locality metric Node positions picked randomly Compared sum of distances for all hops, against a straight line from source to destination. Routed distance was 30-40% longer than line. Is impressive given how small the routing tables are. For 100K nodes, b = 4, routing table is about 75 entries. Interesting but doesn ’ t necessarily say much about internet.

22 Results: Locality

23 Optional: Some Vulnerabilities A malicious node can silently consume messages Currently no way to detect, but can defend against it. Randomize routing, usually take best hop, but not always. After multiple queries one should avoid the bad node. Partitioning can occur If connectivity goes away for a while, node groups may “ repair ” all their connections to rest of overlay When connectivity comes back up, still partitioned. No good solution. Authors suggest ip multicast periodically.

24 Conclusion Issues to solve: Joining is a problem, especially since need to join a node close in locality. No security built in. Not sure how well locality metric works in real world. Overlay routing very efficient, few node hops Very small routing tables Locality works well in simulation environment Overall, appears to be a very scalable solution


Download ppt "Pastry: Scalable, decentralized object location and routing for large-scale peer-to-peer systems Antony Rowstron and Peter Druschel Proc. of the 18th IFIP/ACM."

Similar presentations


Ads by Google