Pastry And Squirrel Presented by Eirik T. Laberg Håvard Semundseth Orri G. Pálsson.

Slides:



Advertisements
Similar presentations
Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.
Advertisements

Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK
Scalable Content-Addressable Network Lintao Liu
Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility Antony Rowstron, Peter Druschel Presented by: Cristian Borcea.
Kademlia: A Peer-to-peer Information System Based on the XOR Metric Petar Mayamounkov David Mazières A few slides are taken from the authors’ original.
CHORD – peer to peer lookup protocol Shankar Karthik Vaithianathan & Aravind Sivaraman University of Central Florida.
Serverless Network File Systems. Network File Systems Allow sharing among independent file systems in a transparent manner Mounting a remote directory.
Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.
1 PASTRY Partially borrowed from Gabi Kliot ’ s presentation.
1 Accessing nearby copies of replicated objects Greg Plaxton, Rajmohan Rajaraman, Andrea Richa SPAA 1997.
Common approach 1. Define space: assign random ID (160-bit) to each node and key 2. Define a metric topology in this space,  that is, the space of keys.
Pastry: Scalable, decentralized object location and routing for large-scale peer-to-peer systems Antony Bowstron & Peter Druschel Presented by: Long Zhang.
Scribe: A Large-Scale and Decentralized Application-Level Multicast Infrastructure Miguel Castro, Peter Druschel, Anne-Marie Kermarrec, and Antony L. T.
P2p, Spring 05 1 Topics in Database Systems: Data Management in Peer-to-Peer Systems March 29, 2005.
Web Caching Schemes1 A Survey of Web Caching Schemes for the Internet Jia Wang.
Internet Networking Spring 2006 Tutorial 12 Web Caching Protocols ICP, CARP.
Peer to Peer File Sharing Huseyin Ozgur TAN. What is Peer-to-Peer?  Every node is designed to(but may not by user choice) provide some service that helps.
Pastry: Scalable, decentralized object location and routing for large-scale peer-to-peer systems Antony Rowstron and Peter Druschel Proc. of the 18th IFIP/ACM.
Storage Management and Caching in PAST, a large-scale, persistent peer- to-peer storage utility Authors: Antony Rowstorn (Microsoft Research) Peter Druschel.
Cis e-commerce -- lecture #6: Content Distribution Networks and P2P (based on notes from Dr Peter McBurney © )
Secure routing for structured peer-to-peer overlay networks Miguel Castro, Ayalvadi Ganesh, Antony Rowstron Microsoft Research Ltd. Peter Druschel, Dan.
Pastry Partially borrowed for Gabi Kliot. Pastry Scalable, decentralized object location and routing for large-scale peer-to-peer systems  Antony Rowstron.
Spring 2003CS 4611 Peer-to-Peer Networks Outline Survey Self-organizing overlay network File system on top of P2P network Contributions from Peter Druschel.
A Scalable Content-Addressable Network Authors: S. Ratnasamy, P. Francis, M. Handley, R. Karp, S. Shenker University of California, Berkeley Presenter:
1 Spring Semester 2007, Dept. of Computer Science, Technion Internet Networking recitation #13 Web Caching Protocols ICP, CARP.
Secure routing for structured peer-to-peer overlay networks (by Castro et al.) Shariq Rizvi CS 294-4: Peer-to-Peer Systems.
SkipNet: A Scaleable Overlay Network With Practical Locality Properties Presented by Rachel Rubin CS294-4: Peer-to-Peer Systems By Nicholas Harvey, Michael.
Squirrel: A decentralized peer- to-peer web cache Paul Burstein 10/27/2003.
Wide-area cooperative storage with CFS
1 Peer-to-Peer Networks Outline Survey Self-organizing overlay network File system on top of P2P network Contributions from Peter Druschel.
P2P Course, Structured systems 1 Introduction (26/10/05)
Peer-to-Peer Networks Slides largely adopted from Ion Stoica’s lecture at UCB.
 Structured peer to peer overlay networks are resilient – but not secure.  Even a small fraction of malicious nodes may result in failure of correct.
1CS 6401 Peer-to-Peer Networks Outline Overview Gnutella Structured Overlays BitTorrent.
Pastry: Scalable, decentralized object location and routing for large-scale peer-to-peer systems (Antony Rowstron and Peter Druschel) Shariq Rizvi First.
Mobile Ad-hoc Pastry (MADPastry) Niloy Ganguly. Problem of normal DHT in MANET No co-relation between overlay logical hop and physical hop – Low bandwidth,
Tapestry GTK Devaroy (07CS1012) Kintali Bala Kishan (07CS1024) G Rahul (07CS3009)
1 PASTRY. 2 Pastry paper “ Pastry: Scalable, decentralized object location and routing for large- scale peer-to-peer systems ” by Antony Rowstron (Microsoft.
Content Overlays (Nick Feamster). 2 Content Overlays Distributed content storage and retrieval Two primary approaches: –Structured overlay –Unstructured.
Chord & CFS Presenter: Gang ZhouNov. 11th, University of Virginia.
2: Application Layer1 Chapter 2 outline r 2.1 Principles of app layer protocols r 2.2 Web and HTTP r 2.3 FTP r 2.4 Electronic Mail r 2.5 DNS r 2.6 Socket.
Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Xiaozhou Li COS 461: Computer Networks (precept 04/06/12) Princeton University.
A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.
An IP Address Based Caching Scheme for Peer-to-Peer Networks Ronaldo Alves Ferreira Joint work with Ananth Grama and Suresh Jagannathan Department of Computer.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan Presented.
Peer to Peer A Survey and comparison of peer-to-peer overlay network schemes And so on… Chulhyun Park
Pastry: Scalable, decentralized object location and routing for large-scale peer-to-peer systems Antony Rowstron and Peter Druschel, Middleware 2001.
1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan MIT Laboratory.
Pastry Antony Rowstron and Peter Druschel Presented By David Deschenes.
Squirrel: A decentralized peer-to- peer web cache Paper by Sitaram Iyer, Antony Rowstron and Peter Druschel (© 2002) Presentation* by Alexander Prohaska.
Peer to Peer Network Design Discovery and Routing algorithms
CS 6401 Overlay Networks Outline Overlay networks overview Routing overlays Resilient Overlay Networks Content Distribution Networks.
LOOKING UP DATA IN P2P SYSTEMS Hari Balakrishnan M. Frans Kaashoek David Karger Robert Morris Ion Stoica MIT LCS.
09/13/04 CDA 6506 Network Architecture and Client/Server Computing Peer-to-Peer Computing and Content Distribution Networks by Zornitza Genova Prodanoff.
Large Scale Sharing Marco F. Duarte COMP 520: Distributed Systems September 19, 2004.
1 Tuesday, February 03, 2009 “Work expands to fill the time available for its completion.” - Parkinson’s 1st Law.
1 Plaxton Routing. 2 History Greg Plaxton, Rajmohan Rajaraman, Andrea Richa. Accessing nearby copies of replicated objects, SPAA 1997 Used in several.
Plethora: A Locality Enhancing Peer-to-Peer Network Ronaldo Alves Ferreira Advisor: Ananth Grama Co-advisor: Suresh Jagannathan Department of Computer.
CS694 - DHT1 Distributed Hash Table Systems Hui Zhang University of Southern California.
Peer-to-Peer Networks 05 Pastry Christian Schindelhauer Technical Faculty Computer-Networks and Telematics University of Freiburg.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Fabián E. Bustamante, Fall 2005 A brief introduction to Pastry Based on: A. Rowstron and P. Druschel, Pastry: Scalable, decentralized object location and.
Chord: A Scalable Peer-to-Peer Lookup Service for Internet Applications * CS587x Lecture Department of Computer Science Iowa State University *I. Stoica,
CS791Aravind Elango Maintenance-Free Global Data Storage Sean Rhea, Chris Wells, Patrick Eaten, Dennis Geels, Ben Zhao, Hakim Weatherspoon and John Kubiatowicz.
Pastry Scalable, decentralized object locations and routing for large p2p systems.
Controlling the Cost of Reliability in Peer-to-Peer Overlays
Accessing nearby copies of replicated objects
PASTRY.
EE 122: Peer-to-Peer (P2P) Networks
Presentation transcript:

Pastry And Squirrel Presented by Eirik T. Laberg Håvard Semundseth Orri G. Pálsson

What is Pastry System? Overlay network that handles: Routing between nodes Object localization Each node is assigned a unique nodeId. 128-bit SHA-1hash of either the nodes public key or IP-address

Pastry Node Leaf set (L) A set of nodes that are numerically closer in the nodeId space to the present Node. Half larger and half smaller than the current node. The leaf set is mainly used when routing messages. Routing table (R) The routing table consists of a number of rows, where row i containing nodes sharing i initial digits of the nodeId with the local node

Pastry Node (2) Neighborhood set (M): Contains nodeIds and IP addresses of the |M| nodes that are closest (according to the proximity metric) to the local node. The neighbor set is used as a starting point for the construction of the routing table process and maintaining locality properties.

Routing Message with key D arrives to node with nodeId A. Checks if the key falls within the range of nodeIds in leaf set. If yes, forward message to destination node If no, use routing table. Forward to node with common prefix by at least one more digit Table entry are empty or node unreachable. Forward to node with prefix at least as long as key, and numerically closer in the nodeId space

Routing (2) Example:

Node Arrival (1/3) A new nodeId is X and its nearby Pastry node is A. Assumed that the new node X knows initially about the nearby Pastry node A. Node X asks A to route a “join” message with a key equal to X. Pastry routes the join message to existing node Z whose id is numerically closest to X in the nodeId space. Nodes A, Z and all nodes encountered on the path sends their state tables to X.

Node Arrival (2/3)

Node Arrival (3/3) New node X initializes its own state tables Neighborhood set is initialized with A’s (closest in proximity metric) neighborhood set Since Z is closest numerically to X: X’s leaf set is initialized with Z’s leaf set. Row 0 (R 0 ) of A’s routing table used to initialize X row 0 Row 1 (R 1 ) of node B’s routing table used to initialize X row 1 … Node X transmits a copy of its resulting state to all nodes in its neighborhood set (M), leaf set (L) and routing table (R) Nodes in Pastry network updates own state based on info. received

Locality What do we mean by locality? We mean the ability to exploit “local” resources over “global” ones whenever possible. The route chosen for a message is likely to be “good“ with respect to the proximity metric. Who can we maintain this property when a new node X joins/arrives? Discuss the locality property regarding: Locality in the routing table Route locality Locating the nearest among k nodes

Locality in the routing table Goal: Want to be sure that all routing entries refer to a node that is near the present node. According to the proximity metric with live nodes with appropriate prefix for entry Want to maintain this property when a new node X arrives. Stage one: Require that node A is near X and A’s R 0 are close to A according to proximity metric. We also assume that node B’s R 1 entries are reasonable choice for R 1 of X. The new node X initializes its state in this fashion X’s routing table (R) and neighborhood set (M) approximate the desired locality property. Problem: The quality of the approximation must be improved => Avoiding cascading errors that could lead to poor route locality. Solution: Use second stage. Node X requests the state from each of the nodes in its routing table and neighborhood set to update its entries to closer nodes. Neighborhood set contributes valuable information.

Route locality Each routing step moves the message closer to the destination: In the nodeId space. While traveling the least possible distance in the proximity space. Given that: A routed message from A to B at a distance d cannot be routed to a node with a distance of less than d from A. “Local” information is used, Pastry minimizes the distance of the next routing step with no global direction. Does not guarantee shortest path from source to destination is chosen.

Locating the nearest among k nodes Goal: Peer-to-peer applications may want to use Pastry to replicate information on k Pastry nodes. The k Pastry nodes is numerically closest to a given key in the Pastry nodeId space. A message routed from a client application (CA), reaches first a node near the CA, among k numerically closest nodes to a key. Problem: Since Pastry routes primarily is based on nodeId prefixes, it may miss nearby nodes with a different prefix than the key. Solution: Pastry uses heuristic to overcome prefix mismatch issues. It detects when a message approaches the set of k nodes and switches to numerically nearest address based routing. Heuristic: based on estimating the density of nodeIds.

Node departure Node considered failed when its immediate neighbor in the nodeId space can no longer communicate with the node. Leaf node: the failed node’s neighbor in the nodeId space contacts the node with the largest index in L on the side of the failed node and ask for its leaf table. Routing node : contacts another node of the same row, and asks for that nodes neighbor. If it can’t find a suitable on the same row it looks on the row below.

Arbitrary node failures Node continues to be responsive, but behaves incorrectly or even maliciously. Repeated queries fail each time since they normally take the same route. Solution: Routing can be randomized The choice among multiple nodes that satisfy the routing criteria should be made randomly

Some experimental results Quad-processor Compaq AlphaServer ES MHz Alpha CPUs 6GBytes of main memory True64 UNIX Implemented in Java Pastry nodes were configured to run in a single java VM

Routing performance

Maintaining the network (1/2)

Maintaining the network (2/2) Shows the quality of the routing tables With respect to locality property. How information exchange during join operation affects the quality. Optimal: the best (closest according to the proximity metric). Sub-Optimal: an entry was not the closest or was missing. SL: considers only the appropriate row from each node along the route. WT: fetches the entire state of each node along the path (omitting the second stage of update). WTF: WT + the second stage of update. Result: => Pastry’s method of node integration (“WFT”) is effective in initializing the routing tables with good locality. Less information exchange during join operation => lower quality with respect to locality.

Conclusion Pastry is a generic peer-to-peer content location and routing system Scales well Used for applications like global file sharing, file storage etc. Takes into account locality when routing messages

What is squirrel ? Squirrel is an alternative to caches that are deployed on dedicated machines on the boundaries of corporative LAN's. Client desktop machines cooperate in a p2p fashion inside the LAN to provide the functionality of a proxy.

Traditional approach One machine that have to be capable of handling peak loads of traffic Expensive hardware and administrative costs. Growth of users require hardware updates. Single point of failure.

Web caching If object is found in the cache server it is tested for freshness (ttl) If it is fresh, object is returned, otherwise a cGET request is generated by the browser. Two types of cGET If-Modified-Since (uses timestamp) If-None-Match (ETag = hashed web content ) Response from cGET either includes the content or not-modified message

Pastry Uses pastry for the location of web objects by mapping the url to a node. Hashes the url and uses it as a key. If web browser does not find the requested object in his cache then squirrel tries to locate a copy on another node.

Models Home store model Directory model

Home store model Homenode of an object is the node that has nodeId numerically closest to a given objectId All external requests are routed through home node.

Scenario Home store

Directory model A homenode holds a small directory of pointers to nodes (called delegates) that have recently accessed the object. Additionally it stores meta data about the object such as ETag, fetch time, last modified time, ttl etc. Requests are forwarded to a randomly chosen delegate that is know to have the object

Directory model scenario 1

Directory model scenario 2

Directory model scenario 3

Directory model scenario 4

Arrival, failure and departure Arrival New node is automatically set as homenode. Two neighboring nodes transfer objects and directories which objectId are numerically closest to the newly joined node. Failure Future requests will be routed to the node that has become numerically closest to the objectID. Departure Nodes that are capable of announcing their desire to leave the system can transfer stuff to neighbors

External bandwidth consumption A 100 mb of disk donation from each client to squirrel, lowers the external bandwidth consumption to the level of a dedicated cache

Latency Latency is dependant on LAN hops In traditional proxy caching LAN hops = 2 In squirrel LAN hops = 4-7 LAN communication is fast user perceived latency is minimal

Load on each node LocationObjects/secObjects/min Redmond*48388 Caimbridge**55125 *36,782 clients**105 clients Directory model

Load on each node LocationObjects/secObjects/min Redmond*865 Caimbridge**935 *36,782 clients**105 clients Home store model Average load on any given minute is 0,31 object/min (for Redmond) for both models. Squirrel performes webcaching with low cost

Fault tolerance Possible to loose connection to the internet due to router failure Internal link ore router fails results in partitioning the network. Squirrel would partition itself into two separate systems Individual nodes can fail. Most nodes leave voluntarily