Building Low-maintenance Expressways for P2P Systems Zhichen Xu and Zheng Zhang Presented By Swetha Boinepally.

Slides:



Advertisements
Similar presentations
Introduction to Algorithms Quicksort
Advertisements

CAN 1.Distributed Hash Tables a)DHT recap b)Uses c)Example – CAN.
Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.
Scalable Content-Addressable Network Lintao Liu
Algorithms Analysis Lecture 6 Quicksort. Quick Sort Divide and Conquer.
Allocating Memory.
Sylvia Ratnasamy, Paul Francis, Mark Handley, Richard Karp, Scott Schenker Presented by Greg Nims.
LightFlood: An Optimal Flooding Scheme for File Search in Unstructured P2P Systems Song Jiang, Lei Guo, and Xiaodong Zhang College of William and Mary.
Common approach 1. Define space: assign random ID (160-bit) to each node and key 2. Define a metric topology in this space,  that is, the space of keys.
A Scalable Content Addressable Network (CAN)
Scribe: A Large-Scale and Decentralized Application-Level Multicast Infrastructure Miguel Castro, Peter Druschel, Anne-Marie Kermarrec, and Antony L. T.
Peer to Peer File Sharing Huseyin Ozgur TAN. What is Peer-to-Peer?  Every node is designed to(but may not by user choice) provide some service that helps.
Sylvia Ratnasamy, Paul Francis, Mark Handley, Richard Karp, Scott Shenker A Scalable, Content- Addressable Network (CAN) ACIRI U.C.Berkeley Tahoe Networks.
1 A Tree Based Router Search Engine Architecture With Single Port Memories Author: Baboescu, F.Baboescu, F. Tullsen, D.M. Rosu, G. Singh, S. Tullsen, D.M.Rosu,
Mesh Networks A.k.a “ad-hoc”. Definition A local area network that employs either a full mesh topology or partial mesh topology Full mesh topology- each.
A Scalable Content-Addressable Network Authors: S. Ratnasamy, P. Francis, M. Handley, R. Karp, S. Shenker University of California, Berkeley Presenter:
A Scalable Content- Addressable Network Sections: 3.1 and 3.2 Καραγιάννης Αναστάσιος Α.Μ. 74.
1 A Scalable Content- Addressable Network S. Ratnasamy, P. Francis, M. Handley, R. Karp, S. Shenker Proceedings of ACM SIGCOMM ’01 Sections: 3.5 & 3.7.
SCALLOP A Scalable and Load-Balanced Peer- to-Peer Lookup Protocol for High- Performance Distributed System Jerry Chou, Tai-Yi Huang & Kuang-Li Huang Embedded.
IP layer restoration and network planning based on virtual protection cycles 2000 IEEE Journal on Selected Areas in Communications Reporter: Jyun-Yong.
Content Addressable Networks. CAN Associate with each node and item a unique id in a d-dimensional space Goals –Scales to hundreds of thousands of nodes.
1 CS 194: Distributed Systems Distributed Hash Tables Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer.
P2P Course, Structured systems 1 Skip Net (9/11/05)
P2P Course, Structured systems 1 Introduction (26/10/05)
“Umbrella”: A novel fixed-size DHT protocol A.D. Sotiriou.
Geographic Routing Without Location Information A. Rao, C. Papadimitriou, S. Shenker, and I. Stoica In Proceedings of the 9th Annual international Conference.
1CS 6401 Peer-to-Peer Networks Outline Overview Gnutella Structured Overlays BitTorrent.
Mobile Ad-hoc Pastry (MADPastry) Niloy Ganguly. Problem of normal DHT in MANET No co-relation between overlay logical hop and physical hop – Low bandwidth,
1 A scalable Content- Addressable Network Sylvia Rathnasamy, Paul Francis, Mark Handley, Richard Karp, Scott Shenker Pirammanayagam Manickavasagam.
Roger ZimmermannCOMPSAC 2004, September 30 Spatial Data Query Support in Peer-to-Peer Systems Roger Zimmermann, Wei-Shinn Ku, and Haojun Wang Computer.
1 Plaxton Routing. 2 Introduction Plaxton routing is a scalable mechanism for accessing nearby copies of objects. Plaxton mesh is a data structure that.
Developing Analytical Framework to Measure Robustness of Peer-to-Peer Networks Niloy Ganguly.
CONTENT ADDRESSABLE NETWORK Sylvia Ratsanamy, Mark Handley Paul Francis, Richard Karp Scott Shenker.
Mobile Adhoc Network: Routing Protocol:AODV
Using the Small-World Model to Improve Freenet Performance Hui Zhang Ashish Goel Ramesh Govindan USC.
1 Reading Report 5 Yin Chen 2 Mar 2004 Reference: Chord: A Scalable Peer-To-Peer Lookup Service for Internet Applications, Ion Stoica, Robert Morris, david.
Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Xiaozhou Li COS 461: Computer Networks (precept 04/06/12) Princeton University.
SOS: Security Overlay Service Angelos D. Keromytis, Vishal Misra, Daniel Rubenstein- Columbia University ACM SIGCOMM 2002 CONFERENCE, PITTSBURGH PA, AUG.
CCAN: Cache-based CAN Using the Small World Model Shanghai Jiaotong University Internet Computing R&D Center.
Content Addressable Network CAN. The CAN is essentially a distributed Internet-scale hash table that maps file names to their location in the network.
A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.
Adaptive Web Caching CS411 Dynamic Web-Based Systems Flying Pig Fei Teng/Long Zhao/Pallavi Shinde Computer Science Department.
Load-Balancing Routing in Multichannel Hybrid Wireless Networks With Single Network Interface So, J.; Vaidya, N. H.; Vehicular Technology, IEEE Transactions.
AODV: Introduction Reference: C. E. Perkins, E. M. Royer, and S. R. Das, “Ad hoc On-Demand Distance Vector (AODV) Routing,” Internet Draft, draft-ietf-manet-aodv-08.txt,
Content Addressable Networks CAN is a distributed infrastructure, that provides hash table-like functionality on Internet-like scales. Keys hashed into.
Scalable Content- Addressable Networks Prepared by Kuhan Paramsothy March 5, 2007.
MobiQuitous 2007 Towards Scalable and Robust Service Discovery in Ubiquitous Computing Environments via Multi-hop Clustering Wei Gao.
Pastry: Scalable, decentralized object location and routing for large-scale peer-to-peer systems Antony Rowstron and Peter Druschel, Middleware 2001.
DHT-based unicast for mobile ad hoc networks Thomas Zahn, Jochen Schiller Institute of Computer Science Freie Universitat Berlin 報告 : 羅世豪.
P2P Group Meeting (ICS/FORTH) Monday, 28 March, 2005 A Scalable Content-Addressable Network Sylvia Ratnasamy, Paul Francis, Mark Handley, Richard Karp,
1 Distributed Hash Table CS780-3 Lecture Notes In courtesy of Heng Yin.
1 Presented by Jing Sun Computer Science and Engineering Department University of Conneticut.
1 An Arc-Path Model for OSPF Weight Setting Problem Dr.Jeffery Kennington Anusha Madhavan.
LOOKING UP DATA IN P2P SYSTEMS Hari Balakrishnan M. Frans Kaashoek David Karger Robert Morris Ion Stoica MIT LCS.
Two Peer-to-Peer Networking Approaches Ken Calvert Net Seminar, 23 October 2001 Note: Many slides “borrowed” from S. Ratnasamy’s Qualifying Exam talk.
CSCI 599: Beyond Web Browsers Professor Shahram Ghandeharizadeh Computer Science Department Los Angeles, CA
1 Plaxton Routing. 2 History Greg Plaxton, Rajmohan Rajaraman, Andrea Richa. Accessing nearby copies of replicated objects, SPAA 1997 Used in several.
Authors: Jiang Xie, Ian F. Akyildiz
Pastry Scalable, decentralized object locations and routing for large p2p systems.
Controlling the Cost of Reliability in Peer-to-Peer Overlays
B+ Tree.
Intra-Domain Routing Jacob Strauss September 14, 2006.
Plethora: Infrastructure and System Design
Zhichen Xu, Mallik Mahalingam, Magnus Karlsson
A Scalable content-addressable network
CONTENT ADDRESSABLE NETWORK
A Scalable, Content-Addressable Network
Reading Report 11 Yin Chen 1 Apr 2004
A Scalable Content Addressable Network
Presentation transcript:

Building Low-maintenance Expressways for P2P Systems Zhichen Xu and Zheng Zhang Presented By Swetha Boinepally

Overview  What are “Expressways” ?  CAN  CAN construction  Routing in CAN  Overview of Expressway  Building Expressway  Routing with Expressways  Tuning towards load balance  On demand maintenance  Analysis of routing  Tuning towards best performance  Conclusion

Introduction  “Expressway” is an auxiliary mechanism to deliver high routing performance.  Expressways are auxiliary data structures.

Real Life Expressways Vs. Expressways in P2P Systems  Expressways in P2P systems are analogous to the real world expressways.  Attributes of expressways: - Auxiliary mechanisms. Not a wholesale replacement of local routes. - Greater span per ‘hop’. - High bandwidth.  Difference between real life expressways and expressways in P2P systems:  Expressways in P2P systems are - Self organized. - Self maintained. - Adaptive to changing network.

Why CAN ?  CAN has performance of O(N^1/d)  Expressways improves the routing performance to O(log N)  Highly scalable  Simple routing algorithm  Low maintenance cost

Content Addressable Network(CAN)  CAN is a distributed infrastructure which provides hash table like functionality.  CAN organizes the logical space as a d-dimensional Cartesian coordinate space.  Entire coordinate space is partitioned among all the nodes.  Each node owns its individual zone within overall space.

CAN construction  All nodes are represented in the coordinate space.  When a new node joins, it joins a node that is close to it in IP distance.  The existing node will split its allocated zone into half.  The neighbors of the split zone must be notified.  When a node leaves the network, its zone should be taken over by one of its neighboring nodes.

Example

Routing in CAN  CAN node maintains a coordinate routing table that holds the IP address and the virtual coordinates of each of its neighboring zones.  A CAN message includes the destination coordinates. Using its neighbor coordinate set, a node routes a message towards its destination by simply forwarding to the neighbor with coordinates closest to the destination coordinates.

Example   We consider a 2-d Cartesian space with 16 equal zones.   Each node has maximum of 4 neighbors.   Maximum routing path length is 3 hops.

D-dimensional Space  For a d-dimensional space which is partitioned into n equal zones The average routing path length is (d/4)(n^(1/d)) hops.  # of neighbors for an individual node is 2d.

Scaling Results  The previous results show that for a d- dimensional space, we can grow the number of nodes(zones) without increasing the per node state.  The average routing path length grows in proportion to (n^(1/d)).  Thus the routing performance in CAN is O(n^(1/d)).

Overview of Expressways  Expressways of CAN’s have routing tables of increasing span.  The entire space is partitioned into zones of different spans with smallest zones correspond to the CAN zones and any other zones are called Expressway zones.  Each node owns a CAN zone and is also a resident of the Expressway zone that encloses its CAN zone.  Expressway zones and CAN zones are recorded in each node in data structure called “Total routing table”.

Contd…  In this example, CAN zones are at level 3, four of neighboring CAN zones make one level 2 expressway and four such level 2 zones make a level 1 zone.  Total routing table of node 1 consists of the default routing table of CAN(plain arcs) and expressway routing tables(thick and long arcs)

Contd…  Total routing table R T : R T = where R L corresponds to the node’s default routing table that CAN already builds.  Ri(i=0 to L-1) is called an Expressway routing table that has larger span. Smaller the i larger the span.  For each neighboring expressway zone, the expressway routing table keeps the address of one or more nodes in that zone.  In the previous figure, CAN zones are at level 3 and each CAN zone is 1/64 th of entire Cartesian space.

Building Expressways  The algorithm for building expressway is “Evolving snapshot”.  According to the algorithm, at regular intervals of system growth, snapshots of current routing table are taken.  Each node takes the snapshot independently by observing its zone size, with which it may infer to what stage the system has grown.  For node x, suppose the current zone size is x.R.Z and the target size is x.R (L-1). Z/K. if the x’s current size shrinks to its target size then it takes a new snapshot by incrementing L. where K is the span of the expressway.

Notations Used

Procedure for New Node Join  When a node y splits with x, it inherits all entries of x’s total routing table other than x’s current routing table.  Procedure when a node y joins node x: { y.R T = repeat the procedure for testing for a new snap shot }  Both nodes test to see if its current zone has shrunken to 1/K th of its last snapshot.

Algorithm to Test a New Snapshot  Procedure for testing for new snapshot: //executed by both x and y If (R L.Z <= R L-1.Z/K) { R L+1 = R L R T = L=L+1 }

Expressway Tree  CAN can be thought as building a binary tree since each new node splits a random existing node.  Total routing table can be found by walking down the tree from the root towards the node picking up the snapshot routing tables.  Tree rooted by x when it joined system is called x’s expressway tree.

Routing  If the destination point is within the current zone (R L.Z) then we can route it using the CAN’s default routing table.  If the node is in some other zone, it iterates through the total routing table, starting from the one with largest span.  It iterates until it finds a routing table whose space does not cover the destination.  Suppose Ri is the routing table whose space does not cover the destination, then Ri is selected and the message will be routed according to Ri to one of Ri’s expressway neighbors.

Pseudo-code for Routing With Expressway  Route with Expressway: If(pt  RL.Z) Return; For(i=0;i<=L;i++) If(pt  (Ri.Z)) Route using Ri;  Route with Ri: for(j=0;j<d;j++) if(pt Ri.Z.Uj) { Route to x  Ri.Nj that is close to pt Break; }

Tuning Towards Load Balance  Achieving load balance requires that a node z that routes to x in expressway routing, to have the option for replacing x with any node inside the x’s expressway tree.  Let Z.Ri be the routing table used to route to x and X be the x’s expressway zone, then we pick a point pt in X and route to it. Whoever owns that zone now replaces x in Z.Ri. Here the point pt is selected randomly.  # of nodes that replace x is proportional to the size of x’s Expressway tree and therefore is proportional to the # of residents inside x’s Expressway tree.  Thus the algorithm will automatically balance the load among Expressways.

Default Vs. Random Algorithms   This figure show the simulation results reporting number of messages each node has to forward with default routing table and the routing table constructed using random selection.   We see that with random selection, more # of messages are forwarded

On-demand Maintenance  When a new node joins the network, it updates its expressway neighbors for the EW zones recorded in its total routing table. Such maintenance is proportional to the total levels of expressways, which is O(log k N).  When a node departs, one of its neighbors will take over departed nodes responsibilities. However the departed node can be the expressway neighbor of multiple nodes and its EW functions need to be maintained by some other nodes.

Contd…  Suppose node x departs from the network, and node y attempts to route to node x, then the request will timeout and node y tries to use the EW with smaller reach.  Now node y picks up a point pt in the space of x recorded in the failed routing table, and route to it. If node z owns the zone in which the point pt lies then node z is now replacement of x to repair y’s snapshot.  On average the total # of nodes that will update their routing table entries is the product of the total # of expressway levels and the # of neighbors in each level, which is O(d.log k N).

Storage  Let m be the # of levels of EWs and m=log k N. The total routing table depth is m and as a result the storage for routing table increases m-folds.  For example, if K=4 and N=2^20 (million nodes), then m is 10 and it takes only few hundred bytes to store the routing table. Therefore storage is not a concern.

Routing with expressways  Routing expressways involves at most m levels of CAN like routing.  Routing algorithm will iterate through the table to get down to the level where the expressway zone doesn't cover the destination. It will the follow the CAN routing to reach an expressway zone that covers the destination.

Analysis  The average routing hops in any level are (d/3)K^(1/d) hops and number of levels the routing will travel are log k N levels. Thus there are (log k N)(d/3)K^(1/d) hops in the worst case.  The bigger the K, the less levels of expressways there are and routing at each level is more expensive, so we need to find the optimal K.

Contd…  Let f(x)=(d/3).(x^1/d).log x N  Taking the derivative and equating it to zero: f(x)’= 0 => (d/3).(1/d.x^(1/d-1).lnN/lnx = x^(1/d-1).lnN/(lnx)^2 on simplification we get 1/d=1/lnx => x=e^d  Thus the optimal K is e^d and the optimal routing performance is f(x)=(d/3).((e^d)^1/d).log (e^d) N =d/3.e.lnN/d =e/3 lnN  Thus the routing performance of CAN using expressways is O(lnN). And it is independent of choice of d.

Expressway Performance Compared With Theoretical Values   In the figure, theoretical performance using EW is compared with results obtained using simulation.   There are small deviations in these values.

Expressway Performance Compared With CAN With Different D   In this graph, we can see that the EW performance with d=1 outperforms the basic CAN with d=5.

Tuning Towards Best Performance  Tuning towards network conditions  Locating closest nodes using coordinate maps  Route around congested node  Tuning according to application needs

Locating Closest Nodes  Build maps of network coordinates, and utilizing the archival capacity of P2P system itself to store and maintain the maps in various zones in CAN.  Any node can find a resident close to it in a given expressway zone Z by consulting the appropriate map.  There is one map for each expressway zone in the system.

Route Around Congested Node  In Internet like dynamic environment, the traffic flow is unpredictable and the common problem to deal with is congestion.  Congestion can be avoided by detecting the congested node, reporting and finding the alternate route.  To find the appropriate expressway neighbor, each node will periodically update its map with the information about its load.

Tuning According to Application Needs  Reducing the per hop distance may not necessarily reduce the total delay.  To account for the communicating patterns of application, the total routing table of node keeps the addresses of multiple nodes, instead of keeping the address of the node that is physically closest to it.  While routing to a destination, we attempt to balance the per hop distance and the total number of logical hops to achieve overall small latency.  Routing performance can be further improved by allocating some storage at each node to cache routes based on runtime behavior.

Summary  Using expressways the performance of routing in CAN is improved from O(N^1/d) to O(ln N).  The important attributes of P2P systems are to reduce maintenance cost, and to provide venues to tune the system to suit the application needs and the underlying network conditions. Such that it beats O(ln N) in practice.

Thank You