Presentation is loading. Please wait.

Presentation is loading. Please wait.

Towards Efficient Load Balancing in Structured P2P Systems Yingwu Zhu, Yiming Hu University of Cincinnati.

Similar presentations


Presentation on theme: "Towards Efficient Load Balancing in Structured P2P Systems Yingwu Zhu, Yiming Hu University of Cincinnati."— Presentation transcript:

1 Towards Efficient Load Balancing in Structured P2P Systems Yingwu Zhu, Yiming Hu University of Cincinnati

2 Outline Motivation and Preliminaries Load balancing scheme Evaluation

3 Why Load Balancing? Structured P2P systems, e.g., Chord,Pastry –Object IDs and Node IDs are produced by using a uniform hash function. –Results in O(log N) load imbalance, in the number of objects stored at each node. Skewed distribution of node capacity –Nodes may carry loads proportional to their capacities. Other problems: different object sizes, non- uniform dist. of object IDs.

4 Virtual Servers (VS) First introduced in Chord/CFS. A VS is responsible for a contiguous region of the ID space. A node can host multiple VSs. Chord Ring Node A Node C Node B

5 Virtual Sever Reassignment Virtual server is the basic unit of load movement, allowing load to be transferred between nodes. L – Load, T – Target Load. T=15 Chord Ring Heavy L=45 L=41 L=3 Node C Node B Node A30 20 11 3 10 15 T=50 T=35

6 Virtual Sever Reassignment Virtual server is the basic unit of load movement, allowing load to be transferred between nodes. L – Load, T – Target Load. T=15 Chord Ring Heavy L=45 L=41 L=3 Node C Node B Node A30 20 11 3 10 15 T=50 T=35

7 Virtual Sever Reassignment Virtual server is the basic unit of load movement, allowing load to be transferred between nodes. L – Load, T – Target Load. Chord Ring Node A Node C Node B T=50 T=15 T=35 L=45 L=31 L=14 L=30 30 20 11 3 10 15

8 Advantages of Virtual Servers Flexible: load is moved in the unit of a virtual server. Simple: –VS movement is supported by all structured P2P systems. –Simulated by a leave operation followed by a join operation.

9 Current Load Balancing Solutions Some use the concept of virtual server However: –Either ignore the heterogeneity of node capabilities. –Or transfer loads without considering proximity relationships between nodes. –Or both.

10 Goals Goals: –To maintain each node’s load less than its target load (maximum load a node is willing to take). –High capacity nodes take more loads. –Load balancing is performed in proximity-aware manner, to minimize the overhead of load movement (bandwidth usage) and allow more efficient and fast load balancing. Load: depends on the particular P2P systems. –E.g., storage, network bandwidth, and CPU cycles.

11 Assumptions Nodes in system are cooperative. Only one bottlenecked resource, e.g., storage or network bandwidth. The load of each virtual server is stable over the timescale when load balancing is performed.

12 Overview of Design Step1: Load balancing information (LBI) aggregation, e.g., load and capacity info. Step2: Node classification. E.g., heavy nodes, light nodes, neutral nodes. Step3: Virtual server assignment (VSA). Step4: Virtual server transferring (VST). Proximity-aware load balancing –VSA is proximity-aware.

13 LBI Aggregation and Node Classification Rely on a fully decentralized, self-repairing, and fault-tolerant K-nary tree built on top of a DHT (distributed hash table). Each K-nary tree node is planted in a DHT node. represents the load, capacity and the minimum load of virtual servers, respectively.

14 LBI Aggregation and Node Classification Relying on a fully decentralized, self-repairing, and fault-tolerant K- nary tree built on top of a DHT. Each K-nary tree node is planted in a DHT node. represents the load, capacity, and the minimum load of virtual servers. Light Heav y Light T i = (L/C+  )*C i

15 Virtual Server Assignment H1H1 L1L1 H2H2 H3H3 LnLn L n+1 HmHm H m+1 … V 11, V 12 C1C1 V 21 V 31, V 32 CnCn C n+1 V m1, V m2 V m+1 VSA information Rendezvous point: best-fit heuristics Rendezvous point: best fit heuristics Unpaired VSA information Final rendezvous point VSA happens earlier between logically closer nodes Logically close

16 Virtual Server Assignment DHT identifier space-based VSA: –VSA happens earlier between logically closer nodes. –Proximity-ignorant, because logically close nodes in DHT do NOT mean they are physically close together. H1 L3 L2 L4 L1 H2 [1] Nodes in same colors are physically close to each other. [2] H – heavy nodes, L – light nodes. [3] V i – virtual servers. V1 V2 V3

17 Proximity-Aware VSA Nodes in same colors are physically close to each other. H – heavy node, L – light node, V i – virtual server. VSs are assigned between physically close nodes. H1 L3 L2 L4 L1 H2 V1 V2 V3

18 Proximity-Aware VSA Use landmark clustering to generate proximity information, e.g. landmark vectors. Use space-filling curves (e.g., Hilbert curve): Landmark vectors  Hilbert numbers as DHT keys. Heavy nodes and light nodes each puts/maps their VSA info. into the underlying DHT with the resulting DHT keys: align physical closeness with logical closeness. Each virtual server independently reports the VSA info. which is mapped into its responsible region, rather than its node’s own VSA info.

19 Proximity-Aware Virtual Server Assignment H1H1 L1L1 H2H2 H3H3 LnLn L n+1 HmHm H m+1 … V 11, V 12 C1C1 V 21 V 31, V 32 CnCn C n+1 V m1, V m2 V m+1 VSA information Rendezvous point: best-fit heuristics Rendezvous point: best fit heuristics Unpaired VSA information Final rendezvous point Physically close VSA happens earlier between physically closer nodes

20 Experimental Setup A K-nary tree built on top of a DHT (Chord), e.g., k=2, and 8, respectively. Two node capacity distributions: –Gnutella-like capacity profile, 5-level capacities. –Zipf-like capacity profile. Two load distributions of virtual servers: –Gaussian dist. and Pareto dist. Two transit-stub topologies (5,000 nodes): –“ts5k-large” and “ts5k-small”.

21 High Capacity Nodes Carry More Loads Gaussian load distribution + Gnutella-like capacity profile

22 High Capacity Nodes Carry More Loads Pareto load distribution + Zipf-like capacity profile

23 Proximity-Aware Load Balancing CDF of Moved Load Distribution in ts5k-large Gaussian load distribution and Gnutella-like capacity profile Pareto load distribution and Zipf-like capacity profile More loads are moved within shorter distances by proximity-aware load balancing.

24 Benefit of Proximity-Aware Scheme Load movement cost: LM(d) denotes the load moved in the distance of d hops. Benefit: Results: –For ts5k-large: B = 37-65% –For ts5k-small: B = 11-20%

25 Other Results Quantify the overhead of K-nary tree construction: –Link stress, node stress. The latencies of LBI aggregation and VSA, bound in O(logN) time. The effect of pairing threshold in rendezvous points.

26 Conclusions Current load balancing approaches using virtual servers have limitations: –Either ignore node capacity heterogeneity. –Or transfer loads without considering proximity relationships between nodes. –Or both. Our solution: –A fully decentralized, self-repairing, and fault-tolerant K-nary is built on top of DHTs for performing load balancing. –Nodes carry loads in proportion to their capacities. –The first work to address load balancing issue in a proximity- aware manner, thereby minimizing the overhead of load movement and allowing more efficient load balancing.

27 Questions?


Download ppt "Towards Efficient Load Balancing in Structured P2P Systems Yingwu Zhu, Yiming Hu University of Cincinnati."

Similar presentations


Ads by Google