Download presentation

Presentation is loading. Please wait.

Published byShea Eve Modified over 2 years ago

1
p2p 2006 1 CAN, CHORD, BATON (extensions)

2
p2p 2006 2 Additional issues: Fault tolerance, load balancing, network awareness, concurrency Replicate & cache Performance evaluation CHORD ΒΑΤΟΝ Structured P2P

3
p2p 2006 3 Path- length Neighbor state Total path latency Per-hop latency volume Multiple routes replicas Dimensions (d) O(dn 1/d )O(d) -- - Realities (r) O(r) - MAXPEERS (p) O(1/p)O(p) O(p)* Hash functions (k) -- -Ο(k)-O(k) RTT- weighted routing -- --- Uniform partitioning heuristic Reduced variance Reduces variance -- Reduced variance -- Summary: Design parameters and performance (CAN) * Only on replicated data

4
p2p 2006 4 Additional issues discussed for CHORD: Fault tolerance Concurrency Replication (Data) Load balancing CHORD

5
p2p 2006 5 CHORD Stabilization Need to keep the finger tables consistent in the case of concurrent inserts Keep the successors of the nodes up-to-date Then, these successors can be used to verify and correct the finger tables

6
p2p 2006 6 CHORD Stabilization What about similar problems in CAN? If we forget performance, in general, it suffices to keep the network connected Thus: Connect to the network by finding your successor Periodically run stabilize (to fix successors) and less often run fix table

7
p2p 2006 7 CHORD Stabilization A lookup before stabilization All finger tables “reasonably” current, an entry is found in O(logN) steps Successors are correct but finger tables are inaccurate, correct but maybe slow lookups Incorrect successors pointers or keys have not moved yet, lookup may fail and needs to be retried

8
p2p 2006 8 CHORD Stabilization Works Concurrent joins Lost and reordered messages May not work (?) When system is split into multiple disjoint circles A single cycle that loops around the identifier space Caused by failures, network partitioning, etc – in general, left unclear

9
p2p 2006 9 CHORD Stabilization n.join(n’) predecessor = nil; successor = n’.find_succesor(n ); Upon joining, node n calls a known node n’ and asks n’ to locate its successor Note, the rest of the network does not know n yet – to achieve this run stabilize Join

10
p2p 2006 10 n.stabilize() x = successor.predecessor; if(x (n, successor)) successor = x; successor.notify(n); CHORD Stabilization Every node n runs stabilize periodically n asks its successor for its predecessor, say p n checks whether p should be its successor instead (this is, if p has joined the network in between) – this is how nodes find out of new nodes Stabilize

11
p2p 2006 11 n.notify(n’) if (predecessor is nil or n’ (predecessor, n)) predecessor = n’; CHORD Stabilization successor(n) is also notified and checks whether it should make n its predecessor Notify

12
p2p 2006 12 npnp nsns n p.successor = n s n s.predecessor = n p n n.predecessor = nil n.successor = n s CHORD Stabilization Node n joins

13
p2p 2006 13 Example – n.stabilize npnp nsns n p.successor = n s n s.predecessor = n p n n.predecessor = nil n.successor = n s n s.predecessor = n n s.notify(n) n runs stabilize

14
p2p 2006 14 Example - n p.stabilize npnp nsns n p.successor = n s n s.predecessor = n n n.predecessor = nil n.successor = n s n p.stabilize() n p.successor = n n p runs stabilize

15
p2p 2006 15 Example - n p.stabilize npnp nsns n p.successor = n n s.predecessor = n n n.predecessor = nil n.successor = n s n.notify(n p ) n.predecessor = n p n.successor = n s

16
p2p 2006 16 Chord Stabilization: Fix fingers n.fix_fingers() i = random_index > 1 into finger[]; finger[i].node = find_successor(finger[i].start); Finger tables are not updated immediately Thus, lookups may be slow by find_predecessor and find_successor work Periodically run fix_fingers: pick a random entry in the finger table and update it

17
p2p 2006 17 Stabilization Finger tables Must have a finger at each interval, then the distance halving argument still holds Lookup finds the predecessor of the target t, say p, then p finds the target (it is its successor) Problem only when a new node enters between p and t Then we may need to scan these nodes linearly, ok if O(logN) such nodes

18
p2p 2006 18 Stabilization Eventually succeed Invariant: Once a node can reach a node r via successor pointers, it always can Termination argument: Assume two nodes (n 1, n 2 ) that both think that they have the same successor s Both attempt to notify s s will eventually choose the one that is closer of the two as its predecessor, say n 1 The farthest of the two n 2 will then learn by s of a better successor than s (n 1 ) Thus, there is progress to a better successor each time

19
p2p 2006 19 When a node n fails Nodes that have n as their successors in the finger table must be informed and find the successor of n to replace it in their finger table Lookups in progress must continue Maintain correct successor pointers Failures

20
p2p 2006 20 Replication Each node maintain a successor list of its r nearest successors Upon failure, use the next successor in the list Modify stabilize to fix the list Failures

21
p2p 2006 21 Other nodes may attempt to send requests through the failed node Use alternate nodes found in the routing table of preceding nodes or in the successor list Failures

22
p2p 2006 22 Example r=3 0 1 3 5 7 8 10 11 12 15 2 4 6 9 14 13 5 [5, 6, 9] [6, 9, 12] [9, 12, 14] [12, 14, 15] [14, 15, 3] [15, 3, 5] [3, 5, 6] 12 9 6 Failures

23
p2p 2006 23 A lookup fails, if all r nodes in the successor list fail. All fail with probability 2 -r (independent failures) = 1/N Theorem: If we use a successor list with r = Ο(logN) in an initially stable network and then every node fails with probability 1/2, then with high probability, find_successor returns the closest living successor the expected time to execute find_successor in the failed network is O(logN) Failures

24
p2p 2006 24 Store replicas of a key at the k nodes succeeding the key Successor list helps to keep the number of replicas per item known Other approach: store a copy per region Replication

25
p2p 2006 25 K keys, N nodes ideal allocation, each node gets K/N 10 4 nodes, 10 5 – 10 6 keys, step of 10 5 Mean 1st and 99th percentiles (το ποσοστό της κατανομής που είναι μικρότερο ή ίσο) of the number of keys per node Large variation which increases with the number of keys Load balance

26
p2p 2006 26 K keys, N nodes ideal allocation, each node K/N 10 4 nodes, 5 x 10 5 keys Load balance Probability Density Function

27
p2p 2006 27 Node identifiers do not uniformly cover the entire identifier space Assume N keys and N nodes If we divide the space in N equal-sized bins, then we would like to see one key per node The probability that a particular bin is empty is (1 – 1/N) N For large N, this approaches e -1 = 0.368 Load balance

28
p2p 2006 28 Introduce virtual nodes Map multiple virtual nodes with unrelated identifiers to each real node Each key is hashed into a virtual node which is next mapped to an actual node Increase number of virtual nodes from N -> N logN Worst-case path length O(N logN) Each actual node needs r times as much space for the finger tables of its virtual nodes. If r = log N, then log 2 N entries, for N = 10 6 then 400 entries The routing messages per node also increase Load balance: Virtual Nodes

29
p2p 2006 29 Load balance

30
p2p 2006 30 CAN One virtual node (zone) -> Many physical nodes reduce virtual nodes -> decrease path length physical network awareness Many virtual nodes -> One physical node increase virtual nodes -> increase path length data load balance Load balance

31
p2p 2006 31 Performance Evaluation CAN Simulation Knobs-on full vs Bare bones Use network topology generators CHORD Simulation (runs in iterative style, note how does this affect network proximity) Also a small size distributed experiment that reports latency measurements 10 sites in the USA, simulate more than 10 nodes by running more than one copies of CHORD at each of the sites

32
p2p 2006 32 Metrics for System Performance Path length: overlay hops to route between two nodes in the CAN space Latency: End-to-end latency between two nodes Per-hop latency: end-to-end latency for the whole path length Neighbor-state Volume per node (indicative of data and query load) Load per node lookup

33
p2p 2006 33 Metrics for System Performance Routing fault tolerance Data fault tolerance Maintenance cost: cost of join/leave, replication etc How? Either separately or as the overall network traffic Churn (dynamic behavior)

34
p2p 2006 34 Period of stabilize Period of fix_finger Maximum number of virtual nodes m Number of virtual nodes per physical node (O(logN)) Size of the successor list (O(logN)) Hash function CHORD Magic Constants

35
p2p 2006 35 Path Length 2 12 nodes Actually is ½ log N Follow half the log N bits

36
p2p 2006 36 Simultaneous Node Failures Randomly select a fraction p of nodes that fail The network stabilizes Lookup failure rate is p (could be worst if say network partition) Lookups during stabilization CHORD more

37
p2p 2006 37 Detect malicious participants Or take a false position in the ring to “steal” data Physical network awareness CHORD future

38
p2p 2006 38 Additional issues: Fault tolerance Other ways to restructure/balance the tree (Workload) load balance BATON

39
p2p 2006 39 Failures Upon node departure or failure, the parent can reconstruct the entries Assume node x fails, any detected failures of x are reported to its parent y y regenerates the routing tables of x – Theorem 2 Messages are routed Sideways (redundancy similar to CHORD) Up-down (can find its parent through its neighbors) There is routing redundancy

40
p2p 2006 40 AVL-like Restructuring The network may be restructured using AVL-like rotations No data movement is needed, but some routing tables need to be updated

41
p2p 2006 41 Load Balance Each node keeps statistics about the number of queries or messages it receives Adjust the data range to equalize the workload between adjacent nodes for leaves, find another (less loaded leaf) say v; have it transfer its load to its parent; and make v join again as the node’s child Performance results (simulation) are reported – not surprises

42
p2p 2006 42 Replication - Beehive Proactive – model-driven replication Passive (demand-driven) replication such as caching objects along a lookup path Hint for BATON Beehive The length of the average query path reduced by one when an object is proactively replicated at all nodes logically preceding that node on all query paths BATON Range queries Many paths to data

Similar presentations

OK

Common approach 1. Define space: assign random ID (160-bit) to each node and key 2. Define a metric topology in this space, that is, the space of keys.

Common approach 1. Define space: assign random ID (160-bit) to each node and key 2. Define a metric topology in this space, that is, the space of keys.

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google