Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chord & CFS Presenter: Gang ZhouNov. 11th, 2002 University of Virginia.

Similar presentations


Presentation on theme: "Chord & CFS Presenter: Gang ZhouNov. 11th, 2002 University of Virginia."— Presentation transcript:

1 Chord & CFS Presenter: Gang ZhouNov. 11th, 2002 Email: gz5d@cs.virginia.edugz5d@cs.virginia.edu University of Virginia

2 Relation of: Chord, CAN, INS, CFS DCS Polynomial in number of attributes in name specifiers Scalability DCS INS (early binding) Chord CAN CFS File blocksSensor

3 Outline of Chord u Chord protocol  What is Chord ?  Why use Chord ?  Basic Chord protocol  Extended Chord protocol  Simulation results u Chord system  APIs provided by Chord system

4 Outline of CFS u CFS system structure u Chord layer  Server selection u DHash layer  Replication  Caching  Load balance  Quota  Update and delete u Experimental results

5 Part One: Chord

6 What is Chord? node n node n’ that stores key’s value Value of key? Application in n Value of key!

7 Why use Chord ? u High efficiency  resolves a lookup via O(logN) messages to other nodes u Fault tolerance u Chord scales well  each node maintains information only about O(logN) other nodes

8 Base Chord protocol u What is identifier cycle? u Where to store a key? u How to look up a key quickly ? u Node joins (departures)

9 The base Chord protocol u What is identifier cycle ? Node’ IP addressNode’s m-bit identifier SHA-1 Key (a string)key’s m-bit identifier SHA-1  Secure Hash Algorithm  The algorithm takes a message of less than 2 64 bits in length and produces a 160-bit message digest Key1 Key2 Node n a position in id cycle a node

10 Where is a key stored? u Successor (k): the first node encountered when moving in the clockwise direction starting from k in the identifier circle.

11 Value of key! How to look up a key quickly ? Successor(key) ? Value of key? n1 n4 n2n3 …… Key jump n1 Key Value of key! move Value of key?

12 How to look up a key quickly ?(cont.) u We need finger table

13 How to look up a key quickly ?(cont.) u finger table for node 1

14 How to look up a key quickly ?(cont.) Node 3: Am I predecessor(1) ? Predecessor(1)  successor(1) Node 3: Try entry 3, and find node 0Node 3: Send lookup to node 0Node 0: Am I predecessor(1) ?Node 0: successor(1) is node 1 return to node 3 (RPC) Value of key 1 ?

15 Node joins u Two challenges  Each node’s finger table is correctly filled  Each key k is stored at node successor(k) u Three operations  Initialize the predecessor and fingers of the new node n  Update the fingers and predecessors of existing nodes  Copy to n all keys for which node n has became their successor

16 Initialize the predecessor and fingers of node n u Idea: Ask an existing node for information needed Join in

17 Update the fingers and predecessors of existing nodes u Observation: when node n joins the network, n will become the i th finger of a node p when the following two conditions meet:  P proceeds n by at least 2 i-1  The i th finger of node p succeeds n u Solution: Try to find predecessor(n- 2 i-1 ) for all 1<=i<=m; and check whether n is their i th finger, and whether n is their predecessor’s i th finger.

18 Update the fingers and predecessors of existing nodes (cont.) Predecessor(6-2 1-1 ) =3, update 66 Predecessor(3) =1, no update Predecessor(6-2 2-1 ) =3, update 66 Predecessor(6-2 3-1 ) =1, update 66 Predecessor(1) =0, update Predecessor(3) =1, no update 66 Predecessor(0) =3, no update Join in

19 Copy to n all keys for which node n has became their successor u Idea: Node n can become the successor only for keys stored by the node immediately following n Join in

20 Extended Chord protocol u Concurrent joins u Failures and replication

21 Concurrent joins u When multiple nodes with similar identifiers join at the same time,  They tell the same predecessor that they are its successor.  How to update predecessor and successor ? notify stabilize Only the newcomer with the lowest (highest) identifier will be the predecessor (successor) Periodically check whether new nodes have inserted themselves between a node and its immediate neighbors

22 Concurrent joins (cont.) a d u Example : Node b joins c t1t1 b t 1 +∆t Succ (a) Pred (b) Succ(b) Pred (c) Succ(c) Pred (d) t0t0 d a t1t1 c a d c t 1 +∆t b a d a d c t2t2 b a c b d c notifystabilize

23 Failures and replication u Challenges  When node n fails, successor (n) must be found ?  Successor (n) must has a copy of the key/value pair  n’s failure must not disrupt queries in progress as the system is re-stabilizing

24 How to find n’s successor after n fails ? u To find:  n’s predecessor (n2) found that n doesn’t respond in stabilizing u To recover:  n2 looks through n2’s finger table for the first live node n1  n2 asks n1 for successor(n2) and uses the result as n2’s new successor

25 How to ensure that n’s successor has a copy after n fails ? u Each Chord node maintains a list of its r nearest successors. Each successor has the data copy.  After node n fails, queries for its keys automatically end up at its successor

26 How to keep in-progress queries undisrupted ? u Detect failure:  Before stabilization has completed, node failure can be detected by timing out the requests u Continue query:  Any node with identifier close to the failed node’s identifier will have similar table entries  Such node can be used to route requests at a slight extra cost in route length

27 Simulation results u Path length : the number of nodes traversed by a lookup operation Average: O(logN)

28 Simulation results (cont.) Miss rate due to state inconsistency  Miss rate due to state inconsistency increases fast with failure frequency miss rate due to node failures (key lost) <

29 APIs provided by Chord system

30 u Part Two: CFS

31 CFS System structure Interprets blocks as files; Present a file system interface to applications Stores data blocks reliably Maintains routing tables to find blocks

32 Chord Layer --- Server selection u Factors to consider:  Distance around the ID ring  RPC latency d i --- the latency from node n to node n i --- the average latency of all the RPCs that node n has ever issued d i --- the latency from node n to node n i --- an estimate of the number of Chord hops that would remain after contacting n i n n2n2 key n1n1 n3n3

33 DHash layer --- Replication u DHash places a block’s replicas at the k servers immediately after successor (block). u After successor (block) fails, the block is immediately available at the new successor (block) u Independence failure is provided : close to each other in the ID ring ≠ physically close to each other

34 DHash layer --- Caching u How to cache? u Cache replacement?--- LRU u Cache vs. replication? n1 n4 n2 n3 …… Key Cache ? Data ?  Replication is good for solving nodes failure  Cache is good for loading balance

35 DHash layer --- Load balance u Break file systems into many distributed blocks u Caching u A real server can act as multiple virtual servers --- ID is derived from hashing both the real server’s IP address and the index of the virtual server

36 DHash layer --- Quota u Why quota?  The total amount of storage an IP address can consume will grow linearly with the total number of CFS servers  Prevent malicious injection of large quantities of data Example: If each CFS server limits any one IP address to using 0.1% of its storage, then an attacker would have to mount an attack from about 1000 machines for it to be successful

37 DHash layer --- Update and delete u Update  Content hash block: supplied key = SHA-1(block’s content)  Root block: only publisher with private key can change it u Delete  No delete, useful for recovering from malicious data insertion (Good or not?)

38 Experimental results u lookup cost: O(logN)

39 Experimental results (cont.) u caching 1000 servers

40 Experimental results (cont.) u Effect of nodes failure --- lookup fail because of all replicas fail  6 replicas  1000 blocks  1000 servers

41 Some discussion u Chord in sensor network ? u Do you want to use CFS (since no delete) ? u Build CFS over CAN and INS ? u Lazy replica copying ?


Download ppt "Chord & CFS Presenter: Gang ZhouNov. 11th, 2002 University of Virginia."

Similar presentations


Ads by Google