Presentation is loading. Please wait.

Presentation is loading. Please wait.

Self Stabilizing Distributed File System Shlomi Dolev and Ronen I. Kat Department of Computer Science, Ben-Gurion University Research Sponsored by IBM.

Similar presentations


Presentation on theme: "Self Stabilizing Distributed File System Shlomi Dolev and Ronen I. Kat Department of Computer Science, Ben-Gurion University Research Sponsored by IBM."— Presentation transcript:

1 Self Stabilizing Distributed File System Shlomi Dolev and Ronen I. Kat Department of Computer Science, Ben-Gurion University Research Sponsored by IBM

2 DFS Motivation Performance Fault tolerance Placing files closer to users

3 Related Work File systems NFS – network file system protocol AFS – Andrew file system – CMU(1988) Coda - CMU (1998) Intermezzo – Peter J. Braam, CMU Peer to peer (2000) Global storage: OceanStore – Berkeley Server less: Microsoft Farsite.

4 Talk Overview Self-stabilization Design Algorithms File system implementation Future work

5 Self Stabilization Self healing Adaptiveness Automatic recovery Autonomic computing Self Stabilization Dijkstra 1974

6 Self Stabilization A self-stabilizing system is a system that can automatically recover following the occurrence of (transient) faults. The idea is to design system that can be started in an arbitrary state and still converge to a desired behaviour. E.G., Self-stabilization / S. Dolev.

7 Self Stabilization Motivation totally The combination and type of faults cannot be totally anticipated in on-going systems must Any on-going system must be self stabilizing (or manually monitored) Self-stabilizing algorithm can recover from any arbitrary state reached due to the occurrence of faults

8 Design

9 Replication servers joined to a spanning tree A spanning tree is constructed File updates are propagated using self- stabilizing  -synchronizer

10 Design (Cont’) Clients join the replication tree and form a caching tree File leases Global locking

11 Algorithms – Self Stabilizing  Electing a leader (leader election) Collecting connectivity information Optimising communication costs  - Synchronizer for file consistency

12 Leader Election A single leader coordinates construction If non exists, a server becomes a leader If more than one exists, one survives Message are periodically broadcasted

13 Leader Election Algorithm Every T 1 do: If (p = leader) then send-multicast(‘I’m a leader’) Leader-exists = true Every T1+Td do: If (not leader-exists) then leader = p Leader-exists = false Upon arrival of message do: If (p.volume=volume) then If (p=leader) then leader = min(leader,sender) Else leader = sender Leader-exists = true

14 Algorithms – Self Stabilizing Electing a leader (leader election)  Collecting connectivity information Optimising communication costs  - Synchronizer for file consistency

15 Induced Graph Example

16 Update Algorithm Collect routing tables from all neighbours in the induced graph Elect a manager (local leader) for the tree, a server with the minimal ID Build a distributed BFS spanning tree The algorithm converges

17 Algorithms – Self Stabilizing Electing a leader (leader election) Collecting connectivity information  Optimising communication costs  - Synchronizer for file consistency

18 Optimising Communication Costs Goal: find the minimal  radius that keeps connectivity Increase  by a factor of 2 Run a 2nd instance of update with  <  Searching for  using binary search

19 Tree Structure

20 Caching Tree Extends the replication tree The update algorithm constructs both Servers execute two instances Caches execute one instance

21 Combined Spanning Tree

22 Algorithms – Self Stabilizing Electing a leader (leader election) Collecting connectivity information Optimising communication costs   -Synchronizer for file consistency

23 Synchronization Mechanism Provide reliable command and timing Propagate commands between servers Collect and distribute information

24 Replication Consistency Verifies signatures Multiple signature – a conflict Conflict resolution Broadcast resolved signature

25 Locking Table A (unified) global lock table Lock are requested Leader resolves multiple locks Lock are removed by cancelling the locks request

26 File System Implementation

27 Accessing a File Lock file Get signatureGet a copy Yes No Use local copy Yes Update? Cached?

28 Closing a File Send new signature Yes No Update? Confirm signature

29 Meta Access Globally processed Blocked until a lock is obtained Lock file Execute command Wait confirmation

30 Linux Based bgRFS Application User Level Linux system calls System Calls New implementation: open, close, lstat, mkdir, etc … SyncDaemon: Cache manager & Server Up calls Network Communication

31 Future Work Kernel VFS module. Communication improvements: – Reducing update messages – Using timers with  -synchronizer Performance enhancements Integrating disconnected operations Conflict resolution algorithms

32 Credits Undergraduate Students: Amir Livneh livneha@cs.bgu.ac.il Itay Granik granik@cs.bgu.ac.il Boris Lansky lanskyb@cs.bgu.ac.il Naama Shmuel shmueln@cs.bgu.ac.il Moshe Shish shishm@cs.bgu.ac.il Guy Erlich erlichg@cs.bgu.ac.il Avital Chohen avitalco@cs.bgu.ac.il Yael Biran birany@cs.bgu.ac.il Tamir Fridman tamirf@cs.bgu.ac.il Shiraz Bernard shirazb@cs.bgu.ac.il Zvika Ferents ferents@cs.bgu.ac.il Roy Feintuch feintuch@cs.bgu.ac.il Chen Shalev shalevc@cs.bgu.ac.il Shay Kraim kraim@cs.bgu.ac.il Alex Hayuit Faculty Prof Shlomi Dolev dolev@cs.bgu.ac.il Graduate Students Ronen I. Kat kat@cs.bgu.ac.il System Engeenier Albina Budker albinabu@cs.bgu.ac.il

33 Visit us at www.cs.bgu.ac.il/~bgrfs www.cs.bgu.ac.il/~bgrfs


Download ppt "Self Stabilizing Distributed File System Shlomi Dolev and Ronen I. Kat Department of Computer Science, Ben-Gurion University Research Sponsored by IBM."

Similar presentations


Ads by Google