Presentation is loading. Please wait.

Presentation is loading. Please wait.

Weed File System Simple and highly scalable distributed file system (NoFS)

Similar presentations


Presentation on theme: "Weed File System Simple and highly scalable distributed file system (NoFS)"— Presentation transcript:

1 Weed File System Simple and highly scalable distributed file system (NoFS)

2 Project Objectives Yes 1Store billions of files! 2Serve the files fast! Not Namespaces POSIX compliant

3 Design Goals 1Separate volume metadata from file metadata. 2Each volume can optionally have several replicas. 3Flexible volume placement control, several volume placement policy. 4When saving data, use can specify replication factor, or desired replication policy

4 Challenges for common FS POSIX costs space, inefficient One folder can not store too many files – Need to generate a deep path Reading one file requires visiting whole directory path, each may require one disk seek Slow moving deep directories across computers

5 Challenges for HDFS Stores large files, not lots of small files Designed for streaming files, not on demand random access Name node keeps all metadata (SPOF), bottleneck

6 How Weed-FS files are stored? Files are stored into 32GB-sized volumes Each volume server has multiple volumes Master server tracks each volume’ location and free space Master server generate unique keys, and direct clients to a volume server to store Clients remember the fid.

7 Workflow Client 2 Save Volume Node(s) Volume Info Master Node 1 Get FID

8 Master Node Generate Unique Keys Track volume status – > – Maintained via heartbeat – Can restart

9 fid format Sample File Key: – 3, d6 Each Key has 3 components: – Volume ID = 3 – File Key = 01 – File cookie = d6(4bytes)

10 Volume Node Keep several volumes Each volume keep a map – Map >

11 File Entry in Volume

12 Compared to HDFS HDFS Namenode stores all file metadata Namenode loss can not be tolerated WeedFS MasterNode only stores volume location MasterNode can be restarted fresh Easy to have multiple instances (TODO)

13 Serve File Fast Each Volume Server maintains an map > for each of its volumes. – No disk read for file metadata – Possibly read the file with one disk read, O(1) Unless file is already in buffer, or File on disk is not in one continuous block (Use XFS to store on continuous block)

14 Automatic Compression Compress the data based on mime types Transparent Works with browser if accept gzip encoding

15 Volume Replica Placement 1Each volume can have several replicas. 2Flexible volume placement control, several volume placement policy. 3When saving data, use can specify replication factor, or desired replication policy

16 Flexible Replica Placement Policy 1No replication. 21 replica on local rack 31 replica on local data center, but different rack 41 replica on a different data center 52 replicas, first on local rack, random other server, second on local datacenter, random other rack. 62 replicas, first on random other rack and same data center, second on different data center

17 Future work Tools to manage the file system


Download ppt "Weed File System Simple and highly scalable distributed file system (NoFS)"

Similar presentations


Ads by Google