Presentation is loading. Please wait.

Presentation is loading. Please wait.

Building Distributed, Wide-Area Applications with WheelFS Jeremy Stribling, Emil Sit, Frans Kaashoek, Jinyang Li, and Robert Morris MIT CSAIL and NYU.

Similar presentations


Presentation on theme: "Building Distributed, Wide-Area Applications with WheelFS Jeremy Stribling, Emil Sit, Frans Kaashoek, Jinyang Li, and Robert Morris MIT CSAIL and NYU."— Presentation transcript:

1 Building Distributed, Wide-Area Applications with WheelFS Jeremy Stribling, Emil Sit, Frans Kaashoek, Jinyang Li, and Robert Morris MIT CSAIL and NYU

2 2 Grid Computations Share Data Nodes in a distributed computation share: –Program binaries –Initial input data –Processed output from one node as intermediary input to another node

3 3 So Do Users and Distributed Apps Shared home directory for testbeds (e.g., PlanetLab, RON) Distributed apps reinvent the wheel: –Distributed digital research library –Wide-area measurement experiments –Cooperative web cache Can we invent a shared data layer once?

4 4 Our Goal Distributed file system for testbeds/Grids App can share data between nodes Users can easily access data Simple-to-build distributed apps Node File foo Testbed/Grid File foo File foo

5 5 Current Solutions Usual drawbacks: –All data flows through one node –File systems are too transparent Mask failures Incur long delays Node Testbed/Grid Central File Server Copy foo File foo

6 6 Our Proposal: WheelFS A decentralized, wide-area FS Main contributions: 1) Provide good performance according to Read Globally, Write Locally 2) Give apps control with semantic cues

7 7 Talk Outline 1.How to decentralize your file system 2.How to control your files

8 8 What Does a File System Buy You? A familiar interface Language-independent usage model Hierarchical namespace useful for apps Quick-prototyping for apps

9 9 File Systems 101 File system (FS) API: –Open –{ Close/Read/Write } Directories translate file names to IDs App 1 App 2 Operating System File System API call Local hard disk Node

10 10 Distributed File Systems App 1 App 2 Operating System File System API call Local hard disk Node Testbed/Grid File 135 Dir 500: foo 135

11 11 Basic Design of WheelFS Node 653 Node 076 Node 150 Node 554 Node 402 Node 257 File 135? File 135 File 135 v2 File 135 v3 135 v2 135 v2 135 v3 135 v3 Consistency Servers 076 150 257 402 554 653

12 12 Read Globally, Write Locally Perform writes at local disk speeds Efficient bulk data transfer Avoid overloading nodes w/ popular files

13 13 Write Locally Node 653 Node 076 Node 150 Node 554 Node 402 Node 257 Create foo/bar 1.Choose an ID 2.Create dir entry 3.Write local file 550 Dir 209 (foo) File 550 (bar) bar = 550 Read foo/bar

14 14 Read Globally Node 653 Node 076 Node 150 Node 554 Node 402 Node 257 Read file 135 File 135 Cached 135 Cached 135 076 653 Chunk Cached 135 1.Contact node 2.Receive list 3.Get chunks 076 653 076 554 653 Chunk Read file 135 File 135

15 15 Example: BLAST DNA alignment tool run on Grids Copy separate DB portions and queries to many nodes Run separate computations Later fetch and combine results

16 16 Example: BLAST With WheelFS, however: –No explicit DB copying necessary –Efficient initial DB transfers –Automatic caching for reused DBs and queries Could be better since data is never updated

17 17 Example: Cooperative Web Cache Collection of nodes that: –Serve redirected web requests –Fetch web content from original web servers –Cache web content and serve it directly –Find cached content on other CWC nodes

18 18 Example: Cooperative Web Cache Avoid hotspots if [ -f /wfs/cwc/$URL ]; then if notexpired /wfs/cwc/$URL; then cat /wfs/cwc/$URL exit fi wget $URL –O - | tee /wfs/cwc/$URL

19 19 if [ -f /wfs/cwc/$URL ]; then if notexpired /wfs/cwc/$URL; then cat /wfs/cwc/$URL exit fi wget $URL –O - | tee /wfs/cwc/$URL Example: Cooperative Web Cache Node 653 Node 076 Node 150 Node 554 Node 402 Node 257 File 135 Cached 135 Client $URL $URL? 135 135?135 = v1 402 Chunk Cached 135 No! $URL File 550 $URL == 550 Dir 070 (/wfs/cwc)

20 20 Talk Outline 1.How to decentralize your file system 2.How to control your files

21 21 Example: Cooperative Web Cache Would rather fail and refetch than wait Perfect consistency isnt crucial if [ -f /wfs/cwc/$URL ]; then if notexpired /wfs/cwc/$URL; then cat /wfs/cwc/$URL exit fi wget $URL –O - | tee /wfs/cwc/$URL

22 22 Explicit Semantic Cues Allow direct control over system behavior Meta-data that attach to files, dirs, or refs Apply recursively down dir tree Possible impl: intra-path component –/wfs/cwc/.cue/foo/bar

23 23 Semantic Cues: Writability Applies to files WriteMany (default) WriteOnce Node 653 Node 076 Node 150 Node 554 Node 402 Node 257 File 135? File 135 File 135 v2 File 135 v3 Cached 135 v3 Cached 135

24 24 Semantic Cues: Freshness Applies to file references LatestVersion (default) AnyVersion BestVersion Node 653 Node 076 Node 150 Node 554 Node 402 Node 257 File 135? File 135 Cached 135

25 25 Semantic Cues: Write Consistency Applies to files or directories Strict (default) Lax Node 653 Node 076 Node 150 Node 554 Node 402 Node 257 Write File 135 File 135 Write File 135 File 135 v2 135 v2

26 26 Example: BLAST WriteOnce for all: –DB files –Query files –Result files Improves cachability of these files

27 27 Example: Cooperative Web Cache Reading an older version is ok: –cat /wfs/cwc/.maxtime=250,bestversion/foo Writing conflicting versions is ok: –wget http://foo > /wfs/cwc/.lax,writemany/foo if [ -f /wfs/cwc/.maxtime=250,bestversion/$URL ]; then if notexpired /wfs/cwc/.maxtime=250,bestversion/$URL; then cat /wfs/cwc/.maxtime=250,bestversion/$URL exit fi wget $URL –O - | tee /wfs/cwc/.lax,writemany/$URL

28 28 Discussion Must break data up into files small enough to fit on one disk Stuff we swept under the rug: –Security –Atomic renames across dirs –Unreferenced files

29 29 Related Work Every FS paper ever written Specifically: –Cluster FS: Farsite, GFS, xFS, Ceph –Wide-area FS: JetFile, CFS, Shark –Grid: LegionFS, GridFTP, IBP –POSIX I/O High Performance Computing Extensions

30 30 Conclusion WheelFS: distributed storage layer for newly-written applications Performance by reading globally and writing locally Control through explicit semantic cues


Download ppt "Building Distributed, Wide-Area Applications with WheelFS Jeremy Stribling, Emil Sit, Frans Kaashoek, Jinyang Li, and Robert Morris MIT CSAIL and NYU."

Similar presentations


Ads by Google