Presentation is loading. Please wait.

Presentation is loading. Please wait.

Google File System Robert Nishihara. What is GFS? Distributed filesystem for large-scale distributed applications.

Similar presentations


Presentation on theme: "Google File System Robert Nishihara. What is GFS? Distributed filesystem for large-scale distributed applications."— Presentation transcript:

1 Google File System Robert Nishihara

2 What is GFS? Distributed filesystem for large-scale distributed applications

3 Setting Frequent hardware failures Large files Most writes are appends Most reads are sequential Throughput > latency

4 Architecture Files divided into 64MB “chunks” Chunkservers store/write/serve chunks Master maps files -> chunk

5

6 Design Decisions Large chunks (64MB) – Pro: fewer client/master interactions – Pro: less metadata No caching Writes optimized for “appends” Single master => optimizations Master metadata stored in memory – Pro: master operations are fast – Con: limits number of files

7 Fault Tolerance Chunks replicated (3x by default) Master state replicated (both logs and checkpoints)

8 Consistency Namespace mutation (e.g., file creation) is atomic Relaxed guarantees (“inconsistent” regions may be interspersed between “consistent” ones) Clients can handle de-duplication

9 Conclusion GFS is a filesystem designed for large scale distributed applications Optimized for appends and sequential reads Fault tolerance via replication, monitoring, fast recovery, checksumming


Download ppt "Google File System Robert Nishihara. What is GFS? Distributed filesystem for large-scale distributed applications."

Similar presentations


Ads by Google