Presentation is loading. Please wait.

Presentation is loading. Please wait.

FARSITE: Federated, Available, and Reliable Storage for an Incompletely Trusted Environment.

Similar presentations


Presentation on theme: "FARSITE: Federated, Available, and Reliable Storage for an Incompletely Trusted Environment."— Presentation transcript:

1 FARSITE: Federated, Available, and Reliable Storage for an Incompletely Trusted Environment

2 Introduction  Farsite: serverless distributed file system Logically functions as a centralized file server Logically functions as a centralized file server  Designed for desktop environments  Need some effort for initial configurations  With little central administration to maintain

3 Farsite Characteristics  Peer-to-peer among untrusted machines  Need to handle privacy, integrity, durability Cryptography Cryptography Randomized replication Randomized replication Byzantine fault-tolerance Byzantine fault-tolerance

4 Farsite Workloads  High access locality  Low update rate  Sequential accesses with rare concurrency

5 Administration  Machine certificates bind machines to their public keys  User certificates bind users to their public keys  Namespace certificates bind namespace roots to their managing machines

6 Design Assumptions  for ~10 5 machines  All interconnected by a high-bandwidth, low-latency network  Majority of machines to be up most of the time  Uncorrelated permanent machine failures  Read-mostly sharing  Few malicious users

7 Enabling Technology Trends  Increase in unused disk capacity In 2000, 58% of disk capacity unused at Microsoft In 2000, 58% of disk capacity unused at Microsoft Can replicate data for reliability Can replicate data for reliability  Decrease in the computational cost Can easily encrypt at 53 MB/sec Can easily encrypt at 53 MB/sec Disk transfers at 32 MB/sec Disk transfers at 32 MB/sec Can use strong cryptography for security Can use strong cryptography for security

8 Namespace Roots  Allow multiple roots for multiple machines

9 Trust and Certification  Based on public-key-cryptographic certificates Encrypt(Key public, text plain )  text cipher Encrypt(Key public, text plain )  text cipher Decrypt(Key private, text cipher )  text plain Decrypt(Key private, text cipher )  text plain Encrypt(Key private, text plain )  text cipher Encrypt(Key private, text plain )  text cipher Decrypt(Key public, text cipher )  text plain Decrypt(Key public, text cipher )  text plain

10 Public Key Encryption Basics  Idea Public key is published Public key is published Private key is the secret Private key is the secret  Encrypt(Key my_public, “Hi, Andy”) Anyone can create it, but only I can read it Anyone can create it, but only I can read it  Encrypt(Key my_private, “I’m Andy”) Everyone can read it, but only I can create it Everyone can read it, but only I can create it

11 Public Key Encryption Basics  Encrypt(Key your_public, Encrypt(Key my_private, “I know your secret”)) Only you can read it, and only I can send it Only you can read it, and only I can send it

12 Basic System  Every machine has three roles Client Client A machine that interacts with a userA machine that interacts with a user Directory group Directory group A set of machines that manage files via Byzantine- fault-tolerant protocolA set of machines that manage files via Byzantine- fault-tolerant protocol Every group member owns a replicaEvery group member owns a replica File host File host

13 More on the Basic System + Reliability + Data integrity - Performance Byzantine’s algorithm can only tolerate up to 1/3 of failed replicas Byzantine’s algorithm can only tolerate up to 1/3 of failed replicas Need lots of replicas Need lots of replicas - Privacy - Storage consumption

14 System Enhancements  Local caching A client can lease a copy of a file A client can lease a copy of a file  Encrypt written files with public keys of all authorized clients Offload those files to file hosts Offload those files to file hosts Store only the content hash of those files locally Store only the content hash of those files locally Can validate damaged copies Can validate damaged copies Can tolerate n – 1 file host failures Can tolerate n – 1 file host failures

15 Traditional Byzantine Approach [CL99] Client File Meta-Data Byzantine fault- tolerant protocol Byzantine servers 3f +1 file copies to handle f failures

16 Farsite: BFT only for meta-data Client Byzantine fault- tolerant protocol Directory group File hosts f + 1 file copies for f failures

17 Semantic Differences from NTFS  Hard limit on concurrent writes  Soft limit on concurrent read Sometime supply stale snapshots Sometime supply stale snapshots  No name-locking on open file’s path

18 File System Features  Reliability  Availability  Security  Durability  Consistency  Scalability  Efficiency  Manageability

19 Reliability and Availability  Replication  When a machine in unavailable for an extended period Its functions migrate to others Its functions migrate to others  Caching

20 Privacy  File content and metadata are encrypted  Convergent encryption Encrypt(Hash one_way (block plain ), block plain )  block cipher Encrypt(Hash one_way (block plain ), block plain )  block cipher Hash Encrypt Data blocks

21 More on Convergent Encryption  Block hashes are used to identify identical block contents  Block-level encryption allows block-level changes without re-encrypting the entire file

22 More on Convergent Encryption  Encrypt(Key file, file_hashes plain )  file_hashes cipher Encrypt Block hashes

23 More on Convergent Encryption  Encrypt(Key client1_public, Key file )  Key file_cipher1  Encrypt(Key client2_public, Key file )  Key file_cipher2  …  Store both encrypted file and keys

24 Directories  Also encrypted  Use exclusive encryption Prevent malicious client from encrypting a syntactically illegal name Prevent malicious client from encrypting a syntactically illegal name

25 Integrity  Use hash trees to compare files If the root matches, two files are identical If the root matches, two files are identical If not, compare the hashes at the lower level If not, compare the hashes at the lower level Until the discrepancy is identified Until the discrepancy is identified  The cost of in-place updates is logarithmic of the file size  Linear time to verify the integrity of individual blocks

26 Durability  Updates are logged and compressed locally  The log is pushed back to the directory group periodically and when a lease is recalled  Each log entry is verified

27 Consistency  Control can be loaned to clients Content leases Content leases Name leases Name leases Mode leases Mode leases Access leases Access leases

28 Data Consistency  Content leases Read/write Read/write Read-only Read-only Assures no stale dataAssures no stale data Single-writer, multiple-reader semantics Single-writer, multiple-reader semantics A lease is kept until it is expired or recalled A lease is kept until it is expired or recalled Can lease a file, directory, a tree Can lease a file, directory, a tree

29 Namespace Consistency  Name leases Can create a file name Can create a file name Can create a directory and its files and subdirectories Can create a directory and its files and subdirectories

30 Windows File-Sharing Semantics  Mode leases Read, write, delete, exclude-read, exclude- write, exclude-delete Read, write, delete, exclude-read, exclude- write, exclude-delete

31 Windows Deletion Semantics  Open it, mark it for deletion, close it  A file is not deleted until the last file close  Access leases Public: Lease holder has the file open Public: Lease holder has the file open Protected Protected No other client will be granted access without first contacting the lease holderNo other client will be granted access without first contacting the lease holder Private Private No other client has any access lease on the fileNo other client has any access lease on the file

32 Scalability  Hint-based pathname translation Caching Caching  Delayed directory-change notification

33 Space Efficiency  Reclaim space from duplicate files Workgroup-shared documents Workgroup-shared documents Multiple copies of common applications Multiple copies of common applications Can save 50% of storage requirement Can save 50% of storage requirement Based on hash comparisons Based on hash comparisons

34 Time Efficiency  Insert a delay between a file creation and replication Expect many files get deleted shortly after their creation Expect many files get deleted shortly after their creation Reduced network traffic Reduced network traffic

35 Local-Machine Administration  Machine replacement A special case of hardware failure A special case of hardware failure  Little need for backup

36 Performance Measurements  Used only five machines…  With only 1 hour of file-system trace 450,164 file operations 450,164 file operations  2 to 4 times as long as NTFS reads/writes/closes  9 times as long for opens  20 times as long for metadata accesses  5.5 times slower I/O latencies


Download ppt "FARSITE: Federated, Available, and Reliable Storage for an Incompletely Trusted Environment."

Similar presentations


Ads by Google