FARSITE: Federated, Available, and Reliable Storage for an Incompletely Trusted Environment.

Slides:



Advertisements
Similar presentations
Windows Server ® 2008 File Services Infrastructure Planning and Design Published: June 2010 Updated: November 2011.
Advertisements

What is OceanStore? - 10^10 users with files each - Goals: Durability, Availability, Enc. & Auth, High performance - Worldwide infrastructure to.
Sanjay Ghemawat, Howard Gobioff and Shun-Tak Leung
G O O G L E F I L E S Y S T E M 陳 仕融 黃 振凱 林 佑恩 Z 1.
FARSITE: Federated, Available, and Reliable Storage for an Incompletely Trusted Environment Presented by: Boon Thau Loo CS294-4 (Adapted from Adya’s OSDI’02.
Serverless Network File Systems. Network File Systems Allow sharing among independent file systems in a transparent manner Mounting a remote directory.
DESIGNING A PUBLIC KEY INFRASTRUCTURE
1 Principles of Reliable Distributed Systems Tutorial 12: Frangipani Spring 2009 Alex Shraer.
Lecture 6 – Google File System (GFS) CSE 490h – Introduction to Distributed Computing, Winter 2008 Except as otherwise noted, the content of this presentation.
The Google File System. Why? Google has lots of data –Cannot fit in traditional file system –Spans hundreds (thousands) of servers connected to (tens.
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment, Enhanced Chapter 7: Advanced File System Management.
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment Chapter 1: Introduction to Windows Server 2003.
Large Scale Sharing GFS and PAST Mahesh Balakrishnan.
G Robert Grimm New York University Farsite: A Serverless File System.
Hands-On Microsoft Windows Server 2003 Administration Chapter 5 Administering File Resources.
Chapter 12 File Management Systems
Concurrency Control & Caching Consistency Issues and Survey Dingshan He November 18, 2002.
70-270, MCSE/MCSA Guide to Installing and Managing Microsoft Windows XP Professional and Windows Server 2003 Chapter Nine Managing File System Access.
The Google File System.
Farsite: Ferderated, Available, and Reliable Storage for an Incompletely Trusted Environment Microsoft Reseach, Appear in OSDI’02.
Wide-area cooperative storage with CFS
5.1 © 2004 Pearson Education, Inc. Exam Managing and Maintaining a Microsoft® Windows® Server 2003 Environment Lesson 5: Working with File Systems.
Distributed File System: Data Storage for Networks Large and Small Pei Cao Cisco Systems, Inc.
Google File System.
Northwestern University 2007 Winter – EECS 443 Advanced Operating Systems The Google File System S. Ghemawat, H. Gobioff and S-T. Leung, The Google File.
MCTS Guide to Microsoft Windows Server 2008 Network Infrastructure Configuration Chapter 7 Configuring File Services in Windows Server 2008.
Maintaining Windows Server 2008 File Services
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment, Enhanced Chapter 1: Introduction to Windows Server 2003.
Team CMD Distributed Systems Team Report 2 1/17/07 C:\>members Corey Andalora Mike Adams Darren Stanley.
Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Google∗
Federated, Available, and Reliable Storage for an Incompletely Trusted Environment Atul Adya, Bill Bolosky, Miguel Castro, Gerald Cermak, Ronnie Chaiken,
Distributed File Systems Concepts & Overview. Goals and Criteria Goal: present to a user a coherent, efficient, and manageable system for long-term data.
Configuring File Services Lesson 6. Skills Matrix Technology SkillObjective DomainObjective # Configuring a File ServerConfigure a file server4.1 Using.
Review Session for Fourth Quiz Jehan-François Pâris Summer 2011.
1 The Google File System Reporter: You-Wei Zhang.
MCTS Guide to Configuring Microsoft Windows Server 2008 Active Directory Chapter 6: Windows File and Print Services.
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment, Enhanced Chapter 7: Advanced File System Management.
CSC 456 Operating Systems Seminar Presentation (11/13/2012) Leon Weingard, Liang Xin The Google File System.
1 Chapter 12 File Management Systems. 2 Systems Architecture Chapter 12.
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment Chapter 7: Advanced File System Management.
FARSITE: Federated, Available and Reliable Storage for an Incompletely Trusted Environment A. Atta, W. J. Bolowsky, M. Castro, G. Cermak, R. Chaiken, J.
Chord & CFS Presenter: Gang ZhouNov. 11th, University of Virginia.
1 The Design of a Robust Peer-to-Peer System Gisik Kwon Dept. of Computer Science and Engineering Arizona State University Reference: SIGOPS European Workshop.
MapReduce and GFS. Introduction r To understand Google’s file system let us look at the sort of processing that needs to be done r We will look at MapReduce.
Presenters: Rezan Amiri Sahar Delroshan
NT SECURITY Introduction Security features of an operating system revolve around the principles of “Availability,” “Integrity,” and Confidentiality. For.
Peer-to-peer Information Systems Universität des Saarlandes Max-Planck-Institut für Informatik – AG5: Databases and Information Systems Group Prof. Dr.-Ing.
Paper Survey of DHT Distributed Hash Table. Usages Directory service  Very little amount of information, such as URI, metadata, … Storage  Data, such.
GFS : Google File System Ömer Faruk İnce Fatih University - Computer Engineering Cloud Computing
Eduardo Gutarra Velez. Outline Distributed Filesystems Motivation Google Filesystem Architecture The Metadata Consistency Model File Mutation.
GFS. Google r Servers are a mix of commodity machines and machines specifically designed for Google m Not necessarily the fastest m Purchases are based.
POND: THE OCEANSTORE PROTOTYPE S. Rea, P. Eaton, D. Geels, H. Weatherspoon, J. Kubiatowicz U. C. Berkeley.
1 Objectives Discuss File Services in Windows Server 2008 Install the Distributed File System in Windows Server 2008 Discuss and create shared file resources.
Presenter: Seikwon KAIST The Google File System 【 Ghemawat, Gobioff, Leung 】
Outline for Today’s Lecture Administrative: –Happy Thanksgiving –Sign up for demos. Objective: –Peer-to-peer file systems Mechanisms employed Issues Some.
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Lecture 24: GFS.
1 CEG 2400 Fall 2012 Network Servers. 2 Network Servers Critical Network servers – Contain redundant components Power supplies Fans Memory CPU Hard Drives.
Distributed File System. Outline Basic Concepts Current project Hadoop Distributed File System Future work Reference.
Chapter Five Distributed file systems. 2 Contents Distributed file system design Distributed file system implementation Trends in distributed file systems.
An Introduction to GPFS
DISTRIBUTED FILE SYSTEM- ENHANCEMENT AND FURTHER DEVELOPMENT BY:- PALLAWI(10BIT0033)
Sanjay Ghemawat, Howard Gobioff, Shun-Tak Leung
Distributed File Systems
Google File System.
Google Filesystem Some slides taken from Alan Sussman.
Providing Secure Storage on the Internet
THE GOOGLE FILE SYSTEM.
by Mikael Bjerga & Arne Lange
Federated, Available, and Reliable Storage for an Incompletely Trusted Environment Atul Adya, William J. Bolosky, Miguel Castro, Gerald Cermak, Ronnie.
Presentation transcript:

FARSITE: Federated, Available, and Reliable Storage for an Incompletely Trusted Environment

Introduction  Farsite: serverless distributed file system Logically functions as a centralized file server Logically functions as a centralized file server  Designed for desktop environments  Need some effort for initial configurations  With little central administration to maintain

Farsite Characteristics  Peer-to-peer among untrusted machines  Need to handle privacy, integrity, durability Cryptography Cryptography Randomized replication Randomized replication Byzantine fault-tolerance Byzantine fault-tolerance

Farsite Workloads  High access locality  Low update rate  Sequential accesses with rare concurrency

Administration  Machine certificates bind machines to their public keys  User certificates bind users to their public keys  Namespace certificates bind namespace roots to their managing machines

Design Assumptions  for ~10 5 machines  All interconnected by a high-bandwidth, low-latency network  Majority of machines to be up most of the time  Uncorrelated permanent machine failures  Read-mostly sharing  Few malicious users

Enabling Technology Trends  Increase in unused disk capacity In 2000, 58% of disk capacity unused at Microsoft In 2000, 58% of disk capacity unused at Microsoft Can replicate data for reliability Can replicate data for reliability  Decrease in the computational cost Can easily encrypt at 53 MB/sec Can easily encrypt at 53 MB/sec Disk transfers at 32 MB/sec Disk transfers at 32 MB/sec Can use strong cryptography for security Can use strong cryptography for security

Namespace Roots  Allow multiple roots for multiple machines

Trust and Certification  Based on public-key-cryptographic certificates Encrypt(Key public, text plain )  text cipher Encrypt(Key public, text plain )  text cipher Decrypt(Key private, text cipher )  text plain Decrypt(Key private, text cipher )  text plain Encrypt(Key private, text plain )  text cipher Encrypt(Key private, text plain )  text cipher Decrypt(Key public, text cipher )  text plain Decrypt(Key public, text cipher )  text plain

Public Key Encryption Basics  Idea Public key is published Public key is published Private key is the secret Private key is the secret  Encrypt(Key my_public, “Hi, Andy”) Anyone can create it, but only I can read it Anyone can create it, but only I can read it  Encrypt(Key my_private, “I’m Andy”) Everyone can read it, but only I can create it Everyone can read it, but only I can create it

Public Key Encryption Basics  Encrypt(Key your_public, Encrypt(Key my_private, “I know your secret”)) Only you can read it, and only I can send it Only you can read it, and only I can send it

Basic System  Every machine has three roles Client Client A machine that interacts with a userA machine that interacts with a user Directory group Directory group A set of machines that manage files via Byzantine- fault-tolerant protocolA set of machines that manage files via Byzantine- fault-tolerant protocol Every group member owns a replicaEvery group member owns a replica File host File host

More on the Basic System + Reliability + Data integrity - Performance Byzantine’s algorithm can only tolerate up to 1/3 of failed replicas Byzantine’s algorithm can only tolerate up to 1/3 of failed replicas Need lots of replicas Need lots of replicas - Privacy - Storage consumption

System Enhancements  Local caching A client can lease a copy of a file A client can lease a copy of a file  Encrypt written files with public keys of all authorized clients Offload those files to file hosts Offload those files to file hosts Store only the content hash of those files locally Store only the content hash of those files locally Can validate damaged copies Can validate damaged copies Can tolerate n – 1 file host failures Can tolerate n – 1 file host failures

Traditional Byzantine Approach [CL99] Client File Meta-Data Byzantine fault- tolerant protocol Byzantine servers 3f +1 file copies to handle f failures

Farsite: BFT only for meta-data Client Byzantine fault- tolerant protocol Directory group File hosts f + 1 file copies for f failures

Semantic Differences from NTFS  Hard limit on concurrent writes  Soft limit on concurrent read Sometime supply stale snapshots Sometime supply stale snapshots  No name-locking on open file’s path

File System Features  Reliability  Availability  Security  Durability  Consistency  Scalability  Efficiency  Manageability

Reliability and Availability  Replication  When a machine in unavailable for an extended period Its functions migrate to others Its functions migrate to others  Caching

Privacy  File content and metadata are encrypted  Convergent encryption Encrypt(Hash one_way (block plain ), block plain )  block cipher Encrypt(Hash one_way (block plain ), block plain )  block cipher Hash Encrypt Data blocks

More on Convergent Encryption  Block hashes are used to identify identical block contents  Block-level encryption allows block-level changes without re-encrypting the entire file

More on Convergent Encryption  Encrypt(Key file, file_hashes plain )  file_hashes cipher Encrypt Block hashes

More on Convergent Encryption  Encrypt(Key client1_public, Key file )  Key file_cipher1  Encrypt(Key client2_public, Key file )  Key file_cipher2  …  Store both encrypted file and keys

Directories  Also encrypted  Use exclusive encryption Prevent malicious client from encrypting a syntactically illegal name Prevent malicious client from encrypting a syntactically illegal name

Integrity  Use hash trees to compare files If the root matches, two files are identical If the root matches, two files are identical If not, compare the hashes at the lower level If not, compare the hashes at the lower level Until the discrepancy is identified Until the discrepancy is identified  The cost of in-place updates is logarithmic of the file size  Linear time to verify the integrity of individual blocks

Durability  Updates are logged and compressed locally  The log is pushed back to the directory group periodically and when a lease is recalled  Each log entry is verified

Consistency  Control can be loaned to clients Content leases Content leases Name leases Name leases Mode leases Mode leases Access leases Access leases

Data Consistency  Content leases Read/write Read/write Read-only Read-only Assures no stale dataAssures no stale data Single-writer, multiple-reader semantics Single-writer, multiple-reader semantics A lease is kept until it is expired or recalled A lease is kept until it is expired or recalled Can lease a file, directory, a tree Can lease a file, directory, a tree

Namespace Consistency  Name leases Can create a file name Can create a file name Can create a directory and its files and subdirectories Can create a directory and its files and subdirectories

Windows File-Sharing Semantics  Mode leases Read, write, delete, exclude-read, exclude- write, exclude-delete Read, write, delete, exclude-read, exclude- write, exclude-delete

Windows Deletion Semantics  Open it, mark it for deletion, close it  A file is not deleted until the last file close  Access leases Public: Lease holder has the file open Public: Lease holder has the file open Protected Protected No other client will be granted access without first contacting the lease holderNo other client will be granted access without first contacting the lease holder Private Private No other client has any access lease on the fileNo other client has any access lease on the file

Scalability  Hint-based pathname translation Caching Caching  Delayed directory-change notification

Space Efficiency  Reclaim space from duplicate files Workgroup-shared documents Workgroup-shared documents Multiple copies of common applications Multiple copies of common applications Can save 50% of storage requirement Can save 50% of storage requirement Based on hash comparisons Based on hash comparisons

Time Efficiency  Insert a delay between a file creation and replication Expect many files get deleted shortly after their creation Expect many files get deleted shortly after their creation Reduced network traffic Reduced network traffic

Local-Machine Administration  Machine replacement A special case of hardware failure A special case of hardware failure  Little need for backup

Performance Measurements  Used only five machines…  With only 1 hour of file-system trace 450,164 file operations 450,164 file operations  2 to 4 times as long as NTFS reads/writes/closes  9 times as long for opens  20 times as long for metadata accesses  5.5 times slower I/O latencies