1 JTE HPC/FS Pastis: a peer-to-peer file system for persistant large-scale storage Jean-Michel Busca Fabio Picconi Pierre Sens LIP6, Université Paris 6.

Slides:



Advertisements
Similar presentations
What is OceanStore? - 10^10 users with files each - Goals: Durability, Availability, Enc. & Auth, High performance - Worldwide infrastructure to.
Advertisements

Peer to Peer and Distributed Hash Tables
Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.
Peer-to-Peer (P2P) Distributed Storage 1Dennis Kafura – CS5204 – Operating Systems.
Storage management and caching in PAST Antony Rowstron and Peter Druschel Presented to cs294-4 by Owen Cooper.
Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility Antony Rowstron, Peter Druschel Presented by: Cristian Borcea.
FARSITE: Federated, Available, and Reliable Storage for an Incompletely Trusted Environment Presented by: Boon Thau Loo CS294-4 (Adapted from Adya’s OSDI’02.
Ivy: A Read/Write P2P File System Athicha Muthitacharoan, Robert Morris, Thomer Gil and Benjie Chen Presented by Rachel Rubin CS 294-4, Fall 2003.
Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.
Outline for today Structured overlay as infrastructures Survey of design solutions Analysis of designs.
Applications over P2P Structured Overlays Antonino Virgillito.
Storage Management and Caching in PAST, a large-scale, persistent peer- to-peer storage utility Authors: Antony Rowstorn (Microsoft Research) Peter Druschel.
What is a P2P system? A distributed system architecture: No centralized control Nodes are symmetric in function Large number of unreliable nodes Enabled.
P2P: Advanced Topics Filesystems over DHTs and P2P research Vyas Sekar.
Spring 2003CS 4611 Peer-to-Peer Networks Outline Survey Self-organizing overlay network File system on top of P2P network Contributions from Peter Druschel.
Large Scale Sharing GFS and PAST Mahesh Balakrishnan.
Object Naming & Content based Object Search 2/3/2003.
Chord-over-Chord Overlay Sudhindra Rao Ph.D Qualifier Exam Department of ECECS.
1 CS 194: Distributed Systems Distributed Hash Tables Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer.
A Peer-to-Peer File System OSCAR LAB. Overview A short introduction to peer-to-peer (P2P) Systems Ivy: a read/write P2P file system (OSDI’02)
Concurrency Control & Caching Consistency Issues and Survey Dingshan He November 18, 2002.
Farsite: Ferderated, Available, and Reliable Storage for an Incompletely Trusted Environment Microsoft Reseach, Appear in OSDI’02.
Wide-area cooperative storage with CFS
Or, Providing Scalable, Decentralized Location and Routing Network Services Tapestry: Fault-tolerant Wide-area Application Infrastructure Motivation and.
1 Peer-to-Peer Networks Outline Survey Self-organizing overlay network File system on top of P2P network Contributions from Peter Druschel.
Peer-to-Peer Networks Slides largely adopted from Ion Stoica’s lecture at UCB.
1CS 6401 Peer-to-Peer Networks Outline Overview Gnutella Structured Overlays BitTorrent.
A Backup System built from a Peer-to-Peer Distributed Hash Table Russ Cox joint work with Josh Cates, Frank Dabek, Frans Kaashoek, Robert Morris,
Federated, Available, and Reliable Storage for an Incompletely Trusted Environment Atul Adya, Bill Bolosky, Miguel Castro, Gerald Cermak, Ronnie Chaiken,
Roger ZimmermannCOMPSAC 2004, September 30 Spatial Data Query Support in Peer-to-Peer Systems Roger Zimmermann, Wei-Shinn Ku, and Haojun Wang Computer.
Tapestry GTK Devaroy (07CS1012) Kintali Bala Kishan (07CS1024) G Rahul (07CS3009)
Information-Centric Networks05b-1 Week 5 / Paper 2 A survey of peer-to-peer content distribution technologies –Stephanos Androutsellis-Theotokis, Diomidis.
Introduction to Peer-to-Peer Networks. What is a P2P network A P2P network is a large distributed system. It uses the vast resource of PCs distributed.
Thesis Proposal Data Consistency in DHTs. Background Peer-to-peer systems have become increasingly popular Lots of P2P applications around us –File sharing,
Wide-area cooperative storage with CFS Frank Dabek, M. Frans Kaashoek, David Karger, Robert Morris, Ion Stoica.
Cooperative File System. So far we had… - Consistency BUT… - Availability - Partition tolerance ?
1 JTE HPC/FS Pastis: a peer-to-peer file system for persistant large-scale storage Jean-Michel Busca Fabio Picconi Pierre Sens LIP6, Université Paris 6.
Chord & CFS Presenter: Gang ZhouNov. 11th, University of Virginia.
Pond: the OceanStore Prototype Sean Rhea, Patric Eaton, Dennis Gells, Hakim Weatherspoon, Ben Zhao, and John Kubiatowicz University of California, Berkeley.
Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Xiaozhou Li COS 461: Computer Networks (precept 04/06/12) Princeton University.
Ivy: A Read/Write Peer-to-Peer File System A. Muthitacharoen, R. Morris, T. M. Gil, and B. Chen In Proceedings of OSDI ‘ Presenter : Chul Lee.
From Coulouris, Dollimore, Kindberg and Blair Distributed Systems: Concepts and Design Edition 5, © Addison-Wesley 2012 Slides for Chapter 10: Peer-to-Peer.
1 Distributed Hash Tables (DHTs) Lars Jørgen Lillehovde Jo Grimstad Bang Distributed Hash Tables (DHTs)
Hongil Kim E. Chan-Tin, P. Wang, J. Tyra, T. Malchow, D. Foo Kune, N. Hopper, Y. Kim, "Attacking the Kad Network - Real World Evaluation and High.
1 Phenix Workshop on Global Computing Systems Pastis, a peer-to-peer file system for persistent large-scale storage by Fabio Picconi advisor Pierre Sens.
Storage Management and Caching in PAST A Large-scale persistent peer-to-peer storage utility Presented by Albert Tannous CSE 598D: Storage Systems – Dr.
Peer-to-Peer Name Service (P2PNS) Ingmar Baumgart Institute of Telematics, Universität Karlsruhe IETF 70, Vancouver.
Chord+DHash+Ivy: Building Principled Peer-to-Peer Systems Robert Morris Joint work with F. Kaashoek, D. Karger, I. Stoica, H. Balakrishnan,
Paper Survey of DHT Distributed Hash Table. Usages Directory service  Very little amount of information, such as URI, metadata, … Storage  Data, such.
Peer to Peer A Survey and comparison of peer-to-peer overlay network schemes And so on… Chulhyun Park
Automated P2P Backup Group 1 Anderson, Bowers, Johnson, Walker.
1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan MIT Laboratory.
Plethora: Infrastructure and System Design. Introduction Peer-to-Peer (P2P) networks: –Self-organizing distributed systems –Nodes receive and provide.
1. Efficient Peer-to-Peer Lookup Based on a Distributed Trie 2. Complex Queries in DHT-based Peer-to-Peer Networks Lintao Liu 5/21/2002.
POND: THE OCEANSTORE PROTOTYPE S. Rea, P. Eaton, D. Geels, H. Weatherspoon, J. Kubiatowicz U. C. Berkeley.
Algorithms and Techniques in Structured Scalable Peer-to-Peer Networks
Bruce Hammer, Steve Wallis, Raymond Ho
INTERNET TECHNOLOGIES Week 10 Peer to Peer Paradigm 1.
P2P Search COP6731 Advanced Database Systems. P2P Computing  Powerful personal computer Share computing resources P2P Computing  Advantages: Shared.
P2P Search COP P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized & unstructured P2P systems  e.g. Gnutella.
Review CS File Systems - Partitions What is a hard disk partition?
Large Scale Sharing Marco F. Duarte COMP 520: Distributed Systems September 19, 2004.
1 JTE HPC/FS Pastis: a peer-to-peer file system for persistant large-scale storage Jean-Michel Busca Fabio Picconi Pierre Sens LIP6, Université Paris 6.
Plethora: A Locality Enhancing Peer-to-Peer Network Ronaldo Alves Ferreira Advisor: Ananth Grama Co-advisor: Suresh Jagannathan Department of Computer.
Peer-to-Peer (P2P) File Systems. P2P File Systems CS 5204 – Fall, Peer-to-Peer Systems Definition: “Peer-to-peer systems can be characterized as.
Fabián E. Bustamante, Fall 2005 A brief introduction to Pastry Based on: A. Rowstron and P. Druschel, Pastry: Scalable, decentralized object location and.
Ivy: A Read/Write Peer-to- Peer File System Authors: Muthitacharoen Athicha, Robert Morris, Thomer M. Gil, and Benjie Chen Presented by Saurabh Jha 1.
CS Spring 2010 CS 414 – Multimedia Systems Design Lecture 24 – Introduction to Peer-to-Peer (P2P) Systems Klara Nahrstedt (presented by Long Vu)
Peer-to-Peer (P2P) File Systems
Peer-to-Peer Storage Systems
Presentation transcript:

1 JTE HPC/FS Pastis: a peer-to-peer file system for persistant large-scale storage Jean-Michel Busca Fabio Picconi Pierre Sens LIP6, Université Paris 6 – CNRS, Paris, France INRIA, Rocquencourt, France

2 JTE HPC/FS 1.DHT-based File Systems 2.Pastis 3.Performance evaluation Outline

3 JTE HPC/FS Distributed file systems Client-serverP2P LAN (100) NFS- Organization (10.000) AFSFARSITE Pangaea Internet ( ) -Ivy * Oceanstore * Pastis * scalability (number of nodes) architecture * uses a Distributed Hash Table (DHT) to store data

4 JTE HPC/FS Distributed Hash Tables

5 JTE HPC/FS DHTs logical address space South America North America Australia Asia Europe Asia high latency, low bandwidth between logical neighbors Overlay network

6 JTE HPC/FS Insertion of blocks in DHT 04F B C52A BB2 3A AC78 895D E25A 04F2 3A B BB2 AC78 C52A E25A k = 8958 k = 8959 put(8959,block) root of key 8959 block Address space replica 895D

7 JTE HPC/FS PAST: Storage System PAST: Cooperative, archival file storage and distribution  Layered on top of Pastry Goals:  Strong persistence of the data  High availability  Scalability of the System  Reduced cost (no backup)  Efficient use of pooled resources

8 JTE HPC/FS Insertion of blocks in DHT 04F B C52A BB2 3A AC78 895D E25A 04F2 3A B BB2 AC78 C52A E25A k = 8958 k = 8959 put(8959,block) root of key 8959 block Address space replica 895D replica

9 JTE HPC/FS Insertion of blocks in DHT 04F B C52A BB2 3A AC78 895D E25A 04F2 3A B BB2 AC78 C52A E25A block Address space replica 895D replica k = 8958 k = 8959 get(8959,block)

10 JTE HPC/FS P2P File systems architecture put(key, block) block = get(key)  files and directories  read-write access semantics  security and access control DHash / Past Ivy / Pastis DHT FS - scalability - fault-tolerance - self-organization  block store (DHT)  message routing open(), read(), write(), close(), etc.

11 JTE HPC/FS DHT-based file systems Ivy [OSDI’02]  log-based, one log per user  fast writes, slow reads  limited to small number of users Oceanstore [FAST’03]  updates serialized by primary replicas  partially centralized system  BFT agreement protocol requires well-connected primary replicas primary replicas secondary replicas User A’s log User B’s log User C’s log DHT object DHT object DHT object

12 JTE HPC/FS Pastis

13 JTE HPC/FS Pastis design Design goals  simple  completely decentralized  scalable (network size and number of users) put(key, block) block = get(key) Pastry Past Pastis DHT FS storage routing

14 JTE HPC/FS Pastis data structures Data structures similar to the Unix file system  inodes are stored in modifiable DHT blocks (UCBs)  file contents are stored in immutable DHT blocks (CHBs) metadata block addresses UCB file inode CHB1 CHB2 file contents UCB CHB1 CHB2 replica sets DHT address space Inode key

15 JTE HPC/FS Pastis data structures (cont.)  directories contain entries  use indirect blocks for large files metadata block addresses UCB directory inode CHB file1, key1 file2, key2 … metadata block addresses UCB file1 inode CHB old contents CHB indirect block CHB file contents CHB old contents CHB file contents

16 JTE HPC/FS Content Hash Block (CHB) Content Hash Block  block has to be immutable Solution to check and prevent modification  block contents determine block key  can detect if block is modified data block block key = Hash( block contents ) block contents

17 JTE HPC/FS User Certificate Blocks (UCBs) UCBs are modifiable by the block owner. Question: How to check that the file is modified only by the owner? Protocol  (KB pub, KB priv ) associated to each block  The owner builds a signature of the block using KB priv. Authentication  Verify signature of UCB using the KB pub sign(KB priv ) timestamp UCB block key = Hash( KB pub ) inode contents