Presentation is loading. Please wait.

Presentation is loading. Please wait.

Pond: the OceanStore Prototype Sean Rhea, Patric Eaton, Dennis Gells, Hakim Weatherspoon, Ben Zhao, and John Kubiatowicz University of California, Berkeley.

Similar presentations


Presentation on theme: "Pond: the OceanStore Prototype Sean Rhea, Patric Eaton, Dennis Gells, Hakim Weatherspoon, Ben Zhao, and John Kubiatowicz University of California, Berkeley."— Presentation transcript:

1 Pond: the OceanStore Prototype Sean Rhea, Patric Eaton, Dennis Gells, Hakim Weatherspoon, Ben Zhao, and John Kubiatowicz University of California, Berkeley Proc. of the 2 nd USENIX Conf. On File and Storage Technologies (FAST ‘03) Presented by Park, Seon-Yeong

2 2/26 Ubiquitous Computing Telephone SPO Watch PDA Cell Phone Digital TV PC Storage Pool

3 3/26 OceanStore Overview Internet-scale, Cooperative File System Application Calendars, Email, Contact Lists, Large Digital Libraries, Repositories for Scientific Data, Distributed Design Tool, etc. Requirements Universal Availability Durability Understandable Consistency Model Privacy vs. Information Sharing

4 4/26 Data Model (1/2) Data Object A File in a Traditional File System Named by an Active Globally-Unique Identifier, AGUID –Location Independent –Preventing Name Space Collisions SHA-1 AGUID Application-specified Name + Owner’s Public Key

5 5/26 Data Model (2/2) Data Object Sequences of Read-only Versions Block Reference –Cryptographically-secure Hash of Child Block’s Contents

6 6/26 Underlying Technology Access Control Data Update Primary Replica Archival Storage Secondary Replica Data Read Data Location & Routing ;Tapestry

7 7/26 Access Control Reader Restriction Encrypt All Data Distribute Encryption Key to Users with Read Permission Writer Restriction Access Control List (ACL) for an Object All Writes be Signed so that Well-behaved Servers and Clients Verify them based on the ACL

8 8/26 Underlying Technology Access Control Data Update Primary Replica Archival Storage Secondary Replica Data Read Data Location & Routing

9 9/26 Data Update (1/2) Update Adding a New Version to the Head of Version Stream Array of Potential Actions each Guarded by a Predicate –Predicate Examples Checking Latest Version_Num, Comparing a Region of Bytes to an Expected Value, etc. –Action Examples Replacing a Set of Bytes, Appending New Data, Truncating the Object, etc. Timestamp Client ID... Client Signature

10 10/26 Data Update (2/2) Application Primary Replica (Inner Ring) Archival Storages Application Secondary Replica Secondary Replica

11 11/26 Primary Replica Inner Ring A Set of Servers that Implement Object’s Primary Replica Applies Updates and Creates New Versions –Serialization –Access Control –Create Archival Fragments Update Agreements –Byzantine Agreement Protocol Distributed Decision Process in which All Non-faulty Participants Reach the Same Decision for a Group of Size 3 f +1, no more than f Faulty Servers

12 12/26 Archival Storage Simple Replication Tolerance of One Failure for an Addition 100% Storage Cost Erasure Codes Efficient and Stable Storage for Archival Copies Storage Cost by a Factor of N/M Original Block can be Reconstructed from Any M Fragments Block Fragment 1 Fragment 2 Fragment N... Fragment 1 Fragment 2 Fragment M... Encoded by Erasure Code M < N Fragment 3

13 13/26 Secondary Replica Whole-block Caching to Avoid Erasure Codes on Frequently-read Objects Push-based Update Every Time the Primary Replica Applies an Update Dissemination Tree Application-level Multicast Tree Rooted at Primary Replica Parent Nodes are Pre-existing Replicas to Serve Objects

14 14/26 Underlying Technology Access Control Data Update Primary Replica Archival Storage Secondary Replica Data Read Data Location & Routing

15 15/26 Data Read Application Primary Replica (Inner Ring) Archival Storages Secondary Replica 1. AGUID 2. Latest VGUID 3. Search Blocks from Secondary Replicas 4. Search enough Fragments from Archival Storages

16 16/26 Underlying Technology Access Control Data Update Primary Replica Archival Storage Secondary Replica Data Read Data Location & Routing

17 17/26 Data Location & Routing (1/4) Tapestry Decentralized Object Location and Routing System Using Globally Unique Identifier (GUID) to Hosts and Resources Location Independent Locality Aware

18 18/26 Data Location & Routing (2/4) Routing Example Messages are Routed to the Destination ID Digit by Digit ***8=>**98=>*598=>4598 B4F8 9098 0325 2BB8 7598 4598 87CA 0098 3E98 1598 D598 2118 L1 L2 L3 L4 L2 L4 L3 L1

19 19/26 Data Location & Routing (3/4) Location Independent & Locality Aware L1 L2 L3 L4 L2 L4 L3 Replica Location Pointer L1

20 20/26 Data Location & Routing (4/4) Routing Table

21 21/26 Prototype Prototype Software Architecture

22 22/26 Experimental Results (1/2) Update Performance

23 23/26 Experimental Results (2/2) Comparison with NFS Write Read Read/Write

24 24/26 Related Work Other Peer-to-peer File Systems PAST[Rows01] and CFS[Dabe01] –No Write Sharing IVY[Muth02], Pangaea[Sait02] –Provide Both Read and Write Sharing but, –No Single Point of Consistency

25 25/26 Conclusion Operational OceanStore Prototype Universally Accessible, Fault-tolerance, Security and Information Sharing Future Research Improving Performance –Efficient Threshold Schemes and Archival Data Generation Self-Maintenance Stability and Fault-tolerance Supporting More Applications

26 26/26 Discussion System Design Choice Security vs. Fast Response Simple vs. Complicate Design Storage Service Provider (SSP) Independent SSP vs. Confederation of Companies such as IBM, AT&T Efficient Storage Usage

27 27/26 Primary Replica (Ext.) Modification of Byzantine Agreement Protocol Public Key Cryptography –Symmetric-key Message Authentication Codes (MACs) for Inner Ring –Public-key Cryptography for All Other Machines Proactive Threshold Signatures –Flexibility in Choosing the Membership of Inner Ring –Single Public Key with l Private Key Shares –Any k Correctly Generated Signature Shares among l –Independent Sets of Key Shares can be Used to Control Membership Responsible Party –To Choose the Hosts that Make Up Inner Rings


Download ppt "Pond: the OceanStore Prototype Sean Rhea, Patric Eaton, Dennis Gells, Hakim Weatherspoon, Ben Zhao, and John Kubiatowicz University of California, Berkeley."

Similar presentations


Ads by Google