Presentation is loading. Please wait.

Presentation is loading. Please wait.

Give Your Data the Edge A Scalable Data Delivery Platform

Similar presentations


Presentation on theme: "Give Your Data the Edge A Scalable Data Delivery Platform"— Presentation transcript:

1 Give Your Data the Edge A Scalable Data Delivery Platform
Larry Peterson In collaboration with Arizona, Akamai, Internet2, NSF, North Carolina, Open Networking Lab, Princeton (and several pilot sites)

2 Data Management Challenge
Distributed Set of Collaborators Data Management Experts Share Pre-Stage Write-Back Private Data Repositories Commodity Cloud Storage S3 iPlant DropBox Emphasize value – but limitations – of existing resources: (1) R/W performance of local disk, (2) Scalable Read Bandwidth, (3) Persistent Storage, (4) popular data sets. Then introduce UG, RG, and AG, plus tied all together with (1) HTTP data plane and (2) MS. Results in shared/global volume. GenBank

3 Our Goal To enable a scalable number of collaborators (and their applications) to share access to data independent of where it is stored, where the storage platform: Minimizes the operational burden imposed on users Maximizes the use of commodity infrastructure Maximizes aggregate I/O performance

4 Syndicate Solution CDN Metadata Service Shared Volume SG SG SG SG SG
iPlant DropBox Emphasize value – but limitations – of existing resources: (1) R/W performance of local disk, (2) Scalable Read Bandwidth, (3) Persistent Storage, (4) popular data sets. Then introduce UG, RG, and AG, plus tied all together with (1) HTTP data plane and (2) MS. Results in shared/global volume. GenBank

5 Syndicate Solution CDN Manages data consistency and key distribution
Shared Volume Manages data consistency and key distribution Bridges application workflow and HTTP transport; e.g., – iRODS – Hadoop SG SG SG Metadata Service CDN Aquires data from existing repositories; e.g., – iPlant (iRODS) – GenBank Treats cloud storage as a block device SG SG SG SG S3 iPlant DropBox Emphasize value – but limitations – of existing resources: (1) R/W performance of local disk, (2) Scalable Read Bandwidth, (3) Persistent Storage, (4) popular data sets. Then introduce UG, RG, and AG, plus tied all together with (1) HTTP data plane and (2) MS. Results in shared/global volume. GenBank

6 Syndicate Solution CDN As easy as mounting Dropbox Auto-mount in
Shared Volume Auto-mount in Cloud VMs SG SG SG Metadata Service CDN SG SG SG SG S3 iPlant DropBox Emphasize value – but limitations – of existing resources: (1) R/W performance of local disk, (2) Scalable Read Bandwidth, (3) Persistent Storage, (4) popular data sets. Then introduce UG, RG, and AG, plus tied all together with (1) HTTP data plane and (2) MS. Results in shared/global volume. GenBank

7 Service Composition Syndicate = CDN  Object Store  NoSQL DB
Value-Add Storage Service Scalable Read Bandwidth (Akamai HyperCache & RequestRouter) Data Durability (S3, Glacier, DropBox, Box, Swift) Data Consistency (Google App Engine)

8 Value-Add Storage Service
Commodity Clouds Private Clouds Internet2 Backbone Regional & Campus End Users HPC Amazon AWS RR S3 HPC . Google Cloud Platform MS Latency matters Shared state matters Sufficient resources matters

9 Value Proposition Cloud-Ready – Allows users to mount shared volumes into cloud-hosted virtual machines (VMs) with minimal operational overhead. Scalable Read Bandwidth – Provides scalable read bandwidth (i.e., supports a scalable number of users) with minimal operational overhead. Provider Independence – Allows users to take advantage of cost/performance tradeoffs among multiple storage providers (as well as spread risk across those providers) with minimal operational overhead.

10 Value Proposition Secure-by-Default – Allows users to securely share files across organizational boundaries, at scale, with minimal operational overhead. Adapt to Existing Workflows – Makes it easy to integrate existing user workflows, datasets, and toolkits, as well as extend and customize to meet specific community requirements (e.g., privacy). Sustainable Design – Provides a general-purpose storage platform that leverages commodity storage and network caches at every opportunity.

11 Advanced Networking/Joint Techs
More Information opencloud.us + Tomorrow, 10:50am Advanced Networking/Joint Techs

12 Next Talk by John Hartman
Shared Volume SG SG SG iRODS Hadoop Metadata Service CDN SG SG SG SG S3 iPlant DropBox Emphasize value – but limitations – of existing resources: (1) R/W performance of local disk, (2) Scalable Read Bandwidth, (3) Persistent Storage, (4) popular data sets. Then introduce UG, RG, and AG, plus tied all together with (1) HTTP data plane and (2) MS. Results in shared/global volume. GenBank


Download ppt "Give Your Data the Edge A Scalable Data Delivery Platform"

Similar presentations


Ads by Google