Presentation is loading. Please wait.

Presentation is loading. Please wait.

1. 2 Corollary 3 System Overview Second Key Idea: Specialization Think GoogleFS.

Similar presentations


Presentation on theme: "1. 2 Corollary 3 System Overview Second Key Idea: Specialization Think GoogleFS."— Presentation transcript:

1 1

2 2 Corollary

3 3 System Overview

4 Second Key Idea: Specialization Think GoogleFS

5 http://netsyslab.ece.ubc.ca5 Third idea: Enable cross-layer optimizations Layered Architectures: High benefits, but … TCP/IP File System Benefits, but… – … limits information flow across layers. API

6 http://netsyslab.ece.ubc.ca6 Cross-Layer Optimizations Examples – IP – Storage systems – …. Applications  Storage System – Performance – QoS requirements – Consistency requirements Applications  Storage System – Provide storage-level information to applications Data Intensive Schedulers: Notification about data movements Data Intensive Applications: Co-usage of files What’s missing? A vehicle to pass information across layers

7 http://netsyslab.ece.ubc.ca7 Traditional Use of Custom Metadata Application Layer File System Layer Storage System Layer Metadata Manager File Organization Module Basic File System Author=Smith input.dat File Browser POSIX API

8 HPDC'08http://netsyslab.ece.ubc.ca8 Cross-Layer Communication Application Layer File System Layer Storage System Layer Metadata Manager File Organization Module Basic File System Replicate input.dat 3x input.dat moved from node1 to node3 OK. Schedule Task on node3 POSIX API

9 Recap Object-based storage Enable specialization --> performance Enable cross-layer optimization --> genrality

10 One intended use: A Workflow-Aware Storage System 10

11 Workflow Example - ModFTDock Protein docking application  Simulates the creation of a complex protein from two known proteins Applications  Drugs design  Protein interaction prediction 11

12 Platform Example – Argonne BlueGene/P 160K cores 10 Gb/s Switch Complex GPFS 24 I/O servers IO rate: 8GBps= 51KBps / core !! 2.5K IO Nodes Torus Network 2.5 GBps per node 3D Torus 850 MBps per 64 nodes Tree The central storage is a potential bottleneck Underused resources

13 Background – ModFTDock in Argonne BG/P 13 Backend file system (e.g., GPFS, NFS) Scale: 40960 Compute nodes File based communication Large IO volume Workflow Runtime Engine 1.2 M Docking Tasks IO rate : 8GBps = 51KBps / core App. task Local storage App. task Local storage App. task Local storage App. task Local storage App. task Local storage

14 Intermediate Storage Approach 14 Backend file system (e.g., GPFS, NFS) App. task Local storage App. task Local storage App. task Local storage Intermediate Storage … POSIX API Workflow Runtime Engine Scale: 40960 Compute nodes Stage In Stage Out

15 Usage scenario II: Support for deduplication

16 Stakeholders The final clients – Financing agencies ($) DoE NSERC – Science teams Development team – Graduate students (6+) – Undergraduate students, visitors (10+) Me

17 Stakeholders – and their goals The final clients – Financing agencies ($) DoE NSERC – Science teams Development team – Graduate students (6+) – Undergraduate students, visitors (10+) Me

18 Requirements 1.Easy to deploy 2.Easy to integrate with applications 3.Versatility and ability to configure 4.Efficiency / high-performance /scalability 5.Ability to support versioning and partially similar data. All have big architectural implications

19 Early architectural decisions 1)Object-based storage -system structure 2.) Network/protocol stack: uniform - Stateless to the degree possible

20 Early architectural decisions 3.) FUSE-based implementation -Impact: structure, deployability 4.) Policy to manage tension between code maturity and need to experiment

21 Mid-way architectural decisions 5.) GeneralIO hack 6.) Test-driven design - integrate 3month projects

22 Implicit architectural policies 7.) Personnel management: -prioritize ‘fun’ -Flat Team structure -Bottom-up decision making / prioritization: -‘campaigns’ 8.) Align ‘values’

23 Key architectural decisions 1)Object-based storage 2.) Uniform protocol stack 3.) POSIX, FUSE-based implementation, 4.) Policy to manage tension between code maturity and need to experiment 5.) GeneralIO hack 6.) Test-driven design 7.) Personnel management: prioritize ‘fun’ 8.) Align values


Download ppt "1. 2 Corollary 3 System Overview Second Key Idea: Specialization Think GoogleFS."

Similar presentations


Ads by Google