Presentation is loading. Please wait.

Presentation is loading. Please wait.

The Global Storage Grid Or, Managing Data for “Science 2.0” Ian Foster Computation Institute Argonne National Lab & University of Chicago.

Similar presentations


Presentation on theme: "The Global Storage Grid Or, Managing Data for “Science 2.0” Ian Foster Computation Institute Argonne National Lab & University of Chicago."— Presentation transcript:

1 The Global Storage Grid Or, Managing Data for “Science 2.0” Ian Foster Computation Institute Argonne National Lab & University of Chicago

2 2 “Web 2.0” l Software as services u Data- & computation-rich network services l Services as platforms u Easy composition of services to create new capabilities (“mashups”)—that themselves may be made accessible as new services l Enabled by massive infrastructure buildout u Google projected to spend $1.5B on computers, networks, and real estate in 2006 u Dozens of others are spending substantially l Paid for by advertising Declan Butler, Nature

3 3 Science 2.0: E.g., Cancer Bioinformatics Grid Data Service @ uchicago.edu <BPEL Workflow Doc> BPEL Engine Analytic service @ osu.edu Analytic service @ duke.edu <Workflow Results> <Workflow Inputs> link caBiG: https://cabig.nci.nih.gov/ BPEL work: Ravi Madduri et al.

4 4 Science 2.0: E.g., Virtual Observatories Data Archives User Analysis tools Gateway Figure: S. G. Djorgovski Discovery tools

5 5 Science 1.0  Science 2.0: For Example, Digital Astronomy Tell me about this star Tell me about these 20K stars Support 1000s of users E.g., Sloan Digital Sky Survey, ~40 TB; others much bigger soon

6 6 Global Data Requirements l Service consumer u Discover data u Specify compute-intensive analyses u Compose multiple analyses u Publish results as a service l Service provider u Host services enabling data access/analysis u Support remote requests to services u Control who can make requests u Support time-varying, data- and compute- intensive workloads

7 7 l But: u Amount of computation can be enormous u Load can vary tremendously u Users want to compose distributed services  data must sometimes be moved, anyway l Fortunately u Networks are getting much faster (in parts) u Workloads can have significant locality of reference Analyzing Large Data: “Move Computation to the Data”

8 8 “Move Computation to the Core” Poorly connected “periphery” Highly connected “core”

9 9 Highly Connected “Core”: For Example, TeraGrid Starlight LA Atlanta SDSC TACC PU IU ORNL NCSA ANL PSC 16 Supercomputers - 9 different types, multiple sizes World’s fastest network Globus Toolkit and other middleware providing single login, application management, data movement, web services 30 Gigabits/s to large sites = 20-30 times major uni links = 30,000 times my home broadband = 1 full length feature film per sec 75 Teraflops (trillion calculations/s) = 12,500 faster than all 6 billion humans on earth each doing one calculation per second

10 10 Open Science Grid May 7-14, 2006 4000 jobs

11 11 The Two Dimensions of Science 2.0 l Decompose across network Clients integrate dynamically u Select & compose services u Select “best of breed” providers u Publish result as new services l Decouple resource & service providers Function Resource Data Archives Analysis tools Discovery tools Users Fig: S. G. Djorgovski

12 12 Provisioning Technology Requirements: Integration & Decomposition l Service-oriented Grid infrastructure u Provision physical resources to support application workloads Appln Service Users Workflows Composition Invocation l Service-oriented applications u Wrap applications & data as services u Compose services into workflows “The Many Faces of IT as Service”, ACM Queue, Foster, Tuecke, 2005

13 13 Technology Requirements: Within the Core … l Provide “service hosting services” that allow consumers to negotiate the hosting of arbitrary data analysis services l Dynamically manage resources (compute, storage, network) to meet diverse computational demands l Provide strong internal security, bridging to diverse external sources of attributes

14 14 Globus Software Enables Grid Infrastructure l Web service interfaces for behaviors relating to integration and decomposition u Primitives: resources, state, security u Services: execution, data movement, … l Open source software that implements those interfaces u In particular, Globus Toolkit (GT4) l All standard Web services u “Grid is a use case for Web services, focused on resource management”

15 15 Open Source Grid Software Data Mgmt Security Common Runtime Execution Mgmt Info Services GridFTP Authentication Authorization Reliable File Transfer Data Access & Integration Grid Resource Allocation & Management Index Community Authorization Data Replication Community Scheduling Framework Delegation Replica Location Trigger Java Runtime C Runtime Python Runtime WebMDS Workspace Management Grid Telecontrol Protocol Globus Toolkit v4 www.globus.org Credential Mgmt Globus Toolkit Version 4: Software for Service-Oriented Systems, LNCS 3779, 2-13, 2005

16 16 GT4 Data Services l Data movement u GridFTP—secure, reliable, performant u Reliable File Transfer: managed transfers u Data Replication Service—managed replication l Replica Location Service u Scales to 100s of millions of replicas l Data Access & Integration services u Access to, and server-side processing, of structured data Disk-to-disk on TeraGrid

17 17 Security Services l Attribute Authority (ATA) u Issue signed attribute assertions (incl. identity, delegation & mapping) l Authorization Authority (AZA) u Decisions based on assertions & policy VO A Service VO ATA VO AZA Mapping ATA VO B Service VO User A Delegation Assertion User B can use Service A VO-A Attr  VO-B Attr VO User B Resource Admin Attribute VO Member Attribute VO Member Attribute

18 18 Service Hosting Policy Client Environment Activity Allocate/provision Configure Initiate activity Monitor activity Control activity Interface Resource provider WSRF (or WS-Transfer/WS-Man, etc.), Globus GRAM, Virtual Workspaces

19 19 Virtual OSG Clusters OSG cluster Xen hypervisors TeraGrid cluster OSG “Virtual Clusters for Grid Communities,” Zhang et al., CCGrid 2006

20 20 Managed Storage: GridFTP with NeST (Demoware) GridFTP Server NeST Module Disk Storage NeST Server Chirp Custom Application (Lot operations, etc.) (chirp) (File transfers) (GSI-FTP) (File transfer) (chirp) GT4NeST Bill Allcock, Nick LeRoy, Jeff Weber, et al.

21 21 Community Services Provider Content Services Capacity Hosting Science Services 1) Integrate services from other sources u Virtualize external services as VO services 2) Coordinate & compose u Create new services from existing ones Capacity Provider “Service-Oriented Science”, Science, 2005

22 22 Virtualizing Existing Services l Establish service agreement with service u E.g., WS-Agreement l Delegate use to community (“VO”) user User A VO Admin User B VO User Existing Services

23 23 Birmingham The Globus-Based LIGO Data Grid Replicating >1 Terabyte/day to 8 sites >40 million replicas so far MTBF = 1 month LIGO Gravitational Wave Observatory www.globus.org/solutions  Cardiff AEI/Golm

24 24 l Pull “missing” files to a storage system List of required Files GridFTP Local Replica Catalog Replica Location Index Data Replication Service Reliable File Transfer Service Local Replica Catalog GridFTP Data Replication Service “Design and Implementation of a Data Replication Service Based on the Lightweight Data Replicator System,” Chervenak et al., 2005 Replica Location Index Data Movement Data Location Data Replication

25 25 Hypervisor/OS Deploy hypervisor/OS Data Replication Service: Dynamic Deployment Physical machine Procure hardware VM Deploy virtual machine State exposed & access uniformly at all levels Provisioning, management, and monitoring at all levels JVM Deploy container DRS Deploy service GridFTP LRC VO Services GridFTP

26 26 Decomposition Enables Separation of Concerns & Roles User Service Provider “Provide access to data D at S1, S2, S3 with performance P” Resource Provider “Provide storage with performance P1, network with P2, …” D S1 S2 S3 D S1 S2 S3 Replica catalog, User-level multicast, … D S1 S2 S3

27 27 Example: Biology Public PUMA Knowledge Base Information about proteins analyzed against ~2 million gene sequences Back Office Analysis on Grid Millions of BLAST, BLOCKS, etc., on OSG and TeraGrid Natalia Maltsev et al., http://compbio.mcs.anl.gov/puma2

28 28 Genome Analysis & Database Update (GADU) on OSG April 24, 2006 GADU 3,000 jobs

29 29 Example: Earth System Grid l Climate simulation data u Per-collection control u Different user classes u Server-side processing l Implementation (GT) u Portal-based User Registration (PURSE) u PKI, SAML assertions u GridFTP, GRAM, SRM l >2000 users l >100 TB downloaded www.earthsystemgrid.org — DOE OASCR

30 30 “Inside the Core” of ESG

31 31 Example: Astro Portal Stacking Service l Purpose u On-demand “stacks” of random locations within ~10TB dataset l Challenge u Rapid access to 10- 10K “random” files u Time-varying load l Solution u Dynamic acquisition of compute, storage + + + + + + = + S4S4 Sloan Data Web page or Web Service Joint work with Ioan Raicu & Alex Szalay

32 32 Preliminary Performance (TeraGrid, LAN GPFS) Joint work with Ioan Raicu & Alex Szalay

33 33 Example: Cybershake Calculate hazard curves by generating synthetic seismograms from estimated rupture forecast Rupture Forecast Synthetic Seismogram Strain Green Tensor Hazard Curve Spectral Acceleration Hazard Map Tom Jordan et al., Southern California Earthquake Center

34 34 20 TB, 1.8 CPU-year Enlisting TeraGrid Resources Workflow Scheduler/Engine VO Service Catalog Provenance Catalog Data Catalog SCEC Storage TeraGrid Compute TeraGrid Storage VO Scheduler Ewa Deelman, Carl Kesselman, et al., USC Information Sciences Institute

35 35 http://dev.globus.org Guidelines (Apache) Infrastructure (CVS, email, bugzilla, Wiki) Projects Include … dev.globus — Community Driven Improvement of Globus Software, NSF OCI

36 36 Summary l “Science 2.0”—science as service, & service as platform—demands u New infrastructure—service hosting u New technology—hosting & management u New policy—hierarchically controlled access l Data & storage management cannot be separated from computation management u And increasingly become community roles l A need for new technologies, skills, & roles u Creating, publishing, hosting, discovering, composing, archiving, explaining … services

37 37 Science 1.0  Science 2.0 Gigabytes Tarballs Journals Individuals Community codes Supercomputer centers Makefile Computational science Physical sciences Computational scientists NSF-funded  Terabytes  Services  Wikis  Communities  Science gateways  TeraGrid, OSG, campus  Workflow  Science as computation  All sciences (& humanities)  All scientists  NSF-funded

38 38 Science 2.0 Challenges l A need for new technologies, skills, & roles u Creating, publishing, hosting, discovering, composing, archiving, explaining … services l A need for substantial software development u “30-80% of modern astronomy projects is software”—S. G. Djorgovski l A need for more & different infrastructure u Computers & networks to host services l Can we leverage commercial spending? u To some extent, but not straightforward

39 39 Acknowledgements l Carl Kesselman for many discussions l Many colleagues, including those named on slides, for research collaborations and/or slides l Colleagues involved in the TeraGrid, Open Science Grid, Earth System Grid, caBIG, and other Grid infrastructures l Globus Alliance members for Globus software R&D l DOE OASCR, NSF OCI, & NIH for support

40 40 For More Information l Globus Alliance u www.globus.org l Dev.Globus u dev.globus.org l Open Science Grid u www.opensciencegrid.org l TeraGrid u www.teragrid.org l Background u www.mcs.anl.gov/~foster 2nd Edition www.mkp.com/grid2 Thanks for DOE, NSF, and NIH for research support!!


Download ppt "The Global Storage Grid Or, Managing Data for “Science 2.0” Ian Foster Computation Institute Argonne National Lab & University of Chicago."

Similar presentations


Ads by Google