Presentation on theme: "The Grid Enabling Resource Sharing within Virtual Organizations Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department."— Presentation transcript:
The Grid Enabling Resource Sharing within Virtual Organizations Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer Science The University of Chicago Invited Talk, Research Library Group Annual Conference, Amsterdam, April 22, 2002
2 ARGONNE CHICAGO Overview l The technology landscape –Living in an exponential world l Grid concepts –Resource sharing in virtual organizations l Petascale Virtual Data Grids –Data, programs, computers, computations as community resources
3 ARGONNE CHICAGO Living in an Exponential World (1) Computing & Sensors Moores Law: transistor count doubles each 18 months Magnetohydro- dynamics star formation
4 ARGONNE CHICAGO Living in an Exponential World: (2) Storage l Storage density doubles every 12 months l Dramatic growth in online data (1 petabyte = 1000 terabyte = 1,000,000 gigabyte) –2000~0.5 petabyte –2005~10 petabytes –2010~100 petabytes –2015~1000 petabytes? l Transforming entire disciplines in physical and, increasingly, biological sciences; humanities next?
5 ARGONNE CHICAGO Data Intensive Physical Sciences l High energy & nuclear physics –Including new experiments at CERN l Gravity wave searches –LIGO, GEO, VIRGO l Time-dependent 3-D systems (simulation, data) –Earth Observation, climate modeling –Geophysics, earthquake modeling –Fluids, aerodynamic design –Pollutant dispersal scenarios l Astronomy: Digital sky surveys
6 ARGONNE CHICAGO Ongoing Astronomical Mega-Surveys l Large number of new surveys –Multi-TB in size, 100M objects or larger –In databases –Individual archives planned and under way l Multi-wavelength view of the sky –> 13 wavelength coverage within 5 years l Impressive early discoveries –Finding exotic objects by unusual colors >L,T dwarfs, high redshift quasars –Finding objects by time variability >Gravitational micro-lensing MACHO 2MASS SDSS DPOSS GSC-II COBE MAP NVSS FIRST GALEX ROSAT OGLE... MACHO 2MASS SDSS DPOSS GSC-II COBE MAP NVSS FIRST GALEX ROSAT OGLE...
7 ARGONNE CHICAGO Crab Nebula in 4 Spectral Regions X-rayOptical InfraredRadio
8 ARGONNE CHICAGO Coming Floods of Astronomy Data l The planned Large Synoptic Survey Telescope will produce over 10 petabytes per year by 2008! –All-sky survey every few days, so will have fine-grain time series for the first time
9 ARGONNE CHICAGO Data Intensive Biology and Medicine l Medical data –X-Ray, mammography data, etc. (many petabytes) –Digitizing patient records (ditto) l X-ray crystallography l Molecular genomics and related disciplines –Human Genome, other genome databases –Proteomics (protein structure, activities, …) –Protein interactions, drug delivery l Virtual Population Laboratory (proposed) –Simulate likely spread of disease outbreaks l Brain scans (3-D, time dependent)
10 ARGONNE CHICAGO And comparisons must be made among many We need to get to one micron to know location of every cell. Were just now starting to get to 10 microns – Grids will help get us there and further A Brain is a Lot of Data! (Mark Ellisman, UCSD)
11 ARGONNE CHICAGO An Exponential World: (3) Networks (Or, Coefficients Matter …) l Network vs. computer performance –Computer speed doubles every 18 months –Network speed doubles every 9 months –Difference = order of magnitude per 5 years l 1986 to 2000 –Computers: x 500 –Networks: x 340,000 l 2001 to 2010 –Computers: x 60 –Networks: x 4000 Moores Law vs. storage improvements vs. optical improvements. Graph from Scientific American (Jan- 2001) by Cleo Vilett, source Vined Khoslan, Kleiner, Caufield and Perkins.
12 ARGONNE CHICAGO Evolution of the Scientific Process l Pre-electronic –Theorize &/or experiment, alone or in small teams; publish paper l Post-electronic –Construct and mine very large databases of observational or simulation data –Develop computer simulations & analyses –Exchange information quasi-instantaneously within large, distributed, multidisciplinary teams
13 ARGONNE CHICAGO The Grid Resource sharing & coordinated problem solving in dynamic, multi-institutional virtual organizations
14 ARGONNE CHICAGO An Example Virtual Organization: CERNs Large Hadron Collider 1800 Physicists, 150 Institutes, 32 Countries 100 PB of data by 2010; 50,000 CPUs?
15 ARGONNE CHICAGO Grid Communities & Applications: Data Grids for High Energy Physics Tier2 Centre ~1 TIPS Online System Offline Processor Farm ~20 TIPS CERN Computer Centre FermiLab ~4 TIPS France Regional Centre Italy Regional Centre Germany Regional Centre Institute Institute ~0.25TIPS Physicist workstations ~100 MBytes/sec ~622 Mbits/sec ~1 MBytes/sec There is a bunch crossing every 25 nsecs. There are 100 triggers per second Each triggered event is ~1 MByte in size Physicists work on analysis channels. Each institute will have ~10 physicists working on one or more channels; data for these channels should be cached by the institute server Physics data cache ~PBytes/sec ~622 Mbits/sec or Air Freight (deprecated) Tier2 Centre ~1 TIPS Caltech ~1 TIPS ~622 Mbits/sec Tier 0 Tier 1 Tier 2 Tier 4 1 TIPS is approximately 25,000 SpecInt95 equivalents
16 ARGONNE CHICAGO The Grid Opportunity: eScience and eBusiness l Physicists worldwide pool resources for peta-op analyses of petabytes of data l Civil engineers collaborate to design, execute, & analyze shake table experiments l An insurance company mines data from partner hospitals for fraud detection l An application service provider offloads excess load to a compute cycle provider l An enterprise configures internal & external resources to support eBusiness workload
17 ARGONNE CHICAGO Grid Computing
18 ARGONNE CHICAGO l Early 90s –Gigabit testbeds, metacomputing l Mid to late 90s –Early experiments (e.g., I-WAY), academic software projects (e.g., Globus, Legion), application experiments l 2002 –Dozens of application communities & projects –Major infrastructure deployments –Significant technology base (esp. Globus Toolkit TM ) –Growing industrial interest –Global Grid Forum: ~500 people, 20+ countries The Grid: A Brief History
19 ARGONNE CHICAGO Challenging Technical Requirements l Dynamic formation and management of virtual organizations l Online negotiation of access to services: who, what, why, when, how l Establishment of applications and systems able to deliver multiple qualities of service l Autonomic management of infrastructure elements Open Grid Services Architecture
20 ARGONNE CHICAGO Data Intensive Science: l Scientific discovery increasingly driven by IT –Computationally intensive analyses –Massive data collections –Data distributed across networks of varying capability –Geographically distributed collaboration l Dominant factor: data growth (1 Petabyte = 1000 TB) –2000~0.5 Petabyte –2005~10 Petabytes –2010~100 Petabytes –2015~1000 Petabytes? How to collect, manage, access and interpret this quantity of data? Drives demand for Data Grids to handle additional dimension of data access & movement
21 ARGONNE CHICAGO Data Grid Projects l Particle Physics Data Grid (US, DOE) –Data Grid applications for HENP expts. l GriPhyN (US, NSF) –Petascale Virtual-Data Grids l iVDGL (US, NSF) –Global Grid lab l TeraGrid (US, NSF) –Dist. supercomp. resources (13 TFlops) l European Data Grid (EU, EC) –Data Grid technologies, EU deployment l CrossGrid (EU, EC) –Data Grid technologies, EU emphasis l DataTAG (EU, EC) –Transatlantic network, Grid applications l Japanese Grid Projects (APGrid) (Japan) –Grid deployment throughout Japan Collaborations of application scientists & computer scientists Infrastructure devel. & deployment Globus based
22 ARGONNE CHICAGO Biomedical Informatics Research Network (BIRN) l Evolving reference set of brains provides essential data for developing therapies for neurological disorders (multiple sclerosis, Alzheimers, etc.). l Today –One lab, small patient base –4 TB collection l Tomorrow –10s of collaborating labs –Larger population sample –400 TB data collection: more brains, higher resolution –Multiple scale data integration and analysis
23 ARGONNE CHICAGO GriPhyN = App. Science + CS + Grids l GriPhyN = Grid Physics Network –US-CMSHigh Energy Physics –US-ATLASHigh Energy Physics –LIGO/LSCGravity wave research –SDSSSloan Digital Sky Survey –Strong partnership with computer scientists l Design and implement production-scale grids –Develop common infrastructure, tools and services –Integration into the 4 experiments –Application to other sciences via Virtual Data Toolkit l Multi-year project –R&D for grid architecture (funded at $11.9M +$1.6M) –Integrate Grid infrastructure into experiments through VDT
24 ARGONNE CHICAGO GriPhyN Institutions –U Florida –U Chicago –Boston U –Caltech –U Wisconsin, Madison –USC/ISI –Harvard –Indiana –Johns Hopkins –Northwestern –Stanford –U Illinois at Chicago –U Penn –U Texas, Brownsville –U Wisconsin, Milwaukee –UC Berkeley –UC San Diego –San Diego Supercomputer Center –Lawrence Berkeley Lab –Argonne –Fermilab –Brookhaven
25 ARGONNE CHICAGO GriPhyN: PetaScale Virtual Data Grids Virtual Data Tools Request Planning & Scheduling Tools Request Execution & Management Tools Transforms Distributed resources (code, storage, CPUs, networks) è Resource è Management è Services Resource Management Services è Security and è Policy è Services Security and Policy Services è Other Grid è Services Other Grid Services Interactive User Tools Production Team Individual Investigator Workgroups Raw data source ~1 Petaop/s ~100 Petabytes
26 ARGONNE CHICAGO GriPhyN Research Agenda l Virtual Data technologies –Derived data, calculable via algorithm –Instantiated 0, 1, or many times (e.g., caches) –Fetch value vs. execute algorithm –Potentially complex (versions, cost calculation, etc) l E.g., LIGO: Get gravitational strain for 2 minutes around 200 gamma-ray bursts over last year l For each requested data value, need to –Locate item materialization, location, and algorithm –Determine costs of fetching vs. calculating –Plan data movements, computations to obtain results –Execute the plan
27 ARGONNE CHICAGO Virtual Data in Action l Data request may –Compute locally –Compute remotely –Access local data –Access remote data l Scheduling based on –Local policies –Global policies –Cost Major facilities, archives Regional facilities, caches Local facilities, caches Fetch item
28 ARGONNE CHICAGO GriPhyN Research Agenda (cont.) l Execution management –Co-allocation (CPU, storage, network transfers) –Fault tolerance, error reporting –Interaction, feedback to planning l Performance analysis (with PPDG) –Instrumentation, measurement of all components –Understand and optimize grid performance l Virtual Data Toolkit (VDT) –VDT = virtual data services + virtual data tools –One of the primary deliverables of R&D effort –Technology transfer to other scientific domains
29 ARGONNE CHICAGO iVDGL: A Global Grid Laboratory l International Virtual-Data Grid Laboratory –A global Grid lab (US, EU, South America, Asia, …) –A place to conduct Data Grid tests at scale –A mechanism to create common Grid infrastructure –A production facility for LHC experiments –An experimental laboratory for other disciplines –A focus of outreach efforts to small institutions l Funded for $13.65M by NSF We propose to create, operate and evaluate, over a sustained period of time, an international research laboratory for data-intensive science. From NSF proposal, 2001
30 ARGONNE CHICAGO Initial US-iVDGL Data Grid Tier1 (FNAL) Proto-Tier2 Tier3 university UCSD Florida Wisconsin Fermilab BNL Indiana BU Other sites to be added in 2002 SKC Brownsville Hampton PSU JHU Caltech
31 ARGONNE CHICAGO iVDGL Map ( ) Tier0/1 facility Tier2 facility 10 Gbps link 2.5 Gbps link 622 Mbps link Other link Tier3 facility DataTAG Surfnet Later Brazil Pakistan Russia China
32 ARGONNE CHICAGO Programs as Community Resources: Data Derivation and Provenance l Most scientific data are not simple measurements; essentially all are: –Computationally corrected/reconstructed –And/or produced by numerical simulation l And thus, as data and computers become ever larger and more expensive: –Programs are significant community resources –So are the executions of those programs
33 ARGONNE CHICAGO TransformationDerivation Data created-by execution-of consumed-by/ generated-by Ive detected a calibration error in an instrument and want to know which derived data to recompute. Ive come across some interesting data, but I need to understand the nature of the corrections applied when it was constructed before I can trust it for my purposes. I want to search an astronomical database for galaxies with certain characteristics. If a program that performs this analysis exists, I wont have to write one from scratch. I want to apply an astronomical analysis program to millions of objects. If the results already exist, Ill save weeks of computation.
34 ARGONNE CHICAGO The Chimera Virtual Data System (GriPhyN Project) l Virtual data catalog –Transformations, derivations, data l Virtual data language –Data definition + query l Applications include browsers and data analysis applications
35 ARGONNE CHICAGO NCSA Linux cluster 5) Secondary reports complete to master Master Condor job running at Caltech 7) GridFTP fetches data from UniTree NCSA UniTree - GridFTP- enabled FTP server 4) 100 data files transferred via GridFTP, ~ 1 GB each Secondary Condor job on WI pool 3) 100 Monte Carlo jobs on Wisconsin Condor pool 2) Launch secondary job on WI pool; input files via Globus GASS Caltech workstation 6) Master starts reconstruction jobs via Globus jobmanager on cluster 8) Processed objectivity database stored to UniTree 9) Reconstruction job reports complete to master Early GriPhyN Challenge Problem: CMS Data Reconstruction Scott Koranda, Miron Livny, others
36 ARGONNE CHICAGO Pre / Simulation Jobs / Post (UW Condor) ooHits at NCSA ooDigis at NCSA Delay due to script error Trace of a Condor-G Physics Run
37 ARGONNE CHICAGO Attributes Semantics Knowledge Information Data Ingest Services ManagementAccess Services (Model-based Access) (Data Handling System ) MCAT/HDF Grids XML DTD SDLIP XTM DTD Rules - KQL Information Repository Attribute-based Query Feature-based Query Knowledge or Topic-Based Query / Browse Knowledge Repository for Rules Relationships Between Concepts Fields Containers Folders Storage (Replicas, Persistent IDs) Knowledge-based Data Grid Roadmap (Reagan Moore, SDSC)
38 ARGONNE CHICAGO New Programs l U.K. eScience program l EU 6 th Framework l U.S. Committee on Cyberinfrastructure l Japanese Grid initiative
39 ARGONNE CHICAGO U.S. Cyberinfrastructure: Draft Recommendations l New INITIATIVE to revolutionize science and engineering research at NSF and worldwide to capitalize on new computing and communications opportunities 21 st Century Cyberinfrastructure includes supercomputing, but also massive storage, networking, software, collaboration, visualization, and human resources –Current centers (NCSA, SDSC, PSC) are a key resource for the INITIATIVE –Budget estimate: incremental $650 M/year (continuing) l An INITIATIVE OFFICE with a highly placed, credible leader empowered to –Initiate competitive, discipline-driven path-breaking applications within NSF of cyberinfrastructure which contribute to the shared goals of the INITIATIVE –Coordinate policy and allocations across fields and projects. Participants across NSF directorates, Federal agencies, and international e-science –Develop high quality middleware and other software that is essential and special to scientific research –Manage individual computational, storage, and networking resources at least 100x larger than individual projects or universities can provide.
40 ARGONNE CHICAGO Summary l Technology exponentials are changing the shape of scientific investigation & knowledge –More computing, even more data, yet more networking l The Grid: Resource sharing & coordinated problem solving in dynamic, multi- institutional virtual organizations l Petascale Virtual Data Grids represent the future in science infrastructure –Data, programs, computers, computations as community resources
41 ARGONNE CHICAGO For More Information l Grid Book –www.mkp.com/grids l The Globus Project –www.globus.org l Global Grid Forum –www.gridforum.org l TeraGrid –www.teragrid.org l EU DataGrid –www.eu-datagrid.org l GriPhyN –www.griphyn.org l iVDGL –www.ivdgl.org l Background papers –www.mcs.anl.gov/~foster