University of Michigan (May 8, 2003)Paul Avery1 University of Florida Grids for 21 st Century Data Intensive.

Slides:



Advertisements
Similar presentations
International Grid Communities Dr. Carl Kesselman Information Sciences Institute University of Southern California.
Advertisements

Virtual Data and the Chimera System* Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer Science.
 Contributing >30% of throughput to ATLAS and CMS in Worldwide LHC Computing Grid  Reliant on production and advanced networking from ESNET, LHCNET and.
High Performance Computing Course Notes Grid Computing.
1 Cyberinfrastructure Framework for 21st Century Science & Engineering (CIF21) NSF-wide Cyberinfrastructure Vision People, Sustainability, Innovation,
1 Cyberinfrastructure Framework for 21st Century Science & Engineering (CF21) IRNC Kick-Off Workshop July 13,
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
LHC Computing Review (Jan. 14, 2003)Paul Avery1 University of Florida GriPhyN, iVDGL and LHC Computing.
Milos Kobliha Alejandro Cimadevilla Luis de Alba Parallel Computing Seminar GROUP 12.
SCHOOL OF INFORMATION UNIVERSITY OF MICHIGAN GriPhyN: Grid Physics Network and iVDGL: International Virtual Data Grid Laboratory.
Beauty 2003 (October 14, 2003)Paul Avery1 University of Florida Grid Computing in High Energy Physics Enabling Data Intensive Global.
Knowledge Environments for Science: Representative Projects Ian Foster Argonne National Laboratory University of Chicago
The Grid as Infrastructure and Application Enabler Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer.
Computing in Atmospheric Sciences Workshop: 2003 Challenges of Cyberinfrastructure Alan Blatecky Executive Director San Diego Supercomputer Center.
Korean HEP Grid Workshop (Nov. 8, 2002)Paul Avery1 University of Florida U.S. Physics Data Grid Projects.
LCG Milestones for Deployment, Fabric, & Grid Technology Ian Bird LCG Deployment Area Manager PEB 3-Dec-2002.
CANS Meeting (December 1, 2004)Paul Avery1 University of Florida UltraLight U.S. Grid Projects and Open Science Grid Chinese American.
Peer to Peer & Grid Computing Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer Science The University.
Open Science Grid For CI-Days Internet2: Fall Member Meeting, 2007 John McGee – OSG Engagement Manager Renaissance Computing Institute.
Experiment Requirements for Global Infostructure Irwin Gaines FNAL/DOE.
HEP Experiment Integration within GriPhyN/PPDG/iVDGL Rick Cavanaugh University of Florida DataTAG/WP4 Meeting 23 May, 2002.
ARGONNE  CHICAGO Ian Foster Discussion Points l Maintaining the right balance between research and development l Maintaining focus vs. accepting broader.
University of Mississippi (Nov. 11, 2003)Paul Avery1 University of Florida Data Grids Enabling Data Intensive Global Science Physics.
LCG and HEPiX Ian Bird LCG Project - CERN HEPiX - FNAL 25-Oct-2002.
From GEANT to Grid empowered Research Infrastructures ANTONELLA KARLSON DG INFSO Research Infrastructures Grids Information Day 25 March 2003 From GEANT.
Jarek Nabrzyski, Ariel Oleksiak Comparison of Grid Middleware in European Grid Projects Jarek Nabrzyski, Ariel Oleksiak Poznań Supercomputing and Networking.
PPDG and ATLAS Particle Physics Data Grid Ed May - ANL ATLAS Software Week LBNL May 12, 2000.
Finnish DataGrid meeting, CSC, Otaniemi, V. Karimäki (HIP) DataGrid meeting, CSC V. Karimäki (HIP) V. Karimäki (HIP) Otaniemi, 28 August, 2000.
Manuela Campanelli The University of Texas at Brownsville EOT-PACI Alliance All-Hands Meeting 30 April 2003 Urbana, Illinois GriPhyN.
Data Grid projects in HENP R. Pordes, Fermilab Many HENP projects are working on the infrastructure for global distributed simulated data production, data.
10/24/2015OSG at CANS1 Open Science Grid Ruth Pordes Fermilab
Virtual Data Grid Architecture Ewa Deelman, Ian Foster, Carl Kesselman, Miron Livny.
DataTAG Research and Technological Development for a Transatlantic Grid Abstract Several major international Grid development projects are underway at.
GriPhyN EAC Meeting (Jan. 7, 2002)Carl Kesselman1 University of Southern California GriPhyN External Advisory Committee Meeting Gainesville,
GriPhyN EAC Meeting (Apr. 12, 2001)Paul Avery1 University of Florida Opening and Overview GriPhyN External.
Brussels Grid Meeting (Mar. 23, 2001)Paul Avery1 University of Florida Extending the Grid Reach in Europe.
November SC06 Tampa F.Fanzago CRAB a user-friendly tool for CMS distributed analysis Federica Fanzago INFN-PADOVA for CRAB team.
Perspectives on Grid Technology Ian Foster Argonne National Laboratory The University of Chicago.
EGEE is a project funded by the European Union under contract IST HEP Use Cases for Grid Computing J. A. Templon Undecided (NIKHEF) Grid Tutorial,
DOE/NSF Review (Nov. 15, 2000)Paul Avery (LHC Data Grid)1 LHC Data Grid The GriPhyN Perspective DOE/NSF Baseline Review of US-CMS Software and Computing.
The Swiss Grid Initiative Context and Initiation Work by CSCS Peter Kunszt, CSCS.
GriPhyN Project Overview Paul Avery University of Florida GriPhyN NSF Project Review January 2003 Chicago.
Ian Bird LHC Computing Grid Project Leader LHC Grid Fest 3 rd October 2008 A worldwide collaboration.
The GriPhyN Planning Process All-Hands Meeting ISI 15 October 2001.
…building the next IT revolution From Web to Grid…
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
Middleware Camp NMI (NSF Middleware Initiative) Program Director Alan Blatecky Advanced Networking Infrastructure and Research.
Les Les Robertson LCG Project Leader High Energy Physics using a worldwide computing grid Torino December 2005.
OSG Consortium Meeting (January 23, 2006)Paul Avery1 University of Florida Open Science Grid Progress Linking Universities and Laboratories.
Ruth Pordes November 2004TeraGrid GIG Site Review1 TeraGrid and Open Science Grid Ruth Pordes, Fermilab representing the Open Science.
The Grid Effort at UF Presented by Craig Prescott.
The Particle Physics Data Grid Collaboratory Pilot Richard P. Mount For the PPDG Collaboration DOE SciDAC PI Meeting January 15, 2002.
CEOS Working Group on Information Systems and Services - 1 Data Services Task Team Discussions on GRID and GRIDftp Stuart Doescher, USGS WGISS-15 May 2003.
High Energy Physics and Grids at UF (Dec. 13, 2002)Paul Avery1 University of Florida High Energy Physics.
LIGO-G E LIGO Scientific Collaboration Data Grid Status Albert Lazzarini Caltech LIGO Laboratory Trillium Steering Committee Meeting 20 May 2004.
GriPhyN EAC Meeting (Jan. 7, 2002)Paul Avery1 Integration with iVDGL è International Virtual-Data Grid Laboratory  A global Grid laboratory (US, EU, Asia,
Open Science Grid & its Security Technical Group ESCC22 Jul 2004 Bob Cowles
GriPhyN Project Paul Avery, University of Florida, Ian Foster, University of Chicago NSF Grant ITR Research Objectives Significant Results Approach.
U.S. Grid Projects and Involvement in EGEE Ian Foster Argonne National Laboratory University of Chicago EGEE-LHC Town Meeting,
INFSO-RI Enabling Grids for E-sciencE The EGEE Project Owen Appleton EGEE Dissemination Officer CERN, Switzerland Danish Grid Forum.
Open Science Grid in the U.S. Vicky White, Fermilab U.S. GDB Representative.
Management & Coordination Paul Avery, Rick Cavanaugh University of Florida Ian Foster, Mike Wilde University of Chicago, Argonne
Towards deploying a production interoperable Grid Infrastructure in the U.S. Vicky White U.S. Representative to GDB.
Northwest Indiana Computational Grid Preston Smith Rosen Center for Advanced Computing Purdue University - West Lafayette West Lafayette Calumet.
1 Open Science Grid: Project Statement & Vision Transform compute and data intensive science through a cross- domain self-managed national distributed.
1 Particle Physics Data Grid (PPDG) project Les Cottrell – SLAC Presented at the NGI workshop, Berkeley, 7/21/99.
10-Feb-00 CERN HepCCC Grid Initiative ATLAS meeting – 16 February 2000 Les Robertson CERN/IT.
Hall D Computing Facilities Ian Bird 16 March 2001.
Bob Jones EGEE Technical Director
Ian Bird LCG Project - CERN HEPiX - FNAL 25-Oct-2002
Presentation transcript:

University of Michigan (May 8, 2003)Paul Avery1 University of Florida Grids for 21 st Century Data Intensive Science University of Michigan May 8, 2003

University of Michigan (May 8, 2003)Paul Avery2 Grids and Science

University of Michigan (May 8, 2003)Paul Avery3 The Grid Concept  Grid: Geographically distributed computing resources configured for coordinated use  Fabric: Physical resources & networks provide raw capability  Middleware: Software ties it all together (tools, services, etc.)  Goal: Transparent resource sharing

University of Michigan (May 8, 2003)Paul Avery4 Fundamental Idea: Resource Sharing  Resources for complex problems are distributed  Advanced scientific instruments (accelerators, telescopes, …)  Storage, computing, people, institutions  Communities require access to common services  Research collaborations (physics, astronomy, engineering, …)  Government agencies, health care organizations, corporations, …  “Virtual Organizations”  Create a “VO” from geographically separated components  Make all community resources available to any VO member  Leverage strengths at different institutions  Grids require a foundation of strong networking  Communication tools, visualization  High-speed data transmission, instrument operation

University of Michigan (May 8, 2003)Paul Avery5 Some (Realistic) Grid Examples  High energy physics  3,000 physicists worldwide pool Petaflops of CPU resources to analyze Petabytes of data  Fusion power (ITER, etc.)  Physicists quickly generate 100 CPU-years of simulations of a new magnet configuration to compare with data  Astronomy  An international team remotely operates a telescope in real time  Climate modeling  Climate scientists visualize, annotate, & analyze Terabytes of simulation data  Biology  A biochemist exploits 10,000 computers to screen 100,000 compounds in an hour

University of Michigan (May 8, 2003)Paul Avery6 Grids: Enhancing Research & Learning  Fundamentally alters conduct of scientific research  Central model:People, resources flow inward to labs  Distributed model:Knowledge flows between distributed teams  Strengthens universities  Couples universities to data intensive science  Couples universities to national & international labs  Brings front-line research and resources to students  Exploits intellectual resources of formerly isolated schools  Opens new opportunities for minority and women researchers  Builds partnerships to drive advances in IT/science/eng  “Application” sciences  Computer Science  Physics  Astronomy, biology, etc.  Universities  Laboratories  Scientists  Students  Research Community  IT industry

University of Michigan (May 8, 2003)Paul Avery7 Grid Challenges  Operate a fundamentally complex entity  Geographically distributed resources  Each resource under different administrative control  Many failure modes  Manage workflow across Grid  Balance policy vs. instantaneous capability to complete tasks  Balance effective resource use vs. fast turnaround for priority jobs  Match resource usage to policy over the long term  Goal-oriented algorithms: steering requests according to metrics  Maintain a global view of resources and system state  Coherent end-to-end system monitoring  Adaptive learning for execution optimization  Build high level services & integrated user environment

University of Michigan (May 8, 2003)Paul Avery8 Data Grids

University of Michigan (May 8, 2003)Paul Avery9 Data Intensive Science:  Scientific discovery increasingly driven by data collection  Computationally intensive analyses  Massive data collections  Data distributed across networks of varying capability  Internationally distributed collaborations  Dominant factor: data growth (1 Petabyte = 1000 TB)  2000~0.5 Petabyte  2005~10 Petabytes  2010~100 Petabytes  2015~1000 Petabytes? How to collect, manage, access and interpret this quantity of data? Drives demand for “Data Grids” to handle additional dimension of data access & movement

University of Michigan (May 8, 2003)Paul Avery10 Data Intensive Physical Sciences  High energy & nuclear physics  Including new experiments at CERN’s Large Hadron Collider  Astronomy  Digital sky surveys: SDSS, VISTA, other Gigapixel arrays  VLBI arrays: multiple- Gbps data streams  “Virtual” Observatories (multi-wavelength astronomy)  Gravity wave searches  LIGO, GEO, VIRGO, TAMA  Time-dependent 3-D systems (simulation & data)  Earth Observation, climate modeling  Geophysics, earthquake modeling  Fluids, aerodynamic design  Dispersal of pollutants in atmosphere

University of Michigan (May 8, 2003)Paul Avery11 Data Intensive Biology and Medicine  Medical data  X-Ray, mammography data, etc. (many petabytes)  Radiation Oncology (real-time display of 3-D images)  X-ray crystallography  Bright X-Ray sources, e.g. Argonne Advanced Photon Source  Molecular genomics and related disciplines  Human Genome, other genome databases  Proteomics (protein structure, activities, …)  Protein interactions, drug delivery  Brain scans (1-10  m, time dependent)

University of Michigan (May 8, 2003)Paul Avery Physicists 150 Institutes 32 Countries Driven by LHC Computing Challenges  Complexity:Millions of individual detector channels  Scale:PetaOps (CPU), Petabytes (Data)  Distribution:Global distribution of people & resources

University of Michigan (May 8, 2003)Paul Avery13 CMS Experiment at LHC “Compact” Muon Solenoid at the LHC (CERN) Smithsonian standard man

University of Michigan (May 8, 2003)Paul Avery14 LHC Data Rates: Detector to Storage Level 1 Trigger: Special Hardware 40 MHz ~1000 TB/sec 75 KHz 75 GB/sec 5 KHz 5 GB/sec Level 2 Trigger: Commodity CPUs 100 Hz 100 – 1500 MB/sec Level 3 Trigger: Commodity CPUs Raw Data to storage Physics filtering

University of Michigan (May 8, 2003)Paul Avery15 All charged tracks with pt > 2 GeV Reconstructed tracks with pt > 25 GeV (+30 minimum bias events) 10 9 events/sec, selectivity: 1 in LHC: Higgs Decay into 4 muons

University of Michigan (May 8, 2003)Paul Avery16 CMS Experiment Hierarchy of LHC Data Grid Resources Online System CERN Computer Center > 20 TIPS UK Korea Russia USA Institute MBytes/s Gbps 1-10 Gbps Gbps Gbps Tier 0 Tier 1 Tier 3 Tier 4 Tier0/(  Tier1)/(  Tier2) ~ 1:1:1 Tier 2 Physics cache PCs Institute Tier2 Center ~10s of Petabytes by ~1000 Petabytes in 5-7 years

University of Michigan (May 8, 2003)Paul Avery17 Digital Astronomy Future dominated by detector improvements Moore’s Law growth in CCDs Gigapixel arrays on horizon Growth in CPU/storage tracking data volumes Glass MPixels Total area of 3m+ telescopes in the world in m 2 Total number of CCD pixels in Mpixels 25 year growth: 30x in glass, 3000x in pixels

University of Michigan (May 8, 2003)Paul Avery18 The Age of Astronomical Mega-Surveys  Next generation mega-surveys will change astronomy  Large sky coverage  Sound statistical plans, uniform systematics  The technology to store and access the data is here  Following Moore’s law  Integrating these archives for the whole community  Astronomical data mining will lead to stunning new discoveries  “Virtual Observatory” (next slides)

University of Michigan (May 8, 2003)Paul Avery19 Virtual Observatories Source Catalogs Image Data Specialized Data: Spectroscopy, Time Series, Polarization Information Archives: Derived & legacy data: NED,Simbad,ADS, etc Discovery Tools: Visualization, Statistics Standards Multi-wavelength astronomy, Multiple surveys

University of Michigan (May 8, 2003)Paul Avery20 Virtual Observatory Data Challenge  Digital representation of the sky  All-sky + deep fields  Integrated catalog and image databases  Spectra of selected samples  Size of the archived data  40,000 square degrees  Resolution 50 trillion pixels  One band (2 bytes/pixel)100 Terabytes  Multi-wavelength: Terabytes  Time dimension:Many Petabytes  Large, globally distributed database engines  Multi-Petabyte data size, distributed widely  Thousands of queries per day, Gbyte/s I/O speed per site  Data Grid computing infrastructure

University of Michigan (May 8, 2003)Paul Avery21 Sloan Sky Survey Data Grid

University of Michigan (May 8, 2003)Paul Avery22 International Grid/Networking Projects US, EU, E. Europe, Asia, S. America, …

University of Michigan (May 8, 2003)Paul Avery23 Global Context: Data Grid Projects  U.S. Projects  Particle Physics Data Grid (PPDG)DOE  GriPhyNNSF  International Virtual Data Grid Laboratory (iVDGL)NSF  TeraGridNSF  DOE Science GridDOE  NSF Middleware Initiative (NMI)NSF  EU, Asia major projects  European Data Grid (EU, EC)  LHC Computing Grid (LCG) (CERN)  EU national Projects (UK, Italy, France, …)  CrossGrid (EU, EC)  DataTAG (EU, EC)  Japanese Project  Korea project

University of Michigan (May 8, 2003)Paul Avery24 Particle Physics Data Grid  Funded 2001 – US$9.5M (DOE)  Driven by HENP experiments: D0, BaBar, STAR, CMS, ATLAS

University of Michigan (May 8, 2003)Paul Avery25 PPDG Goals  Serve high energy & nuclear physics (HENP) experiments  Unique challenges, diverse test environments  Develop advanced Grid technologies  Focus on end to end integration  Maintain practical orientation  Networks, instrumentation, monitoring  DB file/object replication, caching, catalogs, end-to-end movement  Make tools general enough for wide community  Collaboration with GriPhyN, iVDGL, EDG, LCG  ESNet Certificate Authority work, security

University of Michigan (May 8, 2003)Paul Avery26 GriPhyN and iVDGL  Both funded through NSF ITR program  GriPhyN: $11.9M (NSF) + $1.6M (matching)(2000 – 2005)  iVDGL:$13.7M (NSF) + $2M (matching)(2001 – 2006)  Basic composition  GriPhyN:12 funded universities, SDSC, 3 labs(~80 people)  iVDGL:16 funded institutions, SDSC, 3 labs(~80 people)  Expts:US-CMS, US-ATLAS, LIGO, SDSS/NVO  Large overlap of people, institutions, management  Grid research vs Grid deployment  GriPhyN:CS research, Virtual Data Toolkit (VDT) development  iVDGL:Grid laboratory deployment  4 physics experiments provide frontier challenges  VDT in common

University of Michigan (May 8, 2003)Paul Avery27 GriPhyN Computer Science Challenges  Virtual data (more later)  Data + programs (content) +programs (executions)  Representation, discovery, & manipulation of workflows and associated data & programs  Planning  Mapping workflows in an efficient, policy-aware manner to distributed resources  Execution  Executing workflows, inc. data movements, reliably and efficiently  Performance  Monitoring system performance for scheduling & troubleshooting

University of Michigan (May 8, 2003)Paul Avery28 Goal: “PetaScale” Virtual-Data Grids Virtual Data Tools Request Planning & Scheduling Tools Request Execution & Management Tools Transforms Distributed resources (code, storage, CPUs, networks) Resource Management Services Security and Policy Services Other Grid Services Interactive User Tools Production Team Single Researcher Workgroups Raw data source  PetaOps  Petabytes  Performance

University of Michigan (May 8, 2003)Paul Avery29 GriPhyN/iVDGL Science Drivers  US-CMS & US-ATLAS  HEP experiments at LHC/CERN  100s of Petabytes  LIGO  Gravity wave experiment  100s of Terabytes  Sloan Digital Sky Survey  Digital astronomy (1/4 sky)  10s of Terabytes Data growth Community growth  Massive CPU  Large, distributed datasets  Large, distributed communities

University of Michigan (May 8, 2003)Paul Avery30 Virtual Data: Derivation and Provenance  Most scientific data are not simple “measurements”  They are computationally corrected/reconstructed  They can be produced by numerical simulation  Science & eng. projects are more CPU and data intensive  Programs are significant community resources (transformations)  So are the executions of those programs (derivations)  Management of dataset transformations important!  Derivation: Instantiation of a potential data product  Provenance: Exact history of any existing data product We already do this, but manually!

University of Michigan (May 8, 2003)Paul Avery31 Transformation Derivation Data product-of execution-of consumed-by/ generated-by “I’ve detected a muon calibration error and want to know which derived data products need to be recomputed.” “I’ve found some interesting data, but I need to know exactly what corrections were applied before I can trust it.” “I want to search a database for 3 muon SUSY events. If a program that does this analysis exists, I won’t have to write one from scratch.” “I want to apply a forward jet analysis to 100M events. If the results already exist, I’ll save weeks of computation.” Virtual Data Motivations (1)

University of Michigan (May 8, 2003)Paul Avery32 Virtual Data Motivations (2)  Data track-ability and result audit-ability  Universally sought by scientific applications  Facilitates tool and data sharing and collaboration  Data can be sent along with its recipe  Repair and correction of data  Rebuild data products—cf., “make”  Workflow management  Organizing, locating, specifying, and requesting data products  Performance optimizations  Ability to re-create data rather than move it Manual /error prone  Automated /robust

University of Michigan (May 8, 2003)Paul Avery33 “Chimera” Virtual Data System  Virtual Data API  A Java class hierarchy to represent transformations & derivations  Virtual Data Language  Textual for people & illustrative examples  XML for machine-to-machine interfaces  Virtual Data Database  Makes the objects of a virtual data definition persistent  Virtual Data Service (future)  Provides a service interface (e.g., OGSA) to persistent objects  Version 1.0 available  To be put into VDT 1.1.7

University of Michigan (May 8, 2003)Paul Avery34 Chimera Virtual Data System + GriPhyN Virtual Data Toolkit + iVDGL Data Grid (many CPUs) Chimera Application: SDSS Analysis Galaxy cluster data Size distribution

University of Michigan (May 8, 2003)Paul Avery35 Virtual Data and LHC Computing  US-CMS  Chimera prototype tested with CMS MC (~200K events)  Currently integrating Chimera into standard CMS production tools  Integrating virtual data into Grid-enabled analysis tools  US-ATLAS  Integrating Chimera into ATLAS software  HEPCAL document includes first virtual data use cases  Very basic cases, need elaboration  Discuss with LHC expts: requirements, scope, technologies  New ITR proposal to NSF ITR program ($15M)  Dynamic Workspaces for Scientific Analysis Communities  Continued progress requires collaboration with CS groups  Distributed scheduling, workflow optimization, …  Need collaboration with CS to develop robust tools

University of Michigan (May 8, 2003)Paul Avery36 iVDGL Goals and Context  International Virtual-Data Grid Laboratory  A global Grid laboratory (US, EU, E. Europe, Asia, S. America, …)  A place to conduct Data Grid tests “at scale”  A mechanism to create common Grid infrastructure  A laboratory for other disciplines to perform Data Grid tests  A focus of outreach efforts to small institutions  Context of iVDGL in US-LHC computing program  Develop and operate proto-Tier2 centers  Learn how to do Grid operations (GOC)  International participation  DataTag  UK e-Science programme: support 6 CS Fellows per year in U.S.

University of Michigan (May 8, 2003)Paul Avery37 US-iVDGL Sites (Spring 2003) Partners?  EU  CERN  Brazil  Australia  Korea  Japan UF Wisconsin BNL Indiana Boston U SKC Brownsville Hampton PSU J. Hopkins Caltech Tier1 Tier2 Tier3 FIU FSU Arlington Michigan LBL Oklahoma Argonne Vanderbilt UCSD/SDSC NCSA Fermilab

University of Michigan (May 8, 2003)Paul Avery38 US-CMS Grid Testbed

University of Michigan (May 8, 2003)Paul Avery39  Production Run for Monte Carlo data production  Assigned 1.5 million events for “eGamma Bigjets” ~500 sec per event on 750 MHz processor; all production stages from simulation to ntuple  2 months continuous running across 5 testbed sites  Demonstrated at Supercomputing 2002 US-CMS Testbed Success Story

University of Michigan (May 8, 2003)Paul Avery40 Creation of WorldGrid  Joint iVDGL/DataTag/EDG effort  Resources from both sides (15 sites)  Monitoring tools (Ganglia, MDS, NetSaint, …)  Visualization tools (Nagios, MapCenter, Ganglia)  Applications: ScienceGrid  CMS:CMKIN, CMSIM  ATLAS:ATLSIM  Submit jobs from US or EU  Jobs can run on any cluster  Demonstrated at IST2002 (Copenhagen)  Demonstrated at SC2002 (Baltimore)

University of Michigan (May 8, 2003)Paul Avery41 WorldGrid Sites

University of Michigan (May 8, 2003)Paul Avery42 Grid Coordination

University of Michigan (May 8, 2003)Paul Avery43 U.S. Project Coordination: Trillium  Trillium = GriPhyN + iVDGL + PPDG  Large overlap in leadership, people, experiments  Driven primarily by HENP, particularly LHC experiments  Benefit of coordination  Common software base + packaging: VDT + PACMAN  Collaborative / joint projects: monitoring, demos, security, …  Wide deployment of new technologies, e.g. Virtual Data  Stronger, broader outreach effort  Forum for US Grid projects  Joint view, strategies, meetings and work  Unified entity to deal with EU & other Grid projects

University of Michigan (May 8, 2003)Paul Avery44 International Grid Coordination  Global Grid Forum (GGF)  International forum for general Grid efforts  Many working groups, standards definitions  Close collaboration with EU DataGrid (EDG)  Many connections with EDG activities  HICB: HEP Inter-Grid Coordination Board  Non-competitive forum, strategic issues, consensus  Cross-project policies, procedures and technology, joint projects  HICB-JTB Joint Technical Board  Definition, oversight and tracking of joint projects  GLUE interoperability group  Participation in LHC Computing Grid (LCG)  Software Computing Committee (SC2)  Project Execution Board (PEB)  Grid Deployment Board (GDB)

University of Michigan (May 8, 2003)Paul Avery45 HEP and International Grid Projects  HEP continues to be the strongest science driver  (In collaboration with computer scientists)  Many national and international initiatives  LHC a particularly strong driving function  US-HEP committed to working with international partners  Many networking initiatives with EU colleagues  Collaboration on LHC Grid Project  Grid projects driving & linked to network developments  DataTag, SCIC, US-CERN link, Internet2  New partners being actively sought  Korea, Russia, China, Japan, Brazil, Romania, …  Participate in US-CMS and US-ATLAS Grid testbeds  Link to WorldGrid, once some software is fixed

University of Michigan (May 8, 2003)Paul Avery46 New Grid Efforts

An Inter-Regional Center for High Energy Physics Research and Educational Outreach (CHEPREO) at Florida International University  E/O Center in Miami area  iVDGL Grid Activities  CMS Research  AMPATH network (S. America)  Int’l Activities (Brazil, etc.) Status:  Proposal submitted Dec  Presented to NSF review panel  Project Execution Plan submitted  Funding in June?

University of Michigan (May 8, 2003)Paul Avery48 A Global Grid Enabled Collaboratory for Scientific Research (GECSR)  $4M ITR proposal from  Caltech (HN PI,JB:CoPI)  Michigan (CoPI,CoPI)  Maryland (CoPI)  Plus senior personnel from  Lawrence Berkeley Lab  Oklahoma  Fermilab  Arlington (U. Texas)  Iowa  Florida State  First Grid-enabled Collaboratory  Tight integration between  Science of Collaboratories  Globally scalable work environment  Sophisticated collaborative tools (VRVS, VNC; Next-Gen)  Agent based monitoring & decision support system (MonALISA)  Initial targets are the global HENP collaborations, but GESCR is expected to be widely applicable to other large scale collaborative scientific endeavors  “Giving scientists from all world regions the means to function as full partners in the process of search and discovery”

University of Michigan (May 8, 2003)Paul Avery49 Large ITR Proposal: $15M Dynamic Workspaces: Enabling Global Analysis Communities

University of Michigan (May 8, 2003)Paul Avery50 UltraLight Proposal to NSF 10 Gb/s+ network Caltech, UF, FIU, UM, MIT SLAC, FNAL Int’l partners Cisco Applications HEP VLBI Radiation Oncology Grid Projects

University of Michigan (May 8, 2003)Paul Avery51 GLORIAD  New 10 Gb/s network linking US-Russia-China  Plus Grid component linking science projects  H. Newman, P. Avery participating  Meeting at NSF April 14 with US-Russia-China reps.  HEP people (Hesheng, et al.)  Broad agreement that HEP can drive Grid portion  More meetings planned

University of Michigan (May 8, 2003)Paul Avery52 Summary  Progress on many fronts in PPDG/GriPhyN/iVDGL  Packaging: Pacman + VDT  Testbeds (development and production)  Major demonstration projects  Productions based on Grid tools using iVDGL resources  WorldGrid providing excellent experience  Excellent collaboration with EU partners  Building links to our Asian and other partners  Excellent opportunity to build lasting infrastructure  Looking to collaborate with more international partners  Testbeds, monitoring, deploying VDT more widely  New directions  Virtual data a powerful paradigm for LHC computing  Emphasis on Grid-enabled analysis

University of Michigan (May 8, 2003)Paul Avery53 Grid References  Grid Book   Globus   Global Grid Forum   PPDG   GriPhyN   iVDGL   TeraGrid   EU DataGrid 