Ruth Pordes, Fermilab CD, and A PPDG Coordinator Some Aspects of The Particle Physics Data Grid Collaboratory Pilot (PPDG) and The Grid Physics Network.

Slides:



Advertisements
Similar presentations
 Contributing >30% of throughput to ATLAS and CMS in Worldwide LHC Computing Grid  Reliant on production and advanced networking from ESNET, LHCNET and.
Advertisements

High Performance Computing Course Notes Grid Computing.
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
Cracow Grid Workshop, November 5-6, 2001 Towards the CrossGrid Architecture Marian Bubak, Marek Garbacz, Maciej Malawski, and Katarzyna Zając.
GRID DATA MANAGEMENT PILOT (GDMP) Asad Samar (Caltech) ACAT 2000, Fermilab October , 2000.
Workload Management Workpackage Massimo Sgaravatto INFN Padova.
The LHC Computing Grid Project Tomi Kauppi Timo Larjo.
DataGrid Kimmo Soikkeli Ilkka Sormunen. What is DataGrid? DataGrid is a project that aims to enable access to geographically distributed computing power.
Milos Kobliha Alejandro Cimadevilla Luis de Alba Parallel Computing Seminar GROUP 12.
Grids and Grid Technologies for Wide-Area Distributed Computing Mark Baker, Rajkumar Buyya and Domenico Laforenza.
Workload Management Massimo Sgaravatto INFN Padova.
Workload Management WP Status and next steps Massimo Sgaravatto INFN Padova.
HEP Experiment Integration within GriPhyN/PPDG/iVDGL Rick Cavanaugh University of Florida DataTAG/WP4 Meeting 23 May, 2002.
ARGONNE  CHICAGO Ian Foster Discussion Points l Maintaining the right balance between research and development l Maintaining focus vs. accepting broader.
CoG Kit Overview Gregor von Laszewski Keith Jackson.
INFSO-RI Enabling Grids for E-sciencE The US Federation Miron Livny Computer Sciences Department University of Wisconsin – Madison.
LCG and HEPiX Ian Bird LCG Project - CERN HEPiX - FNAL 25-Oct-2002.
Grid Workload Management & Condor Massimo Sgaravatto INFN Padova.
DataGrid Applications Federico Carminati WP6 WorkShop December 11, 2000.
Grid Status - PPDG / Magda / pacman Torre Wenaus BNL U.S. ATLAS Physics and Computing Advisory Panel Review Argonne National Laboratory Oct 30, 2001.
File and Object Replication in Data Grids Chin-Yi Tsai.
PPDG and ATLAS Particle Physics Data Grid Ed May - ANL ATLAS Software Week LBNL May 12, 2000.
SAM and D0 Grid Computing Igor Terekhov, FNAL/CD.
GriPhyN Status and Project Plan Mike Wilde Mathematics and Computer Science Division Argonne National Laboratory.
Miron Livny Computer Sciences Department University of Wisconsin-Madison Welcome and Condor Project Overview.
DOSAR Workshop, Sao Paulo, Brazil, September 16-17, 2005 LCG Tier 2 and DOSAR Pat Skubic OU.
Miron Livny Center for High Throughput Computing Computer Sciences Department University of Wisconsin-Madison Open Science Grid (OSG)
Grid Workload Management Massimo Sgaravatto INFN Padova.
Data Grid projects in HENP R. Pordes, Fermilab Many HENP projects are working on the infrastructure for global distributed simulated data production, data.
10/24/2015OSG at CANS1 Open Science Grid Ruth Pordes Fermilab
Major Grid Computing Initatives Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer Science The.
Virtual Data Grid Architecture Ewa Deelman, Ian Foster, Carl Kesselman, Miron Livny.
DataTAG Research and Technological Development for a Transatlantic Grid Abstract Several major international Grid development projects are underway at.
GriPhyN EAC Meeting (Jan. 7, 2002)Carl Kesselman1 University of Southern California GriPhyN External Advisory Committee Meeting Gainesville,
Author - Title- Date - n° 1 Partner Logo EU DataGrid, Work Package 5 The Storage Element.
Interoperability Grids, Clouds and Collaboratories Ruth Pordes Executive Director Open Science Grid, Fermilab.
Ames Research CenterDivision 1 Information Power Grid (IPG) Overview Anthony Lisotta Computer Sciences Corporation NASA Ames May 2,
DOE/NSF Review (Nov. 15, 2000)Paul Avery (LHC Data Grid)1 LHC Data Grid The GriPhyN Perspective DOE/NSF Baseline Review of US-CMS Software and Computing.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
GRID ARCHITECTURE Chintan O.Patel. CS 551 Fall 2002 Workshop 1 Software Architectures 2 What is Grid ? "...a flexible, secure, coordinated resource- sharing.
1 DØ Grid PP Plans – SAM, Grid, Ceiling Wax and Things Iain Bertram Lancaster University Monday 5 November 2001.
The Earth System Grid (ESG) Computer Science and Technologies DOE SciDAC ESG Project Review Argonne National Laboratory, Illinois May 8-9, 2003.
What is SAM-Grid? Job Handling Data Handling Monitoring and Information.
The GriPhyN Planning Process All-Hands Meeting ISI 15 October 2001.
Internet 2 Workshop (Nov. 1, 2000)Paul Avery (The GriPhyN Project)1 The GriPhyN Project (Grid Physics Network) Paul Avery University of Florida
GRIDS Center Middleware Overview Sandra Redman Information Technology and Systems Center and Information Technology Research Center National Space Science.
Ruth Pordes November 2004TeraGrid GIG Site Review1 TeraGrid and Open Science Grid Ruth Pordes, Fermilab representing the Open Science.
The Particle Physics Data Grid Collaboratory Pilot Richard P. Mount For the PPDG Collaboration DOE SciDAC PI Meeting January 15, 2002.
CEOS Working Group on Information Systems and Services - 1 Data Services Task Team Discussions on GRID and GRIDftp Stuart Doescher, USGS WGISS-15 May 2003.
1 P.Kunszt LCGP Data Management on the GRID Peter Z. Kunszt CERN Database Group EU DataGrid – Data Management.
PPDGLHC Computing ReviewNovember 15, 2000 PPDG The Particle Physics Data Grid Making today’s Grid software work for HENP experiments, Driving GRID science.
US Grid Efforts Lee Lueking D0 Remote Analysis Workshop February 12, 2002.
6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.
Open Science Grid & its Security Technical Group ESCC22 Jul 2004 Bob Cowles
Alain Roy Computer Sciences Department University of Wisconsin-Madison Condor & Middleware: NMI & VDT.
Securing the Grid & other Middleware Challenges Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer.
GriPhyN Project Paul Avery, University of Florida, Ian Foster, University of Chicago NSF Grant ITR Research Objectives Significant Results Approach.
Super Computing 2000 DOE SCIENCE ON THE GRID Storage Resource Management For the Earth Science Grid Scientific Data Management Research Group NERSC, LBNL.
U.S. Grid Projects and Involvement in EGEE Ian Foster Argonne National Laboratory University of Chicago EGEE-LHC Town Meeting,
Open Science Grid in the U.S. Vicky White, Fermilab U.S. GDB Representative.
April 25, 2006Parag Mhashilkar, Fermilab1 Resource Selection in OSG & SAM-On-The-Fly Parag Mhashilkar Fermi National Accelerator Laboratory Condor Week.
Grid Status - PPDG / Magda / pacman Torre Wenaus BNL DOE/NSF Review of US LHC Software and Computing Fermilab Nov 29, 2001.
1 An update on the Open Science Grid for IHEPCCC Ruth Pordes, Fermilab.
Grid Activities in CMS Asad Samar (Caltech) PPDG meeting, Argonne July 13-14, 2000.
DOE/NSF Quarterly review January 1999 Particle Physics Data Grid Applications David Malon Argonne National Laboratory
1 CMS Virtual Data Overview Koen Holtman Caltech/CMS GriPhyN all-hands meeting, Marina del Rey April 9, 2001.
Towards deploying a production interoperable Grid Infrastructure in the U.S. Vicky White U.S. Representative to GDB.
1 Particle Physics Data Grid (PPDG) project Les Cottrell – SLAC Presented at the NGI workshop, Berkeley, 7/21/99.
Open Science Grid at Condor Week
Status of Grids for HEP and HENP
Presentation transcript:

Ruth Pordes, Fermilab CD, and A PPDG Coordinator Some Aspects of The Particle Physics Data Grid Collaboratory Pilot (PPDG) and The Grid Physics Network (GriPhyN)

R. Pordes, LCCWS PPDG GriPhyN › 6 HENP experiments including 4 labs, 4 CS groups, >25 active participants › Atlas, BaBar, CMS, D0, JLAB, Star › DOE 3 year SciDAC proposal – expect to know about funding very soon. › Funding balanced towards experiments with CS funding for development and deployment. › Joint CS and Experiment integrated deliverables. › 17 universities, SDSC, 3 labs, >40 active participants. 4 physics/astrophysics experiments › Atlas, CMS, LIGO, SDSS. › 3 years of NSF ITR funding started Sept › Funded primarily as an IT research project 2/3 CS + 1/3 physics. CS funding focus on students and post-docs. › Education and Outreach a focus

R. Pordes, LCCWS PPDG BaBar Data Management BaBar D0 CDF Nuclear Physics CMSAtlas Globus Users SRB Users Condor Users HENP GC Users CMS Data Management Nuclear Physics Data Management D0 Data Management CDF Data Management Atlas Data Management Globus Team Condor SRB Team HENP GC PPDG Person ERD

R. Pordes, LCCWS PPDG Goals › End-to-end integration and deployment in experiment production systems. › After one experiment uses software adapt/reuse/extend for other experiments. › Extend & interface existing distributed data access code bases with Grid middleware. › Develop missing components and new technical solutions. › Milestones for deliverables of the order of 1 year for each Project Activity

R. Pordes, LCCWS Experiment Specific Application (Atlas, BaBar, CMS, D0, STAR JLAB ) Generic Grid Services Experiment Specific Fabric & Hardware Application-Grid interfaces Fabric-Grid interfaces PPDG End-to-End

R. Pordes, LCCWS PPDG future: › High speed (100MByte/sec) point to point file transfer, › Common storage management adapters – HRM to SRB, HPSS, SAM, Enstore, › File replication prototypes – BaBar, GDMP › Input to Globus file replication management › Collaborative work was very loosely coupled to the overall project activities › Project Activities from proposal: proposed activities proposed activities › Understand how to relate these to CS teams who are also delivering to other funded responsibilities and clients. › Understand how to accommodate these within Experiments who have to “deliver to WBS” PPDG to date:

R. Pordes, LCCWS PPDG CS Program & › Job Description Language – WP1 › Scheduling and management of processing and data placement activities. – WP1 › Monitoring and status reporting – WP3 › Storage resource management – WP5 › Reliable replica management services – WP2 › File transfer services – WP2 › Collect and document current experiment practices and potential generalizations – WP8 EU Data Grid WPs PPDG has no equivalent to date to WP4

R. Pordes, LCCWS Common s/w will we use – in the best case PPDG (and GriPhyN) include the leadership of existing Grid middleware  Condor  Globus  Storage Resource Broker It is natural/required/expected that we will use software from these groups. Existing S/w includes:  Condor and Condor-G  Class Ads as a component GridFTP  Globus Replica Catalog  Globus Replication Management  LBL IDL definitions of HRM/DRM  SRB as a system? Will stimulate new services and extensions in core Grid software components

R. Pordes, LCCWS Concerns - in the worst case › Experiments have their own “data handling” systems at different stages of maturity developed by collaboration projects and /or consensus across many institutions and stakeholders.  BaBar  D0 SAM  CMS ORCA + GDMP + ??  ATLAS mysql catalogs??  JLAB Java implementation only  Star existing data transfers › We will not be able to tolerate the overheads:  Experiments have milestones e.g. beam  Across experiment agreement  Accept s/w from outside over which we do not have complete control  Of not just doing it ourselves

R. Pordes, LCCWS PPDG Project Activity #1 - GDMP  Prototype developed for Objectivity database replication for CMS Monte Carlo data between Cern and remote sites  As part of PPDG will add more general features e.g. support for flat files  Also an EUDataGrid WP2 sub-project  As part of both projects integrated with Globus Replica Catalog. Open question is use of Globus replica management services  Uses to date: transfer CMS simulation data from Cern to remote sites. SDSS Science database for query testing at remote site  Under discussion for Atlas  Issue in spades of how future developments get planned and executed.  GDMP URL GDMP URL

R. Pordes, LCCWS Project Activity #2. - D0 and University of Wisconsin Job Definition and Global Management layer  Existing D0 SAM (Grid enabled) distributed data handling & resource management system – including well developed file replication and disk caching services  Aim to make use of existing/extended Condor services – ClassAdds, Condor-g job submission layer  Develop functionality to benefit D0 end users in an architecture that provides well defined boundaries between SAM and extra services. Allows reuse of concepts and design for e.g. CMS and reuse of actual software  Condor team provides the liaising to INFN EUDG WP1 & WP1 PPDG liaison is ICL D0 collaborator.  Define scope and deliverables by the end of June Job Definition Language and Distributed Management Features – extensions to existing SAM and Condor languages for Grid Interface of new s/w to SAM Demonstration application  (need a name…)

R. Pordes, LCCWS Other PPDG-Activities to be defined soon.. › Star file replication using Globus replica catalog and management › BaBar data base transfer between Slac and IN2P3 › Atlas use of GDMP and Globus › Java developments from JLAB with Globus › Disk Resource management in Star with LBL › Other Activities:  Participation in GriPhyn Discussions  Participation in EU/US Data Grid Coordination meeting

R. Pordes, LCCWS GriPhyN (from I. Foster All hands talk) › Project has two complementary & supporting elements  IT research project: will be judged on contributions to knowledge  CS/application partnership: will also be judged on successful transfer to experiments › Two associated unifying concepts  Virtual data as the central intellectual concept  Toolkit as a central deliverable and technology transfer vehicle › Petascale Virtual Data Grids – demonstrate experiment applications using Virtual Data Concepts

R. Pordes, LCCWS GriPhyN Goals (Foster) “ Explore concept of virtual data and its applicability to data-intensive science,” i.e., 1. Transparency with respect to location  Known concept; but how to realize in a large-scale, performance- oriented Data Grid? 2. Transparency with respect to materialization  To determine: is this useful? 3. Automated management of computation  Issues of scale, transparency Virtual Data Toolkit (Livny) “.. a primary GriPhyN deliverable will be a suite of virtual data services and virtual data tools designed to support a wide range of applications. The development of this Virtual Data Toolkit (VDT) will enable the real-life experimentation needed to evaluate GriPhyN technologies. The VDT will also serve as a primary technology transfer mechanism to the four physics experiments and to the broader scientific community”.

R. Pordes, LCCWS Primary GriPhyN R&D Components Virtual Data Tools Request Planning and Scheduling Tools Request Execution Management Tools Performance Estimation and Evaluation Transforms Distributed resources (code, storage, computers, and network) Resource Management Services Resource Management Services Security and Policy Services Security and Policy Services Other Grid Services Other Grid Services Interactive User Tools Production Team Individual InvestigatorOther Users Raw data source

R. Pordes, LCCWS GriPhyN R&D from all-hands meeting  Dynamic Replication Strategies for a High Performance Data Grid – Univeristy of Chicago  Agent Grid Interfaces – Information Sciences Institute (ISI)  Query Optimization - Caltech  Data Base query research - UC Berkeley  Issues in Fault Tolerance and the Grid – University of California San Diego  NeST: Network Storage Flexible Commodity Storage Appliances – University of Wisconsin  Prophesy: Performance Analysis & Modeling for Distributed Applications – Northwestern University  High-performance bulk data transfers with TCP – University of Chicago  Location-Transparent Adaptive File Caching – LBL

R. Pordes, LCCWS VDT V1.0 & Existing Components › On going activity to implement and harden existing Grid technology by the Condor and Globus teams › Condor job and resource management  Condor-G reliable and recoverable job management  DAGMan Basic job description, control and flow services  ClassAds resource publication and matching › Globus  MDS-2 information service: access to static & dynamic configuration & state information  GRAM resource access protocol  GridFTP data access and transfer protocol  Replica catalog, replica management  Grid Security Infrastructure: single sign on › SRB catalog services

R. Pordes, LCCWS GriPhyN Catalogs to date

R. Pordes, LCCWS finally…back to what is facing us in PPDG: › How to translate the “G” word into useful & ubiquitous services › How to actually deliver what was proposed › How will production deployment be achieved and supported. › Boundaries between and overlap between projects and experiment systems both in US and Europe? › Do the US Grid projects have holes in the architecture e.g. “WP4” functionality