Data Grid projects in HENP R. Pordes, Fermilab Many HENP projects are working on the infrastructure for global distributed simulated data production, data.

Slides:



Advertisements
Similar presentations
International Grid Communities Dr. Carl Kesselman Information Sciences Institute University of Southern California.
Advertisements

GridPP July 2003Stefan StonjekSlide 1 SAM middleware components Stefan Stonjek University of Oxford 7 th GridPP Meeting 02 nd July 2003 Oxford.
31/03/00 CMS(UK)Glenn Patrick What is the CMS(UK) Data Model? Assume that CMS software is available at every UK institute connected by some infrastructure.
Author - Title- Date - n° 1 GDMP The European DataGrid Project Team
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
GRID DATA MANAGEMENT PILOT (GDMP) Asad Samar (Caltech) ACAT 2000, Fermilab October , 2000.
CERN Krakow 2001 F. Gagliardi - CERN/IT 1 RTD efforts in Europe by Kyriakos Baxevanidis Foster cohesion, interoperability, cross- fertilization of knowledge,
Interfacing Interactive Data Analysis Tools with the Grid: PPDG CS-11 Activity Doug Olson, LBNL Joseph Perl, SLAC ACAT 2002, Moscow 24 June 2002.
October 24, 2000Milestones, Funding of USCMS S&C Matthias Kasemann1 US CMS Software and Computing Milestones and Funding Profiles Matthias Kasemann Fermilab.
CMS Report – GridPP Collaboration Meeting VI Peter Hobson, Brunel University30/1/2003 CMS Status and Plans Progress towards GridPP milestones Workload.
Experiment Requirements for Global Infostructure Irwin Gaines FNAL/DOE.
HEP Experiment Integration within GriPhyN/PPDG/iVDGL Rick Cavanaugh University of Florida DataTAG/WP4 Meeting 23 May, 2002.
The SAMGrid Data Handling System Outline:  What Is SAMGrid?  Use Cases for SAMGrid in Run II Experiments  Current Operational Load  Stress Testing.
ARGONNE  CHICAGO Ian Foster Discussion Points l Maintaining the right balance between research and development l Maintaining focus vs. accepting broader.
Remote Production and Regional Analysis Centers Iain Bertram 24 May 2002 Draft 1 Lancaster University.
Grid Job and Information Management (JIM) for D0 and CDF Gabriele Garzoglio for the JIM Team.
November 7, 2001Dutch Datagrid SARA 1 DØ Monte Carlo Challenge A HEP Application.
Building a distributed software environment for CDF within the ESLEA framework V. Bartsch, M. Lancaster University College London.
D0 SAM – status and needs Plagarized from: D0 Experiment SAM Project Fermilab Computing Division.
3rd June 2004 CDF Grid SAM:Metadata and Middleware Components Mòrag Burgon-Lyon University of Glasgow.
Grid Workload Management & Condor Massimo Sgaravatto INFN Padova.
8th November 2002Tim Adye1 BaBar Grid Tim Adye Particle Physics Department Rutherford Appleton Laboratory PP Grid Team Coseners House 8 th November 2002.
Grid Status - PPDG / Magda / pacman Torre Wenaus BNL U.S. ATLAS Physics and Computing Advisory Panel Review Argonne National Laboratory Oct 30, 2001.
PPDG and ATLAS Particle Physics Data Grid Ed May - ANL ATLAS Software Week LBNL May 12, 2000.
SAM and D0 Grid Computing Igor Terekhov, FNAL/CD.
Ruth Pordes, Fermilab CD, and A PPDG Coordinator Some Aspects of The Particle Physics Data Grid Collaboratory Pilot (PPDG) and The Grid Physics Network.
ATLAS and GridPP GridPP Collaboration Meeting, Edinburgh, 5 th November 2001 RWL Jones, Lancaster University.
Instrumentation of the SAM-Grid Gabriele Garzoglio CSC 426 Research Proposal.
10/24/2015OSG at CANS1 Open Science Grid Ruth Pordes Fermilab
Virtual Data Grid Architecture Ewa Deelman, Ian Foster, Carl Kesselman, Miron Livny.
DataTAG Research and Technological Development for a Transatlantic Grid Abstract Several major international Grid development projects are underway at.
MAGDA Roger Jones UCL 16 th December RWL Jones, Lancaster University MAGDA  Main authors: Wensheng Deng, Torre Wenaus Wensheng DengTorre WenausWensheng.
GriPhyN EAC Meeting (Jan. 7, 2002)Carl Kesselman1 University of Southern California GriPhyN External Advisory Committee Meeting Gainesville,
D0RACE: Testbed Session Lee Lueking D0 Remote Analysis Workshop February 12, 2002.
09/02 ID099-1 September 9, 2002Grid Technology Panel Patrick Dreher Technical Panel Discussion: Progress in Developing a Web Services Data Analysis Grid.
1 DØ Grid PP Plans – SAM, Grid, Ceiling Wax and Things Iain Bertram Lancaster University Monday 5 November 2001.
SAM - Sequential Data Access via Metadata Schema Metadata Functionality Workshop Glasgow University April 26-28,2004.
…building the next IT revolution From Web to Grid…
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
Lee Lueking 1 The Sequential Access Model for Run II Data Management and Delivery Lee Lueking, Frank Nagy, Heidi Schellman, Igor Terekhov, Julie Trumbo,
Les Les Robertson LCG Project Leader High Energy Physics using a worldwide computing grid Torino December 2005.
Ruth Pordes November 2004TeraGrid GIG Site Review1 TeraGrid and Open Science Grid Ruth Pordes, Fermilab representing the Open Science.
Grid User Interface for ATLAS & LHCb A more recent UK mini production used input data stored on RAL’s tape server, the requirements in JDL and the IC Resource.
The Particle Physics Data Grid Collaboratory Pilot Richard P. Mount For the PPDG Collaboration DOE SciDAC PI Meeting January 15, 2002.
High Energy Physics and Grids at UF (Dec. 13, 2002)Paul Avery1 University of Florida High Energy Physics.
ATLAS is a general-purpose particle physics experiment which will study topics including the origin of mass, the processes that allowed an excess of matter.
LIGO-G E LIGO Scientific Collaboration Data Grid Status Albert Lazzarini Caltech LIGO Laboratory Trillium Steering Committee Meeting 20 May 2004.
PPDGLHC Computing ReviewNovember 15, 2000 PPDG The Particle Physics Data Grid Making today’s Grid software work for HENP experiments, Driving GRID science.
1D. Olson, SDM-ISIC Mtg, 26 Mar 2002 Scientific Data Management: An Incomplete Experimental HENP Perspective D. Olson, LBNL 26 March 2002 SDM-ISIC Meeting.
US Grid Efforts Lee Lueking D0 Remote Analysis Workshop February 12, 2002.
Open Science Grid & its Security Technical Group ESCC22 Jul 2004 Bob Cowles
Super Computing 2000 DOE SCIENCE ON THE GRID Storage Resource Management For the Earth Science Grid Scientific Data Management Research Group NERSC, LBNL.
LHC Computing, CERN, & Federated Identities
Analysis Tools at D0 PPDG Analysis Grid Computing Project, CS 11 Caltech Meeting Lee Lueking Femilab Computing Division December 19, 2002.
Data Management with SAM at DØ The 2 nd International Workshop on HEP Data Grid Kyunpook National University Daegu, Korea August 22-23, 2003 Lee Lueking.
April 25, 2006Parag Mhashilkar, Fermilab1 Resource Selection in OSG & SAM-On-The-Fly Parag Mhashilkar Fermi National Accelerator Laboratory Condor Week.
LHC Computing, SPC-FC-CC-C; H F Hoffmann1 CERN/2379/Rev: Proposal for building the LHC computing environment at CERN (Phase 1) Goals of Phase.
Grid Status - PPDG / Magda / pacman Torre Wenaus BNL DOE/NSF Review of US LHC Software and Computing Fermilab Nov 29, 2001.
The EU DataTAG Project Richard Hughes-Jones Based on Olivier H. Martin GGF3 Frascati, Italy Oct 2001.
D0 File Replication PPDG SLAC File replication workshop 9/20/00 Vicky White.
1 Open Science Grid.. An introduction Ruth Pordes Fermilab.
Grid Activities in CMS Asad Samar (Caltech) PPDG meeting, Argonne July 13-14, 2000.
Simulation Production System Science Advisory Committee Meeting UW-Madison March 1 st -2 nd 2007 Juan Carlos Díaz Vélez.
Jianming Qian, UM/DØ Software & Computing Where we are now Where we want to go Overview Director’s Review, June 5, 2002.
1 Particle Physics Data Grid (PPDG) project Les Cottrell – SLAC Presented at the NGI workshop, Berkeley, 7/21/99.
Hall D Computing Facilities Ian Bird 16 March 2001.
US Grid Efforts Lee Lueking D0 Remote Analysis Workshop February 12, 2002.
Patrick Dreher Research Scientist & Associate Director
Gridifying the LHCb Monte Carlo production system
Status of Grids for HEP and HENP
Presentation transcript:

Data Grid projects in HENP R. Pordes, Fermilab Many HENP projects are working on the infrastructure for global distributed simulated data production, data processing and analysis e.g. Experiments now taking data : BaBar (.3 PB/year), D0 (.5 PB/year in 2002), STAR Funded projects  GriPhyN USA NSF, $11.9M + $1.6M  PPDG I USA DOE, $2M  PPDG II USA DOE, $9.5M  EU DataGrid EU $9.3M Just funded or Proposed projects  iVDGL USA NSF, $15M + $1.8M + UK  DTF USA NSF, $45M + $4M/yr  DataTag EU EC, $2M?  GridPP UK PPARC, > $15M Other national projects  UK e-Science (> $100M for )  Italy, France, (Japan?)

Current Data Grid Projects at Fermilab Fermilab, D0-SAM  D0 Run 2 Data Grid. Experiment starting to take data now. 500 Physicists; 80 institutions; 17 Countries 30 Tbytes in the data store.  Started 3 years ago by the Fermilab Run 2 Joint Offline project.  Currently ~100 active users; 8 sites in US and Europe.  PPDG  Three year, ten institution, DOE/SciDAC funded development and integration to deploy data grids for HENP.  D0 and CMS Fermilab data handling groups are collaborating and contributing to the project.  “Vertical Integration” of Grid middleware components into HENP experiments’ ongoing work.  Laboratory for “experimental computer science” 

D0 SAM -Transparent data replication and caching with support for multiple transports and interface to mass storage systems (bbftp, scp; enstore, sara) -Well developed meta-data catalog for data, job and physics parameter tracking. -Generalized Interface to batch systems (lsf, fbs,pbs,condor). “Economic concepts” to implement collaboration policies for resource management and scheduling. -Interface for physics data set definition and selection -Interfaces to user programs in C++, python. -Support for robust production service with restart facilities, self- managing agent servers; monitoring and logging services throughout. -Support for simulation production as well as event data processing analysis

Simplified Database Schema Files ID Name Format Size # Events Files ID Name Format Size # Events Events ID Event Number Trigger L1 Trigger L2 Trigger L3 Off-line Filter Thumbnail Events ID Event Number Trigger L1 Trigger L2 Trigger L3 Off-line Filter Thumbnail Volume Project Data Tier Physical Data Stream Physical Data Stream Trigger Configuration Trigger Configuration Creation & Processing Info Creation & Processing Info Run Event-File Catalog Event-File Catalog Run Conditions Luminosity Calibration Alignment Run Conditions Luminosity Calibration Alignment Group and User information Group and User information Station Config. & Cache info Station Config. & Cache info File Storage Locations File Storage Locations MC Process & Decay MC Process & Decay

Remote Station MC Management DB Tables Event Generator Review/ Authorize Establish priorities Assign Work Web Summaries Input Requests Analysis Project Admin Tools SAM Tape Station cache Existing SAM pieces New parts FNAL Data Serving Station cache Station cache Station cache SAM Tape Ocean/Mountain/Prairie MSS Tape Remote pieces D0 Monte Carlo Management System

Data Added to SAM 160k Files 25TB

SAM on the Global Scale Interconnected network of primary cache stations Communicating and replicating data where it is needed. MSS WAN Stations at FNAL Current active stations FNAL Lyon Fr(IN2P3), Amsterdam Nl(NIKHEF) Lancaster UK Imperial College UK Others in US

PPDG – profile and 3 year goals Funding US DOE approved 1/1/3/3/3 $M, 99 – 03 Computer ScienceGlobus (Foster), Condor (Livny), SDM (Shoshani), Storage Resource Broker (Moore) Physics BaBar, Dzero, STAR, JLAB, ATLAS, CMS National LaboratoriesBNL, Fermilab, JLAB, SLAC, ANL, LBNL Universities Caltech, SDSS, UCSD, Wisconsin Hardware/Networks No funding Experiments successfully embrace and deploy grid services throughout their data handling and analysis systems, based on shared experiences and developments. Computer Science groups evolve common interfaces and services so to serve not only a range of High Energy and Nuclear Physics experiment needs but also other scientific communities.

PPDG Main areas of work: Extending Grid services:  Storage Resource Management and Interfacing.  Robust File Replication and Information Services.  Intelligent Job and Resource Management.  System monitoring and information capture. End to End applications:  Experiments data handling systems in use now and in the near future  to give real-world requirements, testing and feedback.  Error reporting and response  Fault tolerant integration of complex components Cross-project activities:  Authenitcation and Certificate authorisation and exchange.  European Data Grid common project for data transfer (Grid Data Management Pilot)  SC2001 demo with GriPhyN.

Align PPDG milestones to Experiment data challenges, first year:  ATLAS – production distributed data service – 6/1/02  BaBar – analysis across partitioned dataset storage – 5/1/02  CMS – Distributed simulation production – 1/1/02  D0 – distributed analyses across multiple workgroup clusters – 4/1/02  STAR – automated dataset replication – 12/1/01  JLAB – policy driven file migration – 2/1/02 Management /coordination challenge:

Example data grid heirachy – CMS Tier 1 and 2 Tier 2s used for simulation production today Tier 1 FNAL T Tier 0 (CERN) Other Tier 1 centers

PPDG views cross-project coordination as important: Other Grid Projects in our field:  GriPhyN – Grid for Physics Network  European DataGrid  Storage Resource Management collaboratory  HENP Data Grid Coordination Committee Deployed systems:  ATLAS, BaBar, CMS, D0, Star, JLAB experiment data handling systems  iVDGL – International Virtual Data Grid Laboratory  Use DTF computational facilities? Standards Committees:  Internet2 High Energy and Nuclear Physics Working Group  Global Grid Forum