28 April 2003Lee Lueking, PPDG Review1 BaBar and DØ Experiment Reports DOE Review of PPDG January 28-29, 2003 Lee Lueking Fermilab Computing Division D0.

Slides:



Advertisements
Similar presentations
GridPP July 2003Stefan StonjekSlide 1 SAM middleware components Stefan Stonjek University of Oxford 7 th GridPP Meeting 02 nd July 2003 Oxford.
Advertisements

Physics with SAM-Grid Stefan Stonjek University of Oxford 6 th GridPP Meeting 30 th January 2003 Coseners House.
4/2/2002HEP Globus Testing Request - Jae Yu x Participating in Globus Test-bed Activity for DØGrid UTA HEP group is playing a leading role in establishing.
Plateforme de Calcul pour les Sciences du Vivant SRB & gLite V. Breton.
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
Jean-Yves Nief, CC-IN2P3 Wilko Kroeger, SCCS/SLAC Adil Hasan, CCLRC/RAL HEPiX, SLAC October 11th – 13th, 2005 BaBar data distribution using the Storage.
1 Use of the European Data Grid software in the framework of the BaBar distributed computing model T. Adye (1), R. Barlow (2), B. Bense (3), D. Boutigny.
The Sam-Grid project Gabriele Garzoglio ODS, Computing Division, Fermilab PPDG, DOE SciDAC ACAT 2002, Moscow, Russia June 26, 2002.
JIM Deployment for the CDF Experiment M. Burgon-Lyon 1, A. Baranowski 2, V. Bartsch 3,S. Belforte 4, G. Garzoglio 2, R. Herber 2, R. Illingworth 2, R.
18 Feb 2004Computing Division Project Status Report1 Project Status Report : SAMGrid  SAMGrid Management, Status, Operations – Merritt  SAMGrid Development.
The SAM-Grid Fabric Services Gabriele Garzoglio (for the SAM-Grid team) Computing Division Fermilab.
QCDgrid Technology James Perry, George Beckett, Lorna Smith EPCC, The University Of Edinburgh.
Grid Job, Information and Data Management for the Run II Experiments at FNAL Igor Terekhov et al (see next slide) FNAL/CD/CCF, D0, CDF, Condor team, UTA,
SAMGrid – A fully functional computing grid based on standard technologies Igor Terekhov for the JIM team FNAL/CD/CCF.
5 November 2001F Harris GridPP Edinburgh 1 WP8 status for validating Testbed1 and middleware F Harris(LHCb/Oxford)
The SAMGrid Data Handling System Outline:  What Is SAMGrid?  Use Cases for SAMGrid in Run II Experiments  Current Operational Load  Stress Testing.
Remote Production and Regional Analysis Centers Iain Bertram 24 May 2002 Draft 1 Lancaster University.
Grid Job and Information Management (JIM) for D0 and CDF Gabriele Garzoglio for the JIM Team.
03/27/2003CHEP20031 Remote Operation of a Monte Carlo Production Farm Using Globus Dirk Hufnagel, Teela Pulliam, Thomas Allmendinger, Klaus Honscheid (Ohio.
BaBar Grid Computing Eleonora Luppi INFN and University of Ferrara - Italy.
November 7, 2001Dutch Datagrid SARA 1 DØ Monte Carlo Challenge A HEP Application.
Deploying and Operating the SAM-Grid: lesson learned Gabriele Garzoglio for the SAM-Grid Team Sep 28, 2004.
3rd June 2004 CDF Grid SAM:Metadata and Middleware Components Mòrag Burgon-Lyon University of Glasgow.
QCDGrid Progress James Perry, Andrew Jackson, Stephen Booth, Lorna Smith EPCC, The University Of Edinburgh.
Jean-Yves Nief CC-IN2P3, Lyon HEPiX-HEPNT, Fermilab October 22nd – 25th, 2002.
CHEP 2003Stefan Stonjek1 Physics with SAM-Grid Stefan Stonjek University of Oxford CHEP th March 2003 San Diego.
8th November 2002Tim Adye1 BaBar Grid Tim Adye Particle Physics Department Rutherford Appleton Laboratory PP Grid Team Coseners House 8 th November 2002.
May 12-15, 2003Lee Lueking, EDG Int. Proj. Conf.1 DØ Computing Experience and Plans for SAM-Grid EU DataGrid Internal Project Conference May 12-15, 2003.
1 st December 2003 JIM for CDF 1 JIM and SAMGrid for CDF Mòrag Burgon-Lyon University of Glasgow.
PPDG and ATLAS Particle Physics Data Grid Ed May - ANL ATLAS Software Week LBNL May 12, 2000.
SAMGrid as a Stakeholder of FermiGrid Valeria Bartsch Computing Division Fermilab.
SAM and D0 Grid Computing Igor Terekhov, FNAL/CD.
ATLAS and GridPP GridPP Collaboration Meeting, Edinburgh, 5 th November 2001 RWL Jones, Lancaster University.
BaBar Data Distribution using the Storage Resource Broker Adil Hasan, Wilko Kroeger (SLAC Computing Services), Dominique Boutigny (LAPP), Cristina Bulfon.
Instrumentation of the SAM-Grid Gabriele Garzoglio CSC 426 Research Proposal.
GridPP18 Glasgow Mar 07 DØ – SAMGrid Where’ve we come from, and where are we going? Evolution of a ‘long’ established plan Gavin Davies Imperial College.
Data Distribution and Management Tim Adye Rutherford Appleton Laboratory BaBar Computing Review 9 th June 2003.
Data Grid projects in HENP R. Pordes, Fermilab Many HENP projects are working on the infrastructure for global distributed simulated data production, data.
November SC06 Tampa F.Fanzago CRAB a user-friendly tool for CMS distributed analysis Federica Fanzago INFN-PADOVA for CRAB team.
The SAM-Grid and the use of Condor-G as a grid job management middleware Gabriele Garzoglio for the SAM-Grid Team Fermilab, Computing Division.
22 nd September 2003 JIM for CDF 1 JIM and SAMGrid for CDF Mòrag Burgon-Lyon University of Glasgow.
GridPP Deployment & Operations GridPP has built a Computing Grid of more than 5,000 CPUs, with equipment based at many of the particle physics centres.
D0RACE: Testbed Session Lee Lueking D0 Remote Analysis Workshop February 12, 2002.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
1 DØ Grid PP Plans – SAM, Grid, Ceiling Wax and Things Iain Bertram Lancaster University Monday 5 November 2001.
The Experiments – progress and status Roger Barlow GridPP7 Oxford 2 nd July 2003.
What is SAM-Grid? Job Handling Data Handling Monitoring and Information.
Data reprocessing for DZero on the SAM-Grid Gabriele Garzoglio for the SAM-Grid Team Fermilab, Computing Division.
Owen SyngeTitle of TalkSlide 1 Storage Management Owen Synge – Developer, Packager, and first line support to System Administrators. Talks Scope –GridPP.
GRIDS Center Middleware Overview Sandra Redman Information Technology and Systems Center and Information Technology Research Center National Space Science.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
Grid User Interface for ATLAS & LHCb A more recent UK mini production used input data stored on RAL’s tape server, the requirements in JDL and the IC Resource.
The Particle Physics Data Grid Collaboratory Pilot Richard P. Mount For the PPDG Collaboration DOE SciDAC PI Meeting January 15, 2002.
GridPP11 Liverpool Sept04 SAMGrid GridPP11 Liverpool Sept 2004 Gavin Davies Imperial College London.
6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.
1 e-Science AHM st Aug – 3 rd Sept 2004 Nottingham Distributed Storage management using SRB on UK National Grid Service Manandhar A, Haines K,
Testing and integrating the WLCG/EGEE middleware in the LHC computing Simone Campana, Alessandro Di Girolamo, Elisa Lanciotti, Nicolò Magini, Patricia.
Adapting SAM for CDF Gabriele Garzoglio Fermilab/CD/CCF/MAP CHEP 2003.
Grid Job, Information and Data Management for the Run II Experiments at FNAL Igor Terekhov et al FNAL/CD/CCF, D0, CDF, Condor team.
Data Management with SAM at DØ The 2 nd International Workshop on HEP Data Grid Kyunpook National University Daegu, Korea August 22-23, 2003 Lee Lueking.
April 25, 2006Parag Mhashilkar, Fermilab1 Resource Selection in OSG & SAM-On-The-Fly Parag Mhashilkar Fermi National Accelerator Laboratory Condor Week.
Distributed Physics Analysis Past, Present, and Future Kaushik De University of Texas at Arlington (ATLAS & D0 Collaborations) ICHEP’06, Moscow July 29,
D0 File Replication PPDG SLAC File replication workshop 9/20/00 Vicky White.
DØ Grid Computing Gavin Davies, Frédéric Villeneuve-Séguier Imperial College London On behalf of the DØ Collaboration and the SAMGrid team The 2007 Europhysics.
Bob Jones EGEE Technical Director
BaBar-Grid Status and Prospects
GridPP10 Meeting CERN June 3 rd 2004
EGEE Middleware Activities Overview
DØ MC and Data Processing on the Grid
The DZero/PPDG D0/PPDG mission is to enable fully distributed computing for the experiment, by enhancing SAM as the distributed data handling system of.
Presentation transcript:

28 April 2003Lee Lueking, PPDG Review1 BaBar and DØ Experiment Reports DOE Review of PPDG January 28-29, 2003 Lee Lueking Fermilab Computing Division D0 liaison to PPDG

28 April 2003Lee Lueking, PPDG Review2 BaBar Introduction DØ BaBar's PPDG effort concentrating on: –Data Distribution on the Grid (SRB, Bdbserver++). –Job submission on the Grid (EDG,LCG). People involved: –Tim Adye (RAL) –Andy Hanushevsky (SLAC) –Adil Hasan (SLAC) –Wilko Kroeger (SLAC). Interactions with other Grid efforts that are part of BaBar: –GridPP (UK), EDG (Europe through Dominique Boutigny), GridKA, Italian Grid groups etc. BaBar Grid applications are being designed to be data-format neutral –BaBar's new computing model should have little impact on the apps. D Ø’ s PPDG effort concentrating on: –Data Distribution on the Grid (SAM). –Job submission on the Grid (JIM w/Condor-G and Globus). People involved: –Igor Terekhov (FNAL; JIM Team Lead) –Gabriele Garzoglio (FNAL) –Andrew Baranovski (FNAL) –Parag Mhashilkar & Vijay Murthi (via Contr. w/ UTA CSE) –Lee Lueking (FNAL; D0 Liaison to PPDG) Interactions with other Grid efforts that are part of D0: –GridPP (UK), GridKA (DE), NIKHEF (NL), CCIN2P3 (FR) Very closely working with the Condor team to achieve –Grid Job & Resource Matchmaking service –Other robustness and usability features

28 April 2003Lee Lueking, PPDG Review3 Overview of BaBar and DØ Data Handling Regional Center Analysis site DØ Integrated Files Consumed Mar’02 to Mar‘03 DØ Integrated Data Consumed Mar’02 to Mar‘ M Files 1.2 PB Mar2002 Mar2003 Both experiments have extensive distributed computing and data handling systems Significant amounts of data are processed at remote sites in the US and Europe BaBar Database Growth (TB) Jan'02 to Dec'02 BaBar Analysis Jobs (SLAC) Apr'02 to Mar' TB 140k Jobs DØ SAM Deployment BaBar Deployment Tier A Centers Monte Carlo

28 April 2003Lee Lueking, PPDG Review4 BaBar Bulk Data Distribution – SRB Storage Resource Broker (SRB) from SDSC being used to test out data distribution from Tier A to Tier A with view to production this summer. So far have had 2 successful demos at Super Computing 2001 (SLAC- >SLAC), 2002 (SLAC->ccin2p3). Have been testing SRB V2 (released Feb 2003), new features Bulk registering in RDBMS, parallel stream file replication. Busy incorperating newly designed BaBar metadata tables to SRB's RDBMS tables. Looking to improve file replication performance (playing with streams, etc).

28 April 2003Lee Lueking, PPDG Review5 BaBar User-driven data distribution: BdbServer++ Attempts to address use-case: user wants to copy a collection of sparse events with little space overhead (mainly Tier A to Tier C). BdbServer++ essentially a set of scripts that: –Submit a job to the Grid to make a deep-copy of the sparse collection (ie copy objects for events of interest only). –Then copy the files back to user's institution through Grid (can use globus-url-copy). –Poster at CHEP2003 Currently have tested Deep-copy through the grid using EDG and pure Globus. Just completed test of extracting data using globus-url-copy (pure Globus request). To do: incorperate with BaBar bookeeping. Robustness, reliability tests, production-level scripts for submission, copying.

28 April 2003Lee Lueking, PPDG Review6 BaBar Job Submission on the Grid Many production-like activities could take advantage of using compute resources at more than one site. –Analysis Production: ccin2p3 (France), UK, SLAC – using EDG installations. –Simulation Production: Ferrara (Italy) Grid Group, Ohio – using EDG and VDT installations. –Also very useful for data distribution (BdbServer++), ccin2p3 (France), SLAC. Proposed BaBar Grid Architecture

28 April 2003Lee Lueking, PPDG Review7 BaBar Job Submission on the Grid There was a CHEP 2003 talk and Poster, a grid demo set up in UK (run BaBar jobs on UK grid) and have managed to run Simulation Production and data distribution tests on Grid. Plan: test new EDG2/LCG installations, increase users as releases stabilize. BbgUtils.pl – perl script to allow easier client-side installation of Globus + CA's (currently works for Sun, Linux). –Script copies all tar files and signing-policies etc necessary for client installation for that expt. –Can be readily extended to include SRB client-side installation, EDG/LCG client side installation, etc.

28 April 2003Lee Lueking, PPDG Review8 DØ Objectives of SAMGrid Bring standard grid technologies (including Globus and Condor) to the Run II experiments. Enable globally distributed computing for DØ and CDF. JIM (Job and Information Management) complements SAM by adding job management and monitoring to data handling. Together, JIM + SAM = SAMGrid

28 April 2003Lee Lueking, PPDG Review9 JOB Computing Element Submission Client User Interface Queuing System JIM Job Management User Interface Broker Match Making Service Information Collector Execution Site #1 Submission Client Match Making Service Computing Element Grid Sensors Execution Site #n Queuing System Grid Sensors Storage Element Computing Element Storage Element Data Handling System Storage Element Informatio n Collector Grid Sensor s Computin g Element Data Handling System

28 April 2003Lee Lueking, PPDG Review10 A site can join SAM-Grid with combos of services: –Monitoring, and/or –Execution, and/or –Submission May 2003: Expect 5 initial execution sites for SAMGrid deployment, and 20 submission sites. –GrkdKa (Karlsruhe) – Analysis site –Imperial College and Lancaster – MC sites –U. Michigan (NPACI) – Reconstruction center. –FNAL - CLueD0 as a submission site. Summer 2003: Continue to add execution and submission sites. Second round of execution site deployments include Lyon (ccin2p3), Manchester, MSU, Princeton, UTA, FNAL – CAB system. Hope to grow to dozens execution and hundreds of submission sites over next year(s). Use grid middleware for job submission within a site too! –Administrators will have general ways of managing resources. –Users will use common tools for submitting and monitoring jobs everywhere. DØ JIM Deployment

28 April 2003Lee Lueking, PPDG Review11 What’s Next for SAMGrid? After JIM version 1 Improve scheduling jobs and decision making. Improved monitoring, more comprehensive, easier to navigate. Execution of structured jobs Simplifying packaging and deployment. Extend the configuration and advertising features of the uniform framework built for JIM that employs XML. CDF is adopting SAM and SAMGrid for their Data Handling and Job Submission. CDF also has asked to join PPDG. Interoperability, interoperability, interoperability –Working with EDG and LCG to move in common directions –Moving to Web services, Globus V3, and all the good things OGSA will provide. In particular, interoperability by expressing SAM and JIM as a collection of services, and mixing and matching with other Grids

28 April 2003Lee Lueking, PPDG Review12 Challenges Meeting the challenges of real data handling and job submission BaBar and DØ have confronted real-life issues, including… Troubleshooting is an important and time consuming activity in distributed computing environments, and many tools are needed to do this effectively. Operating these distributed systems on a 24/7 basis involves coordination, training, and worldwide effort. Standard middleware is still hard to use, and requires significant integration, testing, and debugging. –File replication integrity –Preemptive distributed caching –Private networks –Routing data in a worldwide system. –Reliable network file transfers, timeouts, and retries –Simplifying complex installation procedures –Username clashing issues, moving to GSI and Grid Certificates –Interoperability with many MSS. –Security issues, firewalls, site policies –Robust job submission on the grid

28 April 2003Lee Lueking, PPDG Review13

28 April 2003Lee Lueking, PPDG Review14 PPDG Benefits to BaBar and DØ PPDG has provided very useful collaboration with, and feedback to, other Grid and Computer Science Groups. Development of tools and middleware that should be of general interest to the Grid community, e.g. –BbgUtils.pl –Condor-G enhancements Deploying and testing grid middleware under battlefield conditions of operational experiments hardens the software and helps CS learn what is needed. The CS groups enable the experiments to examine problems in new, innovative ways, and provide important new technologies for solving them.

28 April 2003Lee Lueking, PPDG Review15 The End