ATLAS Data Challenge Production Experience Kaushik De University of Texas at Arlington Oklahoma D0 SARS Meeting September 26, 2003.

Slides:



Advertisements
Similar presentations
Resources for the ATLAS Offline Computing Basis for the Estimates ATLAS Distributed Computing Model Cost Estimates Present Status Sharing of Resources.
Advertisements

1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
Experience with ATLAS Data Challenge Production on the U.S. Grid Testbed Kaushik De University of Texas at Arlington CHEP03 March 27, 2003.
Current Monte Carlo calculation activities in ATLAS (ATLAS Data Challenges) Oxana Smirnova LCG/ATLAS, Lund University SWEGRID Seminar (April 9, 2003, Uppsala)
The Panda System Mark Sosebee (for K. De) University of Texas at Arlington dosar workshop March 30, 2006.
LCG Milestones for Deployment, Fabric, & Grid Technology Ian Bird LCG Deployment Area Manager PEB 3-Dec-2002.
ATLAS-Specific Activity in GridPP EDG Integration LCG Integration Metadata.
ATLAS Data Challenge Production and U.S. Participation Kaushik De University of Texas at Arlington BNL Physics & Computing Meeting August 29, 2003.
High Energy Physics At OSCER A User Perspective OU Supercomputing Symposium 2003 Joel Snow, Langston U.
5 November 2001F Harris GridPP Edinburgh 1 WP8 status for validating Testbed1 and middleware F Harris(LHCb/Oxford)
Alexandre A. P. Suaide VI DOSAR workshop, São Paulo, 2005 STAR grid activities and São Paulo experience.
Andrew McNab - Manchester HEP - 5 July 2001 WP6/Testbed Status Status by partner –CNRS, Czech R., INFN, NIKHEF, NorduGrid, LIP, Russia, UK Security Integration.
David Adams ATLAS ATLAS Distributed Analysis David Adams BNL March 18, 2004 ATLAS Software Workshop Grid session.
QCDGrid Progress James Perry, Andrew Jackson, Stephen Booth, Lorna Smith EPCC, The University Of Edinburgh.
DataGrid Applications Federico Carminati WP6 WorkShop December 11, 2000.
Cosener’s House – 30 th Jan’031 LHCb Progress & Plans Nick Brook University of Bristol News & User Plans Technical Progress Review of deliverables.
13-May-2003Barcelona EDG Conference L.Perini 1 ATLAS Grid Planning ATLAS has used in “production mode” different Grids with simulation jobs –NorduGrid,
K. De UTA Grid Workshop April 2002 U.S. ATLAS Grid Testbed Workshop at UTA Introduction and Goals Kaushik De University of Texas at Arlington.
Offline Coordinators  CMSSW_7_1_0 release: 17 June 2014  Usage:  Generation and Simulation samples for run 2 startup  Limited digitization and reconstruction.
Grid Status - PPDG / Magda / pacman Torre Wenaus BNL U.S. ATLAS Physics and Computing Advisory Panel Review Argonne National Laboratory Oct 30, 2001.
Claudio Grandi INFN Bologna CHEP'03 Conference, San Diego March 27th 2003 Plans for the integration of grid tools in the CMS computing environment Claudio.
Finnish DataGrid meeting, CSC, Otaniemi, V. Karimäki (HIP) DataGrid meeting, CSC V. Karimäki (HIP) V. Karimäki (HIP) Otaniemi, 28 August, 2000.
David Adams ATLAS ATLAS Distributed Analysis Plans David Adams BNL December 2, 2003 ATLAS software workshop CERN.
The LHC Computing Grid – February 2008 The Worldwide LHC Computing Grid Dr Ian Bird LCG Project Leader 25 th April 2012.
ATLAS and GridPP GridPP Collaboration Meeting, Edinburgh, 5 th November 2001 RWL Jones, Lancaster University.
Dan Tovey, University of Sheffield GridPP: Experiment Status & User Feedback Dan Tovey University Of Sheffield.
MAGDA Roger Jones UCL 16 th December RWL Jones, Lancaster University MAGDA  Main authors: Wensheng Deng, Torre Wenaus Wensheng DengTorre WenausWensheng.
7April 2000F Harris LHCb Software Workshop 1 LHCb planning on EU GRID activities (for discussion) F Harris.
Status of the LHCb MC production system Andrei Tsaregorodtsev, CPPM, Marseille DataGRID France workshop, Marseille, 24 September 2002.
November SC06 Tampa F.Fanzago CRAB a user-friendly tool for CMS distributed analysis Federica Fanzago INFN-PADOVA for CRAB team.
Tier-2  Data Analysis  MC simulation  Import data from Tier-1 and export MC data CMS GRID COMPUTING AT THE SPANISH TIER-1 AND TIER-2 SITES P. Garcia-Abia.
David Adams ATLAS ADA, ARDA and PPDG David Adams BNL June 28, 2004 PPDG Collaboration Meeting Williams Bay, Wisconsin.
Quick Introduction to NorduGrid Oxana Smirnova 4 th Nordic LHC Workshop November 23, 2001, Stockholm.
Production Tools in ATLAS RWL Jones GridPP EB 24 th June 2003.
PHENIX and the data grid >400 collaborators Active on 3 continents + Brazil 100’s of TB of data per year Complex data with multiple disparate physics goals.
ATLAS Data Challenges US ATLAS Physics & Computing ANL October 30th 2001 Gilbert Poulard CERN EP-ATC.
The Experiments – progress and status Roger Barlow GridPP7 Oxford 2 nd July 2003.
Zprávy z ATLAS SW Week March 2004 Seminář ATLAS SW CZ Duben 2004 Jiří Chudoba FzÚ AV CR.
The LHC Computing Grid – February 2008 The Challenges of LHC Computing Dr Ian Bird LCG Project Leader 6 th October 2009 Telecom 2009 Youth Forum.
Grid User Interface for ATLAS & LHCb A more recent UK mini production used input data stored on RAL’s tape server, the requirements in JDL and the IC Resource.
High Energy Physics & Computing Grids TechFair Univ. of Arlington November 10, 2004.
ATLAS is a general-purpose particle physics experiment which will study topics including the origin of mass, the processes that allowed an excess of matter.
Grid Production Experience in the ATLAS Experiment Horst Severini University of Oklahoma Kaushik De University of Texas at Arlington D0-SAR Workshop, LaTech.
Post-DC2/Rome Production Kaushik De, Mark Sosebee University of Texas at Arlington U.S. Grid Phone Meeting July 13, 2005.
PHENIX and the data grid >400 collaborators 3 continents + Israel +Brazil 100’s of TB of data per year Complex data with multiple disparate physics goals.
SC4 Planning Planning for the Initial LCG Service September 2005.
U.S. ATLAS Computing Facilities Bruce G. Gibbard GDB Meeting 16 March 2005.
Integration of the ATLAS Tag Database with Data Management and Analysis Components Caitriana Nicholson University of Glasgow 3 rd September 2007 CHEP,
Performance of The NorduGrid ARC And The Dulcinea Executor in ATLAS Data Challenge 2 Oxana Smirnova (Lund University/CERN) for the NorduGrid collaboration.
The ATLAS Computing Model and USATLAS Tier-2/Tier-3 Meeting Shawn McKee University of Michigan Joint Techs, FNAL July 16 th, 2007.
Magda Distributed Data Manager Prototype Torre Wenaus BNL September 2001.
Overview of ATLAS Data Challenge Oxana Smirnova LCG/ATLAS, Lund University GAG monthly, February 28, 2003, CERN Strongly based on slides of Gilbert Poulard.
1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.
Distributed Physics Analysis Past, Present, and Future Kaushik De University of Texas at Arlington (ATLAS & D0 Collaborations) ICHEP’06, Moscow July 29,
Grid Status - PPDG / Magda / pacman Torre Wenaus BNL DOE/NSF Review of US LHC Software and Computing Fermilab Nov 29, 2001.
ATLAS Distributed Analysis Dietrich Liko IT/GD. Overview  Some problems trying to analyze Rome data on the grid Basics Metadata Data  Activities AMI.
David Adams ATLAS ATLAS Distributed Analysis (ADA) David Adams BNL December 5, 2003 ATLAS software workshop CERN.
David Adams ATLAS ATLAS Distributed Analysis and proposal for ATLAS-LHCb system David Adams BNL March 22, 2004 ATLAS-LHCb-GANGA Meeting.
David Adams ATLAS ADA: ATLAS Distributed Analysis David Adams BNL December 15, 2003 PPDG Collaboration Meeting LBL.
ATLAS Experience on Large Scale Productions on the Grid CHEP-2006 Mumbai 13th February 2006 Gilbert Poulard (CERN PH-ATC) on behalf of ATLAS Data Challenges;
Magda Distributed Data Manager Torre Wenaus BNL October 2001.
EDG Project Conference – Barcelona 13 May 2003 – n° 1 A.Fanfani INFN Bologna – CMS WP8 – Grid Planning in CMS Outline  CMS Data Challenges  CMS Production.
U.S. ATLAS Grid Production Experience
Long-term Grid Sustainability
Readiness of ATLAS Computing - A personal view
ATLAS DC2 ISGC-2005 Taipei 27th April 2005
U.S. ATLAS Testbed Status Report
US ATLAS Physics & Computing
ATLAS DC2 & Continuous production
Presentation transcript:

ATLAS Data Challenge Production Experience Kaushik De University of Texas at Arlington Oklahoma D0 SARS Meeting September 26, 2003

Kaushik De, D0 SARS Meeting, Oklahoma University 2 ATLAS Data Challenges  Original Goals (Nov 15, 2001)  Test computing model, its software, its data model, and to ensure the correctness of the technical choices to be made  Data Challenges should be executed at the prototype Tier centres  Data challenges will be used as input for a Computing Technical Design Report due by the end of 2003 (?) and for preparing a MoU  Current Status  Goals are evolving as we gain experience  Computing TDR ~end of 2004  DC’s are ~yearly sequence of increasing scale & complexity  DC0 and DC1 (completed)  DC2 (2004), DC3, and DC4 planned  Grid deployment and testing is major part of DC’s

September 26, 2003 Kaushik De, D0 SARS Meeting, Oklahoma University 3 ATLAS DC1: July 2002-April 2003 Goals : Produce the data needed for the HLT TDR Get as many ATLAS institutes involved as possible Worldwide collaborative activity Participation : 56 Institutes (39 in phase 1)  Australia  Austria  Canada  CERN  China  Czech Republic  Denmark *  France  Germany  Greece  Israel  Italy  Japan  Norway *  Poland  Russia  Spain  Sweden *  Taiwan  UK  USA *  New countries or institutes  * using Grid

September 26, 2003 Kaushik De, D0 SARS Meeting, Oklahoma University 4 (6300)(84) 2.5x10 6 Reconstruction + Lvl1/ x10 6 Lumi02 Pile-up x10 7 Simulation Single part TB Volume of data (+6300) CPU-days (400 SI2k) kSI2k.months 4x x No. of events CPU Time Process 690 (+84) Total 50Reconstruction 78 Lumi10 Pile-up 415Simulation Physics evt. DC1 Statistics (G. Poulard, July 2003)

September 26, 2003 Kaushik De, D0 SARS Meeting, Oklahoma University 5  End-July 03: Release 7  Mid-November 03: pre- production release  February 1 st 04 : Release 8 (production)  April 1 st 04:  June 1 st 04: “DC2”  July 15th  Put in place, understand & validate:  Geant4; POOL; LCG applications  Event Data Model  Digitization; pile-up; byte-stream  Conversion of DC1 data to POOL; large scale persistency tests and reconstruction  Testing and validation  Run test-production  Start final validation  Start simulation; Pile-up & digitization  Event mixing  Transfer data to CERN  Intensive Reconstruction on “Tier0”  Distribution of ESD & AOD  Calibration; alignment  Start Physics analysis  Reprocessing DC2:Scenario & Time scale (G. Poulard)

September 26, 2003 Kaushik De, D0 SARS Meeting, Oklahoma University 6 U.S. ATLAS DC1 Data Production  Year long process, Summer  Played 2nd largest role in ATLAS DC1  Exercised both farm and grid based production  10 U.S. sites participating  Tier 1: BNL, Tier 2 prototypes: BU, IU/UC, Grid Testbed sites: ANL, LBNL, UM, OU, SMU, UTA (UNM & UTPA will join for DC2)  Generated ~2 million fully simulated, piled-up and reconstructed events  U.S. was largest grid-based DC1 data producer in ATLAS  Data used for HLT TDR, Athens physics workshop, reconstruction software tests...

September 26, 2003 Kaushik De, D0 SARS Meeting, Oklahoma University 7 U.S. ATLAS Grid Testbed  BNL - U.S. Tier 1, 2000 nodes, 5% for ATLAS, 10 TB, HPSS through Magda  LBNL - pdsf cluster, 400 nodes, 5% for ATLAS (more if idle ~10-15% used), 1TB  Boston U. - prototype Tier 2, 64 nodes  Indiana U. - prototype Tier 2, 64 nodes  UT Arlington - new 200 cpu’s, 50 TB  Oklahoma U. - OSCER facility  U. Michigan - test nodes  ANL - test nodes, JAZZ cluster  SMU - 6 production nodes  UNM - Los Lobos cluster  U. Chicago - test nodes

September 26, 2003 Kaushik De, D0 SARS Meeting, Oklahoma University 8 U.S. Production Summary * Total ~30 CPU YEARS delivered to DC1 from U.S. * Total produced file size ~20TB on HPSS tape system, ~10TB on disk. * Black - majority grid produced, Blue - majority farm produced  Exercised both farm and grid based production  Valuable large scale grid based production experience

September 26, 2003 Kaushik De, D0 SARS Meeting, Oklahoma University 9 Grid Production Statistics Figure : Pie chart showing the sites where DC1 single particle simulation jobs were processed. Only three grid testbed sites were used for this production in August Figure : Pie chart showing the number of pile-up jobs successfully completed at various U.S. grid sites for dataset 2001 (25 GeV dijets). A total of 6000 partitions were generated. These are examples of some datasets produced on the Grid. Many other large samples were produced, especially at BNL using batch.

September 26, 2003 Kaushik De, D0 SARS Meeting, Oklahoma University 10 DC1 Production Systems  Local batch systems - bulk of production  GRAT - grid scripts, generated ~50k files produced in U.S.  NorduGrid - grid system, ~10k files in Nordic countries  AtCom - GUI, ~10k files at CERN (mostly batch)  GCE - Chimera based, ~1k files produced  GRAPPA - interactive GUI for individual user  EDG - test files only  + systems I forgot…  More systems coming for DC2  LCG  GANGA  DIAL

September 26, 2003 Kaushik De, D0 SARS Meeting, Oklahoma University 11 GRAT Software  GRid Applications Toolkit  developed by KD, Horst Severini, Mark Sosebee, and students  Based on Globus, Magda & MySQL  Shell & Python scripts, modular design  Rapid development platform  Quickly develop packages as needed by DC  Physics simulation (GEANT/ATLSIM)  Pileup production & data management  Reconstruction  Test grid middleware, test grid performance  Modules can be easily enhanced or replaced, e.g. EDG resource broker, Chimera, replica catalogue… (in progress)

September 26, 2003 Kaushik De, D0 SARS Meeting, Oklahoma University 12 GRAT Execution Model 1. Resource Discovery 2. Partition Selection 3. Job Creation 4. Pre-staging 5. Batch Submission 6. Job Parameterization DC1 Prod. (UTA) Remote Gatekeeper Replica (local) MAGDA (BNL) Param (CERN) Batch Execution scratch 1,4,5, Simulation 8. Post-staging 9. Cataloging 10. Monitoring

September 26, 2003 Kaushik De, D0 SARS Meeting, Oklahoma University 13 Databases used in GRAT  Production database  define logical job parameters & filenames  track job status, updated periodically by scripts  Data management (Magda)  file registration/catalogue  grid based file transfers  Virtual Data Catalogue  simulation job definition  job parameters, random numbers  Metadata catalogue (AMI)  post-production summary information  data provenance

September 26, 2003 Kaushik De, D0 SARS Meeting, Oklahoma University 14 U.S. Middleware Evolution Used for 95% of DC1 production Used successfully for simulation Tested for simulation, used for all grid-based reconstruction Used successfully for simulation (complex pile-up workflow not yet)

September 26, 2003 Kaushik De, D0 SARS Meeting, Oklahoma University 15 U.S. Experience with DC1  ATLAS software distribution worked well for DC1 farm production, but not well suited for grid production  No integration of databases - caused many problems  Magda & AMI very useful - but we are missing data management tool for truly distributed production  Required a lot of people to run production in the U.S., especially with so many sites on both grid and farm  Startup of grid production slow - but learned useful lessons  Software releases were often late - leading to chaotic last minute rush to finish production

September 26, 2003 Kaushik De, D0 SARS Meeting, Oklahoma University 16 Plans for New DC2 Production System  Need unified system for ATLAS  for efficient usage of facilities, improved scheduling, better QC  should support all varieties of grid middleware (& batch?)  First “technical” meeting at CERN August 11-12, 2003  phone meetings, forming code development groups  all grid systems represented  design document is being prepared  planning a Supervisor/Executor model (see fig. next slide)  first prototype software should be released ~6 months  U.S. well represented in this common ATLAS effort  Still unresolved - Data Management System  Strong coordination with database group

September 26, 2003 Kaushik De, D0 SARS Meeting, Oklahoma University 17 Schematic of New DC2 System  Main features  Common production database for all of ATLAS  Common ATLAS supervisor run by all facilities/managers  Common data management system a la Magda  Executors developed by middleware experts (LCG, NorduGrid, Chimera teams)  Final verification of data done by supervisor  U.S. involved in almost all aspects - could use more help

September 26, 2003 Kaushik De, D0 SARS Meeting, Oklahoma University 18 Conclusion  Data Challenges are important for ATLAS software and computing infrastructure readiness  U.S. playing a major role in DC planning & production  12 U.S. sites ready to participate in DC2  UTA & OU - major role in production software development  Physics analysis will be emphasis of DC2 - new experience  Involvement by more U.S. physicists is needed in DC2  to verify quality of data  to tune physics algorithms  to test scalability of physics analysis model