Distributed Physics Analysis Past, Present, and Future Kaushik De University of Texas at Arlington (ATLAS & D0 Collaborations) ICHEP’06, Moscow July 29,

Slides:



Advertisements
Similar presentations
1 ALICE Grid Status David Evans The University of Birmingham GridPP 14 th Collaboration Meeting Birmingham 6-7 Sept 2005.
Advertisements

GridPP July 2003Stefan StonjekSlide 1 SAM middleware components Stefan Stonjek University of Oxford 7 th GridPP Meeting 02 nd July 2003 Oxford.
31/03/00 CMS(UK)Glenn Patrick What is the CMS(UK) Data Model? Assume that CMS software is available at every UK institute connected by some infrastructure.
During the last three years, ALICE has used AliEn continuously. All the activities needed by the experiment (Monte Carlo productions, raw data registration,
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
LHC Experiment Dashboard Main areas covered by the Experiment Dashboard: Data processing monitoring (job monitoring) Data transfer monitoring Site/service.
CERN - IT Department CH-1211 Genève 23 Switzerland t Monitoring the ATLAS Distributed Data Management System Ricardo Rocha (CERN) on behalf.
December 17th 2008RAL PPD Computing Christmas Lectures 11 ATLAS Distributed Computing Stephen Burke RAL.
Zhiling Chen (IPP-ETHZ) Doktorandenseminar June, 4 th, 2009.
High Energy Physics At OSCER A User Perspective OU Supercomputing Symposium 2003 Joel Snow, Langston U.
Alexandre A. P. Suaide VI DOSAR workshop, São Paulo, 2005 STAR grid activities and São Paulo experience.
José M. Hernández CIEMAT Grid Computing in the Experiment at LHC Jornada de usuarios de Infraestructuras Grid January 2012, CIEMAT, Madrid.
Remote Production and Regional Analysis Centers Iain Bertram 24 May 2002 Draft 1 Lancaster University.
3rd June 2004 CDF Grid SAM:Metadata and Middleware Components Mòrag Burgon-Lyon University of Glasgow.
Data Import Data Export Mass Storage & Disk Servers Database Servers Tapes Network from CERN Network from Tier 2 and simulation centers Physics Software.
The first year of LHC physics analysis using the GRID: Prospects from ATLAS Davide Costanzo University of Sheffield
GridPP18 Glasgow Mar 07 DØ – SAMGrid Where’ve we come from, and where are we going? Evolution of a ‘long’ established plan Gavin Davies Imperial College.
F. Fassi, S. Cabrera, R. Vives, S. González de la Hoz, Á. Fernández, J. Sánchez, L. March, J. Salt, A. Lamas IFIC-CSIC-UV, Valencia, Spain Third EELA conference,
SuperBelle Collaboration Meeting December 2008 Martin Sevior University of Melbourne A Computing Model for SuperBelle This is an idea for discussion only!
November SC06 Tampa F.Fanzago CRAB a user-friendly tool for CMS distributed analysis Federica Fanzago INFN-PADOVA for CRAB team.
The ILC And the Grid Andreas Gellrich DESY LCWS2007 DESY, Hamburg, Germany
Tier-2  Data Analysis  MC simulation  Import data from Tier-1 and export MC data CMS GRID COMPUTING AT THE SPANISH TIER-1 AND TIER-2 SITES P. Garcia-Abia.
The ATLAS Grid Progress Roger Jones Lancaster University GridPP CM QMUL, 28 June 2006.
T3 analysis Facility V. Bucard, F.Furano, A.Maier, R.Santana, R. Santinelli T3 Analysis Facility The LHCb Computing Model divides collaboration affiliated.
Grid User Interface for ATLAS & LHCb A more recent UK mini production used input data stored on RAL’s tape server, the requirements in JDL and the IC Resource.
GridPP11 Liverpool Sept04 SAMGrid GridPP11 Liverpool Sept 2004 Gavin Davies Imperial College London.
1 Andrea Sciabà CERN Critical Services and Monitoring - CMS Andrea Sciabà WLCG Service Reliability Workshop 26 – 30 November, 2007.
EGEE-III INFSO-RI Enabling Grids for E-sciencE Ricardo Rocha CERN (IT/GS) EGEE’08, September 2008, Istanbul, TURKEY Experiment.
NOVA A Networked Object-Based EnVironment for Analysis “Framework Components for Distributed Computing” Pavel Nevski, Sasha Vanyashin, Torre Wenaus US.
PD2P The DA Perspective Kaushik De Univ. of Texas at Arlington S&C Week, CERN Nov 30, 2010.
Integration of the ATLAS Tag Database with Data Management and Analysis Components Caitriana Nicholson University of Glasgow 3 rd September 2007 CHEP,
Performance of The NorduGrid ARC And The Dulcinea Executor in ATLAS Data Challenge 2 Oxana Smirnova (Lund University/CERN) for the NorduGrid collaboration.
23.March 2004Bernd Panzer-Steindel, CERN/IT1 LCG Workshop Computing Fabric.
Predrag Buncic Future IT challenges for ALICE Technical Workshop November 6, 2015.
A proposal: from CDR to CDH 1 Paolo Valente – INFN Roma [Acknowledgements to A. Di Girolamo] Liverpool, Aug. 2013NA62 collaboration meeting.
The ATLAS Computing Model and USATLAS Tier-2/Tier-3 Meeting Shawn McKee University of Michigan Joint Techs, FNAL July 16 th, 2007.
1 Andrea Sciabà CERN The commissioning of CMS computing centres in the WLCG Grid ACAT November 2008 Erice, Italy Andrea Sciabà S. Belforte, A.
1. 2 Overview Extremely short summary of the physical part of the conference (I am not a physicist, will try my best) Overview of the Grid session focused.
The ATLAS Strategy for Distributed Analysis on several Grid Infrastructures D. Liko, IT/PSS for the ATLAS Distributed Analysis Community.
1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.
Distributed Analysis Tutorial Dietrich Liko. Overview  Three grid flavors in ATLAS EGEE OSG Nordugrid  Distributed Analysis Activities GANGA/LCG PANDA/OSG.
David Adams ATLAS ATLAS Distributed Analysis (ADA) David Adams BNL December 5, 2003 ATLAS software workshop CERN.
CMS: T1 Disk/Tape separation Nicolò Magini, CERN IT/SDC Oliver Gutsche, FNAL November 11 th 2013.
Finding Data in ATLAS. May 22, 2009Jack Cranshaw (ANL)2 Starting Point Questions What is the latest reprocessing of cosmics? Are there are any AOD produced.
Dynamic Data Placement: the ATLAS model Simone Campana (IT-SDC)
Future of Distributed Production in US Facilities Kaushik De Univ. of Texas at Arlington US ATLAS Distributed Facility Workshop, Santa Cruz November 13,
ATLAS Distributed Analysis DISTRIBUTED ANALYSIS JOBS WITH THE ATLAS PRODUCTION SYSTEM S. González D. Liko
David Adams ATLAS ADA: ATLAS Distributed Analysis David Adams BNL December 15, 2003 PPDG Collaboration Meeting LBL.
Meeting with University of Malta| CERN, May 18, 2015 | Predrag Buncic ALICE Computing in Run 2+ P. Buncic 1.
Monitoring the Readiness and Utilization of the Distributed CMS Computing Facilities XVIII International Conference on Computing in High Energy and Nuclear.
ALICE Physics Data Challenge ’05 and LCG Service Challenge 3 Latchezar Betev / ALICE Geneva, 6 April 2005 LCG Storage Management Workshop.
ATLAS Physics Analysis Framework James R. Catmore Lancaster University.
1 Particle Physics Data Grid (PPDG) project Les Cottrell – SLAC Presented at the NGI workshop, Berkeley, 7/21/99.
The EPIKH Project (Exchange Programme to advance e-Infrastructure Know-How) gLite Grid Introduction Salma Saber Electronic.
DØ Grid Computing Gavin Davies, Frédéric Villeneuve-Séguier Imperial College London On behalf of the DØ Collaboration and the SAMGrid team The 2007 Europhysics.
CERN IT Department CH-1211 Genève 23 Switzerland t EGEE09 Barcelona ATLAS Distributed Data Management Fernando H. Barreiro Megino on behalf.
ATLAS Distributed Analysis S. González de la Hoz 1, D. Liko 2, L. March 1 1 IFIC – Valencia 2 CERN.
Computing Operations Roadmap
University of Texas At Arlington Louisiana Tech University
Overview of the Belle II computing
U.S. ATLAS Grid Production Experience
U.S. ATLAS Tier 2 Computing Center
POW MND section.
Data Challenge with the Grid in ATLAS
LHCb Computing Model and Data Handling Angelo Carbone 5° workshop italiano sulla fisica p-p ad LHC 31st January 2008.
PanDA in a Federated Environment
Readiness of ATLAS Computing - A personal view
ALICE Computing Model in Run3
The ATLAS Computing Model
The LHCb Computing Data Challenge DC06
Presentation transcript:

Distributed Physics Analysis Past, Present, and Future Kaushik De University of Texas at Arlington (ATLAS & D0 Collaborations) ICHEP’06, Moscow July 29, 2006

Kaushik De 2 Introduction  Computing needs for HENP experiments keep growing  Computing models have evolved to meet needs  We have seen many important paradigm shifts in the past: farm computing, distributed information systems (world-wide web), distributed production, and now distributed analysis (DA)  Many lessons from the past – from FNAL, SLAC, RHIC and LHC  I will talk about some general ideas, with examples from ATLAS and D0 experiments – see other talks here for additional LHC specific details

July 29, 2006 Kaushik De 3 Distributed Analysis Goals  Mission statement: remote physics analysis on globally distributed computing systems  Scale: set by experimental needs  Lets look at LHC example from ATLAS in 2008  10,000-20,000 CPU’s distributed at ~40 sites  100 TB transferred from CERN per day (100k files per day)  PB data stored worldwide from 1 st year at LHC  Simultaneous access to data for distributed production and DA  Physicists (users) will need access to both large scale storage and CPU from thousands of desktops worldwide  DA systems are being designed to meet these challenges at the LHC, while learning from current & past experiments

July 29, 2006 Kaushik De 4 Distributed Analysis Challenges  Distributed production is now routinely done in HENP  For MC production and reprocessing of data - not yet LHC scale  Scale: few TB’s of data generated/processed daily in ATLAS  Scope: organized activity, managed by experts  Lessons learned from production  Robust software systems to automatically recover from grid failures  Robust site services – with hundreds of sites, there are daily failures  Robust data management – pre-location of data, cataloguing, transfers  Distributed analysis is in early stages of testing  Moving from Regional Analysis Center model (ex. D0) to fully distributed analysis model – computing on demand  Presents new challenges, in addition to those faced in production  Chaotic by nature – hundreds of users, random fluctuations in demand  Robustness becomes even more critical – software, sites, services

July 29, 2006 Kaushik De 5 Role of Grid Middleware  Basic grid middleware for Distributed Analysis  Most HEP experiments use VDT (which includes Globus)  Security and Accounting - GSI authentication, Virtual Organizations  Tools for secure file transfer and job submission to remote systems  Data location catalogues (RLS, LFC)  Higher level middleware through international Grid projects  Resource brokers (ex. LCG, gLite, CondorG…)  Tools for reliable file transfer (FTS…)  User and group account management (VOMS)  Experiments build application layers on top of middleware  To manage experiment specific workflow  Data (storage) management tools, and database applications

July 29, 2006 Kaushik De 6 Divide and Conquer  Experiments optimize/factorize both data and resources  Data factorization  Successive processing steps lead to compressed physics objects  End user does physics analysis using physics objects only  Limited access to detailed data for code development, calibration  Periodic centralized reprocessing to improve analysis objects  Resource factorization  Tiered model of data location and processors  Higher tiers hold archival data and perform centralized processing  Middle tiers for MC generation and some (re)processing  Middle and lower tiers play important role in distributed analysis  Regional centers are often used to aggregate nearby resources

July 29, 2006 Kaushik De 7 Example of Data Factorization in ATLAS Warning – such projections are often underestimated for DA

July 29, 2006 Kaushik De 8 Example from D0 from A. Boehnlein

July 29, 2006 Kaushik De 9 Computing Model Data handling Services (SAM, Dbservers) Central Analysis Systems Remote Farms Central Farms Raw Data RECO Data RECO MC User Data CLuEDO Central Storage Remote Analysis Systems Fix/skim Resource Factorization Example D0 Computing Model from A. Boehnlein

July 29, 2006 Kaushik De 10 ATLAS Computing Model  Expected resources  10 Tier 1’s each with CPU’s, ~1 PB disk, ~1 PB tape  30 Tier 2’s each with CPU’s, TB disk  Satellite Tier 3 sites – small clusters, user facilities  10 Gb/s network backbone  Tier 0 – repository for raw data, first pass processing  Tier 1 – repository of full set of processed data, reprocessing capabilities, repository for MC data generated at Tier 2’s  Tier 2 – MC production, repository of data summaries  Distributed analysis – uses resources at all Tier’s

July 29, 2006 Kaushik De 11 ATLAS CM Resource Requirements Projected resources needed in 2008, assuming 20% MC

July 29, 2006 Kaushik De 12 Data Management Systems  DA needs robust distributed data management systems  Example from D0 – SAM  10 years of development/experience  Has evolved from data/metadata catalogue to grid enabled workflow system for central production and user analysis (in progress)  Example from ATLAS – DQ2  3 years of development/experience  Has evolved from data catalogue API to data management system  Central catalogue for data collection information (datasets)  Distributed catalogues for dataset content - file level information  Asynchronous site services for data movement by subscription  Client-server architecture with REST-style HTTP calls

July 29, 2006 Kaushik De 13 The Panda Example  Production and Distributed Analysis system in ATLAS  Similar to batch systems for the grid (central job queue)  Marriage of three ideas  Common system for distributed production and analysis  Distributed production jobs submitted through web interface  Distributed analysis jobs submitted through command line interface  Jobs processed through the same workflow system (with common API)  Production operations group maintains Panda as a reliable service for users, working closely with site administrators  Local analysis jobs and distributed analysis jobs with same interface  Use case – physicist develops and tests code on local data, submits to grid for dataset processing (thousands of files) using same interface  ATLAS software framework Athena becomes ‘pathena’ in Panda  Highly optimized for and coupled to ATLAS DDM system DQ2

July 29, 2006 Kaushik De 14 Some ATLAS DA User Examples  Use case 1:  User wants to run analysis on 1000 AOD files (1M events)  User copies a few data files using DQ2  User develops and tests analysis code (Athena) on these local files  User runs pathena over 1000 files on the grid to create Ntuples  User retrieves Ntuples for final analysis and to makes plots  Use case 2:  User needs to process 20,000 ESD files (1M events)  Or user wants to generate large signal MC sample  User requests centralized production through web interface  Use case 3:  User needs small MC sample or to process few file on grid  User runs GUI or CL tools (Ganga, AtCom, LJSF, pathena…)

July 29, 2006 Kaushik De 15 Panda (pathena) DA Status

July 29, 2006 Kaushik De 16 Panda – User Accounting Example

July 29, 2006 Kaushik De 17 Conclusion  Distributed production works well – still needs to scale up  Distributed analysis is new challenge – both for current and future experiments in HENP  Scale of resources and users unprecedented at LHC  Many systems being tested – I showed only one example  Robustness of services and data management critically important  Looking to the future  Self organizing systems  Agent based systems