Presentation is loading. Please wait.

Presentation is loading. Please wait.

Are We Ready for the LHC Data? Roger Jones Lancaster University PPD Xmas Bash RAL, 19/12/06.

Similar presentations


Presentation on theme: "Are We Ready for the LHC Data? Roger Jones Lancaster University PPD Xmas Bash RAL, 19/12/06."— Presentation transcript:

1 Are We Ready for the LHC Data? Roger Jones Lancaster University PPD Xmas Bash RAL, 19/12/06

2 RWL Jones 19 December 2006 RAL PPD 2 Are We Ready? No! Thank You But Seriously….

3 RWL Jones 19 December 2006 RAL PPD 3 The high-level goals of the Computing System Commissioning operation during Autumn 06 - Spring 07The high-level goals of the Computing System Commissioning operation during Autumn 06 - Spring 07 A running-in of continuous operation not a stand-alone challenge Main aim of CSC is to test the software and computing infrastructure that we will need at the beginning of 2007:Main aim of CSC is to test the software and computing infrastructure that we will need at the beginning of 2007: Calibration and alignment procedures and conditions DB Full trigger chain Event reconstruction and data distribution Distributed access to the data for analysis 60 M events have already been produced; new production of 10M events will be done from now until the end of the year.60 M events have already been produced; new production of 10M events will be done from now until the end of the year. At the end of 2006 we will have a working and operational system, ready to take data with cosmic rays at increasing ratesAt the end of 2006 we will have a working and operational system, ready to take data with cosmic rays at increasing rates Getting There - Computing System Commissioning (06/07)

4 RWL Jones 19 December 2006 RAL PPD 4 Facilities at CERN Tier-0:Tier-0: Prompt first pass processing on express/calibration & physics streams with old calibrations - calibration, monitoring Calibrations tasks on prompt data 24-48 hours later, process full physics data streams with reasonable calibrations CERN Analysis FacilityCERN Analysis Facility Access to ESD and RAW/calibration data on demand Essential for early calibration Detector optimisation/algorithmic development

5 RWL Jones 19 December 2006 RAL PPD 5 What is Done  Large scale system tests in the last few weeks have shown the Event-Filter  Tier 0 transfer tests are under control  Various ‘Tier 0’ exercises have demonstrated the first pass processing can be done if the calibrations are in place  Missing elements The rapid calibration process The adequate monitoring and validation

6 RWL Jones 19 December 2006 RAL PPD 6 Data flows and operations can be maintained for ~ week Tier-0 Internal Transfers (Oct)

7 RWL Jones 19 December 2006 RAL PPD 7 Fully simulate ~ 20M events (mainly SM processes: Z  ll, QCD di-jets, etc.) with “realistic” detector “Realistic”  1) As installed in the pit : already-installed detector components positioned in the software according to survey measurements 2) Mis-calibrated (e.g. calo cells, R-t relations) and mis-aligned (e.g. SCT modules, muon chambers); include also chamber/module deformations, wire sagging, HV imperfections, etc. Use the above samples and calibration/alignment algorithms to calibrate and align the detector and recover the nominal (“TDR”) performance. Useful also to understand the trigger performance in more realistic conditions. Includes exercise of (distributed) infrastructure: 3D Condition DB, bookkeeping, etc. Scheduled for Spring 2007; needs ATLAS Release 13 (February 2007) The Calibration Data Challenge (CDC)

8 RWL Jones 19 December 2006 RAL PPD 8  Obtain final alignment and calibration constants  Compare performance of realistic “as-installed” detector after calibration and alignment to nominal (TDR) performance  Understand many systematic effects (material, B-field), test trigger robustness, etc  Learn how to do analyses w/o a-priori information (exact geometry, etc.) Geometry of realistic “as-installed” detector G4-simulation of ~ 20M events (SM processes e.g. Z  ll) Reconstruction pass N (Release 13, Oct. 06) Analysis Calib/align constants pass N Condition DataBase Calib/align constants from pass N-1 Pass 1 uses nominal calibration, alignment, material Large part of it in Release 12.0.0/12.0.x (being validated now) Schematic of the Calibration Data Challenge

9 RWL Jones 19 December 2006 RAL PPD 9 Tier 1 Activities Tier-1:Tier-1:  Receiving RAW an first pass processed data from CERN  Redistribute to Tier 2s  Reprocess 1-2 months after arrival with better calibrations  Reprocess all resident RAW at year end with improved calibration and software

10 RWL Jones 19 December 2006 RAL PPD 10 Put in place monitoring system allowing sites to see their rates (disk/tape areas), data assignments, errors in the last hours, per file, dataset, …Put in place monitoring system allowing sites to see their rates (disk/tape areas), data assignments, errors in the last hours, per file, dataset, … FTS channels in place between T0 and T1 and now progressing between T1 and T2sFTS channels in place between T0 and T1 and now progressing between T1 and T2s By ‘pressure’ of regional contacts Start of the exercise marked by deployment of new DQ2 version (LCG and OSG sites)Start of the exercise marked by deployment of new DQ2 version (LCG and OSG sites) Hopefully this is last major new release for near future Many improvements to the handling of FTS requests Tier-2s participate on a “voluntary basis”.Tier-2s participate on a “voluntary basis”. T0-T1-T2 Scaling test (October 2006)

11 RWL Jones 19 December 2006 RAL PPD 11 Data transfer (October 2006)

12 RWL Jones 19 December 2006 RAL PPD 12 Tier 1 Reprocessing There is a feeling that this is far down the critical pathThere is a feeling that this is far down the critical path Not needed until 2008 Can clone much of the Tier 0 system I do not agreeI do not agree Complex data movement between T1s and out to T2s Staging out of RAW data is different site-by-site Recall for reprocessing will be different Must handle time blocks Best handled by site pulling off a central task queue when ready Implies local effort On a positive note, the 3D conditions database tests have worked well, with RAL one of 3 Tier 1s fully validatedOn a positive note, the 3D conditions database tests have worked well, with RAL one of 3 Tier 1s fully validated

13 RWL Jones 19 December 2006 RAL PPD 13 Analysis computing model Analysis model broken into two components @ Tier 1: Scheduled central production of augmented AOD, tuples & TAG collections from ESD  Derived files moved to other T1s and to T2s ! The framework for this is still to be developed @ Tier 2: On-demand user analysis of augmented AOD streams, tuples, new selections etc and individual user simulation and CPU-bound tasks matching the official MC production  Modest job traffic between T2s  Tier 2 files are not private, but may be for small sub- groups in physics/detector groups Limited individual space, copy to Tier3s

14 RWL Jones 19 December 2006 RAL PPD 14 Optimised Access RAW, ESD and AOD will be streamed to optimise accessRAW, ESD and AOD will be streamed to optimise access The selection and direct access to individual events is via a TAG databaseThe selection and direct access to individual events is via a TAG database TAG is a keyed list of variables/event Overhead of file opens is acceptable in many scenarios Works very well with pre-streamed data Two rolesTwo roles Direct access to event in file via pointer Data collection definition function Two formats, file and databaseTwo formats, file and database Now believe large queries require full database Multi-TB relational database Restricts it to Tier1s and large Tier2s/CAF File-based TAG allows direct access to events in files (pointers) Ordinary Tier2s hold file-based primary TAG corresponding to locally- held datasets

15 RWL Jones 19 December 2006 RAL PPD 15 Tier 2/3 Activities ~30 Tier 2 Centers distributed worldwide Monte Carlo Simulation, producing ESD, AOD, ESD, AOD  Tier 1 centers~30 Tier 2 Centers distributed worldwide Monte Carlo Simulation, producing ESD, AOD, ESD, AOD  Tier 1 centers  Simulation  ESD, AOD, ESD, AOD  Tier 1 centers On demand user physics analysis of shared datasets Limited success Limited access to ESD and RAW data sets Tier 3 Centers distributed worldwideTier 3 Centers distributed worldwide Physics analysis Data private and local - summary datasets

16 RWL Jones 19 December 2006 RAL PPD 16 On-demand Analysis Restricted Tier 2s and CAFRestricted Tier 2s and CAF Can specialise some Tier 2s for some groups ALL Tier 2s are for ATLAS-wide usage Most ATLAS Tier 2 data should be ‘placed’ with lifetime ~ monthsMost ATLAS Tier 2 data should be ‘placed’ with lifetime ~ months Job must go to the data This means Tier 2 bandwidth is vastly lower than if you pull data to the job Back navigation requires AOD/ESD to be co-located Role and group based quotas are essentialRole and group based quotas are essential Quotas to be determined per group not per user Data SelectionData Selection Over small samples with Tier-2 file-based TAG and AMI dataset selector TAG queries over larger samples by batch job to database TAG at Tier- 1s/large Tier 2s What data?What data? Group-derived formats Subsets of ESD and RAW Pre-selected or selected via a Big Train run by working group No back-navigation between sites, formats should be co-located

17 Alessandra Forti Gridpp12 17 Simulation has been the basis of all tests so farSimulation has been the basis of all tests so far We can produce over 10M events fully simulated and reconstructed per month This needs to double in 07Q1 Partly needs new kit, but also higher efficiency Tier 2s and Tier 3 all contribute Data management needs work (see below) We have limited analysis activityWe have limited analysis activity Most users pull data to a selected local resource over WAN (bad) Better users make copy to CLOSE CE over LAN (better) Some have tried streamed access using rfio Problems with root version for dCache Does not work for DPM - incompatible root plug-in Fix promised ‘in a month’ for approximately 9 months Data ManagementData Management This is not yet fit for purpose More development effort is needed Needs work for bulk transfers, partly using middleware better, partly missing functionality The operations effort is being disentangled Every site needs tuning and retuning Site SRM stability has been an issue Tier 2 Preparation Tier 2 Preparation

18 RWL Jones 19 December 2006 RAL PPD 18 User Tools GANGA/pAthena starting to be used for Distributed AnalysisGANGA/pAthena starting to be used for Distributed Analysis Users always improve things Priority queues being requested Hard to track frequent software releases DDM user tools exist but rudimentaryDDM user tools exist but rudimentary Vital to manage requests Also important for traffic to T3s

19 RWL Jones 19 December 2006 RAL PPD 19  Generate O(10 7 ) evts: few days of data taking, ~1 pb -1 at L=10 31 cm -2 s -1  Filter events at MC generator level to get physics spectrum expected at HLT output  Pass events through G4 simulation (realistic “as installed” detector geometry)  Mix events from various physics channels to reproduce HLT physics output  Run LVL1 simulation (flag mode)  Produce byte streams  emulate the raw data  Send raw data to Point 1, pass through HLT nodes (flag mode) and SFO, write out events by streams, closing files at boundary of luminosity blocks.  Send events from Point 1 to Tier0  Perform calibration & alignment at Tier0 (also outside ?)  Run reconstruction at Tier0 (and maybe Tier1s ?)  produce ESD, AOD, TAGs  Distribute ESD, AOD, TAGs to Tier1s and Tier2s  Perform distributed analysis (possibly at Tier2s) using TAGs  MCTruth propagated down to ESD only (no truth in AOD or TAGs) A complete exercise of the full chain from trigger to (distributed) analysis, to be performed in July 2007 “The Dress rehearsal”

20 RWL Jones 19 December 2006 RAL PPD 20 Whole New Operations Model Requires an overall Grid Operations Co-ordinatorRequires an overall Grid Operations Co-ordinator Kors Bos from January (tbc) Requires Data Placement and resource co-ordinationRequires Data Placement and resource co-ordination TBD Requires Data Movement Operations Co-ordinator and TeamRequires Data Movement Operations Co-ordinator and Team Alexei Kimentov Requires full production shifts and tasksRequires full production shifts and tasks Prepared, included in formal requirements Start from July?

21 Alessandra Forti Gridpp12 21 Tier 1 DiskTier 1 Disk The shortage of Tier 1 disk has been crippling both operations and the Computing System Commissioning/SC4 exercises The first Tier 1-Tier 2 transfer tests using the DDM tools failed because of absence of required T1 disk and required Current tests are ongoing 6/7 tier 2s receiving data from RAL T1, 3/7 shipping it back Lots of pain to clear available disk Daily operations stopped because of missing disk space Tier 2 capacity does not help for operations. Need to separate production and user disk space/allocations The large CSC operations are about to start Even the production will be limited until the year end Local Comments

22 Alessandra Forti Gridpp12 22 Tier 1 & other UK Grid operationsTier 1 & other UK Grid operations We now have weekly operations meetings for the Tier 1 RAL also attend ATLAS weekly production meeting Much improved communication with T1 Invitation to DTEAM meetings harder to attend - need bigger ATLAS production group Also ATLAS services and operationsAlso ATLAS services and operations UK operations shifts are being started This includes shifts for the Run Time testing etc Also data movement effort We need to provide ~5FTE 2 for data movement 1 for ATLAS production 1 for user support 1 for RTT Operations issues Operations issues

23 RWL Jones 19 December 2006 RAL PPD 23 Evolution There is a growing Tier 2 capacity shortfall with time We need to be careful to avoid wasted event copies

24 RWL Jones 19 December 2006 RAL PPD 24 Closing Comment We desperately need to exercise the analysis modelWe desperately need to exercise the analysis model With real distributed analysis With streamed (‘impure’) datasets With the data realistically placed (several copies available, not being pulled locally)


Download ppt "Are We Ready for the LHC Data? Roger Jones Lancaster University PPD Xmas Bash RAL, 19/12/06."

Similar presentations


Ads by Google