Presentation is loading. Please wait.

Presentation is loading. Please wait.

SLUO LHC Workshop, SLACJuly 16-17, 2009 1 Analysis Model, Resources, and Commissioning J. Cochran, ISU Caveat: for the purpose of estimating the needed.

Similar presentations


Presentation on theme: "SLUO LHC Workshop, SLACJuly 16-17, 2009 1 Analysis Model, Resources, and Commissioning J. Cochran, ISU Caveat: for the purpose of estimating the needed."— Presentation transcript:

1 SLUO LHC Workshop, SLACJuly 16-17, Analysis Model, Resources, and Commissioning J. Cochran, ISU Caveat: for the purpose of estimating the needed resources, an analysis model is assumed; the AMFY* report will supercede and may alter results Given time constraints, some details are in the back-up slides *AMFY = Analysis Model First Year

2 SLUO LHC Workshop, SLACJuly 16-17, ATLAS Computing Model may be different for early data (i.e. T2) analysis focus Note that user analysis on T1 not part of Computing Model D1PD

3 SLUO LHC Workshop, SLACJuly 16-17, ATLAS Analysis/Computing model – simpleton view ESD/AOD, D 1 PD, D 2 PD - POOL based D 3 PD - flat ntuple Contents defined by physics group(s) made in official production at T0; remake as needed on T1 Produced outside official production on T2 and/or T3 Streamed ESD/AOD thin/ skim/ slim D 1 PD 1 st stage anal D n PD root histo/ plots Produced on T3 claim all analyses (data reduction) can be broken down into a series of transforms [input  output] (i.e. AOD  D 1 PD, D 1 PD  D 3 PD, D 3 PD  plots)

4 SLUO LHC Workshop, SLACJuly 16-17, Expected analysis patterns for early data Assume bulk of group/user activity will happen on T2s/T3s (define user accessible area of T1 as a T3af) Two primary modes: Assume final stage of analysis (plots) happens on T3s (T2s are not interactive) [except for SLAC] (1)Physics group/(user ?) runs jobs on T2s to make tailored dataset (usually D 3 PD) (potential inputs: ESD,AOD,D 1 PD) resultant dataset is then transferred to user’s T3 for further analysis (2) group/user copies input files to specified T3 (potential inputs: ESD,AOD,D 1 PD) On T3 group/user either generates reduced dataset for further analysis or performs final analysis on input data set Choice depends strongly on capabilities of T3, size of input data sets, etc.

5 SLUO LHC Workshop, SLACJuly 16-17, Resources horizontal axis: fraction fully simulated Tier2 simulation for 1 year (~2012, not 2010) vertical axis: fraction fast-simulated (ATLFAST II) Assume only 20% of T2 resources available for analysis Is this sufficient ? What resources are needed for analysis ? from T3 report (Amir Farbin) Number and type of US-based analyses estimated (based on institutional polling) Using known benchmarks (+other assumptions) compute needed storage and cpu-s

6 SLUO LHC Workshop, SLACJuly 16-17, all analyses independent17 minimal cooperation8.5 maximal cooperation4.3 supermax cooperation1.0 kSI2k-s needed/10 10 US Tier2s13 kSI2k-s available/10 10 Note that having every analysis make its own D 3 PDs is not our model! We have always known that we will need to cooperate Tier2 CPU Estimation: Results compare needed cpu-s with available cpu-s: Available Tier2 cpu should be sufficient for 2010 analyses

7 SLUO LHC Workshop, SLACJuly 16-17, copies of AODs/D 1 PDs (data+MC) are distributed over US T2s 1 copy of ESD (data only) distributed over US T2s (expect only for ) (may be able to use perfDPDs in some cases) Included in LCG pledge: T1: All AOD, 20% ESD, 25% RAW each T2: 20% AOD (and/or 20% D 1 PD ?) US Plan US currently behind on LCG pledge for storage Tier2 Storage Estimation: results Available for user analysis: 0 TB 17 TB if we assume only 20% ESD We have insufficient analysis storage until Tier2 disk deficiency is resolved no level of cooperation is sufficient here 2010 all analyses independent3717 minimal cooperation2379 maximal cooperation613 supermax cooperation143 TB needed T2 pledge is somewhat beyond what’s needed for 20% AOD

8 SLUO LHC Workshop, SLACJuly 16-17, Analysis Commissioning/Testing T2 Analysis queues in existence and use since Fall 2008 Robotic stress tests have been running on some queues since fall08 running on all queues since April 09 User tests more difficult to organize Robotic tests reached peak as major component of STEP09 exercise (early June 2009) Queues tested with robotic submission (by experts) of example user jobs US held a 4-day 3-site Jamboree/Stress Test in Sep08 - useful but not nearly extensive enough US expert user testing included as part of STEP09 (and ongoing) - see Andy’s talk Expanded US tests next week (including T2  T3 data transfer test) single-day “all hands” US test in mid-August (?) ATLAS-wide user tests now in planning stages

9 SLUO LHC Workshop, SLACJuly 16-17, Backup

10 SLUO LHC Workshop, SLACJuly 16-17, Data Formats FormatSize(kB)/evt RAW – data output from DAQ ( streamed on trigger bits )1600 ESD – event summary data: reco info + most RAW500 AOD – analysis object data: summary of ESD data150 TAG – event level metadata with pointers to data files1 Derived Physics Data (DPDs): D 1 PD – subset, refined, little brother of AOD~25 D 2 PD – specific to physics (sub)group, augmented, undefined~30 D 3 PD – flat roottuple~5 perfDPD – performance DPD, calibrations, etc. (early data) claim that all analyses can be broken down into a series of transforms [input  output] (i.e. AOD  D 1 PD, D 1 PD  D 3 PD, D 3 PD  plots)

11 SLUO LHC Workshop, SLACJuly 16-17, Starting point: The Transforms claim that all analyses can be broken down into a series of transforms [input  output] (i.e. AOD  D 1 PD, D 1 PD  D 3 PD, D 3 PD  plots) Skimming – removing entire events Slimming – removing parts of objects Thinning – removing objects Augmenting – costs cpu, may increase output size Merging – concatenating files of same type ESD AOD D 1 PD D 2 PD D 3 PD ESD AOD D 1 PD D 2 PD D 3 PD plots InputOutputTransform Assume (for 2009 & 2010 user analysis): T2 activity will be ESD/PerfDPD  D 3 PD and AOD/D 1 PD  D 3 PD T3 activity will be D 3 PD  plots basic model most people are using now

12 SLUO LHC Workshop, SLACJuly 16-17, Transform rates Obtained from PanDA on FDR data – stable over expected range of file sizes (number of events) ESD  D 3 PD ESD  D 3 PD-small ESD  D 3 PD-verysmall AOD  D 3 PD AOD  D 3 PD-small AOD  D 3 PD-verysmall 13 Hz 30 Hz 82 Hz 14 Hz 35 Hz 91 Hz Rates correspond to kSpecInt2k Don’t yet have enough info to know which analyses will use standard, small, or verysmall Choose 10 kHz D 3 PD  plots Large variation (see ATL-COM-SOFT-002.pdf) HammerCloud tests on AOD find ~10 Hz Choose 10 Hz for both ESD/pDPD  D 3 PD AOD/D 1 PD  D 3 PD Assume all standard for Depending on input/output file size and choice of analysis software, rate varied from kHz


Download ppt "SLUO LHC Workshop, SLACJuly 16-17, 2009 1 Analysis Model, Resources, and Commissioning J. Cochran, ISU Caveat: for the purpose of estimating the needed."

Similar presentations


Ads by Google