Presentation is loading. Please wait.

Presentation is loading. Please wait.

ATLAS Computing Model – US Research Program Manpower J. Shank N.A. ATLAS Physics Workshop Tucson, AZ 21 Dec., 2004.

Similar presentations


Presentation on theme: "ATLAS Computing Model – US Research Program Manpower J. Shank N.A. ATLAS Physics Workshop Tucson, AZ 21 Dec., 2004."— Presentation transcript:

1 ATLAS Computing Model – US Research Program Manpower J. Shank N.A. ATLAS Physics Workshop Tucson, AZ 21 Dec., 2004

2 12/21/04 NA ATLAS Physics Workshop Tucson, AZ 2 Overview Updates to the Computing Model The Tier hierarchy The base numbers Size estimates: T0, CAF, T1, T2 US ATLAS Research Program Manpower

3 12/21/04 NA ATLAS Physics Workshop Tucson, AZ 3 Computing Model http://atlas.web.cern.ch/Atlas/GROUPS/SOFTWARE/OO /computing-model/Comp-Model-December15.doc Computing Model presented at the October Overview Week  Revision concerning the Tier-2s since then  Revision concerning effect of pile-up, luminosity profile There are (and will remain) many unknowns  We are starting to see serious consideration of calibration and alignment needs in the sub-detector communities, but there is a way to go!  Physics data access patterns MAY start to be seen from the final stage of DC2 Too late for the document Unlikely to know the real patterns until 2007/2008!  Still uncertainties on the event sizes  RAW without pile-up is just over 1.6MB limit  ESD is (with only one tracking package) about 60% larger than nominal, 140% larger with pile-up  AOD is smaller than expected, but functionality will grow  With the advertised assumptions, we are at the limit of available disk Model must maintain as much flexibility as possible For review, we must present a single coherent model All Computing Model slides are from Roger Jones at last sw week http://agenda.cern.ch/age?a036309

4 12/21/04 NA ATLAS Physics Workshop Tucson, AZ 4 Resource estimates These have been revised again  Luminosity profile 2007-2010 assumed  More simulation (20% of data rate)  Now only ~30 Tier-2s We can count about 29 candidates This means that the average Tier-2 has grown because of simulation and because it represents a larger fraction  The needs of calibration from October have been used to update the CERN Analysis Facility resources  Input buffer added to Tier-0

5 12/21/04 NA ATLAS Physics Workshop Tucson, AZ 5 The System Tier2 Center ~200kSI2k Event Builder Event Filter ~7.5MSI2k T0 ~5MSI2k US Regional Centre (BNL) UK Regional Center (RAL) French Regional Center Dutch Regional Center Tier3 Tier 3 ~0.25TIPS Workstations 10 GB/sec 320 MB/sec 100 - 1000 MB/s links Some data for calibration and monitoring to institutes Calibrations flow back Each Tier 2 has ~20 physicists working on one or more channels Each Tier 2 should have the full AOD, TAG & relevant Physics Group summary data Tier 2 do bulk of simulation Physics data cache ~Pb/sec ~ 75MB/s/T1 for ATLAS Tier2 Center ~200kSI2k  622Mb/s links Tier 0 Tier 1 Desktops PC (2004) = ~1 kSpecInt2k Other Tier2 ~200kSI2k Tier 2  ~200 Tb/year/T2  ~2MSI2k/T1  ~2 Pb/year/T1  ~5 Pb/year  No simulation  622Mb/s links 10 Tier-1s: rereconstruction store simulated data group Analysis

6 12/21/04 NA ATLAS Physics Workshop Tucson, AZ 6 Computing Resources Assumption:  200 days running in 2008 and 2009 at 50% efficiency (10 7 sec live)  100 days running in 2007 (5x10 6 sec live)  Events recorded are rate limited in all cases – luminosity only affects data size and data processing time  Luminosity: 0.5*10 33 cm -2 s -1 in 2007 2*10 33 cm -2 s -1 in 2008 and 2009 10 34 cm -2 s -1 (design luminosity) from 2010 onwards Hierarchy  Tier-0 has raw+calibration data+first-pass ESD  CERN Analysis Facility has AOD, ESD and RAW samples  Tier-1s hold RAW data and derived samples and ‘shadow’ the ESD for another Tier-1  Tier-1s also house simulated data  Tier-1s provide reprocessing for their RAW and scheduled access to full ESD samples  Tier-2s provide access to AOD and group Derived Physics Datasets and carry the full simulation load

7 12/21/04 NA ATLAS Physics Workshop Tucson, AZ 7 Processing Tier-0:  First pass processing on express/calibration lines  24-48 hours later, process full primary stream with reasonable calibrations Tier-1:  Reprocess 1-2 months after arrival with better calibrations (steady state: and same software version, to produce a coherent dataset)  Reprocess all resident RAW at year end with improved calibration and software

8 8 The Input Numbers Rate(Hz)sec/yearEvents/ySize(MB)Total(TB) Raw Data (inc express etc)2001.00E+072.00E+091.63200 ESD (inc express etc)2001.00E+072.00E+090.51000 General ESD1801.00E+071.80E+090.5900 General AOD1801.00E+071.80E+090.1180 General TAG1801.00E+071.80E+090.0012 Calibration (ID, LAr, MDT)44 (8 long-term) MC Raw2.00E+082400 ESD Sim2.00E+080.550 AOD Sim2.00E+080.110 TAG Sim2.00E+080.0010 Tuple0.01 Nominal year 10 7 Accelerator efficiency 50%

9 12/21/04 NA ATLAS Physics Workshop Tucson, AZ 9 Resource Summary (15 Dec. version) Table 1: The estimated resources required for one full year of data taking in 2008 or 2009.

10 12/21/04 NA ATLAS Physics Workshop Tucson, AZ 10 Amount of Simulation is a “free” parameter 20% of data100% of data

11 12/21/04 NA ATLAS Physics Workshop Tucson, AZ 11 2008 T0 requirements CERN T0 : Storage requirement Disk (TB)Tape (TB) Integrated Tape (TB ) Raw 030404454 General ESD (prev..) 010001465 Calibration 240 168 280 Buffer 1140 0 Total 354 42086165 Table Y1.2 CERN T0 : Computing requirement Reconstr.Reprocess.Calibr.Cent.AnalysisUser AnalysisTotal CPU (KSI2k) 35290529004058 Understanding of the calibration load is evolving.

12 12/21/04 NA ATLAS Physics Workshop Tucson, AZ 12 T0 Evolution – Total capacity Note detailed evolutions differ from the draft – revised and one bug fixed

13 12/21/04 NA ATLAS Physics Workshop Tucson, AZ 13 T0 Cost/Year Evolution

14 12/21/04 NA ATLAS Physics Workshop Tucson, AZ 14 CERN Analysis Facility Small-sample chaotic reprocessing 170kSI2k Calibration 530kSI2k User analysis ~1470kSI2k – much increased This site does not share in the global simulation load The start-up balance would be very different, but we should try to respect the envelope Storage requirement -2008 data only Disk (TB)Auto.Tape (TB) Raw2410 General ESD (curr.)2290 General ESD (prev.)018 AOD (curr.)2570 AOD (prev.)04 TAG (curr.)30 TAG (prev.)02 ESD Sim (curr.)2860 ESD Sim (prev.)04 AOD Sim (curr.)570 AOD Sim (prev.)040 Tag Sim (curr.)0.60 Tag Sim (prev.)00.4 Calibration240168 User Data (100 users)303212 Total1615448

15 12/21/04 NA ATLAS Physics Workshop Tucson, AZ 15 Analysis Facility Evolution

16 12/21/04 NA ATLAS Physics Workshop Tucson, AZ 16 Analysis Facility Cost/Year Evolution

17 12/21/04 NA ATLAS Physics Workshop Tucson, AZ 17 Estimate about 1800kSi2k for each of 10 T1s Central analysis (by groups, not users) ~1300kSI2k Typical Tier-1 Year 1 resources This includes a ‘1year, 1 pass’ buffer ESD is 47% of Disk ESD is 33% of Tape Current pledges are ~55% of this requirement Making event sizes bigger makes things worse! 2008 Average T1 Requirements T1 : Storage requirement Disk (TB)Auto.Tape (TB) Raw43304 General ESD (curr.)25790 General ESD (prev..)12990 AOD28336 TAG30 Calib2400 RAW Sim080 ESD Sim (curr.)5720 ESD Sim (prev.)2920 AOD Sim638 Tag Sim10 User Data (20 groups) 126 0 Total1230648

18 12/21/04 NA ATLAS Physics Workshop Tucson, AZ 18 Single T1 Evolution

19 12/21/04 NA ATLAS Physics Workshop Tucson, AZ 19 Single T1 Cost/Year Evolution

20 12/21/04 NA ATLAS Physics Workshop Tucson, AZ 20 20 User Tier-2 2008 Data Only Typical Storage requirement Disk (TB) Raw1 General ESD (curr.)13 AOD86 TAG3 RAW Sim0 ESD Sim (curr.)6 AOD Sim 19 Tag Sim1 User Group42 User Data61 Total230 User activity includes some reconstruction (algorithm development etc) Also includes user simulation (increased) T2s also share the event simulation load (increased), but not the output data storage Typical Computing requirement Reconstruction.ReprocessingSimulationUser AnalysisTotal (kSI2k) CPU (KSI2k)680180293541

21 12/21/04 NA ATLAS Physics Workshop Tucson, AZ 21 20-user T2 Evolution

22 12/21/04 NA ATLAS Physics Workshop Tucson, AZ 22 20-user T2 Cost Evolution

23 12/21/04 NA ATLAS Physics Workshop Tucson, AZ 23 Overall 2008-only Resources (‘One Full Year’ Resources) CERN All T1 All T2 Total Tape (Pb)4.6Pb6.5Pb0.0Pb11.1Pb Disk (Pb)2.0Pb12.3Pb6.9Pb21.2Pb CPU (MSI2k)6.218.016.240.5 If T2 supports private analysis, add about 1 TB and 1 kSI2k/user

24 12/21/04 NA ATLAS Physics Workshop Tucson, AZ 24 Overall 2008 Total Resources CERN All T1 All T2 Total Tape (Pb)6.9Pb9.5Pb0.0Pb16.4Pb Disk (Pb)2.9Pb18.0Pb10.1Pb31.0Pb CPU (MSI2k)9.026.123.558.7 If T2 supports private analysis, add about 1.5 TB and 1.5 kSI2k/user

25 12/21/04 NA ATLAS Physics Workshop Tucson, AZ 25 Important points: Discussion on disk vs tape storage at Tier-1’s  Tape in this discussion means low-access slow secure storage Storage of Simulation  Assumed to be at T1s  Need partnerships to plan networking  Must have fail-over to other sites Commissioning  These numbers are calculated for the steady-state but with the requirement of flexibility in the early stages Simulation fraction is an important tunable parameter in T2 numbers!

26 12/21/04 NA ATLAS Physics Workshop Tucson, AZ 26 Latencies On the input side of the T0, assume following:  Primary stream – every physics event Publications should be based on this, uniform processing  Calibration stream – calibration + copied selected physics triggers Need to reduce latency of processing primary stream  Express stream – copied high-pT events for ‘excitement’ and (with calibration stream) for detector optimisation Must be a small percentage of total Express and calibration streams get priority in T0  New calibrations determine the latency for primary processing  Intention is to have primary processing within 48 hours  Significantly more would require a prohibitively large input buffer Level of access to RAW?  Depends on functionality of ESD  Discussion of small fraction of DRD – augmented RAW data  Software and processing model must support very flexible data formats

27 12/21/04 NA ATLAS Physics Workshop Tucson, AZ 27 Networking EF  T0 maximum 320MB/s (450MB/s with headroom) Networking off-site now being calculated with David Foster Recent exercise with (almost) current numbers Traffic from T0 to each Tier-1 is 75MB/s – will be more with overheads and contention (225MB/sec) Significant traffic of ESD and AOD from reprocessing between T1s  52MB/sec raw  ~150MB/sec with overheads and contention  Dedicated networking test beyond DC2, plans in HLT

28 12/21/04 NA ATLAS Physics Workshop Tucson, AZ 28 Conclusions and Timetable Computing Model documents required by 15 th December  This is the last chance to alter things  We need to present a single coherent model  We need to maintain flexibility  Intend to produce a requirements and recommendations document Computing Model review in January 2005 (P McBride)  We need to have serious inputs at this point Documents to April RRBs MoU Signatures in Summer 2005 Computing & LCG TDR June 2005

29 12/21/04 NA ATLAS Physics Workshop Tucson, AZ 29 Calibration and Start-up Discussions Richard will present some comments from others on what they would like at start-up Some would like a large e/mu second copy on disk for repeated reprocessing Be aware of the disk and CPU requirements  10Hz + 2 ESD versions retained = >0.75PB on disk  Full sample would take 5MSI2k to reprocess in a week Requires scheduled activity or huge resources If there are many reprocessings you must either distribute it or work with smaller samples What were (are) we planning to provide? @CERN  1.1MSI2k in T0 and Super T2 for calibration etc  T2 also has 0.8MSI2k for user analysis  Super T2 with 0.75TB disk, mainly AoD but could be more Raw+ESD to start In T1 Cloud  T1 cloud has 10% of Raw on disk and 0.5MSI2k in T1 cloud for calibration In T2s  0.5PB for RAW+ESD, should allow small unscheduled activities

30 12/21/04 NA ATLAS Physics Workshop Tucson, AZ 30 Reality check Putting 10Hz e/mu on disk would require more than double the CERN disk We are already short of disk in the T1s (funding source the same!) There is capacity in the T1s so long as the sets are replaced with the steady-state sets as the year progresses Split 2008ALICEATLASCMSLHCb Offered669016240103257450 Required910016600126009500 Balance-26%-2%-18%-22% Offered769517144061154 Required3000920087001300 Balance-74%-44%-49%-11% Offered2.18.95.11.7 Required3.66.06.60.4 Balance-42%48%-23%325% Snapshot of Tier-1 status

31 12/21/04 NA ATLAS Physics Workshop Tucson, AZ 31 End of Computing Model talk.

32 12/21/04 NA ATLAS Physics Workshop Tucson, AZ 32 U.S. Research Program Manpower

33 12/21/04 NA ATLAS Physics Workshop Tucson, AZ 33 FY05 Software Manpower 5.75 FTE @ LBNL  Core/Framework 5 FTE @ ANL  Data Management Event Store 5 FTE @BNL  DB/Distributed Analysis/sw Infrastructure 1 FTE @ U Pittsburgh  Detector Description 1 FTE @ Indiana U.  Improving end-user usability of Athena

34 12/21/04 NA ATLAS Physics Workshop Tucson, AZ 34 Est. FY06 Software Manpower Maintain FY05 Move from PPDG to Program funds  Puts about 1 FTE burden on program + approx. 2 FTE at Universities In long term, total expected program funded at universities is about 7 FTE

35 12/21/04 NA ATLAS Physics Workshop Tucson, AZ 35 FY07 and beyond sw manpower Reaching plateau  Maybe 1-2 more FTE at universities Obviously, manpower for physics analysis (students, post-docs) is going to have to come from the base program.  We (project management) try to help get DOE/NSF base funding for all, but…prospects have not been good  “redirection” from Tevatron starting to happen, but it might not be enough for our needs in 2007


Download ppt "ATLAS Computing Model – US Research Program Manpower J. Shank N.A. ATLAS Physics Workshop Tucson, AZ 21 Dec., 2004."

Similar presentations


Ads by Google