Presentation is loading. Please wait.

Presentation is loading. Please wait.

CMS-HI Offline Computing

Similar presentations


Presentation on theme: "CMS-HI Offline Computing"— Presentation transcript:

1 CMS-HI Offline Computing
Charles F. Maguire Vanderbilt University For the CMS-HI US Institutions May 14, 2008 Draft Version 5 on May 13 at 8:40 PM CDT

2 CMS-HI at the LHC for High Density QCD with Heavy Ions
LHC: New Energy Frontier for Relativistic Heavy Ion Physics Quantum Chromodynamics at extreme conditions (density, temperature, …) Pb+Pb collisions at 5.5 TeV, thirty times larger than Au+Au at RHIC Expecting a longer-lived Quark Gluon Plasma state accompanied by very enhanced yields of hard probes with high mass and/or transverse momentum CMS: Excels as Detector for Heavy Ion Collisions at the LHC Sophisticated high level trigger for getting rare important events at a rapid rate Best momentum resolution and tracking granularity Large acceptance tracking and calorimetry proven jet finding in HI events May 14, 2008 Draft Version 5 on May 13 at 8:40 PM CDT

3 Draft Version 5 on May 13 at 8:40 PM CDT
CMS-HI in the US 10 Participating Institutions Colorado, Iowa, Kansas, LANL, Maryland, Minnesota, MIT, UC Davis, UIC, Vanderbilt Projected to contribute ~60 FTEs (Ph.D. and students) as of 2012 MIT as lead US institution with Boleslaw Wyslouch as Project Manager US-CMS-HI Tasks (Research Management Plan in 2007) Completion of HLT CPU Farm Upgrade for HI events at CMS (MIT) Construction of the Zero Degree Calorimeter at CMS (Kansas and Iowa) Development of the CMS-HI Compute Center in the US (task force established) Task force recommended that the Vanderbilt University group lead the proposal composition Also want to retain the expertise developed by the MIT HI group at their CMS Tier2 site CMS-HI Compute Proposal to DOE is Due This Month Will be reviewed by ESnet managers, and also by external consultants May 14, 2008 Draft Version 5 on May 13 at 8:40 PM CDT

4 Basis of Compute Model for CMS-HI
Heavy Ion Data Operations for CMS at the LHC Heavy ion collisions expected in 2009, second year of running for the LHC Heavy ion running takes place during a 1 month period (106 seconds) At designed DAQ bandwidth the CMS detector will be writing 225 TBytes of raw data per heavy ion running period, plus ~75 TBytes support files Raw data will stay resident at CERN Tier0 disks for a few days at most, while transfers take place to the CMS-HI compute center in the US Possibility to have 300 TBytes of disk for HI data (NSF REDDnet project) Raw data will not be reconstructed at the Tier0, but will be written to a write-only (emergency archive) tape system before deletion from Tier0 disks Projected Data Volumes (optimistic scenario) Assumes that we don’t need 400 TBytes of disk space immediately TBytes in first year of HI operations TBytes in second year of HI operations 300 TBytes nominal size achieved in third year of HI operations May 14, 2008 Draft Version 5 on May 13 at 8:40 PM CDT

5 Draft Version 5 on May 13 at 8:40 PM CDT
Data Transport Options for CMS-HI Following ESnet-NP 2008 Workshop Recommendations CMS-HEP Raw Data Transport from CERN to FNAL Tier1 Using LHCnet to cross the Atlantic LHCnet terminating at Starlight HUP ESnet transport from Starlight into FNAL Tier1 centre Links are rated at 10 Gbps CMS-HI Raw Data Transport from CERN to US Compute Center Network topology has not been established at this time (options exist) Vanderbilt is establishing a 10 Gbps path to SOX-Atlanta for end of 2008 Trans-Atlantic Network Options (DOE requires a non-LHCnet backup plan) Use LHCnet to Starlight during one month when HI beams are being collided Transport data from Starlight to Vanderbilt compute center via ESnet/Internet2 Transfer links will still be rated at 10 Gbps to transfer data within ~1 month Do not use LHCnet but use other trans-Atlantic links supported by NSF, with links rated at 10 Gbps such that data are transferred over ~1 month Install 300 TByte disk buffer at CERN and use non-LHCnet trans-Atlantic links to transfer data over 4 months (US-Alice model) at ~2.5 Gbps May 14, 2008 Draft Version 5 on May 13 at 8:40 PM CDT

6 Vanderbilt to SoX Using Managed Service Connecting 10G to ACCRE only
More elaborate network topologies are shown in slides 13 and 14 The extra cost of these, if necessary, would be borne by Vanderbilt May 14, 2008 Vanderbilt I n f o r m a t i o n T e c h n o l o g y S e r v i c e s Vanderbilt I n f o r m a t i o n T e c h n o l o g y S e r v i c e s 6

7 Draft Version 5 on May 13 at 8:40 PM CDT
Data Transport Issues for CMS-HI Following ESnet-NP 2008 Workshop Recommendations To Use LHCnet or Not To Use LHCnet Use of LHCnet to US, following CMS-HEP path, is the simplest approach A separate trans-Atlantic link will require dedicated CMS-HI certifications DOE requires a non-LHCnet plan be discussed in CMS-HI compute proposal Issues With the Use of LHCnet by CMS-HI ESnet officials believe that LHCnet is already fully subscribed by FNAL HI month was supposed to be used for getting final sets HEP data transferred FNAL was quoted as having only 5% “headroom” left with use of LHCnet HI data volume is 10% of the HEP data volume Issues With the Non-Use of LHCnet by CMS-HI Non-use of LHCnet would be a new, unverified path for data out of the LHC CERN computer management would have to approve (same for US-ALICE) Installing a 300 TByte disk buffer system at CERN (ESnet recommendation) would also have to approved and integrated into the CERN Tier0 operations May 14, 2008 Draft Version 5 on May 13 at 8:40 PM CDT

8 Tape Archive for CMS-HI? Raw Data and Processed Data for Each Year
CMS-HI Raw Data Tape Archive: FNAL or Vanderbilt? Original suggestion was that the raw data archive be at Vanderbilt Vanderbilt already has a tape robot system, can upgrade to 1.5 more PBytes Possibly more economical to archive the raw (and processed) data at FNAL? Savings are a result of not having a person at VU dedicated to tape service Cost savings result in more CPUs for the CMS-HI compute center Issues for Tape Archive Decision Is there a true cost savings as anticipated? [quantitative discussion given on slide 10] What specific new burdens would be assumed at FNAL? Serving raw data (300 TB) twice per year to Vanderbilt for reconstruction Receiving processed data annually for archiving into tape system Is the tape archive decision coupled with the use/non-use of LHCnet? Administrative issue of transferring funds from DOE-NP to DOE-HEP May 14, 2008 Draft Version 5 on May 13 at 8:40 PM CDT

9 Data Processing for CMS-HI
Real Data Processing Model Two reconstruction passes per year taking 4 months each Two following analysis passes per year taking 2 months each Analyzed data placed at selected data depots (MIT, UIC, Vanderbilt) A smaller CMS-HI analysis center is also being established in South Korea All CMS-HI institutions access AOD files via OSG job submissions Real Data Output File Production Model Normal reco output is 20% of raw data input Reco output with diagnostics is 400% of raw data input (only for initial runs) CPU Processing Requirement Extensive (several months) studies with MC HI events done at MIT Tier2 MC studies done with current CMSSW framework Conclusion that ~2900 CPU cores will be required for nominal year running Cores at SpecInt2K 1600 equivalent (obsolete unit, will do actual bench testing) Ramp up to nominal year will require fewer cores with fewer complex events May 14, 2008 Draft Version 5 on May 13 at 8:40 PM CDT

10 NOTE: None of these scenarios includes costs for network transfers
Implementation of CMS-HI Compute Center Three Scenarios Being Considered Raw Data + Output Production Archived on Tape at FNAL Cost transfer of $35K/year to FNAL, obtains 2.23 PBytes of tape in 5 years These numbers are only preliminary estimates of the cost and size of the tape component Five year sequence: TBytes of tape Cost savings estimated at $135K over 5 years After 5 years there will be 3000 CPU cores available at Vanderbilt and 400 TBytes disk Raw Data + Output Production Archived on Tape at VU Costs will include tapes and a dedicated FTE to support additional tape load After 5 years there will be 2480 CPU cores available at Vanderbilt plus disk space Tape Archive at VU and 300 TBytes Buffer Disk at CERN Buffer disk at CERN allows raw data to be transferred over 4 months instead of 1 month Would need consultation with CERN Tier0 managers Suggested by ESnet managers as a fallback if trans-Atlantic network is not fast enough After 5 years there will be 2120 CPU cores available at Vanderbilt plus disk space Possible 300 Tbytes at CERN could be funded by a separate proposal, e.g. NSF REDDnet NOTE: None of these scenarios includes costs for network transfers May 14, 2008 Draft Version 5 on May 13 at 8:40 PM CDT

11 Draft Version 5 on May 13 at 8:40 PM CDT
Backup Slides 1) VU network topologies ( ) 2) Spreadsheet of CPU reco times (16) May 14, 2008 Draft Version 5 on May 13 at 8:40 PM CDT

12 Draft Version 5 on May 13 at 8:40 PM CDT
12

13 Draft Version 5 on May 13 at 8:40 PM CDT
Add ORNL or Starlight later using Dark Fiber (unmanaged) May 14, 2008 Draft Version 5 on May 13 at 8:40 PM CDT 13

14 Draft Version 5 on May 13 at 8:40 PM CDT
LHCNet?? Fermilab?? *Would need to have peering or layer 2 VLAN setup. May 14, 2008 Draft Version 5 on May 13 at 8:40 PM CDT 14

15 Draft Version 5 on May 13 at 8:40 PM CDT
15

16 Draft Version 5 on May 13 at 8:40 PM CDT
HLT Bandwidth Decomposition for HI Events And Associated Reconstruction CPU Times May 14, 2008 Draft Version 5 on May 13 at 8:40 PM CDT


Download ppt "CMS-HI Offline Computing"

Similar presentations


Ads by Google