Meeting, 5/12/06 CMS T1/T2 Estimates à CMS perspective: n Part of a wider process of resource estimation n Top-down Computing.

Slides:

Advertisements

Similar presentations

31/03/00 CMS(UK)Glenn Patrick What is the CMS(UK) Data Model? Assume that CMS software is available at every UK institute connected by some infrastructure.

Advertisements

Bernd Panzer-Steindel, CERN/IT WAN RAW/ESD Data Distribution for LHC.

Introduction to CMS computing CMS for summer students 7/7/09 Oliver Gutsche, Fermilab.

1 User Analysis Workgroup Update  All four experiments gave input by mid December  ALICE by document and links  Very independent.

Resources for the ATLAS Offline Computing Basis for the Estimates ATLAS Distributed Computing Model Cost Estimates Present Status Sharing of Resources.

SLUO LHC Workshop, SLACJuly 16-17, Analysis Model, Resources, and Commissioning J. Cochran, ISU Caveat: for the purpose of estimating the needed.

Amber Boehnlein, FNAL D0 Computing Model and Plans Amber Boehnlein D0 Financial Committee November 18, 2002.

T1 at LBL/NERSC/OAK RIDGE General principles. RAW data flow T0 disk buffer DAQ & HLT CERN Tape AliEn FC Raw data Condition & Calibration & data DB disk.

Locality-Aware Request Distribution in Cluster-based Network Servers 1. Introduction and Motivation --- Why have this idea? 2. Strategies --- How to implement?

1 Data Storage MICE DAQ Workshop 10 th February 2006 Malcolm Ellis & Paul Kyberd.

DATA PRESERVATION IN ALICE FEDERICO CARMINATI. MOTIVATION ALICE is a 150 M CHF investment by a large scientific community The ALICE data is unique and.

Ian M. Fisk Fermilab February 23, Global Schedule External Items ➨ gLite 3.0 is released for pre-production in mid-April ➨ gLite 3.0 is rolled onto.

Large scale data flow in local and GRID environment V.Kolosov, I.Korolko, S.Makarychev ITEP Moscow.

WLCG/8 July 2010/MCSawley WAN area transfers and networking: a predictive model for CMS WLCG Workshop, July 7-9, 2010 Marie-Christine Sawley, ETH Zurich.

Hall D Online Data Acquisition CEBAF provides us with a tremendous scientific opportunity for understanding one of the fundamental forces of nature. 75.

October 24, 2000Milestones, Funding of USCMS S&C Matthias Kasemann1 US CMS Software and Computing Milestones and Funding Profiles Matthias Kasemann Fermilab.

December 17th 2008RAL PPD Computing Christmas Lectures 11 ATLAS Distributed Computing Stephen Burke RAL.

DATA DEDUPLICATION By: Lily Contreras April 15, 2010.

CHEP – Mumbai, February 2006 The LCG Service Challenges Focus on SC3 Re-run; Outlook for 2006 Jamie Shiers, LCG Service Manager.

Computing Infrastructure Status. LHCb Computing Status LHCb LHCC mini-review, February The LHCb Computing Model: a reminder m Simulation is using.

1 Kittikul Kovitanggoon*, Burin Asavapibhop, Narumon Suwonjandee, Gurpreet Singh Chulalongkorn University, Thailand July 23, 2015 Workshop on e-Science.

Fermilab User Facility US-CMS User Facility and Regional Center at Fermilab Matthias Kasemann FNAL.

LHC Computing Review - Resources ATLAS Resource Issues John Huth Harvard University.

Your university or experiment logo here Caitriana Nicholson University of Glasgow Dynamic Data Replication in LCG 2008.

Introduction to CMS computing J-Term IV 8/3/09 Oliver Gutsche, Fermilab.

ALICE Upgrade for Run3: Computing HL-LHC Trigger, Online and Offline Computing Working Group Topical Workshop Sep 5 th 2014.

1. Maria Girone, CERN  Q WLCG Resource Utilization  Commissioning the HLT for data reprocessing and MC production  Preparing for Run II  Data.

Grid Lab About the need of 3 Tier storage 5/22/121CHEP 2012, The need of 3 Tier storage Dmitri Ozerov Patrick Fuhrmann CHEP 2012, NYC, May 22, 2012 Grid.

Tier-2  Data Analysis  MC simulation  Import data from Tier-1 and export MC data CMS GRID COMPUTING AT THE SPANISH TIER-1 AND TIER-2 SITES P. Garcia-Abia.

ATLAS Computing Model – US Research Program Manpower J. Shank N.A. ATLAS Physics Workshop Tucson, AZ 21 Dec., 2004.

CMS Computing and Core-Software USCMS CB Riverside, May 19, 2001 David Stickland, Princeton University CMS Computing and Core-Software Deputy PM.

V.Ilyin, V.Gavrilov, O.Kodolova, V.Korenkov, E.Tikhonenko Meeting of Russia-CERN JWG on LHC computing CERN, March 14, 2007 RDMS CMS Computing.

The LHCb CERN R. Graciani (U. de Barcelona, Spain) for the LHCb Collaboration International ICFA Workshop on Digital Divide Mexico City, October.

CERN IT Department CH-1211 Genève 23 Switzerland t Frédéric Hemmer IT Department Head - CERN 23 rd August 2010 Status of LHC Computing from.

Lecture 3 Applications of TDM ( T & E Lines ) & Statistical TDM.

Claudio Grandi INFN Bologna CMS Computing Model Evolution Claudio Grandi INFN Bologna On behalf of the CMS Collaboration.

CMS Computing Model Simulation Stephen Gowdy/FNAL 30th April 2015CMS Computing Model Simulation1.

LHCbComputing Resources requests : changes since LHCb-PUB (March 2013) m Assume no further reprocessing of Run I data o (In.

NA62 computing resources update 1 Paolo Valente – INFN Roma Liverpool, Aug. 2013NA62 collaboration meeting.

The ATLAS Computing Model and USATLAS Tier-2/Tier-3 Meeting Shawn McKee University of Michigan Joint Techs, FNAL July 16 th, 2007.

ATLAS Computing Requirements LHCC - 19 March ATLAS Computing Requirements for 2007 and beyond.

CMS Computing Model summary UKI Monthly Operations Meeting Olivier van der Aa.

10-Jan-00 CERN Building a Regional Centre A few ideas & a personal view CHEP 2000 – Padova 10 January 2000 Les Robertson CERN/IT.

Ian Bird WLCG Networking workshop CERN, 10 th February February 2014

1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.

GDB, 07/06/06 CMS Centre Roles à CMS data hierarchy: n RAW (1.5/2MB) -> RECO (0.2/0.4MB) -> AOD (50kB)-> TAG à Tier-0 role: n First-pass.

LHCb Current Understanding of Italian Tier-n Centres Domenico Galli, Umberto Marconi Roma, January 23, 2001.

Computing Model José M. Hernández CIEMAT, Madrid On behalf of the CMS Collaboration XV International Conference on Computing in High Energy and Nuclear.

WLCG Status Report Ian Bird Austrian Tier 2 Workshop 22 nd June, 2010.

Meeting with University of Malta| CERN, May 18, 2015 | Predrag Buncic ALICE Computing in Run 2+ P. Buncic 1.

LHCb Computing activities Philippe Charpentier CERN – LHCb On behalf of the LHCb Computing Group.

1 June 11/Ian Fisk CMS Model and the Network Ian Fisk.

Oct 16, 2009T.Kurca Grilles France1 CMS Data Distribution Tibor Kurča Institut de Physique Nucléaire de Lyon Journées “Grilles France” October 16, 2009.

CMS data access Artem Trunov. CMS site roles Tier0 –Initial reconstruction –Archive RAW + REC from first reconstruction –Analysis, detector studies, etc.

ScotGRID is the Scottish prototype Tier 2 Centre for LHCb and ATLAS computing resources. It uses a novel distributed architecture and cutting-edge technology,

ATLAS – statements of interest (1) A degree of hierarchy between the different computing facilities, with distinct roles at each level –Event filter Online.

Ian Bird WLCG Workshop San Francisco, 8th October 2016

Data Challenge with the Grid in ATLAS

Bernd Panzer-Steindel, CERN/IT

Status and Prospects of The LHC Experiments Computing

CMS transferts massif Artem Trunov.

LHCb Computing Model and Data Handling Angelo Carbone 5° workshop italiano sulla fisica p-p ad LHC 31st January 2008.

Philippe Charpentier CERN – LHCb On behalf of the LHCb Computing Group

Disk capacities in 2017 and 2018 ALICE Offline week 12/11/2017.

Ákos Frohner EGEE'08 September 2008

ALICE Computing Upgrade Predrag Buncic

ILD Ichinoseki Meeting

New strategies of the LHC experiments to meet

R. Graciani for LHCb Mumbay, Feb 2006

The ATLAS Computing Model

Presentation transcript:

Meeting, 5/12/06 CMS T1/T2 Estimates à CMS perspective: n Part of a wider process of resource estimation n Top-down Computing Model -> real per-site estimates n More detail exists than is presented in the Megatable à Original process: n CMS had a significant resource shortfall (esp. T1) n To respect pledges -> ad hoc descoping of CM à After new LHC planning n New top-down planning roughly matches overall pledged resource n Allow resource requirements at T1 centres to float a little n Establish self-consistent balance of resources à Outputs n Transfer capacity estimates between centres n New guidance on balance of resources on T1/T2

Meeting, 5/12/06 Inputs: CMS Model à Data rates, event sizes n Trigger rate: ~300Hz (450MB/s) n Sim to real ratio is 1:1 (though not all full simulation) n RAW (sim) 1.5 (2.0) MB/evt; RECO (sim) 250 (400) kB/evt n All AOD is 50kB/evt à Data placement n RAW/RECO: one copy across all T1, disk1tape1 n Sim RAW/RECO: one copy across all T1, on tape with 10% disk cache How is this expressed in diskXtapeY formalism? Is this formalism in fact appropriate for resource questions…? n AOD: one copy at each T1, disk1tape1

Meeting, 5/12/06 Inputs: LHC, Centres à 2008 LHC assumptions n 92 days of ‘running’ (does not include long MD periods) n 50% efficiency during ‘running’ n Practical implication: the T0 is 100% busy for this time n Input smoothing at T0 required; assume queue < few days n T0 output rate is flat during ‘running’ (straight from T0 capacity) n More precise input welcomed + would be useful Not expected to have strong effects upon most of the estimates à Efficiencies, overheads, etc n Assume 70% T1/T2 disk fill factor (the 30% included in expt reqt) n Assume 100% tape fill factor (i.e. any overhead owned by centre) n T1 CPU efficiency back to 75 / 85% (chaotic/shed)

Meeting, 5/12/06 Centre Roles: T1 à T1 storage reqts: n Curation of assigned fraction of RAW Assigned raw data fractions 1st fundamental input to T1/T2 process n Storage of corresponding RECO / MC from associated T2 centres Association of T1/T2 2nd fundamental input to T1/T2 process n Hosting of entire AOD à T1 processing reqts: n Re-reconstruction: RAW -> RECO -> AOD n Skimming; group and end-user bulk analysis of all data tiers n Calibration, alignment, detectors studies, etc à T1 connections n T0 -> T1: Prompt RAW/RECO from T0 (to tape) n T1 T1: Replication of new AOD version / hot data n T1 -> T2; T2 -> T1 (see below)

Meeting, 5/12/06 Centre Roles: T2 à T2 storage reqts: n Caching of T1 data for analysis; no custodial function n Working space for analysis groups, MC production à T2 processing reqts: n Analysis / MC production only n Assume ratio of analysis:MC constant across T2 à T1 -> T2 dataflow: n AOD: comes from any T1 in principle, often from associated T1 For centres without ‘local’ T1, can usefully share the load n RECO: must come from defined T1 with that sample n Implies full T1 -> T2 many-to-many interconnection Natural consequence of storage-efficient computing model à T2 -> T1 dataflow: n MC data always goes to associated T1

Meeting, 5/12/06 T1/T2 Associations à NB: These are working assumptions in some cases à Stream “allocation” ~ available storage at centre

Meeting, 5/12/06 Centre Roles: CERN CAF / T1 à CAF functionality n Provides short-latency analysis centre for critical tasks n e.g. detector studies, DQM, express analysis, etc n All data available in principle à T1 functionality n CERN will act as associated T1 for RDMS / Ukraine T2 n Note: not a full T1 load, since no T1 processing, no RECO serving n There is the possibility to carry out more general T1 functions e.g. second source of some RECO in case of overload n Reserve this T1 functionality to ensure flexibility Same spirit as the CAF concept à CERN non-T0 connections n Specific CERN -> T2 connection to associated centres n Generic CERN -> T2 connection for service of unique MC data, etc n T1 CERN connection for new-AOD exchange

Meeting, 5/12/06 Transfer Rates à Calculating data flows n T0->T1: data rates, running period Rate is constant during running, zero otherwise n T1 T1; total AOD size, replication period (currently 14 days) High rate, short duty cycle (so OPN capacity can be shared) Short repl. period driven by disk reqd for multiple AOD copies n T1->T2: T2 capacity; refresh period at T2 (currently 30 days) This gives the average rate only - not a realistic use pattern n T2->T1: total MC per centre per year à Peak versus average (T1 -> T2) n Worst-case peak for T1 is sum of T2 transfer capacities Weighted by data fraction at T1 n Realistically, aim for: average_rate < T1_capacity < peak_rate n Difference between peak / avg is uniformly a factor 3-4 n Better information on T2 connection speeds will be needed

Meeting, 5/12/06 Outputs: Rates à Units are MB/s à These are raw rates: no catchup (x2?), no overhead (x2?) n Potentially some large factor to be added n A common understanding is needed à FNAL T2-out-avg is around 50% US, 50% external

Meeting, 5/12/06 Outputs: Capacities à “Resource” from a simple estimate of relative unit costs n CPU : Disk : Tape at 0.5 : 1.5 : 0.3 (a la B. Panzer) à Clearly some fine-tuning left to do n But is a step towards a reasonably balanced model à Total is consistent with top-down input to CRRB, by construction à Storage classes are still under study n Megatable totals are firm, but diskXtapeY categories are not n This may be site-dependent (also, details of cache)

Meeting, 5/12/06 e.g. RAL Storage Planning

Meeting, 5/12/06 Comments / Next Steps? à T1 / T2 process: n Has been productive and useful; exposed many issues à What other information is useful for sites? n Internal dataflow estimates for centres (-> cache sizes, etc) n Assumptions on storage classes, etc. n Similar model estimates for 2007 / n Documentation of assumed CPU capacities at centres à What does CMS need? n Feedback from sites (not overloaded with this so far) n Understanding of site ramp-up plans, resource balance, network capacity n Input on realistic LHC schedule, running conditions, etc n Feedback from providers on network requirements à Goal: detailed self-consistent model for 2007/8 n Based upon real / guaranteed centre, network capacities… n Gives at least an outline for ramp-up at sites, global experiment n Much work left to do…

Meeting, 5/12/06 Backup: Rate Details

Meeting, 5/12/06 Backup: Capacity Details