Answers to Panel1/Panel3 Questions John Harvey/ LHCb May 12 th, 2000.

Slides:



Advertisements
Similar presentations
31/03/00 CMS(UK)Glenn Patrick What is the CMS(UK) Data Model? Assume that CMS software is available at every UK institute connected by some infrastructure.
Advertisements

Resources for the ATLAS Offline Computing Basis for the Estimates ATLAS Distributed Computing Model Cost Estimates Present Status Sharing of Resources.
T1 at LBL/NERSC/OAK RIDGE General principles. RAW data flow T0 disk buffer DAQ & HLT CERN Tape AliEn FC Raw data Condition & Calibration & data DB disk.
23/04/2008VLVnT08, Toulon, FR, April 2008, M. Stavrianakou, NESTOR-NOA 1 First thoughts for KM3Net on-shore data storage and distribution Facilities VLV.
The LHCb DAQ and Trigger Systems: recent updates Ricardo Graciani XXXIV International Meeting on Fundamental Physics.
ACAT 2002, Moscow June 24-28thJ. Hernández. DESY-Zeuthen1 Offline Mass Data Processing using Online Computing Resources at HERA-B José Hernández DESY-Zeuthen.
1 Data Storage MICE DAQ Workshop 10 th February 2006 Malcolm Ellis & Paul Kyberd.
DATA PRESERVATION IN ALICE FEDERICO CARMINATI. MOTIVATION ALICE is a 150 M CHF investment by a large scientific community The ALICE data is unique and.
Trigger and online software Simon George & Reiner Hauser T/DAQ Phase 1 IDR.
CERN/IT/DB Multi-PB Distributed Databases Jamie Shiers IT Division, DB Group, CERN, Geneva, Switzerland February 2001.
Large scale data flow in local and GRID environment V.Kolosov, I.Korolko, S.Makarychev ITEP Moscow.
Planning LHCb computing infrastructure 22 May 2000 Slide 1 Planning LHCb computing infrastructure at CERN and at regional centres F. Harris.
CMS Report – GridPP Collaboration Meeting VI Peter Hobson, Brunel University30/1/2003 CMS Status and Plans Progress towards GridPP milestones Workload.
Alexandre A. P. Suaide VI DOSAR workshop, São Paulo, 2005 STAR grid activities and São Paulo experience.
Remote Production and Regional Analysis Centers Iain Bertram 24 May 2002 Draft 1 Lancaster University.
D0 SAM – status and needs Plagarized from: D0 Experiment SAM Project Fermilab Computing Division.
Copyright © 2000 OPNET Technologies, Inc. Title – 1 Distributed Trigger System for the LHC experiments Krzysztof Korcyl ATLAS experiment laboratory H.
Fermilab User Facility US-CMS User Facility and Regional Center at Fermilab Matthias Kasemann FNAL.
LHC Computing Review - Resources ATLAS Resource Issues John Huth Harvard University.
Finnish DataGrid meeting, CSC, Otaniemi, V. Karimäki (HIP) DataGrid meeting, CSC V. Karimäki (HIP) V. Karimäki (HIP) Otaniemi, 28 August, 2000.
ALICE Upgrade for Run3: Computing HL-LHC Trigger, Online and Offline Computing Working Group Topical Workshop Sep 5 th 2014.
LHC Computing Review Recommendations John Harvey CERN/EP March 28 th, th LHCb Software Week.
14 Aug 08DOE Review John Huth ATLAS Computing at Harvard John Huth.
7April 2000F Harris LHCb Software Workshop 1 LHCb planning on EU GRID activities (for discussion) F Harris.
Tier-2  Data Analysis  MC simulation  Import data from Tier-1 and export MC data CMS GRID COMPUTING AT THE SPANISH TIER-1 AND TIER-2 SITES P. Garcia-Abia.
Management of the LHCb DAQ Network Guoming Liu * †, Niko Neufeld * * CERN, Switzerland † University of Ferrara, Italy.
Clara Gaspar, March 2005 LHCb Online & the Conditions DB.
ATLAS Data Challenges US ATLAS Physics & Computing ANL October 30th 2001 Gilbert Poulard CERN EP-ATC.
ESFRI & e-Infrastructure Collaborations, EGEE’09 Krzysztof Wrona September 21 st, 2009 European XFEL.
CERN IT Department CH-1211 Genève 23 Switzerland t Frédéric Hemmer IT Department Head - CERN 23 rd August 2010 Status of LHC Computing from.
CERN – IT Department CH-1211 Genève 23 Switzerland t Working with Large Data Sets Tim Smith CERN/IT Open Access and Research Data Session.
ATLAS WAN Requirements at BNL Slides Extracted From Presentation Given By Bruce G. Gibbard 13 December 2004.
J. Harvey : Panel 3 – list of deliverables Slide 1 / 14 Planning Resources for LHCb Computing Infrastructure John Harvey LHCb Software Week July
LHCb Computing Model and updated requirements John Harvey.
Slide David Britton, University of Glasgow IET, Oct 09 1 Prof. David Britton GridPP Project leader University of Glasgow UK-T0 Meeting 21 st Oct 2015 GridPP.
NTOF DAQ status D. Macina (EN-STI-EET) Acknowledgements: EN-STI-ECE section: A. Masi, A. Almeida Paiva, M. Donze, M. Fantuzzi, A. Giraud, F. Marazita,
Integration of the ATLAS Tag Database with Data Management and Analysis Components Caitriana Nicholson University of Glasgow 3 rd September 2007 CHEP,
Status of the Bologna Computing Farm and GRID related activities Vincenzo M. Vagnoni Thursday, 7 March 2002.
Computing R&D and Milestones LHCb Plenary June 18th, 1998 These slides are on WWW at:
Predrag Buncic Future IT challenges for ALICE Technical Workshop November 6, 2015.
The ATLAS Computing Model and USATLAS Tier-2/Tier-3 Meeting Shawn McKee University of Michigan Joint Techs, FNAL July 16 th, 2007.
Predrag Buncic ALICE Status Report LHCC Referee Meeting CERN
Ian Bird Overview Board; CERN, 8 th March 2013 March 6, 2013
January 20, 2000K. Sliwa/ Tufts University DOE/NSF ATLAS Review 1 SIMULATION OF DAILY ACTIVITITIES AT REGIONAL CENTERS MONARC Collaboration Alexander Nazarenko.
LHCbComputing Computing for the LHCb Upgrade. 2 LHCb Upgrade: goal and timescale m LHCb upgrade will be operational after LS2 (~2020) m Increase significantly.
Distributed Physics Analysis Past, Present, and Future Kaushik De University of Texas at Arlington (ATLAS & D0 Collaborations) ICHEP’06, Moscow July 29,
Markus Frank (CERN) & Albert Puig (UB).  An opportunity (Motivation)  Adopted approach  Implementation specifics  Status  Conclusions 2.
L. Perini DATAGRID WP8 Use-cases 19 Dec ATLAS short term grid use-cases The “production” activities foreseen till mid-2001 and the tools to be used.
Meeting with University of Malta| CERN, May 18, 2015 | Predrag Buncic ALICE Computing in Run 2+ P. Buncic 1.
1 June 11/Ian Fisk CMS Model and the Network Ian Fisk.
Jianming Qian, UM/DØ Software & Computing Where we are now Where we want to go Overview Director’s Review, June 5, 2002.
Cofax Scalability Document Version Scaling Cofax in General The scalability of Cofax is directly related to the system software, hardware and network.
Computing infrastructures for the LHC: current status and challenges of the High Luminosity LHC future Worldwide LHC Computing Grid (WLCG): Distributed.
Predrag Buncic CERN Plans for Run2 and the ALICE upgrade in Run3 ALICE Tier-1/Tier-2 Workshop February 2015.
ATLAS – statements of interest (1) A degree of hierarchy between the different computing facilities, with distinct roles at each level –Event filter Online.
Ian Bird WLCG Workshop San Francisco, 8th October 2016
LHC experiments Requirements and Concepts ALICE
RT2003, Montreal Niko Neufeld, CERN-EP & Univ. de Lausanne
Philippe Charpentier CERN – LHCb On behalf of the LHCb Computing Group
Dagmar Adamova (NPI AS CR Prague/Rez) and Maarten Litmaath (CERN)
ALICE Computing Model in Run3
ALICE Computing Upgrade Predrag Buncic
The LHCb Event Building Strategy
US ATLAS Physics & Computing
ATLAS DC2 & Continuous production
The ATLAS Computing Model
LHCb thinking on Regional Centres and Related activities (GRIDs)
Use Of GAUDI framework in Online Environment
Development of LHCb Computing Model F Harris
Presentation transcript:

Answers to Panel1/Panel3 Questions John Harvey/ LHCb May 12 th, 2000

J. Harvey : Answers to Panel 2/3 questions Slide 2 Question 1 : Raw Data Strategy mCan you be more precise on the term "selected" samples to be exported and the influence of such exported samples (size, who will access, purposes) on networks and other resources. mDo you plan any back-up ?

J. Harvey : Answers to Panel 2/3 questions Slide 3 Access Strategy for LHCb Data mReminder - once we get past the start-up we will be exporting only AOD+TAG data in production cycle, and only ‘small’ RAW+ESD samples (controlled frequency and size) upon physicist request. mIt is a requirement that physicists working remotely should have ~immediate access to AOD+TAG data as soon as they are produced. ãSo as not to penalise physicists working remotely. ãSo as not to encourage physicists working remotely to run their jobs at CERN. mThe rapid distribution of AOD+TAG data from CERN to regional centres places largest load on network infrastructure.

J. Harvey : Answers to Panel 2/3 questions Slide 4 Access to AOD + TAG Data mEXPORT each ‘week’ to each of the regional centres. ãAOD+TAG: events * 20 kb = 1 TB. mA day turnaround for exporting 1 TB would imply an ‘effective’ network bandwidth requirement of 10 MB/s from CERN to each of the regional centres. mAt the same bandwidth a year’s worth of data could be distributed to the regional centres in 20 days. This would be required following a reprocessing.

J. Harvey : Answers to Panel 2/3 questions Slide 5 Access to RAW and ESD Data mWorst case planning- the start-up period ( 1st and 2nd years) ãWe export sufficient RAW + ESD data to allow for remote physics and detector studies mAt start-up our selection tagging will be crude, so assume we select 10% of data taken for a particular channel, and of this sample include RAW+ESD for 10% of events mEXPORT each ‘week’ to each of the regional centres ãAOD+TAG: events * 20 kb = 1 TB ãRAW+ESD: 10 channels * events * 200 KB =1TB mA day turnaround for exporting 2 TB implies ‘effective’ bandwidth requirement of 20 MB/s from CERN to each of RCs

J. Harvey : Answers to Panel 2/3 questions Slide 6 Access to RAW and ESD Data mIn steady running after the first 2 years people will still want to access the RAW/ESD data for physics studies but only for their private selected sample. mThe size of the data samples required in this case is small, ãof the order of 10 5 events (i.e. 20 GB ) per sample ãturnaround time < 12 hours ãbandwidth requirement for one such transaction is <1MB/s

J. Harvey : Answers to Panel 2/3 questions Slide 7 Access to RAW and ESD Data mSamples of RAW and ESD data will also be used to satisfy requirements for detailed detector studies. mSamples of background events may also be required, but it is expected that the bulk of the data requirements can be satisfied with the samples distributed for physics studies. mIt should be noted however that for detector/trigger studies people working on detectors will most likely be at CERN, and it may not be necessary to export RAW+ESD for such studies during the start-up period. mAfter first two years smaller samples will be needed for detailed studies of detector performance.

J. Harvey : Answers to Panel 2/3 questions Slide 8 Backup of Data mWe intend to make to two copies of RAW data on archive media (tape)

J. Harvey : Answers to Panel 2/3 questions Slide 9 Question 2 : Simulation mCan you be more precise about your MDC (mock data challenges) strategy ? mIn correlation with hardware cost decreases. (Remember : a 10% MDC 3 years before T° could cost ~ as much as a 100% MDC at T°)

J. Harvey : Answers to Panel 2/3 questions Slide 10 Physics : Plans for Simulation mIn 2000 and 2001 we will produce simulated events each year for detector optimisation studies in preparation of the detector TDRs (expected in 2001 and early 2002). mIn 2002 and 2003 studies will be made of the high level trigger algorithms for which we are required to produce simulated events each year. mIn 2004 and 2005 we will start to produce very large samples of simulated events, in particular background, for which samples of 10 7 events are required. mThis on-going physics production work will be used as far as is practicable for testing development of the computing infrastructure.

J. Harvey : Answers to Panel 2/3 questions Slide 11 Computing : MDC Tests of Infrastructure m2002 : MDC 1 - application tests of grid middleware and farm management software using a real simulation and analysis of 10 7 B channel decay events. Several regional facilities will participate : ãCERN, RAL, Lyon/CCIN 2 P 3,Liverpool, INFN, …. m2003 : MDC 2 - participate in the exploitation of the large scale Tier0 prototype to be setup at CERN ãHigh Level Triggering – online environment, performance ãManagement of systems and applications ãReconstruction – design and performance optimisation ãAnalysis – study chaotic data access patterns ãSTRESS TESTS of data models, algorithms and technology m2004 : MDC 3 - Start to install event filter farm at the experiment to be ready for commissioning of detectors in and 2005

J. Harvey : Answers to Panel 2/3 questions Slide 12 Growth in Requirements to Meet Simulation Needs

J. Harvey : Answers to Panel 2/3 questions Slide 13 Cost / Regional Centre for Simulation mAssume there are 5 regional centres mAssume costs are shared equally

J. Harvey : Answers to Panel 2/3 questions Slide 14 Tests Using Tier 0 Prototype in 2003 mWe intend to make use of the Tier 0 prototype planned for construction in 2003 to make stress tests of both hardware and software mWe will prepare realistic examples of two types of application : ãTests designed to gain experience with the online farm environment ãProduction tests of simulation, reconstruction, and analysis

J. Harvey : Answers to Panel 2/3 questions Slide 15 Switch (Functions as Readout Network) ~100 RU SFC CPU ~10 CPC RU SFC CPU ~10 CPC Controls Network Storage Controller(s) Controls System Storage/CDR Readout Network Technology (GbE?) Sub-Farm Network Technology (Ethernet) Controls Network Technology (Ethernet) SFCSub-Farm Controller CPCControl PC CPUWork CPU Event Filter Farm Architecture

J. Harvey : Answers to Panel 2/3 questions Slide 16 Event Fragments Built Raw Events Accepted and Reconstructed Events Switch RU SFC CPU RU NIC (EB) NIC (SF) CPU/MEM Storage Controller(s) Data Flow

J. Harvey : Answers to Panel 2/3 questions Slide 17 Switch (Functions as Readout Network) ~100 RU SFC CPU ~10 CPC RU SFC CPU ~10 CPC Storage Controller(s) Controls System Storage/CDR Testing/Verification Controls Network Legend Small Scale Lab Tests +Simulation Full Scale Lab Tests Large/Full Scale Tests using Farm Prototype

J. Harvey : Answers to Panel 2/3 questions Slide 18 Requirements on Farm Prototype mFunctional requirements ãA separate controls network (Fast Ethernet at the level of the sub-farm, GbEthernet towards the controls system) ãFarm CPUs organized in sub-farms (contrary to a “flat” farm) ãEvery CPU in the sub-farm should have two Fast Ethernet interfaces mPerformance and Configuration Requirements ãSFC : NIC >1 MB/s, >512 MB Memory ãStorage controller : NIC >40-60 MB/s, >2 GB memory, >1 TB disk ãFarm CPU : ~256 MB memory ãSwitch : >95 1 Gb/s (Gbit Ethernet)

J. Harvey : Answers to Panel 2/3 questions Slide 19 Data Recording Tests mRaw and reconstructed data are sent from ~100 SFCs to the storage controller and inserted in the permanent storage in a format suitable for re- processing and off-line analysis. mPerformance Goal ãThe storage controller should be able to populate the permanent storage at a event rate of ~200 HZ and an aggregate data rate of ~40-50 MB/s mIssues to be studied ãData movement compatible with DAQ environment ãScalability of Data Storage

J. Harvey : Answers to Panel 2/3 questions Slide 20 Farm Controls Tests mA large farm of processors are to be controlled through a controls system mPerformance Goal ãReboot all farm CPUs in less than ~10 minutes ãconfigure all Farm CPUs in less than ~1 minute mIssues to be studied ãScalability of booting method ãScalability of controls system ãScalability of access and distribution of configuration data

J. Harvey : Answers to Panel 2/3 questions Slide 21 Scalability tests for simulation and reconstruction mTest writing of reconstructed+raw data at 200Hz in online farm environment mTest writing of reconstructed+simulated data in offline Monte Carlo farm environment ãPopulation of event database from multiple input processes mTest efficiency of event and detector data models ãAccess to conditions data from multiple reconstruction jobs ãOnline calibration strategies and distribution of results to multiple reconstruction jobs ãStress testing of reconstruction to identify hot spots, weak code etc.

J. Harvey : Answers to Panel 2/3 questions Slide 22 Scalability tests for analysis mStress test of event database ãMultiple concurrent accesses by “chaotic” analysis jobs mOptimisation of data model ãStudy data access patterns of multiple, independent, concurrent analysis jobs ãModify event and conditions data models as necessary ãDetermine data clustering strategies

J. Harvey : Answers to Panel 2/3 questions Slide 23 Question 3 : Luminosity and Detector Calibration mStrategy in the analysis to get access to the conditions data. mWill it be performed at CERN only or at outside institutes. mIf outside,how the raw data required can be accessed and how the detector conditions DB will be updated?

J. Harvey : Answers to Panel 2/3 questions Slide 24 Access to Conditions Data mProduction updating of conditions database (detector calibration) to be done at CERN for reasons of system integrity. mConditions data less than 1% of event data mConditions data for relevant period will be exported as part of the production cycle to the Regional Centres. ãDetector status data being designed å < 100 kbyte/sec ~ < 10 GB/week ãEssential Alignment + calibration constants required for reconstruction å ~ 100 MB/week

J. Harvey : Answers to Panel 2/3 questions Slide 25 Luminosity and Detector Calibration mComments on detector calibration ãVELO done online..needed for trigger(pedestals,common mode + alignment for each fill) ãTracking alignment will be partially done at start-up without magnetic field ãCALORIMETER done with test beam and early physics data ãRICHs will have optical alignment system mComment on luminosity calibration(based at CERN) ãStrategy being worked on. Thinking to base on ’number of primary vertices’ distribution (measured in an unbiased way)

J. Harvey : Answers to Panel 2/3 questions Slide 26 Question 4 : CPU estimates m"floating" factors, at least 2, were quoted at various meetings by most experiments. And the derivative is definitely positive. Will your CPU estimates continue to grow ? mHow far ? mAre you convinced your estimates are right within a factor 2 ? mWould you agree with a CPU sharing of 1/3, 1/3, 1/3 between Tier0,{Tier1},{Tier2,3,4} ?

J. Harvey : Answers to Panel 2/3 questions Slide 27 CPU Estimates mCPU estimates have been made using performance measurements made with today’s software mAlgorithms have still to be developed and final technology choices made e.g.for data storage, … mPerformance optimisation will help reduce requirements mEstimates will be continuously revised mThe profile with time for acquiring cpu and storage has been made. mFollowing acquisition of the basic hardware it assume that acquisition will proceed at 30% each year for cpu and 20% for disk. This is to cover growth and replacement. mWe will be limited by what is affordable and will adapt our simulation strategy accordingly

J. Harvey : Answers to Panel 2/3 questions Slide 28 Question 5 : Higher network bandwidth mPlease summarise the bandwidth requirements associated with the different elements of the current baseline model. Also please comment on possible changes to the model if very high, guaranteed bandwidth links (10 Gbps) become available. NB. With a 10Gbps sustained throughput (ie. a ~20G link), one could transfer - a 40 GB tape in half a minute, - one TB in less than 15', - one PB in 10 days.

J. Harvey : Answers to Panel 2/3 questions Slide 29 Bandwidth requirements in/out of CERN

J. Harvey : Answers to Panel 2/3 questions Slide 30 Impact of 10 Gbps connections mThe impact of very high bandwidth network connections would be to give optimal turnround for the distribution of AOD and TAG data and to give very rapid access to RAW and ESD data for specific physics studies. mMinimising the latency for response to individual physicist requests is convenient and improves efficiency of analysis work mAt present we do not see any strong need to distribute all the RAW and ESD data as part of the production cycle mWe do not rely on this connectivity but will exploit it if it is affordable.

J. Harvey : Answers to Panel 2/3 questions Slide 31 Question 6 : Event storage DB management tools mOptions include Objectivity, Root, a new project Espresso or an improved version of the first two ? mShall we let experiments make a free choice or be more directive ? mShall we encourage a commercial product or an in-house open software approach ? mMultiple choice would mean less resources per experiment. Can we afford to have different such tools for the 4 experiments ? only one, two maximum, can we interfere with decisions in which each experiment has already invested many man-years or shall we listen more to the "all purpose Tier1" (= a Tier-1 that will support several LHC experiments, plus perhaps non LHC experiments) that would definitely prefer a support to a minimum of systems? Similar comments could be made about other software packages.

J. Harvey : Answers to Panel 2/3 questions Slide 32 Free choice? mThe problem of choice is not only for the DB management tool. The complete data handling problem needs to be studied and decisions need to be made. mThis comprises the object persistency and its connection to the experiment framework, bookkeeping and event catalogs, interaction with the networks and mass storage, etc. mIt involves many components machines, disks, robots, tapes, networks, etc. and a number of abstraction layers. mThe choice of the product or solution for each of the layers needs to be carefully studied as a coherent solution.

J. Harvey : Answers to Panel 2/3 questions Slide 33 Commercial or in-house mWe are of the opinion that more than one object storage solution should be available to the LHC experiments. Each one with a different range of applicability. ãa full-fledged solution for the experiment main data store capable of storing petabytes distributed worldwide – implies security, transactions, replication, etc. (commercial) ãa much lighter solution for end-physicists doing the final analysis with his own private dataset. (in-house) mPerhaps a single solution can cover the complete spectrum but in general this would not the case. mIf commercial solution is not viable then an in-house solution will have to be developed

J. Harvey : Answers to Panel 2/3 questions Slide 34 Question 7 : Coordination Body? mA complex adventure such as the LHC computing needs a continuous coordination and follow-up at least until after the first years of LHC running. mWhat is your feeling on how this coordination should be morganized ? mHow would you see a "LCB++" for the coming decade ?

J. Harvey : Answers to Panel 2/3 questions Slide 35 LCB++ - a possible scenario Review Independent reviewers Report to management (Directors, spokesmen,..) Common Project Coordination SDTools Analysis Tools ESPRESSO Wired Steering IT/DL EP/DDL(computing) LHC Comp. Coordinators Common Project Coordinator Agree programme Manage resources Project meetings / fortnightly Steering meetings / quaterly Workshops / quaterly Conditions Database Work Packages Follows structure of JCOP