Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 The MEG Software Project PSI 9/2/2005Corrado Gatto Offline Architecture Computing Model Status of the Software Organization.

Similar presentations


Presentation on theme: "1 The MEG Software Project PSI 9/2/2005Corrado Gatto Offline Architecture Computing Model Status of the Software Organization."— Presentation transcript:

1

2 1 The MEG Software Project PSI 9/2/2005Corrado Gatto Offline Architecture Computing Model Status of the Software Organization

3 2 The Offline Architecture

4 3 Dataflow and Reconstruction Requirements 100 Hz L3 trigger evt size : 1.2 MB Raw Data throughput: (10+10)Hz  1.2Mb/Phys evt  0.1 + 80Hz  0.01Mb/bkg evt = 3.5Mb/s : 35 kB Total raw data storage: 3.5Mb/s  10 7 s = 35 TB/yr

5 4 Requirements for Software Architecture Geant3 compatible (at least at the beginning) Easy interface with existing packages: – Geant3, Geant4, external (fortran) event generators Scalability Simple structure to be used by non-computing experts Written and maintained by few people Portability Use a world-wide accepted framework Use ROOT + An existing Offline Package as starting point

6 5 Expected Raw Performance Pure Linux setup 20 data sources FastEthernet local connection

7 6 MC & Data Processing Scheme Simu sigSimu bkg 1 hits files sighits files bkg 1 SDigitizer SDigits files sig SDigitizer SDigits files bkg Digitizer Merged Digits filesRaw Data Reconstruct ESD Reconstruct ESD C++ Fortran & G3 Or C++ & G3/G4/Fluka Simu bkg 2 hits files bkg 2 SDigitizer SDigits files bkg Fortran & G3 Or C++

8 7 General Architecture: Guidelines Ensure high level of modularity (for easy of maintenance) Absence of code dependencies between different detector modules (to C++ header problems) The structure of every detector package is designed so that static parameters (like geometry and detector response parameters) are stored in distinct objects The data structure is build up as ROOT TTree-objects Access is possible to either the full set of correlated data (i.e., the event) or only one or more sub-sample, stored in different branches of the data structure (TBranch class) and corresponding to one or more detector.

9 8 Computing Model

10 9 Elements of a Computing Model Components – Data Model Event data sizes, formats, streaming Data “Tiers” (DST/ESD/AOD etc) Roles, accessibility, distribution,… Calibration/Conditions data Flow, latencies, update freq Simulation, Sizes, distribution File size – Analysis Model Canonical group needs in terms of data, streams, re-processing, calibrations Data Movement, Job Movement, Priority management Interactive analysis Implementation – Computing Strategy and Deployment Roles of Computing Tiers Data Distribution between Tiers Data Management Architecture Databases Masters, Updates, Hierarchy Active/Passive Experiment Policy – Computing Specifications Profiles (Tier N & Time) –Processors, –Storage, –Network (Wide/Local), –DataBase services, –Specialized servers Middleware requirements

11 10 CPU Needed Calibration: 1 - 7 CPU’s MC Production: 11 - 33 CPU’s MC Reconstruction: 3 - 14 CPU/repr. Raw data reconstruction: 3 – 14 CPU/repr. Alignment: to be estimated Assume: – Trigger rate: 20 Hz – Beam Duty Cycle: 50% – MC = data – Calibration: 1Hz Estimates by scaling existing code and a C++ framework Optimal solution for reconstruction:take Beam Duty Cycle: 100% and use no-beam time for reprocessing. -> Double the CPU for reconstruction.

12 11 Storage Needed Assume: – Raw data (compressed) event size: 120 kB – Hits (from MC): 9 kB/evt – Sdigit+Digit size (from MC): 30kB + 120kB – Track reference (from MC): 15 kByte – Kinematics (from MC): 4 kByte – ESD size (data or MC): 30 kByte Storage required yearly ( L3 trigger: 20Hz and DC: 50%): – Raw data: 12 Tbyte (+ calib) – Hits (from MC): 0.9 TByte – Digit size (from MC): 15 TByte – Track reference (from MC): 1.5 TByte – Kinematics (from MC): 0.4 TByte – ESD size (data or MC): 3/Tbyte /reproc. From Alice data model

13 12 A Computing Model for MEG The crucial decisions are being taken at the collaboration level Very dependent on CPU and storage requirement Two degrees of complexity are being considered INFN has requested a CM design by Nov 2005.

14 13 Data Access: ROOT + RDBMS Model histograms Calibrations Geometries Run/File Catalog Trees Event Store ROOT files Oracle MySQL

15 14 Computing Deployment: Implementation #1 Concentrated computing – Will be considered only in the case the CPU’s needed would exceed PSI’s capabilities – Easiest implementation – It requires the least manpower for maintenance – PROOF is essential for Analysis and DQM

16 15 Computing Deployment: Implementation #2 Distributed computing – Posting of the data/Catalog synchronization – Data distribution dependent upon connection speed – PSI -> INFN OK (3 MB/sec) – PSI -> Japan needs GRID Department    Desktop PSI – Tier 0 (PSI - Tier 1) Tier 1 X Y Z 622 Mbps 2.5 Gbps 622 Mbps 155 mbps Tier2 Lab a Uni b Lab c Uni n  "Transparent" user access to applications and all data

17 16 Processing Flow Ana DPD Raw Calib & Reco 1 ESD AOD Tag Reco Reproc MC ESD AOD Tag

18 17 Computing Model Configuration

19 18 How We Will Proceed 1. Decide for single-Tier or multi-Tier CM (depends heavily on CPU+Storage needed) 2. PSI is OK for Tier-0 (existing infrastructure + Horizon Cluster) 3. Find the candidate sites for Tier-1 Requirement for a Tier-1 1. 1 FTE for job running, control 2. 0.5-1 FTE for data export, catalog maintenance and DQM 3. If GRID is needed, 0.3-0.5 FTE for GRID maintenance 4. Software installation would be responsibility of the Offline group (after startup)

20 19 Computing Model Status Started interaction with PSI’s Computing Center (J. Baschnagel and coll.) PSI CC survey was very encouraging: Physical space and electrical power not a limitation However, CC is running near 80% of Cooling and Clean Power capabilities Relevant services (backup, connection to exp. Area) are OK Alternative options are being considered (setup the farm in the experimental area and use the CC only for backup) Horizon Cluster PSI stand well as Tier-0 candidate (in a MultiTier CM. It might have enough resources to fulfill all MEG’s computing requirement

21 20 Status of the Software

22 21 Software Organization Development of Montecarlo (G3+Fortran) and Calibration/Reconstruction (VMC+ROOT) will initially proceed in parallel Migration will eventually occur later (end 2005) Montecarlo group is in charge of the simulation (Geometry+Event Generator) Offline group in charge of the rest (architecture, framework, reconstruction, computing, etc.) Detector specific code developed along with Detector Experts Present analyses for R&D within the ROME environment (same as Online)

23 22 Status of the Montecarlo: DCH (P. Cattaneo) Progress – Implementation of cable duct and various mechanical supports – Reorganization of the GEANT volume inside the chambers – First integration of the time to distance profile from GARFIELD Next – Completing integration of GARFIELD profiles – Calculation of time profiles for signals – Effect on signals of electrode patterns – Electronic simulation – Signal digitization – Improving details cards, cables

24 23 Status of the Montecarlo: EMCAL Progress – Pisa model: implement new geometry – Pisa model: new tracking model (not based on GTNEXT) – Tokio model: GEANT based (GNEXT) photon tracking including Refraction PMT quartz window Refraction PMT holder Tokio model: absorption Tokio model: scattering – Tokio model: ZEBRA output Number of p.e. in each PMT PMT position (to be replaced by MySQL/ XML) – Tokio model: ZEBRA2NTuple converter Next – Update geometry and complete integration between Pisa and Tokio – Signal digitization – ZEBRA output for hit timing

25 24 Status of the Montecarlo: TOF Progress – Implementation of the actual geometry: tapered slanted bar, square fibers – Addition of phototubes, photodiodes and related mechanics Next – Generation of photons in the bars/fiber – Propagation of photons to the PMT & photodiodes – Electronics and PMT/ photodiodes simulation – Improving details of material distribution, e.g. better PMT, – cables – Add mechanical support – Signal digitization

26 25 Status of the Montecarlo: more Graphics display of geometry only (no event) – Some zooming capability – Possibility of displaying a single volume – Addition of phototubes, photodiodes and related mechanics Beam, Target and Trigger – Preliminary approach outside GEM

27 26 Softwares used in LP beam test (R. Sawada) Midas Data taking Slow control Data logging Run control ROME + ROOT Online monitoring Offline analysis MySQL Channel information Geometry information Calibration constants

28 Proceedure of data take BoR EoR Event run number time trigger mode Configuration ID trigger mode channel info. geometry info. calibration data Histogram Tree (Ntuple) time # of events calibration

29 Proceedure of Offline Analysis BoR EoR Event trigger mode channel info. geometry info. calibration data processed data calibration Bofore a Run calibration run number

30 29 Status of the Offline Project Preliminary Offline architecture approved by MEG (June 2004) INFN – CSN1 also approved the Project (Sept. 2004) Funds have been provided for installing a minifarm in Lecce (22 kEUR) Real work started November 2004

31 30 Immediate Tasks and Objectives Setup a system to test the Offline code DAQ Prompt Calib Reco farm Online Disk Server Staging to tape Prototype the Main Program Modules Classes Steering program FastMC Test functionalities & performance Core Offline System Development RDBMS

32 31 Manpower Estimate Available (FTE) +1 waiting for funds

33 32 The People Involved Core Offline: – Coordinator: Gatto – Steering Program: Di Benedetto, Gatto – FastMC: Mazzacane, Tassielli – TS Implementation: Chiri – Interface to raw data: Chiri, Di Benedetto, Tassielli – Calibration Classes & Interface to RDBMS: Barbareschi* – Trigger: Siragusa *Will start on Mar. 2005 Detector experts: – LXe: Signorelli, Yamada, Savada – TS: Schneebeli (hit), Hajime (Pattern), Lecce (Pattern) – TOF: Pavia/Genova – Trigger: Nicolo’ (Pisa) – Magnet: Ootani All <100% All 100% (except for Chiri) Montecarlo: – LXe: Cattaneo, Cei, Yamada All <100%

34 33 Milestones Offline 1. Start-up: October 2004 2. Start deploying the minifarm: January 2005 3. First working version of the framework (FastMC + Reconstruction): April 2005 (badly needed to proceed with the Computing Model) 4. Initial estimate of CPU/storage needed: June 2005 5. Computing Model: November 2005 6. MDC: 4 th quarter 2005 Montecarlo 1. Include the calorimeter in gem 2. Keep the existing MC in the Geant3 framework. Form a panel to decide if and how to migrate to ROOT: 4 th quarter 2005

35 34 Milestones Offline 1. Start-up: October 2004 2. Start deploying the minifarm: January 2005 3. First working version of the framework (FastMC + Reconstruction): April 2005 (badly needed to proceed with the Computing Model) 4. Initial estimate of CPU/storage needed: June 2005 5. Computing Model: November 2005 6. MDC: 4 th quarter 2005 Montecarlo 1. Include the calorimeter in gem 2. Keep the existing MC in the Geant3 framework. Form a panel to decide if and how to migrate to ROOT: 4 th quarter 2005

36 35 Conclusions Offline project approved by MEG and the INFN Offline group has consolidated (mostly in Lecce) Work has started Minifarm for testing the software is being setup Definition of the Computing Model is under way Montecarlo is on its way to a detailed description of the detector

37 36 Backup Slides

38 37 Compare to the others

39 38 Data Model (Monarc) ESD (Event Summary Data) – contain the reconstructed tracks (for example, track pt, particle Id, pseudorapidity and phi, and the like), the covariance matrix of the tacks, the list of track segments making a track etc… – AOD (Analysis Object Data) – contain information on the event that will facilitate the analysis (for example, centrality, multiplicity, number of positrons, number of EM showers, and the like). Tag objects – identify the event by its physics signature (for example, a cosmic ray) and is much smaller than the other objects. Tag data would likely be stored into a database and be used as the source for the event selection. DPD ( Derived Physics Data) – are constructed from the physics analysis of AOD and Tag objects. – They will be specific to the selected type of physics analysis (ex: mu->e gamma) – Typically consist of histograms or nt-uple-like objects. – These objects will in general be stored locally on the workstation performing the analysis, thus not add any constraint to the overall data-storage resources

40 39 Recostruction Structure Detector Class Global Reco Detector tasks Run Class Run Manager Detector Class DCH Detector tasks ROOT Data Base Tree Branches One or more Global Objects execute a list of tasks involving objects from several detectors Detector Class EMC Detector tasks Run Class MC Run Manager Detector Class TOF Detector tasks The Run manager executes the detector objects in the order of the list Each detector executes a list of detector tasks On demand actions are possible but not the default

41 40 Responsibilities & Tasks (all software) Detector experts: – LXe: Signorelli, Yamada, Savada – DC: Schneebeli (hit), Hajime (Pattern), Lecce (Pattern) – TC: Pavia/Genova – Trigger: Nicolo’ (Pisa) – Magnet: Ootani

42 41 Manpower Estimate (framework only) Available at Lecce Job posted: apply now 4 from Naples + 2.2 from Lecce +1 waiting for funds

43 42


Download ppt "1 The MEG Software Project PSI 9/2/2005Corrado Gatto Offline Architecture Computing Model Status of the Software Organization."

Similar presentations


Ads by Google