Horst Fischer University Freiburg / CERN COMPASS Computing.

Slides:

Advertisements

Similar presentations

GridPP July 2003Stefan StonjekSlide 1 SAM middleware components Stefan Stonjek University of Oxford 7 th GridPP Meeting 02 nd July 2003 Oxford.

Advertisements

B A B AR and the GRID Roger Barlow for Fergus Wilson GridPP 13 5 th July 2005, Durham.

LHCb Computing Activities in UK Current activities UK GRID activities RICH s/w activities.

1 Databases in ALICE L.Betev LCG Database Deployment and Persistency Workshop Geneva, October 17, 2005.

CASTOR Project Status CASTOR Project Status CERNIT-PDP/DM February 2000.

23/04/2008VLVnT08, Toulon, FR, April 2008, M. Stavrianakou, NESTOR-NOA 1 First thoughts for KM3Net on-shore data storage and distribution Facilities VLV.

O. Stézowski IPN Lyon AGATA Week September 2003 Legnaro Data Analysis – Team #3 ROOT as a framework for AGATA.

Objectivity Data Migration Marcin Nowak, CERN Database Group, CHEP 2003 March , La Jolla, California.

DATA PRESERVATION IN ALICE FEDERICO CARMINATI. MOTIVATION ALICE is a 150 M CHF investment by a large scientific community The ALICE data is unique and.

Large scale data flow in local and GRID environment V.Kolosov, I.Korolko, S.Makarychev ITEP Moscow.

L3 Filtering: status and plans D  Computing Review Meeting: 9 th May 2002 Terry Wyatt, on behalf of the L3 Algorithms group. For more details of current.

CERN - European Laboratory for Particle Physics HEP Computer Farms Frédéric Hemmer CERN Information Technology Division Physics Data processing Group.

Test Of Distributed Data Quality Monitoring Of CMS Tracker Dataset H->ZZ->2e2mu with PileUp - 10,000 events ( ~ 50,000 hits for events) The monitoring.

Alexandre A. P. Suaide VI DOSAR workshop, São Paulo, 2005 STAR grid activities and São Paulo experience.

GLAST LAT ProjectDOE/NASA Baseline-Preliminary Design Review, January 8, 2002 K.Young 1 LAT Data Processing Facility Automatically process Level 0 data.

Computing for ILC experiment Computing Research Center, KEK Hiroyuki Matsunaga.

CDF data production models 1 Data production models for the CDF experiment S. Hou for the CDF data production team.

CERN - IT Department CH-1211 Genève 23 Switzerland t Tier0 database extensions and multi-core/64 bit studies Maria Girone, CERN IT-PSS LCG.

Computing Infrastructure Status. LHCb Computing Status LHCb LHCC mini-review, February The LHCb Computing Model: a reminder m Simulation is using.

Jean-Yves Nief CC-IN2P3, Lyon HEPiX-HEPNT, Fermilab October 22nd – 25th, 2002.

8th November 2002Tim Adye1 BaBar Grid Tim Adye Particle Physics Department Rutherford Appleton Laboratory PP Grid Team Coseners House 8 th November 2002.

David N. Brown Lawrence Berkeley National Lab Representing the BaBar Collaboration The BaBar Mini  BaBar  BaBar’s Data Formats  Design of the Mini 

9 February 2000CHEP2000 Paper 3681 CDF Data Handling: Resource Management and Tests E.Buckley-Geer, S.Lammel, F.Ratnikov, T.Watts Hardware and Resources.

5 May 98 1 Jürgen Knobloch Computing Planning for ATLAS ATLAS Software Week 5 May 1998 Jürgen Knobloch Slides also on:

Status of the LHCb MC production system Andrei Tsaregorodtsev, CPPM, Marseille DataGRID France workshop, Marseille, 24 September 2002.

The ALICE short-term use case DataGrid WP6 Meeting Milano, 11 Dec 2000Piergiorgio Cerello 1 Physics Performance Report (PPR) production starting in Feb2001.

CMS Computing and Core-Software USCMS CB Riverside, May 19, 2001 David Stickland, Princeton University CMS Computing and Core-Software Deputy PM.

CASTOR evolution Presentation to HEPiX 2003, Vancouver 20/10/2003 Jean-Damien Durand, CERN-IT.

Report on CHEP ‘06 David Lawrence. Conference had many participants, but was clearly dominated by LHC LHC has 4 major experiments: ALICE, ATLAS, CMS,

CERN-IT Oracle Database Physics Services Maria Girone, IT-DB 13 December 2004.

The CMS CERN Analysis Facility (CAF) Peter Kreuzer (RWTH Aachen) - Stephen Gowdy (CERN), Jose Afonso Sanches (UERJ Brazil) on behalf.

Status of the Bologna Computing Farm and GRID related activities Vincenzo M. Vagnoni Thursday, 7 March 2002.

23.March 2004Bernd Panzer-Steindel, CERN/IT1 LCG Workshop Computing Fabric.

23/2/2000Status of GAUDI 1 P. Mato / CERN Computing meeting, LHCb Week 23 February 2000.

Development of the CMS Databases and Interfaces for CMS Experiment: Current Status and Future Plans D.A Oleinik, A.Sh. Petrosyan, R.N.Semenov, I.A. Filozova,

CD FY09 Tactical Plan Status FY09 Tactical Plan Status Report for Neutrino Program (MINOS, MINERvA, General) Margaret Votava April 21, 2009 Tactical plan.

5 June 2003Alan Norton / Focus / EP Topics1 Other EP Topics Some 2003 Running Experiments - NA48/2 (Flavio Marchetto) - Compass (Benigno Gobbo) - NA60.

Large scale data flow in local and GRID environment Viktor Kolosov (ITEP Moscow) Ivan Korolko (ITEP Moscow)

Andrea Valassi (CERN IT-DB)CHEP 2004 Poster Session (Thursday, 30 September 2004) 1 HARP DATA AND SOFTWARE MIGRATION FROM TO ORACLE Authors: A.Valassi,

David Stickland CMS Core Software and Computing

LHCb Data Challenge in 2002 A.Tsaregorodtsev, CPPM, Marseille DataGRID France meeting, Lyon, 18 April 2002.

Summary of User Requirements for Calibration and Alignment Database Magali Gruwé CERN PH/AIP ALICE Offline Week Alignment and Calibration Workshop February.

The MEG Offline Project General Architecture Offline Organization Responsibilities Milestones PSI 2/7/2004Corrado Gatto INFN.

D0 Farms 1 D0 Run II Farms M. Diesburg, B.Alcorn, J.Bakken, R. Brock,T.Dawson, D.Fagan, J.Fromm, K.Genser, L.Giacchetti, D.Holmgren, T.Jones, T.Levshina,

ALICE RRB-T ALICE Computing – an update F.Carminati 23 October 2001.

BNL dCache Status and Plan CHEP07: September 2-7, 2007 Zhenping (Jane) Liu for the BNL RACF Storage Group.

Markus Frank (CERN) & Albert Puig (UB).  An opportunity (Motivation)  Adopted approach  Implementation specifics  Status  Conclusions 2.

AMS02 Data Volume, Staging and Archiving Issues AMS Computing Meeting CERN April 8, 2002 Alexei Klimentov.

Meeting with University of Malta| CERN, May 18, 2015 | Predrag Buncic ALICE Computing in Run 2+ P. Buncic 1.

Jianming Qian, UM/DØ Software & Computing Where we are now Where we want to go Overview Director’s Review, June 5, 2002.

CDF SAM Deployment Status Doug Benjamin Duke University (for the CDF Data Handling Group)

Apr. 25, 2002Why DØRAC? DØRAC FTFM, Jae Yu 1 What do we want DØ Regional Analysis Centers (DØRAC) do? Why do we need a DØRAC? What do we want a DØRAC do?

ATLAS Distributed Computing Tutorial Tags: What, Why, When, Where and How? Mike Kenyon University of Glasgow.

BaBar & Grid Eleonora Luppi for the BaBarGrid Group TB GRID Bologna 15 febbraio 2005.

Database 12.2 and Oracle Enterprise Manager 13c Liana LUPSA.

BaBar Transition: Computing/Monitoring

Database Replication and Monitoring

SuperB and its computing requirements

IT-DB Physics Services Planning for LHC start-up

LHC experiments Requirements and Concepts ALICE

Calibrating ALICE.

The COMPASS event store in 2002

Readiness of ATLAS Computing - A personal view

Conditions Data access using FroNTier Squid cache Server

Universita’ di Torino and INFN – Torino

COMPASS off-line computing

ATLAS DC2 & Continuous production

Short to middle term GRID deployment plan for LHCb

Offline framework for conditions data

Presentation transcript:

Horst Fischer University Freiburg / CERN COMPASS Computing

Outline Central Data Recording and Event Database Event Reconstruction Some words on statistics Analysis Model A word of warning: 2001  COMPASS Detector commissioning 2002  COMPASS Analysis commissioning Hence, some numbers are preliminary and not yet on rock solid ground

35 MB/s input rate Central Data Recording Data servers: Up to 20 x 500GB During shutdown: Use disk of online farm as additional stage pool (additional 3.5 TB) CASTOR: CERN development of a “infinite file system” COMPASS is the first experiment which uses CASTOR heavily DST production tape

Central Data Recording Design value (35MB/s) 260 TByte in ~100 days Compare to BaBar: 1TByte/day 662 TByte in Or CDF: 500 TByte/year (U.S. year >> 100 days) May 27Sep 18 GB

Data in Database 2001: Detector commissioning:65 days, 13 TByte Target polarized longitudinally:14 days, 15 TByte total: 28 TByte 2002: Target polarized longitudinally:57 days, 173k spills Target polarized transversely:19 days, 52k spills total: 260 TByte 2002 on average 22k triggers/spill  5 Gev (3.8Gev, 1.2Gev) online: ~260,000 files (~80 files / run) Objectivity faced out at CERN and replaced by Oracle9i After transfer to CCF ~2000ev/s are formatted into Objectivity/DB Different data bases for raw events and special events  Planed: event reconstruction before storing DB federations to tape  2002: offline reconstruction (calibrations were/are not available)

Database Migration  Raw events Original DATE format, flat files kept in Castor  Metadata Mainly event headers and navigational information for raw and reconstructed events in relational database 0.25% of the raw data volume at CERN stored in Oracle (but without Oracle-specific features) Possibility for another database in the outside institutes (MySQL)  Conditions data Migrated to the new CDB implementation based on Oracle No interface change (abstract interface)  DST Object streaming to files

More on DB Migration  CERN IT/DB is involved in the migration project to a large extent Planning activities started early this year Several people (IT/DB/PDM section) working on the project Coordinating experiments and with other IT groups –Servers: ADC group –Castor: DS group Migrate database to “borrowed” tapes  COMPASS IT/DB is working closely with COMPASS experts Certify interfaces to Oracle DB  Licenses being negotiated between CERN and Oracle for >2003

Event Reconstruction Compass Computing Farm:  100 dual Processor PIII Linux PCs  Mainly for event reconstruction and (Mini)-DST production  maintained and operated by CERN staff  Part of CERN LSF BATCH-system Average time to reconstruct one event: 400ms/ev (300…500 ms, during the first 400ms of a spill up to 1300ms/ev) 5Gev * 400ms/ev = 2Gs / 200CPUs = 10 Ms/CPU at 100% efficiency: = 115 days on 200 CPUs Today: about 30% of 2002 data are being pre-processed for preliminary physics analysis (since September) Process as much in parallel as possible: 1 run : ~ 80 data files : ~ 80 batch jobs running at the same time

Batch Statistics Jobs per week It’s a full time job to control data flow and quality of output Must set-up production shifts to share work load like one does for data taking Alternative: Outsourcing = hire technicians? CPU time per week

COMPASS Analysis Model Calibrations MySQL DST Oracle 9i mini DST ROOT tree RAW data Oracle 9i CORAL PHAST micro DST ROOT tree micro DST ROOT tree micro DST ROOT tree Conditions PVSS/Oracle 9i e-Logbook MySQL

CORAL - Event Reconstruction = CO MPASS R econstruction and A na L ysis Program  Modular architecture  Following OO techniques  Fully written in C++  Written from scratch  Defined interfaces for easy exchange of external packages (e.g. OB/DB  Oracle9i  Access to event, conditions and calibration data bases  Not as user friendly, however, as it may look

Calibrations etc. Presently (2001 & 2002): Calibration DB: 42 MByte Map DB: 4.5 MByte Geometry DB:4.4 MByte Calibrations stored in Run dependent file structured data base which allows for different versions Bookkeeping: MySQL (project launched recently) Calibrations: Alignment (Torino) RICH refraction index, PID likelihood (Trieste) Time-zero calibrations (Bonn, Erlangen & Mainz) r-t relations (Saclay & Munich) Amplitude and time for GEM & Silicon (Munich) Calorimeter (Dubna & Protvino) Detector calibrations can not be automated yet. Still everyday new surprises …  Is a drain for our most valuable resource “manpower”  Tasks must stay in responsibility of the listed groups, otherwise very inefficient because of learning curve

Experience: Reconstruction & Analysis (my personal, subjective, view)  “Distributed” data analysis is very inefficient (presently!)  Size of individual groups participating in analysis is not yet large enough to make fast & significant progress by their own Frequently struggling with infrastructural problems Communication between satellites is poor or not existing Missing discussions & stimulations  This unsatisfactory situation may improve with time (?)  COMPASS needs a strong & centralized analysis group  Exchange with satellite groups due to travel of people coming/leaving from/to home institutes Short term SOLUTION:

Analysis: Where & Resources significant core of CERN “Analysis Center” Resources: CCF+… Trieste (55 GHz, 4 TB) TU Munich (26 GHz, 2.4TB) 2003: double GridKa (German LHC Tier 1) 530 GHz, 50TB COMPASS: 20 GHz 2003: double Freiburg (40 GHz, 3TB) 2003: * 1.5 Dubna (14.5 GHz,1.7TB) 2003: double Saclay (  Lyon) Mainz/Bonn Torino (66 GHz, 1 TB)

Conclusions & Suggestions COMPASS faces its first year of “real” data analysis Challenges are enormous: calibrations, data handling, etc. … but very interesting physics ahead and thus its worth all the efforts ( and sleepless nights) Unfortunately we are forced to change event data base in a moment when we should work with full pace on our first data analysis (disadvantage when you are dependent on third party decisions!) Tools for event reconstruction and (Mini- & Micro-) DST analysis exist but need continuous improvements and adaptations to reality Still weak on tools to submit & monitor and manipulate & handle output of 200+ batch jobs per hour (effort was underestimated by many of us) Presently under evaluation: ALIEN (ALICE development) Organizational structure of analysis group: 1. Establish a strong core group at CERN with regular WORK(!)shops 2. Setup shifts for DST production