Status of the Analysis Task Force

Slides:



Advertisements
Similar presentations
ALICE Offline Tutorial Markus Oldenburg – CERN May 15, 2007 – University of Sao Paulo.
Advertisements

PROOF and AnT in PHOBOS Kristjan Gulbrandsen March 25, 2004 Collaboration Meeting.
CHEP 2012 – New York City 1.  LHC Delivers bunch crossing at 40MHz  LHCb reduces the rate with a two level trigger system: ◦ First Level (L0) – Hardware.
STAR Software Walk-Through. Doing analysis in a large collaboration: Overview The experiment: – Collider runs for many weeks every year. – A lot of data.
1 Status of the ALICE CERN Analysis Facility Marco MEONI – CERN/ALICE Jan Fiete GROSSE-OETRINGHAUS - CERN /ALICE CHEP Prague.
Large scale data flow in local and GRID environment V.Kolosov, I.Korolko, S.Makarychev ITEP Moscow.
PROOF: the Parallel ROOT Facility Scheduling and Load-balancing ACAT 2007 Jan Iwaszkiewicz ¹ ² Gerardo Ganis ¹ Fons Rademakers ¹ ¹ CERN PH/SFT ² University.
The ALICE Analysis Framework A.Gheata for ALICE Offline Collaboration 11/3/2008 ACAT'081A.Gheata – ALICE Analysis Framework.
1 Part III: PROOF Jan Fiete Grosse-Oetringhaus – CERN Andrei Gheata - CERN V3.2 –
November SC06 Tampa F.Fanzago CRAB a user-friendly tool for CMS distributed analysis Federica Fanzago INFN-PADOVA for CRAB team.
PWG3 Analysis: status, experience, requests Andrea Dainese on behalf of PWG3 ALICE Offline Week, CERN, Andrea Dainese 1.
Andrei Gheata, Mihaela Gheata, Andreas Morsch ALICE offline week, 5-9 July 2010.
ALICE analysis framework References for Analysis Tools used to the ALICE simulated data.
AliRoot survey P.Hristov 11/06/2013. Offline framework  AliRoot in development since 1998  Directly based on ROOT  Used since the detector TDR’s for.
David Adams ATLAS DIAL: Distributed Interactive Analysis of Large datasets David Adams BNL August 5, 2002 BNL OMEGA talk.
PROOF and ALICE Analysis Facilities Arsen Hayrapetyan Yerevan Physics Institute, CERN.
Integration of the ATLAS Tag Database with Data Management and Analysis Components Caitriana Nicholson University of Glasgow 3 rd September 2007 CHEP,
March, PROOF - Parallel ROOT Facility Maarten Ballintijn Bring the KB to the PB not the PB to the KB.
Super Scaling PROOF to very large clusters Maarten Ballintijn, Kris Gulbrandsen, Gunther Roland / MIT Rene Brun, Fons Rademakers / CERN Philippe Canal.
Alberto Colla - CERN ALICE off-line week 1 Alberto Colla ALICE off-line week Cern, May 31, 2005 Table of contents: ● Summary of requirements ● Description.
Computing for Alice at GSI (Proposal) (Marian Ivanov)
Andrea Valassi (CERN IT-DB)CHEP 2004 Poster Session (Thursday, 30 September 2004) 1 HARP DATA AND SOFTWARE MIGRATION FROM TO ORACLE Authors: A.Valassi,
M. Oldenburg GridPP Metadata Workshop — July 4–7 2006, Oxford University 1 Markus Oldenburg GridPP Metadata Workshop July 4–7 2006, Oxford University ALICE.
AliRoot survey: Analysis P.Hristov 11/06/2013. Are you involved in analysis activities?(85.1% Yes, 14.9% No) 2 Involved since 4.5±2.4 years Dedicated.
1 Offline Week, October 28 th 2009 PWG3-Muon: Analysis Status From ESD to AOD:  inclusion of MC branch in the AOD  standard AOD creation for PDC09 files.
October Test Beam DAQ. Framework sketch Only DAQs subprograms works during spills Each subprogram produces an output each spill Each dependant subprogram.
Predrag Buncic CERN ALICE Status Report LHCC Referee Meeting 01/12/2015.
M. Gheata ALICE offline week, October Current train wagons GroupAOD producersWork on ESD input Work on AOD input PWG PWG31 (vertexing)2 (+
Alignment in real-time in current detector and upgrade 6th LHCb Computing Workshop 18 November 2015 Beat Jost / Cern.
Ian Bird Overview Board; CERN, 8 th March 2013 March 6, 2013
Latest Improvements in the PROOF system Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers, Gerri Ganis, Jan Iwaszkiewicz CERN.
ALICE experiences with CASTOR2 Latchezar Betev ALICE.
Analysis experience at GSIAF Marian Ivanov. HEP data analysis ● Typical HEP data analysis (physic analysis, calibration, alignment) and any statistical.
Dynamic staging to a CAF cluster Jan Fiete Grosse-Oetringhaus, CERN PH/ALICE CAF / PROOF Workshop,
David Adams ATLAS ATLAS Distributed Analysis (ADA) David Adams BNL December 5, 2003 ATLAS software workshop CERN.
Analysis train M.Gheata ALICE offline week, 17 March '09.
Markus Frank (CERN) & Albert Puig (UB).  An opportunity (Motivation)  Adopted approach  Implementation specifics  Status  Conclusions 2.
Meeting with University of Malta| CERN, May 18, 2015 | Predrag Buncic ALICE Computing in Run 2+ P. Buncic 1.
I/O aspects for parallel event processing frameworks Workshop on Concurrency in the many-Cores Era Peter van Gemmeren (Argonne/ATLAS)
Some topics for discussion 31/03/2016 P. Hristov 1.
Barthélémy von Haller CERN PH/AID For the ALICE Collaboration The ALICE data quality monitoring system.
ANALYSIS TRAIN ON THE GRID Mihaela Gheata. AOD production train ◦ AOD production will be organized in a ‘train’ of tasks ◦ To maximize efficiency of full.
Monthly video-conference, 18/12/2003 P.Hristov1 Preparation for physics data challenge'04 P.Hristov Alice monthly off-line video-conference December 18,
Lyon Analysis Facility - status & evolution - Renaud Vernet.
ATLAS – statements of interest (1) A degree of hierarchy between the different computing facilities, with distinct roles at each level –Event filter Online.
The ALICE Analysis -- News from the battlefield Federico Carminati for the ALICE Computing Project CHEP 2010 – Taiwan.
Workload Management Workpackage
Installation of the ALICE Software
Real Time Fake Analysis at PIC
WP18, High-speed data recording Krzysztof Wrona, European XFEL
Database Replication and Monitoring
Overview of the Belle II computing
Status of the CERN Analysis Facility
ALICE analysis preservation
PROOF – Parallel ROOT Facility
INFN-GRID Workshop Bari, October, 26, 2004
Controlling a large CPU farm using industrial tools
ALICE Physics Data Challenge 3
Experience in ALICE – Analysis Framework and Train
Analysis Trains - Reloaded
AliRoot status and PDC’04
MC data production, reconstruction and analysis - lessons from PDC’04
Analysis framework - status
ALICE Computing Model in Run3
ALICE Computing Upgrade Predrag Buncic
QA tools – introduction and summary of activities
Kristjan Gulbrandsen March 25, 2004 Collaboration Meeting
Support for ”interactive batch”
ATLAS DC2 & Continuous production
Production Manager Tools (New Architecture)
Presentation transcript:

Status of the Analysis Task Force J.F. Grosse-Oetringhaus ALICE Week, Bologna 21.06.06

Status of the Analysis Task Force, J.F. Grosse-Oetringhaus, CERN Initiated by Yves Schutz in May The analysis task force’s objective is to assure that a robust and efficient analysis framework is available By stressing the analysis tools on the Grid with PROOF Share the experience with ALICE users Submit requirements to the Offline team Status of the Analysis Task Force, J.F. Grosse-Oetringhaus, CERN

Status of the Analysis Task Force, J.F. Grosse-Oetringhaus, CERN Members Federico Carminati - PH/AIP <Federico.Carminati@cern.ch> Panos Christakoglou - PH/AIP <Panos.Christakoglou@cern.ch> Rafael Diaz Valdes - PH/UAI <Rafael.Diaz.Valdes@cern.ch> Andrei Gheata - PH/UAT <Andrei.Gheata@cern.ch> Jan Fiete Grosse-Oetringhaus - PH/AIP <Jan.Fiete.Grosse-Oetringhaus@cern.ch> Peter Hristov - PH/AIP <Peter.Hristov@cern.ch> Claus Jorgensen - PH/AIO <Claus.Jorgensen@cern.ch> Mercedes Lopez Noriega - PH/AIP <Mercedes.Lopez.Noriega@cern.ch> Andreas Morsch - PH/AIP <Andreas.Morsch@cern.ch> Tapan Nayak - PH/AIP <tapan.nayak@cern.ch> Yves Schutz - PH/AIP <Yves.Schutz@cern.ch> Status of the Analysis Task Force, J.F. Grosse-Oetringhaus, CERN

Status of the Analysis Task Force, J.F. Grosse-Oetringhaus, CERN Outlook Summary of 4 talks from the Offline Week Preparation of new Analysis page in the Offline Web (Diaz, R.) File and event level metadata - Queries to the file catalog (Christakoglou, P.) The CERN Analysis Facility: PROOF for Day-1-Physics and Calibration (Grosse-Oetringhaus, J.) A skeleton of analysis framework (Gheata, A.) All found here http://indico.cern.ch/conferenceDisplay.py?confId=a056303#3 More information about analysis on the Grid (A. Peters) http://indico.cern.ch/getFile.py/access?contribId=4&sessionId=4&resId=0&materialId=slides&confId=1148 Status of the Analysis Task Force, J.F. Grosse-Oetringhaus, CERN

Status of the Analysis Task Force, J.F. Grosse-Oetringhaus, CERN Analysis web page http://aliceinfo.cern.ch/Offline/Activities/Analysis/ Linked on Offline page Information about analysis Dedicated pages to come Analysis framework The CERN Analysis Facility (CAF) Analysis on the Grid Mailing list: alice-project-analysis-task-force@cern.ch Status of the Analysis Task Force, J.F. Grosse-Oetringhaus, CERN

Run and event level metadata You want to analyse events from/with e.g. ESD files pp collisions the start and stop time of the run between 19/03/2008 and 20/03/2008 a properly reconstructed vertex a Vz value between -1 cm and 1 cm Run (File) level metadata Event level metadata Status of the Analysis Task Force, J.F. Grosse-Oetringhaus, CERN

Run level metadata Internal note by Markus Oldenburg run comment run type physics, laser, pulser, pedestal, simulation run start time run stop time run stop reason Normal, beam loss, detector failure, … magnetic field setting FullField, ReversedField, ZeroField, HalfField collision system PbPb, pp, pPb, … collision energy trigger class detectors present in this run # of events in this run run sanity for reconstructed events production tag production software library version calibration & alignment setting for simulation Simulation config tag ... Internal note by Markus Oldenburg http://cern.ch/Oldenburg/MetaData/MetaData.doc Status of the Analysis Task Force, J.F. Grosse-Oetringhaus, CERN

Status of the Analysis Task Force, J.F. Grosse-Oetringhaus, CERN The event tag system After selecting a set of files to be analyzed, more selection criteria can be applied at the event level in order to analyze only the events that fulfil them, thus reducing the analysis time. Criteria: e.g. Multiplicity range Mean pt range NChargedAbove10GeV range ... The classes that perform this step are AliEventTagCuts and AliTagAnalysis Status of the Analysis Task Force, J.F. Grosse-Oetringhaus, CERN

Status of the Analysis Task Force, J.F. Grosse-Oetringhaus, CERN The event tag system (2) For further information on the event tag system (structure, fields and use): http://indico.cern.ch/materialDisplay.py?subContId=2&contribId =s5t2&sessionId=s5&materialId=0&confId=a056302 A note on the architecture and usage of the event tag system is almost ready and will be submitted soon (P. Christakoglou) Status of the Analysis Task Force, J.F. Grosse-Oetringhaus, CERN

Status of the Analysis Task Force, J.F. Grosse-Oetringhaus, CERN PROOF Parallel ROOT Facility Interactive parallel analysis on a local cluster PROOF itself is not related to Grid (can be used in the Grid) It is aimed that the use of PROOF is fully transparent The same code can be run locally and in a PROOF system (certain rules have to be followed) Status of the Analysis Task Force, J.F. Grosse-Oetringhaus, CERN

PROOF Schema Local PC root $ root root [0] tree->Process(“ana.C”) ESD $ root root [0] tree->Process(“ana.C”) $ root

PROOF Schema Local PC Remote PROOF Cluster root ESD ESD ESD $ root node1 ESD ana.C ESD node2 $ root root [0] tree->Process(“ana.C”) $ root ESD node3 ESD node4

PROOF Schema Local PC Remote PROOF Cluster root proof PROOF master ESD node1 ESD ana.C ESD node2 $ root root [0] tree->Process(“ana.C”) $ root root [0] tree->Process(“ana.C”) root [1] gROOT->Proof(“remote”) $ root ESD node3 ESD node4

PROOF Schema Local PC Remote PROOF Cluster root proof proof proof PROOF master node1 ESD ana.C ESD proof node2 $ root root [0] tree->Process(“ana.C”) root [1] gROOT->Proof(“remote”) $ root root [0] tree->Process(“ana.C”) $ root ESD proof node3 ESD proof node4

PROOF Schema Local PC Remote PROOF Cluster root proof proof proof stdout/result root proof PROOF master ana.C node1 ESD ana.C ESD proof node2 $ root $ root root [0] tree->Process(“ana.C”) root [1] gROOT->Proof(“remote”) $ root root [0] tree->Process(“ana.C”) root [1] gROOT->Proof(“remote”) root [2] chain->Process(“ana.C”) $ root root [0] tree->Process(“ana.C”) ESD proof node3 ESD proof node4

CERN Analysis Facility The CERN Analysis Facility (CAF) will run PROOF for ALICE Prompt analysis of pp data Pilot analysis of PbPb data Calibration & Alignment Available to the whole collaboration but the number of users will be limited for efficiency reasons Design goals 500 CPUs 100 TB of selected data locally available Status of the Analysis Task Force, J.F. Grosse-Oetringhaus, CERN

CAF Schema ... Tier-1 data export Tape storage Experiment Disk Buffer Sub set (moderated) CAF computing cluster Proof node local disk Proof node local disk Proof node local disk ... Proof node local disk Proof node local disk Proof node local disk

Status of the Analysis Task Force, J.F. Grosse-Oetringhaus, CERN Analysis Framework Which constraints on the analysis are there? Big events Maximize the profit from having one event in memory Parallelize an analysis task for subsets of events (PROOF) If possible, serialize current event for several analysis chains Complex analysis chains Split a complex analysis into smaller modules Reuse functionality provided by modules, create a pool of useful/tested/agreed ones Assembly a big analysis from smaller independent pieces Easily locate/debug possible problems Something that every user wants Simple to use and robust Status of the Analysis Task Force, J.F. Grosse-Oetringhaus, CERN

Status of the Analysis Task Force, J.F. Grosse-Oetringhaus, CERN Prototype A data-driven model controlled by a top-level TSelector mechanism AliAnalysisTask – a basic task depending only on input data AliAnalysisDataSlot – representing a slot where data of a defined type may be connected to a task (A task can define several input/output slots) AliAnalysisDataContainer – a data placeholder Top level containers publishing initial input data for several client tasks Result containers are connected to the output slot of a producer task and also to several client tasks AliAnalysisManager – a selector talking to a list of top-level tasks Task execution serialized event by event (TSelector) The analysis developer will not have to take care of internal mechanisms of TSelector/TTask or implementation details of this framework Status of the Analysis Task Force, J.F. Grosse-Oetringhaus, CERN

The analysis tasks and data containers Status of the Analysis Task Force, J.F. Grosse-Oetringhaus, CERN

Status of the Analysis Task Force, J.F. Grosse-Oetringhaus, CERN Analysis manager Allows definition of the set of tasks and the data containers to be used Provides methods for connecting tasks to containers to create an analysis tree Deriving from TSelector Allows initializing/connecting input data to top level containers via Init(), Begin() methods Executes the top level tasks during Process() Assemblies the output during Terminate() Status of the Analysis Task Force, J.F. Grosse-Oetringhaus, CERN

Status of the Analysis Task Force, J.F. Grosse-Oetringhaus, CERN Summary Analysis Task Force exists to make sure that all needed tools for analysis will be in place. We have to try, test, stress the tools now  Join! Analysis web page for all relevant information Run & Event tags for selection of files on the Grid  Try it when you use events from PDC06! The CERN Analysis Facility is being set up, a PROOF cluster for analysis  Try it when you analyze PDC06 events! A prototype of the analysis framework is nearly ready  Try it! Status of the Analysis Task Force, J.F. Grosse-Oetringhaus, CERN