Status of the CERN Analysis Facility

Slides:



Advertisements
Similar presentations
Status GridKa & ALICE T2 in Germany Kilian Schwarz GSI Darmstadt.
Advertisements

GSIAF "CAF" experience at GSI Kilian Schwarz. GSIAF Present status Present status installation and configuration installation and configuration usage.
1 Databases in ALICE L.Betev LCG Database Deployment and Persistency Workshop Geneva, October 17, 2005.
Grid and CDB Janusz Martyniak, Imperial College London MICE CM37 Analysis, Software and Reconstruction.
Statistics of CAF usage, Interaction with the GRID Marco MEONI CERN - Offline Week –
New CERN CAF facility: parameters, usage statistics, user support Marco MEONI Jan Fiete GROSSE-OETRINGHAUS CERN - Offline Week –
1 Status of the ALICE CERN Analysis Facility Marco MEONI – CERN/ALICE Jan Fiete GROSSE-OETRINGHAUS - CERN /ALICE CHEP Prague.
Staging to CAF + User groups + fairshare Jan Fiete Grosse-Oetringhaus, CERN PH/ALICE Offline week,
Status of CAF and SKAF Arsen Hayrapetyan, Yerevan Physics Institute 1 November 18, 2010 ALICE Offline Week.
PROOF: the Parallel ROOT Facility Scheduling and Load-balancing ACAT 2007 Jan Iwaszkiewicz ¹ ² Gerardo Ganis ¹ Fons Rademakers ¹ ¹ CERN PH/SFT ² University.
The ALICE Analysis Framework A.Gheata for ALICE Offline Collaboration 11/3/2008 ACAT'081A.Gheata – ALICE Analysis Framework.
AliEn uses bbFTP for the file transfers. Every FTD runs a server, and all the others FTD can connect and authenticate to it using certificates. bbFTP implements.
Test Of Distributed Data Quality Monitoring Of CMS Tracker Dataset H->ZZ->2e2mu with PileUp - 10,000 events ( ~ 50,000 hits for events) The monitoring.
US ATLAS Western Tier 2 Status and Plan Wei Yang ATLAS Physics Analysis Retreat SLAC March 5, 2007.
Alexandre A. P. Suaide VI DOSAR workshop, São Paulo, 2005 STAR grid activities and São Paulo experience.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES P. Saiz (IT-ES) AliEn job agents.
Status of the production and news about Nagios ALICE TF Meeting 22/07/2010.
1 Part III: PROOF Jan Fiete Grosse-Oetringhaus – CERN Andrei Gheata - CERN V3.2 –
Offline shifter training tutorial L. Betev February 19, 2009.
PROOF Cluster Management in ALICE Jan Fiete Grosse-Oetringhaus, CERN PH/ALICE CAF / PROOF Workshop,
ROOT and Federated Data Stores What Features We Would Like Fons Rademakers CERN CC-IN2P3, Nov, 2011, Lyon, France.
Infrastructure for QA and automatic trending F. Bellini, M. Germain ALICE Offline Week, 19 th November 2014.
Andrei Gheata, Mihaela Gheata, Andreas Morsch ALICE offline week, 5-9 July 2010.
CASTOR evolution Presentation to HEPiX 2003, Vancouver 20/10/2003 Jean-Damien Durand, CERN-IT.
AliRoot survey P.Hristov 11/06/2013. Offline framework  AliRoot in development since 1998  Directly based on ROOT  Used since the detector TDR’s for.
David Adams ATLAS DIAL: Distributed Interactive Analysis of Large datasets David Adams BNL August 5, 2002 BNL OMEGA talk.
PROOF and ALICE Analysis Facilities Arsen Hayrapetyan Yerevan Physics Institute, CERN.
Development of the CMS Databases and Interfaces for CMS Experiment: Current Status and Future Plans D.A Oleinik, A.Sh. Petrosyan, R.N.Semenov, I.A. Filozova,
ALICE Offline Week October 4 th 2006 Silvia Arcelli & Chiara Zampolli TOF Online Calibration - Strategy - TOF Detector Algorithm - TOF Preprocessor.
Latest Improvements in the PROOF system Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers, Gerri Ganis, Jan Iwaszkiewicz CERN.
Latest Improvements in the PROOF system Bleeding Edge Physics with Bleeding Edge Computing Fons Rademakers, Gerri Ganis, Jan Iwaszkiewicz CERN.
ALICE experiences with CASTOR2 Latchezar Betev ALICE.
Dynamic staging to a CAF cluster Jan Fiete Grosse-Oetringhaus, CERN PH/ALICE CAF / PROOF Workshop,
CMS: T1 Disk/Tape separation Nicolò Magini, CERN IT/SDC Oliver Gutsche, FNAL November 11 th 2013.
Good user practices + Dynamic staging to a CAF cluster Jan Fiete Grosse-Oetringhaus, CERN PH/ALICE CUF,
AAF tips and tricks Arsen Hayrapetyan Yerevan Physics Institute, Armenia.
ALICE Full Dress Rehearsal ALICE TF Meeting 02/08/07.
ANALYSIS TRAIN ON THE GRID Mihaela Gheata. AOD production train ◦ AOD production will be organized in a ‘train’ of tasks ◦ To maximize efficiency of full.
CALIBRATION: PREPARATION FOR RUN2 ALICE Offline Week, 25 June 2014 C. Zampolli.
Lyon Analysis Facility - status & evolution - Renaud Vernet.
Availability of ALICE Grid resources in Germany Kilian Schwarz GSI Darmstadt ALICE Offline Week.
ATLAS – statements of interest (1) A degree of hierarchy between the different computing facilities, with distinct roles at each level –Event filter Online.
The ALICE Analysis -- News from the battlefield Federico Carminati for the ALICE Computing Project CHEP 2010 – Taiwan.
Federating Data in the ALICE Experiment
Experience of PROOF cluster Installation and operation
WP18, High-speed data recording Krzysztof Wrona, European XFEL
Predrag Buncic ALICE Status Report LHCC Referee Meeting CERN
ALICE Monitoring
Report PROOF session ALICE Offline FAIR Grid Workshop #1
Status of the Analysis Task Force
ALICE analysis preservation
PROOF – Parallel ROOT Facility
Grid status ALICE Offline week Nov 3, Maarten Litmaath CERN-IT v1.0
GSIAF & Anar Manafov, Victor Penso, Carsten Preuss, and Kilian Schwarz, GSI Darmstadt, ALICE Offline week, v. 0.8.
ALICE Physics Data Challenge 3
Offline shifter training tutorial
Analysis Trains - Reloaded
Data Federation with Xrootd Wei Yang US ATLAS Computing Facility meeting Southern Methodist University, Oct 11-12, 2011.
MC data production, reconstruction and analysis - lessons from PDC’04
GSIAF "CAF" experience at GSI
Simulation use cases for T2 in ALICE
AliEn central services (structure and operation)
Offline shifter training tutorial
Ákos Frohner EGEE'08 September 2008
The INFN Tier-1 Storage Implementation
ALICE Computing Upgrade Predrag Buncic
Support for ”interactive batch”
Backup Monitoring – EMC NetWorker
Overview Multimedia: The Role of WINS in the Network Infrastructure
Presentation transcript:

Status of the CERN Analysis Facility J. F. Grosse-Oetringhaus, ALICE Offline Week, October 2007

Content CAF / PROOF Status Current Development Current version Available data Problems Current Development Datasets and automatic staging from AliEn Groups, disk quotas, CPU fairshare CAF workshop announcement The CERN Analysis Facility - Jan Fiete Grosse-Oetringhaus

PROOF Schema Client – Local PC Remote PROOF Cluster root root root Result stdout/result root root ana.C node1 Result ana.C Data root Data node2 Result The Parallel ROOT Facility allows interactive parallel analysis on a local cluster Data node3 Result Proof master Proof worker Data node4 The CERN Analysis Facility - Jan Fiete Grosse-Oetringhaus

Proof master, xrootd redirector AliEn SE CASTOR MSS Tier-1 data export A cluster called CERN Analysis Facility (CAF) runs PROOF for Prompt analysis of pp data Pilot analysis of PbPb data Fast simulation/reconstruction Calibration & Alignment Experiment Disk Buffer Tape Staging CAF computing cluster Proof master, xrootd redirector Proof worker local disk Proof worker local disk Proof worker local disk ... The CERN Analysis Facility - Jan Fiete Grosse-Oetringhaus

Current ROOT version ROOT version from 30th July Since this week under test: new version with SVN tag 20268, probably will get default next week Last month's cpu usage The CERN Analysis Facility - Jan Fiete Grosse-Oetringhaus

Available data PDC06 14 TeV pp minbias: 3 million events 0.9 TeV pp minbias: 200k events AliRoot v4-04-Release available PDC07 0.9 TeV pp minbias: 3 million events staged AliRoot v4-05-Release not there yet, but ESD.par can be used See http://aliceinfo.cern.ch/Offline/Analysis/CAF/index.html#data The CERN Analysis Facility - Jan Fiete Grosse-Oetringhaus

Recent Problems Full cluster failure around 20th Sep (single incident) Reason still not fully understood "Hard" reset required by IT User problems Memory exceeded (especially when filling trees) Could affect other users and system processes (failing new, malloc calls) New: Hard limit of 2 GB per process Under development Memory tracking directly from PROOF progress window Creation of files to deal with large output on workers (in new CAF version) Running over long chains failed (~ 20.000 files) Fixed by Gerri Ganis The CERN Analysis Facility - Jan Fiete Grosse-Oetringhaus

Groups, Quotas, Fairshare ALICE wants to assure a fair use of the CAF resources Users are grouped E.g. sub detectors or physics working groups Users can be in several groups Each group has a quota on the disk This space can be used for datasets staged from AliEn (see next slides) Each group has a CPU fairshare target This governs the priority of concurrent queries More details in Gerri's and Marco's talks The CERN Analysis Facility - Jan Fiete Grosse-Oetringhaus

Monitoring of cluster usage Bytes read CPU time Events processed User waiting time per interval accumulated More details, tests and examples in Marco's talk Groups The CERN Analysis Facility - Jan Fiete Grosse-Oetringhaus

Easing the Staging of Data – PROOF Datasets A dataset represents a list of files (e.g. physics run X) Correspondence between AliEn collection and PROOF dataset Users register datasets The files contained in a dataset are automatically staged from AliEn (and kept available) Datasets are used for processing with PROOF Contain all relevant information to start processing (location of files, abstract description of content of files) File-level storing by underlying xrootd infrastructure Datasets are public for reading (you can use datasets from anybody!) There are common datasets (for data of common interest) The CERN Analysis Facility - Jan Fiete Grosse-Oetringhaus

Dataset concept PROOF master / xrootd redirector data manager daemon keeps dataset persistent by requesting staging updating file information touching files PROOF master data manager daemon Dataset registers dataset removes dataset uses dataset stage olbd/ xrootd selects disk server and forwards stage request read, touch PROOF worker / xrootd disk server (many) olbd/ xrootd WN disk read AliEn SE stages files removes files that are not used (least recently used above threshold) … file stager CASTOR MSS write delete The CERN Analysis Facility - Jan Fiete Grosse-Oetringhaus

Datasets in Practice Create DS from AliEn collection TFileCollection* ds = TGridResult::GetFileCollection() Upload to PROOF cluster gProof->RegisterDataSet("myDS", ds) Check status: gProof->ShowDataSet("myDS") Use it: gProof->Process("myDS", "mySelector.cxx+") (not yet implemented) verify api The CERN Analysis Facility - Jan Fiete Grosse-Oetringhaus

Dataset in Practice (2) List available datasets: gProof->ShowDataSets() You always see common datasets and datasets of your group This method was used to stage 30 M events of PDC07 Demo today in Gerhard's talk The CERN Analysis Facility - Jan Fiete Grosse-Oetringhaus

Monitoring of datasets Number of files per host Data set usage per group The CERN Analysis Facility - Jan Fiete Grosse-Oetringhaus

Using CAF @ lxplus CAF can be accessed by all ROOT production releases ROOT/AliRoot version compatible with CAF for lxplus users is on AFS source /afs/cern.ch/alice/caf/caf-lxplus.sh (bash shell) NB: Built for lxplus architecture, gnu/linux distribution Soon a parameter is needed, to indicate the required AliRoot version Is kept up to date, if CAF version is changed The CERN Analysis Facility - Jan Fiete Grosse-Oetringhaus

Information http://aliceinfo.cern.ch/Offline/Analysis/CAF Monthly tutorials at CERN, contact Yves Schutz Slides of the last tutorial (AliRoot, PROOF, AliEn) http://aliceinfo.cern.ch /Offline/Analysis/Tutorial/ Server installation at other institutes http://root.cern.ch/twiki/bin/view/ROOT/ProofInstallation The CERN Analysis Facility - Jan Fiete Grosse-Oetringhaus

CAF workshop We will have a CAF workshop between November 28-30 (probably 1 day)  Please mark in your calendars Feedback from users Feature requests for PROOF and CAF Discussion about staging to CAF The analysis framework on CAF Not a tutorial The CERN Analysis Facility - Jan Fiete Grosse-Oetringhaus

Summary & Plans CAF up and running with minor problems Several new developments coming up Users need to be grouped. We will soon send a call to assign yourself to a group Datasets functionality to be made public in the next weeks  You will be able to decide yourself what is staged to the system There will be a CAF workshop at the end of November (around 28-30). Please mark in your calendars The CERN Analysis Facility - Jan Fiete Grosse-Oetringhaus

Backup The CERN Analysis Facility - Jan Fiete Grosse-Oetringhaus