DCS Status and Amanda News

Slides:



Advertisements
Similar presentations
A Peer-to-Peer Database Server based on BitTorrent John Colquhoun Paul Watson John Colquhoun Paul Watson.
Advertisements

The Detector Control System – FERO related issues
1 DCS Installation & commissioning TB 18 May 06 L.Jirden Central DCS Detector DCS Status.
André Augustinus ALICE Detector Control System  ALICE DCS is responsible for safe, stable and efficient operation of the experiment  Central monitoring.
Grid and CDB Janusz Martyniak, Imperial College London MICE CM37 Analysis, Software and Reconstruction.
Peter Chochula, January 31, 2006  Motivation for this meeting: Get together experts from different fields See what do we know See what is missing See.
Supervision of Production Computers in ALICE Peter Chochula for the ALICE DCS team.
Peter Chochula.  DAQ architecture and databases  DCS architecture  Databases in ALICE DCS  Layout  Interface to external systems  Current status.
New CERN CAF facility: parameters, usage statistics, user support Marco MEONI Jan Fiete GROSSE-OETRINGHAUS CERN - Offline Week –
Staging to CAF + User groups + fairshare Jan Fiete Grosse-Oetringhaus, CERN PH/ALICE Offline week,
CERN - IT Department CH-1211 Genève 23 Switzerland t The High Performance Archiver for the LHC Experiments Manuel Gonzalez Berges CERN, Geneva.
CERN - IT Department CH-1211 Genève 23 Switzerland t Tier0 database extensions and multi-core/64 bit studies Maria Girone, CERN IT-PSS LCG.
Update on Database Issues Peter Chochula DCS Workshop, June 21, 2004 Colmar.
Peter Chochula and Svetozár Kapusta ALICE DCS Workshop, October 6,2005 DCS Databases.
ALMA Integrated Computing Team Coordination & Planning Meeting #1 Santiago, April 2013 Relational APDM & Relational ASDM models effort done in online.
DCS - general Unscheduled power cut Wed. morning ( , human error) –DCS (and HLT) off, due to UPS on bypass –In the middle of DB migration, had.
André Augustinus 10 September 2001 DCS Architecture Issues Food for thoughts and discussion.
1 Responsibilities & Planning DCS WS L.Jirdén.
Databases E. Leonardi, P. Valente. Conditions DB Conditions=Dynamic parameters non-event time-varying Conditions database (CondDB) General definition:
ALICE, ATLAS, CMS & LHCb joint workshop on
CERN Physics Database Services and Plans Maria Girone, CERN-IT
Naming and Code Conventions for ALICE DCS (1st thoughts)
The DCS lab. Computer infrastructure Peter Chochula.
Peter Chochula ALICE Offline Week, October 04,2005 External access to the ALICE DCS archives.
4 Oct 2005 / Offline week Shuttle program for gathering conditions data from external DB Boyko Yordanov 4 October 2005 ALICE Offline week.
CERN Database Services for the LHC Computing Grid Maria Girone, CERN.
Peter Chochula.  DCS architecture in ALICE  Databases in ALICE DCS  Layout  Interface to external systems  Current status and experience  Future.
CERN - IT Department CH-1211 Genève 23 Switzerland t High Availability Databases based on Oracle 10g RAC on Linux WLCG Tier2 Tutorials, CERN,
Maria Girone CERN - IT Tier0 plans and security and backup policy proposals Maria Girone, CERN IT-PSS.
November 1, 2004 ElizabethGallas -- D0 Luminosity Db 1 D0 Luminosity Database: Checklist for Production Elizabeth Gallas Fermilab Computing Division /
Summary of User Requirements for Calibration and Alignment Database Magali Gruwé CERN PH/AIP ALICE Offline Week Alignment and Calibration Workshop February.
Summary of Workshop on Calibration and Alignment Database Magali Gruwé CERN PH/AIP ALICE Computing Day February 28 th 2005.
22 March 2010 DCS workshop C. Bortolin, g. de cataldo and A. Franco INFN it/CERN CH 1 Progress on LHC data exchange LHC-ALI_DCS project overview Future.
The DCS Databases Peter Chochula. 31/05/2005Peter Chochula 2 Outline PVSS basics (boring topic but useful if one wants to understand the DCS data flow)
1 Planning DCS WS L.Jirdén. 2 TPC HMPID ACORDE FE PC Console DSS Disk SRV DB-SRV App GW sensors actuators DSS sensors actuators PC room SXL2.
Summary of TPC/TRD/DCS/ECS/DAQ meeting on FERO configuration CERN,January 31 st 2006 Peter Chochula.
Database Issues Peter Chochula 7 th DCS Workshop, June 16, 2003.
ACO & AD0 DCS Status report Mario Iván Martínez. LS1 from DCS point of view Roughly halfway through LS1 now – DCS available through all LS1, as much as.
CERN IT Department CH-1211 Genève 23 Switzerland t Load testing & benchmarks on Oracle RAC Romain Basset – IT PSS DP.
L1Calo Databases ● Overview ● Trigger Configuration DB ● L1Calo OKS Database ● L1Calo COOL Database ● ACE Murrough Landon 16 June 2008.
Database 12.2 and Oracle Enterprise Manager 13c Liana LUPSA.
(Alice COsmic Ray Detector) Status Report
V4-18-Release P. Hristov 21/06/2010.
Jean-Philippe Baud, IT-GD, CERN November 2007
Calibration: preparation for pa
F. Bellini for the DQM core DQM meeting, 04th October 2012
RHEV Platform at LHCb Red Hat at CERN 17-18/1/17
WP18, High-speed data recording Krzysztof Wrona, European XFEL
Peter Chochula Calibration Workshop, February 23, 2005
IT-DB Physics Services Planning for LHC start-up
3D Application Tests Application test proposals
BDII Performance Tests
TPC Commissioning: DAQ, ECS aspects
Calibrating ALICE.
ALICE – First paper.
v4-18-Release: really the last revision!
Conditions Data access using FroNTier Squid cache Server
ProtoDUNE SP DAQ assumptions, interfaces & constraints
Experience between AMORE/Offline and sub-systems
Workshop Summary Dirk Duellmann.
The INFN Tier-1 Storage Implementation
Corr framework Computing in High Energy and Nuclear Physics
Oracle Storage Performance Studies
Case studies – Atlas and PVSS Oracle archiver
Design Unit 26 Design a small or home office network
Experience with an IT Asset Management System
Pierluigi Paolucci & Giovanni Polese
Configuration DB Status report Lana Abadie
Carlos Solans TileCal Valencia meeting 27/07/2007
Presentation transcript:

DCS Status and Amanda News Peter Chochula and Svetozar Kapusta

Introduction This talk focuses on DCS archival and Amanda server General description of PVSS data handling was presented several times, please check slides from previous workshops or contact DCS team Report on general DCS status will be given by Andre this afternoon DCS computing infrastructure in place Currently tuning the load balancing in the detector cluster

DCS Databases Services 3 main services: DCS lab for debugging , commissioning and performance tests IT service, used mainly for performance tests, to be dropped soon CR3 servers, used for comissionig

Present DCS Database service in CR3 Redundant DB service configured as ORACLE Dataguard is available Used both for configuration data and data archival Oracle Dataguard (Redundant service)

DB usage status DETECTOR E-mail IT Lab P2 [MB] FERO[MB] FWcfg[MB] PVSSarch[MB] TOTAL [MB] SPD Y   1.25 5.13 13.81 20.19 SDD 0.00 SSD 0.88 TPC 55.44 88.19 3.38 71.10 14482.62 14700.73 TRD 2.56 4.69 7.25 TOF 414.56 HMPID 38869.50 PHOS/CPV 0.19 MUON 246.56 249.94 FMD T0 6.25 V0 PMD 6.63 ZDC EMC ACORDE 6.76 DSS Trigger 1513.56 HLT DAQ DCS 409GB 3.31 4593.5 ~2.2e9 rows ~315e6 rows 76.44 96.76 17.19 75.79 18.15 60120.30

Final Database Service architecture ORACLE RAC 6 nodes RAID, redundant power supply Dual dual-core Xeon 5160 per node, 4 GB RAM 3 disk arrays Infortrend Eonstor A16F-R2431 16 disks each 20 TB in total SAN infrastructure Based on SANbox 5600 switched in redundant configuration Status: HW installed SW installation by DCS and IT ongoing (schedule was not clear in July) Service agreement between IT and experiments under discussion Oracle RAC r rr

Interface Between DCS and Offline FES Infrastructure in place Framework used by SPD to transfer data from DAQ to DCS First systems to use DCS FES: PHS and GMS (end October) Amanda server retrieval of data from DCS archive PVSS API-based version at limits New server being implemented (see next slides)

Comments on Amanda problems Status as of July 2007 Two functionalities of AMANDA server: Single DP request (one DP per connection) For example used for analysis of TPC commissioning data Stability problems, otherwise working fine Multiple DP request (several DPs per connection) Reasonably faster Instabilities (memory leaks) to be solved Some requests caused crashes DCS Protocol PVSS Interface Suspected source of problems PVSS API File access DB access File archive DB archive

In July we promised a quick fix in case that the problems come from our code Deeper investigation showed, that the source is in API (commercial code) Long DP aliases and aliases containig ‘/’ were not correctly handled Large amounts of data caused timeouts in watchdog service (PVSS API is not thread safe) Internal serialization of data requests in PVSS API causes performance limits which cannot be solved in our code

AMANDA upgrade New AMANDA independent on PVSS being commissioned Code exists (both Amanda server and local client for tests) – within the timescale promised in july First tests done this week Release during next week (scheduled for Monday 15th ) DCS Protocol DCS Protocol PVSS Interface DB access PVSS API File access DB access DB archive File archive DB archive

Consequences of Amanda Upgrade ADMP protocol (communication protocol between Shuttle and Amanda) needs to be modified (due to database architecture) Changes already discussed with offline team, proposal submitted in written Working towards elimination of most inconvenient changes (DP types are not stored in the database, but directly in PVSS, hence unreadable for new Amanda) Workaround being tested

Implications Coming With New Amanda Amanda server sees all data produced in DCS no need to run one Amanda per detector Offline will be served by one Amanda Several servers can be installed in case performance is not sufficient Amanda can serve only data from database Detectors must use Oracle archiver – which is also a DCS policy

Preparing for FDR Full simulation running Number of simulated datapoints changed from ~70000 down to ~17000 PHS changed the request DCS infrastructure: Oracle database available in CR3 for detectors installed in cavern Oracle database available in DCS lab for detectors installing on the surface (no access to CR3 possible)

Present Limitations Offline expects that all datapoints are available in the queried PVSS system This is satisfied for simulation but might create problems for real data: The DPs are created once the device is installed and integrated into the system DPs do not exist for missing devices – incomplete detectors will have incomplete set of datapoints Workaround: Use simulator for missing data Do not request data which is not measured