The Performance and Scalability of the back-end DAQ sub-system

Slides:



Advertisements
Similar presentations
CHEP 2012 – New York City 1.  LHC Delivers bunch crossing at 40MHz  LHCb reduces the rate with a two level trigger system: ◦ First Level (L0) – Hardware.
Advertisements

Objektorienteret Middleware Presentation 2: Distributed Systems – A brush up, and relations to Middleware, Heterogeneity & Transparency.
Supervision of Production Computers in ALICE Peter Chochula for the ALICE DCS team.
ATLAS DAQ Configuration Databases CHEP 2001September 3 - 7, 2001 Beijing, P.R.China I.Alexandrov 1, A.Amorim 2, E.Badescu 3, D.Burckhart-Chromek 4, M.Caprini.
Introduction to Network Administration. Objectives.
ACAT 2002, Moscow June 24-28thJ. Hernández. DESY-Zeuthen1 Offline Mass Data Processing using Online Computing Resources at HERA-B José Hernández DESY-Zeuthen.
CHEP04 - Interlaken - Sep. 27th - Oct. 1st 2004T. M. Steinbeck for the Alice Collaboration1/20 New Experiences with the ALICE High Level Trigger Data Transport.
March 2003 CHEP Online Monitoring Software Framework in the ATLAS Experiment Serguei Kolos CERN/PNPI On behalf of the ATLAS Trigger/DAQ Online Software.
L. Granado Cardoso, F. Varela, N. Neufeld, C. Gaspar, C. Haen, CERN, Geneva, Switzerland D. Galli, INFN, Bologna, Italy ICALEPCS, October 2011.
Control and monitoring of on-line trigger algorithms using a SCADA system Eric van Herwijnen Wednesday 15 th February 2006.
Large Scale and Performance Tests of the ATLAS Online Software CERN ATLAS TDAQ Online Software System D.Burckhart-Chromek, I.Alexandrov, A.Amorim, E.Badescu,
CERN - European Laboratory for Particle Physics HEP Computer Farms Frédéric Hemmer CERN Information Technology Division Physics Data processing Group.
Artdaq Introduction artdaq is a toolkit for creating the event building and filtering portions of a DAQ. A set of ready-to-use components along with hooks.
PARMON A Comprehensive Cluster Monitoring System A Single System Image Case Study Developer: PARMON Team Centre for Development of Advanced Computing,
1 © 2004, Cisco Systems, Inc. All rights reserved. CCNA 4 v3.1 Module 6 Introduction to Network Administration.
A study of introduction of the virtualization technology into operator consoles T.Ohata, M.Ishii / SPring-8 ICALEPCS 2005, October 10-14, 2005 Geneva,
Data Acquisition for the 12 GeV Upgrade CODA 3. The good news…  There is a group dedicated to development and support of data acquisition at Jefferson.
Local Trigger Unit (LTU) status T. Blažek, V. Černý, M. Kovaľ, R. Lietava Comenius University, Bratislava M. Krivda University of Birmingham 30/08/2012.
© 2004 Mercury Computer Systems, Inc. FPGAs & Software Components Graham Bardouleau & Jim Kulp Mercury Computer Systems, Inc. High Performance Embedded.
Control in ATLAS TDAQ Dietrich Liko on behalf of the ATLAS TDAQ Group.
13 May 2004EB/TB Middleware meeting Use of R-GMA in BOSS for CMS Peter Hobson & Henry Nebrensky Brunel University, UK Some slides stolen from various talks.
The Data Flow System of the ATLAS DAQ/EF "-1" Prototype Project G. Ambrosini 3,9, E. Arik 2, H.P. Beck 1, S. Cetin 2, T. Conka 2, A. Fernandes 3, D. Francis.
R11 Management Command Center Scalability Tests Revised July
L3 DAQ Doug Chapin for the L3DAQ group DAQShifters Meeting 10 Sep 2002 Overview of L3 DAQ uMon l3xqt l3xmon.
CHEP March 2003 Sarah Wheeler 1 Supervision of the ATLAS High Level Triggers Sarah Wheeler on behalf of the ATLAS Trigger/DAQ High Level Trigger.
GLAST LAT Project CERN Testbeam WorkShop, Pisa, 20 March 2006 David Smith & Denis Dumora VME DAQ 1 The Bordeaux Data Acquisition system
DØ Online Workshop3-June-1999S. Fuess Online Computing Overview DØ Online Workshop 3-June-1999 Stu Fuess.
ICALEPCS 2007 The Evolution of the Elettra Control System The evolution of the Elettra Control Sytem C. Scafuri, L. Pivetta.
Computing R&D and Milestones LHCb Plenary June 18th, 1998 These slides are on WWW at:
TDAQ Experience in the BNL Liquid Argon Calorimeter Test Facility Denis Oliveira Damazio (BNL), George Redlinger (BNL).
The ATLAS DAQ System Online Configurations Database Service Challenge J. Almeida, M. Dobson, A. Kazarov, G. Lehmann-Miotto, J.E. Sloper, I. Soloviev and.
Markus Frank (CERN) & Albert Puig (UB).  An opportunity (Motivation)  Adopted approach  Implementation specifics  Status  Conclusions 2.
1 DAQ.IHEP Beijing, CAS.CHINA mail to: The Readout In BESIII DAQ Framework The BESIII DAQ system consists of the readout subsystem, the.
Capsule Placement in the Service Platform Bhuvan Urgaonkar Timothy Roscoe Systems Group, Sprint ATL.
Embedded Real-Time Systems Introduction to embedded software development Lecturer Department University.
M. Caprini IFIN-HH Bucharest DAQ Control and Monitoring - A Software Component Model.
Distribution of ATLAS Software and configuration data Costin Caramarcu on behalf of ATLAS TDAQ SysAdmins.
Enterprise Wide Information Systems SAP R/3 Overview & Basis Technology Instructor: Richard W. Vawter.
Online clock software status
Gu Minhao, DAQ group Experimental Center of IHEP February 2011
Operating System Overview
Business System Development
Online remote monitoring facilities for the ATLAS experiment
Overview – SOE PatchTT November 2015.
PC Farms & Central Data Recording
LHC experiments Requirements and Concepts ALICE
ATF/ATF2 Control System
BDII Performance Tests
Overview – SOE PatchTT December 2013.
Yoshiji Yasu and Andrei Kazarov on behalf of the TDAQ collaboration
RT2003, Montreal Niko Neufeld, CERN-EP & Univ. de Lausanne
CMS Central Version 1.0 Made by Eden Sun Jan 2010.
Distributed object monitoring for ROOT analyses with Go4 v.3
Traffic Analysis with Ethereal
SUBMITTED BY: NAIMISHYA ATRI(7TH SEM) IT BRANCH
A Messaging Infrastructure for WLCG
Online Software Status
TYPES OFF OPERATING SYSTEM
M. Gulmini, G, Maron, N. Toniolo, L. Zangrando
Grid Canada Testbed using HEP applications
CANalytics TM CAN Interface Software BY.
Building a Database on S3
Delivering Distance Learning Experiments in Local Area Networking
Cloud computing mechanisms
Ivan Reid (Brunel University London/CMS)
So what is Target Management all about?
Introduction to Operating Systems
CCNA 4 v3.1 Module 6 Introduction to Network Administration
STATEL an easy way to transfer data
Cluster Computers.
Presentation transcript:

The Performance and Scalability of the back-end DAQ sub-system Igor SOLOVIEV CERN ATLAS DAQ/EF-1 4/17/2019

Contents Introduction Test Results Summary and Future ATLAS DAQ/EF P-1 project back-end software overview & architecture Test Results component test results integrated back-end sub-system test results Summary and Future 4/17/2019 The Performance and Scalability of the back-end DAQ sub-system - CHEP2000 - Igor Soloviev

ATLAS DAQ/EF P-1 Project Goal: to produce a prototype system representing a “full slice” of a DAQ suitable for evaluating candidate technologies and architectures for the final ATLAS DAQ system Sub-systems: Detector Interface Data-Flow Event Filter Back-end Status Base-line system developed & working in lab. environment Exploitation phase up to TDR (2001) To be used on test-beam (summer 2000) 4/17/2019 The Performance and Scalability of the back-end DAQ sub-system - CHEP2000 - Igor Soloviev

Back-end Sub-system Is used to configure, control and monitor the DAQ system It excludes management, processing and transportation of physical data It talks to all the other online systems (“glue” of the experiment) More information: WWW pages: http://atddoc.cern.ch/Atlas/ “Impact of Software Review and Inspection”: talk F331, today 17:50, Doris Burckhart 4/17/2019 The Performance and Scalability of the back-end DAQ sub-system - CHEP2000 - Igor Soloviev

Back-end Architecture Components split back-end software into groups with similar functionality (Core + TDAQ & detector integration components) Operational environment: heterogeneous collection of UNIX workstations, PCs and embedded systems (e.g. PPC on VME under real-time Lynx OS) connected via a local network developed in C++ and ported to several compilers on Solaris, Linux, Lynx, HP-UX & Window NT Design: use freeware and commercial software: Tools.h++, OODB, CORBA, CHSM, CLIPS, Motif/Java 4/17/2019 The Performance and Scalability of the back-end DAQ sub-system - CHEP2000 - Igor Soloviev

Back-end core components Configuration Databases describes all aspects of the configuration Information Service (IS) general purpose information exchange facility Message Reporting System (MRS) allows software components to report messages in distributed environment Process Manager (PMG) performs distributed job control of components Run Control (RC) controls configuration and data taking operations 4/17/2019 The Performance and Scalability of the back-end DAQ sub-system - CHEP2000 - Igor Soloviev

Component Unit Tests Results Configuration Databases used by many components during system start-up tests done for different OKS configurations (single read-out crate, typical P-1 conf., expected ATLAS DAQ conf.) on average workstation time to load P-1 conf. , make complete traverse and close is about 1.5 sec. and on PPC VME board the same test requires about 3 sec. Information Systems (IS & MRS) used by many components during all phases of system operation (publish/subscribe facilities) scalable (multiple servers to split the load) benchmarks done on single workstation and on several computers for different conf. (size, up to 50+10 clients) the response time is a few milliseconds better results for distributed systems 4/17/2019 The Performance and Scalability of the back-end DAQ sub-system - CHEP2000 - Igor Soloviev

Component Unit Tests Results Process Manager used during the system start-up and shutdown results obtained on single Solaris workstation time to start a process is a few 100s milliseconds and slowly increases with the number of managed processes Run Control required to change the state of the system scalable by changing the structure of RC tree tests on all available workstations (up to 250 controllers) to change the system’s state with several 10s of nodes varies from several 100s milliseconds up to few seconds depending on the state of the system the time to change running/configured states is <1 sec. 4/17/2019 The Performance and Scalability of the back-end DAQ sub-system - CHEP2000 - Igor Soloviev

Component Unit Tests Conclusions Unit tests made for back-end core components show that they are in accordance with DAQ P-1 requirements Similar tests will be done for back-end integration components 4/17/2019 The Performance and Scalability of the back-end DAQ sub-system - CHEP2000 - Igor Soloviev

Back-end Sub-system Tests What: bring together all the core and several TDAQ/detector integration components Why: to simulate the control & configuration of data taking sessions Where: back-end servers are running on UNIX workstation others (PMG agent, LDAQ emulator & RC Ctrl.) on PC running Linux or VME based Power PC CPU board running Lynx OS 4/17/2019 The Performance and Scalability of the back-end DAQ sub-system - CHEP2000 - Igor Soloviev

Test Configurations Network PMG Agent LDAQ RC Ctrl PMG Agent LDAQ G IPC P IPC DF IS PMG IS RC IS RDB RM MRS MRS L DAQ Supervisor IGUI RC Root Ctrl 4/17/2019 The Performance and Scalability of the back-end DAQ sub-system - CHEP2000 - Igor Soloviev

Test Description Done by shell script: start communication services launch configuration processes via DAQ supervisor marshal the hierarchy of RC controllers through different states: I - L - C - R - C - R - C - L - I stop DAQ supervisor processes stop servers R C L B I cold start setup warm stop shutdown cold stop warm start B - booted I - initialized L - loaded C - configured R - running 4/17/2019 The Performance and Scalability of the back-end DAQ sub-system - CHEP2000 - Igor Soloviev

Start-up & warm start/stop Time (seconds) Number of processors/crates PowerPC 100/200 MHz 32/64 MB Lynx OS Number of processors/crates Pentium III 450 MHz 128 MB Linux 4/17/2019 The Performance and Scalability of the back-end DAQ sub-system - CHEP2000 - Igor Soloviev

Start-up & close Time (seconds) Number of processors/crates PowerPC 100/200 MHz 32/64 MB Lynx OS Number of processors/crates Pentium III 450 MHz 128 MB Linux 4/17/2019 The Performance and Scalability of the back-end DAQ sub-system - CHEP2000 - Igor Soloviev

Back-end system test summary Results time to start/stop processes depends on OS, computer architecture and configuration once all processes started, the time to change system state remains constant (good distributed control) the use of IS, MRS and conf. db has a negligible effect on the performance the results even for the largest configurations is in acceptable range (< 1 minute to start-up on Linux) Known problems pmg agents started via RSH with long delays (20 sec) the computers were not dedicated to tests 4/17/2019 The Performance and Scalability of the back-end DAQ sub-system - CHEP2000 - Igor Soloviev

Summary & Future Individual back-end component test done for core components and show that they are in accordance with the DAQ/EF P-1 requirements similar tests have to be done for integration components Integrated back-end system tests performed employing the majority of the components verified correct component inter-operation, ability to work in a distributed multi-platform environment gathered performance measurements Future more statistics for larger configurations (more hosts) script improvement and better start-up/shutdown synchronization 4/17/2019 The Performance and Scalability of the back-end DAQ sub-system - CHEP2000 - Igor Soloviev

Appendix: Configuration Databases Importance are used by many components during initialization performance is important for system start-up Results (with OKS) Time (s) 1 single read-out crate 10 prototype -1 200 expected ATLAS DAQ Number of crates 4/17/2019 The Performance and Scalability of the back-end DAQ sub-system - CHEP2000 - Igor Soloviev

Appendix: Information Service Importance used by many components performance is important during all phases of system operation Results scalable (multiple servers to split the load) update medium size info. results presented (on single host) similar to publish and remove Update time (ms) Number of sources 4/17/2019 The Performance and Scalability of the back-end DAQ sub-system - CHEP2000 - Igor Soloviev

Appendix: Message Reporting System Importance used by many components performance is important during all phases of system operation Results presented tests obtained on single host better results obtained in distributed environment Report time per message (ms) Number of senders 4/17/2019 The Performance and Scalability of the back-end DAQ sub-system - CHEP2000 - Igor Soloviev

Appendix: Process Manager Importance performance is important for system start-up and shutdown Results obtained on single Solaris workstation Time per process (ms) 4/17/2019 The Performance and Scalability of the back-end DAQ sub-system - CHEP2000 - Igor Soloviev

Appendix: Run Control Importance Results required to change the state of the system Results scalable by changing the structure of the RC tree tests done on all available workstations Time, (s) Number of controllers 4/17/2019 The Performance and Scalability of the back-end DAQ sub-system - CHEP2000 - Igor Soloviev