Timm M. Steinbeck - Kirchhoff Institute of Physics - University Heidelberg 1 Timm M. Steinbeck HLT Data Transport Framework.

Slides:



Advertisements
Similar presentations
Jochen Thäder – Kirchhoff Institute of Physics - University of Heidelberg 1 HLT Data Challenge - PC² - - Setup / Results – - Clusterfinder Benchmarks –
Advertisements

Data and Computer Communications Eighth Edition by William Stallings Lecture slides by Lawrie Brown Chapter 10 – Circuit Switching and Packet Switching.
GNAM and OHP: Monitoring Tools for the ATLAS Experiment at LHC GNAM and OHP: Monitoring Tools for the ATLAS Experiment at LHC M. Della Pietra, P. Adragna,
Enabling Efficient On-the-fly Microarchitecture Simulation Thierry Lafage September 2000.
SKELETON BASED PERFORMANCE PREDICTION ON SHARED NETWORKS Sukhdeep Sodhi Microsoft Corp Jaspal Subhlok University of Houston.
Presented by: Yash Gurung, ICFAI UNIVERSITY.Sikkim BUILDING of 3 R'sCLUSTER PARALLEL COMPUTER.
Extensible Networking Platform IWAN 2005 Extensible Network Configuration and Communication Framework Todd Sproull and John Lockwood
Remigius K Mommsen Fermilab A New Event Builder for CMS Run II A New Event Builder for CMS Run II on behalf of the CMS DAQ group.
Rheeve: A Plug-n-Play Peer- to-Peer Computing Platform Wang-kee Poon and Jiannong Cao Department of Computing, The Hong Kong Polytechnic University ICDCSW.
Timm Morten Steinbeck, Computer Science/Computer Engineering Group Kirchhoff Institute f. Physics, Ruprecht-Karls-University Heidelberg A Framework for.
CHEP04 - Interlaken - Sep. 27th - Oct. 1st 2004T. M. Steinbeck for the Alice Collaboration1/20 New Experiences with the ALICE High Level Trigger Data Transport.
CHEP04 - Interlaken - Sep. 27th - Oct. 1st 2004T. M. Steinbeck for the Alice Collaboration1/27 A Control Software for the ALICE High Level Trigger Timm.
> IRTG – Heidelberg 2007 < Jochen Thäder – University of Heidelberg 1/18 ALICE HLT in the TPC Commissioning IRTG Seminar Heidelberg – January 2008 Jochen.
HLT Online Monitoring Environment incl. ROOT - Timm M. Steinbeck - Kirchhoff Institute of Physics - University Heidelberg 1 HOMER - HLT Online Monitoring.
Timm M. Steinbeck - Kirchhoff Institute of Physics - University Heidelberg - DPG 2005 – HK New Test Results for the ALICE High Level Trigger.
The Publisher-Subscriber Interface Timm Morten Steinbeck, KIP, University Heidelberg Timm Morten Steinbeck Technical Computer Science Kirchhoff Institute.
Timm M. Steinbeck - Kirchhoff Institute of Physics - University Heidelberg 1 Tag Database for HLT Micro-ESDs.
Matthias Richter, University of Bergen & Timm M. Steinbeck, University of Heidelberg 1 AliRoot - Pub/Sub Framework Analysis Component Interface.
CHEP03 - UCSD - March 24th-28th 2003 T. M. Steinbeck, V. Lindenstruth, H. Tilsner, for the Alice Collaboration Timm Morten Steinbeck, Computer Science.
M. Richter, University of Bergen & S. Kalcher, J. Thäder, T. M. Steinbeck, University of Heidelberg 1 AliRoot - Pub/Sub Framework Analysis Component Interface.
Embedded Transport Acceleration Intel Xeon Processor as a Packet Processing Engine Abhishek Mitra Professor: Dr. Bhuyan.
Timm Morten Steinbeck, Computer Science/Computer Engineering Group Kirchhoff Institute f. Physics, Ruprecht-Karls-University Heidelberg Alice High Level.
Rensselaer Polytechnic Institute CSC 432 – Operating Systems David Goldschmidt, Ph.D.
Jochen Thäder – Kirchhoff Institute of Physics - University of Heidelberg 1 HLT for TPC commissioning - Setup - - Status - - Experience -
1 The ATLAS Online High Level Trigger Framework: Experience reusing Offline Software Components in the ATLAS Trigger Werner Wiedenmann University of Wisconsin,
Cluster Computers. Introduction Cluster computing –Standard PCs or workstations connected by a fast network –Good price/performance ratio –Exploit existing.
Cluster computing facility for CMS simulation work at NPD-BARC Raman Sehgal.
A TCP/IP transport layer for the DAQ of the CMS Experiment Miklos Kozlovszky for the CMS TriDAS collaboration CERN European Organization for Nuclear Research.
The High-Level Trigger of the ALICE Experiment Heinz Tilsner Kirchhoff-Institut für Physik Universität Heidelberg International Europhysics Conference.
Boosting Event Building Performance Using Infiniband FDR for CMS Upgrade Andrew Forrest – CERN (PH/CMD) Technology and Instrumentation in Particle Physics.
Rensselaer Polytechnic Institute CSCI-4210 – Operating Systems CSCI-6140 – Computer Operating Systems David Goldschmidt, Ph.D.
Cluster Computers. Introduction Cluster computing –Standard PCs or workstations connected by a fast network –Good price/performance ratio –Exploit existing.
LECC2003 AmsterdamMatthias Müller A RobIn Prototype for a PCI-Bus based Atlas Readout-System B. Gorini, M. Joos, J. Petersen (CERN, Geneva) A. Kugel, R.
Data and Computer Communications Chapter 10 – Circuit Switching and Packet Switching (Wide Area Networks)
May PEM status report. O.Bärring 1 PEM status report Large-Scale Cluster Computing Workshop FNAL, May Olof Bärring, CERN.
Srihari Makineni & Ravi Iyer Communications Technology Lab
Control in ATLAS TDAQ Dietrich Liko on behalf of the ATLAS TDAQ Group.
February 17, 2015 Software Framework Development P. Hristov for CWG13.
Operating Systems David Goldschmidt, Ph.D. Computer Science The College of Saint Rose CIS 432.
7. CBM collaboration meetingXDAQ evaluation - J.Adamczewski1.
Management of the LHCb DAQ Network Guoming Liu * †, Niko Neufeld * * CERN, Switzerland † University of Ferrara, Italy.
1 Network Performance Optimisation and Load Balancing Wulf Thannhaeuser.
Data Acquisition Backbone Core J. Adamczewski-Musch, N. Kurz, S. Linev GSI, Experiment Electronics, Data processing group.
An Architecture and Prototype Implementation for TCP/IP Hardware Support Mirko Benz Dresden University of Technology, Germany TERENA 2001.
Overview of DAQ at CERN experiments E.Radicioni, INFN MICE Daq and Controls Workshop.
SCHOOL OF ELECTRICAL AND COMPUTER ENGINEERING | SCHOOL OF COMPUTER SCIENCE | GEORGIA INSTITUTE OF TECHNOLOGY MANIFOLD Manifold Execution Model and System.
2003 Conference for Computing in High Energy and Nuclear Physics La Jolla, California Giovanna Lehmann - CERN EP/ATD The DataFlow of the ATLAS Trigger.
CHEP March 2003 Sarah Wheeler 1 Supervision of the ATLAS High Level Triggers Sarah Wheeler on behalf of the ATLAS Trigger/DAQ High Level Trigger.
HIGUCHI Takeo Department of Physics, Faulty of Science, University of Tokyo Representing dBASF Development Team BELLE/CHEP20001 Distributed BELLE Analysis.
Marcelo R.N. Mendes. What is FINCoS? A set of tools for data generation, load submission, and performance measurement of CEP systems; Main Characteristics:
1 Farm Issues L1&HLT Implementation Review Niko Neufeld, CERN-EP Tuesday, April 29 th.
Management of the LHCb DAQ Network Guoming Liu *†, Niko Neufeld * * CERN, Switzerland † University of Ferrara, Italy.
D0 Farms 1 D0 Run II Farms M. Diesburg, B.Alcorn, J.Bakken, R. Brock,T.Dawson, D.Fagan, J.Fromm, K.Genser, L.Giacchetti, D.Holmgren, T.Jones, T.Levshina,
Cluster Computers. Introduction Cluster computing –Standard PCs or workstations connected by a fast network –Good price/performance ratio –Exploit existing.
Markus Frank (CERN) & Albert Puig (UB).  An opportunity (Motivation)  Adopted approach  Implementation specifics  Status  Conclusions 2.
Grzegorz Korcyl - Jagiellonian University, Kraków Grzegorz Korcyl – PANDA TDAQ Workshop, Giessen April 2010.
ROD Activities at Dresden Andreas Glatte, Andreas Meyer, Andy Kielburg-Jeka, Arno Straessner LAr Electronics Upgrade Meeting – LAr Week September 2009.
The Evaluation Tool for the LHCb Event Builder Network Upgrade Guoming Liu, Niko Neufeld CERN, Switzerland 18 th Real-Time Conference June 13, 2012.
LHCb and InfiniBand on FPGA
Matthias Richter, Sebastian Kalcher, Jochen Thäder & Timm M. Steinbeck
Electronics Trigger and DAQ CERN meeting summary.
Controlling a large CPU farm using industrial tools
RT2003, Montreal Niko Neufeld, CERN-EP & Univ. de Lausanne
CMS DAQ Event Builder Based on Gigabit Ethernet
ECE 4450:427/527 - Computer Networks Spring 2017
The LHCb High Level Trigger Software Framework
The Performance and Scalability of the back-end DAQ sub-system
GridTorrent Framework: A High-performance Data Transfer and Data Sharing Framework for Scientific Computing.
Cluster Computers.
Presentation transcript:

Timm M. Steinbeck - Kirchhoff Institute of Physics - University Heidelberg 1 Timm M. Steinbeck HLT Data Transport Framework

Timm M. Steinbeck - Kirchhoff Institute of Physics - University Heidelberg 2 Outline – Overview Framework / Components – HLT Control System – Benchmarks & Tests ● Framework Communication Benchmark ● Framework Interface Benchmark ● Framework Sample Configurations – Summary & Conclusion

Timm M. Steinbeck - Kirchhoff Institute of Physics - University Heidelberg 3 Overview of the Framework – Framework consisting of independant components – Communicating via publisher-subscriber interface ● Data driven architecture ● Components receive data from other components ● Process received data ● Forward own output data to next component – Named pipes used to exchange small messages and descriptors – Data is exchanged via shared memory Major design criteria ● Efficiency: CPU Cycles used for analysis ● Flexibility: Different configurations needed ● Fault-Tolerance: PCs are unreliable Subscriber Processing Publisher Subscriber Processing Publisher Shared Memory Data Named Pipe

Timm M. Steinbeck - Kirchhoff Institute of Physics - University Heidelberg 4 Application Components Templates for application specific components Data Source Addressing & Handling Buffer Management Data Announcing Data Analysis & Processing Output Buffer Mgt. Output Data Announcing Accepting Input Data Input Data Addressing Data Sink Addressing & Writing Accepting Input Data Input Data Addressing ● Read data from source and publish it (Data Source Template) ● Accept data, process it, publish results (Analysis Template) ● Accept data and process it, e.g. storing (Data Sink Template)

Timm M. Steinbeck - Kirchhoff Institute of Physics - University Heidelberg 5 Framework contains components to shape the flow of data in a cluster All components use publisher-subscriber interface Data Flow Components Subscriber Publisher 1.2; 2.2;... 1; 2;... Subscriber 1.1; 2.1;... Subscriber 1.3; 2.3;... Subscriber Publisher 1; 2;... Subscriber Publisher Subscriber 1; 2;... Publisher Subscriber Communication Node A Node B Network Merge parts of events Transparently connect components on different nodes (bridges) Split and rejoin a data stream (e.g. for load-balancing) (scatter/gather)

Timm M. Steinbeck - Kirchhoff Institute of Physics - University Heidelberg 6 HLT Control System Purpose of the HLT Control System: Management of HLT Processes – Supervise processes – React to state changes ● E.g. errors, unexpected termination – Manage system startup ● Connect components only when they are ready ● Connect bridges between nodes when nodes are ready – Orchestration of HLT system

Timm M. Steinbeck - Kirchhoff Institute of Physics - University Heidelberg 7 Requirements – Flexible ● HLT Configurations not yet defined ● Usable for other programs as well – Hierarchical ● >2000 processes not manageable by single supervisor instance ● Eases configurations – No single point of failure ● PC cluster nodes are unreliable ● No need for highly reliable, expensive, equipment – Should be able to run on the cluster nodes themselves, along the analysis processes.

Timm M. Steinbeck - Kirchhoff Institute of Physics - University Heidelberg 8 Core TaskManager Architecture Configurati on Engine XML Configuratio n File Interface Library Controlled Programs Program State & Action Engine Embedded Python Interpreter TaskManager Program Interface Engine

Timm M. Steinbeck - Kirchhoff Institute of Physics - University Heidelberg 9 Framework Communication Benchmark ● Network Throughput – GbE, InfiniBand w. TCP (IPoIB, SDP) IB w. uDAPL

Timm M. Steinbeck - Kirchhoff Institute of Physics - University Heidelberg 10 Framework Communication Benchmark ● CPU Load per Network Throughput (sender+receiver)

Timm M. Steinbeck - Kirchhoff Institute of Physics - University Heidelberg 11 Interface Benchmarks Benchmark of only Publisher-Subscriber Interface No shared memory access or anything else Benchmarks sensitive to L1 cache size (P4 family only 8 kB) Maximum rate achieved: >110 kHz (Dual Opteron) Time overhead f. event round-trip: < 18  s

Timm M. Steinbeck - Kirchhoff Institute of Physics - University Heidelberg 12 2 Node Network Test ● 2 nodes connected by bridge components ● Dual 3.2 GHz Xeon 4 w. EM64T ● Gigabit Ethernet ● Sender node: 1 FilePublisher ● Receiver node: 1 EventRateSubscriber ● 4 kB Dummy Events ● Achieved Rate: – 16.7 kHz ERS FP

Timm M. Steinbeck - Kirchhoff Institute of Physics - University Heidelberg 13 FP CF FP CF FP CF TR “Compact” HLT Configuration ● HLT Test Configuration using 5 nodes to process 1 slice ● 3 ClusterFinder nodes – 2 FilePublishers + 2 ClusterFinders per node (1 each per patch) ● 2 Tracker nodes – 1 Tracker process per node

Timm M. Steinbeck - Kirchhoff Institute of Physics - University Heidelberg 14 “Compact” HLT Configuration Results ● Dual 800 MHz Pentium 3, Fast Ethernet – 870 Hz (runtime 142h) ● Dual 3.2 GHz Xeon 4 w. EM64T, Gigabit Ethernet – 2.9 kHz ● Using empty events FP CF FP CF FP CF TR

Timm M. Steinbeck - Kirchhoff Institute of Physics - University Heidelberg 15 Local Scatterer Test ● Test on one node – 1 FilePublisher – 1 Scatterer – 2 DummyLoads ● Rates: – Dual 800 MHz Pentium 3 ● 6 kHz – Single 1.6 GHz Opteron 242 ● 9.8 kHz – Dual 1.6 GHz Opteron 242 ● 16.8 kHz FP S DL

Timm M. Steinbeck - Kirchhoff Institute of Physics - University Heidelberg 16 HLT Data Transport Framework Summary ● HLT Data Transport Framework has matured ● Already being used in projects, testbeams, commissionings, HLT data challenges ● Stable, 1 setup running for 1 month continuously ● Performance improved significantly – Enough for HLT already now ● Component approach already proved useful – Easy to create configurations for short tests – Flexible configuration changes, e.g. during testbeams, commissioning, data challenges