Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data Acquisition Backbone Core DABC J. Adamczewski, H.G. Essel, N. Kurz, S. Linev GSI, Darmstadt The new Facility for Antiproton and Ion Research at GSI.

Similar presentations


Presentation on theme: "Data Acquisition Backbone Core DABC J. Adamczewski, H.G. Essel, N. Kurz, S. Linev GSI, Darmstadt The new Facility for Antiproton and Ion Research at GSI."— Presentation transcript:

1 Data Acquisition Backbone Core DABC J. Adamczewski, H.G. Essel, N. Kurz, S. Linev GSI, Darmstadt The new Facility for Antiproton and Ion Research at GSI First sketch of the data acquisition concept for CBM (2004) H.G.Essel, FutureDAQ for CBM: On-line event selection, IEEE TNS Vol.53, No.3, June 2006, pp 677-681 CHEP2007 Victoria BC, Canada, 2- 7 September, 2007Work supported by EU RP6 project JRA1 FutureDAQ RII3-CT-2004-506078 General purpose DAQ Event building network with standard components like InfiniBand Scheduled data transfer with ~10µs accuracy Thread synchronization with determined latency Needs realtime patches of standard Linux kernel 2.6.19: RT priorities, nanosleep and pthread_condition PREEMPT_RT (Ingo Molnar), high resolution timer (Thomas Gleixner) Use cases Detector tests FE equipment tests High speed data transport Time distribution Switched event building Software evaluation MBS* event builder Requirements build events over fast networks handle triggered or self-triggered front-ends process time stamped data streams provide data flow control (to front-ends) connect (nearly) any front-ends provide interfaces to plug in application codes connect MBS readout or collector nodes be controllable by several controls frameworks * Muliti Branch System: current standard DAQ at GSI Motivation for developing DABC The left Figure shows hardware components the DABC has to deal with. Front-end boards typically keep sampling digitizers, whereas combiner and dispatcher boards have fast data links controlled by FPGAs. Prototypes of these boards are under test. Data rates in the order of 1 GBytes/s could be delivered through the PCI express interface. Frontend hardware example Logically such a setup looks like sketched in right Figure. Depending on the performance requirements (data rates of the front-end channels) one may connect one or more front- ends to one DABC node. From a minimum of one PC with Ethernet up to medium sized clusters all kind of configurations are possible. One may separate input and processing nodes or combine both tasks on each machine using bi-directional links depending on CPU requirements. DABC as data flow engine If the receiver nodes are busy with other calculations like event building, it is necessary to hold the senders. This is achieved by the back pressure mechanism shown in Figure above. The receivers inform the senders about their queue status after a collection of buffers. The senders only send when the receivers signalled sufficient resources. The back pressure works smoothly. Data flow control of DABC thread Producer thread Receive thread Consumer Output Port memory pool thread Wait Network add buffer wait for buffer wait signal send buffer wait for buffer add buffer allocate free IO wait completion wait feed back wait receive feed back thread Send Network Interface B put new buffer Network Interface A Input Port Logical structure of DABC data input sorting tagging filter analysis data input sorting tagging filter analysis IB PC GE analysis archive PC frontend Readout frontend other frontend MBS readout scheduler DABC http://www-linux.gsi.de/~dabc/dabc/DABC.php Bandwidth achieved by DABC event building on 4 nodes at GSI (non synchronized mode, left graph). The test measurement result is included for comparison. With buffers > 32KB DABC delivers nearly the bandwitdh of the plain test program. Gigabit Ethernet delivers 70MByte/s. Four data sources per node generate data. Event building on the receiver nodes only checks the completeness of data. On the IWF cluster the scaling of bandwidth with the number of nodes has been measured from 5 to 23 nodes (right graph). The rates are plotted normalized to 5 nodes (100%). Rates drop to 83% for non synchronized traffic and large buffers. With 8 KB buffers traffic stays at nearly 100%. The fourth measurement shows the effect of synchronizing the traffic: Performance for large buffers on 23 nodes is even better than non synchronized on 5 nodes. All nodes send to all nodes (bi-directional). Data rates are per direction. In synchronized mode senders send data according a time schedule avoiding conflicts at the receivers. Without synchronization all senders send round robin to all receivers and the network must handle collisions. Measurements have been performed at GSI (4 double Opteron machines 2.2 GHz, non synchronized, left graph) and Forschungszentrum Karlsruhe (IWF) on a cluster with 23 double dual-core Opteron (2.2 GHz)* and SilverStorm (QLogic) switch. On the 4 nodes at GSI there is no difference between synchronized and non synchronized mode. * Many thanks to Frank Schmitz and Ivan Kondov Performance measurements InfiniBand Using the real time features of Linux Linux kernel 2.6.19 with RT patches (high resolution timer now in 2.6.21). With two threads doing condition ping-pong and one "worker"-thread one turn is 3.2 µs With high resolution timer and hardware priority a sleeping thread gets control after 10 µs. Synchronized transfer with microsleep achieves 800 MB/sec Java GUI (DIM client) Network locally DABC Module port Memory management DABC Module port process Data queue Device Transport Device Transport Data queue Module manager & Queue DABC Module A processInput processCommand ActionEvent manager Thread with 1-n modules Command Device thread Transport & Data queue Data ready event (event) GE switch PC ABB DCB FE IB switch FE DC Front-end board: sampling ADCs (4*4), clock distribution Data combiner boards, clock distribution to FE Data dispatcher board: PCI express card 8-20 PCs dual quad PC DD PCIe bi-directional event building *8 2.5 GBit/s bi-directional (optical) link: data, clock *4 2.5 GBit/s data links Gigabit Ethernet FE: Frontend board DC: Data combiner board DD: Data dispatcher board GE: Gigabit Ethernet IB: InfiniBand PC Builder network Process network The right Figure shows the components of DABC data flow mechanism. Basic object is a module which gets access to data buffer queues through input ports, processes them, and outputs them to output ports. The data queues on each node are controlled by a Memory management which allows processing buffers without copy by modules running on the same node. The Transport component in this case propagates only references. If a port is connected to a remote port, the Transport and Device components do the transfers. Module components can either run in separate threads to utilize multiple CPUs, or in one thread which is faster because no module synchronization by mutexes is needed in this case (left Figure). Data processing functions of the module are called by an Action event manager. The data flow is controlled by buffer resources (queues) handled through the Transports. Once a queue buffer is filled, the Transport sends a signal to the ActionEvent manager, which in turn calls the processInput function of the module associated through a port. This function has access to the data buffers through the port (Fig. above). Similarly, commands can be sent to the ActionEvent manager, which calls the associated module function processCommand. The GUI is built up dynamically by the information delivered by the DIM servers running in the application threads on all nodes


Download ppt "Data Acquisition Backbone Core DABC J. Adamczewski, H.G. Essel, N. Kurz, S. Linev GSI, Darmstadt The new Facility for Antiproton and Ion Research at GSI."

Similar presentations


Ads by Google