Modeling of the architectural studies for the PANDA DAT system K. Korcyl 1,2 W. Kuehn 3, J. Otwinowski 1, P. Salabura 1, L. Schmitt 4 1 Jagiellonian University,Krakow,

Modeling of the architectural studies for the PANDA DAT system K. Korcyl 1,2 W. Kuehn 3, J. Otwinowski 1, P. Salabura 1, L. Schmitt 4 1 Jagiellonian University,Krakow, Poland 2 Cracow University of Technology, Krakow, Poland 3 Justus-Liebig Universitat Giessen, Giessen, Germany 4 GSI, Darmstadt, Germany

Outline The PANDA experiment and TDAQ system Architecture proposal and operation of basic components Detector Concentrator Board organization L1 processing node Model of the architecture Preliminary results Conclusions

Detector and DAT requirements interaction rate: 10 MHz raw data flow: 40 - 80 GB/s typ. event size: 4 – 8 kB lack of hardware trigger signal continuously sampling FE flexibility in the choice of triggering algorithms cost efficient (COTS components)

Detector Concentrator Board Switch L1 farm L1out Detector Front-end Data PANDA DAT architecture L2outL3out Switch L2 farmL3 farm

Detector Front End Electronics Receives precise synchronous clock signal from the central distribution system Continuously sampling mode of operation capable of autonomous hit detection Time stamps data with the interaction time based on the central clock and expedites message towards the DCB

Detector Concentrator Board FreePages Fifo Detector data with timeStamp Paged mainMemory timeStamp Feature extractor L1outL2outL3out evt page L1 L2 L1dec L2req/dec L3req/dec Empty = Busy Central clock timeout L2-NO L1-YES L3- release

Detector Concentrator Board (2) Local filtering process (Feature Extraction) is started with DCB_INSPECT_LATENCY. It may result in generation of a message to the LVL1 farm. The L1 structure is cleared by the purge process running with the DCB_PURGE_LATENCY and the TIME STAMP rate. Addresses of pages found in the L1 at position DCB_PURGE_LATENCY are returned back to the free pages fifo (detector data will be overwritten). Positive LVL1 decisions save detector data, by moving the page address from the L1 to the L2 structure

Detector Concentrator Board (3) The DCB uses the TIME STAMPs to calculate address of the LVL1 processing node to send the features to. N concecutive TIME STAMPs are directed to the same L1port. neutralizes quantization of time (TIME STAMPs) N - number of the TIME STAMPs sent to each of the destinations is parameter (minimum: 3) the DCBs change the LVL1 destination port every: (TIME STAMP width [ns] ) * N [nanoseconds] allows to rescale the architecture and avoid the switch output port overflow.

Detector Concentrator data Neighbor LVL1 data (copy of DCB data) 3 timeout-based processes run with the TIME STAMP rate and: open Latency: allows storage of the data neighbor Latency: sends message to neighbor LVL1 close Latency: closes storage and starts filtration FreePages Fifo Paged mainMemory page Empty = Busy Central clock Time stamp open/close buffer0buffer1buffer2 open/close pageAddress LVL1 processing node

L1 operation L1 uses sliding window of 3 TIME STAMPs to concatenate data originating from the same interaction (ex: N=8) TIME STAMP + 0 TIME STAMP + 1 TIME STAMP + 2 TIME STAMP + 4 TIME STAMP + 3 TIME STAMP + 7 TIME STAMP + 6 TIME STAMP + 5 TIME STAMP - 2 TIME STAMP - 1 The close timeout analyses 3 adjacent TIMESLICES synchro-nous with the TIMESLICE rate and preprogrammed latency time TIME STAMP-2 and TIME STAMP-1 are received from the neighbour L1 process port with "earlier times" + Copying data across L1 access points aimes to cater the L1 segmentation and boundary problems - Duplicates data

L2 operation On reception of the event info (+ results from processing) from the L1 processor, the L2 may request additional data from the detector concentrators referring the event number (unicast requests - PULL architecture) L2 processing may become a sequential procedure requesting more data from various detectors in the course of verification of some physics hipotheses. The L2 latency can vary depending on the data. L2 negative decisions are broadcasted to the DCBs.

EB operation For positively annotated events the L2 processor will send event info + results from processing to the EB processor for the last stage of filtering and Event Building On reception of the event info (+ results from processing) from the L2 processor, the EB processor makes series of unicast requests to ALL detector concentrators, requesting event data (avoids overflow with spontaneous replies) After collection of replies from ALL detectors, the L3 processor sends the RELEASE event broadcast message to all DCBs. The L3 processor performs Event Building and sends the data to the permanent storage.

The PANDA DAT model Uses SystemC – discrete event simulation platform Model of the architecture: Creates physics generator (100 ns inv exp inter-event) Creates 5 detectors with various sizes of data Creates 40 DCBs (8 DCBs per detector) Creates 40 LVL1 processing nodes, 40 LVL2 processors and 40 EB processors Creates 3 Ethernet switches Connects all the components with 1Gbps Ethernet links

The model – DCB Buffer Occupancy

The model – Filtering Latency Time measured at the DCB between arrival of the detector data and the LVL1, LVL2 and the EB decision

The model - Latency Variation Sufficient processing resources (number of CPUs) installed at the LVL1 processing node guarantee deterministic latency and lossless operation

The model – LVL1 Throughput Scaling The number of time stamps per LVL1 processing node and the time stamp width are the key parameters which allow the architecture to scale with the amount of data produced by the DCBs

What we did so far: We proposed architecture for the continuous sampling data acquisition system for PANDA We built behavioral model to evaluate and understand impact of the key architectural parameters on the performance The architecture meets requirements: operates at 10 MHz interaction rate allows for data correlation based on time stamps offers flexibility for wide range of filtering algorithms scales for increased amount of data to be transferred and processed – this we have to prove and it is now our main focus

Scalability studies 1234567 assigns number to interaction introduces random delay of 200 ns per detector and stamps with delayed time delivers delayed messages to DCBs (each detector has 8 DCBs) DCB selects randomly 10% of messages and sends them to LVL1 nodes LVL1 sorts data and stores according to the time stamp LVL1 analyses messages from 3 adjacent time slices if sum of messages reaches criteria number the contents of the 3 time stamps is histogrammed against the interaction number if number of collected messages for interaction number meets another limit the event is considered as interesting and not previously recognized – it is assigned to CPU for processing. The earliest time stamp with a message belonging to the given interaction defines INTERACTION TIME STAMP.

Scalability studies

Modeling of the architectural studies for the PANDA DAT system K. Korcyl 1,2 W. Kuehn 3, J. Otwinowski 1, P. Salabura 1, L. Schmitt 4 1 Jagiellonian University,Krakow,

Similar presentations

Presentation on theme: "Modeling of the architectural studies for the PANDA DAT system K. Korcyl 1,2 W. Kuehn 3, J. Otwinowski 1, P. Salabura 1, L. Schmitt 4 1 Jagiellonian University,Krakow,"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Modeling of the architectural studies for the PANDA DAT system K. Korcyl 1,2 W. Kuehn 3, J. Otwinowski 1, P. Salabura 1, L. Schmitt 4 1 Jagiellonian University,Krakow,

Similar presentations

Presentation on theme: "Modeling of the architectural studies for the PANDA DAT system K. Korcyl 1,2 W. Kuehn 3, J. Otwinowski 1, P. Salabura 1, L. Schmitt 4 1 Jagiellonian University,Krakow,"— Presentation transcript:

Similar presentations

About project

Feedback