Presentation is loading. Please wait.

Presentation is loading. Please wait.

BonnDAQ Status and upgrade possibilities

Similar presentations


Presentation on theme: "BonnDAQ Status and upgrade possibilities"— Presentation transcript:

1 BonnDAQ Status and upgrade possibilities
Florian Lütticke On behalf of the DEPFET Collaboration

2 BonnDAQ in the experiment Structure of BonnDAQ Support software
Contents BonnDAQ in the experiment Structure of BonnDAQ Support software Scaling for Belle Operation File format These slides are documentation for BonnDAQ as of

3 Current Idea for BonnDAQ
HLT x 8 DHE DHE ONSEN EVB DHE Express Reco DHE DHC DHE Simplified BonnDAQ Subset of events Standalone DAQ running without HLT for PXD Calibration data PXD DQM on full data (no ROI) measuring ROI efficiency, Slow Pion rescue Proven useful during 2016 April test beam Currently: standard lab tool for measurements

4 Current status of BonnDAQ (05/2016)
Keyboard Command Decoder TCP Command Server CMD Client(s) CMD Client(s) Thread Main Thread dh<x>_receiver.cpp Startup options Setup of data structures Cleanup CMD Client(s) Data structure Control Statistics/ Health Write/Control Read/React Read/React DHE/DHC Receiver Thread Data Distribu-tion Server File Writer Thread DHE/DHC Buffer 1 Buffer 2 Socket File Data listener(s) Compression thread pool Data File Error log file Listener Compression thread pool Listener Compression thread pool

5 Current status of BonnDAQ
Main Thread (dh<x>_receiver.cpp) Startup options: --file, -f [string] use special file name, autostart --fileprefix [string] file prefix (default: Run) --path, -p [string] path to data folder (default: DATA) -- runnumber, -r [int] Use given runnumber (default: take from file) --inputport, -I [int] Port, where DH<x> data is received (default: 6000) --modid, -m [int] Select ModID to put into header (not implemented) (default: 11) --nokeyboard Do not start keyboad thread CMD Client(s) TCP Command Server Keyboard Command Decoder Statistics/ Health DHE/DHC Receiver Thread Data Distribu-tion Server File Writer Thread Buffer 1 Control DHE/DHC Listener Data listener(s) Error log file Compression thread pool Data File

6 Current status of BonnDAQ
Keyboard Command Decoder (cmd_thread.cpp) React to keys pressed by user Quit [q], Pause (stop receiving packets) [p] Echo (debug output 1) [e] raw mode (print all packets, do nothing else) [r] More debug output [1] [2] (udpecho, tcpecho) To be discontinued soon… (Disabled by “--nokeyboard“) (defines.h) Controls for the Software Inter thread communication, Startup options Status bits: Run started, Debug Modes, Currrent State, DHE/DHC, Queues to be used, Ports to be used, etc… CMD Client(s) TCP Command Server Keyboard Command Decoder Statistics/ Health DHE/DHC Receiver Thread Data Distribu-tion Server File Writer Thread Buffer 1 Control DHE/DHC Listener Data listener(s) Error log file Compression thread pool Data File Additional debug output switchable but deprecated! Debug in terminal only, not in Belle II logger. No debug level implemented Status bits currently without mutexes Control

7 Current status of BonnDAQ
TCP Command Server CMD Client(s) (cmd_thread.cpp) TCP server, Clients connect & give commands Commands are terminated with \n, case ignored “cmd start”, “cmd stop”: Start/Stop run, returns running/stopped message, runnumber is incremented. “cmd exit”: Exit daq “cmd stats”: get JSON serialized statistics “cmd status”: get running/stopped message “cmd evb runnum”: Set runnumber for run (default from file) “cmd evb prefix”: Set prefix for run (default “Run”) “cmd evb path”: Set runnumber for run (default “DATA”) “cmd evb filename”: use special filename, do not use above “cmd fr echo/tcpecho/udpecho/rawmode on/off” debug, see keyboard thread CMD Client(s) CMD Client(s) CMD Client(s) TCP Command Server Keyboard Command Decoder Statistics/ Health DHE/DHC Receiver Thread Data Distribu-tion Server File Writer Thread Buffer 1 Control DHE/DHC Listener Data listener(s) Error log file Compression thread pool Data File

8 Current status of BonnDAQ
TCP Command Server CMD Client(s) (cmd_thread.cpp) TCP server, Clients connect & give commands Not recognized commands are distributed to everyone in the network Yes, you can misuse it as a chat program (utils.cpp) All threads write statistics/status in here, “cmd stats” queries it. Statistics gets serialized using JSON Thread save as long as threads stick to their own variables. CMD Client(s) CMD Client(s) CMD Client(s) TCP Command Server Keyboard Command Decoder Statistics/ Health DHE/DHC Receiver Thread Data Distribu-tion Server File Writer Thread Buffer 1 Control DHE/DHC Listener Data listener(s) Error log file Compression thread pool Data File Statistics/ Health No tcp client implementation yet Fragile implementation: Clients need to read messages. Does not handle multi message packets yet („cmd start\n cmd stop\n“ would just start) Stats not complete yet

9 Current status of BonnDAQ
DHE/DHC UDP Data Socket. Times out if no data for 5ms Data arrives as UDP datagrams of (max) 1490 bytes Bigger frames are split, 4 byte header: FF 00 XX XX where XX XX is the packet number (starting with 1) (dh<x>_receiver_thread.cpp) Reads data from DHE/DHC and builds events Receive data, check basic consistency (CRC, sane ordering, trigger number & DHE ID correct in all packages), build (device) event If output buffer is not full, writes events to buffer. Selectable: Check trigger number and insert empty events if needed. CMD Client(s) TCP Command Server Keyboard Command Decoder Statistics/ Health DHE/DHC Receiver Thread Data Distribu-tion Server File Writer Thread Buffer 1 Control DHE/DHC Listener Data listener(s) Error log file Compression thread pool Data File DHE/DHC Receiver Thread Performance: Data is copied on receive DHE: Big endian, 16 bit. – data is converted to little endian What to do if buffer full and trigger number checking is selected?

10 Current status of BonnDAQ
Error log file (utils.cpp) Log warning messages and the offending packets At each start up a log file is created (/tmp/dhh_daq/daq_errors<number>.txt). Log files contains header „DAQ Error Log Tue May 17 16:12: ” Error message with number of the offending packet “packet 40: This is a error” Lines containing packets before and after the offending one. Time (usec), relative packet number: data in hex: CMD Client(s) TCP Command Server Keyboard Command Decoder Statistics/ Health DHE/DHC Receiver Thread Data Distribu-tion Server File Writer Thread Buffer 1 Control DHE/DHC Listener Data listener(s) Error log file Compression thread pool Data File Performance: Writing to file using streams (slow) Lots of errors: could slow down system Profiling needed BUT: many errors -> slow down is a minor issue, exept when it is not! packet 40: This is a error 16:12: , -1: e6 5e c f d a 00 00 16:12: , 0: 00 e6 5e c f d a fe 7a 16:12: , 1: e6 5e c f

11 Current status of BonnDAQ
Buffer 1/ Buffer 2 Internal buffer for events Currently: max 500 frames (raw frames are huge, 500 ZS frames are just 10 ms) (event_client.cpp) Data listeners Extension of TCP Socket. Guarantees non-blocking Large blocks will be written in several steps if TCP buffer is small or full. Internal buffer for rest of event Keeps track of receivers perceived run state Built in functions for BOR, EOR and Info events Buffer 1 Buffer 2 CMD Client(s) TCP Command Server Keyboard Command Decoder Statistics/ Health DHE/DHC Receiver Thread Data Distribu-tion Server File Writer Thread Buffer 1 Control DHE/DHC Listener Data listener(s) Error log file Compression thread pool Data File Data listener(s) Listener Listener

12 Current status of BonnDAQ
Data Distribution Server (data_server_thread.cpp) Server for data listeners Handles connect/disconnect, full buffers Reads events from receiver thread and sends to file thread after distribution 3 request schemes: Single event per request continuous stream of events exclusive stream of events (blocking if requester is not fast enough) Multiple receiver possible (max 2k) Receiver class available (receiveClient.cpp) init_connection(); subscribe(); event_get(); CMD Client(s) TCP Command Server Keyboard Command Decoder Statistics/ Health DHE/DHC Receiver Thread Data Distribu-tion Server File Writer Thread Buffer 1 Control DHE/DHC Listener Data listener(s) Error log file Compression thread pool Data File Known Issues When start/stop happen fast (and reading is slow), system could omit EOR and BOR No high load tests done with multiple listeners

13 Current status of BonnDAQ
File Writer Thread (file_thread.cpp) Write events to disc Compressed or uncompressed File in BonnDAQ Format File can be given by filename If not, creates (DATA/RunXXXX-YYY.dat) Fileprefix (Run) can be changed Path (DATA/) can be changed Runnumber (XXXX) taken from runnumber file in the path (created with 0 if not available) Runnumber can be changed Changing works over command line at startup or over TCP command. new sub file (YYY) created every 2GByte CMD Client(s) TCP Command Server Keyboard Command Decoder Statistics/ Health DHE/DHC Receiver Thread Data Distribu-tion Server File Writer Thread Buffer 1 Control DHE/DHC Listener Data listener(s) Error log file Compression thread pool Data File Larger sub files not possible Files with names capped at 2 GB Not tested under high load (e.g. 100MB/sec)

14 Current status of BonnDAQ
Compression thread pool (compressed_file_writer.cpp) Compress & write to file Buffer data and compress when enough is available Non-blocking implementation: When buffer is compressed a second stores data New file can be started while the last is being finished Blocking functions exist (e.g. close & block until written) Multi-threaded compression Algorithms: Blosc (very fast), lz4 (fast) or zlib (good) Class to read (and write) compressed files available (compressed_file.cpp) User does not notice compression. Both uncompressed and compressed look the same for user. Compression thread pool Compression thread pool CMD Client(s) TCP Command Server Keyboard Command Decoder Statistics/ Health DHE/DHC Receiver Thread Data Distribu-tion Server File Writer Thread Buffer 1 Control DHE/DHC Listener Data listener(s) Error log file Compression thread pool Data File Worst Case: Data not compressible: Max Overhead of 20B/2MB chunk (0.001%)

15 BonnDAQ & pyDepfetReader
Data File pyDepfetReader: python extention Written in Cython & C++ Interpretation of both DHE and DHC data Transparent: User does not notice Data given as numpy array Raw: Shape is Row, Col ZS: Shape is Firing Pixel, 4 Multiple words of [Col, Row, Value, CM] Can select single or all ASICS and/or DHEs Single DHE requested: Return is [Data, Datatype, additional flags] All DHEs requested: List of DHEs [Name, Data, Datatype, additional flags] plus a DHC trigger (if available) CMD Client(s) TCP Command Server Keyboard Command Decoder Statistics/ Health DHE/DHC Receiver Thread Data Distribu-tion Server File Writer Thread Buffer 1 Control DHE/DHC Listener Data listener(s) Error log file Compression thread pool Data File

16 BonnDAQ & pyDepfetReader
Data File pyDepfetReader.FileReader Reads BonnDAQ files and interprets data Datatype is true (raw) or (false) Number of events can be read from file Can select to only read ZS or raw events – no data type given in this case pyDepfetReader.TcpReader Connects to BonnDAQ, receives and interprets live data Datatype: TcpReader.{EMPTY_FRAME, BAD_FRAME, BOR_FRAME, EOR_FRAME, INFO_FRAME, ZS_FRAME, RAW_FRAME} CMD Client(s) TCP Command Server Keyboard Command Decoder Statistics/ Health DHE/DHC Receiver Thread Data Distribu-tion Server File Writer Thread Buffer 1 Control DHE/DHC Listener Data listener(s) Error log file Compression thread pool Data File (Open) Issues Different datatype in File and TCP. (historical) Change and make uniform? Some Information lost ZS data not sorted, stays as received

17 BonnDAQ: EPICS Integration status
DAQ Health module (python + CS-Studio) Shows status of threads, filename, file size, connected clients (for live data), buffer fill status, error counts etc. Creates EPICs server, requests DAQ status and publishes EPICS variables. OPI display available Used in April testbeam Trigger mismatch display bad Start/Stop buttons sometimes crash. Stability testing needed Sometimes does not reconnect when BonnDAQ is restarted

18 BonnDAQ: EPICS Integration status
DQM display (python + CS-Studio) Uses pyDepfetReader.TcpReader Gets live data and calculates hit map, Seed/Cluster histogram, pixel occupancy histogram, common mode histogram, event size histogram Creates EPICs server and publishes results OPI display available Used in April 2016 testbeam ~500 frames/sec two PXD modules, single thread ~25 frames/sec for 40 modules Currently can only handle 1 DHC (5 DHEs) OPI Display has hardcoded DHE IDs Clustering only in space not in time. Single threaded

19 Scaling BonnDAQ for Belle II
CMD Client(s) DAQ Layout: Keyboard Command Decoder TCP Command Server BonnDAQ instance Control Statistics/ Health DHE/DHC Receiver Thread Data Distribu-tion Server File Writer Thread Buffer 1 Buffer 2 DHE/DHC Compression thread pool Compression thread pool Compression thread pool Error log file Data listener(s) Data File

20 BonnDAQ for Belle II – multiple instances
TCP Command Server Keyboard Command Decoder Statistics/ Health DHE/DHC Receiver Thread Data Distribu-tion Server File Writer Thread Buffer 1 Buffer 2 Control DHE/DHC Data listener(s) Error log file CMD Client(s) Compression thread pool Data File BonnDAQ instance TCP Command Server Keyboard Command Decoder Statistics/ Health DHE/DHC Receiver Thread Data Distribu-tion Server File Writer Thread Buffer 1 Buffer 2 Control DHE/DHC Data listener(s) Error log file CMD Client(s) Compression thread pool Data File BonnDAQ instance TCP Command Server Keyboard Command Decoder Statistics/ Health DHE/DHC Receiver Thread Data Distribu-tion Server File Writer Thread Buffer 1 Buffer 2 Control DHE/DHC Data listener(s) Error log file CMD Client(s) Compression thread pool Data File BonnDAQ instance Start multiple instances Easy Distributable on multiple machines Same run but multiple Data files CMD clients Log files data listeners DAQ not synchronized Hard to get all data from one event

21 BonnDAQ for Belle II – multiple threads
CMD Client(s) Start multiple receiver threads for each DHC Single control „Single“ statistics Easy to control Same run but multiple Data files Log files data listeners DAQ not synchronized Hard to get data from one event Keyboard Command Decoder TCP Command Server BonnDAQ instance Control Statistics/ Health Statistics/ Health Statistics/ Health DHE/DHC Receiver Thread Data Distribu-tion Server File Writer Thread DHE/DHC Receiver Thread Buffer 1 Data Distribu-tion Server File Writer Thread DHE/DHC Receiver Thread Buffer 2 Buffer 1 Data Distribu-tion Server Buffer 2 File Writer Thread Buffer 1 Buffer 2 DHE/DHC DHE/DHC DHE/DHC Compression thread pool Compression thread pool Compression thread pool Compression thread pool Compression thread pool Compression thread pool Compression thread pool Compression thread pool Compression thread pool Error log file Data listener(s) Data File Error log file Data listener(s) Data File Error log file Data listener(s) Data File

22 BonnDAQ for Belle II – integrated threads
Create multiple DH<x> threads and build events Easy to control, one file for all data, full online events. Needs work (and testing) for implementation Needs powerfull machine CMD Client(s) Keyboard Command Decoder TCP Command Server BonnDAQ instance Control Statistics/ Health DHE/DHC Receiver Thread Event building and sorting Data Distribu-tion Server File Writer Thread DHE/DHC Receiver Thread DHE/DHC Receiver Thread Buffer 1 Buffer 3 Buffer 1 Buffer 2 Buffer 1 DHE/DHC DHE/DHC DHE/DHC Compression thread pool Compression thread pool Compression thread pool Error log file Data listener(s) Data File Error log file Error log file

23 Worst case performance considerations
10k-30k frames per second 8 DHCs: 8*Start, 8*Stop = 16 Packets/frame 40 DHEs: 8*Start, 8*Stop = 80 Packets/frame 160 DHPs: 2-3 packets from DHP frames: Packets/frame Total: 5-15M Ethernet packets per second: Can work, but probably not with OS memory management. (malloc/free need ~50-100ns) 700k-2M packets per Ethernet port – possible? – 700k yes, 2M probably hard. Check if compression works for “real” data and gives advantage Check if 1Gbyte/sec can be compressed Check if 1Gbyte/sec can be written to disc (raid has limitations) Server: 8+1* 1Gbit Ethernet for DHC and User 1 * 10Gbit Ethernet for Raid CPU: 8 cores for receiver (+8 for Network hashes?) 1 core for event builder, 1-2 cores for CRC, 1 core for file writer, 1 core for the rest (data distribution, TCP commands) 8-16? cores for compression Mem: 8 * 512MB input buffer 4-8 GB event buffer Some MB for compression UDP size limited These numbers are a feasibility study. We should not aim for 10KHz

24 BonnDAQ File structure – Header Layout
All packets have 8 byte header: Dev ID is data type. DHE (7), DHC (8), Group (0), INFO (14), others for V4, S3B, DHHemulator, Geant4 … Mod ID is the number of module. 16 modules can be in the data stream (e.g. 8 DHCs). Type can be: BOR (0), EOR (1), DATA (2) 2 Flags available 20bit Eventsize (in 4Byte words) including the header. 8Byte - 4MByte Data field (e.g containing trigger number) Dev ID 31:28 Mod ID 27:24 Type 23:22 Flags 21:20 Eventsize (in 4Byte words) 19:0 Trigger number/ Run Number / Events in File 31:0 (Open) Issues What if Dev ID changes? Is it a new module or is it different data for the same module?

25 BonnDAQ File structure – Event Layout
Event Groups Each event can contain multiple devices. A Event group consist of a Group Event and one or several device events. Each device event contains its own size, a group event contains size of itself plus all device events belonging to it. Group Event Dev ID: Group (0) Mod ID: 0 Type: same as following events Flags: 2 Eventsize: 2 (x4 Byte) + payload size (in 4 Byte words) DataField: Trigger number 0x0 31:28 27:24 XX 23:22 0x2 21:20 2 + X 19:0 Trigger number 31:0 (Open) Issues Why Flag=2? Unused! Size limited to 4 MB. Use Flags to get 16 MB? 4 MB ok for single DHC Max Belle2: 7.9 MB (160xRaw (192 gates) ) 11 MB if full (256 gates)

26 BonnDAQ File structure – Info Events
FileType Event: Dev ID: INFO (14) Mod ID: 2 Type: Data (2) Flags: 3 if uncompressed file, 2 if compressed file Eventsize: 2 (x4 Byte) DataField: Events in File (or 0 if DAQ crashed/killed) This is the first event in the file and can be only there RunNumber Event: Mod ID: 1 Flags: None DataField: Current Runnumber Has to be before BOR 0xE 31:28 0x2 27:24 23:22 0x2/0x3 21:20 19:0 Events in File 31:0 0xE 31:28 0x1 27:24 0x2 23:22 0x0 21:20 19:0 Run number 31:0

27 BonnDAQ File structure – BOR/EOR
BOR (Begin Of Run) Event: Dev ID: Same Dev ID as device Mod ID: Same Mod ID as device Type: BOR (0) Flags: Same as device Eventsize: 2 (x4 Byte) DataField: BOR Trigger 0x EOR (End Of Run) Event: Type: EOR (1) DataField: EOR Trigger 0xAAAAAAAA XX 31:28 YY 27:24 0x0 23:22 ZZ 21:20 0x2 19:0 0x 31:0 XX 31:28 YY 27:24 0x1 23:22 ZZ 21:20 0x2 19:0 0xAAAAAAAA 31:0

28 BonnDAQ File structure – Device Events
Dev ID: Dev ID of device (7 for DHE, 8 for DHC) Mod ID: Mod ID of device (11 as default) Type: Data (2) Flags: 0 Eventsize: 2 (x4 Byte) + Payload (in 4 Byte words) DataField: Trigger number Payload in our case: Info Word (4 byte) Multiple DHH frames, each with a old_onsen header XX 31:28 YY 27:24 0x2 23:22 0x0 21:20 0x2+ X 19:0 Trigger Number 31:0

29 DHE has Dev ID 7, DHC has Dev ID 8
DHE and DHC Data DHE has Dev ID 7, DHC has Dev ID 8 Both have an Infoword Temperature: DEPRECATED (0) ver: Version of startgate. DEPRECATED (0) ZS: Zero suppressed (0 = raw data, 1= ZS data) Startgate: DEPRECATED(0) frames/event: DEPRECATED (0) Both consist of sequence of old ONSEN header + DHH Frame MAGIC to check data alignment (32bit) Size of payload (32bit) 2 Empty fields (32bit) N DHH Data words (16bit , N even) Temperature 31:22 ver 21 ZS 20 Startgate 19:0 frames/event 9:0 MAGIC (0xCAFEBABE) 31:0 Size in Byte 0x DHE Frame Word 1 31:16 DHE Frame Word 2 15:0 DHE Frame Word n-1 DHE Frame Word n (Open) Issues Infoword: ZS can be inferred from data Infoword Obsolete – remove? Remove empty fields in ONSEN header?

30 BonnDAQ File structure - example
Dev ID 31:28 Mod ID 27:24 type 23:22 flags 21:20 Size 19:0 Data word, 31:0 Type EventSize (Words) FileType event (uncompressed) 2 RunNumber event 2 BOR Group event 2+2 BOR Device event 1 2 Group event 2+2+X Device event 1 2+X EOR Group event 2+2 EOR Device event 1 2 assuming Run 55 1 DHE ( Dev ID 7, Mod ID11) 1 event in file (trigger NR 9) 0xE 0x2 0x3 1 (Events in File) 0xE 0x1 0x2 0x0 55 (Run number) 0x0 0x2 0x4 0x (BOR Trigger) 0x7 0xB 0x0 0x2 0x 0x0 0x2 0x4 + X 0x9 (Trigger number) 0x7 0xB 0x2 0x0 0x2 + X 0x9 (Trigger number) 0 (temp) 1 0(startgate) 0 (frames) 0xCAFEBABE n/2 Size in Byte 0x DHE Word 1 DHE Word 2 DHE Word n-1 DHE Word n K x 0x0 0x1 0x2 0x4 0xAAAAAAAA (EOR Trigger) 0x7 0xB 0x1 0x0 0x2 0xAAAAAAAA

31 Compression uses the blosc metacompressor
Compressed Data Compression uses the blosc metacompressor 16 byte header 1 byte typesize of data to be compressed (for preconditioner) 1 byte flags (compressor, preconditioner, etc…) 1 byte compression version 1 byte blosc version 4 byte size of uncompressed data 4 byte blocksize (size of each compression block) 4 byte size of compressed data (X) X byte compressed data 4 byte CRC checksum typesize 31:24 Flags 23:16 Version LZ 15:8 Version 7:0 Uncompressed Size 31:0 Blocksize Compressed Size

32 Compressed Data Compressed Files are made of compressed blocks. After uncompressing, data looks like uncompressed BonnDAQ data. Compression is aligned to event boundaries: No event group is in more than one compressed block. Blocksize may vary, depending on received frame Blocksize may be increased to fit huge events For good performance: Blocksize >256kb/thread recommended. (Open) Issues Blosc/LZ4 very fast but very weak (almost no effect on random data) zlib strong but (too) slow Are there better choices (zstd, density?)

33 Compressed File Structure in BonnDAQ
FileType event (compressed) 8 byte Compressed Block 1: Blosc Header byte Data X byte CRC byte Compressed Block 2: Data Y byte

34 Improvements to the data format
Remove Info Word (4 byte/event) Remove empty old ONSEN fields ( byte/event) More efficient header format like new ONSEN Save byte/event more Even more if you go crazy Harder event building Remove DH<X> End of Event Only if no error Saves byte/event Remove all CRC checksum and replace by single one. Remove Trigger number from Device event (is copy from DH<x> start of event) OR Remove DH<x> start of event What Info would be lost? What can be inferred? (Open) Issues Is DHC Start of event needed? Runnumber is known Triggernumber is known Timefields? Else: Skip Trigger field in header Event = 1 DHC event


Download ppt "BonnDAQ Status and upgrade possibilities"

Similar presentations


Ads by Google