Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Farm Issues L1&HLT Implementation Review Niko Neufeld, CERN-EP Tuesday, April 29 th.

Similar presentations


Presentation on theme: "1 Farm Issues L1&HLT Implementation Review Niko Neufeld, CERN-EP Tuesday, April 29 th."— Presentation transcript:

1 1 Farm Issues L1&HLT Implementation Review Niko Neufeld, CERN-EP Tuesday, April 29 th

2 Niko NEUFELD CERN, EP 2 Overview Requirements Architecture Protocols & Dataflow Latencies Implementation

3 Niko NEUFELD CERN, EP 3 Requirements Scalable up to several thousand CPUs Organised in sub-farms, which perform local load balancing, hide large numbers from RUs Support partitioning (at the subfarm level) Interface to the throttle via Experiment Control System (ECS) Must fit within cooling and space limits in UX8 Low latency for data movement and latency control Allow concurrent, seamless usage for L1 and HLT algorithms, running standard OS, while prioritising L1 traffic wherever possible

4 Niko NEUFELD CERN, EP 4 The Event Filter Farm

5 Niko NEUFELD CERN, EP 5 Architecture - Farm

6 Niko NEUFELD CERN, EP 6 Structure The (initially) 1200 CPUs are distributed over several sub-farms To minimise the number of inputs from the event building network, the number of sub- farms is chosen such that the average link load into a sub-farm is close to 110 MB/s –For the minimal system (VELO + TT), this yields ~ 50 subfarms –It is also advantageous to minimise the number of sub-farms (while keeping the number of CPUs constant) from the point of view of the local load-balancing (see later)

7 Niko NEUFELD CERN, EP 7 Anatomy of a sub-farm Each sub-farm consists of a gateway to the event- builder, the Subfarm Controller SFC, and worker CPUs Each subfarm handles an aggregated data-stream of approximately two Gigabits (one in, one out) (  see later) The SFC is connected to the worker CPUs by a switch (Ethernet - Layer 2) A completely separate network connects the SFC and the worker nodes to the ECS The transport protocol is light-weight directly on top of Ethernet (or raw IP if necessary) no TCP

8 Niko NEUFELD CERN, EP 8 Dataflow Completely assembled events are sent to the SFC as (several) raw Ethernet frames The SFC keeps a list of idle worker CPU and forwards the event to a node –A node buffers only a single L1 event at anytime (latency!). When no node is free, the event is buffered in the SFC and accumulates extra latency –A node buffers several HLT events (50 to 100). When all buffers are full, events are buffered in the SFC When the high-water marks of the SFC buffer are reached, a throttle signal is issued via the ECS The worker CPU processes the event and always sends an answer (= decision) –In case of a L1 event the answer is only yes or no + a short summary for the L1 sorter –In case of a HLT event the positive answer contains the raw and reconstructed event data as well The SFC forwards L1 decisions to the L1 decision sorter and HLT accepted events to the storage controller

9 Niko NEUFELD CERN, EP 9 Latencies Switch SFC Switch CPU SFC Switch CPU SFC Switch CPU SFC Switch CPU SFC Switch CPU SFC Switch CPU Reception of event and invocation of trigger algorithm Queuing in the SFC (“all all nodes are busy with a L1 event”) Transmitting from switch to CPU node via Ethernet Forwarding in the subfarm switch

10 Niko NEUFELD CERN, EP 10 Minimising the Latencies Transmitting of events (transport time) – use Gigabit Ethernet for the internal subfarm network (although links loaded only to a few %) Reception of events and invocation of trigger algorithm: –use raw Ethernet/IP and zero-copy sockets, etc… –use real time scheduling, pre-emptive system calls and low context switching latency Queuing due to statistical fluctuations in the processing time of earlier events –keep number of nodes in the subfarm high  keep number of subfarms low

11 Niko NEUFELD CERN, EP 11 Context Switching Latency What is it? –On a multi-tasking OS, whenever the OS switches from one process to another it needs a certain time to do this Why do we worry? –Because we run the L1 and the HLT algorithms concurrently on each CPU node Why do we want this concurrency? –We want to minimise the idle-time of the CPUs –We cannot use double-buffering in the L1 (latency budget would be half-ed!)

12 Niko NEUFELD CERN, EP 12 Scheduling and Latency Using Linux 2.5.55 we have established two facts about the scheduler: –Realtime priorities work: the L1 task will never be interrupted until it finishes –The context switch latency is low: 10.1 ± 0.2 µs Measurements of this have been done on a high- end server 2.4 GHz PIV Xeon – 400 MHz FSB – we should have machines at least 2x faster in 2007 Conclusion: the scheme of running both tasks concurrently is sound

13 Niko NEUFELD CERN, EP 13 Latency due to queuing 0.1 % of events have a timeout larger than the 30 ms cut-off Ptolemy simulation: Processing time distribution from number of clusters Assuming 9 processors and a shared L1 trigger rate of 9 kHz per sub- farm 10^6 L0 accepted events, one of 120 subfarms

14 Niko NEUFELD CERN, EP 14 Beating the statistics of small numbers Subfarm now with 18 nodes and sharing ~ 18 kHz of L1 trigger  one of 60 sub farms. Total number of CPUs in the system constant Only 0.05 % of events have a timeout larger than 30 ms  minimise number of sub-farms

15 Niko NEUFELD CERN, EP 15 Implementation SFC is either a high performance (better than 2 Gigabit sustained I/O) PC or a single NP module Farm nodes are disk-less, booted from network, running (most likely) Linux –rack-mounted PCs (1U or blade servers) single or dual CPU The farm will be installed in UX8 –limits in floor/rack space and cooling power Joint studies for rack cooling and physical realisation (optimal cabling, mechanics, etc…) ongoing


Download ppt "1 Farm Issues L1&HLT Implementation Review Niko Neufeld, CERN-EP Tuesday, April 29 th."

Similar presentations


Ads by Google