Presentation is loading. Please wait.

Presentation is loading. Please wait.

G ö khan Ü nel / CHEP 2004- Interlaken ATLAS 1 Performance of the ATLAS DAQ DataFlow system Introduction/Generalities –Presentation of the ATLAS DAQ components.

Similar presentations


Presentation on theme: "G ö khan Ü nel / CHEP 2004- Interlaken ATLAS 1 Performance of the ATLAS DAQ DataFlow system Introduction/Generalities –Presentation of the ATLAS DAQ components."— Presentation transcript:

1 G ö khan Ü nel / CHEP 2004- Interlaken ATLAS 1 Performance of the ATLAS DAQ DataFlow system Introduction/Generalities –Presentation of the ATLAS DAQ components Functionality & Performance Measurements –Prototype Setup –Event Building, ROI collection, Combined systems –At2sim: discrete data Simulation Conclusions –From Prototype setup & simulations Outlook N Gökhan Ünel on behalf of the ATLAS TDAQ Group

2 G ö khan Ü nel / CHEP 2004- Interlaken ATLAS 2 Generalities : ATLAS DAQ Level1(L1) rate: 75 kHz min, upgradeable to 100 kHz Level2(L2) rate per ROS : 20 kHz ; L2 time budget per event: 10 ms EventBuilding(EB) rate : 3-3.5 kHz for 1.5  2 MByte events Recording rate: 200 Hz for 1.5  2 MByte events SFIL2PU L2SV DFM pROS ROS ROI data (100kHz) Event data (100kHz) L2 decision To EventFilter (3kHz) ROI data Event Clear Assign event Request data L2 details L2 decision End of event

3 G ö khan Ü nel / CHEP 2004- Interlaken ATLAS 3 Matching requirements –DataFlowManager(DFM), L2SuperVisor(L2SV): previous work (TDR) has shown currently available hardware can match the requirements. –ReadOutSystem(ROS), SubFarmInput (SFI): Latest studies will be presented in this talk –L2ProcessingUnit (L2PU): Since the physics algorithms for event selection are not finalized, only time to fetch fragments from ROS will be compared to computation budget. –Networking: Discrete event simulation tool will be used to scale from prototype setup up to final ATLAS size.

4 G ö khan Ü nel / CHEP 2004- Interlaken ATLAS 4 EB / L2 Setups EB: up to 16SFIs Up to 24 ROSs L2: up to 14L2PUs up to 6 L2SVs up to 8 ROSs FastIron – 64 ports T6 – 31 ports Few FAST ROS

5 G ö khan Ü nel / CHEP 2004- Interlaken ATLAS 5 EventBuilding Rate Solid lines: ROS=2GHz Dashed line: ROS=3GHz 8.55 kHzx12.4k=106MB/s  ROS cpu limit Small & Large systems have the same max EB rate  no penalty as event size grows Can run 24 ROS vs 16 SFI EB system stably Faster ROS does a better job (we hit the io limit) 110MB/s per SFI  NIC limit ROS : 12 emulated input channels, 1kB /channel SFI : No output to EF More ROS = Bigger Events ! 9.66 kHzx12.4 k = 120MB/s  ROS NIC limit

6 G ö khan Ü nel / CHEP 2004- Interlaken ATLAS 6 Scaling in EB throughput EB throughput scales linearly with Nb of SFIs No show-stoppers Possible to estimate the rate of any EB system in the prototype setup

7 G ö khan Ü nel / CHEP 2004- Interlaken ATLAS 7 Determining Number of SFIs Requirement: 3-3.5 kHz of EB for 60-70 % bandwidth usage per SFI 60% bw 90% bw Typical ATLAS event size At typical event size of 1.5 Mb, 60 SFIs (2.4 GHz SMP) are enough Output to EF + extra SFIs for safety margin should be considered  100 SFIs (2.4 GHz SMP) would easily handle 3-3.5 kHz 1.5-2MB events

8 G ö khan Ü nel / CHEP 2004- Interlaken ATLAS 8 ROS cpu limited Level2 Rate dummy algorithms in L2PUs 6 concurrent ROI collection per L2PU Linear scaling when ROS is not the limiting factor

9 G ö khan Ü nel / CHEP 2004- Interlaken ATLAS 9 L2 Time budget If 500 L2PU 3 GHz SMP is used –10 ms /event at 100 kHz L1 rate for L2 decision –Worst case of 16 ROLs all from different ROS < 0.8ms Requirement: 10 ms event for L2 decision, ROI fetch time << 10ms Longest ROI fetch: 13-16 ROL

10 G ö khan Ü nel / CHEP 2004- Interlaken ATLAS 10 Foundry EI Foundry FastIron 800 SFI(O)1 - 16 SFI01 ROS19L2P01 L2P14 ….. L2SV06 … L2SV01 pROSDFM ROS01 ROS18 … … ROS24 … … Combined setups: EB + L2 BATM T6

11 G ö khan Ü nel / CHEP 2004- Interlaken ATLAS 11 Small system:3ROS x 2SFI x..12 L2PU Since the Max rates for EB and L2 are known,  Use the plateau region to calculate the ROS cpu utilization for “clear” task Plateau: ROS cpu limit

12 G ö khan Ü nel / CHEP 2004- Interlaken ATLAS 12 Analysis for ROS cpu CPU= R EB × CPU EB + R L2 × CPU L2 + R L1 × CPU Cl  CPU EB is the CPU power spend by the ROS on 1 kHz of Event Building  CPU L2 is the CPU power spend by the ROS on 1 kHz of Level 2 ROI  CPU Cl is the CPU power spend by the ROS on 1 kHz of Event Clears Requirement: 100 kHz L1, 20 kHz L2, 3-3.5 kHz EB + including clears** using 2 NICs simultaneously 2GHz ROS needs: 20x0.06061 + 3x0.2252 + 100x0.0074= 2.6 > 2.0  3GHz ROS needs: 20x0.05564 + 3x0.20274 +100x0.0083= 2.55 < 3.06 

13 G ö khan Ü nel / CHEP 2004- Interlaken ATLAS 13 Combined system Largest possible system using 2GHz ROS 18ROS x 16SFI x 12 L2PU runs stably

14 G ö khan Ü nel / CHEP 2004- Interlaken ATLAS 14 Meeting requirements with 3 GHz ROS Good agreement between data and simulation 3 GHz ROS can do 20 kHz L2 & 3 kHz EB at 100 kHz L1 EB=3 kHz, acc=3% L2 = 20kHz L1=100 kHz

15 G ö khan Ü nel / CHEP 2004- Interlaken ATLAS 15 Final system Simulation -1 160ROS x 110SFI x N L2PU Using concentrating switches for PUs (6  1) Realistic Trigger Menu & ROI distribution Stable @ 75 kHz Stable @ 95 kHz

16 G ö khan Ü nel / CHEP 2004- Interlaken ATLAS 16 Final system Simulation -2 at2sim: 127ROS, 110 SFIs, 504 L2PUs with concentrator switches 0 20 40 60 80 100 120 0246810 time (s) Final size system runs smoothly with fast ROSs (3.06GHz) L1 rate (kHz) EB latency (ms) # events in L2 Slowest ROS Q

17 G ö khan Ü nel / CHEP 2004- Interlaken ATLAS 17 Conclusions - I 3GHz ROS can do 3kHz EB & 20kHz L2 –we need ~140 such nodes Dual 2.4 GHz SFI can do 3kHz EB at 60% of line-speed –We need ~100 such nodes Dual 3GHz L2PU can do ROI collection better than 8% of its time budget –We need ~500 such nodes The largest test system was 18x16x12 –No scalability/functionality problems observed

18 G ö khan Ü nel / CHEP 2004- Interlaken ATLAS 18 at2sim of the final setup:160x100x..500 –Scaling from 20% to 100%: no surprises, no queues, no anomalies Network: we can handle extreme traffic caused by ultra-fast L2 PUs without algorithms Prototype L2PUs running @ 12.5 kHz, ~25 times faster then in the final system Conclusions - II

19 G ö khan Ü nel / CHEP 2004- Interlaken ATLAS 19 Next Steps Test: Prototype custom hardware with 2 input channels Preseries: 10 % setup down in the ATLAS cavern –A bigger switch (128 ports) will be bought –Merge with existing prototype setup –Time scale: Q2 / 2005 Networking aspects: scalability & performance –Separate test bed –Dedicated hardware (line-speed @ any Frame-size) –Stress testing candidate switches

20 G ö khan Ü nel / CHEP 2004- Interlaken ATLAS 20 Backup slides

21 G ö khan Ü nel / CHEP 2004- Interlaken ATLAS 21 Hardware inventory –Networking 1 EB switch:Foundry FastIron 800 – 62 Ports 1 L2 switch:BATM T6 – 31 Ports 1 X-over switch:Foundry EdgeIron – 10 Ports –PCs (intel Xeon, 64bit/66MHz PCI) 31 Tower Uni-proc. (2.0 GHz) –25 used as ROS for scaling studies –06 used as L2SVs –01 used as DFM 16 Tower Dual-proc. (3.06 GHz) –Used as L2PUs –5 used as ROS for performance studies 16 rack mountable Dual proc. (2.4 GHz) –Used as SFIs

22 G ö khan Ü nel / CHEP 2004- Interlaken ATLAS 22 EFD setup DFM EFD1 ROS1 SFI ROS2 EFD2 EFD15

23 G ö khan Ü nel / CHEP 2004- Interlaken ATLAS 23 EFD Studies 40% performance loss No EF output Single SFI: small events, WORST case.

24 G ö khan Ü nel / CHEP 2004- Interlaken ATLAS 24 DFM & L2SV performance

25 G ö khan Ü nel / CHEP 2004- Interlaken ATLAS 25 ROS input emulation vs Prototype Hardware Data Emulation


Download ppt "G ö khan Ü nel / CHEP 2004- Interlaken ATLAS 1 Performance of the ATLAS DAQ DataFlow system Introduction/Generalities –Presentation of the ATLAS DAQ components."

Similar presentations


Ads by Google