Presentation is loading. Please wait.

Presentation is loading. Please wait.

EPICS Archiving Appliance Test at ESS

Similar presentations


Presentation on theme: "EPICS Archiving Appliance Test at ESS"— Presentation transcript:

1 EPICS Archiving Appliance Test at ESS
J. Bobnar, S.Gysin November 25, 2014

2 Goal Asses the feasibility of the EPICS Archive Appliance (AA) for European Spallation Source. Measure performance and compare to requirements Propose new features for the services

3 Requirements: Capacity Planning
Description # records records archived bytes/ record record/ sec bytes/sec GB/day Rack estimation (ESS Bilbao Ion source) 28,400 2,840 14.3 1.00 40,612 3.31 SNS (BEAUtY) 340,000 85,000 30 0.02 52,298 4.21 FRIB (estimates) 200,000 8 0.20 320000 26 SLAC : Archive appliance test : test-arch 102,255 0.03 80,406 6.47 Jaka: Medical Accelerator (BEAUtY) 150,000 0.22 994,205 80 LHC logging (MDB) 3,625,990 292 For ESS we decided to double the capacity of SNS: Description # records records archived bytes/record record/sec GB/day SNS 340,000 85,000 30 0.02 4.21 ESS (2x SNS) 680,000 170,000 8.42

4 But … there will be spikes in the rate the data is archived
Waveforms are significantly larger (~5kB/record) Post Mortem buffers: ~15 GB/beam stop 1 beam stop/hour = 24 beam stops/day = 360 GB/day (commissioning) Data on demand 10 event/day 1000 channels ~2MB per channel per event = 20 GB/day EPICS V4 data types

5 Short, Medium and Long term archiving
Examples: SLAC Archiver Appliance: 1 hour, 1 day, 1 year FRIB planned: 1 week, 1 month, forever LHC – Timber Logging System: MDB: 7 days, LDB > 20 years SNS Archiving Service: no division DESY: 1 month, forever ESS requirements: Short term: 10 days (8.4 GB/day) Medium term: 100 days (20% of short term = 1.9 GB/day) Long term: forever (20 % of medium term = 0.19 GB/day)

6 Rate of retrieval Depends on Retrieval from short term storage
The archive rate Reduction algorithm Number of clients simultaneously reading data Hardware Retrieval from short term storage Not slower than 1000 points/sec

7 Test setup 2 dedicated machines on a dedicated network, both running CODAC version of the Scientific Linux 4.3 Archive Appliance computer: Intel Xeon 8 core (16 threads) CPU, 16 GB RAM Solid State Drive Performance: ~240 MB/s for reading (random) and ~280MB/s for writing (sequential) ESS Control Box with IOC 30000 scalar double-type PVs 200 waveform (aSub) long-type PVs of length 1000 Both at 10 Hz. Units: “number of samples per second” N/s = number of PVs * 10 Hz

8 Test results: Scalars, JVM needs optimal setup
Adaptive heap memory (-Xms < -Xmx) N/s -> all is well N/s -> event drop rate 0.04% > N/s -> higher drop rate performance degrader: management of the Java Heap Memory size by the virtual machine (CPU was at 100 % all the time) Fixed heap size (8 GB for the engine): N/s without a problem

9 Test results: Scalars Saving 10 seconds worth of data (1M samples)
With ETL running (transfer between short and medium term storage) Between 8 and 11 seconds Probable Cause: The same physical drive was used for the short and medium term storage

10 Test results: Scalars Increased the sampling rate to 300,000 N/s
Saving 10 seconds worth of data (3 M samples) 3.5 and 4 seconds However: Event drops at start up With ETL running, time increased by an order of magnitude, and drop rate was very high. CPU time remained the same IO seems to be the bottle neck

11 Test results: wave forms
200 PVs of length 1000 at 10 Hz 2000 N/s, 1N ≈ 8kB Saving 10 seconds worth of data 200 and 300 milliseconds When ETL was running the time increased to 1 sec Archiving the same amount of data but in a waveform is 15 times faster than in scalar PVs -> number of PVs matter.

12 Test results: rate of retrieval scalars
Data stored: N/s 8 hours 54 GB Short term: 2 files for the last hour Medium term: 1 file for the rest Retrieval rate: Short intervals (minutes; less than 800 data points available) 100 – 150 ms Longer intervals (hours; more than 800 data points available) 200 – 400 ms Even longer intervals (1 day, 2 days) 700 – 800 ms, ~1500 ms No problems with large number of PVs (file fragmentation)

13 Test results: rate of retrieval waveforms
Retrieval rate: 1 hour interval (reduction: > 800 samples) ~ 3500 ms Every additional hour adds approximately 3000 ms 1 day interval (reduction: > 800 samples) > 1 min Room for improvement in reduction algorithm and in the client More tests planned with longer acquisition period.

14 Conclusion SNS archives 0.02 samples per second per PV. At archived PVs that means 1600 N/s. One EPICS Archiver Appliance: can archive N/s which is 60-times more. To reduce retrieval time we recommend running several instances of AA and distribute the PVs among them The retrieval rate (for scalars) is good and meets the requirements: for most common time interval (i.e. 1 day or less) < 1 second. We also have a list of recommendation for AA and for the AA users. To be published after completion of the tests.


Download ppt "EPICS Archiving Appliance Test at ESS"

Similar presentations


Ads by Google