Presentation is loading. Please wait.

Presentation is loading. Please wait.

ATLAS FroNTier cache consistency stress testing David Front Weizmann Institute 1September 2009 ATLASFroNTier chache consistency stress testing.

Similar presentations


Presentation on theme: "ATLAS FroNTier cache consistency stress testing David Front Weizmann Institute 1September 2009 ATLASFroNTier chache consistency stress testing."— Presentation transcript:

1 ATLAS FroNTier cache consistency stress testing David Front Weizmann Institute 1September 2009 ATLASFroNTier chache consistency stress testing

2 outline Goals ?? Parameters to test 2September 2009 ATLASFroNTier chache consistency stress testing

3 Handling cache consistency September 2009 ATLASFroNTier chache consistency stress testing 3 Client machine ATHENA COOL CORAL FroNTier client squid Tomcat FroNTier servlet Oracle DB server squid Memorize modification time of each table Consult modification times to invalidate or not Server site

4 Goals 1)Verify that the table modification time trigger does not consume too much resources I expect this to be trivial 2) Show that the performance of frontier with the cache consistency does match expectations 3) Compare performance of frontier to direct Oracle connection Testing to be done at CERN And possibly also sending requests from remote sites September 2009 ATLASFroNTier chache consistency stress testing 4

5 Testing To test creating high load on the caching system, multiple instances of a job, created by Richard Hawkings, will be submitted Jobs submitted as grid jobs Two Atlas FroNTier server machines that are currently being configured will be used for this testing September 2009 ATLASFroNTier chache consistency stress testing 5

6 Testing challenge According to Dave Dykstra, in order to load a FroNTier server, one needs many tens of squids. squid is a single threaded application The reasonable amount of squids to be installed on a machine is its number of cores Allocating servers to function as pools of squids does not seem to be the correct attitude An alternative is that a squid will be available on each client machine September 2009 ATLASFroNTier chache consistency stress testing 6

7 A squid per client machine In order to have a squid per client machine: – Tens of worker nodes should be pre allocated – As suggested by Johannes Elmsheuser, HammerClould can not accommodate such a setting efficiently – Rather, local submission to scheduler, using a dedicated job queue may be done – For this the appropriate resources should be allocated (Dario Barberis?) – In addition, there is a need for a way (script) to setup a squid at beginning of test and tearing it down at end of test September 2009 ATLASFroNTier chache consistency stress testing 7

8 Testing elements Squids setup and tear down Spawn writer jobs one every 15 minutes (or 30 or 60) Spawning reader jobs via the grid monitoring September 2009 ATLASFroNTier chache consistency stress testing 8

9 September 2009 ATLASFroNTier chache consistency stress testing 9 Tens of pre allocated worker nodes Using a dedicated job queue Used to spawn Athena reader jobs Via the grid Athena job Tomcat FroNTier servlet Oracle DB server squid FroNTier stress testing (at CERN) Athena reader job … … writer job Once a while (15 minutes) A writer job writes fresh data to Oracle Setup/tear down squids Monitoring: DB server Frontier machines Spawn writer jobs Spawn reader jobs Testing manager

10 Parameters to test with September 2009 ATLASFroNTier chache consistency stress testing 10

11 Parameter: Arrival rate of reading client jobs – I do not know what the expected rate is – Taking into account the number of queries per Athena job (~1-3k), and comparing with the capability of CMS FroNTier to answer queries, I expect the whole system to be able to handle 2-300 jobs per minute (100-20K jobs per hour) – Hence, testing may be repeated with the following rates of reader jobs per minute: 10,100,1000 September 2009 ATLASFroNTier chache consistency stress testing 11

12 Parameters to test with The rate in which ATLAS COOL tables are updated: – Using: once every 15 minutes This is the rate that PVSS2COOL works According to Fred Luehring, other COOL tables are updated in a slower rate – Testing may be repeated with slower periods: Once every 30 minutes Once every 60 minutes September 2009 ATLASFroNTier chache consistency stress testing 12

13 Network latency is an important issue, hence test: – Reading client runs from CERN or remote near/far location – TBD: Candidate remote locations may be – For each of these 3(4) sites, run the example workload via frontier or directly to Oracle at CERN. September 2009 ATLASFroNTier chache consistency stress testing 13 Compare performance of frontier to Oracle

14 cache consistency policy A comparison between the performance with and without cache consistency may be expected. Test where both compared cases use the same frontier delays (5 minutes). September 2009 ATLASFroNTier chache consistency stress testing 14

15 frontier delays We are currently using 5 minutes for the three delays (squid/frontier server/DB trigger) I do not see the need to test with shorter delays. Yet, it may be expected to compare performance between different delays times, in minutes: [5,10,15] September 2009 ATLASFroNTier chache consistency stress testing 15

16 Monitoring September 2009 ATLASFroNTier chache consistency stress testing 16

17 Testing time frame Each test will be done for a given period (one hour) at steady load, after ramping-up the load Ramp-up may take an order to 1-2 hours, according to Johannes Elmsheuser After stopping to spawn jobs, load will slow down (maybe within an hour or two as well) Hence each test is expected to last about 3-5 hours September 2009 ATLASFroNTier chache consistency stress testing 17

18 Links David Dykstra: poster of the CMS solution http://frontier.cern.ch/dist/Poster_CHEP09_Frontier-newcaching.pdf Richard Hawkings: ATLAS COOL reference workloads and tests https://twiki.cern.ch/twiki/bin/view/Atlas/CoolRefWork David Front: Previous related presentations http://indico.cern.ch/getFile.py/access?contribId=5&resId=1&materialId=slides&confId=59928 http://indico.cern.ch/getFile.py/access?subContId=2&contribId=2&resId=1&materialId=slides&confId=62120 September 2009 ATLASFroNTier chache consistency stress testing 18


Download ppt "ATLAS FroNTier cache consistency stress testing David Front Weizmann Institute 1September 2009 ATLASFroNTier chache consistency stress testing."

Similar presentations


Ads by Google