Presentation is loading. Please wait.

Presentation is loading. Please wait.

LFC Replication Tests LCG 3D Workshop Barbara Martelli.

Similar presentations


Presentation on theme: "LFC Replication Tests LCG 3D Workshop Barbara Martelli."— Presentation transcript:

1 LFC Replication Tests LCG 3D Workshop Barbara Martelli

2 Objectives of LFC Replication Tests  Understand if and how the Streams replication impacts LFC behaviour.  Understand if the throughput achievable in terms of number of entries inserted per second is suitable for LHCb needs.  Understand if the sustained rate achievable in terms of number of entries inserted per second is suitable for LHCb needs.  Mesure the delay of replication for a particular entry.  Mesure the max throughput achievable in our configuration.  Mesure the max sustained rate achievable in our configuration.  Compare the read performances between present setup and Streamed setup (hope they’ll improve with a replica).

3 LHCb Access Pattern on LFC  At the moment LFC is used for DC06 MC production Stripping Analysis  Really difficult to estimate access pattern for the future, but we can make a snapshot of what happens today  Read access (end 2006) 10M PFNs expected, read access mainly for analisis, one average user starts O(100) jobs. Each job contacts LFC twice: once for DIRAC optimization, once at the aim of creating an XML POOL slice that will be used by the application to access data. Every 15 minutes 1000 users are expected to submit jobs contacting the LFC 200 times. 24*4*1000*200 ~ 20M LFC requests for analisis. 200Hz Read Only Requests.  Write access (today) MC Production: 10-15 inserts per day DC06: About 40MB/s transfers from CERN to T1s, file size is about 100MB -> one replicated file every 3 seconds. Every 30 files processed, 2 are created. So we can expect about 1Hz for Write Access.

4 LFC Local Test Description (Feasibility test)  40 LFC clients, 40 LFC daemons threads, streams pool.  Client’s actions Control if LFN exists into the database  Select from cns_file_metadata If yes -> add a sfn for that lfn  Insert sfn into cns_file_replica If not -> add both lfn and sfn  Insert lfn into cns_file_metadata  Insert sfn into cns_file_replica For each lfn 3 sfn are inserted

5 LFC Master HW Configuration Gigabit Switch Private LHCB link rac-lhcb-01 rac-lhcb-02 Dell 224F 14 x 73GB disks ASM Dual Xeon 3,2GHz,4GB memory 2nodes-RAC on Oracle 10gR2 RHEL 4 kernel 2.6.9- 34.ELsmp 14 Fibre Channel disks (73GB each) HBA Qlogic Qla2340 – Brocade FC Switch Disk storage managed with Oracle ASM (striping and mirroring)

6 LFC Slave Configuration  LFC Read only replica Dual Xeon 2.4, 2GB RAM Oracle 10gR2 (oracle RAC but used as single instance) RHEL 3 kernel 2.4.21 6 x 250GB disks in RAID 5 HBA Qlogic Qla2340 – Brocade FC Switch Disk storage formatted with OCFS2

7 Performance About 75 transactions per second on each cluster node. Inserted and replicated 1700k entries in 4 hours (118 insert per second). Almost real-time replica with Oracle Streams without significant delays (<< 1s).

8 CPU load on cluster nodes is far from being saturated.

9 CERN to CNAF LFC Replication  At CERN: 2 LFC servers connected to the same LFC Master DB Backend (single instance).  At CNAF: 1 LFC server connected to the replica DB Backend (single instance).  Oracle Streams send entries from the Master DB at CERN to the replica DB at CNAF.  Population Clients: python script which starts N parallel clients. The clients write entries and replicas into the Master LFC at CERN.  Read Only Clients: python script which reads entries from the master and from the replica LFC.

10 LFC Replication Testbed LFC Read-Only Server LFC Oracle Server Replica DB LFC R-W Server LFC Oracle Server Master DB LFC R-W Server Population Clients Oracle Streams rls1r1.cern.ch lxb0716.cern.ch lxb0717.cern.ch Read Only Clients lfc-streams.cr.cnaf.infn.it lfc-replica.cr.cnaf.infn.it WAN

11 Test 1: 40 Parallel Clients  40 parallel clients equally divided between the two LFC master servers.  Inserted 3700 replicas per minute during the first two hours.  Very good performance at the beginnig, but after few hours the master fall into a Flow Control state.  Flow Control means that the master is notified by the client that the update rate is too fast. Master slows down to avoid Spill Over at client side.  Spill Over means that the buffer of the Streams queue is full, so Oracle has to write the entries into the disk (persistent part of the queue). This decreases performances.  Apply side of Streams replication (slave) is usually slower than the master side, we argue that is necessary to decrease the insert rate to achieve good sustained performance.

12 Test 2: 20 Parallel Clients  20 parallel clients equally divided between the two LFC master servers.  Inserted 3000 replicas per minute, 50 replicas per second.  Apply parallelism enhanced: 4 parallel apply processes on the slave.  After some hours the rate decreases, but reaches a stable state at 33 replicas per second.  Achieved sustained rate of 33 replicas per second.  No flow control on the master has been detected.

13

14 Conclusions  Even this test setup is less powerful than the production one, sustained insertion rate is even higher than LHCb needs.  Need to test read random access to understand if and how the replication impacts the response time.  Could be interesting understand which is the best replication rate achievable whith this setup, even if not requested by the experiments.


Download ppt "LFC Replication Tests LCG 3D Workshop Barbara Martelli."

Similar presentations


Ads by Google