Distributed Data Management Miguel Branco 1 DQ2 status & plans BNL workshop October 3, 2007.

Distributed Data Management Miguel Branco 1 DQ2 status & plans BNL workshop October 3, 2007

2 Outline Focusing mostly on site services Releases Observations from 0.3.x 0.4.x Future plans

3 Release status 0.3.2 (‘stable’) in use on OSG [ I think :-) ] 0.4.0 being progressively rolled out –And 0.4.0_rc8 in use on all LCG sites 0.3.x was focused on moving central catalogues to better DB. The site services were essentially the same as on 0.2.12. 0.4.x is about site services only –In fact, 0.4.x version number applies to site services only, and clients remain on 0.3.x

4 Observations from 0.3.x Some problems were solved: –overloading central catalogues –big datasets –choice of next files to transfer … but this introduced another problem: –as we were more ‘successful’ into filling up the site services queue of files… –we observed overloading the site services with expensive ORDERing queries (e.g. to choose which files to transfer next)

5 Observations from 0.3.x The ramp-up in simulation production followed by poor throughput created a large backlog Also, site services were (still are) insufficiently protected for requests impossible to fulfill: –“transfer this dataset, but keep polling for data to be created - or streamed from another site”.. but the data never came/never was produced/was subscribed from a source that was never meant to have it…

6 Observations from 0.3.x Most relevant changes to the site services were to handle large datasets –less important for PanDA production but very relevant for LCG as all were large datasets

7 Implementation of 0.3.x The implementation was also too simplistic for the load we observed: –MySQL database servers were not extensively tuned, too much reliance on database for IPC, heavy overload from ‘polling’ FTS and LFCs (e.g. no GSI session reuse for LFC) –loads >10 were common (I/O+CPU), in particular with MySQL database on the same machine (~10 processes doing GSI plus mysqld doing ORDER BYs)

8 Deployment of 0.3.x FTS @ T1s is usually run in high performing hardware, with the FTS agents split from the database –… but FTS usually has a few thousand files in the system, on a typical T1 DQ2 is usually deployed with the DB joint with the site services –And its queue is hundreds of thousands of files in a typical T1 DQ2 is the system throttling FTS –where the expensive brokering decisions are made and where the larger queue is maintained

9 Evolution of 0.3.x During 0.3.x, the solution ended up being to simplify 0.3.x in particular reducing its ORDERing Work on 0.4.x had already started –to fix database interactions while maintaining essentially same logic

10 0.4.x New DB schema, more reliance on newer MySQL features (e.g. triggers) less on expensive features (e.g. foreign key constraints) Able to sustain and order large queues (e.g. FZK/LYON are running with >1.5M files in their queues) –One instance of DQ2 0.4.x is used to serve ALL M4 + Tier-0 test data –One instance of DQ2 0.4.x is used to serve 26 sites (FZK + T2s AND LYON + T2s)

11 What remains to be done Need another (smaller) iteration on channel allocation and transfer ordering –e.g. in-memory buffers to prevent I/O on the database, creating temporary queues in front of each channel with tentative ‘best files to transfer next’ Some work remains to have services resilient to failures –e.g. dropped MySQL connection Still need to tackle some ‘holes’ –e.g. queues of files for which we cannot find replicas may still grow forever if a replica indeed appears for one of them the system may take too long to consider this file to transfer –… but already introduced BROKEN subscriptions to release some load

12 What remains to be done More site services local monitoring –work nearly completed –will be deployed when we are confident it does not cause any harm to the database we still observe deadlocks next slides…

15 Expected patches to 0.4.x 0.4.x branch will continue to focus on site services only –channel allocation, source replicas lookup and submit queues + monitoring Still, the DDM problem as a whole can only be solved by having LARGE files –while we need to sustain queues with MANY files, if we continue with the current file size, the “per-event transfer throughput” will remain very inefficient –plus more aggressive policies on denying/forgetting about subscription requests

16 After 0.4.x 0.5.x will include an extension to the central catalogues –location catalogue only This change follows a request from user analysis and LCG production Goal is to provide central overview of incomplete datasets (files missing) –but handle also dataset deletion (returning list of files to delete at a site, coping with overlapping datasets - quite a hard problem!) First integration efforts (as prototype is now complete) expected to begin mid-Nov

17 After 0.5.x Background work has began on a central catalogue update –Important schema change: new timestamp oriented unique identifiers for datasets allowing partitioning of the backend DB transparently to the user and proving more efficient storage schema –2007 datasets on an instance, 2008 datasets on another.. Work has started, on a longer timescale, as the change will be fully backward compatible –old clients will continue to operate as today, to facilitate any 0.3.x->1.0 migration

18 Constraints In March we decided to centralize DQ2 services on LCG –as a motivation to understand problems with production simulation transfers as our Tier-0 -> Tier-1 tests had always been quite successful in using DQ2 –now, 6 months later we finally start seeing some improvement many design decisions of the site services were altered to adapt to production simulation behaviour (e.g. many fairly “large” open datasets) We expect to continue to need to operate centrally all LCG DQ2 instances for a while more –support and operations are now being setup –but there is an important lack of technical people, aware of MySQL/DQ2 internals

19 Points 1.Longish threads and the use of Savannah for error reports –e.e.g recently we kept getting internal Panda error messages from the worker node for some failed jobs, which were side-effects of some failure in the central Panda part –Propose a single (or at least a primary) contact point for central catalogue issues (Tadashi + Pedro?) 2.For site services and whenever problem is clear, please post also report on Savannah –We have missed minor bug reports due to this 3.Clarify DQ2 role on OSG and signal possible contributions

Distributed Data Management Miguel Branco 1 DQ2 status & plans BNL workshop October 3, 2007.

Similar presentations

Presentation on theme: "Distributed Data Management Miguel Branco 1 DQ2 status & plans BNL workshop October 3, 2007."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Distributed Data Management Miguel Branco 1 DQ2 status & plans BNL workshop October 3, 2007.

Similar presentations

Presentation on theme: "Distributed Data Management Miguel Branco 1 DQ2 status & plans BNL workshop October 3, 2007."— Presentation transcript:

Similar presentations

About project

Feedback