Distributed Data Management Miguel Branco 1 DQ2 status & plans BNL workshop October 3, 2007.

Slides:



Advertisements
Similar presentations
Data Management Expert Panel. RLS Globus-EDG Replica Location Service u Joint Design in the form of the Giggle architecture u Reference Implementation.
Advertisements

Adding scalability to legacy PHP web applications Overview Mario A. Valdez-Ramirez.
Computer Organization and Architecture
CS364 CH08 Operating System Support TECH Computer Science Operating System Overview Scheduling Memory Management Pentium II and PowerPC Memory Management.
Layers and Views of a Computer System Operating System Services Program creation Program execution Access to I/O devices Controlled access to files System.
Inexpensive Scalable Information Access Many Internet applications need to access data for millions of concurrent users Relational DBMS technology cannot.
ALICE Operations short summary and directions in 2012 Grid Deployment Board March 21, 2011.
How WebMD Maintains Operational Flexibility with NoSQL Rajeev Borborah, Sr. Director, Engineering Matt Wilson – Director, Production Engineering – Consumer.
Distributed Deadlocks and Transaction Recovery.
Summary of issues and questions raised. FTS workshop for experiment integrators Summary of use  Generally positive response on current state!  Now the.
CERN - IT Department CH-1211 Genève 23 Switzerland t Monitoring the ATLAS Distributed Data Management System Ricardo Rocha (CERN) on behalf.
LHCC Comprehensive Review – September WLCG Commissioning Schedule Still an ambitious programme ahead Still an ambitious programme ahead Timely testing.
Lecture On Database Analysis and Design By- Jesmin Akhter Lecturer, IIT, Jahangirnagar University.
ATLAS DQ2 Deletion Service D.A. Oleynik, A.S. Petrosyan, V. Garonne, S. Campana (on behalf of the ATLAS Collaboration)
03/27/2003CHEP20031 Remote Operation of a Monte Carlo Production Farm Using Globus Dirk Hufnagel, Teela Pulliam, Thomas Allmendinger, Klaus Honscheid (Ohio.
Introduction: Distributed POOL File Access Elizabeth Gallas - Oxford – September 16, 2009 Offline Database Meeting.
DDM-Panda Issues Kaushik De University of Texas At Arlington DDM Workshop, BNL September 29, 2006.
2nd September Richard Hawkings / Paul Laycock Conditions data handling in FDR2c  Tag hierarchies set up (largely by Paul) and communicated in advance.
Heterogeneous Database Replication Gianni Pucciani LCG Database Deployment and Persistency Workshop CERN October 2005 A.Domenici
CCRC’08 Weekly Update Jamie Shiers ~~~ LCG MB, 1 st April 2008.
BNL DDM Status Report Hironori Ito Brookhaven National Laboratory.
CERN Physics Database Services and Plans Maria Girone, CERN-IT
GLite – An Outsider’s View Stephen Burke RAL. January 31 st 2005gLite overview Introduction A personal view of the current situation –Asked to be provocative!
DDM Monitoring David Cameron Pedro Salgado Ricardo Rocha.
EGI-InSPIRE EGI-InSPIRE RI DDM Site Services winter release Fernando H. Barreiro Megino (IT-ES-VOS) ATLAS SW&C Week November
DELETION SERVICE ISSUES ADC Development meeting
1 LHCb on the Grid Raja Nandakumar (with contributions from Greig Cowan) ‏ GridPP21 3 rd September 2008.
CASTOR evolution Presentation to HEPiX 2003, Vancouver 20/10/2003 Jean-Damien Durand, CERN-IT.
INFSO-RI Enabling Grids for E-sciencE Enabling Grids for E-sciencE Pre-GDB Storage Classes summary of discussions Flavia Donno Pre-GDB.
1 Andrea Sciabà CERN Critical Services and Monitoring - CMS Andrea Sciabà WLCG Service Reliability Workshop 26 – 30 November, 2007.
Topic Distributed DBMS Database Management Systems Fall 2012 Presented by: Osama Ben Omran.
Site Validation Session Report Co-Chairs: Piotr Nyczyk, CERN IT/GD Leigh Grundhoefer, IU / OSG Notes from Judy Novak WLCG-OSG-EGEE Workshop CERN, June.
INFSO-RI Enabling Grids for E-sciencE ARDA Experiment Dashboard Ricardo Rocha (ARDA – CERN) on behalf of the Dashboard Team.
Database authentication in CORAL and COOL Database authentication in CORAL and COOL Giacomo Govi Giacomo Govi CERN IT/PSS CERN IT/PSS On behalf of the.
CD FY09 Tactical Plan Status FY09 Tactical Plan Status Report for Neutrino Program (MINOS, MINERvA, General) Margaret Votava April 21, 2009 Tactical plan.
CERN IT Department CH-1211 Genève 23 Switzerland t Streams Service Review Distributed Database Workshop CERN, 27 th November 2009 Eva Dafonte.
The new FTS – proposal FTS status. EMI INFSO-RI /05/ FTS /05/ /05/ Bugs fixed – Support an SE publishing more than.
Data Transfer Service Challenge Infrastructure Ian Bird GDB 12 th January 2005.
Data Placement Intro Dirk Duellmann WLCG TEG Workshop Amsterdam 24. Jan 2012.
Distributed Data Management Miguel Branco 1 DQ2 discussion on future features BNL workshop October 4, 2007.
MND review. Main directions of work  Development and support of the Experiment Dashboard Applications - Data management monitoring - Job processing monitoring.
Pavel Nevski DDM Workshop BNL, September 27, 2006 JOB DEFINITION as a part of Production.
U.S. ATLAS Facility Planning U.S. ATLAS Tier-2 & Tier-3 Meeting at SLAC 30 November 2007.
1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.
Site Services and Policies Summary Dirk Düllmann, CERN IT More details at
Distributed Analysis Tutorial Dietrich Liko. Overview  Three grid flavors in ATLAS EGEE OSG Nordugrid  Distributed Analysis Activities GANGA/LCG PANDA/OSG.
CMS: T1 Disk/Tape separation Nicolò Magini, CERN IT/SDC Oliver Gutsche, FNAL November 11 th 2013.
DDM Central Catalogs and Central Database Pedro Salgado.
LHCC Referees Meeting – 28 June LCG-2 Data Management Planning Ian Bird LHCC Referees Meeting 28 th June 2004.
WMS baseline issues in Atlas Miguel Branco Alessandro De Salvo Outline  The Atlas Production System  WMS baseline issues in Atlas.
INFSO-RI Enabling Grids for E-sciencE File Transfer Software and Service SC3 Gavin McCance – JRA1 Data Management Cluster Service.
VO Box discussion ATLAS NIKHEF January, 2006 Miguel Branco -
DB Questions and Answers open session (comments during session) WLCG Collaboration Workshop, CERN Geneva, 24 of April 2008.
Joe Foster 1 Two questions about datasets: –How do you find datasets with the processes, cuts, conditions you need for your analysis? –How do.
Data Distribution Performance Hironori Ito Brookhaven National Laboratory.
ATLAS Computing Model Ghita Rahal CC-IN2P3 Tutorial Atlas CC, Lyon
ATLAS DDM Developing a Data Management System for the ATLAS Experiment September 20, 2005 Miguel Branco
Jean-Philippe Baud, IT-GD, CERN November 2007
Computing Operations Roadmap
(on behalf of the POOL team)
U.S. ATLAS Grid Production Experience
3D Application Tests Application test proposals
Data Challenge with the Grid in ATLAS
Elizabeth Gallas - Oxford ADC Weekly September 13, 2011
PanDA in a Federated Environment
William Stallings Computer Organization and Architecture
WLCG Service Interventions
Dirk Düllmann CERN Openlab storage workshop 17th March 2003
Data Management cluster summary
SOFTWARE DEVELOPMENT LIFE CYCLE
Presentation transcript:

Distributed Data Management Miguel Branco 1 DQ2 status & plans BNL workshop October 3, 2007

2 Outline Focusing mostly on site services Releases Observations from 0.3.x 0.4.x Future plans

3 Release status (‘stable’) in use on OSG [ I think :-) ] being progressively rolled out –And 0.4.0_rc8 in use on all LCG sites 0.3.x was focused on moving central catalogues to better DB. The site services were essentially the same as on x is about site services only –In fact, 0.4.x version number applies to site services only, and clients remain on 0.3.x

4 Observations from 0.3.x Some problems were solved: –overloading central catalogues –big datasets –choice of next files to transfer … but this introduced another problem: –as we were more ‘successful’ into filling up the site services queue of files… –we observed overloading the site services with expensive ORDERing queries (e.g. to choose which files to transfer next)

5 Observations from 0.3.x The ramp-up in simulation production followed by poor throughput created a large backlog Also, site services were (still are) insufficiently protected for requests impossible to fulfill: –“transfer this dataset, but keep polling for data to be created - or streamed from another site”.. but the data never came/never was produced/was subscribed from a source that was never meant to have it…

6 Observations from 0.3.x Most relevant changes to the site services were to handle large datasets –less important for PanDA production but very relevant for LCG as all were large datasets

7 Implementation of 0.3.x The implementation was also too simplistic for the load we observed: –MySQL database servers were not extensively tuned, too much reliance on database for IPC, heavy overload from ‘polling’ FTS and LFCs (e.g. no GSI session reuse for LFC) –loads >10 were common (I/O+CPU), in particular with MySQL database on the same machine (~10 processes doing GSI plus mysqld doing ORDER BYs)

8 Deployment of 0.3.x T1s is usually run in high performing hardware, with the FTS agents split from the database –… but FTS usually has a few thousand files in the system, on a typical T1 DQ2 is usually deployed with the DB joint with the site services –And its queue is hundreds of thousands of files in a typical T1 DQ2 is the system throttling FTS –where the expensive brokering decisions are made and where the larger queue is maintained

9 Evolution of 0.3.x During 0.3.x, the solution ended up being to simplify 0.3.x in particular reducing its ORDERing Work on 0.4.x had already started –to fix database interactions while maintaining essentially same logic

x New DB schema, more reliance on newer MySQL features (e.g. triggers) less on expensive features (e.g. foreign key constraints) Able to sustain and order large queues (e.g. FZK/LYON are running with >1.5M files in their queues) –One instance of DQ2 0.4.x is used to serve ALL M4 + Tier-0 test data –One instance of DQ2 0.4.x is used to serve 26 sites (FZK + T2s AND LYON + T2s)

11 What remains to be done Need another (smaller) iteration on channel allocation and transfer ordering –e.g. in-memory buffers to prevent I/O on the database, creating temporary queues in front of each channel with tentative ‘best files to transfer next’ Some work remains to have services resilient to failures –e.g. dropped MySQL connection Still need to tackle some ‘holes’ –e.g. queues of files for which we cannot find replicas may still grow forever if a replica indeed appears for one of them the system may take too long to consider this file to transfer –… but already introduced BROKEN subscriptions to release some load

12 What remains to be done More site services local monitoring –work nearly completed –will be deployed when we are confident it does not cause any harm to the database we still observe deadlocks next slides…

13

14

15 Expected patches to 0.4.x 0.4.x branch will continue to focus on site services only –channel allocation, source replicas lookup and submit queues + monitoring Still, the DDM problem as a whole can only be solved by having LARGE files –while we need to sustain queues with MANY files, if we continue with the current file size, the “per-event transfer throughput” will remain very inefficient –plus more aggressive policies on denying/forgetting about subscription requests

16 After 0.4.x 0.5.x will include an extension to the central catalogues –location catalogue only This change follows a request from user analysis and LCG production Goal is to provide central overview of incomplete datasets (files missing) –but handle also dataset deletion (returning list of files to delete at a site, coping with overlapping datasets - quite a hard problem!) First integration efforts (as prototype is now complete) expected to begin mid-Nov

17 After 0.5.x Background work has began on a central catalogue update –Important schema change: new timestamp oriented unique identifiers for datasets allowing partitioning of the backend DB transparently to the user and proving more efficient storage schema –2007 datasets on an instance, 2008 datasets on another.. Work has started, on a longer timescale, as the change will be fully backward compatible –old clients will continue to operate as today, to facilitate any 0.3.x->1.0 migration

18 Constraints In March we decided to centralize DQ2 services on LCG –as a motivation to understand problems with production simulation transfers as our Tier-0 -> Tier-1 tests had always been quite successful in using DQ2 –now, 6 months later we finally start seeing some improvement many design decisions of the site services were altered to adapt to production simulation behaviour (e.g. many fairly “large” open datasets) We expect to continue to need to operate centrally all LCG DQ2 instances for a while more –support and operations are now being setup –but there is an important lack of technical people, aware of MySQL/DQ2 internals

19 Points 1.Longish threads and the use of Savannah for error reports –e.e.g recently we kept getting internal Panda error messages from the worker node for some failed jobs, which were side-effects of some failure in the central Panda part –Propose a single (or at least a primary) contact point for central catalogue issues (Tadashi + Pedro?) 2.For site services and whenever problem is clear, please post also report on Savannah –We have missed minor bug reports due to this 3.Clarify DQ2 role on OSG and signal possible contributions