Dirk Duellmann CERN IT/PSS and 3D

Slides:



Advertisements
Similar presentations
BNL Oracle database services status and future plans Carlos Fernando Gamboa RACF Facility Brookhaven National Laboratory, US Distributed Database Operations.
Advertisements

Database monitoring and service validation Dirk Duellmann CERN IT/PSS and 3D
LCG 3D StatusDirk Duellmann1 LCG 3D Throughput Tests Scheduled for May - extended until end of June –Use the production database clusters at tier 1 and.
Status of WLCG Tier-0 Maite Barroso, CERN-IT With input from T0 service managers Grid Deployment Board 9 April Apr-2014 Maite Barroso Lopez (at)
SC4 Workshop Outline (Strong overlap with POW!) 1.Get data rates at all Tier1s up to MoU Values Recent re-run shows the way! (More on next slides…) 2.Re-deploy.
RLS Tier-1 Deployment James Casey, PPARC-LCG Fellow, CERN 10 th GridPP Meeting, CERN, 3 rd June 2004.
ATLAS Metrics for CCRC’08 Database Milestones WLCG CCRC'08 Post-Mortem Workshop CERN, Geneva, Switzerland June 12-13, 2008 Alexandre Vaniachine.
Databases Technologies and Distribution Techniques Dirk Duellmann, CERN HEPiX, Rome, April 4th 2006.
Workshop Summary (my impressions at least) Dirk Duellmann, CERN IT LCG Database Deployment & Persistency Workshop.
LCG 3D Project Status and Production Plans Dirk Duellmann, CERN IT On behalf of the LCG 3D project CHEP 2006, 15th February, Mumbai.
CCRC’08 Weekly Update Jamie Shiers ~~~ LCG MB, 1 st April 2008.
Database Administrator RAL Proposed Workshop Goals Dirk Duellmann, CERN.
CERN Physics Database Services and Plans Maria Girone, CERN-IT
3D Workshop Outline & Goals Dirk Düllmann, CERN IT More details at
LFC Replication Tests LCG 3D Workshop Barbara Martelli.
CERN-IT Oracle Database Physics Services Maria Girone, IT-DB 13 December 2004.
Database Readiness Workshop Summary Dirk Duellmann, CERN IT For the LCG 3D project SC4 / pilot WLCG Service Workshop.
SC4 Planning Planning for the Initial LCG Service September 2005.
Site Validation Session Report Co-Chairs: Piotr Nyczyk, CERN IT/GD Leigh Grundhoefer, IU / OSG Notes from Judy Novak WLCG-OSG-EGEE Workshop CERN, June.
Report from the WLCG Operations and Tools TEG Maria Girone / CERN & Jeff Templon / NIKHEF WLCG Workshop, 19 th May 2012.
Oracle for Physics Services and Support Levels Maria Girone, IT-ADC 24 January 2005.
3D Project Status Dirk Duellmann, CERN IT For the LCG 3D project Meeting with LHCC Referees, March 21st 06.
LCG Report from GDB John Gordon, STFC-RAL MB meeting February24 th, 2009.
Plans for Service Challenge 3 Ian Bird LHCC Referees Meeting 27 th June 2005.
LCG 3D Project Update (given to LCG MB this Monday) Dirk Duellmann CERN IT/PSS and 3D
Report from GSSD Storage Workshop Flavia Donno CERN WLCG GDB 4 July 2007.
Maria Girone CERN - IT Tier0 plans and security and backup policy proposals Maria Girone, CERN IT-PSS.
LCG Service Challenges SC2 Goals Jamie Shiers, CERN-IT-GD 24 February 2005.
SL5 Site Status GDB, September 2009 John Gordon. LCG SL5 Site Status ASGC T1 - will be finished before mid September. Actually the OS migration process.
Site Services and Policies Summary Dirk Düllmann, CERN IT More details at
Status of tests in the LCG 3D database testbed Eva Dafonte Pérez LCG Database Deployment and Persistency Workshop.
LCG Issues from GDB John Gordon, STFC WLCG MB meeting September 28 th 2010.
Database Project Milestones (+ few status slides) Dirk Duellmann, CERN IT-PSS (
The Grid Storage System Deployment Working Group 6 th February 2007 Flavia Donno IT/GD, CERN.
Replicazione e QoS nella gestione di database grid-oriented Barbara Martelli INFN - CNAF.
INFSO-RI Enabling Grids for E-sciencE File Transfer Software and Service SC3 Gavin McCance – JRA1 Data Management Cluster Service.
A quick summary and some ideas for the 2005 work plan Dirk Düllmann, CERN IT More details at
Storage & Database Team Activity Report INFN CNAF,
CERN - IT Department CH-1211 Genève 23 Switzerland t Service Level & Responsibilities Dirk Düllmann LCG 3D Database Workshop September,
DB Questions and Answers open session (comments during session) WLCG Collaboration Workshop, CERN Geneva, 24 of April 2008.
Database Requirements Updates from LHC Experiments WLCG Grid Deployment Board Meeting CERN, Geneva, Switzerland February 7, 2007 Alexandre Vaniachine (Argonne)
CVMFS Alessandro De Salvo Outline  CVMFS architecture  CVMFS usage in the.
HEPiX IPv6 Working Group David Kelsey (STFC-RAL) GridPP33 Ambleside 22 Aug 2014.
INFN Tier1/Tier2 Cloud WorkshopCNAF, 22 November 2006 Conditions Database Services How to implement the local replicas at Tier1 and Tier2 sites Andrea.
Database Readiness Workshop Summary Dirk Duellmann, CERN IT For the LCG 3D project GDB meeting, March 8th 06.
T0-T1 Networking Meeting 16th June Meeting
WLCG IPv6 deployment strategy
WLCG Tier-2 Asia Workshop TIFR, Mumbai 1-3 December 2006
Status of Distributed Databases (LCG 3D)
“A Data Movement Service for the LHC”
Database Replication and Monitoring
LCG Service Challenge: Planning and Milestones
Status Report on LHC_2 : ATLAS computing
LCG 3D Project Update for the meeting with the LHCC referees
Lee Lueking WLCG Workshop DB BoF 22 Jan. 2007
LCG 3D Distributed Deployment of Databases
Database Services at CERN Status Update
Service Challenge 3 CERN
3D Application Tests Application test proposals
Grid Deployment Area Status Report
Elizabeth Gallas - Oxford ADC Weekly September 13, 2011
Database Readiness Workshop Intro & Goals
Update on Plan for KISTI-GSDC
Summary from last MB “The MB agreed that a detailed deployment plan and a realistic time scale are required for deploying glexec with setuid mode at WLCG.
Conditions Data access using FroNTier Squid cache Server
Workshop Summary Dirk Duellmann.
3D Project Status Report
LHC Data Analysis using a worldwide computing grid
IPv6 update Duncan Rand Imperial College London
Dirk Duellmann ~~~ WLCG Management Board, 27th July 2010
Presentation transcript:

Dirk Duellmann CERN IT/PSS and 3D http://lcg3d.cern.ch LCG 3D Project Update Dirk Duellmann CERN IT/PSS and 3D http://lcg3d.cern.ch

LCG 3D Service Architecture Oracle Streams http cache (SQUID) Cross DB copy & MySQL/SQLight Files M O S S T0 - autonomous reliable service T1- db back bone - all data replicated - reliable service O O F T2 - local db cache -subset data -only local service Online DB autonomous reliable service O M S S R/O Access at Tier 1/2 (at least initially) LCG 3D Status Dirk Duellmann

LCG Database Deployment Plan After October ‘05 workshop a database deployment plan has been presented to LCG GDB and MB http://agenda.cern.ch/fullAgenda.php?ida=a057112 Two production phases March - Oct ‘06 : partial production service Production service (parallel to existing testbed) H/W requirements defined by experiments/projects Based on Oracle 10gR2 Subset of LCG tier 1 sites: ASCC, CERN, BNL, CNAF, GridKA, IN2P3, RAL Oct ‘06- onwards : full production service Adjusted h/w requirements (defined at summer ‘06 workshop) Other tier 1 sites joined in: PIC, NIKHEF, NDG, TRIUMF LCG 3D Status Dirk Duellmann

Oracle Licenses For Tier 1s 3D collected license needs from experiments and s/w projects And validated with T1 site responsibles Some 152 processor licenses (incl. Grid Services and Castor) CERN has negotiated with Oracle a proposal with very attractive conditions T1 sites have been contacted and agreed to the proposal FNAL had already acquired their licenses Tier 1 sites should now be covered for s/w and support! Support accounts for Tier 1s will be enable as soon as papers have been signed and site contribution has been received. LCG 3D Status Dirk Duellmann

Oracle Instant Client Distribution The issue of client distribution has been discussed with Oracle and an agreement has been achieved The instant client can be integrated into the LCG middleware and application distributions As long as the included license file is preserved and adhered to The SPI project in the Application Area will from now on bundle the software as part of AA releases. Experiments and LCG middleware should take advantage and pick up validated client releases from this single source. Version management will happen as for other AA packages via the established channels for external packages LCG 3D Status Dirk Duellmann

Tier 1 Hardware Setup Propose to setup for first 6 month 2/3 dual-cpu database nodes with 2GB or more Setup as RAC cluster (preferably) per experiment ATLAS: 3 nodes with 300GB storage (after mirroring) LHCb: 2 nodes with 100GB storage (after mirroring) Shared storage (eg FibreChannel) proposed to allow for clustering 2-3 dual-cpu Squid nodes with 1GB or more Squid s/w packaged by CMS will be provided by 3D 100GB storage per node Need to clarify service responsibility (DB or admin team?) Target s/w release: Oracle 10gR2 RedHat Enterprise Server to insure Oracle support LCG 3D Status Dirk Duellmann

LCG 3D - Tier 1 Database Setup Phase 1 Sites ASGC, BNL, CNAF, IN2P3, RAL - DB cluster available, part of 3D throughput tests GridKA - DB cluster available, expected to join soon Phase 2 Sites TRIUMF - regular attendance in 3D planning meetings PIC, NIKHEF/SARA - DBA contact established NDGF - early discussions LCG 3D Status Dirk Duellmann

LCG 3D Throughput Tests Initially scheduled for May: Use the production database clusters at tier 1 and obtain a first estimate for the replication throughput which can be achieved with the setup Eg as input to experiment models for calibration data flow Tests started beginning of May, but will extend until end of June Main reasons Server setup problem (often db storage) at sites Firewall configuration Throughput optimization need Oracle experts to be involved LCG 3D Status Dirk Duellmann

June Test Setup Now tests are split into Basic site tests with all T1s LAN throughput optimization with Oracle (CERN local) ATLAS and CMS online groups involved WAN throughput (CERN-CNAF) LFC team and LHCb involved monitoring setup tests (CERN - RAL) Significant load on one central expert at CERN Trying to offload with the help of the database service team LCG 3D Status Dirk Duellmann

Database Monitoring A central Oracle Enterprise Manager repository at CERN has been setup to collect the status and detailed diagnostics of all 3D production clusters. Some sites will in parallel have the information integrated into their site local OEM setups Integration into higher level monitoring tools will be based on this information Eg experiment dashboard, GridView In addition test jobs (a la SFT) are planned as soon as higher level services (eg COOL) are deployed by experiments LCG 3D Status Dirk Duellmann

Preliminary Streams Performance Out of the box replication performance Between 10 and 30 MB/min (LAN) for a single conditions insert job Sufficient for current experiment conditions scenarios Still quite some room for optimisation as neither source nor destination DB seem to be significantly loaded Regular phone calls between Oracle and 3D DBA Head of streams development attending LCG 3D Status Dirk Duellmann

Frontier/SQUID Tests Stress tests with many clients at CERN to validate production setup for CMS DNS based failover validated Setting up an additional node for ATLAS Frontier tests with COOL CMS Frontier tests at FNAL Focus on connection retry and failover Common s/w component shared with database acces CMS July release may pickup LCG AA s/w release which is currently prepared LCG 3D Status Dirk Duellmann

CMS Squid Deployment Squids deployed at several Tier 1 and 2 sites. Tier 1: LCG: ASCC,IN2P3, PIC, RAL, CNAF, FZK, CERN and OSG: FNAL Tier 2:LCG: Bari, CIEMAT, DESY, Taiwan, OSG: UCSD, Purdue, Caltech Remaining Tier 2 sites for SC4 goals: LCG: Belgium, Legnaro, Budapest, Estonia, CSCS, GRIF, Imperial, Rome, Pisa, NCU/NTU, ITEP, SINP, JINR, IHEP, KNU and OSG: Florida, Nebraska, Wisconsin, MIT. In progress, plan to finish in the next month. Request WLCG support for Tier-2 Squid installation Minimum specs: 1GHz CPU, 1GByte mem, GBit network, 100 GB disk. Needs to be well connected (network-wise) to worker nodes, and have access to WAN and LAN if on Private Network. Having 2 machines for failover is a useful option for T-2 but not required. Sites perform Squid installation using existing instructions Installation should be completed and tested by July 15, 2006.

Messages to T1 Sites Initial set of Tier 1s Production Setups need to be usable for throughput tests of the experiments by end of June Need consistent meeting attendance of all sites Need all sites to join the monitoring setup asap so that we can start integration with Need DBA coverage during the experiment tests July-Oct New Tier 1s - NGDF, PIC, NIKHEF/SARA and TRIUMF Production setup needs to be usable for full production in October DB h/w and s/w setup planning will need to start soon (sites should start from current h/w request) Please let us know when your h/w acquisition could/should start LCG 3D Status Dirk Duellmann

Messages to Experiments Need the experiments to confirm or update their current database and/or Frontier requests for October production Need an experiment work plan from each experiment for period from July to October Database and Frontier A significant fraction of the service expected as of October needs to be demonstrated by middle of August The plan should address the full chain from online databases over offline to tier 1 This activity will require significant experiment resources to drive the tests LCG 3D Status Dirk Duellmann

Proposed Timeline End of June Early-July Late-August September October Throughput phase closed, Experiment application and throughput test start Early-July 3D DBA day to plan database setup options with new tier 1 sites. Hosted by one of the new sites? Late-August 3D workshop (experiments and sites) defining October setup and service September Experiment ramp-up test with October Full service open at all tier 1 sites LCG 3D Status Dirk Duellmann