Database Services at CERN Status Update

Slides:



Advertisements
Similar presentations
S. Gadomski, "ATLAS computing in Geneva", journee de reflexion, 14 Sept ATLAS computing in Geneva Szymon Gadomski description of the hardware the.
Advertisements

12. March 2003Bernd Panzer-Steindel, CERN/IT1 LCG Fabric status
1 Recovery and Backup RMAN TIER 1 Experience, status and questions. Meeting at CNAF June of 2007, Bologna, Italy Carlos Fernando Gamboa, BNL Gordon.
1© Copyright 2012 EMC Corporation. All rights reserved. November 2013 Oracle Continuous Availability – Technical Overview.
Implementing Failover Clustering with Hyper-V
Simplify your Job – Automatic Storage Management Angelo Session id:
High Availability & Oracle RAC 18 Aug 2005 John Sheaffer Platform Solution Specialist
Database Services for Physics at CERN with Oracle 10g RAC HEPiX - April 4th 2006, Rome Luca Canali, CERN.
Module 13: Configuring Availability of Network Resources and Content.
ASGC 1 ASGC Site Status 3D CERN. ASGC 2 Outlines Current activity Hardware and software specifications Configuration issues and experience.
NOAA WEBShop A low-cost standby system for an OAR-wide budgeting application Eugene F. Burger (NOAA/PMEL/JISAO) NOAA WebShop July Philadelphia.
CERN - IT Department CH-1211 Genève 23 Switzerland t Tier0 database extensions and multi-core/64 bit studies Maria Girone, CERN IT-PSS LCG.
Databases March 14, /14/2003Implementation Review2 Goals for Database Architecture Changes Simplify hardware architecture Improve performance Improve.
Workshop Summary (my impressions at least) Dirk Duellmann, CERN IT LCG Database Deployment & Persistency Workshop.
CASTOR Databases at RAL Carmine Cioffi Database Administrator and Developer Castor Face to Face, RAL February 2009.
1 Week #10Business Continuity Backing Up Data Configuring Shadow Copies Providing Server and Service Availability.
Database Administrator RAL Proposed Workshop Goals Dirk Duellmann, CERN.
CERN Physics Database Services and Plans Maria Girone, CERN-IT
OSIsoft High Availability PI Replication
CERN - IT Department CH-1211 Genève 23 Switzerland t Oracle Real Application Clusters (RAC) Techniques for implementing & running robust.
CERN-IT Oracle Database Physics Services Maria Girone, IT-DB 13 December 2004.
Database Readiness Workshop Summary Dirk Duellmann, CERN IT For the LCG 3D project SC4 / pilot WLCG Service Workshop.
CERN Database Services for the LHC Computing Grid Maria Girone, CERN.
CERN - IT Department CH-1211 Genève 23 Switzerland t High Availability Databases based on Oracle 10g RAC on Linux WLCG Tier2 Tutorials, CERN,
Greenlight Presentation Oracle 11g Upgrade February 16, 2012.
Oracle for Physics Services and Support Levels Maria Girone, IT-ADC 24 January 2005.
High Availability Technologies for Tier2 Services June 16 th 2006 Tim Bell CERN IT/FIO/TSI.
3D Project Status Dirk Duellmann, CERN IT For the LCG 3D project Meeting with LHCC Referees, March 21st 06.
CERN IT Department CH-1211 Geneva 23 Switzerland t WLCG Operation Coordination Luca Canali (for IT-DB) Oracle Upgrades.
Maria Girone CERN - IT Tier0 plans and security and backup policy proposals Maria Girone, CERN IT-PSS.
CNAF Database Service Barbara Martelli CNAF-INFN Elisabetta Vilucchi CNAF-INFN Simone Dalla Fina INFN-Padua.
LHC Logging Cluster Nilo Segura IT/DB. Agenda ● Hardware Components ● Software Components ● Transparent Application Failover ● Service definition.
PIC port d’informació científica Luis Diaz (PIC) ‏ Databases services at PIC: review and plans.
Site Services and Policies Summary Dirk Düllmann, CERN IT More details at
Patricia Méndez Lorenzo Status of the T0 services.
Virtual Machine Movement and Hyper-V Replica
Database Project Milestones (+ few status slides) Dirk Duellmann, CERN IT-PSS (
Thomas Baus Senior Sales Consultant Oracle/SAP Global Technology Center Mail: Phone:
2 Copyright © 2006, Oracle. All rights reserved. RAC and Shared Storage.
Reaching MoU Targets at Tier0 December 20 th 2005 Tim Bell IT/FIO/TSI.
CERN - IT Department CH-1211 Genève 23 Switzerland t Service Level & Responsibilities Dirk Düllmann LCG 3D Database Workshop September,
DB Questions and Answers open session (comments during session) WLCG Collaboration Workshop, CERN Geneva, 24 of April 2008.
Oracle for Physics Services and Support Levels Maria Girone, IT-ADC 6 April 2005.
OSIsoft High Availability PI Replication Colin Breck, PI Server Team Dave Oda, PI SDK Team.
Database Readiness Workshop Summary Dirk Duellmann, CERN IT For the LCG 3D project GDB meeting, March 8th 06.
WLCG Workshop 2017 [Manchester] Operations Session Summary
Maria Girone, CERN – IT, Data Management Group
IT Services Katarzyna Dziedziniewicz-Wojcik IT-DB.
Dirk Duellmann CERN IT/PSS and 3D
Database Services Katarzyna Dziedziniewicz-Wojcik On behalf of IT-DB.
High Availability Linux (HA Linux)
IT-DB Physics Services Planning for LHC start-up
Lead SQL BankofAmerica Blog: SQLHarry.com
3D Application Tests Application test proposals
Database Readiness Workshop Intro & Goals
WLCG Service Interventions
Scalable Database Services for Physics: Oracle 10g RAC on Linux
Workshop Summary Dirk Duellmann.
WLCG Service Report 5th – 18th July
Bernd Panzer-Steindel CERN/IT
Oracle Database Monitoring and beyond
Oracle Storage Performance Studies
3D Project Status Report
Data Lifecycle Review and Outlook
Database Services for CERN Deployment and Monitoring
SpiraTest/Plan/Team Deployment Considerations
CERN DB Services: Status, Activities, Announcements
Scalable Database Services for Physics: Oracle 10g RAC on Linux
Ron Carovano Manager, Business Development F5 Networks
Deploying Production GRID Servers & Services
Presentation transcript:

Database Services at CERN Status Update Maria Girone, CERN IT-PSS

Database Service Evolution Until summer 2005 Solaris based shared Physics DB cluster (2-nodes for HA) Low CPU power, hard to extend, shared by all experiments (many) linux disk servers as DB servers High maintenance load, no resource sharing, no redundancy Now consolidation on extensible database clusters No sharing across experiments Higher quality building blocks Midrange PCs (RedHat ES) FibreChannel attached disk arrays As of last month - all LHC services moved LCG Database Workshop Maria Girone

Service Architecture - Oracle Database Clusters The Physics Database Production and Validation services are deployed on 2-node RAC/Linux, in failover mode LCG Database Workshop Maria Girone

Experience with RAC availability Managed to apply ORACLE security patches in rolling fashion Big step to decrease planned downtime Need in time patch information from Oracle Most RAC based services stayed up during last power cut - service is now on critical power Investigating some glitches on ATLAS RAC nodes Startup after service problem significantly faster than old disk-server based services LCG Database Workshop Maria Girone

DB Storage Configuration (in production) Data DG-2 Recovery DG-1 Data DG-1 Recovery DG-2 Disk Groups (ASM) DB N.1 DB N.2 Disk groups created with ‘horizontal’ slicing Benefits: more effective use of available storage High availability - Allows to keep backups on disk Higher performance (30%-50%) - Allows clusterware mirroring Oracle RAC Nodes Storage Arrays LCG Database Workshop Maria Girone

Service Throttling - Resource Usage Reports Run into degraded service after single remote user submitted many (idle) jobs Defined account profile for larger apps Db accounts are shared among many users Switched on idle session “sniping” (default = 3h idle time) Proposing (eg weekly) resource overview to experiment database coordinator Allow experiment to prioritize resources and identify unexpected usage patterns Which jobs/users got affected by what limit? LCG Database Workshop Maria Girone

RAC Hardware evolution for 2006 Linear ramp-up budgeted for hardware resources in 2006-2008 Planning next major service extension for Q3 this year Current State ALICE ATLAS CMS LHCb Grid 3D Non-LHC Validation - 2-node offline 2-node 2x2-node 2-node online test Pilot on disk server Proposed structure in Q2 2006 4-node 4--node 2-node (PDB replacement) 2-node valid/test 2-node pilot Compass?? Online? LCG Database Workshop Maria Girone

RAC Expansion for Q2 New mid-range servers received and installed Passed acceptance tests by IT-FIO Waiting for additional disk-arrays and fibre channel switches Expect delivery end of February Planning the setting up in collaboration with IT-FIO Proceed in two steps February: Extension of existing RACs with additional CPUs Cabling work for fibre channel and IP networks has started March: Creation of new RACs eg dedicated experiment validation servers after disk-arrays and switches arrived LCG Database Workshop Maria Girone

Moving to 10gR2 Proceed with move to 10gR2 as main production platform for 2006 Planning with IT-DES to migrate development service for experiments to 10gR2 this month Plan to setup new RAC servers with 10gR2 Will start with validation setups Plan to migrate production service to new release as soon as experiments have validated their apps on dev or validation service Target complete move by end of March LCG Database Workshop Maria Girone

Backups Strategy - Review with Experiments Default backup retention policy and frequency needs review by experiments Backup schedule - is the default of two full backups sufficient? Is the latency of a partial or full recovery acceptable? Can we reduce fraction of active writeable data? And thereby backup volume and latency Impact on physical data organisation and applications Database backup/recovery at Tier 1’s Any experiment requirements on latency to recover? Impact on Tier 0 services for replicated data Propose to setup meetings with experiment database coordinators document an agreed strategy and present at next workshop (summer) LCG Database Workshop Maria Girone

Summary LCG database services now fully based on RAC Benefits of consolidation and additional flexibility obtained Q2 Database extension proceeding as planned Dedicated experiment database clusters will double in CPU power Dedicated validation resources will simplify planning Second h/w extension (Q3) will need to go out soon Need to regularly plan evolution with experiment database responsible Regular resource usage reports could be a good basis Get started with backup and recovery strategy discussions LCG Database Workshop Maria Girone