D0 Taking Stock1 By Anil Kumar CD/CSS/DSG July 10, 2006.

Slides:



Advertisements
Similar presentations
13,000 Jobs and counting…. Advertising and Data Platform Our System.
Advertisements

Shared File Service VM Forum January, SFS Topics Targeted Usage Security Accessing CIFS Shares Availability & Protection Monitoring Pricing.
Do MUCH More with Less Presented by: Jon Farley 2W Technologies.
F Fermilab Database Experience in Run II Fermilab Run II Database Requirements Online databases are maintained at each experiment and are critical for.
Backup and Recovery Part 1.
9 Copyright © Oracle Corporation, All rights reserved. Oracle Recovery Manager Overview and Configuration.
1 Recovery and Backup RMAN TIER 1 Experience, status and questions. Meeting at CNAF June of 2007, Bologna, Italy Carlos Fernando Gamboa, BNL Gordon.
CHAPTER 17 Configuring RMAN. Introduction to RMAN RMAN was introduced in Oracle 8.0. RMAN is Oracle’s tool for backup and recovery. RMAN is much more.
Agenda  Overview  Configuring the database for basic Backup and Recovery  Backing up your database  Restore and Recovery Operations  Managing your.
Tripwire Enterprise Server – Getting Started Doreen Meyer and Vincent Fox UC Davis, Information and Education Technology June 6, 2006.
CERN IT Department CH-1211 Geneva 23 Switzerland t CERN IT Department CH-1211 Geneva 23 Switzerland t
Backup Rationalisation Reorganisation of the CERN Computer Centre Backups David Asbury IT/DS Friday 6 December 2002.
1 RAL Status and Plans Carmine Cioffi Database Administrator and Developer 3D Workshop, CERN, November 2009.
CERN IT Department CH-1211 Genève 23 Switzerland t Next generation of virtual infrastructure with Hyper-V Michal Kwiatek, Juraj Sucik, Rafal.
Castor F2F Meeting Barbara Martelli Castor Database CNAF.
Online Databases Status Oracle 8i Servers, release Platform: Compaq Tru64 UNIX V5.1 (Rev. 732) Production DB d0onprd, on d0olc, 64 users, 34Gb.
Day 10 Hardware Fault Tolerance RAID. High availability All servers should be on UPSs –2 Types Smart UPS –Serial cable connects from UPS to computer.
D0 DB Taking Stock ‘10 1 By Anil Garg – Database Services June 17, 2010.
Bob Thome, Senior Director of Product Management, Oracle SIMPLIFYING YOUR HIGH AVAILABILITY DATABASE.
Operating in a SAN Environment March 19, 2002 Chuck Kinne AT&T Labs Technology Consultant.
Fermilab Oct 17, 2005Database Services at LCG Tier sites - FNAL1 FNAL Site Update By Anil Kumar & Julie Trumbo CD/CSS/DSG FNAL LCG Database.
Online Database Support Experiences Diana Bonham, Dennis Box, Anil Kumar, Julie Trumbo, Nelly Stanfield.
5 Copyright © 2008, Oracle. All rights reserved. Using RMAN to Create Backups.
Chapter 7 Making Backups with RMAN. Objectives Explain backup sets and image copies RMAN Backup modes’ Types of files backed up Backup destinations Specifying.
15 Copyright © 2005, Oracle. All rights reserved. Performing Database Backups.
Backup & Recovery Backup and Recovery Strategies on Windows Server 2003.
Maintaining File Services. Shadow Copies of Shared Folders Automatically retains copies of files on a server from specific points in time Prevents administrators.
CDF Taking Stock ‘08 1 By Anil Kumar CD/LSCS/DBI/DBA July 16, 2008.
CERN - IT Department CH-1211 Genève 23 Switzerland t Tier0 database extensions and multi-core/64 bit studies Maria Girone, CERN IT-PSS LCG.
Paul Scherrer Institut 5232 Villigen PSI HEPIX_AMST / / BJ95 PAUL SCHERRER INSTITUT THE PAUL SCHERRER INSTITUTE Swiss Light Source (SLS) Particle accelerator.
Module 9 Planning a Disaster Recovery Solution. Module Overview Planning for Disaster Mitigation Planning Exchange Server Backup Planning Exchange Server.
21 st October 2002BaBar Computing – Stephen J. Gowdy 1 Of 25 BaBar Computing Stephen J. Gowdy BaBar Computing Coordinator SLAC 21 st October 2002 Second.
06/22/2005CDF Taking Stock CDF Taking Stock By Anil Kumar CD/CSS/DSG June 22, 2005.
1 Week #10Business Continuity Backing Up Data Configuring Shadow Copies Providing Server and Service Availability.
CD FY10 Budget and Tactical Plan Review FY10 Tactical Plans for Database Services Nelly Stanfield October 7, 2009 Database Services3425-v1.
Elizabeth Gallas August 9, 2005 CD Support for D0 Database Projects 1 Elizabeth Gallas Fermilab Computing Division Fermilab CD Grid and Data Management.
CERN - IT Department CH-1211 Genève 23 Switzerland t Oracle Real Application Clusters (RAC) Techniques for implementing & running robust.
CERN IT Department CH-1211 Genève 23 Switzerland t Possible Service Upgrade Jacek Wojcieszuk, CERN/IT-DM Distributed Database Operations.
TiBS Fermilab – HEPiX-HEPNT Ray Pasetes October 22, 2003.
(WINDOWS PLATFORM - ITI310 – S15)
CDF DB Taking Stock ‘10 1 By Anil Garg – Database Services Aug 18, 2010.
The Million Point PI System – PI Server 3.4 The Million Point PI System PI Server 3.4 Jon Peterson Rulik Perla Denis Vacher.
CERN Database Services for the LHC Computing Grid Maria Girone, CERN.
1 D0 Taking Stock By Anil Kumar CD/LSCS/DBI/DBA June 11, 2007.
CERN - IT Department CH-1211 Genève 23 Switzerland t High Availability Databases based on Oracle 10g RAC on Linux WLCG Tier2 Tutorials, CERN,
Greenlight Presentation Oracle 11g Upgrade February 16, 2012.
D0 Taking Stock1 By Anil Kumar CD/CSS/DSG June 06, 2005.
Reliability of KLOE Computing Paolo Santangelo for the KLOE Collaboration INFN LNF Commissione Scientifica Nazionale 1 Roma, 13 Ottobre 2003.
CD FY09 Tactical Plan Status FY09 Tactical Plan Status Report for Neutrino Program (MINOS, MINERvA, General) Margaret Votava April 21, 2009 Tactical plan.
2 Copyright © 2006, Oracle. All rights reserved. Configuring Recovery Manager.
Maria Girone CERN - IT Tier0 plans and security and backup policy proposals Maria Girone, CERN IT-PSS.
CNAF Database Service Barbara Martelli CNAF-INFN Elisabetta Vilucchi CNAF-INFN Simone Dalla Fina INFN-Padua.
Site Services and Policies Summary Dirk Düllmann, CERN IT More details at
BNL dCache Status and Plan CHEP07: September 2-7, 2007 Zhenping (Jane) Liu for the BNL RACF Storage Group.
March, Database Projects J.Trumbo CSS-DSG May,
February, Databases Project Update J.Trumbo LSC/DBI/DBA February 27, 2007.
Extending Auto-Tiering to the Cloud For additional, on-demand, offsite storage resources 1.
ETL Validator Deployment Options
Backup & Recovery of Physics Databases
High Availability Linux (HA Linux)
5f. GSICS Wiki Overview and NOAA GSICS THREDDS Service Overview
By Anil Kumar CD/CSS/DSG June 06, 2005
Database Services at Fermilab
Scalable Database Services for Physics: Oracle 10g RAC on Linux
Upgrading to Microsoft SQL Server 2014
Haiyan Meng and Douglas Thain
Scalable Database Services for Physics: Oracle 10g RAC on Linux
Backup Monitoring – EMC NetWorker
Backup Monitoring – EMC NetWorker
PerformanceBridge Application Suite and Practice 2.0 IT Specifications
Presentation transcript:

D0 Taking Stock1 By Anil Kumar CD/CSS/DSG July 10, 2006

D0 Taking Stock2 D0 offline Production/Integration Infrastructure 8 900MHz CPU 16G RAM The machine has a Clariion 4500 hardware raid array. Oracle Server 10gR2 (64 bit) on Solaris 2.9 (64 bit) Load Avg 4-6 CPU usage ~77%, Memory Free : 50% Uptime excluding scheduled down times % Uptime (based on 120 min of total db unavailability) since June, 2005 vs

D0 Taking Stock3 D0 offline production Luminosity Sun v40z with 2 AMD 844MHz CPU RHEL3 x86_64 2 Ultra 160 RAID controllers 16G RAM Oracle server 10gR1

D0 Taking Stock4 D0 offline development Infrastructure D0ora1 Sun E MHz CPU, 4GB of RAM Oracle 10gR2 64 bit on Solaris 2.9 Load Avg 1-2, CPU usage ~33%, Mem Free 19% D0lum1 Sun v40z with 2 844MHz AMD CPU RHEL3 x86_64 16G RAM Oracle server 10gR2

D0 Taking Stock5 Space Usage

D0 Taking Stock6 Space Usage Summary d0ofprd GB used. d0ofint1 103 GB used. 2.25TB is available for use for int and production. d0ofdev1 120 GB Used 11GB is available for use. d0oflumd 285GB used doflumi 482MB used 150GB is available for d0oflumd and d0oflumi D0oflump 363GB used 411 GB is available.

D0 Taking Stock7 Capacity Planning Next three years expected growth 1.1 TB SAM growth 375Gb/year and other apps 15Gb/year. This exclude Luminosity DB We have around 2.2TB available. Luminosity growth is 125Gb/year.

D0 Taking Stock8 Accomplishments Upgraded D0 offline databases to 10gR2. Also OS upgrade for D0dbsrv nodes. Replacement of d0dbsrv5 node with new hardware and upgraded memory to 4GB vs 2GB Export of Trigger Database. Retention Policy 30 days on disk and daily taken to Dcache. Mini-trigger Simulator Set-up Deployment of Lum Db in production 10gR2. Quarterly Database Security/OS patches Up-to-date. Upgrade OEM to 10g Rewrite of dbatools/toolman for enhanced features of monitoring and 10g support. Disk Capacity Upgrade on d0 offline production database. Db Security Enhancements. Restricting access to Dictionary. Restricted Usage of Database Links. Password complexiety,locking the obsolete accounts and password complexity. Deployment of SAM Request System Schema v6_0. Also deployed version v6_1. V6_3 in development. Moving d0 offline to a standardized backup recovery method using a san and enstore. Parallel testing of san as backup media for development and production instances going well.

D0 Taking Stock9 Back-up/Recovery D0ofprd1 - Daily, 7 days of archives, one backup always on DISK - Bi-monthly backup of READ ONLY tablespaces - Allocated 2TB Used 1.2TB, to tape Daily, RMAN Back-up time -> 6 Hrs ( 3 Hrs 45 Excl READ ONLY + 2 Hrs 20 READ ONLY ) No Export -Tape Rotation : 1 Week for Daily backups and 2 months for Read Only backups. - Backups taken to dcache 2x/week, Read-Only taken 2x/month. Archives taken every 30 min. D0lump Daily backups to SAN. To dcache daily. Archives taken every 30 min. D0ofint1 Once a week on Local disk D0ofdev1 Sat. backup on local disk otherwise on SAN -Allocated 100GB, used 58GB, Daily Tape Backup RMAN Backup time -> 2 hrs. Tape Rotation : 2 Months.

D0 Taking Stock10 Production backups to SAN Two 1TB SAN mount points in use on d0ora2 One in use on d0lum2 daily backup to SAN Always 1 backup on disk, plus X200 tape library backup of RMAN from local disk, and dcache copy Read-only portion of database backed up twice/month to SAN

D0 Taking Stock11 SAN issues Current SAN is not 24 x 7 support IDE disks are not as reliable as other, more expensive disks are. However, these seems to be reliable. We do rman backup validate for backup files on SAN. Also recentely recovery was done after restore from SAN. Current SAN is trouble free except when the path failed a couple of months ago, and because the san is not dual path, it prevented backups over the weekend, as this is not 24/7 supported and we had to wait till monday to get support. Purchasing 24 x 7 SAN requires licensing and changes to O/S to be able to use it Details for Future of SAN at RunII databases will be covered in Ray P/Steve K. ‘s presentation.

D0 Taking Stock12 SAM Schema Production Deployments : Storage Location v6_1. SAM Request Sub System v6_0 Work-in-progress - v6_3 Retiring Files. Upgrade to Mini SAM as SAM Schema Evolved. -> This facilitate individual developers to have copy of SAM metadata and seed data available for server software rewrite if needed. Mini-SAM in Postgres. Initiative to move towards free ware Databases for SAM Proof of product not complete, requires testing with a dbserver from the sam development team 2.38B events in 47 Partitions. Now Avg 1 partition/ 3 running weeks Partitions Rollover dates URL :

D0 Taking Stock13 What’s Next ? Deploy san/enstore backup recovery plan. Replacement of Aging Clariion Array. May be new d0dbsrv nodes. At least Primary Nodes. Luminosity DB server is 2 times performant with C++ caching Server, but causes intermittent crash of other Calib servers. May be dedicated nodes for Luminosity Servers. Cut New event Partitions for SAM ASO ( Advanced Security Option) Deployment. Upgrade Designer and its repository to 10g Bundling of Redhat renewal licenses into one P.O. Testing of postgres mini sam for proof of product.

D0 Taking Stock14 Concerns Python Dcoracle to be built with Oracle 10g Client. Oracle recommends client version should be same as Database Version. Any Oracle Patch may break Pyhton Dcoracle built with 8i client. Backups will get bigger. So backup of VLDB SAM Servers on Linux ? Security Audits may mandate dedicated node for SAM servers and web servers. Not Enough Space for Integration db to do full refresh of SAM. Single point of failures with D0 offline database. Future of the aging clarion array must be addressed in next budget. Hardware for D0 DB server machines is very old. Should consider upgrading the hardware for d0 db servers. Post the Performance Graphs gone in 10gR2 monitoring tool.