Copyright © 2009 Rolta International, Inc., All Rights Reserved Oracle High Availability - A Case Study Rama Balaji Senior Oracle Consultant.

Slides:



Advertisements
Similar presentations
ITEC474 INTRODUCTION.
Advertisements

INTRODUCTION TO ORACLE Lynnwood Brown System Managers LLC Backup and Recovery Copyright System Managers LLC 2008 all rights reserved.
INTRODUCTION TO ORACLE Lynnwood Brown System Managers LLC Oracle High Availability Solutions RAC and Standby Database Copyright System Managers LLC 2008.
High Availability Group 08: Võ Đức Vĩnh Nguyễn Quang Vũ
CERN - IT Department CH-1211 Genève 23 Switzerland t Backup & Recovery with RMAN LCG 3D Workshop, Bologna June 12 th, 2007 Jacek Wojcieszuk.
Oracle Data Guard Ensuring Disaster Recovery for Enterprise Data
1 Chapter 15 Duplicating Databases and Transporting Data.
1 Getting The Most Out of RMAN By: Charles Pfeiffer CIO, Remote Control DBA (888)
5 Copyright © 2006, Oracle. All rights reserved. Database Recovery.
RMAN Restore and Recovery
Backup and Recovery Part 1.
Chapter 12 Performing Incomplete Recovery. Background Viewed as one of the more difficult chapters to write Thought it was important to put in material.
Configuring Recovery Manager
4 Copyright © 2008, Oracle. All rights reserved. Configuring Backup Specifications.
Chapter 5 Configuring the RMAN Environment. Objectives Show command to see existing settings Configure command to change settings Backing up the controlfile.
Backup & Recovery with RMAN
9 Copyright © Oracle Corporation, All rights reserved. Oracle Recovery Manager Overview and Configuration.
CHAPTER 17 Configuring RMAN. Introduction to RMAN RMAN was introduced in Oracle 8.0. RMAN is Oracle’s tool for backup and recovery. RMAN is much more.
CERN - IT Department CH-1211 Genève 23 Switzerland t Oracle Data Guard for RAC migrations WLCG Service Reliability Workshop CERN, November.
The Oracle Recovery Manager (RMAN)
Backup Concepts. Introduction Backup and recovery procedures protect your database against data loss and reconstruct the data, should loss occur. The.
Agenda  Overview  Configuring the database for basic Backup and Recovery  Backing up your database  Restore and Recovery Operations  Managing your.
Introduction to Oracle Backup and Recovery
Using RMAN to Perform Recovery
CHAPTER 16 User-Managed Backup and Recovery. Introduction to User Managed Backup and Recovery Backup and recover is one of the most critical skills a.
20 Copyright © 2004, Oracle. All rights reserved. Database Recovery.
13 Copyright © Oracle Corporation, All rights reserved. RMAN Complete Recovery.
PPOUG, 05-OCT-01 Agenda RMAN Architecture Why Use RMAN? Implementation Decisions RMAN Oracle9i New Features.
Recovery Manager Overview Target Database Recovery Catalog Database Enterprise Manager Recovery Manager (RMAN) Media Options Server Session.
5 Copyright © 2004, Oracle. All rights reserved. Using Recovery Manager.
5 Copyright © 2008, Oracle. All rights reserved. Using RMAN to Create Backups.
Chapter 7 Making Backups with RMAN. Objectives Explain backup sets and image copies RMAN Backup modes’ Types of files backed up Backup destinations Specifying.
11 Copyright © Oracle Corporation, All rights reserved. RMAN Backups.
Backup and Recovery Protects From Data Loss. Backup and Recovery Protects From Data Loss Provides for Media Recovery.
11 Copyright © Oracle Corporation, All rights reserved. RMAN Backups.
Chapter 9 Scripting RMAN. Background Authors felt that scripting was a topic not covered well Authors wanted to cover both Unix/Linux and Windows environments.
Chapter 8 Implementing Disaster Recovery and High Availability Hands-On Virtual Computing.
ORACLE 10g DATAGUARD Ritesh Chhajer Sr. Oracle DBA.
© ViSolve.com All rights reserved. Privacy Statement April Oracle Disaster Recovery Implementation A Non-Technical Overview.
Backup & Recovery Backup and Recovery Strategies on Windows Server 2003.
16 Copyright © 2007, Oracle. All rights reserved. Performing Database Recovery.
11g(R1/R2) Data guard Enhancements Suresh Gandhi
15 Copyright © 2007, Oracle. All rights reserved. Performing Database Backups.
1 Data Guard. 2 Data Guard Reasons for Deployment  Site Failures  Power failure  Air conditioning failure  Flooding  Fire  Storm damage  Hurricane.
A Guide to Oracle9i1 Database Instance startup and shutdown.
8 Copyright © Oracle Corporation, All rights reserved. Configuring the Database Archiving Mode.
14 Copyright © 2005, Oracle. All rights reserved. Backup and Recovery Concepts.
3 Copyright © 2006, Oracle. All rights reserved. Using Recovery Manager.
Configuring the Database Archiving Mode
CERN - IT Department CH-1211 Genève 23 Switzerland t High Availability Databases based on Oracle 10g RAC on Linux WLCG Tier2 Tutorials, CERN,
CERN IT Department CH-1211 Genève 23 Switzerland t Using Data Guard for hardware migration.
Overview of Oracle Backup and Recovery Darl Kuhn, Regis University.
18 Copyright © 2004, Oracle. All rights reserved. Backup and Recovery Concepts.
2 Copyright © 2007, Oracle. All rights reserved. Configuring for Recoverability.
CERN IT Department CH-1211 Genève 23 Switzerland 1 Active Data Guard Svetozár Kapusta Distributed Database Operations Workshop November.
2 Copyright © 2006, Oracle. All rights reserved. Configuring Recovery Manager.
8 Copyright © 2007, Oracle. All rights reserved. Using RMAN to Duplicate a Database.
3 Copyright © 2007, Oracle. All rights reserved. Using the RMAN Recovery Catalog.
18 Copyright © 2004, Oracle. All rights reserved. Recovery Concepts.
14 Copyright © 2005, Oracle. All rights reserved. Backup and Recovery Concepts.
10 Copyright © 2007, Oracle. All rights reserved. Using RMAN Enhancements.
SETA Central 2006 Crashes Happen - Downtime Won't with Data Guard Stephen Rea Tuesday, October 10, :30 AM.
9 Copyright © 2004, Oracle. All rights reserved. Incomplete Recovery.
14 Copyright © 2007, Oracle. All rights reserved. Backup and Recovery Concepts.
What is Flashback? Technology that allows you to revert the database to a point in time in the past Several versions of flashback available Different types.
CERN IT Department CH-1211 Genève 23 Switzerland t Using Data Guard for hardware migration UKOUG RAC & HA SIG, Feb 2008 Miguel Anjo, CERN.
1 Implementing Oracle Data Guard for the RLS database Kasia Pokorska CERN, IT-DB 30 th March 2004.
Duplicating a Database
RAC Backup and Recovery Lab
Performing Database Recovery
Presentation transcript:

Copyright © 2009 Rolta International, Inc., All Rights Reserved Oracle High Availability - A Case Study Rama Balaji Senior Oracle Consultant

This presentation is a case study of a customer whose storage array failed during heavy transaction processing period. Methods followed to quickly restore the database. How the best practice configuration was arrived while maintaining at least two copies of production databases on-line. How Oracle’s HA components were effectively explored to achieve the final configuration with only few minutes of production downtime. Overview

Agenda Background Approach Initial Configuration Disaster to Full recovery High Availability Features Conclusion and best practices Question and Answer

RMAN compressed backups Physical Standby using Data Guard ASM to non-ASM and vice versa Cascaded Destinations Standby 2 node RAC Failover and Switchover Flash Recovery Area and Flash Back Logs Oracle Features Used

Background E-commerce client Nature of business Initial cost effective business approach Business growth IT infrastructure

Approach 1. Storage array failure 2. Restored database from tape using RMAN 3. Set up the Physical Standby 4. Second Physical Standby using cascaded destination feature 5. Failover from production to standby EventDatabase Availability No Database One Two Three Two

Approach 6. 2-Node RAC with ASM on the new storage array as standby 7.Switched over from single node Db instance to 2 Node RAC as primary 8.Disconnected the second standby 9. 2 node RAC standby to 2 node RAC primary 10. Flashback Logs cleanup from FRA Event Database Availability Three Two No Downtime

Storage Array 2 node RAC + ASM Initial configuration Server - Linux Red Hat x86_64 Database – Oracle ASM – two diskgroups DATA_DG – Database FLASH_DG – flash_recovery_area Daily full database backup to disk- RMAN, and to tape using third party software Hourly archivelog backups to disk – RMAN, and to tape using third party software No RMAN catalog

RMAN RMAN> show controlfile autobackup; RMAN configuration parameters are: CONFIGURE CONTROLFILE AUTOBACKUP ON; RMAN> show controlfile autobackup format; RMAN configuration parameters are: CONFIGURE CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE DISK TO '/opt/oracle/admin/PROD/backup/cntl/%F ';

RUN { CONFIGURE RETENTION POLICY TO REDUNDANCY 1; ALLOCATE CHANNEL ch1 DEVICE TYPE DISK MAXPIECESIZE 2G FORMAT '/opt/oracle/admin/PROD/backup/db/RCOMPRESSE D_%U'; BACKUP as compressed backupset DATABASE plus archivelog channel ch1; } RMAN Backup script

RUN { ALLOCATE CHANNEL ch1 DEVICE TYPE DISK FORMAT '/opt/oracle/admin/PROD/backup/archive/ARCH_%U' ; BACKUP as compressed backupset archivelog all; } RMAN Archivelog backup script

Storage Array Failure Storage Array 2 node RAC + ASM Storage admin determined that the SAN failure has caused loss of entire data including the database and backups. Validated that the RMAN backup on tape from previous night as well as archivelog files from previous hour were intact.

Restore using RMAN Storage Array 2 node RAC + ASMSingle DB Non- ASM Restored the tape backup and all archivelog file backups from tape to internal disks on one of the RAC node

Restore from ASM to non- ASM Restored the backup from tape to the original backup location. Restored spfile from autobackup. RMAN> restore spfile from autobackup; Created pfile from spfile;

Edited the following parameters in pfile  control_files - Changed +DATA_DG and +FLASH_DG to file system  db_file_name_convert - Changed +DATA_DG to file system  log_file_name_convert - Changed +DATA_DG and +FLASH_DG to file system startup nomount using pfile; Using RMAN nocatalog, restore controlfile from autobackup. RMAN> restore controlfile from ‘/opt/oracle/admin/PROD/backup/cntl/c e ’ Restore from ASM to non-ASM (Continued…)

alter database mount; restore database; recover database; Created temporary tablespaces alter database open ; Restore from ASM to non- ASM (Continued…)

Physical Standby Standby Database Primary Database PRODPRODS

Physical Standby Steps On the primary database (PROD) generated pfile. Create pfile=’/tmp/initPROD.ora’ from spfile; and copy that file to the standby server. On the primary (PROD) backed up the current controlfile using RMAN. RMAN> backup current controlfile for standby; RMAN> copy current controlfile for standby to '/tmp/sby_control01.ctl';

Run a full RMAN backup on the primary (PROD). RMAN> RUN { ALLOCATE CHANNEL ch1 DEVICE TYPE DISK MAXPIECESIZE 2G FORMAT '/home/oracle/backup_standby/RCOMPRESSED_% U'; BACKUP as compressed backupset DATABASE plus archivelog channel ch1; } Copied all backup files as well as controlfile backup to the standby (PRODS). Backup files needs to be in the same location as the primary backup location or create a symbolic link. Physical Standby Steps (Continued…)

Edited the parameter file on the standby (PRODS).  Changed control_files parameter to point to the controlfile on the standby server.  Changed db_file_name_convert to use the new location on the standby server.  Changed log_file_name_convert to use the new location on the standby server. Start up the standby (PRODS)instance in nomount Physical Standby Steps (Continued…)

rman nocatalog RMAN> connect target RMAN> connect auxiliary / RMAN> RUN { ALLOCATE auxiliary CHANNEL ch1 DEVICE TYPE DISK FORMAT '/home/oracle/backup_standby/RCOMPRESSED_% U'; duplicate target database for standby; } Changed the following parameter on the primary (PROD). SQL> ALTER SYSTEM SET log_archive_dest_2= 'SERVICE=PRODS_XPT REOPEN=300' Physical Standby Steps (Continued…)

Changed the standby (PRODS) to Managed Recovery alter database recover managed standby database disconnect from session; Issued the following command to make sure the archive log files are applied on the standby (PRODS) Select sequence#, applied from v$archived_log order by sequence#; Physical Standby Steps (Continued…)

Data Guard Cascaded Destinations LOG_ARCHIVE_DEST_2 ='SERVICE=PRODS_XPT REOPEN=300'; Primary Database Standby Database 1 Data Guard Data Guard Cascaded Destination Standby Database 2 LOG_ARCHIVE_DEST_2 ='SERVICE=PRODC_XPT REOPEN=300'; PRODPRODSPRODC

Data Guard Failover Primary Database Data Guard Failover Primary Database Data Guard Standby Database PRODPRODSPRODC

Data Guard Failover Steps On the primary (PROD)I did the following. After users were logged out, created a table called TEST. Create table test as select * from dba_users; This was a last operation on the primary database. On the primary (PROD) issued the following command couple of times. Alter system switch logfile ;

On the primary (PROD), noted down the sequence# SQL> archive log list Database log mode Archive Mode Automatic archival Enabled Archive destination /d01/oracle/flashback Oldest online log sequence 6503 Next log sequence to archive 6504 Current log sequence 6504 On the standby (PRODS) issued the following command to make sure standby is in maximum performance mode. SQL> alter database set standby database to maximize performance; Data Guard Failover Steps (Continued…)

On the standby(PRODS), issued the following command SQL> select thread#, low_sequence#, high_sequence# from v$archive_gap; no rows selected On the standby (PRODS), check the last sequence#, SQL> select max(sequence#) from v$archived_log; MAX(SEQUENCE#) Data Guard Failover Steps (Continued…)

On the standby (PRODS), SQL> alter database recover managed standby database finish force; Database altered On the standby (PRODS), SQL> alter database commit to switchover to primary; Database altered On the standby instance (PRODS), SQL> shutdown immediate ORA-01507: database not mounted ORACLE instance shut down. Data Guard Failover Steps (Continued…)

SQL> startup ORACLE instance started. Total System Global Area bytes Fixed Size bytes Variable Size bytes Database Buffers bytes Redo Buffers bytes Database mounted. Database opened. At this point standby is opened as primary. Data Guard Failover Steps (Continued…)

On the current primary (PRODS), issued the following query to make sure the table was brought over. Select * from test; then drop the table. On the second standby (PRODC) instance verified the cascaded standby destination by querying v$archived_log. I had to bounce the cascaded standby instance (PRODC), and put it back in a recovery mode. Data Guard Failover Steps (Continued…)

2-Node RAC as Standby PROD1 Storage Array 2 node RAC + ASM Standby Database 2 Primary Database Data Guard Standby Database 1 PRODS PRODC PROD2

Data Guard Switchover Steps Make sure only one instance on the RAC system is mounted(PROD1). Make sure the FAL_SERVER and FAL_CLIENT parameters are set correctly on the primary as well as standby. On the primary FAL_SERVER=standby instance FAL_CLIENT=primary instance On the Physical Standby FAL_SERVER=primary instance FAL_CLIENT=standby instance

On the primary (PRODS) issued the following SQL> select switchover_status from v$database; SWITCHOVER_STATUS TO STANDBY On the standby (PROD1) instance issued the following command SQL> select switchover_status from v$database; SWITCHOVER_STATUS TO PRIMARY Data Guard Switchover Steps (Continued…)

On the primary (PRODS) issued the following command SQL> ALTER DATABASE COMMIT TO SWITCHOVER TO PHYSICAL STANDBY; SQL> ALTER SYSTEM SET log_archive_dest_state_2='DEFER'; Shutdown the primary instance (PRODS) SQL> shutdown immediate; SQL> Startup nomount; SQL> alter database mount standby database; Data Guard Switchover Steps (Continued…)

On the standby (PROD1) issued the following command SQL> ALTER DATABASE COMMIT TO SWITCHOVER TO PRIMARY; On the standby (PROD1) SQL>shutdown immediate; SQL> startup open; SQL>alter system switch logfile; Make sure other instance(PROD2) comes up as well. Data Guard Switchover Steps (Continued…)

Data Guard Switchover Storage Array 2 node RAC + ASM Primary Database 1 Standby Database 1 Data Guard PRODPRODS Standby Database 2 PRODC

Storage Array 2 node RAC + ASM Primary Database Storage Array 2 node RAC + ASM Standby Database Final Configuration

Real Application Clusters RAC misconception. Always register the database and instances to the cluster. You have to use “netca” to register the listener to crs in 10g. Srvctl is “case sensitive”. Always use single parameter file for multiple RAC nodes. Client side load balancing. Server side load balancing.

ASM Configuration LUN 1LUN 2 LUN 3LUN 4 LUN 5LUN 6 LUN 7 LUN 8 Storage Group 1 Storage Group 2 DATAFLASH ASM Disk Groups Data Files Control Files Online log files Archive log files RMAN backups Mirrored cntl and log Flashback Logs ASM disks

Flash Recovery Area Show parameter db_recovery NAME TYPE VALUE db_recovery_file_dest string +FLASH_DG db_recovery_file_dest_size big integer M SQL> select space_used/(1024*1024*1024),space_limit/(1024*1024 *1024) from v$recovery_file_dest; SPACE_USED (in GB) SPACE_LIMIT (in GB)

SQL> select * from v$flash_recovery_area_usage; CONTROLFILE ONLINELOG ARCHIVELOG BACKUPPIECE IMAGECOPY FLASHBACKLOG ORA-19815: WARNING: db_recovery_file_dest_size of bytes is 87.63% used, and has remaining bytes available Flash Recovery Area

SQL> alter system set db_recovery_file_dest_size=55G scope=memory; SQL> select * from v$flash_recovery_area_usage; CONTROLFILE ONLINELOG ARCHIVELOG BACKUPPIECE IMAGECOPY FLASHBACKLOG Flash Recovery Area

Issue the following RMAN commands RMAN> crosscheck archivelog all; RMAN> delete expired archivelog all; After “delete expired” command CONTROLFILE ONLINELOG ARCHIVELOG BACKUPPIECE IMAGECOPY FLASHBACKLOG Cleanup of Archivelog files from Flash Recovery Area

Flashback Logs SQL> show parameter db_flashback NAME TYPE VALUE db_flashback_retention_target integer 1440 SQL> show parameter log_archive_min_succeed_dest; NAME TYPE VALUE log_archive_min_succeed_dest integer 1

How to cleanup Flashback Logs from Flash Recovery Area SQL> alter system set log_archive_dest_1='LOCATION=/home/oracle/temp_ archivelog' scope=memory; SQL> alter system set log_archive_dest_state_10=defer scope=memory; SQL> alter system set db_recovery_file_dest_size=55G scope=memory; Check the alert log for “deleted Oracle managed files” messages.

How to cleanup Flashback Logs from Flash Recovery Area SQL> select * from v$flash_recovery_area_usage; CONTROLFILE ONLINELOG ARCHIVELOG BACKUPPIECE IMAGECOPY FLASHBACKLOG SQL> alter system set db_recovery_file_dest_size=190G scope=memory;

How to cleanup Flashback Logs from Flash Recovery Area SQL> alter system set log_archive_dest_1='' scope=memory; SQL> alter system set log_archive_dest_state_10=enable scope=memory

Conclusion and Best Practices All configuration changes were performed without any significant downtime outages to the production or standby databases. Various Oracle technologies, including RMAN, RAC, ASM, and Data Guard, were successfully utilized together to achieve the final high-availability solution. An awareness and understanding of the technologies available from Oracle, together with an innovative approach to implementation, were critical in building the environment while complying with the customer’s business needs.

Question Answers Contact Information Rama Balaji (303)

World-wide Team of IT Professionals Contact Information Rama Balaji (303) Core Services – Oracle E-Business Suite – Enterprise Performance Management (EPM) – Oracle DBA, Database, and Infrastructure – Business Intelligence and Data Warehousing – Managed Services – remote support & hosting – Oracle Fusion Middleware – Oracle Hyperion – Software – Training