Maximum Availability Architecture Enterprise Technology Centre.

Maximum Availability Architecture Enterprise Technology Centre

Causes of Downtime Inadequate System Design, Testing & Process
Unscheduled Outages Data Center Disasters Human Error System Faults and Crashes Data and Media Failures Maintenance & Continuous Operations Scheduled Outages Presenter notes: Our approach started with looking initially at the causes of downtime, the outages we wanted to handle effectively and then build an architecture with these in mind. Oracle Support recently reviewed 761 recovery tars (User Managed Backup and Recovery Issue Analysis Report For December 2001). The results showed that 46% of the issues were due to data failures or corruptions, most of which were caused by hardware or software failures. 32% of the issues were due to user errors, such as accidentally removing data files or dropping a table. The final 22% required support assistant for resolution. This review indicates that data failures and user errors are likely and you need an architecture to deal with these outages. Automation and simplicity are important components that help achieve service levels and target MTTR. After 9/11, customers are raising the priority for disaster recovery and protection from site disasters or sabotage. We looked at both scheduled outages, used for maintenance purposes, and unscheduled outages. Unscheduled outages fall into three main areas: system failures, such as a system crash; data failures and disasters, where the data is either inaccessible or is unusable in its current state; and human error, which are unavoidable.

Summary Oracle9i architected for Continuous Data Availability
System Failure Real Application Clusters Fast Restart, Data Guard Unplanned Downtime Data Failure & Disaster Recovery Manager Data Guard Human Error Flashback Query Log Miner, Data Guard System Maintenance Partitioning, Data Guard Dynamic Reconfiguration Planned Downtime Database Maintenance Online Redefinition Data Guard

High Availability Objectives
Maximize Mean Time To Failure (MTTF) Provide consistent, reliable and uninterrupted service to the end user as close to 24 x 7 as is economically feasible Minimize Mean Time to Recover (MTTR) Recover from outages within a small and predictable window of time Provide system access in the event of a site disaster while preventing data loss due to data corruption, user errors, virus attack, etc.

Maximum Availability Architecture

Highly Available Application Oracle9iAS
Availability Oracle9iAS J2EE (OC4J) and Web Cache clustering for protection against system outages Automatic monitor and restart of failed processes Application state preserved through failures Add and remove nodes transparently Scalability Hardware network load balancer distributes client requests to Web Cache Web Cache clustering for distributed caching and load balancing across multiple OC4J instances

Highly Available Application Oracle9iAS

Real Application Clusters Fast Restart, Data Guard
System Failure System Failure Real Application Clusters Fast Restart, Data Guard Unplanned Downtime

Fast-Start™ Fault Recovery
Provides near instantaneous recovery time Oracle9i automatically recovers data DBA can specify time limit for recovery process Oracle9i dramatically reduces recovery times Minimal impact from deferred rollback operations 14 12 10 8 6 4 2 Example high OLTP and batch workload Others Oracle9i Recovery Time (minutes)

Single Instance Failure
Server presents a single point of failure Server 1 Single Instance ‘X’ Database ‘A’

Clusters with ‘cold’ Failover
Server 1 Server 2 Instance ‘X’ Database e.g HP ServiceGuard

High Availability with 9iRAC
Server 1 Instance ‘A’ Server 2 Instance ‘B’ Database Protect from SERVER failures

High Availability with 9iRAC
Server 1 Instance ‘A’ Server 2 Instance ‘B’ Database SERVER failure - your database remains available

Data Failure and Disaster
System Failure Real Application Clusters Fast Restart, Data Guard Unplanned Downtime Data Failure & Disaster Recovery Manager Data Guard

Oracle9i Data Guard Primary Site Standby Site Primary Database Standby
Clients Clients Primary Site Standby Site Data Changes Primary Database Standby Database Oracle Data Guard consists of a production database (also known as the primary database) and one or more standby database(s), which are transactionally consistent copies of the production database. A standby database can either be a physical standby database, or a logical standby database. Later I will talk more about these two types of standby databases. The transactional consistency between primary and standby databases is maintained using Oracle online redo logs. As transactions change information stored in the primary database, the changes are also written to the online redo logs. These changes are transferred to the standby destinations by Log Transport Services and applied to the standby databases by Log Apply Services. Role Management Services work with Log Transport Services and Log Apply Services to reduce downtime of the primary database for planned outages and from unplanned failures by facilitating switchover and failover operations. Data Guard provides several management interfaces, including SQL statements, initialization parameters, a PL/SQL package, and the Oracle Data Guard Broker. The Broker is a distributed management framework that automates the creation and management of Data Guard configurations through the Data Guard Manager graphical user interface or its command-line interface.

Database Maintenance System Failure Power Outage System Crash
Unplanned Downtime Data Failure & Disaster Data Corruption Flood, Fire, Etc. Human Error Drop Tables Sys Admin Error System Maintenance Hardware & O/S Upgrades Planned Downtime Database Maintenance Data Evolution

Online Redefinition On-line schema redefinition
add, modify, drop table columns Complete on-line index operations create, recreate On-line table re-organization and redefinition On-line analyze and validate Updates & queries continue uninterrupted Addresses ‘Planned Downtime’ challenge

Oracle9i Partitioning Maintenance by partition reduces planned
Split table into Orders by Month Set previous months to read-only Orders Table Maintenance by partition reduces planned and unplanned downtime

Maximum Availability Architecture complete solution

Maximum Availability Architecture Enterprise Technology Centre.

Similar presentations

Presentation on theme: "Maximum Availability Architecture Enterprise Technology Centre."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Maximum Availability Architecture Enterprise Technology Centre.

Similar presentations

Presentation on theme: "Maximum Availability Architecture Enterprise Technology Centre."— Presentation transcript:

Similar presentations

About project

Feedback