High Availability 24 hours a day, 7 days a week, 365 days a year…

High Availability 24 hours a day, 7 days a week, 365 days a year…
Vik Nagjee Product Manager, Core Technologies InterSystems Corporation

What is High Availability (HA)? Current HA strategies What’s coming?
Topics What is High Availability (HA)? Current HA strategies What’s coming? Questions & Discussion

What is High Availability (HA)?
Reliability Fault-tolerance High Uptime Operational Continuity Redundancy Minimal Disruption Availability % Downtime per year per month per week 90% 36.5 days 72 hours 16.8 hours 95% 18.25 days 36 hours 8.4 hours 99% 3.65 days 7.20 hours 1.68 hours 99.9% 8.67 hours 43.2 minutes 10.1 minutes 99.99% 52.6 minutes 4.32 minutes 1.01 minutes 99.999% 5.26 minutes 25.9 seconds 6.05 seconds % 31.5 seconds 2.59 seconds 0.605 seconds

High Availability vs. Disaster Recovery
High Availability = fault detection & correction procedures to maximize availability of critical services and applications, often in an automated fashion. Disaster Recovery = process of preparing for recovery or continuation of technology infrastructure critical to an organization after a natural or human-induced disaster. High Availability ≠ Disaster Recovery!

Current HA Strategies Failover = Automatic switch to redundant system
Uses some type of heartbeat software (e.g., HACMP) Current Failover Options: Failover Clusters Concurrent Clusters ECP Clusters With Failover Cluster for Database With Concurrent Cluster for Database

Failover Clusters One active system (PROD), and one standby system (STDBY), with a heartbeat connection Windows Cluster, IBM HACMP, Sun Cluster, HP Serviceguard, Red Hat Cluster Suite, Veritas Cluster Services… Needs shared disk for install directory, WIJ, database files, and journal files Users/Applications connect to a DNS which is mapped to PROD In event of failure, 3rd party cluster software fails Caché to STDBY node Caché performs recovery on STDBY node before allowing connections - open Tx’s are rolled back, open locks are released, etc…

Concurrent Clusters AKA Caché Clusters
Can be configured on OpenVMS and Tru64 UNIX Two or more servers, each running an instance of Caché and each with access to all disks, concurrently provide access to all data Users connect to either one of the clustered nodes; Caché provides data and lock synchronization across nodes If one machine fails, users can immediately reconnect to any of the remaining cluster nodes Caché performs cluster-wide recovery during failover – logical and physical data integrity is maintained

ECP Clusters – with DB as Failover Cluster
Enterprise Cache Protocol (ECP) provides a distributed, tiered system Typical configuration: N+1 application servers Users load-balanced across app servers If any app server goes down, users can be reconnected to other remaining app servers If database goes down, users on app servers will experience pause while DB failover completes (here DB is configured as a failover cluster) Application servers will reconnect after database has performed recovery

ECP Clusters – with DB as Concurrent Cluster
Similar to previous example, except DB server is configured as a concurrent cluster (OpenVMS or Tru64 UNIX) App servers can connect to any one of the nodes If any node fails, the app server(s) connected to that node will reconnect to another surviving node after failover Caché performs cluster-wide recovery during failover – logical and physical data integrity is maintained

High Availability: What’s Coming?
Database Mirroring: Delivers faster, automated failover Eliminates requirement for shared disk configurations Reduces dependency on 3rd party clustering software Uses multiple redundant servers Integrated ECP recovery

Database Mirroring Multiple servers in Mirror Set - one is Primary, others are Backup (1+) TCP connections between mirror members Primary PUSHES journal updates to Backups, who ack and continuously de-journal Primary role can flip from one server to another within moments – automated failover All clients (except ECP) connect to a Mirror Virtual IP – mirror handles appropriate redirection to current Primary ECP protocol is “mirror aware” – app servers will connect directly to current primary, and will fail over to new primary as appropriate. ECP will perform recovery on reconnection.

Questions & Discussion
Wrap-up Questions & Discussion

High Availability 24 hours a day, 7 days a week, 365 days a year…

Similar presentations

Presentation on theme: "High Availability 24 hours a day, 7 days a week, 365 days a year…"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

High Availability 24 hours a day, 7 days a week, 365 days a year…

Similar presentations

Presentation on theme: "High Availability 24 hours a day, 7 days a week, 365 days a year…"— Presentation transcript:

Similar presentations

About project

Feedback