Presentation is loading. Please wait.

Presentation is loading. Please wait.

© 2000 Copyright Hewlett Packard Co. HA Cluster SuperDome Configurations John Foxcroft, BCC/Availability Clusters Solutions Lab HA Products Support Planning.

Similar presentations


Presentation on theme: "© 2000 Copyright Hewlett Packard Co. HA Cluster SuperDome Configurations John Foxcroft, BCC/Availability Clusters Solutions Lab HA Products Support Planning."— Presentation transcript:

1 © 2000 Copyright Hewlett Packard Co. HA Cluster SuperDome Configurations John Foxcroft, BCC/Availability Clusters Solutions Lab HA Products Support Planning and Training Version 1.0 9/22/00

2 © 2000 Hewlett-Packard Co. SUPERDOME HA Clusters 2 HA Cluster SuperDome Configurations HA Cluster Review –HA Cluster Architectures –Cluster Quorum –Cluster Lock –Power Requirements –Disaster Tolerant Solutions Single Cabinet Configuration Multi Cabinet Configurations Mixed Server Configurations Disaster Tolerant Solutions with SuperDome References FAQ’s Lab Exercises

3 © 2000 Hewlett-Packard Co. SUPERDOME HA Clusters 3 Range of HA Cluster Architectures Flexibility & Functionality Distance Local Cluster Campus Cluster Metro Cluster Continental Clusters Single Cluster Automatic Failover Same data center Single Cluster Automatic Failover Same Site Single Cluster Automatic Failover Same City Separate Clusters “Push-Button” Failover Between Cities SuperDome is fully supported across all HA Cluster Architectures !

4 © 2000 Hewlett-Packard Co. SUPERDOME HA Clusters 4 MC/ServiceGuard Features: Multi OS One-stop GUI Rolling upgrade Tape sharing 16 nodes No idle system Online reconfiguration Automatic Failback Rotating standby Closely integrated with OS, HP-UX MC/ServiceGuard Clients Application Tier Database Tier

5 © 2000 Hewlett-Packard Co. SUPERDOME HA Clusters 5 1 2 3 4 2 3 4 100% quorum to boot unless using the manual override -f of cmruncl 4 nodes 3 left out of 4 > 50% quorum (no lock required) 2 left of 3 > 50% quorum 3 4 1 left of 2 = 50% quorum Cluster Lock needed to form cluster Examples of Failures and Dynamic Quorum

6 © 2000 Hewlett-Packard Co. SUPERDOME HA Clusters 6 1 2 3 4 1 2 3 4 5 2 left out of 4 = 50% quorum Cluster Lock needed to form cluster 3 left out of 5 > 50% quorum (no lock required) Examples of Failures and Dynamic Quorum 1 2 3 4 5 2 left out of 5 < 50% quorum Cluster goes down !

7 © 2000 Hewlett-Packard Co. SUPERDOME HA Clusters 7 Cluster Lock Disk A Cluster Lock Disk is required in a 2 node cluster (recommended for 3,4 nodes) to provide a tie breaker for the cluster after a failure. Cluster Lock Disk is supported for up to 4 nodes maximum. Must be a disk that is connected to all nodes Is a normal data disk, lock functionality only used after a node failure AA BB Cluster Lock

8 © 2000 Hewlett-Packard Co. SUPERDOME HA Clusters 8 Care should be taken to make sure a single power supply failure does not take out: –Half the nodes and –Cluster lock disk Secure Power Supply AA A’ Cluster Lock

9 © 2000 Hewlett-Packard Co. SUPERDOME HA Clusters 9 HA SuperDome Configurations Remarks/Assumptions Each partition is equivalent to a traditional standalone server running and OS Each partition comes equipped with: –core I/O, other I/O and LAN connections Each partition connects to: –boot devices, data disks, removable media (DVD-ROM and/or DAT) Redundant components exist in each partition as an attempt to remove SPOFs (single-points-of-failure) –redundant I/O interfaces (disk and LAN) –redundant heartbeat LANs –boot devices protected via mirroring (MirrorDisk/UX or RAID) –critical data protected via mirroring (MirrorDisk/UX or RAID) –LAN protection –auto-port aggregation for Ethernet LANs –MC/SG for Ethernet & FDDI –Hyperfabic and ATM provide their own LAN failover abilities

10 © 2000 Hewlett-Packard Co. SUPERDOME HA Clusters 10 HA SuperDome Configurations Remarks/Assumptions Any partition that is protected by MC/SG can be configured in a cluster with: –a standalone system –another partition within the same SuperDome cabinet (see HA considerations for more details). –another SuperDome Any partition that is protected by MC/SG contains as many redundant components as possible to further reduce the chance of failure. For example: –Dual AC power to a cabinet is recommended, if possible –Redundant I/O chassis attached to a different cell is recommended, if possible

11 © 2000 Hewlett-Packard Co. SUPERDOME HA Clusters 11 HA SuperDome Configurations Cabinet Considerations 3 Single Points of Failure (SPOF) have been identified within single cabinet 16-Way and 32-Way systems and dual cabinet 64-Way systems: –system clock, power monitor. system backplane To configure an HA cluster with no SPOF, the membership must extend beyond a single cabinet: –must be configured such that the failure of a single cabinet does not result in the failure of a majority of the nodes in the cluster. –cluster lock device must be powered independently of the cabinets containing the cluster nodes. Some customers want a “cluster in a box” configuration. –MC/ServiceGuard will support this configuration, however it needs to be recognized that it does contain SPOFs that will bring down the entire cluster. –Mixed OS and ServiceGuard revisions should only exist temporarily while performing a rolling upgrade within a cluster. 64-Way dual cabinet systems connected with flex cables have worse SPOF characteristics than single cabinet 16-Way and 32-Way systems. –There is no HA advantage to configure a cluster within a 64-Way system vs. across two 16 or 32-Way systems. Optional AC input power on a separate circuit is recommended

12 © 2000 Hewlett-Packard Co. SUPERDOME HA Clusters 12 HA SuperDome Configurations I/O Considerations Cluster heartbeat will be done over LAN connections between SuperDome partitions. Redundant heartbeat paths are required and can be accomplished by using either multiple heartbeat subnets or via standby interface cards. Redundant heartbeat paths should be configured in separate I/O modules (I/O card cages) when possible. Redundant paths to storage devices used by the cluster are required and can be accomplished using either disk mirroring or via LVM’s pvlinks. Redundant storage device paths should be configured in separate I/O modules (I/O card cages) when possible.

13 © 2000 Hewlett-Packard Co. SUPERDOME HA Clusters 13 HA SuperDome Configurations Redundant I/O Paths Example * Redundant paths are required for shared storage devices in a cluster * MirrorDisk/UX or PV-Links can be configured to provide alternate paths to to disk volumes and protect against I/O card failure (Logical Volume Manager feature) * At least two I/O card cages per partition are recommended to protect against I/O Card Cage failure I/O Card Cage 1 1 FW SCSI Card Core I/O Card 12 Slots Total I/O Card Cage 2 1 FW SCSI Card 12 Slots Total Primary Path Mirror/Alternate Path 1 Copy of HP-UX running in this partition Partition 1 I/O Card Cage 1 1 FW SCSI Card Core I/O Card 12 Slots Total I/O Card Cage 2 1 FW SCSI Card 12 Slots Total Partition 2 Cell 1 Copy of HP-UX running in this partition

14 © 2000 Hewlett-Packard Co. SUPERDOME HA Clusters 14 HA Single Cabinet Configuration “Cluster in a Box” Partition 1 Cell Partition 2 Two node ServiceGuard Cluster Notes : Considered a "Single System" HA solution SPOFs in the cabinet can cause the entire cluster to fail (SPOF’s: clock, backplane, power monitor). A four node (four partition) cluster is supported within a 16-Way system (*). Up to a eight node (eight partition) cluster is supported within a 32-Way system (*). Up to a sixteen node (sixteen partition) cluster is supported within a 64-Way system (*) Cluster lock required for two partition configurations Cluster lock must be powered independently of the cabinet. N+1 power supplies required (included in base price of SD) Dual power connected to independent power circuits required Root volume mirrors must be on separate power circuits. One 16W, 32W or 64W System Cell Cluster Lock

15 © 2000 Hewlett-Packard Co. SUPERDOME HA Clusters 15 HA Multi Cabinet Configuration Partition 1 Partition 2 Partition 1 Partition 2 Partition 3 Notes: No SPOF configuration. Cluster lock is required if cluster is wholly contained within two 16-Way or 32-Way systems (due to possible 50% cluster membership failure). ServiceGuard only supports cluster lock up to four nodes, thus two cabinet solution is limited to four nodes. Two cabinet configurations, must evenly divide nodes between the cabinets (i.e.. 3 and 1 is not a legal 4 node configuration). Cluster lock must be powered independently of either cabinet N+1 power supplies required Dual power connected to independent power circuits required. Root volume mirrors must be on separate power circuits Other independent nodes Two Independent 16-Way or 32-Way Systems Cell Cluster Lock Two node ServiceGuard cluster

16 © 2000 Hewlett-Packard Co. SUPERDOME HA Clusters 16 Partition 1 Cell Cluster Lock Cell Partition 2 Cell Partition 2 Partition 1 HA Multi Cabinet Configuration Partition 1 Notes: No SPOF configuration. Cluster lock is required if a cluster is wholly contained within two 16-Way or 32-Way systems (due to possible 50% cluster membership failure). ServiceGuard only supports cluster lock up to four nodes, thus two cabinet solution is limited to four nodes. Two cabinet configurations, must evenly divide nodes between the cabinets (i.e.. 3 and 1 is not a legal 4 node configuration). Cluster lock must be powered independently of either cabinet N+1 power supplies required Dual power connected to independent power circuits required. Root volume mirrors must be on separate power circuits Two Independent 32-Way Systems Two 4-node clusters Cell Cluster Lock Four node ServiceGuard cluster Cell Partition 2 Cell Partition 2 Partition 1 Four node ServiceGuard cluster

17 © 2000 Hewlett-Packard Co. SUPERDOME HA Clusters 17 HA Multi Cabinet Configuration 64-Way System 64-Way System (dual cabinet) 4 CPUs 8 GB RAM 4 CPUs 8 GB RAM 4 CPUs 8 GB RAM 4 CPUs 8 GB RAM Partition 1 4 CPUs 8 GB RAM 4 CPUs 8 GB RAM 4 CPUs 8 GB RAM 4 CPUs 8 GB RAM 4 CPU 8 GB RAM 4 CPUs 8 GB RAM Partition 2 4 CPU 8 GB RAM 4 CPUs 8 GB RAM 4 CPU 8 GB RAM 4 CPUs 8 GB RAM Partition 3 4 CPU 8 GB RAM 4 CPUs 8 GB RAM 64-Way System (dual cabinet) 4 CPUs 8 GB RAM 4 CPUs 8 GB RAM 4 CPUs 8 GB RAM 4 CPUs 8 GB RAM Partition 1 4 CPUs 8 GB RAM 4 CPUs 8 GB RAM 4 CPUs 8 GB RAM 4 CPUs 8 GB RAM 4 CPU 8 GB RAM 4 CPUs 8 GB RAM Partition 2 4 CPU 8 GB RAM 4 CPUs 8 GB RAM 4 CPU 8 GB RAM 4 CPUs 8 GB RAM Partition 3 4 CPU 8 GB RAM 4 CPUs 8 GB RAM Cluster Lock Two node ServiceGuard cluster

18 © 2000 Hewlett-Packard Co. SUPERDOME HA Clusters 18 HA Mixed Configurations Partition 1 Notes: Cluster configuration can contain a mixture of SuperDome and non- SuperDome nodes. Care must be taken to maintain an even or greater number of nodes outside of the SuperDome cabinet. Using an even number of nodes within and outside of the Superdome requires a cluster lock (maximum cluster size of four nodes). Cluster lock is not supported for clusters with greater than four nodes. ServiceGuard supports up to 16 nodes A cluster size of greater than four nodes requires more nodes to be outside the Superdome. Without a cluster lock, beware of configurations where the failure of a SuperDome cabinet will cause the remain nodes to be 50% or less quorum - the cluster will fail ! 16-Way, 32-Way or 64-Way System and other HP9000 servers Cell N-Class Cluster Lock Four node ServiceGuard cluster Partition 2 Cell

19 © 2000 Hewlett-Packard Co. SUPERDOME HA Clusters 19 HA Mixed Configurations Partition 1 Notes: Cluster configuration can contain a mixture of SuperDome and non- SuperDome nodes. Care must be taken to maintain an even or greater number of nodes outside of the SuperDome cabinet. Using an even number of nodes within and outside of the Superdome requires a cluster lock (maximum cluster size of four nodes). Cluster lock is not supported for clusters with greater than four nodes. ServiceGuard supports up to 16 nodes A cluster size of greater than four nodes requires more nodes to be outside the Superdome. Without a cluster lock, beware of configurations where the failure of a SuperDome cabinet will cause the remain nodes to be 50% or less quorum - the cluster will fail ! 16-Way, 32-Way or 64-Way System and other HP9000 servers Cell N-Class Five node ServiceGuard cluster Partition 2 Cell N-Class No Cluster Lock

20 © 2000 Hewlett-Packard Co. SUPERDOME HA Clusters 20 HA Mixed Configurations Using a low end system as an arbitrator Partition 1 Notes: A cluster size of greater than four nodes requires more nodes to be outside the Superdome. One option is to configure a low end system to act only as an arbitrator (providing >50% quorum outside the SuperDome). Requires redundant heartbeat LANs. System on separate power circuit. The SMS (Support Management Station) A-class system could be used for this purpose. A180 A400, A500 External LAN connections only (Built-in 100/BT card not supported with ServiceGuard) 16-Way, 32-Way or 64-Way System and other HP9000 servers Cell N-Class Five node ServiceGuard cluster Partition 2 Cell A-Class No Cluster Lock

21 © 2000 Hewlett-Packard Co. SUPERDOME HA Clusters 21 HA Mixed Configurations Partition 1 Notes: Cluster configuration can contain a mixture of SuperDome and non- SuperDome nodes. Care must be taken to maintain an even or greater number of nodes outside of the SuperDome cabinet. Using an even number of nodes within and outside of the Superdome requires a cluster lock (maximum cluster size of four nodes). Cluster lock is not supported for clusters with greater than four nodes. ServiceGuard supports up to 16 nodes A cluster size of greater than four nodes requires more nodes to be outside the Superdome. Without a cluster lock, beware of configurations where the failure of a SuperDome cabinet will cause the remain nodes to be 50% or less quorum - the cluster will fail ! 16-Way, 32-Way or 64-Way System and other HP9000 servers Cell N-Class Five node ServiceGuard cluster Partition 2 Cell N-Class No Cluster Lock N-Class down for maintenance, SPOF (SD) causes 50% quorum, Cluster fails !

22 © 2000 Hewlett-Packard Co. SUPERDOME HA Clusters 22 Frequently Asked Questions (FAQ's) Question: Can I configure a ServiceGuard cluster within a single SuperDome cabinet ? Answer: Yes, it is supported to configure a cluster within a single cabinet (16W, 32W or 64W). Recognize that this configuration contain SPOF’s that can bring down the entire cluster. Question: In a two cabinet configuration (using 16W, 32W or 64W systems), can I configure 1 node in one cabinet and 3 nodes in the other ? Answer: No, there are only two valid ways to create a cluster between two SuperDome systems; a 2 node cluster (1 node in one cabinet, 1 node in the other), or a 4 node cluster (2 nodes in one cabinet, 2 nodes in the other). Question: Is a lock disk required for a 4 node (two cabinet) configuration ? Answer: Yes, since a single failure can take down exactly half of the cluster nodes. Question: Are dual power cables recommended in each cabinet ? Answer: Yes, this optional feature should be ordered in HA configurations Question: Can a cluster be four 32W systems each with one partition of 8 cells wide ? Answer: Yes, single partition SuperDome systems (and non-SuperDome nodes) could be configured in up to a 16 node cluster. Question: Are SuperDomes supported in Campus/Metro Cluster and ContinentalCluster configurations ? Answer: Yes, subject to the rules covered in this presentation. Question: Is heartbeat handled any differently between partitions within SuperDome boxes ? Answer: Heartbeat is done over LAN connections between partitions. From the ServiceGuard perspective, each partition is just another HP-UX node.

23 © 2000 Hewlett-Packard Co. SUPERDOME HA Clusters 23 References l ACSL Product Support information (patches, PSP, etc.) see http://haweb.cup.hp.com/Support, or Kmine l MC/ServiceGuard Users Manual l Designing Disaster Tolerant HA Clusters Users Manual see http://docs.hp.com/hpux/ha l XP256 Documentation see http://docs.hp.com/hpux/systems/#massstorage l HPWorld ‘99 Tutorial: “Disaster-Tolerant, Highly Available Cluster Architectures” see http://docs.hp.com/hpux/ha or http://haweb.cup.hp.com/ATC/WP

24 © 2000 Hewlett-Packard Co. SUPERDOME HA Clusters 24 Additional Refresher Slides

25 © 2000 Hewlett-Packard Co. SUPERDOME HA Clusters 25 MC/ServiceGuard Features: Multi OS One-stop GUI Rolling upgrade Tape sharing 16 nodes No idle system Online reconfiguration Automatic Failback Rotating standby Closely integrated with OS, HP-UX MC/ServiceGuard Clients Application Tier Database Tier

26 © 2000 Hewlett-Packard Co. SUPERDOME HA Clusters 26 ServiceGuard OPS Edition Features: Same protection functionality for applications as MC/SG Additional protection for Oracle database Parallel database environment for increased availability and scalability ServiceGuard OPS Edition End-User Clients

27 © 2000 Hewlett-Packard Co. SUPERDOME HA Clusters 27 ServiceGuard Comparative Features

28 © 2000 Hewlett-Packard Co. SUPERDOME HA Clusters 28 Campus Cluster Solution = MC/SG + Fibre Channel FC Hub ~10km Heartbeat Fibre Channel

29 © 2000 Hewlett-Packard Co. SUPERDOME HA Clusters 29 Campus Cluster Comparative Features

30 © 2000 Hewlett-Packard Co. SUPERDOME HA Clusters 30 MetroCluster with Continuous Access XP HP Continuous Access XP Manhattan New Jersey HP SureStore E Disk Arrays HP 9000 Systems Delivering city-wide automated fail-over s Protect against Tornadoes, Fires, Floods s Rapid, automatic site recovery without human intervention s Effective between systems that are up to 43km apart s Provides very high cluster performance s Backed by collaborative implementation, training and support services from HP s Also available: MetroCluster with EMC SRDF, using EMC Symmetrix Disk Arrays

31 © 2000 Hewlett-Packard Co. SUPERDOME HA Clusters 31 MetroCluster Comparative Features

32 © 2000 Hewlett-Packard Co. SUPERDOME HA Clusters 32 HP ContinentalClusters l Highest levels of availability & disaster tolerance l Reduces downtime from days to minutes l Locate data centers at economically and/or strategically best locations l Transparent to applications and data Data Replication Cluster Detection l Push button failover across 1000s of km l Supports numerous wide area data replication tools for complete data protection l Comprehensive Support and Consulting Services as well as Business Recovery Services for planning, design, support, and rehearsal l Requires CSS support or greater

33 © 2000 Hewlett-Packard Co. SUPERDOME HA Clusters 33 ContinentalClusters: Comparative Features Cluster TopologyTwo Clusters, each up to 16 nodes GeographyContinental or Inter-continental NetworkSubnetsDual IPSubnets Network TypesDedicated Ethernet or FDDI within each data center, Wide Area Network (WAN) between data centers Cluster Lock DiskRequired for 2 nodes, optional for 3-4 nodes, not used with larger clusters Failover TypeSemi-Automatic Failover DirectionUni-directional Data ReplicationPhysical, in hardware (XP256 CA or EMC SRDF) Logical in software (Oracle Standby Database, etc.)

34 © 2000 Hewlett-Packard Co. SUPERDOME HA Clusters 34 Two Data Center Campus Cluster Architecture (# 1) Highly Available Network l Example: 4-node campus cluster using 16-way, 32-way or 64-way systems and fibre channel for disk connectivity (500 meters point-to-point, 10 kilometers using long wave ports with FCAL hubs) l Recommend multi cabinet SuperDome configurations at each data center for increased availability l Each data center must contain the same number of nodes (partitions) l Use of MirrorDisk/UX is required to mirror data between the data centers l All systems are connected to both mirror copies of data for packages they can run l All systems must be connected to the redundant heartbeat network links l MUST have dual cluster lock disks, with all systems connected to both of them l MAXIMUM cluster size is currently 4 nodes when using cluster lock disks Data Center A NW AABB CL 1 Data Center B NW A' B' CL 2 Physical Data Replication using MirrorDisk/UX Physical Data Replication using MirrorDisk/UX Partition 1 Partition 2 Cell SuperDome Cell Partition 1 Partition 2 Cell

35 © 2000 Hewlett-Packard Co. SUPERDOME HA Clusters 35 Three Data Center Campus Architecture (#2) Data Center A NW AABB Data Center B NW A' B' Physical Data Replication using MirrorDisk/UX SuperDome Data Center C NW 1 or 2 Arbitrator System(s) Physical Data Replication with MirrorDisk/UX n Maximum cluster size - 16 nodes with HP-UX 11.0 and later 11.x versions l Recommend multi cabinet SuperDome configurations at each data center for increased availability n Same number of nodes in each non- Arbitrator data center to maintain quorum in case an entire data center fails l Arbitrators need not be connected to the replicated data l No Cluster Lock Disk(s) l All non-Arbitrator systems must be connected to both replica copies of the data l All systems must be connected to the redundant heartbeat network links Highly Available Network Partition 1 Partition 2 Cell Partition 1 Partition 2 Cell

36 © 2000 Hewlett-Packard Co. SUPERDOME HA Clusters 36 Three Data Center MetroCluster Architecture Data Center A NW Data Center B NW SuperDome Data Center C NW 1 or 2 Arbitrator System(s) Highly Available Network Physical Data Replication with EMC SRDF or XP256 CA Physical Data Replication with EMC SRDF or XP256 CA l Arbitrators need not be connected to the replicated data l No Cluster Lock Disk(s) l Systems are not connected to both replica copies of the data (cannot have two distinct devices accessible with the same VGID) l All systems must be connected to the redundant heartbeat network links n Maximum cluster size - 16 nodes with HP-UX 11.0 and later 11.x versions l Recommend multi cabinet SuperDome configuration at each data center for increased availability n Same number of nodes in each non- Arbitrator data center to maintain quorum in case an entire data center fails Partition 1 Partition 2 Cell Partition 1 Partition 2 Cell

37 © 2000 Hewlett-Packard Co. SUPERDOME HA Clusters 37 Two Data Center ContinentalClusters Architecture l Systems are not connected to both replica copies of the data (hosts in each cluster are connected to only one copy of the data) l Each cluster must separately conform to heartbeat network requirements l Each cluster must separately conform to quorum rules (cluster lock disks or Arbitrators) Primary Cluster Recovery Cluster Highly Available Wide Area Network (WAN) Physical or Logical Data Replication NW l Recommend multi cabinet SuperDome configuration at each data center for increased availability l Use of cluster lock disks requires three power circuits in each cluster l HA WAN is used for both data replication and inter-cluster monitoring Data Center A Data Center B SuperDome Partition 1 Cell SuperDome Partition 1 Cell SuperDome


Download ppt "© 2000 Copyright Hewlett Packard Co. HA Cluster SuperDome Configurations John Foxcroft, BCC/Availability Clusters Solutions Lab HA Products Support Planning."

Similar presentations


Ads by Google