Failover and High Availability

Slides:



Advertisements
Similar presentations
High Availability 24 hours a day, 7 days a week, 365 days a year… Vik Nagjee Product Manager, Core Technologies InterSystems Corporation.
Advertisements

1 © Copyright 2010 EMC Corporation. All rights reserved. EMC RecoverPoint/Cluster Enabler for Microsoft Failover Cluster.
Keith Burns Microsoft UK Mission Critical Database.
Lesson 1: Configuring Network Load Balancing
1© Copyright 2011 EMC Corporation. All rights reserved. EMC RECOVERPOINT/ CLUSTER ENABLER FOR MICROSOFT FAILOVER CLUSTER.
Module 14: Scalability and High Availability. Overview Key high availability features available in Oracle and SQL Server Key scalability features available.
Implementing Failover Clustering with Hyper-V
National Manager Database Services
High Availability Module 12.
11 SERVER CLUSTERING Chapter 6. Chapter 6: SERVER CLUSTERING2 OVERVIEW  List the types of server clusters.  Determine which type of cluster to use for.
MCTS Guide to Microsoft Windows Server 2008 Applications Infrastructure Configuration (Exam # ) Chapter Ten Configuring Windows Server 2008 for High.
Microsoft Load Balancing and Clustering. Outline Introduction Load balancing Clustering.
Ronen Gabbay Microsoft Regional Director Yside / Hi-Tech College
70-293: MCSE Guide to Planning a Microsoft Windows Server 2003 Network, Enhanced Chapter 14: Problem Recovery.
DB-12: Achieving High Availability with Clusters and OpenEdge® Replication Combining the two technologies Hugo Loera Chávez Senior Tech Support Engineer.
SANPoint Foundation Suite HA Robert Soderbery Sr. Director, Product Management VERITAS Software Corporation.
Module 10 Configuring and Managing Storage Technologies.
Chapter 10 : Designing a SQL Server 2005 Solution for High Availability MCITP Administrator: Microsoft SQL Server 2005 Database Server Infrastructure Design.
Module 13: Configuring Availability of Network Resources and Content.
1 © 2006 SolidWorks Corp. Confidential. Clustering  SQL can be used in “Cluster Pack” –A pack is a group of servers that operate together and share partitioned.
Module 12: Designing High Availability in Windows Server ® 2008.
Oracle10g RAC Service Architecture Overview of Real Application Cluster Ready Services, Nodeapps, and User Defined Services.
INSTALLING MICROSOFT EXCHANGE SERVER 2003 CLUSTERS AND FRONT-END AND BACK ‑ END SERVERS Chapter 4.
Chapter 8 Implementing Disaster Recovery and High Availability Hands-On Virtual Computing.
© 2005 Mt Xia Technical Consulting Group - All Rights Reserved. HACMP – High Availability Introduction Presentation November, 2005.
Module 11: Implementing ISA Server 2004 Enterprise Edition.
1 Week #10Business Continuity Backing Up Data Configuring Shadow Copies Providing Server and Service Availability.
Clustering In A SAN For High Availability Steve Dalton, President and CEO Gadzoox Networks September 2002.
OSIsoft High Availability PI Replication
11 CLUSTERING AND AVAILABILITY Chapter 11. Chapter 11: CLUSTERING AND AVAILABILITY2 OVERVIEW  Describe the clustering capabilities of Microsoft Windows.
High Availability in DB2 Nishant Sinha
Ashish Prabhu Douglas Utzig High Availability Systems Group Server Technologies Oracle Corporation.
Alwayson Availability Groups
CERN - IT Department CH-1211 Genève 23 Switzerland t High Availability Databases based on Oracle 10g RAC on Linux WLCG Tier2 Tutorials, CERN,
Cloud Computing – UNIT - II. VIRTUALIZATION Virtualization Hiding the reality The mantra of smart computing is to intelligently hide the reality Binary->
1 CEG 2400 Fall 2012 Network Servers. 2 Network Servers Critical Network servers – Contain redundant components Power supplies Fans Memory CPU Hard Drives.
Virtual Machine Movement and Hyper-V Replica
VCS Building Blocks. Topic 1: Cluster Terminology After completing this topic, you will be able to define clustering terminology.
How to setup DSS V6 iSCSI Failover with XenServer using Multipath Software Version: DSS ver up55 Presentation updated: February 2011.
OSIsoft High Availability PI Replication Colin Breck, PI Server Team Dave Oda, PI SDK Team.
MySQL HA An overview Kris Buytaert. ● Senior Linux and Open Source ● „Infrastructure Architect“ ● I don't remember when I started.
USEIMPROVEEVANGELIZE Solutions for High Availability and Disaster Recovery with MySQL ● Detlef Ulherr ● Sun Microsystems.
vSphere 6 Foundations Beta Question Answer
70-293: MCSE Guide to Planning a Microsoft Windows Server 2003 Network, Enhanced Chapter 12: Planning and Implementing Server Availability and Scalability.
VSPHERE 6 FOUNDATIONS BETA Study Guide QUESTION ANSWER
Adam Backman Chief Cat Wrangler – White Star Software
Chapter Objectives In this chapter, you will learn:
Server Upgrade HA/DR Integration
High Availability 24 hours a day, 7 days a week, 365 days a year…
High Availability Linux (HA Linux)
Disaster Recovery Where to Begin
Direct Attached Storage and Introduction to SCSI
Maximum Availability Architecture Enterprise Technology Centre.
A Technical Overview of Microsoft® SQL Server™ 2005 High Availability Beta 2 Matthew Stephen IT Pro Evangelist (SQL Server)
Introduction to Networks
Introduction to Networks
Oracle Solaris Zones Study Purpose Only
Storage Virtualization
Introduction of Week 6 Assignment Discussion
SQL Server High Availability Amit Vaid.
CONFIGURING HARDWARE DEVICE & START UP PROCESS
Direct Attached Storage and Introduction to SCSI
20409A 7: Installing and Configuring System Center 2012 R2 Virtual Machine Manager Module 7 Installing and Configuring System Center 2012 R2 Virtual.
QNX Technology Overview
SpiraTest/Plan/Team Deployment Considerations
Distributed computing deals with hardware
Building continuously available systems with Hyper-V
Windows Virtual PC / Hyper-V
11 Simplex or Multiplex?.
Designing Database Solutions for SQL Server
Presentation transcript:

Failover and High Availability Stefan Zier Sr. Software Engineer, Server Team May 2006 © 2006 ArcSight Confidential

© 2006 ArcSight Confidential Agenda Introduction High availability set-up explained Software Systems IP addresses Heartbeat networks Installation overview © 2006 ArcSight Confidential

© 2006 ArcSight Confidential Introduction © 2006 ArcSight Confidential

High Availability vs. Disaster Recovery? High availability: Cut down on unscheduled down time Hardware failures Operating system failures Application errors Disaster Recovery: Stay operational during catastrophes Natural disasters Major infrastructure issues © 2006 ArcSight Confidential

© 2006 ArcSight Confidential Technical Approaches High availability Second system connected through multiple communication paths Shared storage between the two systems Software that monitors both systems and moves services between them as needed Quick, automatic recovery Disaster recovery Mostly single points of failure in communication paths Replicated storage between the two sites Often manual switch between two sites (may be sufficient and more cost effective) © 2006 ArcSight Confidential

Removing Single Points of Failure Not as simple as it first seems: Many details make implementation complex System only as good as weakest link in the chain Single point of failure is a component whose failure brings down the entire system Examples Single switch that connects all the NICs Single power circuit powering systems… File server used to host system files © 2006 ArcSight Confidential

Current HA/DR Support in ArcSight High availability configurations supported for ArcSight Manager ArcSight Database ArcSight SmartConnectors Disaster recovery configurations deployed at customers Oracle DataGuard: Can only be used without partition archiving at the moment; Oracle is working on the issue SAN block level replication: Requires more bandwidth between sites, but less complex © 2006 ArcSight Confidential

Connector Redirection Connectors can redirect events to another manager Offer access to real time events to cover shorter periods of downtime Development manager set-ups (second manager to develop content before moving to production) Not a full failover solution Resources replication scripts provide incomplete replication Managers will get out of sync (Active Lists, Assets, Rule Engine state, Event IDs, M1s, etc.) Requires duplicate databases © 2006 ArcSight Confidential

Technologies that Don’t Fit ArcSight Profile Load balancing – would compromise correlation capabilities, since they require all events to go through one system Hot-Hot standby – would consume substantial network and CPU resources to keep rich manager state in sync Seamless Console Failover – would cause a similar delay during failover and add some CPU/network cost on the manager © 2006 ArcSight Confidential

© 2006 ArcSight Confidential HA Setup Explained © 2006 ArcSight Confidential

Failover Architecture Public Network Manager 1 Manager 2 Heartbeat Networks Shared Storage Shared Database © 2006 ArcSight Confidential

Failover Management Software (FMS) Software that manages services in a failover cluster Starts and stops Manager/Connector/Oracle Monitors whether Manager/Connector/Oracle is running Migrates a service IP or virtual IP between systems Needs to run on both hosts Tested products EMC AutoStart (Legato AAM) Veritas Cluster Server (VCS) © 2006 ArcSight Confidential

© 2006 ArcSight Confidential Software Public Network Manager 1 Manager 2 Heartbeat Networks Shared Storage Shared Database © 2006 ArcSight Confidential

Why Didn’t ArcSight Invent a Proprietary FMS? ArcSight wants to provide best-of-breed solution Existing FMS are very mature and well-tested Many OS platform combinations supported FMS need to solve many low-level problems that are not ArcSight’s core competency © 2006 ArcSight Confidential

Why Do We Need Identical Systems? Any difference between the two systems presents a risk Failover may fail, leaving the component nonfunctional Standby system not powerful enough to handle load Different OS/patch version on the standby system could prevent component from starting up It is easier to keep systems updated if they are identical Not recommended to use standby system for double duty ! Good practice: When you restart the manager for a configuration change, use the FMS and bring it up on the standby node © 2006 ArcSight Confidential

© 2006 ArcSight Confidential Shared Storage Shared volume to host $ARCSIGHT_HOME or $ORACLE_HOME Needed so all files (rules checkpoints, archived reports, etc.) are reliably in sync between components Needs to be HA (use RAID, multiple I/O channels) Shared SCSI bus cannot be used Simple NFS server or similar would be single point of failure SAN used for shared storage © 2006 ArcSight Confidential

IP Address Transparency Typical set-up has three IP addresses One IP address for each system (system IP) One IP address for the Manager/Connector/Database, also called virtual IP or service IP Clients always talk to service IP: Point DNS and/or hosts files there! Some FMS use IP-based networks for heartbeat networks, these may require additional IPs © 2006 ArcSight Confidential

IP Address Transparency Console or Connector arcsight.customer.com  192.168.10.89 Public Network 192.168.10.89 Manager 1 192.168.10.90 192.168.10.98 Manager 2 Heartbeat Networks Shared Storage Shared Database © 2006 ArcSight Confidential

IP Address Transparency Console or Connector arcsight.customer.com  192.168.10.89 Public Network 192.168.10.89 Manager 1 192.168.10.90 192.168.10.98 Manager 2 Heartbeat Networks Shared Storage Shared Database © 2006 ArcSight Confidential

Multiple Communication Paths Multiple discrete communication channels between systems Use a mix of technologies (serial, disk, ethernet) At least two channels Needed to avoid split brain syndrome © 2006 ArcSight Confidential

© 2006 ArcSight Confidential Split Brain Syndrome Public Network Manager 1 Manager 2 Shared Storage Shared Database © 2006 ArcSight Confidential

© 2006 ArcSight Confidential Split Brain Syndrome Public Network Manager 1 Manager 2 Shared Storage Shared Database © 2006 ArcSight Confidential

What happened to the other server? What happened to the other server? Split Brain Syndrome What happened to the other server? What happened to the other server? Public Network Manager 1 Manager 2 Shared Storage Shared Database © 2006 ArcSight Confidential

Probably broke. Let’s start up. Split Brain Syndrome Let’s just keep going. Probably broke. Let’s start up. Public Network Manager 1 Manager 2 Shared Storage Shared Database © 2006 ArcSight Confidential

Looks pretty inconsistent… Split Brain Syndrome Public Network Manager 1 Manager 2 Looks pretty inconsistent… Shared Storage Shared Database © 2006 ArcSight Confidential

Installation Overview © 2006 ArcSight Confidential

Set-Up Procedure for ArcSight Manager From one system, install Manager on shared disk In managersetup, install the startup scripts, but disable them so that Manager doesn’t auto-start on during boot (option #3) Set the Cluster ID to host name in managersetup On the other host, run managersetup from shared disk, also install startup scripts Note that startup scripts do not reside on shared disk, but on local disk on each system Test Manager on both systems © 2006 ArcSight Confidential

© 2006 ArcSight Confidential FMS Set-Up Install FMS Set-up IP addresses, heartbeat networks Set-up Manager startup, shutdown and monitoring (scripts under utilities/failover) Set-up service IP/virtual IP Test whether the manager can be started/stopped/migrated Group the manager process and the service IP so they move between hosts together © 2006 ArcSight Confidential

© 2006 ArcSight Confidential FMS Options EMC AutoStart (fka. Legato AAM) Communication over IP Unix, Windows Veritas Cluster Server Communication over proprietary protocol UNIX, Windows © 2006 ArcSight Confidential

© 2006 ArcSight Confidential ARCSIGHT_CID ARCSIGHT_CID is the ArcSight Cluster ID Set in /etc/arcsight/arcsight.conf Manually set in .profile for convenience Must be different on each of the hosts Usually, the host name is used Affects where log files get written: logs/$ARCSIGHT_CID/ (default value is default) © 2006 ArcSight Confidential

© 2006 ArcSight Confidential Scripts Startup and shutdown scripts Use /etc/init.d/arcsight_manager start|stop arcsight_manager needs to be executed as root Monitor Use $ARCSIGHT_HOME/bin/arcsight managerup arcsight managerup needs to run as arcsight user If needed, alter the monitor script to pipe the output of managerup into a file so you can monitor output © 2006 ArcSight Confidential

© 2006 ArcSight Confidential Summary ArcSight now supports HA for Connectors The same HA solution can be used for Manager and database EMC AutoStart and Veritas Cluster Server are supported and tested Tech Notes are available on configuration for both products © 2006 ArcSight Confidential

© 2006 ArcSight Confidential Questions and Answers Download Slides https://support.arcsight.com More ArcSight Events http://www.arcsight.com Join the User Forum https://forum.arcsight.com © 2006 ArcSight Confidential

© 2006 ArcSight Confidential