Server Upgrade HA/DR Integration

Slides:



Advertisements
Similar presentations
Introduction to InMage Solutions Eric Burgener Senior Vice President, Product Management May2009.
Advertisements

1/16/20141 Introduction to InMage Solutions Eric Burgener Senior Vice President, Product Management December 2009.
Networking Essentials Lab 3 & 4 Review. If you have configured an event log retention setting to Do Not Overwrite Events (Clear Log Manually), what happens.
Building the business case for Business Continuity Justin Davey Senior Consultant CA.
Cloud Computing: Theirs, Mine and Ours Belinda G. Watkins, VP EIS - Network Computing FedEx Services March 11, 2011.
VERITAS Confidential Disaster Recovery – Beyond Backup Jason Phippen – Director Product and Solutions Marketing, EMEA.
Backup and Disaster Recovery (BDR) A LOGICAL Alternative to costly Hosted BDR ELLEGENT SYSTEMS, Inc.
1 Disaster Recovery “Protecting City Data” Ron Bergman First Deputy Commissioner Gregory Neuhaus Assistant Commissioner THE CITY OF NEW YORK.
1 Disk Based Disaster Recovery & Data Replication Solutions Gavin Cole Storage Consultant SEE.
June 23rd, 2009Inflectra Proprietary InformationPage: 1 SpiraTest/Plan/Team Deployment Considerations How to deploy for high-availability and strategies.
1 © Copyright 2010 EMC Corporation. All rights reserved. EMC RecoverPoint/Cluster Enabler for Microsoft Failover Cluster.
Modern Distributed Systems Design – Security and High Availability 1.Measuring Availability 2.Highly Available Data Management 3.Redundant System Design.
Keith Burns Microsoft UK Mission Critical Database.
Managing Information Systems Information Systems Security and Control Part 2 Dr. Stephania Loizidou Himona ACSC 345.
1© Copyright 2011 EMC Corporation. All rights reserved. EMC RECOVERPOINT/ CLUSTER ENABLER FOR MICROSOFT FAILOVER CLUSTER.
Chapter 4 Database Management Systems. Chapter 4Slide 2 What is a Database Management System (DBMS)?  Database An organized collection of related data.
National Manager Database Services
John Graham – STRATEGIC Information Group Steve Lamb - QAD Disaster Recovery Planning MMUG Spring 2013 March 19, 2013 Cleveland, OH 03/19/2013MMUG Cleveland.
Disaster Recovery as a Cloud Service Chao Liu SUNY Buffalo Computer Science.
Chapter 10 : Designing a SQL Server 2005 Solution for High Availability MCITP Administrator: Microsoft SQL Server 2005 Database Server Infrastructure Design.
IT Business Continuity Briefing March 3,  Incident Overview  Improving the power posture of the Primary Data Center  STAGEnet Redundancy  Telephone.
Implementing Multi-Site Clusters April Trần Văn Huệ Nhất Nghệ CPLS.
Business Continuity and Disaster Recovery Chapter 8 Part 2 Pages 914 to 945.
© Novell, Inc. All rights reserved. 1 PlateSpin Protect Virtualize your Disaster Recovery.
Guide to Linux Installation and Administration, 2e 1 Chapter 9 Preparing for Emergencies.
DotHill Systems Data Management Services. Page 2 Agenda Why protect your data?  Causes of data loss  Hardware data protection  DMS data protection.
Chapter 8 Implementing Disaster Recovery and High Availability Hands-On Virtual Computing.
IT Infrastructure Chap 1: Definition
Co-location Sites for Business Continuity and Disaster Recovery Peter Lesser (212) Peter Lesser (212) Kraft.
Chapter © 2006 The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/ Irwin Chapter 7 IT INFRASTRUCTURES Business-Driven Technologies 7.
Mark A. Magumba Storage Management. What is storage An electronic place where computer may store data and instructions for retrieval The objective of.
Module 13 Implementing Business Continuity. Module Overview Protecting and Recovering Content Working with Backup and Restore for Disaster Recovery Implementing.
Continuous Backup for Business CrashPlan PRO offers a paradigm of backup that includes a single solution for on-site and off-site backups that is more.
Backup and Recovery Services ”0” “Zero” Means no administration – SymQuest On-Premise Support Services offer Monitored backup jobs Remediation.
11 CLUSTERING AND AVAILABILITY Chapter 11. Chapter 11: CLUSTERING AND AVAILABILITY2 OVERVIEW  Describe the clustering capabilities of Microsoft Windows.
Presented by, MySQL & O’Reilly Media, Inc. Top considerations for backup of MySQL Dmitri Joukovski, Zmanda.
TRUE CANADIAN CLOUD Cloud Experts since The ORION Nebula Ecosystem.
DISASTER RECOVERY PLAN By: Matthew Morrow. WHAT HAPPENS WHEN A DISASTER OCCURS  What happens to a business during a disaster?  What steps does a business.
WHAT ARE BACKUPS? Backups are the last line of defense against hardware failure, floods or fires the damage caused by a security breach or just accidental.
Networking Objectives Understand what the following policies will contain – Disaster recovery – Backup – Archiving – Acceptable use – failover.
Program Review Presentation May 5th, 2010
Networking Basics.
Planning for Application Recovery
70-293: MCSE Guide to Planning a Microsoft Windows Server 2003 Network, Enhanced Chapter 12: Planning and Implementing Server Availability and Scalability.
Chapter 6: Securing the Cloud
ATS Service Assurance Suite presentation
Disaster Planning and Recovery
Providing Application High Availability
BEST CLOUD COMPUTING PLATFORM Skype : mukesh.k.bansal.
High Availability 24 hours a day, 7 days a week, 365 days a year…
Maintaining Windows Server 2008 File Services
DTS Disaster Recovery Service Fact and Fallacy
Network Configurations
A Technical Overview of Microsoft® SQL Server™ 2005 High Availability Beta 2 Matthew Stephen IT Pro Evangelist (SQL Server)
SAN and NAS.
Introduction to Networking
Introduction to Networks
Introduction to Networks
Introduction of Week 6 Assignment Discussion
Client-Server Interaction
Unit 10 NT1330 Client-Server Networking II Date: 8/16/2016
VMware VM Replication for High Availability in Vembu VMBackup
Cloud Testing Shilpi Chugh.
DHCP, DNS, Client Connection, Assignment 1 1.3
Planning High Availability and Disaster Recovery
SpiraTest/Plan/Team Deployment Considerations
Selling IIoT Solutions to Systems Integrators
Using the Cloud for Backup, Archiving & Disaster Recovery
LO3 – Understand Business IT Systems
Presentation transcript:

Server Upgrade HA/DR Integration Presented by: Tony Barnes & Arash K. Ardestani April-May 2014

Intoructions Add Into information here

About this session The goal of this session is to give the LIS administrators an over view of the LIS systems that reside under HA cluster support. In the following slides we will review the general elements of each cluster and how they interact with each other. We will also look at some of risks and limitations of the cluster and how to minimize the impact of those risks and limitations.

Resources of Cluster - Dedicated A Typical SCC cluster consists of two AIX hosts AKA nodes. By default each node has a few resources for itself: Its own network port/adapter – dedicated ports Its own permanent IP address – AKA persistent address Its own AIX operating system Its own dedicated disks and devices such as Tape, CD, etc…

Resources of Cluster - Shared Based on the purpose of the cluster the two nodes share sets of resources – meaning they both can see them simultaneously: Multiple sets of disks drives – Shared storage Note: A special disk is used as a communication interfaces among nodes Multiple sets of IP addresses – Service IP addresses Set of policies that define and govern the roles and responsibilities of the nodes within the cluster

Resource Groups & Policies A collection of resources of a node (IP, Disk, devices) and their responsibilities Example: Data volumes, filesystems, scripts, printing, etc… Policy: A set of rules & regulations that exactly define the behavior of a resource group Policies are broken down into two major sets: Events: An event is a set of conditions that result in a particular set of actions Actions: A set of one or more procedures that are performed based on a particular set of events

Resource Groups & Policies Computer Repair Department (Software tech, hardware tech, manager) Policy: Find and fix problems with computers Events: If a problem is found on a computer – determine if HW or SW problem Actions: Manager makes a ticket and sends it to proper tech for evaluation and repair

Resource Group: Policy: MAIN Application Functions AUX Application Functions Policy: MAIN node starts, runs, and controls the MAIN application functions AUX node starts, runs, and controls the AUX application functions MAIN ask for health of AUX and AUX ask for health of MAIN Events: If MAIN does not get a response from AUX within 60 seconds Actions: Then MAIN take over AUX – Start, run, and control those function of AUX node

Policy: MAIN node starts, runs, and controls the MAIN application functions AUX node starts, runs, and controls the AUX application functions MAIN ask for health of AUX and AUX ask for health of MAIN Events: If AUX does not get a response from MAIN within 60 seconds Actions: Then AUX take over MAIN – Start, run, and control those function of MAIN node

Important Considerations A cluster solution does not translate to 100% uptime for users The goal is to minimize the downtime and the manual efforts to recover from single points of failure. As with any technology a cluster needs to be maintained and tested. The initial design and implementation of a solution sets the tone for later Understand the purpose of the cluster and adjust your expectations Have redundancy for everything you possibly can – have backup plan A cluster is as robust as the infrastructure it is running on!!!

Limitations of Cluster A cluster cannot undo or correct a human mistake The proper resources must exist to build the cluster on top There are conditions that cause the cluster to break A broken system with cluster is much hard to fix There is still a short downtime when a failover event occurs There are some applications that do not play well with cluster

Beyond HA Cluster More often now we have a big question on the table: What happens if all nodes of the cluster are not available or able to work? This simply translates to complete downtime and cluster cannot do anything to help you!!! So what can you do to protect yourself against such disaster? The answer is simply “Have a disaster recovery solution”

Types of Disaster Recovery Cold DR: A set of servers are secured for the critical applications. The cold DR solution is never read to be used. It is an empty shell waiting to be installed when a disaster is declared. Method of recovery: Usually from backup tapes or backup server Warm DR: A set of servers are secured, installed, and maintained ready to be used when a disaster is declared. Method of recovery: Usually direct data replication in real time

Warm or Cold? With the cost of hardware steadily and rapidly declining, more than ever organizations are interested in a disaster recovery solution. Ultimately the decision of which DR solution is the right one for you comes from the two important factors: RTO : Recovery Time Objective In simple terms RTO determines how long it will take to have a working system Simply DOWNTIME! RPO : Recovery Point Objective How close to 100% is the data that you will recover and have to work with Simply Data Loss! Here is a chart from an expert….

How to find RTO/RPO? Figure out what it will cost when you lose LIS system for 1 hour Figure out how long can you be without the LAB system and still survive Figure out how much data can you lose and still survive Know the limitations of applications that you are using Know the limitations of the infrastructure that you have in place Know the limitations of the staff and other dependents departments

What we offer… Over the past few years we have designed and tested quite a few different DR solutions that fit the needs of most of our clients still within affordable rate. The design is based on these elements: Add a third system in the DR data center Repeat system profiles from MAIN/AUX on DR SAN-2-SAN replication from MAIN/AUX to DR Use DNS to connect all the LAB devices to LIS systems