IT Business Continuity Briefing March 3, 2011.  Incident Overview  Improving the power posture of the Primary Data Center  STAGEnet Redundancy  Telephone.

Slides:



Advertisements
Similar presentations
Information Technology Disaster Recovery Awareness Program.
Advertisements

Skyward Disaster Recovery Options
Disaster Recovery Planning Because It’s Time! Copyright Columbia University and Bentley College, This work is the intellectual property of the author.
Chapter 4 Infrastructure as a Service (IaaS)
SQL Server Disaster Recovery Chris Shaw Sr. SQL Server DBA, Xtivia Inc.
Module – 9 Introduction to Business continuity
Business Continuity Section 3(chapter 8) BC:ISMDR:BEIT:VIII:chap8:Madhu N PIIT1.
1 Disaster Recovery “Protecting City Data” Ron Bergman First Deputy Commissioner Gregory Neuhaus Assistant Commissioner THE CITY OF NEW YORK.
1EMC CONFIDENTIAL—INTERNAL USE ONLY Overview of SQL Server 2012 High Availability and Disaster Recovery (HADR) Wei Fan Technical Partner Management – Microsoft.
Cloud Disaster Recovery. Typical Business Challenges How much does it cost me to have my IT environment off-line, and how quickly does my disaster recovery.
© 2009 EMC Corporation. All rights reserved. Introduction to Business Continuity Module 3.1.
1 Disk Based Disaster Recovery & Data Replication Solutions Gavin Cole Storage Consultant SEE.
1 RMS Workshop Retail Systems Disaster Recovery ERCOT May 6 th, 2014.
June 23rd, 2009Inflectra Proprietary InformationPage: 1 SpiraTest/Plan/Team Deployment Considerations How to deploy for high-availability and strategies.
1 © Copyright 2010 EMC Corporation. All rights reserved. EMC RecoverPoint/Cluster Enabler for Microsoft Failover Cluster.
March 17, 2005© Gerald Isaacson 2005 Emergency Management Planning Business Continuity IT Partners.
1© Copyright 2011 EMC Corporation. All rights reserved. EMC RECOVERPOINT/ CLUSTER ENABLER FOR MICROSOFT FAILOVER CLUSTER.
Disaster Recovery and Business Continuity Ensuring Member Service in Times of Crisis.
Nortel CS1000 Branch Office Solutions
1 Disaster Recovery Planning & Cross-Border Backup of Data among AMEDA Members Vipin Mahabirsingh Managing Director, CDS Mauritius For Workgroup on Cross-Border.
1 Customer Network Operations Center. 2 Agenda Overview Benefits Value Additional Services Questions Monitoring Options.
Module 8 Implementing Backup and Recovery. Module Overview Planning Backup and Recovery Backing Up Exchange Server 2010 Restoring Exchange Server 2010.
John Graham – STRATEGIC Information Group Steve Lamb - QAD Disaster Recovery Planning MMUG Spring 2013 March 19, 2013 Cleveland, OH 03/19/2013MMUG Cleveland.
Business Crisis and Continuity Management (BCCM) Class Session
Services Tailored Around You® Business Contingency Planning Overview July 2013.
November 2009 Network Disaster Recovery October 2014.
DTS Disaster Recovery Service Fact and Fallacy By Marianne Chick, CBCP DTS Disaster Recovery Planner.
Disaster Recovery as a Cloud Service Chao Liu SUNY Buffalo Computer Science.
Sign-Off on Commonwealth Incident Prioritization: Defines Priority with which Incident will be managed URGENCY/ IMPACT High A service outage with broad.
Continuity of Operations Planning COOP Overview for Leadership (Date)
Chapter 10 : Designing a SQL Server 2005 Solution for High Availability MCITP Administrator: Microsoft SQL Server 2005 Database Server Infrastructure Design.
RBTC: Business Continuity 101 July 18, What is Business Continuity? Scenario Part 1 Why is BC important? What types of plans are needed? How do.
Business Continuity and Disaster Recovery Chapter 8 Part 2 Pages 914 to 945.
New Facilities & Disaster Recovery Solutions 0 February 20, 2007 Agency IT Resources Communications Exchange Meeting Facilities and Disaster Recovery Solutions.
HBCU National Workshop June 24, 2011 Disaster Recovery Reggie Brinson Assoc. VP/Chief Information Officer Clark Atlanta University.
Presentation by Paul Vine In partnership with Version Sabre and Business Continuity Planning.
DotHill Systems Data Management Services. Page 2 Agenda Why protect your data?  Causes of data loss  Hardware data protection  DMS data protection.
Chapter 2: Non functional Attributes.  It infrastructure provides services to applications  Many of these services can be defined as functions such.
David N. Wozei Systems Administrator, IT Auditor.
NOAA WEBShop A low-cost standby system for an OAR-wide budgeting application Eugene F. Burger (NOAA/PMEL/JISAO) NOAA WebShop July Philadelphia.
The Role of High Availability Software in Quality of Service Joe McFadden Vice President, Marketing, Nuasis.
1 Availability Policy (slides from Clement Chen and Craig Lewis)
Co-location Sites for Business Continuity and Disaster Recovery Peter Lesser (212) Peter Lesser (212) Kraft.
©2006 Merge eMed. All Rights Reserved. Energize Your Workflow 2006 User Group Meeting May 7-9, 2006 Disaster Recovery Michael Leonard.
Preventing Common Causes of loss. Common Causes of Loss of Data Accidental Erasure – close a file and don’t save it, – write over the original file when.
Module 9 Planning a Disaster Recovery Solution. Module Overview Planning for Disaster Mitigation Planning Exchange Server Backup Planning Exchange Server.
IT Disaster Recovery CAUBO 2008 Information Systems and Technology.
Module 13 Implementing Business Continuity. Module Overview Protecting and Recovering Content Working with Backup and Restore for Disaster Recovery Implementing.
1 IBM TIVOLI Business Continuance Seminar Training Document.
High Availability in DB2 Nishant Sinha
Disaster Recovery: Can Your Business Survive Data Loss? DR Strategies for Today and Tomorrow.
TRUE CANADIAN CLOUD Cloud Experts since The ORION Nebula Ecosystem.
ICT Disaster Recovery Plan Monitoring & Audit Committee 23 rd November 2010.
DISASTER RECOVERY PLAN By: Matthew Morrow. WHAT HAPPENS WHEN A DISASTER OCCURS  What happens to a business during a disaster?  What steps does a business.
This courseware is copyrighted © 2016 gtslearning. No part of this courseware or any training material supplied by gtslearning International Limited to.
CTS Quarterly Customer Meeting CTS Disaster Recovery (DR) Project October 22, 2014.
OSIsoft High Availability PI Replication Colin Breck, PI Server Team Dave Oda, PI SDK Team.
CompTIA Security+ Study Guide (SY0-401)
BEST CLOUD COMPUTING PLATFORM Skype : mukesh.k.bansal.
Server Upgrade HA/DR Integration
ALWAYSON AVAILABILITY GROUPS
DTS Disaster Recovery Service Fact and Fallacy
Disaster Recovery Services
Microsoft Azure P wer Lunch
Business Contingency Planning
SpiraTest/Plan/Team Deployment Considerations
CTS Quarterly Customer Meeting
CTS Quarterly Customer Meeting
Disaster Recovery is everyone’s job!
Using the Cloud for Backup, Archiving & Disaster Recovery
Presentation transcript:

IT Business Continuity Briefing March 3, 2011

 Incident Overview  Improving the power posture of the Primary Data Center  STAGEnet Redundancy  Telephone Redundancy  Secondary Data Center and Recovery Point Objectives (RPO)  Secondary Data Center and Recovery Time Objectives (RTO)  Customer communications during outage incidents Agenda

SYSTEMS & DATANETWORK SERVICESPOWER & ENVIRONMENTALSFACILITIES & STAFF IT Business Continuity Dependencies

SYSTEMS & DATANETWORK SERVICESPOWER & ENVIRONMENTALSFACILITIES & STAFF Incident Impact

  ITD powered down servers and equipment in the Primary Data Center to minimize data loss.   ITD started to provision equipment to allow the Secondary Data Center to assume the role of the primary data center.   Initial time estimates projected power being restored to the Primary Data Center by 6:00 pm.   Power restored at 5:50 pm, and core network services restored at 6:30 pm, final systems/applications completed by 11:30 pm. January 18 th Incident Response

  Primary Data Center and Secondary Data Center both have generators to provide backup power.   ITD is working with Facilities Management and Sirius Computer Solutions to identify and implement solutions that will provide a second redundant power source to the Primary Data Center.   Hoping to be completed by the end of Power Posture Improvements

  Four Quadrant RPR Ring provides redundancy on the statewide ring by allowing traffic to automatically failover if a core node fails.   The Network Point of Presence in each quadrant has equipment architected for High Availability and backup power generation.   Internet Gateways in Bismarck and Fargo are load balanced and architected to provide failover if one of the Internet Gateways fails.   Agencies should coordinate with ITD if they require redundancy (network diversity) at individual endpoint locations. STAGEnet Redundancy

  Current Design is a Standard Digital Design   Dependent on the PBX serving the endpoint   The PBX has high availability components   Does not provide redundant service if the PBX fails   There is a service agencies can purchase to re-route critical numbers (e.g. Crisis Hotlines) in the event of a disaster. Telephone Redundancy - Current

  New Voice over IP (VoIP) design during the next two years.   As part of the standard VoIP design we will have four redundant Call Managers on STAGEnet which provide failover if the primary Call Manager serving a site fails.   Provides the ability to relocate telephone numbers to other sites with network connectivity.   Provides redundant core services for dial tone, call center and automatic call distribution (ACD).   Will not initially provide redundancy for voice mail, mobility and Interactive Voice Response (IVR). Telephone Redundancy - VoIP

Recovery Point Objective (RPO) Recovery Time Objective (RTO)

 The Recovery Point Objective (RPO) – the point in time to which you must go back to recover data when a loss incident occurs.  RPO focuses on data is independent of the time it takes to get a non-functional system back on-line (the Recovery Time Objective or RTO).  Generally a definition of what an agency determines is an “acceptable loss" in a disaster situation.  The value of the data in the “acceptable loss” window can then be weighed against the cost of the additional loss- prevention measures that would be necessary to narrow the window. Recovery Point Objective (RPO)

  Generally speaking backups are performed on a nightly basis to tape at our Secondary Data Center.   Databases have full weekly backups and nightly incremental backups.   Other data – only items that have changed during the day are backed-up.   Generally speaking the RPO or potential loss window for most data is one day – a Tuesday 4 pm disaster would require you to restore the Monday night back up and the activity for Tuesday is lost.   Agencies whose business requirements don’t allow for this potential data loss implement data replication. Recovery Point Objective (RPO)

  Recover Time Objective (RTO) – a measure of how long it takes for a system to resume normal operations to avoid unacceptable business impacts.   Prior to 2006 ITD contracted for an out of state disaster recovery hot site with a best case mainframe RTO of 72 hours.   With the deployment of online applications and multiple platforms a contracted hot site with adequate network bandwidth and processing capacity became unaffordable.   ITD invested in a second data center to improve the State’s RPO and moved to a four hour RTO for core network services. Recovery Time Objective (RTO)

  Now looking to improve the RTO of the second data center from four hours to a matter of minutes for core network services.   Base services that will be up within the first hour:     File and print services   AS/400 platform and applications   Current replicated hardware   Disaster Recovery Web Site – basic information Recovery Time Objective (RTO)

  Base services that will be up within four to twelve hours:   Mainframe (must IPL) / DELA   ConnectND   Selected shared services and some agencies have development and/or test environments residing at the second data center. These environments will be converted to assume the role of production servers in a disaster scenario. Recovery Time Objective (RTO)

  Agencies that do not invest in replicated data solutions and backup processing capacity will need to wait for additional storage and servers to be shipped and provisioned. Estimated RTO of 3 weeks to 8 weeks for production systems depending on hardware availability, staffing priorities and the amount of data to restore.   Agencies that invest in replicated data solutions but no backup processing capacity will need to wait for servers to be shipped and provisioned. Estimated RTO of 2 weeks to 4 weeks depending on hardware availability and staffing priorities. Recovery Time Objective (RTO)

  We feel we can improve our communications process during any future disaster events.   Planned communication avenues:   DR Website     Customer Service Desk   Notifind – currently used to communicate with our staff   We may be asking for emergency contacts for critical applications Disaster Recovery Communications

Questions ITD Contingency Planning Contact Larry