Presentation on theme: "Business Continuity: Ensuring Survival Ron LaPedis, CBCP, CISSP Sr. Product Manager, Compaq Ron LaPedis, CBCP, CISSP Sr. Product Manager, Compaq."— Presentation transcript:
Business Continuity: Ensuring Survival Ron LaPedis, CBCP, CISSP Sr. Product Manager, Compaq Ron LaPedis, CBCP, CISSP Sr. Product Manager, Compaq
2 AgendaAgenda Continuity planning? I thought it was called disaster recovery… Why? Professional practices Continuity planning model Step by step Horror stories Food for thought
3 Source: San Francicso Chronicle Some people never learn… 11/30/89 Crane Collapse Closes Buildings (Over 1 month after the Loma Prieta earthquake) …for 10 minutes…her job was to race through work areas and scoop up appointment books, payroll records and Rolodexes needed to carry on business elsewhere… Many tenants main concern was getting payroll checks…phone lists and calendars
4 Something happens Business process loss Productivity (Single department or multiple departments) Disaster event occurs Time Source: DRII
5 Disaster recovery Business process loss Productivity Disaster event occurs Time Source: DRII
6 Continuity planning Business process loss Disaster event occurs Time Source: DRII Productivity
8 Source: Contingency Planning Research, 2000 Downtime is lost revenue Industry Financial Media Retail Transportation Entertainment Shipping Financial Application Brokerage operations Credit card sales Pay-per-view Home shopping (TV) Catalog sales Airline reservations Tele-ticket sales Package shipping ATM fees Average cost per hour of downtime (US$) $7,840,000 $3,160,000 $183,000 $137,000 $109,000 $108,000 $83,000 $34,000 $18,000
9 Time zones are no longer a barrier for conducting business If your site is down, your competition is one click away –Utility failure –Communications failure –System failure –Application failure –OS failure –Utility upgrade –Communications upgrade –System upgrade –Application upgrade –OS upgrade And what about system and database maintenance? Downtime is not acceptable
10 Downtime is controllable System and network architecture –High-availability systems –Redundant network –Hardened primary site –Remote backup site Continuity planning –Know what you will do before you need to do it
11 Continuity planning perspective Ensures that an event doesnt become a disaster Covers a broad spectrum of business and technology issues The key goal: –Required business process availability
12 Disaster Recovery Institute International (DRII) Mission DRIIs mission is to provide the leadership and best practices that serve as a base of common knowledge for all business continuity and disaster recovery planners and organizations in the industry.
13 DRIIs professional practices Pre-planning 1.Project initiation and management 2.Risk evaluation and control 3.Business impact analysis Planning 4.Developing business continuity strategies 5.Emergency response and operations 6.Developing and implementing business continuity plans Post-planning 7.Awareness and training programs 8.Maintaining and exercising business continuity plans 9.Public relations and crisis communication 10.Coordination with public authorities
14 DRIIs business continuity planning model 1.Project initiation phase 2.Functional requirements phase 3.Design and development phase 4.Implementation phase 5.Testing and exercise phase 6.Maintenance and update phase 7.Execution phase
15 Its a process Project initiation Functional requirements Design and development Implementation Maintenance and updating Testing and exercising Procedures Business continuity process Start Required availability times Source: DRII
16 Project initiation phase Management commitment and policies Objectives and requirements Baseline assumptions Project management Teams –Delphi – Business function knowledge –Corporate team – Infrastructure / common activities –EMT – Emergency Management Team the workers –CMT – Crisis Management Team the decision makers
17 Project initiation phase Project management CP is a process consisting of programs and projects It does not take a subject matter expert to manage projects, it takes a project manager Use your CP experts to perform CP activities, not to manage projects.
18 Source: DRII Project initiation Functional requirements Design and development Implementation Maintenance and updating Testing and exercising Procedures Business continuity process Required availability times You are here
19 Functional requirements phase Fact gathering, alternatives and decisions Risk analysis and controls Business impact analysis –RTO – Recovery Time Objective – How fast –RPO – Recovery Point Objective – How much Alternative strategies Cost benefit analysis and budgeting
20 Functional requirements phase Risk analysis Asset inventory and definition Evaluation of controls Decision Vulnerability and threat assessment Communication and monitoring
21 Functional requirements phase Risk analysis Quantitative – Facts and figures, hard –Statistical –Actuarial –Annualized Loss Exposure (ALE) –Objective Qualitative – Not calculable, soft –Reputation –Future market share –Subjective
22 Functional requirements phase Risk analysis Controls do not reduce the threat, they reduce the exposure (and hence, the risk)
23 Time to recover COSTCOST LOSSLOSS Maximum cost of control Acceptable downtime Money Functional requirements phase Business impact analysis
24 Source: DRII Project initiation Functional requirements Design and development Implementation Maintenance and updating Testing and exercising Procedures Business continuity process Required availability times You are Here
25 Design and development phase Scope and objectives Recovery teams Cookbook Key disaster scenario Escalation, notification, and activation
26 Design and development phase Recovery teams Evaluation and declaration Notification Emergency response Interim processing Salvage Relocation/reentry
27 Design and development phase Key disaster scenario A fire broke out in the computer room. We are unsure of the state of the computers and data stored there. The building has been shut down by the fire department until they are sure that it is safe to enter. They are estimating that we will not have access to the building for a couple of days
28 Design and development phase Escalation, notification, and activation Who activates the EMT? How does the EMT get activated? Who decides to activate the CMT? How does the CMT get activated? How does the CMT decide to activate the plan? What happens if certain members of the CMT are unavailable?
29 Source: DRII Project initiation Functional requirements Design and development Implementation Maintenance and updating Testing and exercising Procedures Business continuity process Required availability times You are Here
30 Implementation phase Emergency response Command and control Designation of authority Scripts Vendors and resources
31 Implementation phase Designation of authority Who is in charge? –If they are not available, who is in charge? If they are not available, who is in charge? –If they are not available, who is in charge? Committees cannot be in charge!
32 Implementation phase Scripts Step by step listing of activities to be performed every step of the way –In a disaster situation, people do not think rationally Scripts can be tested, tuned, and tested again –The person who follows a script does not need to be the person who developed the script Automate as much as possible –One company has 800 automated scripts just for recovering their database!
33 Implementation phase Vendors and resources Hot site, warm site, cold site, off-site records storage Equipment replacement Rent-a-guard Salvage experts Catering Hotel rooms, rental cars Local authorities –Police, fire, hospitals, hazmat teams
34 Source: DRII Project initiation Functional requirements Design and development Implementation Maintenance and updating Testing and exercising Procedures Business continuity process Required availability times You are Here
35 Testing and exercise phase Training and awareness Exercise program objectives Exercise plans, scenarios and exercises Evaluation and modification
36 Testing and exercise phase Exercise program objectives Practice makes perfect – Some companies spend hundreds of hours tweaking parts of their plans to decrease recovery time Every second counts
37 Testing and exercise phase Evaluation and modification What went wrong and how do we fix it for next time? Do not find someone to blame. A fault found now could save your company later Were any of our assumptions wrong? Do we need to revisit a previous phase?
38 Source: DRII Project initiation Functional requirements Design and development Implementation Maintenance and updating Testing and exercising Procedures Business continuity process Required availability times You are Here
39 Maintenance and update phase Remember to budget for this phase. An untested, stale plan is worse than no plan at all! Review criteria – still current? Status, reporting, and audits Distribution and security –Your plan is a competitive asset
40 Execution phase If an event becomes a disaster –Decide –Declare –Notify –Execute
41... and no one is around to use them IT recovery is part of a complete contingency plan Like Cheerios are part of a complete breakfast… Not just an IT problem IT can recover computers and applications, not Business Processes The computers are humming, the applications are loaded…
43 Horror stories Your backup site is in Atlantic city. You declare during the Miss America pageant (Hurricane Andrew) Your computer room is in the basement and theres a fire in the building (Bell Canada) Will the generators be safe? Do you have a way to refuel them? (Tropical storm Allison)
44 Horror stories 1.You power up the generators and nothing happens 2.You power up the generators and the power surge blows out your systems 3.You power up the generators and realize that your air conditioning isnt on backup power Hint: Exercise your plan!
45 Food for thought Tapes Where is your tape backup hardware? Where are tapes stored until they go offsite? How quickly do your tapes go offsite? Are multiple tape copies sent via different routes? Do you do tape retrieval / restore tests? For recovery, do you ship tapes in waves?
46 Food for thought Replicated enterprise storage Vendors guarantee disk integrity –Backup disk = primary disk at a bit level Database integrity is not guaranteed Your database software needs to recover the database to a consistent state before you can begin processing on the backup system
47 Site Failure Source system Target system D1 D2 Not flushed to disk but transaction committed and log flushed D1D2D1T3 B T3 C D1 D2 On disk, but not committed D2T2B D2 T2B Disk 1 Disk 2 T1 B D1T1 C D1 D2 D1 T1 B D1T1 C D2 Audit disk cache flushed at transaction commit for safety Database disk cache flushed infrequently for performance D1 D2 D1D2D1T3 B T3 C Audit Log Disk Disk 1 Disk 2 Audit Log Disk = disk cache flush Physical disk does not equal logical database
48 Food for thought Check your third party site contract –How many other companies in the same threat area use the same vendor? –How soon do you have to vacate? Where will you go? –Have you included workstations and space for them?
49 Remember that building? One year later, the tornado-scarred Bank One tower in Ft. Worth Texas is still closed. 2000/03/ /02/10