6 The History of Business Continuity Mid 200711/9/2001Holistic Contingency PlansDRP: The advance planning and preparations that are necessary to minimize loss and ensure continuity of the critical business functions of an organization in the event of disaster. The technological aspect of business continuity planning.Business Continuity Management SystemOrganization wide Contingency PlansBusiness Continuity PlanningIT or Technical Contingency PlansDisaster Recovery PlanningBCP: Process of developing advance arrangements and procedures that enable an organization to respond to an event in such a manner that critical business functions continue with planned levels of interruption or essential change.Fallback Plans , Contingency PlansAlternative Planning / Plan B
8 What is BCM System? Business Continuity Management System is: A living system; managed every day; updated at all times when the situation changesHolisticRequires a strategy/policyEnd-to-end critical business process restoration (focus is not asset recovery only)Communication – clients, employees, emergency organisationsIntegration with business processes of other Business Unites (BUs)Address employee safety & wellbeing
9 Describing BCM Risk RISK has four key components: Threats: Fire, earthquake, power failure, loss of key staff etc.Assets: Human, mission critical systems and infrastructure, suppliers, clients, buildings, information and recordsVulnerabilities: Weaknesses in assets such as single point of failure, inadequacies in fire protection, poor staffing levels, unreliable IT security, inefficient data back up etc.Impact – Financial or Non financial (e.g. reputational, health & safety impacts)Risk description (example):“A significant financial and reputational impacts to the Company as a result of IT Manager is unable to restore the Billing Server which was damaged by a power surge on time.”
11 BCM RisksBCM Plans are mitigation controls (i.e. minimise consequences) rather than prevention controlsBCM Risks are managed through Quantate as part of the TPL’s Risk Management Program
12 BCM AssumptionsSecondary or alternative site is always available to the company to restore business critical processes if the primary site not accessible due to damages sustained by a disaster.Minimum resources (e.g. staff, IT equipment etc.) are still available to restore business critical functions following a disasterPopua power station is unharmed and minor damages to the distribution network assetsNational communication methods (i.e. TCC or Digicel) are available to communicate with public.BCM plans are not applicable for nationwide disasters (e.g. tsunami) that may have damaged the company’s all primary and secondary sites, generation and distribution assets.
13 BS25999 Standard Requires a policy statement Requires focusing on restoring end-to-end processes (not just equipment or machinery)Requires RTO/RPO analysis for all critical processesRequires a risk assessment and mitigationRequires a hierarchy of plans (IMP, BURP, ERP etc.)Requires a command Centre and DR sitesRequires a Call TreeRequires testing, maintaining & auditing of BCM plans periodically
14 Overview of BS 25999:2007 ACT PLAN CHECK DO Maintaining and Improving Ongoing trainingUpdate plans whenever there is a major change to the company processes/structureEmbedding BCM in the culturePlanningBCM PolicySteering CommitteeStructure, roles & responsibilitiesBCM scope & objectivesTop management commitmentACTPLANImplementation & OperationIdentify BC threatsBusiness impact analysisProcess criticality analysisGap analysisRisk assessmentRecovery strategies & scenariosGenerating BCM Plans/BCM ManualTesting & Auditing BCM plansPeriodical tests – Table top exercises, Simulations, Live ExercisesPeriodical auditsManagement reviewCHECKDO
17 BURP vs. DR PlanBURP is a plan used to restore the entire business unit including the key processes, which employees are moved to DR site, activation of call tree, coordination with other business units, resume business as normal. (holistic)DR Plan is technical plan designed to restore equipment and machinery back to normal. (e.g. IT DR Plan)Companies use both names intermittently.
18 Managing Incidents Within the Capability of TPL Plans Identify the identificationAssess damage and identify lost critical processesDeclare a disaster if the critical operations are unable to be restored within the primary premisesShut- down power supply for the safety of publicActivate BURPs/DRPs. Move into the DR site to restore operations at a minimum acceptable levelsRestore the damaged distribution assets and networkRestore power generationResume business as usual at the DR SiteMonitoringReview & Documentation
20 RTO & RPO(Last Backup)The Recovery Time Objective (RTO) is the goal for how quickly TPL need to recover the interrupted processes.The Recovery Point Objective (RPO) is the point in time to which data must be restored to successfully resume the interrupted processes (often thought of as time between last backup and when an “interruption” occurred).
21 Business Impact Analysis BIA identifies impacts (financial & non-financial) due to malfunction of company critical processes resulting from disastersThe goal of BIA is to identify, categorise and prioritise the mission critical processes and resources (e.g. technology, infrastructure, vital records, personnel, suppliers) required to function these processes within the company.Some examples of such processes are customer service & support, order & data processing, pay roll, IT & communication, and purchasing and production.BIA also identifies interdependencies (telephones, IT facilities, office space etc.) between different business units within the company.The priorities of critical processes for subsequent resumption are based on RTO and RPO objectives of each process.
22 BIA – Identifying Critical Processes Administer BIA Questionnaire with each BU Manager (template supplied), collect data, review and analyse themConsider the worst case scenario; a total loss of personnel, facilities and property.Does the disaster affect the critical processes?Determine impacts if a process is lost due to the disaster:Identify and quantify financial impacts (recovery costs, production losses & revenue losses)identify non-financial impactsImpact on staff or public wellbeingImpact of damage to premises/assets/recordsImpact of breaches of statutory regulationsDamage to reputationDeterioration of product/services qualityEnvironmental damage
23 BIA – Identify Critical Processes Determine minimum resources required for minimum acceptable recovery of each process manually or using alternative processes.Staff – skills and knowledgeSecondary premisesPlant, equipment, software, data/recordsExternal services providersDetermine RTOs and RPOs for each process. Note shorter the RTO, greater cost of recovery. Also, longer RTOs increases the chances that the recovery will not be achieved within MTO.Determine whether the company is able to achieve RTO/RPO objectives currently. If not, flag them as risks to analyze them at a later stageRank the processes based on impacts to determine priority of recovery of critical processesExample: Results of RTO/RPO analysis is suppliedConduct a process dependency analysis with other business units
24 BIA - Process Dependency Analysis RTO = 2 hrs.BDRTO = 7 hrs.BU 2ACRTO = 3 hrs.ERTO = 5 hrs.RTO = 1 hr.BU 1BU3Consider the entire chain of processes to be recovered togetherAdjust RTOs if necessaryDefine Maximum Tolerable Outage (MTO)BCM Plans are designed to recover company critical processes within MTO
25 BIA – Disaster Declaration Scenario 1 – If Estimated Recovery Time < MTOPrimary Site is intact; (Example: malfunction of IT Servers)Execute IT DR Plan for all IT issuesScenario 2 – If Estimated Recovery Time <= MTOPrimary site is inaccessible but DR sites are operational, key staff are available; national communication systems functional (Example: cyclone)Minimum and acceptable reputational/financial lossesExecute all BCM Plans including IMP, BURPs, & IT DR PlanResume company critical processes from the DR Sites to a minimum acceptable levelScenario 3 – If Estimated Recovery Time > MTOAll primary and DR sites are not operational; key staff are unavailable; national communication system is malfunctioned (Example: tsunami)Requires a good communication plan (e.g. radio communication)Potential reputational/financial losses are inevitableExecute BCM plans with a delay
27 Disaster Declaring Criteria TPL MTO = 2 daysLevel 1: Minor Incident – Minor incidents occur only when critical business operations are affected and distribution network recovery or IT problems are expected to be resolved within 3 hours. In this occasion, the incident is resolved on ‘business as usual’ basis and no disaster is declared.Level 2: Major Incident – Major disruption to the critical business operations with system outage expected to last more than 1 day but not more than two business days. In this case, the incident is escalated to a semi-disaster and is declared by the members of the IMT. At this incident level, only Generation and Distribution DR Plans will be invoked as applicable.Level 3: Critical Incident – Critical disruption to the business operations with system outage expected to last more than two business days. In this case, the incident is escalated and a disaster is declared by the members of the IMT. At this incident level all critical Business Units (IT, Finance, Generation, and Distribution) BURPs/DRPs will be invoked.
29 Command Centre Options Command Centre Activities:Declaring the disaster based on the damage assessmentEstablishing 24 hour communication channelsActivating, coordinating and monitoring BURPs/DRPsMonitoring and acting on staff health & safetyMedia relationsMaking all key decisions for successfully restorationPossible Locations for the Command Centre:Head Office (if available)Distribution Office (if available)Scenic Hotel, Digicel Network Operating Centre
31 DR Sites DR Site or Network Incident Site or Network Distribution Office, TongatapuPopua Generation Plant, TongatapuDistribution Office, HaapaiDistribution Office, VavauHead Office, TongatapuDR SiteDistribution Network, TongatapuThere is no DR Network for TPL in Tongatapu. Therefore, if the network is severely damaged by a disaster, TPL technical staff (with assistance from external emergency teams) have to restore the network as quickly as possible.There is no DR Generation Plant for TPL in Tongatapu. Therefore, if the existing generation facility is severely damaged by a disaster, TPL technical staff have (with assistance from external emergency teams) to restore the generation facility as quickly as possible.Office, HaapaiDistribution Network/ Generation Plant, HaapaiThere is neither DR Generation Plant nor Network for TPL in Haapai. Therefore, if the existing generation facility/ distribution network is severely damaged by a disaster, TPL technical staff have to restore the generation/network facility as quickly as possible working day and night.Office, VavauDistribution Network/ Generation Plant, VavauThere is neither DR Generation Plant nor Network for TPL in Vavau. Therefore, if the existing generation facility/ distribution network is severely damaged by a disaster, TPL technical staff have to restore the generation/network facility as quickly as possible working day and night.Distribution Network/ Generation Plant, EuaThere is neither DR Generation Plant nor Network for TPL in Eua. Therefore, if the existing generation facility/ distribution network is severely damaged by a disaster, TPL technical staff have to restore the generation/network facility as quickly as possible working day and night.
32 Internal Communication – Call Tree Incident Management TeamBusiness Unit ManagersField Supervisors & ForemenOffice SupervisorsLinesmenStaffBrief description of the disaster, loss of life or injuries, damage summary, response and recovery detailsLocation of the DR or Network Site to report to or to remain at home on standbyPhone number of the DR Site/Network SupervisorImmediate actions to be takenLocation and time the team should meet at DR or Network SiteInstruct everyone notified not to make any statements to the media or social media.
33 External Communication IMTDistribution ManagerFinance ManagerGeneration ManagerPlanning & Design ManagerBusiness Unit Restoration PlansVendors & SuppliersPolice, Fire & HospitalNational Emergency Management Office (Ministry of Works)Tonga Met ServiceRadio TongaTonga Defense ServiceGovernment OrganisationsEmbassies & High CommissionsBusiness Unit DR PlansIT Supervisor
34 Media RelationsOnly the authorised spokespeople shown below are to comment to the media.John van Brink (CEO) – English mediaSteven Esau (Finance Manager) – Tongan mediaAll unauthorised staff should know that if they are contacted by the media that they are not to comment and that their standard reply should be “I’m sorry I can’t help you, I am not the appropriate person to speak to. If you provide me your name and contact number, I’ll get the right person to get in touch with you shortly”.Jane Guttenbeil and Nau Lavemai coordinate media relations
36 Head Office BURP Minimum Systems Required at DR Site PriorityRankProcess to RecoverMinimum Systems RequiredCan TPL Provide Min. Systems Required?RPORTOMinimum Staff Required1Customer call managementVoice call & Calls redirection to mobile phonesYes (refer IT DR Plan)NA1 DayDedicated staff2Meter reading & BillingComputers with Orien, Printer, mobile phones2 DayBilling Supervisor, Meter readers3Revenue managementComputers with Orien, mobile phones,Credit supervisor and 2 cashiersMinimum Staff Required at DR SiteDivisionManager/SupervisorTeam MembersITLualalaPei, SoniaBillingOfaMakuetaRevenueHetaOvava, Grace
37 Distribution Office BURP Minimum Systems Required at DR SitePriorityRankProcess to RecoverMinimum Systems RequiredCan TPL Provide Min. Systems Required?RPORTOMinimum Staff Required1Customer faults call managementCalls redirection to mobile phones, Computer with File Maker SoftwareYes (refer IT DR Plan)NA1 day2 Call Centre Staff2GIS Database ManagementComputers with GIS Database, Printer, mobile phones2 Day1 Planning StaffMinimum Staff Required at DR SiteDivisionManager/SupervisorTeam MembersFaultsMalia HeheaGIS DatabaseIanSemi
38 Generation DR Plan Priority Rank Process to Recover Minimum Systems RequiredCan TPL Provide Min. Systems Required?RPORTOMinimum Staff RequiredOption 1Power generation to critical organizations3MW Backup Generator (TPU) and 500KW Generators for outer islands, 2 weeks fuel supplyNo, but will have to restore damaged generators as quickly as possibleNA4 DaysGeneration manager and all generation technical staffOption 2If the power station is unrecoverable, generators will be hired from Aggreko NZ1 weekCEO, Power Generation Manager
39 Distribution Network DR Plan PriorityRankProcess to RecoverMinimum Systems RequiredCan TPL Provide Min. Systems Required?RPORTOMinimum Staff Required1Distribution networkNetwork equipment (poles, transformers, insulators etc.)Yes, Transnet stores has adequate standby supply for disasters.NAUrban areas –2 DaysVillages – 1 WeekDistribution manager, supervisors, and 100% field staff2Field vehicles, PPE,VHF CommsAssume 50% damagedYes, there VHF systems for 50% of field vehicles100% field staffNote: H/O staff are expected to support distribution staff for cooking etc.Meter readers are expected support linesmen at the field
45 Emergency Response Plans ERPs contain specific emergency procedures that must be followed during a disaster in order to protect people and assets, and to mitigate further damage. For example:Building EvacuationDamage AssessmentSpillage (Oil/Chemical/Diesel Fuel) at Power StationFuel Supply Cut OffCivil DisturbanceHurricane/StormRecords RecoveryBomb ThreatEarthquakes
47 Testing Business Continuity Plans The development of BCM plans does not end of the BCM process. The emphasis of BCM is upon managementWithout regular maintenance and testing, their usefulness in a real crisis may be severely limitedPracticing the company’s ability to recover from an incident.Test the scenarios identified under the Scenario Planning Section (refer above)Tests validates effectiveness and timeliness (RTO and RPO objectives) of restoration of critical activities.Determine adequacy of SLAs (service level agreements) with third party suppliersTesting identifies communication breakdowns during call-tree activation trials.Tests are essential to developing teamwork, competence, confidence, and knowledge which is vital at the time of an incident.Frequency of testing can be annual, bi-annual etc. and announced or unannounced
48 Types of TestsTable top checks: is the simplest and most frequent form of tests. The author of the plan simply checks the contents of the plan to ensure that information (e.g. employee names, contact numbers) are up to date.Walk-Through: similar to table top exercise but involve all named participants to test a special disaster scenario. The participants are brought together to role play their defined resumption procedures alongside those of others. This is the most common method of testing BCM plans as is relatively less expensive.Simulation: exercises widens participation to all those who are involved in business recovery with a prior notice. A simulation may includes an interruption such as building fire in which people do not access to normal facilities and must relocate to an alternative location (i.e. DR site).Full or Live Exercises: are the most extensive and expensive form of test thus they are normally undertaken yearly or bi-yearly basis. This is the largest scale of test and involves the invocation of all BCM plans (i.e. IMP, BURP & ERPs) to deal with a scenario which normally involves a move to an alternative site where operations are to be resumed. The dependencies and links between BURPs are the focal point of this type of testing.
49 Post Test EvaluationsThe post test evaluation should consider following issues:Did the plans help or hinder recovery efforts?Did people deviate from the plan and, if so, what was the effect of this?Were RTO & RPO objectives achieved?Where and when did delays occur?What did staff do well?What did staff do badly?How did the expectations differ from what actually happened?Were all BURPs integrated sufficiently to achieve recovery?What are the priorities for change?Is there a paper or audit trail?How should changes be implemented?Could the observation process be improved?Identify and document all the deficiencies, lesions learned etc.Update BCM plans if required based on test outcomes
50 BCM Maintenance & Auditing Ongoing BCMS audits should be conducted to evaluate and identify gaps/inconsistencies of entire BCM portfolio of plans.If testing has shown that plans have failed to meet the recovery objectives, a fundamental review of plans may be required.In addition, audits should be conducted after company restructure as a result of a merger or acquisition, installation new systems & facilities (e.g. IT) etc.Auditor will issue corrective and preventive actions focusing continuous improvementEnsure the BCMS is current and up to date at all times (i.e. BCM plans are living documents)Provide ongoing training on the entire BCM process (e.g. BIA, Risk Management etc.)Communicate to all employees the BCM initiatives through newsletters, induction programmes etc.Cultivate a BCM culture.
51 Interruptions Outside the Capabilities of BCM Plans
52 Managing Disasters Outside the Capability of BCM Plans Usually BCM plans are prepared to mitigate impacts and resume business operations for manageable interruptions such as fire, flood, supply chain interruptions, IT failure, etc.However, there are some disasters (e.g. tsunami) where company’s current BCM plans may be simply ineffective due to following reasons:Business may be interrupted for a whileRTO and RPO objectives may not be able to be achieved (Estimated Recovery Time > Predefined MTO)All DR sites may have been damagedPossible loss of key staffSevere damages to the company’s mission critical assets (e.g. distribution network)All communication (e.g. TCC, Digicel) malfunctionedSevere damages to the power station
53 Managing Disasters Outside the Capability of BCM Plans In case of a such disaster, a possible action plan would be:Shut-down the power generation for safety of he publicMonitor and support staff health & safety and wellbeingCoordinate with NEMO for possible recovery efforts (e.g. assistance from TDS to clear fallen trees etc.)Broadcast public that power is interrupted for a whileRestore the distribution network (might have to import network equipment from overseas) with some delaysEstablish alternative fuel supply (if the fuel tank has been damaged by the tsunami)If the power station cannot be recovered import generators from Aggreko Ltd., NZ.Establish a new DR site to resume key processes (e.g. billing/revenue) to minimum acceptable level.Implement current BCM plans with a delay