Presentation is loading. Please wait.

Presentation is loading. Please wait.

Maintaining Business Continuity After Internal and External Incidents Greg Schaffer, CISSP Director of Network Services Middle Tennessee State University.

Similar presentations


Presentation on theme: "Maintaining Business Continuity After Internal and External Incidents Greg Schaffer, CISSP Director of Network Services Middle Tennessee State University."— Presentation transcript:

1 Maintaining Business Continuity After Internal and External Incidents Greg Schaffer, CISSP Director of Network Services Middle Tennessee State University

2 Copyright Greg Schaffer This work is the intellectual property of the author. Permission is granted for this material to be shared for non-commercial, educational purposes, provided that this copyright statement appears on the reproduced materials and notice is given that the copying is by permission of the author. To disseminate otherwise or to republish requires written permission from the author.

3 Our Story Begins Like Many…. It was late in the afternoon one weekday when suddenly alarms sounded in the NOC. It was clear SOMETHING had happened, because connectivity was shattered across campus. Students could not access online classes, purchase orders could not be processed, would not go through… It was late in the afternoon one weekday when suddenly alarms sounded in the NOC. It was clear SOMETHING had happened, because connectivity was shattered across campus. Students could not access online classes, purchase orders could not be processed, would not go through… BUSINESS DISCONTINUITY

4 Troubleshooting the Problem  It was relatively easy to pinpoint what wasn’t talking to what.  The fact that many things were not talking to other many things indicated that more than one “thing” was affected.  Check of devices indicated the problem was not equipment but at physical layer.  It was clear that this was going to take SOME TIME to fix!

5 Location, Location, Location  The relative location of the physical layer issue was determined to be at or on the site of new stadium construction.  However, there was no initial indications of anything wrong.  When asked, the construction workers said they had not been digging…

6 BUT  …neglected to mention they had been pile driving rocks to prepare a trench for a new water line.  The concrete encased conduits were damaged by the equipment.  The area was excavated to reveal what we hoped was minimal damage…

7 Minimal Damage?!

8

9 Getting Services Up  While the extent of the physical damage wasn’t clear until complete excavation was done the next morning it was clear that there was enough physical damage to assume that the conduits would not be usable for replacement fiber optics.  There were redundant fiber cables between data centers that took different routes across campus…

10 Forming the Plan  …except for one portion, which happened to be the pulverized area!  A plan was needed to restore communications…fast  The plan: –access manholes on either end of the damage and splice new fibers in manholes –run fibers temporarily on the road, and close the road to all traffic (planned anyway)

11 Finding Manhole Difficult

12 Eventually Circuits Back Up

13 But Almost Down Again!  Graduation was that Saturday  Road opened for visitors  Temporary fibers had vehicles driving over them most of the day!  Fibers held, but needless to say they would not be reused…

14 Post Mortem  Eventually (nearly one month later) a manhole was constructed around the break, and new fibers pulled through the repaired area and spliced  Despite “normal” controls (“Tennessee One Call”, conduits encased in concrete, redundant fibers, etc.) “Bad Stuff” happened  Bad Stuff = Good Lessons

15 Operations Security Controls  Preventative  Detective  Corrective  Directive  Recovery  Deterrent  Compensating CISSP CBK

16 Preventive/Detective  Failed: –Tennessee One Call (dirt covered markings) –Hardened Physical Paths  Worked (but after the fact) –Network monitoring –Help desk reporting –Documentation

17 And Keep Manhole Uncovered!

18 Corrective/Directive  Worked –Emergency Web Communications –Temporary fiber construction (temporary corrective control for Business/Mission Continuity) –Shovel  Failed –Blocking car and truck traffic

19 Recovery  More of a longer term approach to prevent the same occurrence  Redundant fiber between data centers  Must also consider separate building entrances  Cost of solution vs cost of downtime analysis

20

21 Deterrent/Compensating  Worked: –Penalty/Insurance –Temporary fiber run –Cutting of ducts –Creation of new manhole

22 Finally It ended up being a late night, hampered by many events. Our DR/BC plan did not specifically address this problem...NOR SHOULD IT HAVE. A good DR/BC plan is flexible and adaptive. The necessary resources were mobilized quickly based on existing DR/BC plans. What could have been a very large disaster goes down as a downtime that lasted 10 hours. It ended up being a late night, hampered by many events. Our DR/BC plan did not specifically address this problem...NOR SHOULD IT HAVE. A good DR/BC plan is flexible and adaptive. The necessary resources were mobilized quickly based on existing DR/BC plans. What could have been a very large disaster goes down as a downtime that lasted 10 hours.


Download ppt "Maintaining Business Continuity After Internal and External Incidents Greg Schaffer, CISSP Director of Network Services Middle Tennessee State University."

Similar presentations


Ads by Google