Presentation on theme: "CANHEIT 2012 Building the Digital University. 2 DRP is like Laundry Nancy Wellard Western University Canada"— Presentation transcript:
CANHEIT 2012 Building the Digital University
2 DRP is like Laundry Nancy Wellard Western University Canada email@example.com
3 Who is Nancy Wellard? Employed at Western University since 1985 Information Technology Services Technical Support Manager Responsible for – Server teams supporting Unix, Linux and Windows servers – Computer Accounts Office
4 I am not an expert on DRP!! My intent is to convey some of the experiences we have had getting to where we are now! We started our DRP planning in 2004. There have been definite epiphany moments. There are lots of opportunities to improve our current model.
5 Key Points or Ah-ha Moments 1.Service catalog vision 2.Its a never ending story 3.Its all about consistency 4.Dependencies exist 5.Perfection is difficult to achieve
6 Process Repetitive Process is the foundation. Loose definition of a project: – An activity with a beginning and an end that facilitates a defined outcome. Statements about the IT sector: – The only constant is change. – We dont support computers… We facilitate upgrades… This implies information captured from this DRP process feeds a Living Document. No End to Process therefore DRP is not a project – It is a journey !
7 Meetings Weekly 1 hr meetings – Attendees Core Team Invited Subject Matter Experts (SME) Core team – Comprised of 7 individuals whose roles represent a cross section of the Technical Services team Network Operations, Windows and Unix Support Teams Phone System Client Services (admin for hire) Storage and Operations Scribe and Coder SME – One or more individuals who are the resident experts on the topic being discussed.
8 Meetings (contd) 1 meeting per month utilized for strategy – Attended by core group – Discuss improvements to the process – Review topics that are not specific to a defined service – Schedule service topics for the next month Remaining meetings in the month are specific to one or more service reviews
9 Standardization Its all about consistency Glossary consensus is very important. Decision points must hinge on consistent evaluation. Information is captured into a standardized template that ensures common questions are processed.
10 Danger Point Repetition develops Boredom Boredom leads to complacency Persistence is Important – Lock the meeting into calendars – Use humor to breakdown the drabness of having to repetitively discuss a negative perspective on the environment – Avoid scope creep
11 Data Capture We use a Wiki (Atlassian Confluence) as the User Interface Template Document Captures: – Dependencies – Current Environment – Recovery Steps and Concerns for 2 Scenarios – Red/yellow/green (green is good)
13 Scenarios Currently working 2 scenarios Point: Use of the word current continues to validate its a living document! We have 2 datacenters on our campus – Support Services Building (SSB) Newest and considered Primary – Stevenson Hall (StvH) Dates back to Sixties and considered Secondary
14 Automation We initially started using only the wiki to track the information gathered. After some time we recognized that sophisticated reporting required a database function. Standardizing on a wiki template enabled us to associate our info with an SQL schema. Store data captured via the UI (wiki) into a Microsoft SQL database – Daily process scrapes template data and feeds the SQL tables (Perl)
15 Services Epiphany – Its all about the Service – When we began… General thought was that DRP is about ensuring server hardware is functioning. – Post Epiphany Servers are infrastructure required to support services. The product important to our customers is the service. DRP is about ensuring services are appropriately protected and that recovery steps are documented, reviewed and tested. It's about the service
16 Service Catalog 122 Interdependent Services. Examples… * Active Directory * Antivirus * Application Managers * Cable Plant * Calendar Service * CallTracker * CANIT * Central Print Services * Central Proxy * CHUBB - Campus * Cluster * Communications Closet Monitoring - Environmentals * Contact Center * Core Network * Databases o Microsoft SQL + SQL o MySQL o Oracle Databases + Oracle DataGuard Protected Databases + Oracle Development Databases + Oracle Non-DataGuard Protected Databases + RMAN * Data Network o Backup Storage (Tape) o Backup Storage (VTL and SIR) o Biotron X4500 Storage o Equallogic San o Fibre Channel Fabric o IPSAN Network o ITS X4500 Storage o NetApp o WISG X4500 Storage * DHCP * Dial * DNS * E-mail * External Carrier * External Connectivity o External Intrusion Detection and Prevention (IDP) o External Routing o External Switching
17 Dependencies Services are recognized as being interdependent. When a service topic is reviewed it is assumed that the parent services (defined dependencies) are out of scope of the discussion. This perspective provides context to the individual service discussion. Power is at the top of the pyramid For example: – DNS -> Core Network -> Power
18 Dependency Visualization Weekly process which extracts data from SQL database. We run a link normalization process to eliminate redundant dependencies. We use Graphviz to generate a hierarchical view of the dependency tree.
21 Executive Level Summary Created an executive level dashboard which is a stoplight view into the environment – Grey Service has not been reviewed for both scenarios – Red Issues of significant concern with this service – Yellow Issues exist with this service – Green This service should survive a scenario Can drill down to specific detail for a service.
23 DRP for the DRP Entire Wiki Space is weekly exported into PDF format onto our departmental windows share. Operations copies these exports onto jump drives. Each team member also carries a copy of these files on their corporate provided jump drives.
24 Opportunities for improvement Improvement in presentation of the dependency tree. Continual addition of new services makes addressing the reds problematic. Testing. Need to get better definition of Business Continuity variables – RPO – RT0 …