Presentation on theme: "Accident Exchange Group Plc Accident Exchange Limited InSpire November 2008 Building a Business Continuity Service: Reasons, Framework, and Technology."— Presentation transcript:
Accident Exchange Group Plc Accident Exchange Limited InSpire November 2008 Building a Business Continuity Service: Reasons, Framework, and Technology
2 Agenda Context : Who are Accident Exchange Limited Reasons: What does IT do for AX? Why bother with Business Continuity? Framework: How do we design, implement, and manage the service? Technology: How did we choose what to use, how did we implement, does it work? What is CSI ? Questions
Context 3 Accident Exchange has an approximate turnover of over £200 m We have approximately 5000 vehicles many prestige (Bentley, Mercedes, BMW, etc) We expect this year in excess of 1.1 m rental days We have approximately 800 employees and these are distributed within the main HQ at Coleshill and depots in Glasgow, Belfast, Warrington, and Dartford. We collect approximately 0.7m a day from claims We outsource nothing, everything is developed and supported in house. The company is a very dynamic entrepreneurial business environment.
Reasons: What does IT do for AX? BC why bother? 4 Accident Exchange acts as the supporting partner for many motor manufacturers and retailer groups. They expect our service availability 6/7 days a week and the end user community expects mobility services 24/7. This functionality rests solely on the IT services provided to the business. Examples: Communications (VoIP, MPLS, LAN, VPN, GPS, Mobile Services Support) Hosting Services (AX Web, DCML, Car Auctions, User Repair Services etc) Data Centres (Production, Business Continuity DC, Security etc) Applications ( , Fax, SMS, AX Accident Management Application, accounts, payroll, HR, CRM, ERP, Voice services, Litigation, Customer Support, Asset Protection and Recovery services, Backup, Archive, data security, surveillance systems, access control, project management) Development Services (application development, bug fix, customisation) Business Continuity Services
Framework 5 Before any service delivery can occur you need a framework to build these services on. That is, a set of business led objectives, deliverables, services level agreements, and defined roles and responsibilities so IT is accountable and measures its worth to the business (KPI’s). This aspect of IT is the hardest to quantify, or measure, but without it, you cannot justify expenditure. AX have decided on the ITIL V3 framework and as such have invested in training key staff to basic accreditation and will follow with advanced accreditation in defined critical areas. We have started writing our own ITIL framework relevant to AX. The first exercise we carried out was increasing the profile of the Service Desk, then the identification of the service catalogue and from this we prioritised the services into 3 main categories, 1 mission critical, 2 impactive, 3 non essential. From there we decided on the RPO, RTO and RTA’s. These are justified by measuring business loss/impact when not available.
Technology : How did we choose what to use? 6 Before any technology was considered a comprehensive ITT was written and submitted to key vendors asking for solutions to a specific set of criteria. These were evaluated for functionality, adherence to success criteria, longevity, skill sets required and lastly costs. Two vendors were selected for final full proof of concepts to prove categorically delivery of success criteria, and then 1 was selected and implemented under our project management services. One aspect of the Business Continuity Service will be discussed further....
Technology : How did we implement? Does it work? 7 Before the project was started a full risk analysis was carried out and it was agreed that the full implementation would be done in parallel to production so at no time was there any risk to the business. The full installation was followed by AX staff, then at key points configurations were trashed recovered and reconfigured by AX to ensure documentation was fully accurate and we were able to administer and adjust configuration. Full performance testing was done at each stage and results compared to tender and POC tests. Resiliency tests were again carried and impacts documented. Finally copies of production data were used and test procedures run and compared to current production environment. The answer to does it work Yes it does! What is the solution?
What is the solution? 8 We have gone for a fully resilient HA system based on the Hitachi USP with 4 tiers of storage. These are linked into Cisco MDS san switches, we are using FCIP links between data centres with synchronous and asynchronous replication. We are using Onstor Bobcats in HA clusters for File share CIFS and HDPS ( CV) for backup and archiving solution. We use Microsoft MNS Clusters split between sites so we have full failover option on production site and we can failover to the BC site as well with no down time or data loss as we have integrated the HCS solution. The clustered storage software ( HORCM files) allows the MNS cluster to fail over between sites and the cluster resource changes over the P and S vol’s accordingly so replication switches automatically to ensure business data is always protected to RPO 0. We have also integrated local synchronous copies using SI to allow reporting services to have up to date copies with no impact on OLTP. Backup uses maglibs then tapes and we distribute luns between sites using IVR at switch level. We have also used VMWare and integrated SRM (site recovery manager)which again uses HORCM to switch P and S vols so failover is immediate and data always protected when BC tested (used SI copy).
What is CSI: Continual System Improvement? 9 Before we sign off a specific service as ready for business usage, we fully test the functionality for all its SLA’s, i.e. Performance, Backup, Recovery, failover/resilience/ and if the SLA is RPO 0/ 15/240 then synchronous data tests are run. This is tested by both the level 3 team and the support desk team (they follow documentation). Once we are happy, then this service is entered into Service Catalogue, will be subject to monthly testing without warning for some aspect of its SLA i.e. May well be failed over to the BC site and back again with no noticeable business affect ( or within SLA) and the RTA noted. If improvements are possible then we do them under the CSI work time and they are noted, documentation updated etc and if necessary SLA adjusted. The business will get reports KPI’s on this service and thus can see that what they paid for, they are getting or better. Any failure to meet the SLA’s is investigated vigorously, recommendations made and implemented. We see this process as fundamental to our ITIL framework and is how we intend to develop our IT plans in future, propose budgets based on business objectives and SLA’s.