Presentation is loading. Please wait.

Presentation is loading. Please wait.

Single Point Failure: The case study of RBS CS/SE 6361 Advanced Requirements Engineering Shahed Shuman.

Similar presentations


Presentation on theme: "Single Point Failure: The case study of RBS CS/SE 6361 Advanced Requirements Engineering Shahed Shuman."— Presentation transcript:

1 Single Point Failure: The case study of RBS CS/SE 6361 Advanced Requirements Engineering Shahed Shuman

2 The Incident On the day of June 19 2012 Olivia Downey from Aberdeenshire, England was waiting for Cancer treatment in a mexican hospital and the money raised by the charity was suddenly frozen by Natwest. Olivia’s family had to scramble to get the funds in other way so her treatment can start that day. A man granted bail by the Court in Kent, England couldn’t get out of jail, his Family’s Account was frozen and they couldn’t post the bail money. 16.9 million customers of the UK bank NatWest and some 100 000 customers of Northern Ireland's Ulster Bank, were affected, some couldn't witdraw money, others did not get their wages, payments and other transactions processed and faced fines for late payment of bills. NatWest and Ulster is part of RBS (Royal bank of Scotland)

3 The RBS payment processing system suddenly stopped, None of the overnight job running the Bank balance calculation happend, all of the transactions processing were disrupted, ATM machines were not operational, online bank balance was showing incorrect balance! The Bank had to announce emergency steps to mitigate this Incident All Branches had to stay open for weekend and extended hours so customers can get access to their funds. Image from: http://www.google.com/patents/US6304860

4 What was the issue? RBS payment processing was done in it’s IBM mainframe by software called CA-7 CA-7 is a job scheduling / workflow automation software package sold by CA technologies, it was called UCC-1 (tape library management), originally developed by UCCEL Corp CA-7 is used by many banks and financial institutions to run IBM job schedules The day before the Incident, an upgrade was Installed to the RBS CA-7 Software, it malfunctioned. Resulting in total stoppage of Payment processing jobs.

5 What were the causes? RBS moved the CA-7 maintenance and support to offshore The job offered 8-10 lakh Indian Rupees (13-16 thousand USD), it may not have attracted top talents. During the software update the CA-7 scheduler file got corrupted when offshore engineers applied the update in both main and backup servers RBS also cut down their onshore CA-7 resources As a result,no CA-7 expert was left in RBS who were experienced enough to diagnose the issue and revert the corrupted updates. RBS had to call CA technologies for help. RBS do use CA-7 and do update all accounts overnight on a mainframe via thousands of batch jobs scheduled by CA-7... Backing out of a failed update to CA-7 really ought to have been a trivial matter for experienced operations and systems programming staff, especially if they knew that an update had been made. That this was not the case tends to imply that the criticisms of the policy to "off-shore" also hold some water. (http://www.theregister.co.uk/2012/06/25/rbs_natwest_what_went_wrong/)

6 Root cause 1.Proper CA-7 update requirements were not followed by the offshore team. 2.Systems upgrade happened on a weekday without proper testing, it caused major business disruption 3.The issue was not found until it was already in the RBS production server. 4.RBS didn’t have any requirement available to tackle these kinds of technical emergency 5.RBS did not have resources to implement any corrective measure 6.The delay to implement the fix and subsequent media coverage caused the issue to be longer as CA technologies took longer time to avoid another blunder. the error was made when backing out of an upgrade from CA-7 v11.1 to v11.3. The CA-7 upgrade took place at the weekend of 16/17th June and a problem was noticed on Monday which prompted a back-out from the upgrade on Tuesday night. In the back-out, an "inexperienced operator" made the wrong move and the day's data was wiped from the system. This created the backlog(http://www.theregister.co.uk/2012/06/28/rbs_job_cuts_and_offshoring_software_glitch/)

7 What was the issue? RBS payment processing was done in IBM mainframe by software called CA-7 CA-7 is a job scheduling / workflow automation software package sold by CA technologies, it was called UCC-1 (tape library management), originally developed by UCCEL Corp CA-7 is used by many banks and financial institutions to run IBM job schedules The day before the Incident, an upgrade was Installed to the RBS CA-7 Software, it malfunctioned. Resulting in total stoppage of Payment processing jobs. Ref: http://www.ca.com/us/products/detail/ca-7-workload- automation.aspx

8 What was the damage? RBS lost £1.7 billion from this incident Shares slumped 9.1 per cent to 227.7p, meaning over £1.7billion was wiped off taxpayers’ 82 per cent stake in the bank (http://www.dailymail.co.uk/money/markets/article-2165332/NatWest- glitch-RBS-shares-fall-insuiders-claim-Indian-technician-cause- meltdown.html#ixzz2ipXZm3LK )http://www.dailymail.co.uk/money/markets/article-2165332/NatWest- glitch-RBS-shares-fall-insuiders-claim-Indian-technician-cause- meltdown.html#ixzz2ipXZm3LK

9 Questions??


Download ppt "Single Point Failure: The case study of RBS CS/SE 6361 Advanced Requirements Engineering Shahed Shuman."

Similar presentations


Ads by Google