Presentation is loading. Please wait.

Presentation is loading. Please wait.

University of Southern California Center for Systems and Software Engineering Software Classic Disasters CS 577b Software Engineering II Supannika Koolmanojwong.

Similar presentations


Presentation on theme: "University of Southern California Center for Systems and Software Engineering Software Classic Disasters CS 577b Software Engineering II Supannika Koolmanojwong."— Presentation transcript:

1 University of Southern California Center for Systems and Software Engineering Software Classic Disasters CS 577b Software Engineering II Supannika Koolmanojwong April 4, 2011

2 University of Southern California Center for Systems and Software Engineering Outline IT Project Management: Infamous Failures, Classic Mistakes, and Best Practices Recovering IT in a Disaster: Lessons from Hurricane Katrina Top 10 Worst Practices 04/04/2011© 2011 USC-CSSE2

3 University of Southern California Center for Systems and Software Engineering IT Project Management: Infamous Failures, Classic Mistakes, and Best Practices R. Ryan Nelson, MIS Quarterly Executive Vol. 6 No. 2 / June 2007 Retrospectives by project postmortems or post- implementation reviews 99 retrospectives conducted in 74 organizations over the past 7 years “Insanity: doing the same thing over and over again and expecting different results.” — Albert Einstein 04/04/2011© 2011 USC-CSSE3

4 University of Southern California Center for Systems and Software Engineering 10 of the most infamous IT project failures Large magnitude Over $100 million One-half come from the public sector –wasted taxpayer dollars –lost services the other half - the private sector –billions of dollars in added costs –lost revenues –lost jobs. 04/04/2011© 2011 USC-CSSE4

5 University of Southern California Center for Systems and Software Engineering 1. Internal Revenue Service (IRS)1999 PROJECT: –Business Systems Modernization; –Launched in 1999 to upgrade the agency’s IT infrastructure and more than 100 business applications $8 billion modernization project, team of vendors a complex project overwhelms the management capabilities of both vendor and client. the most expensive systems development “fiasco” in history, with delays costing the U.S. Treasury tens of billions of dollars per year. ability to collect revenue, conduct audits, and go after tax evaders was severely compromised 04/04/2011© 2011 USC-CSSE5

6 University of Southern California Center for Systems and Software Engineering 2. Federal Aviation Administration, 1996 PROJECT: Advanced Automation System (AAS); FAA’s effort to modernize the nation’s air traffic control system. Estimated to cost $2.5 billion ( $1.5 billion is wasted) Numerous delays and cost overruns, which were blamed on both the FAA and the primary contractor, IBM. Technical complexity of the effort, bad resource estimation, ineffectively requirements control "For example, they wanted the system to have only 3 seconds of downtime a year. But to get the data to prove that requirement had been met would have taken about 10 years” (later on change to 5 minutes downtime) Instead of admitting the problem, IBM turned AAS into a research project The project collapsed 04/04/2011© 2011 USC-CSSE6

7 University of Southern California Center for Systems and Software Engineering 3. Federal Bureau of Investigation, PROJECT: “Trilogy;” Four-year, $500M overhaul of the FBI’s antiquated computer system. Ill-defined requirements, changed dramatically after 9/11 (agency mission switched from criminal to intelligence focus) $170 million project was abandoned altogether 400 problems with early versions of the troubled software, but never told the contractor The bureau went ahead with a $17 million testing program even the software would have to be scrapped 04/04/2011© 2011 USC-CSSE7

8 University of Southern California Center for Systems and Software Engineering 4. McDonalds, 2001 PROJECT: “Innovate;” Digital network for creating a real-time enterprise planned to spend $1 billion over five years Objective: to better serve customers by using information and communications technologies to monitor the quality of products and services Executives in company headquarters would have been able to see how soda dispensers and frying machines in every store were performing, at any moment. Would need $1billion for infrastructure, and $zillions to maintain and upgrade After two years and $170M, the fast food giant threw in the towel. 04/04/2011© 2011 USC-CSSE8

9 University of Southern California Center for Systems and Software Engineering 5. Denver International Airport 1994 PROJECT: Baggage-handling system. It took 10 years and at least $600 million to figure out big muscles, not computers, can best move baggage The baggage system, designed and built by BAE Automated Systems Inc., launched, chewed up, and spit out bags so often that it became known as the “baggage system from hell.” 04/04/2011© 2011 USC-CSSE9

10 University of Southern California Center for Systems and Software Engineering 6. AMR Corp., Budget Rent A Car Corp., Hilton Hotels Corp., Marriott International Inc, 1992 PROJECT: “Confirm;” Reservation system for hotel and rental car bookings After four years and $125 million in development, when it became clear that Confirm would miss its deadline by as much as two years. Was supposed to be a leading edge comprehensive travel industry reservation program combining airline, rental car and hotel information Major problems surfaced when Hilton tested the system, then 18 months delay and the problems could not be resolved 04/04/2011© 2011 USC-CSSE10

11 University of Southern California Center for Systems and Software Engineering 7. Bank of America, 1988 PROJECT: “MasterNet;” Trust accounting system. hardware problems caused the Bank of America (BofA) to lose control of several billion dollars of trust accounts. All the money was eventually found in the system, but all 255 people in the entire Trust Department were fired, as all the depositors withdrew their money. This is a classic case study on the need for risk assessment, including people, process, and technology-related risk. BofA spent $60M to fix the $20M project before deciding to abandon it altogether. BofA fell from being the largest bank in the world to No. 29 CRACK stakeholders problems, bad modular design, focusing in competing with competitors-but ready for transition 04/04/2011© 2011 USC-CSSE11

12 University of Southern California Center for Systems and Software Engineering 8. Kmart, 2000 PROJECT: IT systems modernization $1.4 billion IT modernization effort aimed at linking its sales, marketing, supply, and logistics systems. 18 months later, cash-strapped Kmart cut back on modernization, writing off the $130 million it had already invested in IT. Four months later, it declared bankruptcy Failing to allocate enough money and manpower to not clearly establishing the IT project's relationship to the organization's business 04/04/2011© 2011 USC-CSSE12

13 University of Southern California Center for Systems and Software Engineering 9. London Stock Exchange, 1993 PROJECT: “Taurus;” Paperless share settlement system. £800 million, original budget £6 million Abandoned after 10 years of development By Vista Concepts, US, for database management. Although being very good for on-line real time processing, it could not handle distributed data processing or batch processing LSE tried to modify Vista by rewriting almost 60% of it, hence hidden bugs and long delays Grew from a settlement only system, to become a full “share registration and transfer system”. 04/04/2011© 2011 USC-CSSE13

14 University of Southern California Center for Systems and Software Engineering 10. Nike, 2000 PROJECT: Integrated enterprise software $400 million installing ERP, CRM, and SCM—the full complement of analyst- blessed integrated enterprise software. Caused major inventory glitch, over- produced some shoe models and under- produced others profits drop by $100 million 04/04/2011© 2011 USC-CSSE14

15 University of Southern California Center for Systems and Software Engineering Classic Mistakes Behind schedule Add more people Want to speed up development Cut testing A new version of OS becomes available during the project, Time for an upgrade! Key contributors aggravating the rest of the team? Wait until the end of the project to fire him! 04/04/2011© 2011 USC-CSSE15

16 University of Southern California Center for Systems and Software Engineering Classic Mistakes: People Undermined motivation –productivity and quality Individual capabilities of the team members or the working relationships Failure to take action to deal with a problem employee Adding people to a late project –pouring gasoline on a fire 04/04/2011© 2011 USC-CSSE16

17 University of Southern California Center for Systems and Software Engineering Classic Mistakes: Process BDUF – Big Design Up Front Underestimate, overly optimistic schedules, under scoping it, undermining effective planning, and shortchanging requirements determination and/or quality assurance –Poor estimation also puts excessive pressure on team members, leading to lower morale and productivity. Insufficient risk management contractor failure - outsourcing and offshoring 04/04/2011© 2011 USC-CSSE17

18 University of Southern California Center for Systems and Software Engineering Classic Mistakes: Product FAA’s modernization effort, where the goal was % reliability, which is referred to as “the seven nines.” Requirements gold-plating Feature creep –average project experiences about a +25% change in requirements over its lifetime. Developer gold-plating - new technology that are required in the product. Research-oriented development Silver-bullet syndrome Overestimated savings from new tools or methods Switching tools in the middle of a project 04/04/2011© 2011 USC-CSSE18

19 University of Southern California Center for Systems and Software Engineering A Meta-Retrospective of 99 IT Projects process mistakes (45%), people mistakes (43%) product mistakes (8%) or technology mistakes (4%). –project managers should be experts in managing processes and people. Scope creep didn’t make the top ten mistakes –As long as project manager pays attention to it Contractor failure has been climbing in frequency in recent years If the project managers had focused their attention on better estimation and scheduling, stakeholder management, and risk management, they could have significantly improved the success of the majority of the projects studied. 04/04/2011© 2011 USC-CSSE19

20 University of Southern California Center for Systems and Software Engineering Avoid classic mistakes through best practices 1.Avoiding Poor Estimating and/or Scheduling –Cost overrun, %, %, –Schedule overrun, %, %. –cone of uncertainty by multiplying the “most likely” single-point estimate by the optimistic factor lower bounds - optimistic estimate upper bounds - pessimistic estimate. –Capital One 100% cushion - beginning of the feasibility phase 75% cushion in the definition phase 50% cushion in design 25% cushion at the beginning of construction 04/04/2011© 2011 USC-CSSE20

21 University of Southern California Center for Systems and Software Engineering Avoiding Poor Estimating and/or Scheduling Valuable approaches to improving project estimation and scheduling –Timebox development shorter, smaller projects are easier to estimate, –creating a work breakdown structure to help size and scope projects –retrospectives to capture actual size, effort and time data for use in making future project estimates –a project management office to maintain a repository of project data over time. 04/04/2011© 2011 USC-CSSE21

22 University of Southern California Center for Systems and Software Engineering Avoiding Ineffective Stakeholder Management ineffective stakeholder management is the second biggest cause of project failure Have to know –who has influence over others –who has direct control of resources –stakeholder level of interest –stakeholder degree of support/resistance 04/04/2011© 2011 USC-CSSE22

23 University of Southern California Center for Systems and Software Engineering Avoiding Insufficient Risk Management risk identification, analysis, prioritization, risk-management planning, resolution, and monitoring. Methods/ tools –a prioritized risk assessment table –a top-10 risks list, –interim retrospectives –appointing a risk officer 04/04/2011© 2011 USC-CSSE23

24 University of Southern California Center for Systems and Software Engineering Avoiding Insufficient Planning Ensure the followings –Clear roles and responsibilities –Resource allocation –Schedule / timeline –Follow project policies, plans, and procedures 04/04/2011© 2011 USC-CSSE24

25 University of Southern California Center for Systems and Software Engineering Avoiding Shortchanging Quality Assurance When a project falls behind schedule, the first two areas that often get cut are testing and training. Cut corners by eliminating test planning, eliminating design and code reviews, and performing only minimal testing Suggestions: –agile development, joint application design sessions, automated testing tools, and daily build-and-smoke tests. 04/04/2011© 2011 USC-CSSE25

26 University of Southern California Center for Systems and Software Engineering Avoiding Weak Personnel and/or Team Issues get the right people assigned to the project from the beginning Between 1999 and 2006, the retrospectives reported an increasing number of problems with distributed, inter-organizational, and multi-national teams. –reduction in face-to-face team meetings, time- zone barriers, and language and cultural issues 04/04/2011© 2011 USC-CSSE26

27 University of Southern California Center for Systems and Software Engineering Avoiding Insufficient Project Sponsorship Not only getting top management support, but identifying the right sponsor From the beginning !!! 04/04/2011© 2011 USC-CSSE27

28 University of Southern California Center for Systems and Software Engineering 04/04/2011© 2011 USC-CSSE28

29 University of Southern California Center for Systems and Software Engineering Outline IT Project Management: Infamous Failures, Classic Mistakes, and Best Practices Recovering IT in a Disaster: Lessons from Hurricane Katrina Top 10 Worst Practices 04/04/2011© 2011 USC-CSSE29

30 University of Southern California Center for Systems and Software Engineering Hurricane Katrina 04/04/2011© 2011 USC-CSSE30

31 University of Southern California Center for Systems and Software Engineering Recovering IT in a Disaster: Lessons from Hurricane Katrina Iris Junglas, Blake Ives, MIS Quarterly Executive Vol. 6 No. 1 / Mar 2007 August 29, Hurricane Katrina destroyed a data center and communications infrastructure at the Pascagoula and Gulfport, Mississippi, operations of the Ship Systems sector of Northrop Grumman Corporation Also put a second data center out of commission in a shipyard near New Orleans 20,000 employees in Ship Construction Caused over US$1 billion in damage for the company Brought two of the nation’s largest shipyards to a standstill 04/04/2011© 2011 USC-CSSE31

32 University of Southern California Center for Systems and Software Engineering Recovering IT in a Disaster How to adapt when the business continuity plan; inadequate public infrastructure Reexamine our processes for preparing disaster plans Processes for assessing preparedness and response after a disaster or a near-disaster. 04/04/2011© 2011 USC-CSSE32

33 University of Southern California Center for Systems and Software Engineering Northrop Grumman Corporation Products : electronics, aerospace, and shipbuilding Customers: government and commercial customers worldwide Major business: –Ship construction - large military vessels –Revenue: US$5.7 billion in 2005 –Customers: DoD and Navy –12,900 employees at Mississippi; –7,100 employees at the New Orleans 04/04/2011© 2011 USC-CSSE33

34 University of Southern California Center for Systems and Software Engineering Preparation for Hurricane Hurricane is nothing new to ship industry –September 04 – Hurricane Ivan –July 05 - Hurricane Dennis A bigger one is heading in –August people dead, over US$1billion in damage in Florida 04/04/2011© 2011 USC-CSSE34

35 University of Southern California Center for Systems and Software Engineering Preparation for Hurricane Data –Data backups were sent to Iron Mountain (information management services) –Double back up in Dallas Servers –power off –wrapped in plastic New backup generator – in secure location Only one extranet alive (crucial the Navy and DoD) Human –Left the area 04/04/2011© 2011 USC-CSSE35

36 University of Southern California Center for Systems and Software Engineering The storm smashed NGC facilities are on the storm’s path Communication failed Extensive damage to shipyard and nearby communities Emergency command center – at Dallas office – newly assembled emergency team is formed 04/04/2011© 2011 USC-CSSE36

37 University of Southern California Center for Systems and Software Engineering Damages Collect digital images of damages At Mississippi, lost –1,500 PC, 200 servers, 300 printers, 600 data input devices, and hundreds of two-way radios. –communications closets, routers, switches, fiber and copper cables and wires. –LAN / WAN / MAN – no longer worked At New Orleans –Infrastructures are there –AC systems are not working, hence servers are automatic shutdown A week after the storm, communication lines are down again due to cars are driving over them 04/04/2011© 2011 USC-CSSE37

38 University of Southern California Center for Systems and Software Engineering First thing first Not about restoring computer systems, but restoring human resources But most of the 20,000 employees were out of contact Tools –Press releases –Corporate web site (67,000 hits in the weeks after the storm ) –Toll-free call in number Payroll through Wal-Mart and Western Union 04/04/2011© 2011 USC-CSSE38

39 University of Southern California Center for Systems and Software Engineering Restoring IT infrastructure Electronic communication – nonexistent due to public communication infrastructure Communication through Black Berry can be used intermittently Two-way radios, walkie-talkies Key members using satellite phones –Required line-of-sight access to satellites Later on, use wireless communication 04/04/2011© 2011 USC-CSSE39

40 University of Southern California Center for Systems and Software Engineering Building new data center Hardware acquisition Incompatibilities between software and new hardware environment Inaccessible or difficult to find system documentation, e.g. license keys, server names, addressing schemes, login IDs 04/04/2011© 2011 USC-CSSE40

41 University of Southern California Center for Systems and Software Engineering Restoring data and applications Some firms found that their back up data is partially unreadable For NGC, 2 backups : iron mountain and Dallas Lost some data on desktops or local machines Two weeks after Katrina – had a new data center; essential systems are up and running 04/04/2011© 2011 USC-CSSE41

42 University of Southern California Center for Systems and Software Engineering Disaster preparedness Common mistake : prepare for disasters specific to their domain –financial institutions prepare for IT failures, –hospitals for pandemics –airliners for technical failures and sabotages. An alternative approach : consider a broader spectrum of disaster types, such as the generic disaster –economic, information, physical, human resource, reputation, psychopathic, and natural disasters Identify common characteristics of each disaster categories, then construct the plan 04/04/2011© 2011 USC-CSSE42

43 University of Southern California Center for Systems and Software Engineering IT disaster preparedness framework 04/04/2011© 2011 USC-CSSE43 provide generic objectives and measurements, guidelines for establishing IT disaster preparedness, emphasize developing an IT continuity plan, identifying and allocating critical resources, executing a business impact analysis, and maintaining, testing and training of the plan COBIT (Control Objectives for Information and Related Technology) –For operational IT and business managers –Focus on three core elements of IT governance: IT as an asset, IT- related risks, and IT control structures. ITIL (IT Infrastructure Library) –focus is to improve the efficiency and effectiveness of IT services delivered to customers within the enterprise –de facto standard for IT service management.

44 University of Southern California Center for Systems and Software Engineering IT disaster preparedness framework 04/04/2011© 2011 USC-CSSE44 COBIT (Control Objectives for Information and Related Technology) ITIL (IT Infrastructure Library)

45 University of Southern California Center for Systems and Software Engineering Lesson Learned 1.Keep Data and Data Centers Out of Harm’s Way 2.Don’t Assume the Public Infrastructure Will Be Available 3.Plan for Civil Unrest 4.Assume Some People Will Not Be Available 5.Leverage Your Suppliers as Critical Team Members 04/04/2011© 2011 USC-CSSE45

46 University of Southern California Center for Systems and Software Engineering Lesson Learned 6.Expect the Unexpected 7.Get Prepared – Crisis portfolio 8.Establish a Strong Leadership Position 9.Empower Decision Makers on the Team 10.Exploit Fresh-Start Opportunities 04/04/2011© 2011 USC-CSSE46

47 University of Southern California Center for Systems and Software Engineering Outline IT Project Management: Infamous Failures, Classic Mistakes, and Best Practices Recovering IT in a Disaster: Lessons from Hurricane Katrina Top 10 Worst Practices 04/04/2011© 2011 USC-CSSE47

48 University of Southern California Center for Systems and Software Engineering Worst Practices Capers Jones, "Our Worst Current Development Practices," IEEE Software, vol. 13, no. 2, pp , Mar Project failures –terminated because of cost or schedule overrun –experienced schedule or cost overruns in excess of 50 percent of initial estimates –resulted in client lawsuits for contractual noncompliance 04/04/2011© 2011 USC-CSSE48

49 University of Southern California Center for Systems and Software Engineering Worst Practice #1 No historical software-measurement Lack of historical data makes stakeholders blind to see the realities of software development Need to check on schedule, cost, progress, performance 04/04/2011© 2011 USC-CSSE49

50 University of Southern California Center for Systems and Software Engineering Worst Practice #2 Rejection of accurate estimates No accurate estimate is the root cause for the rest of the worst practices including: –inability to perform return-on-investment calculations –susceptibility to false claims by tool and method vendors –software contracts that are ambiguous and difficult to monitor. 04/04/2011© 2011 USC-CSSE50

51 University of Southern California Center for Systems and Software Engineering Worst Practice #3 & 4 Failure to use automated estimating tools and automated planning tools. 50 commercial software-cost estimating tools –Checkpoint, COCOMO, Estimacs, Price-S, or Slim 100 project-planning tools on the market –Microsoft Project, Primavera, Project Manager’s Workbench, or Timeline Combination of estimating and planning tools leads to accurate and realistic outcomes not easily overridden by clients or executive 04/04/2011© 2011 USC-CSSE51

52 University of Southern California Center for Systems and Software Engineering Worst Practices 5 & 6 - Excessive, irrational schedule pressure and creep in users’ requirements 7 & 8 - Failure to monitor progress and to perform risk management –“90 percent completion” 9 & 10 - Failure to use design reviews and code inspections. 04/04/2011© 2011 USC-CSSE52


Download ppt "University of Southern California Center for Systems and Software Engineering Software Classic Disasters CS 577b Software Engineering II Supannika Koolmanojwong."

Similar presentations


Ads by Google