Presentation is loading. Please wait.

Presentation is loading. Please wait.

ERCOT Project Update ERCOT Outage Evaluation Phase 2 (SCR745) TDTWG May 7, 2008.

Similar presentations


Presentation on theme: "ERCOT Project Update ERCOT Outage Evaluation Phase 2 (SCR745) TDTWG May 7, 2008."— Presentation transcript:

1 ERCOT Project Update ERCOT Outage Evaluation Phase 2 (SCR745) TDTWG May 7, 2008

2 2 PR60006_01 Phase 2 ERCOT Update - Overview Background: SCR 745: To achieve improved Market performance and reliability through a reduction of ERCOT Retail Systems unplanned outages. This effort was planned to be implemented in two subprojects; PR60006_01: ERCOT Outage Evaluation Phase I and Phase II Phase I, NAESB and Proxy Clustered (Delivered 02/2007-Goal Achieved) Phase II, Paperfree Clustered environment with File Server Redundancy and High Availability PR60006_02: Phase III, Database Clustered environment (Cancelled per recommendations at 04/02/2008 TDTWG) Phase II Status: 02/10/2007 – Implemented Veritas clustered solution resulted in rollback due to unsuccessful failover. 03/08/2008 – Implemented Polyserve clustered solution resulted in rollback due to performance and stability issues (This would have delivered Redundancy and Failover) 05/07/2008 – Seeking recommendations from TDTWG for Next Steps

3 3 Recommendations from HP for Performance improvement will require Architectural changes, server rebuilds, and testing ERCOT Recommends pursuing one of the following Options: 1) Place project “On Hold” due to the following (preferred): Stabilization of San Switch Replacement Project (Polyserve known issue with loss of connectivity to SAN) Test Environment Lock down until December 2008 due to Ts and Cs, MarkeTrak, and Nodal Resource constraints due to Ts and Cs, MarkeTrak, and Nodal Eliminate additional Finance charges by placing project on Hold Allow to move forward in 2009 with implementation that will deliver Failover capabilities (High Availability and Redundancy Goal of SCR) 2) Close project and complete effort as O & M: Additional funding will be required for remaining efforts Total Project estimated at $1M approved by Board in 2005 Committed approximately $885K, will require Board approval for additional funding PR60006_01 Phase 2 ERCOT Update – Next Steps

4 4 PR60006_01 Phase 2 ERCOT Update – Outages Retail Transaction Processing Unplanned Outages by # of Incidents NAESB Seebeyond / TIBCOPaperfreeSiebelTMLRetail Databases 20041585367 2005862110 2006212001 20073441112 2008103060 Total29191652410 * Based on IT Incident Report on 04/02/2008 and Metrics in SCR745

5 5 Retail Transaction Processing Unplanned Outages by Approx. # of Minutes NAESB Seebeyond / TIBCOPaperfreeSiebelTMLRetail Databases 200467785842434450805600 2005528104648120 0 20065554084700502 20072323106802101031820 2008131049207430 77246796310178019746922 PR60006_01 Phase 2 ERCOT Update – Outages Based on IT incident Report and SCR Metrics

6 6 PR60006_01 Phase 2 ERCOT Update – PF Outage Details (3yrs) PaperFree Availability Metrics Prior to March 2008 as a result of 2007 Intermediate Resolutions Previous Logged incident for PaperFree file server – 02/2007. Until March, 2008 – Paperfree Application was 100% available due to intermediate solutions (meeting SCR Goal for reliability). Issue Date Dura tion (min s) SLA Impacted Application ImpactedIssue DescriptionRoot Cause Service Impact Service Impact Detail 9/25/06829RetailPaperfreePaperfree File Server not respondingInfrastructureOutage Unplanned Outage 10/2/0618RetailPaperfreePaperfree File Server network outageInfrastructureOutage Unplanned Outage 1/3/07130RetailPaperfree Memory failure in the clustered environmentInfrastructureOutage Unplanned Outage 1/5/07270RetailPaperfreeProblem pulling data from NAESBInfrastructureOutage Unplanned Outage 1/8/07195RetailPaperfree Attempted to replace the Paperfree architecture as identified by the on- going Paperfree issues analysisInfrastructureOutage Unplanned Outage 2/7/0785RetailPaperfree Connectivity issue between application and SANInfrastructureOutage Unplanned Outage 3/19/08147RetailPaperfreeSAN Hardware failureInfrastructureOutage Unplanned Outage 3/20/08105 Retail Market Degradation Issues Post SCR745 Phase 2 solution Polyserve Applicaton/PFOutage Unplanned Outage 3/22/08240 Retail Market Rollback from SCR745 Phase 2 implementation Polyserve Applicaton/PFOutage Unplanned Outage

7 7 PR60006_01 Phase 2 ERCOT Update – TDTWG Recommendations Discussion


Download ppt "ERCOT Project Update ERCOT Outage Evaluation Phase 2 (SCR745) TDTWG May 7, 2008."

Similar presentations


Ads by Google