Backup Methods For a Hot Site Dieter W. Storr Los Angeles Times 23 August 2005
Dieter W. Storr -- B/R Methods Existing Backup Method Experiences Mirroring or Replicating Fast Copy of Data Proposals and Costs Future Technology Lessons learned
23 August 2005 Dieter W. Storr -- Existing Backup Method From disk (databases) Copy to 3490 / / VTS Then, copy to (cartridge)
23 August 2005 Dieter W. Storr -- ADABAS Back-up at LA Times
23 August 2005 Dieter W. Storr -- B/R Methods Source: Companies that relied on tape or on third-party provider found in many cases they had difficulty meeting their recovery time objectives.
23 August 2005 Dieter W. Storr -- B/R Methods Source: 15 Apr 2004 | SearchSecurity.com Flaws in tape-based data backup may be leaving enterprises without key information and could lead to legal exposure under emerging laws such as Sarbanes-Oxley, say data backup and recovery experts.
23 August 2005 Dieter W. Storr -- B/R Methods In a survey of 500 IT departments completed … found that as many as 20% of routine, nightly backups fail to capture all data. 40% of IT managers had been unable to recover data from a tape when they needed it More than 23% sought to use data stored on tape backups more than 20 times in a year Source: 15 Apr 2004 | SearchSecurity.com
23 August 2005 Dieter W. Storr -- B/R Methods Are tapes really so bad? LA Times experiences?
23 August 2005 Dieter W. Storr -- Tape Problems 1 November 2002: Six tape drive errors Delay
23 August 2005 Dieter W. Storr -- Tape Problems 24 March 2003: Only two channel paths per tape controller were provided Slow restore time
23 August 2005 Dieter W. Storr -- Tape Problems 5 October 2003: 3590 tape drives were not defined to DFSMS (SMS) ADABAS restore and application test cancelled
23 August 2005 Dieter W. Storr -- Tape Problems 6 December 2003: VTS problems with GDG datasets End-user functions couldnt be tested
23 August 2005 Dieter W. Storr -- Tape Problems 5 August 2004: Restore jobs had to wait for an input tape that was being used by another restore job Delay
23 August 2005 Dieter W. Storr -- Tape Problems 30 October 2004: Packages didnt arrive in time, due to a thunderstorm that affected FedEx delivery Major delay
23 August 2005 Dieter W. Storr -- Tape Problems 30 October 2004: Automated tape library experienced unit address problems during the restore process Delay
23 August 2005 Dieter W. Storr -- Tape Problems 30 October 2004: VTS logical tapes were not shipped to Wood Dale (HSM level 2, SAR level 2) Delay
23 August 2005 Dieter W. Storr -- Tape Problems 30 October 2004: Confusion about when to load DRP1 and DRP2 tapes, before or after IPL Delay
23 August 2005 Dieter W. Storr -- Tape Problems 30 October 2004: ICIS libraries were not backed up to tape Application tests were not possible
23 August 2005 Dieter W. Storr -- Tape Problems 8 December 2004: Load problems Tapes were loaded before IPL and not after IPL Major delay
23 August 2005 Dieter W. Storr -- Tape Problems 8 December 2004: Experienced problems when trying to restore MIG1 data, e.G. DRADABC0 job Major delay
23 August 2005 Dieter W. Storr -- Tape Problems 8 December 2004: Recall sent by FedEx tapes to SunGard One damaged package arrived without tapes Restored DATA one generation back (-1) System was generation (0)
23 August 2005 Dieter W. Storr -- Tape Problems 21 March 2005: Level 2 tapes for VTS not being sent off-site (but have been on the list) Application team couldnt test all data
23 August 2005 Dieter W. Storr -- Tape Problems 5 August 2005: cartridges ejected, not found DSS8370W - TMS SHOWS TAPE N00318 OUT OF AREA DRP1,SLOT Delay
23 August 2005 Dieter W. Storr -- Time Warner employee data missing May 2, 2005: 5:51 PM EDT NEW YORK (CNN) - Time Warner Inc. said Monday that data on 600,000 current and former employees stored on computer backup tapes was lost by an outside storage company and that the Secret Service is now investigating.
23 August 2005 Dieter W. Storr -- Lost Backup Tape Held Ameritrade Client Data Wednesday, April 20, LA Times … package was damaged during shipping between vendors ….. fourth tape is still missing…… The tapes may have included customers Social Security numbers …..
23 August 2005 Dieter W. Storr -- Info On 3.9M Citigroup Customers Lost Monday, June 6, 2005 – CNN.COM Citigroup, the nation's biggest financial services company, said that UPS lost the tapes while shipping them to a credit bureau in Texas.
23 August 2005 Dieter W. Storr -- Costs for Tape Backups SunGard recovery services Offsite tape storage Tape handling Shipping per test Special extra pick-ups Yearly $150,000
23 August 2005 Dieter W. Storr -- Costs Not capable to restore one day $$ ??? Last December: 2 weeks to rebuild manually (?) customer tables Does it make sense to restore more than 2 days back ??
23 August 2005 Dieter W. Storr -- Costs Example: 20 employees x $140 per day x 10 days = $28,000 And they couldnt work on other projects $140 is based on $51,100 yearly income
23 August 2005 Dieter W. Storr -- Quantitative Risk Analysis Single Loss Expectancy SLE = Single Loss Expectancy EF = Exposure Factor, for example 50% or.50 AV = Asset Value, for example $1,000,000 SLE = EF * AV SLE =.5 x $1,000,000 = $500,000
23 August 2005 Dieter W. Storr -- B/R Methods Reducing tapes
23 August 2005 Dieter W. Storr -- B/R Methods Reducing tapes Stacking datasets to cartridges Using Delta Save Facility from ADABAS
23 August 2005 Dieter W. Storr -- B/R Methods Reducing tapes Using Forward Index Compression (FIC) from ADABAS Using larger block size for 3590 tapes = 256K, supported by ADABAS
23 August 2005 Dieter W. Storr -- Delta Save Facility (DSF)
23 August 2005 Dieter W. Storr -- Delta Save Facility
23 August 2005 Dieter W. Storr -- B/R Methods Forward Index Compression Rochester Gas & Electric Space savings: Normal Index: 37% - 55% Upper Index: 21% - 69% Within an index block the part of the index value that is identical to the forward part of the previous index value is suppressed.
23 August 2005 Dieter W. Storr -- B/R Methods IBM Magstar 3494 / Virtual Tape Server (VTS) LA Times SunGard
23 August 2005 Dieter W. Storr -- B/R Methods VTS problems LA Times: Completion code A78 RC 18 We switched from VTS to cartridges
23 August 2005 Dieter W. Storr -- B/R Methods VTS problems Virginia Information Technologies Agency: Ran 2003/2004 into the same problem system completion code A78 RC 18 We … converted … to 3490/3590 physical tapes Problem solved
23 August 2005 Dieter W. Storr -- B/R Methods Disk to Disk Mirroring Hardware Software Replicating Software
23 August 2005 Dieter W. Storr -- B/R Methods – Enterprise Server Enterprise Server UNIX NT / 2000 / XP Hot Site
23 August 2005 Dieter W. Storr -- B/R Methods – Open System Hot Site
23 August 2005 Dieter W. Storr -- B/R Methods Marty Stewart Disaster Recovery Manager AnMed Health: …wed rather have a server thats running slower than having no server at all.
23 August 2005 Dieter W. Storr -- Disk Mirroring Benefits Asynchronous disk mirroring can provide better physical protection by supporting extended physical distances. No loss of committed transactions in synchronous storage (mirroring/RAID) on a CPU failure ASSO DATA ASSO DATA
23 August 2005 Dieter W. Storr -- Limitations No protection from data corruption Secondary site is not guaranteed to be transitionally consistent, in the case of asynchronous mirroring. Client application must be re-started after failure and need to be aware of failure ASSO DATA ASSO DATA Disk Mirroring
23 August 2005 Dieter W. Storr -- Disk Mirroring Limitations Synchronous mirroring and RAID devices can add overhead to application performance. Redundant/specialized high availability hardware/software can be expensive and restricted to use for backup purposes only. ASSO DATA ASSO DATA
23 August 2005 Dieter W. Storr -- Limitations Secondary copy of data is not available for use – low hardware utilization. Need to replicate everything on disk, no selectivity of data replication ASSO DATA ASSO DATA Disk Mirroring
23 August 2005 Dieter W. Storr -- Example For Disk Mirroring S/390UNIX S/390UNIX miles OC-3 link EMC 5700 SRDF remote mirrored synchronized Back Up / Hot Site SRDF remote mirrored synchronized Main Platform
23 August 2005 Dieter W. Storr -- B/R Methods Can we buy used Enterprise Servers? Yes…..and inexpensive OP system is free for D/R Search for selling used mainframes, for example: etc.
23 August 2005 Dieter W. Storr -- Dedicated line broadband speeds and prices T megabits per second (24 DS0 lines) Ave. cost $400.-$650./mo. T megabits per second (28 T1s) Ave. cost $6,000.-$16,000./mo. OC megabits per second (100 T1s) Ave. cost $20,000.-$45,000./mo. OC megabits per second (4 OC3s) no price OC gigabits per seconds (4 OC12s) no price OC gigabits per second (4 OC48s) no price Source: prices updated: 12 May 2005
23 August 2005 Dieter W. Storr -- Peer-to-Peer Remote Copy Extended Distance (PPRC-XD) PPRC = 60 miles - PPRC-XD = continent ESS Shark - IBM ESS DASD - HDS also support PPRC ESS Shark FlashCopy Also see TimeFinder from EMC
23 August 2005 Dieter W. Storr -- External Back-up Systems Fast Copy of Data Snapshot No data movement A virtual copy by copying pointers Copy Process Physical copy async. from the log. copy No impact on applic. on the original data
23 August 2005 Dieter W. Storr -- Fast Copy of Data Specific Hardware Required Software works only with the hardware Work on Volume Level Some snapshot only tools work also on dataset level External Back-up Systems
23 August 2005 Dieter W. Storr -- Snapshot & Physical Copy IBM Hardware: Enterprise Storage Server Software: FlashCopy EMC 2 Hardware: Symmetrix Remote Data Facility Software: EMC TimeFinder
23 August 2005 Dieter W. Storr -- Flash Copy
23 August 2005 Dieter W. Storr -- How It Works Read / update Physical Backup Physical Backup Snapshot Read / update Read only snap Pre-defined time window SuspendResume Source Data Source Data Read only: update requests are queued Source: SAG ADADBS TRANSACTIONS SUSPEND,TTSYN=60,TRESUME=120
23 August 2005 Dieter W. Storr -- Replication Benefits Warm standby systems can be configured over a Wide Area Network, providing protection from site failures. Ability to more quickly swap to the standby system in the event of failure, as backup database is already on-line. Data corruption is typically not replicated as transactions are logically reproduced rather than I/O blocks mirrored. ASSO DATA WORK ASSO DATA WORK
23 August 2005 Dieter W. Storr -- Replication Benefits Automatic switch over for clients using a switching mechanism, no client restart needed. Originating applications are minimally impacted as replication takes place asynchronously after commit of the originating transaction. The warm standby database is available for read-only operations, allowing better utilization of backup systems. ASSO DATA WORK ASSO DATA WORK
23 August 2005 Dieter W. Storr -- Benefits Ability to resynchronize and easily switch back to primary system when it becomes available without loss of data. ASSO DATA WORK ASSO DATA WORK Replication
23 August 2005 Dieter W. Storr -- Limitations Warm standby system will be out-of- date by transactions committed at the active database that have not been applied to the standby. Protection is limited to components supporting Warm Standby (e.g. DBMS data sources may be protected but file systems may not be supported). ASSO DATA WORK ASSO DATA WORK Replication
23 August 2005 Dieter W. Storr -- Entire Transaction Propagator The Entire Transaction Propagator allows for asynchronous data replication. Replicated data can be updated and synchronized with master data at user specified intervals.
23 August 2005 Dieter W. Storr -- ADABAS Data Replication Logical dissemination of ADABAS Data to homogeneous or heterogeneous targets Near real time propagation Event driven at the Transaction level Implemented at the Database/file level for Store, Delete and Update commands Define Replication rules through subscriptions Minimal Impact on normal nucleus activity Strategic for Enterprise Data Sharing Replace Entire Transaction Propagator
23 August 2005 Dieter W. Storr -- ADABAS Data Replication Origin DBMS File Target Field Target Field Target DBMS Target Table z/OS Image B Unix Server D z/OS Image C z/OS Image A
23 August 2005 Dieter W. Storr -- Possible Hot Site Solutions Enterprise Server Los Angeles Own Enterprise Server Hot Site Shark OC3 Shark EMC OC3 EMC OC3 Converter ESCON FICON Fiber Optic
23 August 2005 Dieter W. Storr -- Costs for Tape Backups SunGard recovery services Offsite tape storage Tape handling Shipping per test Special extra pick-ups Yearly $150,000
23 August 2005 Dieter W. Storr -- Costs for Real Disaster SunGard Declaration Fee D/R Site Daily Usage Fee Office Space Daily Usage Fee Work Group Declaration Fee Work Group Daily Usage Fee LAN Bridge Declaration Fee LAN Bridge Daily Usage Fee 30 Days $475,000
23 August 2005 Dieter W. Storr -- Costs for Own Hot Site Used IBM Z800-0X2 Mainframe Used IBM 2105-F20 Shark Storage Used IBM 3494 Library, VTS and Tape Drives 3rd Party Next Day HW Maintenance Printer and Terminal Controller Re-location Costs 3490 Tape Drive and Controller Re-location Costs Other Costs 1st Year $520,000 After 5 Years Total $735,000
23 August 2005 Dieter W. Storr -- Costs for Own Hot Site 5 Years SunGard=$750, Days Real Disaster =$475,000 5 Years Own Facility=$735,000
23 August 2005 Dieter W. Storr -- Restore Times (Min)
23 August 2005 Dieter W. Storr -- Benefits of Own Hot Site Financial savings > $150,000 annually providing an almost 5% ROI Reduced recovery time Reduced impact due to road and airport closures Elimination of reliance on external vendors Mainframe and open system can use the same facility
23 August 2005 Dieter W. Storr -- Grid Computing virtual machine virtual memory virtual storage virtual I/O
23 August 2005 Dieter W. Storr --
23 August 2005 Dieter W. Storr -- Grid Computing Resource Broker Replica Catalog Information Service Computer Element(s) Storage Element(s) Hardware Software Locations User Interface Passkey
23 August 2005 Dieter W. Storr -- Grid Computing
23 August 2005 Dieter W. Storr -- Grid Computing BBC builds distributed grid for content sharing (Gridcast) /publicdocs/ppt/prompeg307.ppt file:///C:/My%20Documents/Dieter/my%20presentations/prompeg307.ppt
23 August 2005 Dieter W. Storr -- Grid Computing For Backup? Intra or Extra Grid? Pull or Push? Grid Software
23 August 2005 Dieter W. Storr -- Backup Methods Mostly used by other companies Source: DRJ Magazine VTS Disk to disk Is more and more common for enterprise storage servers and AIX server technology, for example. Magazine
23 August 2005 Dieter W. Storr -- B/R Methods Problems for other companies High third-party hot site costs, approx. $10,000 - $70,000 per month Restore time hours
23 August 2005 Dieter W. Storr -- How Far is Far Enough? ( Alternate Facility Offsite Storage Facility Answer = 105 miles …so the survey
23 August 2005 Dieter W. Storr -- Lessons Learned ( Distance is key Streets, bridges, tunnels, airports are closed Tape recovery is not effective All applications are critical Inconsistent back-up is no back-up at all
23 August 2005 Dieter W. Storr -- Lessons Learned ( People-dependent processes do not suffice Two sites are not enough People are hard to replace but information is irreplaceable
23 August 2005 Dieter W. Storr -- …..we should have an excellent HOT SITE!