Presentation is loading. Please wait.

Presentation is loading. Please wait.

Experiences with D/R Procedures

Similar presentations


Presentation on theme: "Experiences with D/R Procedures"— Presentation transcript:

1 Experiences with D/R Procedures
Of ADABAS Data on Mainframes Natural Conference Boston Dieter W. Storr May 2004

2 Dieter W. Storr -- info@storrconsulting.com
May 2004 May 2004 Dieter W. Storr -- Dieter W. Storr --

3 Different Disaster Different Action
Storr Consulting May 2004 Different Disaster Different Action Unplanned downtime Machine outages Network outages Software failures Disaster Site / data center loss Catastrophic failure May 2004 Dieter W. Storr -- Dieter W. Storr --

4 Dieter W. Storr -- info@storrconsulting.com
Leading Causes of Downtime Source: DRJ Summer 2002, Volume 15, Number 3 Power Storm Flood Terrorism Outage Damage Sabotage May 2004 Dieter W. Storr --

5 Other Causes of Downtime
Fire Earthquake Computer Crime May 2004 Dieter W. Storr --

6 Dieter W. Storr -- info@storrconsulting.com
LA Times Downtime Flood Damage 21 April 2002: Water was flooding through the Orange County facility, 14-inch pipe that supplies the fire-sprinkler system burst, half the facility standing in more than a foot of muddy water Affected areas: editorial, ad ops, IT,HR, ADABAS was not affected May 2004 Dieter W. Storr --

7 Dieter W. Storr -- info@storrconsulting.com
LA Times Downtime Bomb Alarm 14 June 2002: A bomb was believed to have been left in the Bank of America branch that’s set into the Times Building Security swept the building, DBA’s observed the system from home May 2004 Dieter W. Storr --

8 Dieter W. Storr -- info@storrconsulting.com
LA Times Downtime Bomb Alarm 29 July 2002: An intruder claimed to have a bomb, darted into the garage Security swept the building, OP stopped CA7 - so PLOGCOPY couldn’t start automatically, two PLOG’s got full, ADABAS was locked, DBA’s later started the PLCOPY jobs manually May 2004 Dieter W. Storr --

9 Dieter W. Storr -- info@storrconsulting.com
LA Times Downtime Power Outage - 29 August 2002 (3:43 P.M.) City (DWP) had a power grid, flood leaked into a DWP transformer There were actually 2 spikes/outages, the first started the UPS switchover, which was interrupted by the second, which took the UPS down. May 2004 Dieter W. Storr --

10 Dieter W. Storr -- info@storrconsulting.com
LA Times Downtime Power Outage - cont’ The network was back in service after a short delay. Our Unix-based servers were restarted, and checked. There was no evidence of damage to the Sybase Adaptive Server Enterprise (ASE, formerly: Sybase SQL Server) servers. May 2004 Dieter W. Storr --

11 Dieter W. Storr -- info@storrconsulting.com
LA Times Downtime Power Outage - cont’ Mainframe recovery was delayed due to corruption to the Hardware Management Console (HMC) OP did a power-on reset, which restored the HMC Operations IPLed, and Technical Support proceeded with system checkout procedures. Although Enterprise Storage Server (ESS) had an error indicator, it was still up and did not add to any outages IBM reset error indicator without impact. May 2004 Dieter W. Storr --

12 Dieter W. Storr -- info@storrconsulting.com
LA Times Downtime Power Outages - cont’ Started ADABAS servers manually: Parm Error 23, DIB block remained after an abnormal termination Started all servers with IGNDIB=YES 18:25 ADABAS IS ACTIVE NO ADAN58 Message May 2004 Dieter W. Storr --

13 Dieter W. Storr -- info@storrconsulting.com
LA Times Downtime ADAN58 Message (ADA71: ADAN5A) ADAN58 BUFFER-FLUSH START RECORD DETECTED DURING AUTORESTART. THE NUCLEUS WILL T E R M I N A T E AFTER AUTORESTART. IN CASE OF POWER FAILURE, THE DATABASE MIGHT BE INCONSISTENT BECAUSE OF PARTIALLY WRITTEN BLOCKS. O N L Y IN THIS CASE, REPAIR THE DATABASE BY RESTORE AND REGENERATE; OTHERWISE RESTART THE NUCLEUS. ADAN5A: FILES MODIFIED DURING AUTORESTART: files May 2004 Dieter W. Storr --

14 Power Failure During Buffer Flush
C D old block updated block partially updated block on disk E F C H E F C D May 2004 Dieter W. Storr --

15 Dieter W. Storr -- info@storrconsulting.com
Nucleus Restart After Power failure - IGNDIB=YES <snip> ADA User exit 2 active. ADA PLOG2 closed. ADAP3X2P submitted. ADAN PROTECTION-LOG PLOGR1 STARTED ADAN NUCLEUS-RUN WITH PROTECTION-LOG 00677 ADAL :25:18 CLOGRS IS ACTIVE ADAN ADABAS COMING UP ADAN5A FILES MODIFIED DURING AUTORESTART: ADAN5A ADAN5A ADAN5A ADAN RUNNING WITH ASYNCHRONOUS BUFFERFLUSH ADAN8Y FILE-LEVEL CACHING INITIALIZED ADAN ADABAS DYNAMIC CACHING ENVIRONMENT ESTABLISHED. ADAN A D A B A S V IS ACTIVE ADAN MODE = MULTI I S O L A T E D ADAN RUNNING WITHOUT RECOVERY-LOG ADA User exit 8 active. May 2004 Dieter W. Storr --

16 Dieter W. Storr -- info@storrconsulting.com
LA Times Downtime Power Outage - cont’ Switched all PLOGs Checked batch and online There was no evidence of damage to any of the ADABAS components. May 2004 Dieter W. Storr --

17 Other LA Times Disasters
1965: Watts riots 1971: Sylmar quake 6.5 1987: Whittier punch 5.9 1992: LA riots 1994: Northridge quake 6.7 6 Feb 1998: El Niňo, flooding in B-1 computer room 15 April 1999: Power failure ‘news editing’ May 2004 Dieter W. Storr --

18 Dieter W. Storr -- info@storrconsulting.com
ADABAS Recovery CLOG Command Log (CLOG) Failure - I/O Error Restore or reallocate/format the CLOG ADABAS will come up through Autorestart normally No data loss if CLOG is not used May 2004 Dieter W. Storr --

19 Dieter W. Storr -- info@storrconsulting.com
ADABAS Recovery PLOG PLOG Protection Log (PLOG) Failure - I/O Error Restore or reallocate/format the PLOG Take a full back-up of the database ADABAS will come up through Autorestart normally Restart batch jobs Restartable batch jobs = OK Non-restartable batch jobs = check May 2004 Dieter W. Storr --

20 Dieter W. Storr -- info@storrconsulting.com
ADABAS Recovery TEMP SORT TEMP and SORT Failure - I/O Error Restore or reallocate/format the TEMP/SORT dataset Different actions for the utilities See the ADABAS Utilities manuals May 2004 Dieter W. Storr --

21 Dieter W. Storr -- info@storrconsulting.com
ADABAS Recovery DSIM DSIM Failure - I/O Error Restore or reallocate/format a DSIM dataset Different actions for the utilities See the ADABAS Utilities manuals May 2004 Dieter W. Storr --

22 Dieter W. Storr -- info@storrconsulting.com
ADABAS Recovery RLOGM RLOGR Recovery Aid Dataset Failure - I/O Error Restore or reallocate/format a RLOG dataset Prepare the RLOG dataset ADARAI PREPARE RLOGSIZE / RLOGDEV…. Different actions for the utilities See the ADABAS Utilities manuals Take a full back-up of the database This will start the first generation of the RLOG dataset May 2004 Dieter W. Storr --

23 Dieter W. Storr -- info@storrconsulting.com
ASSO ADABAS Recovery ASSO DATA DATA ASSO/DATA Failure - I/O Error Copy PLOG twice - ADARES PLCOPY Restore or reallocate/format DATA dataset(s) Instead of reallocate/format and restore all DATA volumes, System specialists can Reallocate and format the new volume Restore the VTOC chain Restore and Regenerate only files that were located on the failed volume Otherwise, . . . May 2004 Dieter W. Storr --

24 Dieter W. Storr -- info@storrconsulting.com
ASSO ADABAS Recovery ASSO DATA DATA ASSO/DATA Failure - I/O Error Restore entire database ADASAV RESTORE [OVERWRITE = for GCB] ADASAV RESTONL [OVERWRITE] include PLOG Start nucleus with UTIONLY=YES Regenerate updates from end of last save (SYN2) ADARES REGENERATE PLOGNUM=xxx ADARES FROMCP=SYN2,FROMBLK=xxx May 2004 Dieter W. Storr --

25 Dieter W. Storr -- info@storrconsulting.com
ASSO ADABAS Recovery ASSO DATA DATA ASSO/DATA Failure - I/O Error Possible utilities need to be rerun (see ADARES): ADALOD LOAD FILE=xxx ADALOD UPDATE FILE=xxx ADALOD UPDATE FILE=xxx,DDISN ADAINV INVERT FILE=xxx,FIELD=xx Lock files to rerun utilities ADADBS OPERCOM LOCKU=xx Unlock utility-only status ADADBS OPERCOM UTIONLY=NO May 2004 Dieter W. Storr --

26 Dieter W. Storr -- info@storrconsulting.com
ASSO ADABAS Recovery ASSO DATA DATA ASSO/DATA Failure - I/O Error Rerun the regenerate function for the relevant files Unlock the regenerated files ADADBS OPERCOM UNLOCKU=xx Don’t repeat these steps if ADARES points out: ADALOD LOAD FILE=nn ADARES REGENERATE FILE=nn ADADBS REFRESH FILE=nn Nucleus is ready May 2004 Dieter W. Storr --

27 Dieter W. Storr -- info@storrconsulting.com
ADABAS Recovery WORK1 WORK2 WORK3 WORK 1 Failure - I/O Error Restore or reallocate/format the WORK dataset Restore and regenerate the entire database to avoid inconsistencies: open transactions See ASSO/DATA failure May 2004 Dieter W. Storr --

28 Dieter W. Storr -- info@storrconsulting.com
ADABAS Recovery WORK1 WORK2 WORK3 WORK 2/3 Failure - I/O Error End the database normally (ADAEND) to avoid open transactions in part 1 of WORK Restore or reallocate/format the WORK dataset Restart the database normally If database abends then restore and regenerate the entire database - see ASSO/DATA failure May 2004 Dieter W. Storr --

29 Dieter W. Storr -- info@storrconsulting.com
DATA ADABAS Recovery DS DS Failure in Data Storage Blocks //DDSIIN DD DSN=SAVE.SIBA…. // DD DSN=PLCOPY.LOG1… // DD DSN=PLOCPY.LOG2… //DDCARD DD * ADARES REPAIR DSRABN=xxx-yyy ADARES FILE=n1,n2,n3 Failure in DSST ADADCK DSCHECK FILE=xxx ADADCK REPAIR DS CALL SAG ! ! May 2004 Dieter W. Storr --

30 Dieter W. Storr -- info@storrconsulting.com
ASSO ADABAS Recovery CP DATA Nucleus Ends With RC 77 Not restartable No more space for Checkpoint File (CP) Rename old WORK Allocate/format new WORK with old space Change high-used RABN and high-used ISN Restart nucleus with new WORK and UTIONLY=YES Nucleus is in “crippled mode” - no user has access Expand the database Stop the nucleus normally Rename old WORK and restart the nucleus with old WORK (autorestart) CP May 2004 Dieter W. Storr --

31 Dieter W. Storr -- info@storrconsulting.com
ASSO ADABAS Recovery User DATA Nucleus Ends With RC 77 Not restartable No more space for user files Rename old WORK Allocate/format new WORK with old space Restart nucleus with new WORK and UTIONLY=YES Nucleus is in “crippled mode” - no user access Expand database Stop nucleus normally Rename old WORK and restart nucleus with old WORK (autorestart) User May 2004 Dieter W. Storr --

32 Dieter W. Storr -- info@storrconsulting.com
ADABAS Recovery ASSO DATA Nucleus Abends - Missed DE Values Descriptor is marked in FDT as DE, value doesn’t exist in ASSO, but in DATA. Check: ADAICK ICHECK FILE=xxx[,NOOPEN] ADAVAL VALIDATE FILE=xxx,DESCRIPTOR=yy Solution 1: ADAULD UNLOAD FILE=xxx,UTYPE=EXF ADALOD LOAD FILE=xxx,LWP=yyyyK Solution 2: ADADBS RELEASE FILE=xxx,DESCRIPTOR=yy ADAINV INVERT FILE=xxx,FIELD=yy,LWP=... CALL SAG ! ! May 2004 Dieter W. Storr --

33 Back-up Possibilities
Storr Consulting May 2004 Back-up Possibilities ADASAV to tape / disk Including Fast Dump Restore, DFDSS Delta Save Facility (DSF) Delta Save QDUMP (Legent) Disk mirroring (hardware level) FlashCopy of Enterprise Storage Server (ESS) Peer-to-Peer Remote Copy Extended Distance (PPRC-XD) OC-3 links two EMC disc arrays Replication Stand-by systems Restore and Regenerate Entire Transaction Server ASSO DATA Shark's remote copy capability, called Peer-to-Peer Remote Copy (PPRC), lets administrators replicate data between Shark systems up to 60 miles away from each other for disaster-recovery purposes. It works with OS/390, Unix, Windows NT and 2000, and NetWare. Linux support is available on request. PPRC is similar to EMC's Symmetric Remote Data Facility, which works over ESCON and IP.PPRC works over ESCON. PPRC starts at $53,000 for systems with 1 to 2 terabytes of data; FlashCopy starts at $35,000 for systems with 2 terabytes; and Fibre Channel adapters that allow connectivity start at $10,000. May 2004 Dieter W. Storr -- Dieter W. Storr --

34 ADABAS Disaster Recovery
How to back-up Collect recovery data Restore w/o nucleus Start nucleus w/ UTILONLY=YES Regenerate w/ nucleus Switch UTIONLY=NO May 2004 Dieter W. Storr --

35 Dieter W. Storr -- info@storrconsulting.com
ADABAS Back-up at LA Times 21:00 01:00 02:00 03:00 8: :00 12:00 Weekly ADAP1BKF Online SAVE ADAP1PLC (FEOFPL) ADAP1PLC PLOG Switch ASSO / DATA / WORK / etc. DFDSS Full-Volume Back-up BRM/ABARS Several Jobs ADAP1BKO Copy Tapes PDS, GDGs etc. Pick-up by Recall May 2004 Dieter W. Storr --

36 Dieter W. Storr -- info@storrconsulting.com
Production Database Back-ups ADASAV SAVE BUFNO=2,TTSYN=60 Record format : VB Record length : Block size : BUFNO=30 May 2004 Dieter W. Storr --

37 Back-up to SMS Disk Pool
Run times are consistently at least 80% lower when writing to disk instead of cartridge Run times are consistently around 60% lower when copying from disk to cartridge (compared with cart to cart) DFSMShsm, automate your storage management tasks, SMS Production Storage Pool DFSMShsm May 2004 Dieter W. Storr --

38 Dieter W. Storr -- info@storrconsulting.com
Back-up to Disk Pool No cartridge errors No cartridge drive errors No cartridges get accidentally ejected from the silo Smaller back-up window Smaller maintenance windows Less impact to application processes Greater confidence that the data you need will be there when you need it May 2004 Dieter W. Storr --

39 IBM Magstar 3494/Virtual Tape Server
Linear design frames Conf. Flexibility SCSI, FC, ESCON, FICON 3590, 3490E, VTS High availability Dual robotics Dual library manager >42 old 3490 carts will fit on 1 new 3494 cart 5 x 3390 volumes fit on one 3494 cart One 3494 cart can be read in 45 seconds into the VTS disk cache (raid-5) May 2004 Dieter W. Storr --

40 Dieter W. Storr -- info@storrconsulting.com
Virtual Tape Concept Virtual tape drives Appear as multiple 3490E tape drives 3490E Media 1 and 2 support Shared / partitioned like real tape drives Tape Volume Caching All data access is to cache Improves ‘mount’ performance LRU Cache management Volume Stacking Fully utilizes physical cart capacity Reduces physical cart requirement Reduces footprint requirement 180 181 19F . . . Virtual Drive 1 Virtual Drive 2 Virtual Drive n Tape Volume Cache Virtual Volume 1 Virtual Volume 2 Virtual Volume n Logical Volume 1 Magstar 3590 30/60 GB capacity* Logical Volume n * assumes 3:1 compression May 2004 Dieter W. Storr --

41 Dieter W. Storr -- info@storrconsulting.com
Performance Tests May 2004 Dieter W. Storr --

42 Collecting Data For Recovery
Block Ranges SYN1 - SYN2 For ADASAV RESTORE From ADASAV SAVE PROTECTION LOG PLOGNUM=64, SYN1=4695, SYN2=4698 From ADAREP SYN UTI :00: DUAL ADAP1BKF SYNP 06 UTI :00: DUAL ADAP1BKF SYN UTI :01: DUAL ADAP1BKF SYNV 0A UTI :01: DUAL ADAP1BKF SYNV 0A UTI :01: DUAL ADAP1BKF SYNV 28 UTI :02: DUAL ADAP1PLC SYNP 28 UTI :02: DUAL ADAP1PLC <snip> EOD ET :30: DUAL ADAPRREP SYNS 53 ET :30: DUAL ADAP1REP SYNV 28 UTI :30: DUAL ADAP1PLC SYNP 28 UTI :30: DUAL ADAP1PLC May 2004 Dieter W. Storr --

43 Collecting Data For Recovery
Storr Consulting May 2004 Collecting Data For Recovery Block Ranges SYN2 - End For ADARES REGENERATE From ADAREP SYN UTI :00: DUAL ADAP1BKF SYNP 06 UTI :00: DUAL ADAP1BKF SYN UTI :01: DUAL ADAP1BKF SYNV 0A UTI :01: DUAL ADAP1BKF SYNV 0A UTI :01: DUAL ADAP1BKF SYNV 28 UTI :02: DUAL ADAP1PLC SYNP 28 UTI :02: DUAL ADAP1PLC <snip> EOD ET :30: DUAL ADAPRREP SYNS 53 ET :30: DUAL ADAP1REP SYNV 28 UTI :30: DUAL ADAP1PLC SYNP 28 UTI :30: DUAL ADAP1PLC May 2004 Dieter W. Storr -- Dieter W. Storr --

44 Collecting Data For Recovery
Storr Consulting May 2004 Collecting Data For Recovery Dataset Name From Back-up Job (GDG) For ADASAV RESTORE ADABAS.PRODOFFD.DB1.BACKUP.FULL.G0842V CATALOGED May 2004 Dieter W. Storr -- Dieter W. Storr --

45 Collecting Data For Recovery
Storr Consulting May 2004 Collecting Data For Recovery Dataset Names From PLOG Copy Jobs (GDG) Matching block numbers End For ADASAV RESTORE and ADARES REGENERATE DDSIAUS1 OUTPUT VOLUME=WRK015, SESSION NR=64 FROMBLK= , FROMTIME= :30:24 TOBLK= , TOTIME= :01:42 ADABAS.PROD.DB1.PLOG.COPY.G7170V00 FROMBLK= , FROMTIME= :02:08 TOBLK= , TOTIME= :30:03 ADABAS.PROD.DB1.PLOG.COPY.G7171V00 DDSIAUS1 OUTPUT VOLUME=WRK004, SESSION NR=64 FROMBLK= , FROMTIME= :30:25 TOBLK= , TOTIME= :30:33 ADABAS.PROD.DB1.PLOG.COPY.G7172V00 May 2004 Dieter W. Storr -- Dieter W. Storr --

46 Recovery - Part 1 - W/O Nucleus
Storr Consulting May 2004 Recovery - Part 1 - W/O Nucleus ADASAV RESTONL <snip> //RESTONL EXEC ADASAVRD //DDREST1 DD DISP=SHR,BUFNO=30, // DSN=ADABAS.PRODOFFD.DB1.BACKUP.FULL.G0842V00 //DDPLOG DD DISP=SHR,BUFNO=30, // DSN=ADABAS.PROD.DB1.PLOG.COPY.G7170V00 //DDKARTE DD * ADASAV RESTONL BUFNO=2,OVERWRITE //REPORT EXEC ADAREP //DDKARTE DD * ADAREP NOFILE // May 2004 Dieter W. Storr -- Dieter W. Storr --

47 Dieter W. Storr -- info@storrconsulting.com
May 2004 Recovery - Part 2 Start the ADABAS nucleus with normal JCL (UTIONLY=YES) <snip> ADAN PROTECTION-LOG PLOGR1 STARTED ADAN NUCLEUS-RUN WITH PROTECTION-LOG 00064 ADAL :20:29 CLOGRS IS ACTIVE ADAN ADABAS COMING UP ADAN RUNNING WITH ASYNCHRONOUS BUFFERFLUSH ADAN8Y FILE-LEVEL CACHING INITIALIZED ADAN ADABAS DYNAMIC CACHING ENVIRONMENT ESTABLISHED. ADAN A D A B A S V IS ACTIVE ADAN MODE = MULTI I S O L A T E D ADAN RUNNING WITHOUT RECOVERY-LOG ADA User exit 8 active. ADA ADAP1PLC submitted. May 2004 Dieter W. Storr -- Dieter W. Storr --

48 Recovery - Part 2 - With Nucleus
Storr Consulting May 2004 Recovery - Part 2 - With Nucleus ADARES REGENERATE <snip> //REGEN EXEC ADARES //DDSIIN DD DISP=SHR,BUFNO=30, // DSN=ADABAS.PROD.DB1.PLOG.COPY.G7170V00 // DD DISP=SHR,BUFNO=30, // DSN=ADABAS.PROD.DB1.PLOG.COPY.G7171V00 // DSN=ADABAS.PROD.DB1.PLOG.COPY.G7172V00 //DDKARTE DD * ADARES REGENERATE PLOGDBID=215,PLOGNUM=64 ADARES FROMCP=SYN2,FROMBLK=4698 ADARES TOCP=EOD,TOBLK=00000 not needed May 2004 Dieter W. Storr -- Dieter W. Storr --

49 Recovery - Part 3 - With Nucleus
Storr Consulting May 2004 Recovery - Part 3 - With Nucleus Lock files to re-run utilities See regenerate report ADADBS OPERCOM LOCKU=fnr or SYSAOS: A / I / L / F or modify command /F jobname,LOCKU=fnr Unlock utility-only status for users ADADBS OPERCOM UTIONLY=NO or SYSAOS: A / I / L / U or modify command /F jobname,UTIONLY=NO May 2004 Dieter W. Storr -- Dieter W. Storr --

50 Recovery - Part 3 - With Nucleus
Storr Consulting May 2004 Recovery - Part 3 - With Nucleus Re-run the utilities - if necessary ADALOD LOAD / UPDATE / DDISN ADAINV INVERT FILE=xxx,FIELD=xx Unlock files ADADBS OPERCOM UNLOCKF=fnr or SYSAOS: A / I / L / F / N or modify command /F jobname,UNLOCKF=fnr May 2004 Dieter W. Storr -- Dieter W. Storr --

51 Dieter W. Storr -- info@storrconsulting.com
May 2004 Delta Save Facility (DSF) May 2004 Dieter W. Storr -- Dieter W. Storr --

52 Dieter W. Storr -- info@storrconsulting.com
May 2004 Delta Save Facility May 2004 Dieter W. Storr -- Dieter W. Storr --

53 Dieter W. Storr -- info@storrconsulting.com
May 2004 Delta Save QDUMP (CCA - now: TSI) May 2004 Dieter W. Storr -- Dieter W. Storr --

54 Dieter W. Storr -- info@storrconsulting.com
May 2004 Disk Mirroring ASSO Benefits Asynchronous disk mirroring can provide better physical protection by supporting extended physical distances. No loss of committed transactions in synchronous storage (mirroring/RAID) on a CPU failure DATA ASSO DATA May 2004 Dieter W. Storr -- Dieter W. Storr --

55 Dieter W. Storr -- info@storrconsulting.com
Disk Mirroring ASSO Limitations No protection from data corruption introduced by the hardware / software Secondary site is not guaranteed to be transitionally consistent, because data is moved at the disk/track/sector or bit level (in the case of asynchronous mirroring). Client application must be re-started after failure and need to be aware of failure DATA ASSO DATA May 2004 Dieter W. Storr --

56 Dieter W. Storr -- info@storrconsulting.com
Disk Mirroring ASSO Limitations Synchronous mirroring and RAID devices can add overhead to application performance. Redundant/specialized high availability hardware/software can be expensive and restricted to use for backup purposes only. Secondary copy of data is not available for use – low hardware utilization. Need to replicate everything on disk, no selectivity of data replication DATA ASSO DATA May 2004 Dieter W. Storr --

57 Example For Disk Mirroring
Storr Consulting May 2004 Example For Disk Mirroring Back Up / Hot Site S/390 UNIX EMC 5700 SRDF remote mirrored synchronized OC-3 link SRDF remote mirrored synchronized 12-15 miles The customer uses EMC on the S/390 and UNIX platforms. He runs four EMC 5700's (two for each platform) that are SRDF (remote mirrored) to their back up/hot site. So, two are at the source and two are at the target and they are connected to the corresponding mate at the alternate site. The newer technology can handle all platforms in one frame and I believe the new frame will hold up to around 28TB of data. The two EMC at the source are connected to the target site via 2 OC-3 links. 100 T1 contains in one OC-3 link. Not much else there except the mirroring is performed at the hardware level. They run synchronized so that each write must complete before the next write is handled. I mention this because there are several ways or modes that the customer may select to run it. Their two sites are separated by about miles I believe. In any event, this has never failed them, while I have seen other, software related, models of mirroring fail. An example is Compaq's Storage works for their NT platform, they have had to write programs to confirm the data is the same at both sites. Now initially, EMC was practically the only game in town back in 1998. Since then Hitachi has come on the scene I would expect prices have caused EMC to drop. So, I would explore using each vendor against each other when pricing. Also, if you don't go with latest technology, you can pick up the EMC5700's pretty cheap on the used market (at least, this is my understanding). Costs: As with anything, there are one time cost for frame, equipment, etc. and ongoing maintenance costs for any software or hardware. EMC 5700 S/390 UNIX Main Platform May 2004 Dieter W. Storr -- Dieter W. Storr --

58 Dedicated line broadband speeds and prices
Storr Consulting May 2004 Dedicated line broadband speeds and prices T megabits per second (24 DS0 lines) Ave. cost $400.-$650./mo. T megabits per second (28 T1s) Ave. cost $6,000.-$16,000./mo. OC megabits per second (100 T1s) Ave. cost $20,000.-$45,000./mo. OC megabits per second (4 OC3s) no price OC gigabits per seconds (4 OC12s) no price OC gigabits per second (4 OC48s) no price Source: prices updated: 16 March 2004 May 2004 Dieter W. Storr -- Dieter W. Storr --

59 PPRC = 60 miles - PPRC-XD = continent
Storr Consulting May 2004 Peer-to-Peer Remote Copy Extended Distance (PPRC-XD) PPRC = 60 miles - PPRC-XD = continent FlashCopy ESS Shark - IBM ESS DASD - HDS also support PPRC ESS Shark CNT and Inrange support PPRC-XD for Shark storage CNT has completed interoperability testing between its UltraNet Storage Director product and IBM's new Peer-to-Peer Remote Copy Extended Distance (PPRC-XD) disk mirroring software that runs on IBM's Shark, aka TotalStorage Enterprise Storage Server. UltraNet Storage Director extends the data replication capabilities of PPRC-XD beyond the metropolitan area to long distances, allowing sites to link data centres in separate cities, across the country, and across the world. Those using both can, we're told, achieve the best possible performance of data replication transfer rates over long and short distances. UltraNet products can further consolidate SAN and LAN traffic into a unified high- performance storage network, thus providing an infrastructure for disaster recovery, data protection, data replication, and data migration needs. In addition, CNT's UltraNet Storage Director is now fully supported by IBM for all versions of IBM's PPRC software. Separately, Inrange announced that its IN-VSN 9801 Series has been enhanced with additional functionality and has completed interoperability testing with ESS for PPRC and new PPRC-XD software, enabling users to mirror IBM storage devices across the wide area network at continental distances with better performance than previously supported by Inrange. Working with Inrange's Virtual Storage Networking (IN-VSN) strategy, the 9801 allows joint Inrange and IBM customers to utilize PPRC and PPRC-XD software over several WAN interfaces and transport data at continental distances. PPRC-XD is asynchronous - to the host DEVICE END is reported when the data is in the primary cache , then transferred asynchrounously . So the performance penalty for synchronous mirroring is avoided .Later you can make a PPRC SYNCH PAIR from that - all changes are transferred until primary and secondary are in SYNCH , then normal PPRC runs . Good for copy on long distance - therefore extended distance. there is a redbook - SG ESS PPRC XD . Also see TimeFinder from EMC May 2004 Dieter W. Storr -- Dieter W. Storr --

60 External Back-up Systems
Storr Consulting May 2004 External Back-up Systems Fast Copy of Data Snapshot No data movement A virtual copy by copying pointers Copy Process Physical copy asynchr. from the log. Copy No impact on applic. on the original data Specific Hardware Required Software works only with the hardware Work on Volume Level Some snapshot only tools work also on dataset level May 2004 Dieter W. Storr -- Dieter W. Storr --

61 Snapshot & Physical Copy
Storr Consulting May 2004 Snapshot & Physical Copy IBM Hardware: Enterprise Storage Server Software: Flashcopy EMC2 Hardware: Symmetrix Remote Data Facility Software: EMC TimeFinder May 2004 Dieter W. Storr -- Dieter W. Storr --

62 Pre-defined time window
Storr Consulting May 2004 How It Works Read only: update requests are queued Pre-defined time window Suspend Resume Read only Read / update Read / update snap Source Data Snapshot Physical Backup Source: SAG May 2004 Dieter W. Storr -- Dieter W. Storr --

63 Dieter W. Storr -- info@storrconsulting.com
May 2004 Replication Benefits Warm standby systems can be configured over a Wide Area Network, providing protection from site failures. Ability to more quickly swap to the standby system in the event of failure, as backup database is already on-line. Data corruption is typically not replicated as transactions are logically reproduced rather than I/O blocks mirrored. May 2004 Dieter W. Storr -- Dieter W. Storr --

64 Dieter W. Storr -- info@storrconsulting.com
May 2004 Replication ASSO Benefits Warm standby systems can be configured over a Wide Area Network, providing protection from site failures. Ability to more quickly swap to the standby system in the event of failure, as backup database is already on-line. Data corruption is typically not replicated as transactions are logically reproduced rather than I/O blocks mirrored. DATA WORK WORK DATA ASSO May 2004 Dieter W. Storr -- Dieter W. Storr --

65 Dieter W. Storr -- info@storrconsulting.com
Replication ASSO Benefits Automatic switch over for clients using a switching mechanism, no client restart needed. Originating applications are minimally impacted as replication takes place asynchronously after commit of the originating transaction. The warm standby database is available for read-only operations, allowing better utilization of backup systems. DATA WORK WORK DATA ASSO May 2004 Dieter W. Storr --

66 Dieter W. Storr -- info@storrconsulting.com
Replication ASSO DATA Benefits Ability to resynchronize and easily switch back to primary system when it becomes available without loss of data. WORK WORK DATA ASSO May 2004 Dieter W. Storr --

67 Dieter W. Storr -- info@storrconsulting.com
Replication ASSO DATA Limitations Warm standby system will be out-of-date by transactions committed at the active database that have not been applied to the standby. Protection is limited to components supporting Warm Standby (e.g. DBMS data sources may be protected but file systems may not be supported). WORK WORK DATA ASSO May 2004 Dieter W. Storr --

68 Entire Transaction Propagator
The Entire Transaction Propagator allows for asynchronous data replication. Replicated data can be updated and synchronized with master data at user specified intervals. May 2004 Dieter W. Storr --

69 Dieter W. Storr -- info@storrconsulting.com
OS/390 Recovery Procedures Prepared by the Mainframe Recovery Team Recovering The OS/390 platform The ABARS aggregates The ADABAS databases May 2004 Dieter W. Storr --

70 Dieter W. Storr -- info@storrconsulting.com
May 2004 Dieter W. Storr --

71 Dieter W. Storr -- info@storrconsulting.com
OS/390 D/R Times (SUNGARD) About 2400 tapes Shipping time from storage to the mainframe ? 4 hours ahead for tape staging OS/390 and ABARS aggregates 5 hours planned, 7+ hours with problems ADABAS databases Approx. 2-3 hours for tape restore and regenerate Next test Nov 1: approx. 45 minutes from disk pool May 2004 Dieter W. Storr --

72 Experiences From D/R Tests
Problems to IPL on a strange CPU (6 hours duration) Initial setup (restore SYS.. Libraries) Pre-IPL procedures (restore Adabas, work, spool volumes, etc) Post-IPL procedures (DFHSM in disaster mode, etc.) Application restores Tape drive offline problems, Import MVSCAT typo errors, etc. Recovered wrong volumes, generation errors Initialize work volumes - conversion to SMS (DFSMShsm) TMC recovery problems caused BRM recovery problems, too May 2004 Dieter W. Storr --

73 Experiences From D/R Tests
Sent wrong cartridges with system dates to storage Less channels for tapes on our offsite (2 instead of 4) = double restore time May 2004 Dieter W. Storr --

74 Experiences From D/R Tests
RESTONL abended with SB00, no PLOG restored, Recovery Aid flag was on at the saved database. REGENERATE deleted file and pointed out to repeat the ADALOD job but the input dataset was not saved We did a full volume restore (DFDSS), restored the database and forgot to format the dual protection logs. Missed protection logs BRM restored wrong aggregates Missing full-volume restores - (Database 2) Missing volumes in Work Storage Pool - (Database 3) May 2004 Dieter W. Storr --

75 Experiences From D/R Tests
BRM: Back-up and Recovery Manager ABARS: Aggregate Back-up and Recovery Support (ABARS = not: Air conditioning and refrigeration industry services <smile> ) Recovered (-1) Aggregates instead of (0) – (all Databases) Recovered only SOME files on Aggregate (0) - (Database 1) BRM/ABARS was not properly recovered (wrong version of BRM database) Once those problems were resolved (several hours later), the ADABAS recovery ran smoothly. 5 Databases (61.4GB) restored and regenerated in 3.5 hours (tape/cart) May 2004 Dieter W. Storr --

76 Dieter W. Storr -- info@storrconsulting.com
How Far is ‘Far Enough?’ (http://www.drj.com/articles/spr03/ html) Alternate Facility Offsite Storage Facility Answer = 105 miles …so the survey May 2004 Dieter W. Storr --

77 Lessons Learned (http://www.drj.com/articles/spr02/1502-07.html)
Distance is key Streets, bridges, tunnels, airports are closed Tape recovery is not effective All applications are critical Inconsistent back-up is no back-up at all People-dependent processes do not suffice Two sites are not enough People are irreplaceable; so is information May 2004 Dieter W. Storr --

78 Lessons Learned (http://www.drj.com/articles/spr02/1502-07.html)
Companies that relied on tape or on third-party provider found in many cases they had difficulty meeting their recovery time objectives All disasters are possible May 2004 Dieter W. Storr --

79 Dieter W. Storr -- info@storrconsulting.com
Helpful Links Software AG - ADABAS Recovery <Knowledge Center - ADABAS> ADABAS Restart and Recovery (Operations Manual) <Knowledge Center - Product Documentation> University of Arkansas - D/R Plan Disaster Recovery Journal May 2004 Dieter W. Storr --

80 Dieter W. Storr -- info@storrconsulting.com
Helpful Links FlashCopy Shark (ESS) State of the Art Storage EMC TimeFinder Entire Transaction Propagator (SAG) May 2004 Dieter W. Storr --

81 Dieter W. Storr -- info@storrconsulting.com
Thank you! Questions? May 2004 Dieter W. Storr --


Download ppt "Experiences with D/R Procedures"

Similar presentations


Ads by Google