Presentation is loading. Please wait.

Presentation is loading. Please wait.

Progress Database Repair & Recovery Dan Foreman BravePoint, Inc.

Similar presentations


Presentation on theme: "Progress Database Repair & Recovery Dan Foreman BravePoint, Inc."— Presentation transcript:

1 Progress Database Repair & Recovery Dan Foreman BravePoint, Inc.

2 PUG Challenge 2012 Introduction- Dan Foreman Progress User since 1984 (V2.1) Speaker at many Progress User Conferences from 1990 to 2012

3 PUG Challenge 2012 Introduction- Dan Foreman Author of: Progress Performance Tuning Guide Progress Database Administration Guide Progress System Tables Guide Progress V10 Database Admin Jumpstart Book purchase allows free online access ProMonitor – Database Monitoring Tool ProD&L - Accelerated Dump/Load Utility Balanced Benchmark – Load testing tool

4 PUG Challenge 2012 Introduction - Who Are You Progress Version: V6, V7, V8, V9, V10.0*, V10.1*, V10.2* DB OS: Unix? Windows? Linux? Is there anything else? Largest Single Database? Highest Concurrent User Count?

5 PUG Challenge 2012 Special Request Mobile Phones on Mute Please!

6 PUG Challenge 2012 Goals Can I teach you database brain surgery in an hour? Note that DBAs have one big advantage that human brain surgeons dont…do you know what that is?

7 PUG Challenge 2012 Real Horror Stories Fortune 500 Company (sorry but they would not appreciate us sharing their name) DB Corruption February 23 Last Good Backup: January 11 Last Good AI Files: January 17 We facilitated a special version of rfutil that would ignore errors during the Roll Forward process

8 PUG Challenge 2012 Real Horror Stories Customer running SCO OpenServer We had told the customer to move to a more modern OS, i.e. Linux OS Problem – Cant mount the Disks; Discovered that backups to tape were not occurring (backup to disk was OK but couldnt see the disks) Had to boot using Knoppix to repair things – took forever to reinstall SCO

9 Real Horror Stories Fortune 1000 Company HP Server Admin outsourced to IBM Backups outsourced to 3d party 3d party stopped doing backups, unannounced, due to non-payment DB Corrupted Restoration impossible BravePoint 2012

10 PUG Challenge 2012 Preventive Maintenance Backups (yes, I know you think you have backups but have you tested one recently?) Test your Entire Recovery Plan

11 PUG Challenge 2012 Preventive Maintenance Warm Standby Database - A database on another machine with a recent copy of the production DB Also called a D/R Database This is easy to do in Progress...covered soon

12 PUG Challenge 2012 Preventive Maintenance Unix: dont logon as root unless you really need to Use O/S security to protect the DB, BI, and AI files from accidental or casual or intentional deletion

13 PUG Challenge 2012 Preventive Maintenance Unix: Dont use kill -9 to terminate a Self Service Progress session; You might bring the database DOWN! if you happen to kill a session that is holding a Latch ALWAYS have an up-to-date Structure (.st) file available - we will see why later

14 PUG Challenge 2012 Preventive Maintenance Monitor the BI file High Water Mark Monitor 'Delinquent' Transactions (Active Transactions longer than minutes) Monitor Large Transactions (A Client with a Large Number of concurrent Record Locks) longtrx*.p Progress program on the BravePoint Website to detect Delinquent Transactions

15 PUG Challenge 2012 Preventive Maintenance Use the -bithold parameter as an extra safeguard; Set to 50% of available BI Disk Space; even in V9 V9/V10 supports Terabyte sized BI Files but extent sizes are still limited to 2gb unless you use the EnableLargeFiles option on proutil and the file system must be 2gb enabled too

16 PUG Challenge 2012 Preventive Maintenance Monitor the Area High Water Marks to avoid growing into the Variable Length Extent There is a Performance Hit, usually insignificant but sometimes not, when growing the Variable Extent A Single Variable Extent can limit some of the recovery options discussed later

17 PUG Challenge 2012 Quiz Question Who are the Smartest DBAs in the Room?

18 PUG Challenge 2012 Answer DBAs who enabled After Imaging on their Mission Critical Databases If youre not using AI, you probably shouldnt be responsible for your companys databases Management need convincing? Play Chicken with them

19 PUG Challenge 2012 After Imaging Who is currently not using AI? If not, why not? Is public flogging or humiliation required?

20 PUG Challenge 2012 After Imaging PSC docs say that AI offers protection against Disk failure Disk fails 5 minutes before the backup starts on the final day of your year end close No paper trail Ouch! Time to work on your resume (C.V.) After Image File(s) + Last (Good) Backup = State of DB at time of crash

21 PUG Challenge 2012 After Imaging - Why Use It? But you say…I have disk mirroring (also known as RAID 1) so Im protected against a disk failure BUT Mirroring does NOT protect against all database evils

22 PUG Challenge 2012 After Imaging - Why Use It? True Horror Story #1 A DBA (logged on as root) FTPd a test database into the directory where the production database resided... unfortunately they had the same name Disk Mirroring worked just fine….. After Imaging would have probably saved the day

23 PUG Challenge 2012 After Imaging - Why Use It? True Horror Story #2 A user ran an archiving program on live data that wasnt ready to be archived Once again the mirroring performed perfectly AI might have improved the situation as it is possible to Roll Forward to a specific point in time

24 PUG Challenge 2012 After Imaging - Why Use It? True Horror Story #3 – Part 1 BI file hit the V8 2GB 1600 on a busy day (300+ users) Large Production Database was corrupted Progress Support Recommendation: dump & load or restore from backup which meant substantial down time or data loss

25 PUG Challenge 2012 After Imaging - Why Use It? True Horror Story #3 – Part 2 Fortunately the customer called me and I was able to temporarily patch the database until a D&L could be performed Irony: I had recommended AI to this customer over one year prior to this event

26 PUG Challenge 2012 After Imaging - Why Use It? Avoid probkup online issues Transaction Activity is Frozen while the BI File is Backed Up The I/O Overhead of disk/tape backup Possible Solution Use AI to maintain a Warm Spare DB Backup the Replicated Database

27 PUG Challenge 2012 After Imaging - Why Use It? Warm Standby (D/R) Database A Warm Standby DB is: A replicated Database on another Server DB can an be brought online quickly in case of catastrophic failure to the production system Its warm because it is not 100% current…usually 2-15 minutes behind

28 PUG Challenge 2012 After Imaging - Why Use It? A HOT spare database is not possible using AI except with: Fathom Replication (oops…OpenEdge Replication) Replication Triggers SAN Mirroring Even these options dont guarantee zero loss of data

29 PUG Challenge 2012 After Imaging - Why Use It? Easy refreshing of a Report Server DB A Report Server DB is: A database on another server Used for reporting only To relieve the production system of the overhead imposed by reporting Doesnt require same level of hardware or Progress license

30 PUG Challenge 2012 Essential DB Monitoring Performed periodically to make sure you dont have hidden or unreported corruption

31 Essential DB Monitoring Corruption Checks proutil dbanalys probkup/procopy proutil dbrpr proutil dbscan(non-interactive dbrpr) proutil idxfix dbtool BravePoint 2012

32 Essential DB Monitoring -MemCheck AND ALL THE OTHER SIMILAR OPTIONS BravePoint 2012

33 PUG Challenge 2012 Database Log File Monitoring Check the Database log (.lg) file for errors DAILY. Look for words such as: kill* drastic warn* error system dead fatal abnormal exceed* fail* wrong unexpected* invalid died damage* overflow* violation insufficient missing disappear* corrupt* allow* attempt* cannot enough illegal beyond impossible increase unknown unable stop* ProMonitor supports automated log file monitoring

34 PUG Challenge Errors SYSTEM ERROR: wrong dbkey in block % probability of H/W problem Dont limit search to disks; consider: Disk Controllers, RAM (parity errors), Firmware, etc. Dont let the Hardware Technician blame Progress or the Application Dont let the Hardware Technician escape without fixing the problem

35 PUG Challenge Stories Seagate Firmware on 2gb drives (mid-90s) HP Server/EMC SAN administered by HP HP/UX diagnostics showed no problems EMC diagnostics showed no problems Cause: Bad SAN Fabric Switch

36 1124 Stories Sometimes a Server reboot can (temporarily) fix a 1124 situation However this might be a situation where the hardware is in a bad-good- bad-good… cycle BravePoint 2012

37 PUG Challenge 2012 Database File System Full Use prostrct repair to relocate Extents to a place with more space Copy the Extent to the new location Update the Structure File (.st) to reflect the current location of the Extents (one good reason to have a current one) Run prostrct repair new.st Alternatively use prostrct add to add new Extents

38 PUG Challenge 2012 No Space for Before Image File Do Not run out of BI disk space if: More space isnt available elsewhere The BI extent(s) cant be relocated To perform Crash Recovery (part of the proutil truncate bi process), the BI file will grow; sometimes 2X or more

39 No Space for Before Image File If there is no space for the BI file to grow, there is no Crash Recovery If Crash Recovery partially completes but then crashes, the next Crash Recovery will create an even larger BI file!!! PUG Challenge 2012

40 No Space for Before Image File Force Access (-F) is the only option (if you dont have AI enabled) Even having AI enabled is problematic Crash Recovery Notes are also written to the AI logs You cant do rfutil aimage empty during Crash Recovery!! This is why the –bithold parameter is so important PUG Challenge 2012

41 Corrupt Database Blocks What Kind of Block? Index or Data Block Type 2 (Index = IX) Block Type 3 (Record Manager = RM) Use proutil dbrpr to get the Block Type Or if the Block is in a Storage Area dedicated to Indexes or Tables, you automatically know

42 Corrupt IX Block Options Try rebuilding the Index that the IX Block belongs to Try to Truncate the Area Reformat the block as a Free Block (proutil dbrpr) If it is an RM block, see the next set of slides PUG Challenge 2012

43 Corrupt RM Blocks Reformat the block as a Free Block (proutil dbrpr) Replace the Block with the same Block from another DB (restored from a backup) proutil dbrpr 5. Dump Block (from the good DB) 4. Load Block(into the bad DB) Dont forget to backup the DB fi

44 PUG Challenge 2012 RM Block Transplant How can I tell if the block has changed since the last probkup? The Block Update Counter stored in the Block Header If not using probkup, the DBKEY from the AI Logs can be obtained with aimage scan verbose but you REALLY need to be motivated

45 PUG Challenge 2012 Emergency Dump Front and Back 4GL Dump for each customer by custnum (until you hit the bad spot) for each customer by custnum descending Assumes the damage is limited in scope

46 Emergency Dump If the Primary Index is damaged, try dumping using a non-Primary Index Or try fixing the index with proutil idxfix Indexless Binary Dump proutil -C dump -index 0 Only works for Type 2 Areas BravePoint 2012

47 PUG Challenge 2012 Emergency Dump RECID Dump Doesnt require an Index Very Slow on a Large Database If one table per Area, perhaps no so bad Usually the Last, Last, Last Resort

48 PUG Challenge 2012 Deleted/Damaged Extents First, backup the remnant of the DB This may seem like a useless step but if the last backup is defective you may need to repair the broken DB and thats difficult to do if its deleted The backup gives you time to: Prepare a plan of action Call outside resources (like me) for help Calm down, take a Xanax & lock the door Prepare a new Resume (C.V.)

49 PUG Challenge 2012 Deleted/Damaged AI Extents If an AI Extent is deleted, simply disable AI and... What? Not running AI? Disable AI (rfutil aimage end) Fix the issue that caused the lost Extent Recreate the Extent with prostrct add Restart AI (rfutil aimage begin) If this doesnt work, next slide

50 PUG Challenge 2012 Deleted/Damaged AI Extents Method #2 Disable AI with rfutil aimage end. You may get an error message regarding the missing AI Extent but typically AI is still disabled Truncate the BI file with proutil truncate bi. You may get an error regarding the missing AI Extent but typically the BI file is still truncated

51 PUG Challenge 2012 Deleted/Damaged AI Extents Method #2 continued Remove all AI Extents with prostrct remove Recreate the original AI Extents with prostrct add Restart After Imaging with rfutil aimage begin Reformat the truncated BI file with proutil bigrow

52 PUG Challenge 2012 Deleted/Damaged BI Extents Force access with -F V8.2 and later -F only works on proutil truncate bi (and promon and proshut) If you Force access, consider the DB damaged! Forcing access THROWS AWAY the BI file Unfortunately, sometimes –F is the only option

53 PUG Challenge 2012 Deleted/Damaged BI Extents Force Access with –F continued Forcing Access sets the Tainted Flag in the Master Block Even if you fix the Tainted Flag (idxbuild), consider the DB damaged! Dump & Load (this is if there is no AI recovery option) If you cant get into the database with -F or any other way, try the Read Only (-RO) option

54 PUG Challenge 2012 Deleted/Damaged Database Extents

55 PUG Challenge 2012 Deleted/Damaged DB Extents Restore the DB and BI from Backup Apply the AI files Re-enable AI BI Grow Done! That didnt work?, next slide

56 PUG Challenge 2012 Deleted/Damaged DB Extents Use prostrct unlock if the deleted Extent was Empty (above the High Water Mark) prostrct unlock -extents will recreate missing Extents (and set the Tainted flag) However unlock also changes the time stamps on the AI files and they cant be used any longer

57 PUG Challenge 2012 Deleted DB Extents – Extent Transplant This technique is for Extents that contain Data except the Schema Area.d1 Extent (contains Master Block) Restore a copy of the deleted Extent from a Backup or other source The Extents Last Opened time stamps wont match Use prostrct unlock to sync the time stamps (broken in some versions)

58 PUG Challenge 2012 Deleted DB Extents – Extent Transplant The data in the Extent might not match but… Use the -miracle option to re-create the Data Use the -miracle option to re-create the Data Loss of the Schema Area.d1 extent is usually not recoverable Loss of a High Water Mark extent is also usually not recoverable

59 Deleted DB Extents – Extent Transplant This is why small fixed extents may still be a good idea PUG Challenge 2012

60 Deleted DB Extents If the DB Broker is still running #1 DONT Shutdown the Database That closes the database extents and you wont be able to re-open them If a Client is still connected to the DB can access the Progress Editor, just dump the Database from the Dictionary Even if they cant access the Editor, put dict.p (renamed as a menu item) into their PROPATH

61 PUG Challenge 2012 Deleted DB Extents - Unix If the DB Broker is still running #2 Warm Boot the System ASAP Dont Shut Down the DB First Run fsck (*ix only) and it can probably recover the deleted Extent Why? On Unix a file is not absolutely deleted until every process that has it open is gone (the Broker still has it open)

62 PUG Challenge 2012 Lost.db File In V9 and later it is relatively easy to restore the.db file prostrct builddb Requires an up-to-date Structure File (remember that point from the Preventive Maintenance list?)

63 PUG Challenge 2012 Sources of Help Progress Documentation Progress DB Administration Guide PSC Kbase (i.e. Krapbase) PSC Tech Support My mobile phone: For those weekend emergencies when you need expert assistance This is not a free call

64 PUG Challenge 2012 Conclusion Thank you for coming More details can be found in my Progress Database Administration Guide Publications are available at Do we have time for Questions?


Download ppt "Progress Database Repair & Recovery Dan Foreman BravePoint, Inc."

Similar presentations


Ads by Google