Progress Database Repair & Recovery

Slides:



Advertisements
Similar presentations
MFA for Business Banking – Security Code Multifactor Authentication: Quick Tip Sheets Note to Financial Institutions: We are providing these QT sheets.
Advertisements

B3: Putting OpenEdge Auditing to Work: Dump and Load with (Almost) No Downtime David EDDY Senior Solution Consultant.
Chapter 4 Memory Management Basic memory management Swapping
Burt King We will cover: Essentials --No command line needed here (mott) What is SQL Server How does it come to life What are the.
Strength. Strategy. Stability. The Application Profiler.
DB-13: Database Health Checks How to tell if you’re heading for The Wall Richard Shulman Principal Support Engineer.
1 How Healthy is Your Progress System? ( Progess DB Best Practices) Dan Foreman BravePoint, Inc.
FlareCo Ltd ALTER DATABASE AdventureWorks SET PARTNER FORCE_SERVICE_ALLOW_DATA_LOSS Slide 1.
DataBase Administration Scheduling jobs Backing up and restoring Performing basic defragmentation and index rebuilding Using alerts Archiving.
VMware Data Recovery Presented by Kroll Ontrack at WI Area VMware User’s Group Presented by Kroll Ontrack at WI Area VMware User’s Group.
1 PUG Challenge EU 2014 Click to edit Master title style PUG Challenge EMEA 2014 – Dusseldorf, Germany Common Database Problems Common Database Solutions.
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment Chapter 12: Managing and Implementing Backups and Disaster Recovery.
File Systems Implementation. 2 Recap What we have covered: –User-level view of FS –Storing files: contiguous, linked list, memory table, FAT, I-nodes.
Chapter 12 File Management Systems
Week:#14 Windows Recovery
RMAN Restore and Recovery
Backup and Recovery (2) Oracle 10g CAP364 1 Hebah ElGibreen.
Backup and Recovery Part 1.
Oracle9i Database Administrator: Implementation and Administration
Transaction log grows unexpectedly
Oracle backup and recovery strategy
1. Preventing Disasters Chapter 11 covers the processes to take to prevent a disaster. The most prudent actions include Implement redundant hardware Implement.
DB Audit Expert v1.1 for Oracle Copyright © SoftTree Technologies, Inc. This presentation is for DB Audit Expert for Oracle version 1.1 which.
NovaBACKUP 10 xSP Technical Training By: Nathan Fouarge
MOVE-4: Upgrading Your Database to OpenEdge® 10 Gus Björklund Wizard, Vice President Technology.
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment, Enhanced Chapter 12: Managing and Implementing Backups and Disaster Recovery.
70-293: MCSE Guide to Planning a Microsoft Windows Server 2003 Network, Enhanced Chapter 14: Problem Recovery.
November 2009 Network Disaster Recovery October 2014.
Administration etc.. What is this ? This section is devoted to those bits that I could not find another home for… Again these may be useless, but humour.
Transactions and Reliability. File system components Disk management Naming Reliability  What are the reliability issues in file systems? Security.
LAN / WAN Business Proposal. What is a LAN or WAN? A LAN is a Local Area Network it usually connects all computers in one building or several building.
Administering Windows 7 Lesson 11. Objectives Troubleshoot Windows 7 Use remote access technologies Troubleshoot installation and startup issues Understand.
1 Chapter 12 File Management Systems. 2 Systems Architecture Chapter 12.
Top Performance Enhancers Top Performance Killers in Progress Dan Foreman Progress Expert
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment, Enhanced Chapter 12: Managing and Implementing Backups and Disaster Recovery.
Module 7. Data Backups  Definitions: Protection vs. Backups vs. Archiving  Why plan for and execute data backups?  Considerations  Issues/Concerns.
Strength. Strategy. Stability.. Progress Performance Monitoring and Tuning Dan Foreman Progress Expert BravePoint BravePoint
Preventing Common Causes of loss. Common Causes of Loss of Data Accidental Erasure – close a file and don’t save it, – write over the original file when.
Mark A. Magumba Storage Management. What is storage An electronic place where computer may store data and instructions for retrieval The objective of.
Progress Database Admin 1 Jeffrey A. Brown - Technical Support Consultant
XP Practical PC, 3e Chapter 6 1 Protecting Your Files.
Week 3 Lecture 1 The Redo Log Files and Diagnostic Files.
IT Database Administration Section 09. Backup and Recovery Backup: The available options Full Consistent (cold) Backup Database shutdown, all files.
14 Copyright © 2005, Oracle. All rights reserved. Backup and Recovery Concepts.
Recovery Log Notes Gus Björklund, Progress Dan Foreman, Bravepoint 2014 Americas PUG Challenge Westford, MA June 8 – June
High Availability in DB2 Nishant Sinha
IT1001 – Personal Computer Hardware & system Operations Week7- Introduction to backup & restore tools Introduction to user account with access rights.
Matthew Glenn AP2 Techno for Tanzania This presentation will cover the different utilities on a computer.
Common Database Problems Common Database Solutions Mike Furgal Managed Database Service EMEA PUG Challenge 2015, Copenhagen, Denmark 4 – 6 November, 2015.
18 Copyright © 2004, Oracle. All rights reserved. Backup and Recovery Concepts.
6 Copyright © 2007, Oracle. All rights reserved. Performing User-Managed Backup and Recovery.
14 Copyright © 2005, Oracle. All rights reserved. Backup and Recovery Concepts.
Common Database Problems Common Database Solutions Mike Furgal PROGRESS Bravepoint – Database Services.
GCSE Computing: A451 Computer Systems & Programming Topic 3 Software System Software (2) Utility Software.
Log Shipping, Mirroring, Replication and Clustering Which should I use? That depends on a few questions we must ask the user. We will go over these questions.
OE REPLICATION AKA FATHOM REPLICATION. WHO AM I Currently with Eaton Corp as a Sr. Progress DBA for the past 12 years Started Programming with Progress.
Advance startup options Shift Restart. Restart options.
OST TO PST How to deal with the problem of OST when Outlook gets terminated abruptly ?
WHEN DATABASE CORRUPTION STRIKES Presented by Steve Stedman Founder/Owner of Stedman Solution, LLC.
Dial-In Number: 1 (631) Webinar ID: FHC Tech Talk Automation and Efficiency Series Talk #1 Carbonite automated backup.
Database Repair & Recovery
Know About MS Access Database
Jonathan Walpole Computer Science Portland State University
Introduction to Computers
Oracle9i Database Administrator: Implementation and Administration
Database Corruption Advanced Recovery Techniques
Prepared by Jaroslav makovski
Performing Database Recovery
Chapter 5 The Redo Log Files.
Database Backup and Recovery
Presentation transcript:

Progress Database Repair & Recovery Dan Foreman BravePoint, Inc. Email: danf@prodb.com

Introduction- Dan Foreman Progress User since 1984 (V2.1) Speaker at many Progress User Conferences from 1990 to 2012 PUG Challenge 2012

Introduction- Dan Foreman Author of: Progress Performance Tuning Guide Progress Database Administration Guide Progress System Tables Guide Progress V10 Database Admin Jumpstart Book purchase allows free online access ProMonitor – Database Monitoring Tool ProD&L - Accelerated Dump/Load Utility Balanced Benchmark – Load testing tool PUG Challenge 2012

Introduction - Who Are You Progress Version: V6, V7, V8, V9, V10.0*, V10.1*, V10.2* DB OS: Unix? Windows? Linux? Is there anything else? Largest Single Database? Highest Concurrent User Count? PUG Challenge 2012

Special Request Mobile Phones on Mute Please! PUG Challenge 2012

Goals Can I teach you database brain surgery in an hour? Note that DBAs have one big advantage that human brain surgeons don’t…do you know what that is? PUG Challenge 2012

Real Horror Stories Fortune 500 Company (sorry but they would not appreciate us sharing their name) DB Corruption February 23 Last Good Backup: January 11 Last Good AI Files: January 17 We facilitated a special version of rfutil that would ignore errors during the Roll Forward process PUG Challenge 2012

Real Horror Stories Customer running SCO OpenServer We had told the customer to move to a more “modern” OS, i.e. Linux OS Problem – Can’t mount the Disks; Discovered that backups to tape were not occurring (backup to disk was OK but couldn’t see the disks) Had to boot using Knoppix to repair things – took forever to reinstall SCO PUG Challenge 2012

Real Horror Stories Fortune 1000 Company HP Server Admin outsourced to IBM Backups outsourced to 3d party 3d party stopped doing backups, unannounced, due to non-payment DB Corrupted Restoration impossible BravePoint 2012

Preventive Maintenance Backups (yes, I know you think you have backups but have you tested one recently?) Test your Entire Recovery Plan PUG Challenge 2012

Preventive Maintenance Warm Standby Database - A database on another machine with a recent copy of the production DB Also called a D/R Database This is easy to do in Progress...covered soon PUG Challenge 2012

Preventive Maintenance Unix: don’t logon as root unless you really need to Use O/S security to protect the DB, BI, and AI files from accidental or casual or intentional deletion PUG Challenge 2012

Preventive Maintenance Unix: Don’t use kill -9 to terminate a Self Service Progress session; You might bring the database DOWN! if you happen to kill a session that is holding a Latch ALWAYS have an up-to-date Structure (.st) file available - we will see why later PUG Challenge 2012

Preventive Maintenance Monitor the BI file High Water Mark Monitor 'Delinquent' Transactions (Active Transactions longer than 30-60 minutes) Monitor Large Transactions (A Client with a Large Number of concurrent Record Locks) longtrx*.p Progress program on the BravePoint Website to detect Delinquent Transactions PUG Challenge 2012

Preventive Maintenance Use the -bithold parameter as an extra safeguard; Set to 50% of available BI Disk Space; even in V9 V9/V10 supports Terabyte sized BI Files but extent sizes are still limited to 2gb unless you use the EnableLargeFiles option on proutil and the file system must be 2gb enabled too PUG Challenge 2012

Preventive Maintenance Monitor the Area High Water Marks to avoid growing into the Variable Length Extent There is a Performance Hit, usually insignificant but sometimes not, when growing the Variable Extent A Single Variable Extent can limit some of the recovery options discussed later PUG Challenge 2012

Quiz Question Who are the Smartest DBAs in the Room? PUG Challenge 2012

Answer DBAs who enabled After Imaging on their Mission Critical Databases If you’re not using AI, you probably shouldn’t be responsible for your company’s databases Management need convincing? Play Chicken with them PUG Challenge 2012

After Imaging Who is currently not using AI? If not, why not? Is public flogging or humiliation required? PUG Challenge 2012

After Imaging PSC docs say that AI offers protection against Disk failure Disk fails 5 minutes before the backup starts on the final day of your year end close No paper trail Ouch! Time to work on your resume (C.V.) After Image File(s) + Last (Good) Backup = State of DB at time of crash PUG Challenge 2012

After Imaging - Why Use It? But you say…”I have disk mirroring (also known as RAID 1) so I’m protected against a disk failure” BUT Mirroring does NOT protect against all database evils PUG Challenge 2012

After Imaging - Why Use It? True Horror Story #1 A DBA (logged on as root) FTP’d a test database into the directory where the production database resided... unfortunately they had the same name Disk Mirroring worked just fine….. After Imaging would have probably saved the day PUG Challenge 2012

After Imaging - Why Use It? True Horror Story #2 A user ran an archiving program on live data that wasn’t ready to be archived Once again the mirroring performed perfectly AI might have improved the situation as it is possible to Roll Forward to a specific point in time PUG Challenge 2012

After Imaging - Why Use It? True Horror Story #3 – Part 1 BI file hit the V8 2GB limit @ 1600 on a busy day (300+ users) Large Production Database was corrupted Progress Support Recommendation: dump & load or restore from backup which meant substantial down time or data loss PUG Challenge 2012

After Imaging - Why Use It? True Horror Story #3 – Part 2 Fortunately the customer called me and I was able to temporarily patch the database until a D&L could be performed Irony: I had recommended AI to this customer over one year prior to this event PUG Challenge 2012

After Imaging - Why Use It? Avoid probkup online issues Transaction Activity is Frozen while the BI File is Backed Up The I/O Overhead of disk/tape backup Possible Solution Use AI to maintain a Warm Spare DB Backup the Replicated Database PUG Challenge 2012

After Imaging - Why Use It? Warm Standby (D/R) Database A Warm Standby DB is: A replicated Database on another Server DB can an be brought online quickly in case of catastrophic failure to the production system It’s ‘warm’ because it is not 100% current…usually 2-15 minutes behind PUG Challenge 2012

After Imaging - Why Use It? A HOT spare database is not possible using AI except with: Fathom Replication (oops…OpenEdge Replication) Replication Triggers SAN Mirroring Even these options don’t guarantee zero loss of data PUG Challenge 2012

After Imaging - Why Use It? Easy refreshing of a Report Server DB A Report Server DB is: A database on another server Used for reporting only To relieve the production system of the overhead imposed by reporting Doesn’t require same level of hardware or Progress license PUG Challenge 2012

Essential DB Monitoring Performed periodically to make sure you don’t have hidden or unreported corruption PUG Challenge 2012

Essential DB Monitoring Corruption Checks proutil dbanalys probkup/procopy proutil dbrpr proutil dbscan (non-interactive dbrpr) proutil idxfix dbtool BravePoint 2012

Essential DB Monitoring -MemCheck AND ALL THE OTHER SIMILAR OPTIONS BravePoint 2012

Database Log File Monitoring Check the Database log (.lg) file for errors DAILY. Look for words such as: kill* drastic warn* error system dead fatal abnormal exceed* fail* wrong unexpected* invalid died damage* overflow* violation insufficient missing disappear* corrupt* allow* attempt* cannot enough illegal beyond impossible increase unknown unable stop* ProMonitor supports automated log file monitoring PUG Challenge 2012

1124 Errors SYSTEM ERROR: wrong dbkey in block 99.9999% probability of H/W problem Don’t limit search to disks; consider: Disk Controllers, RAM (parity errors), Firmware, etc. Don’t let the Hardware Technician blame Progress or the Application Don’t let the Hardware Technician escape without fixing the problem PUG Challenge 2012

1124 Stories Seagate Firmware on 2gb drives (mid-90’s) HP Server/EMC SAN administered by HP HP/UX diagnostics showed no problems EMC diagnostics showed no problems Cause: Bad SAN Fabric Switch PUG Challenge 2012

1124 Stories Sometimes a Server reboot can (temporarily) fix a 1124 situation However this might be a situation where the hardware is in a bad-good-bad-good… cycle BravePoint 2012

Database File System Full Use prostrct repair to relocate Extents to a place with more space Copy the Extent to the new location Update the Structure File (.st) to reflect the current location of the Extents (one good reason to have a current one) Run prostrct repair new.st Alternatively use prostrct add to add new Extents PUG Challenge 2012

No Space for Before Image File Do Not run out of BI disk space if: More space isn’t available elsewhere The BI extent(s) can’t be relocated To perform Crash Recovery (part of the proutil truncate bi process), the BI file will grow; sometimes 2X or more PUG Challenge 2012

No Space for Before Image File If there is no space for the BI file to grow, there is no Crash Recovery If Crash Recovery partially completes but then crashes, the next Crash Recovery will create an even larger BI file!!! PUG Challenge 2012

No Space for Before Image File Force Access (-F) is the only option (if you don’t have AI enabled) Even having AI enabled is problematic Crash Recovery Notes are also written to the AI logs You can’t do rfutil aimage empty during Crash Recovery!! This is why the –bithold parameter is so important PUG Challenge 2012

Corrupt Database Blocks What Kind of Block? Index or Data Block Type 2 (Index = IX) Block Type 3 (Record Manager = RM) Use proutil dbrpr to get the Block Type Or if the Block is in a Storage Area dedicated to Indexes or Tables, you automatically know PUG Challenge 2012

Corrupt IX Block Options Try rebuilding the Index that the IX Block belongs to Try to Truncate the Area Reformat the block as a Free Block (proutil dbrpr) If it is an RM block, see the next set of slides PUG Challenge 2012

Corrupt RM Blocks Reformat the block as a Free Block (proutil dbrpr) Replace the Block with the same Block from another DB (restored from a backup) proutil dbrpr 5. Dump Block (from the good DB) 4. Load Block (into the bad DB) Don’t forget to backup the DB fi PUG Challenge 2012

RM Block Transplant How can I tell if the block has changed since the last probkup? The Block Update Counter stored in the Block Header If not using probkup, the DBKEY from the AI Logs can be obtained with aimage scan verbose but you REALLY need to be motivated PUG Challenge 2012

Emergency Dump ‘Front and Back’ 4GL Dump for each customer by custnum (until you hit the bad spot) for each customer by custnum descending Assumes the damage is limited in scope PUG Challenge 2012

Emergency Dump If the Primary Index is damaged, try dumping using a non-Primary Index Or try fixing the index with proutil idxfix Indexless Binary Dump proutil <db> -C dump <table> -index 0 Only works for Type 2 Areas BravePoint 2012

Emergency Dump RECID Dump Doesn’t require an Index Very Slow on a Large Database If one table per Area, perhaps no so bad Usually the Last, Last, Last Resort PUG Challenge 2012

Deleted/Damaged Extents First, backup the “remnant” of the DB This may seem like a useless step but if the last backup is defective you may need to repair the broken DB and that’s difficult to do if it’s deleted The backup gives you time to: Prepare a plan of action Call outside resources (like me) for help Calm down, take a Xanax & lock the door Prepare a new Resume (C.V.) PUG Challenge 2012

Deleted/Damaged AI Extents If an AI Extent is deleted, simply disable AI and... What? Not running AI? Disable AI (rfutil aimage end) Fix the issue that caused the lost Extent Recreate the Extent with prostrct add Restart AI (rfutil aimage begin) If this doesn’t work, next slide PUG Challenge 2012

Deleted/Damaged AI Extents Method #2 Disable AI with rfutil aimage end. You may get an error message regarding the missing AI Extent but typically AI is still disabled Truncate the BI file with proutil truncate bi. You may get an error regarding the missing AI Extent but typically the BI file is still truncated PUG Challenge 2012

Deleted/Damaged AI Extents Method #2 continued Remove all AI Extents with prostrct remove Recreate the original AI Extents with prostrct add Restart After Imaging with rfutil aimage begin Reformat the truncated BI file with proutil bigrow PUG Challenge 2012

Deleted/Damaged BI Extents Force access with -F V8.2 and later -F only works on proutil truncate bi (and promon and proshut) If you Force access, consider the DB damaged! Forcing access THROWS AWAY the BI file Unfortunately, sometimes –F is the only option PUG Challenge 2012

Deleted/Damaged BI Extents Force Access with –F continued Forcing Access sets the ‘Tainted Flag’ in the Master Block Even if you fix the Tainted Flag (idxbuild), consider the DB damaged! Dump & Load (this is if there is no AI recovery option) If you can’t get into the database with -F or any other way, try the Read Only (-RO) option PUG Challenge 2012

Deleted/Damaged Database Extents PUG Challenge 2012

Deleted/Damaged DB Extents Restore the DB and BI from Backup Apply the AI files Re-enable AI BI Grow Done! That didn’t work?, next slide PUG Challenge 2012

Deleted/Damaged DB Extents Use prostrct unlock if the deleted Extent was Empty (above the High Water Mark) prostrct unlock -extents will recreate missing Extents (and set the Tainted flag) However unlock also changes the time stamps on the AI files and they can’t be used any longer PUG Challenge 2012

Deleted DB Extents – Extent Transplant This technique is for Extents that contain Data except the Schema Area .d1 Extent (contains Master Block) Restore a copy of the deleted Extent from a Backup or other source The Extent’s ‘Last Opened’ time stamps won’t match Use prostrct unlock to sync the time stamps (broken in some versions) PUG Challenge 2012

Deleted DB Extents – Extent Transplant The data in the Extent might not match but… Use the -miracle option to re-create the Data  Loss of the Schema Area .d1 extent is usually not recoverable Loss of a High Water Mark extent is also usually not recoverable PUG Challenge 2012

Deleted DB Extents – Extent Transplant This is why small fixed extents may still be a good idea PUG Challenge 2012

Deleted DB Extents If the DB Broker is still running #1 DON’T Shutdown the Database That ‘closes’ the database extents and you won’t be able to re-open them If a Client is still connected to the DB can access the Progress Editor, just dump the Database from the Dictionary Even if they can’t access the Editor, put dict.p (renamed as a menu item) into their PROPATH PUG Challenge 2012

Deleted DB Extents - Unix If the DB Broker is still running #2 Warm Boot the System ASAP Don’t Shut Down the DB First Run fsck (*ix only) and it can probably recover the deleted Extent Why? On Unix a file is not absolutely deleted until every process that has it open is gone (the Broker still has it open) PUG Challenge 2012

Lost .db File In V9 and later it is relatively easy to restore the .db file prostrct builddb Requires an up-to-date Structure File (remember that point from the Preventive Maintenance list?) PUG Challenge 2012

Sources of Help Progress Documentation Progress DB Administration Guide dba@peg.com PSC Kbase (i.e. Krapbase) PSC Tech Support My mobile phone: +1 541-908-3437 For those weekend emergencies when you need expert assistance This is not a free call PUG Challenge 2012

Conclusion Thank you for coming More details can be found in my Progress Database Administration Guide Publications are available at www.BravePoint.com danf@prodb.com Do we have time for Questions? PUG Challenge 2012