Why You Should Periodically Review Your Backups

Introduction

So like every good Oracle DBA, you’ve created a few databases, decided that if they were worth creating and using they might also be worth backing up, and created some backup scripts that do ample logging. All hunky dory, right? You shouldn’t need to revisit this, right?

WRONG!!!

OK, my apologies for those who are visually impaired or have other issues dealing with blinking text, I just had to put that in there. 🙂

So what can go wrong in such a scenario? Well, depending on how you architected it, quite a lot. Let me walk you through a few examples I’ve come across recently. If you have any other good examples, then feel free to add them as a comment and I’ll incorporate them too!

Hard-Coded Script Example 1

You’ve created your PROD database, and you created a script to back it up – call it backup_PROD.sh for the sake of the discussion.

(So tempting to put that blinking text here again, but I’ll refrain in the following sections!)

WRONG!!!

What can be wrong with that simple scenario? Probably not much if it stays that simple, but let me tell you – it never does! What happens if you create 5 databases, or 10, or 100? Well now you copy the script and rename it for each one, and you get 5, or 10, or 100 scripts. It’s a maintenance nightmare! Far better if there is no need for difference between the backups to pass the ORACLE_SID into the script as a parameter.

Hard-Coded Script Example 2

Your databases are all running in ARCHIVELOG mode, and you decide to put all tablespaces in backup mode (so you can split the mirrors they’re on) using another script that does this:

WRONG!!!

For a start, there’s no need to put each tablespace in backup mode individually, unless you’re going to backup that tablespace individually. Use an ALTER DATABASE BEGIN BACKUP; command instead. Secondly, you have hard-coded the tablespace names in the script. What happens every time you add another tablespace? You have to edit the script! Guaranteed that will be forgotten somewhere along the way. If you want to use ALTER TABLESPACE commands instead for some reason, then query the data dictionary to get the tablespace names and you’ll never have to worry about editing the script again.

At one customer site, there was an even bigger issue. They called the same script for each database – the problem with that came when they added another database which had MOSTLY the same tablespaces and didn’t change the script. So now when they split the mirrors, one of the tablespaces was NOT in backup mode. That might SEEM to work, but as most seasoned Oracle professionals will understand, there is always the chance that you will end up backing up fractured blocks, and that tablespace backup will not be worth last week’s news.

This is an important consideration to be aware of, and one most storage vendors will try to say can never happen. Unfortunately, they are just plain wrong. Storage vendors back up operating system blocks, but most aren’t aware that an Oracle block can be made up of a number of operating system blocks. Of course, as was pointed out by the inestimable Mogens Norgaard, storage vendors should probably continue in this mindset as it does generate more money for us troubleshooting screwed-up databases! That was tongue in cheek by the way (well, maybe not from Mogens but from me!) 😉

Shutting Down the Database for Backup Example 1

Your Production database is running in ARCHIVELOG mode, as is most often the way you want it to be so you can perform point in time recovery if needed. If you’re not running in ARCHIVELOG mode, of course, you can only recover (most of the time, anyway) to the point in time when the last backup was done. I say “most of the time” because I have been able to recover some lost tables from a database in NOARCHIVELOG mode because the same tables had the same data in UAT, but that’s a pretty rare occurrence.

Your backup script was written back in the day when you were setting things up, and at that stage you ran in NOARCHIVELOG mode to save generating lots of unwanted archive logs. So of course you shut the database down and did a cold backup.

WRONG!!!

Well, right for the time when the script was built, but wrong now. If you’re running in ARCHIVELOG mode, you should always be performing hot backups (i.e. backups done while the database is running) so your poor users aren’t unnecessarily kicked off their application.

Shutting Down the Database for Backup Example 2

Another example from a customer – they had a database where they had to load some external data, and there was no way to repeat the run to load it (don’t ask why, that’s not the point of this one). So the way they decided to work around that was to:

  • Shut the database down
  • Split the mirrors
  • Restart the database
  • Backup the mirror
  • While the backup is running, do the work to load the data
  • Once the unrepeatable data is loaded, shut the database down again
  • Split the mirror again
  • Do another backup
  • Restart the database
  • Remirror

It’s a heck of a lot of work, but it got them to where they needed to be.

WRONG!!!

The problem was when users got back on to the application (which is internet facing and can be used from anywhere), there were some performance issues – periods of slowdown, locks and even timeouts, where users had to repeat their transactions. When it was discovered the issues mostly occurred shortly after restart, the application vendor recommended pre-warming the cache by selecting from some of their reference tables. All well and good if we were back in the last century! But this database was running in ARCHIVELOG mode! So the far easier solution is to follow a Cary Millsap maxim – “the easiest work to do is no work”. The answer? Don’t shut the database down! All you need to do in this scenario is put the database in backup mode with an ALTER DATABASE BEGIN BACKUP; command, do the necessary unrepeatable work and do an ALTER DATABASE END BACKUP; command. In that way, the database caches are not cold, there’s no need to pre-warm the caches, and there should be no performance issue as a result.

Conclusion

There are so many more scenarios I’ve seen where issues around backups arise purely and simply because no-one has gone back and revisited the backup strategies once they were put in place. This is something you should do regularly – certainly whenever you add new databases to your backup, and probably more often than that. Remember – the last place you want to find an issue in your backup strategy is when a recovery fails!

Pete

After 22 years of working at Oracle in just about every role except Marketing and Support, I am now working as a Senior Managed Services Consultant with Data Intensity, specializing in Oracle Database technology, High Availability and Disaster Recovery solutions. I am also a member of the OakTable Network, and have presented at RMOUG Training Days, Hotsos Symposia, Oracle OpenWorld conferences, and other user group events. I have co-authored the Expert Oracle Enterprise Manager 12c and Practical Oracle Database Appliance books published by Apress, and am one of the authors of the Building Database Clouds in Oracle Database 12c book published by Addison Wesley.

2 Comments:

  1. Why use ALTER DATABASE BEGIN/END BACKUP when there’s RMAN?

    • Because for this customer they were splitting the mirror then taking a snapshot at the OS level. Yes, you can do the same with RMAN, but for the requirement here a single SQL statement was less change to their existing configuration.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.