Bruce Clarke, my good friend and former boss, and I have have been collaborating recently on RMAN backup to Data Domain deduplication storage arrays. In the process I have learned from Bruce (who I gratefully acknowledge for the technical content of this post, including the scripts shown below) about a very valuable approach to RMAN backup.
Most DBAs do full RMAN backups of all databases which are small enough to fit into a backup window. Full backups have several advantages:
- They are very simple to run.
- They optimize the restore operation, as no redundant blocks need to be applied.
- They reduce dependencies in terms of the number of backup pieces that need to be cataloged and maintained in order to do a given restore, thereby reducing risk.
However, fulls have two strong disadvantages:
- They waste space since the entire database's content is stored in each full backup.
- They are time-consuming to run, because all active blocks in the database must be transferred to the backup target.
The first disadvantage is neatly avoided by the use of a deduplication array, such as Data Domain. My recent testing indicates that this is certainly true. More on this in a later post.
The second disadvantage is unavoidable at present. (Future developments in the area of source-based deduplication on the Data Domain side may mitigate this later on, but I will ignore that for now.) For most DBAs, the trade-off is acceptable as long as the full backup can fit into the available backup window. This is somewhat of a moving target as well, since recent developments in the area of storage networking (such as 10 GbE) have made the bandwidth increase, thereby increasing the size of the database that can fit into a backup window.
Unfortunately, the size of the typical database is also exploding as we all know. Thus, the issue remains: For a very significant number of large databases, the full backup cannot fit into the backup window. For these databases, the DBA has no real choice: He or she must use RMAN incremental backup.
Incremental backup (absent incremental update, which I will discuss later) provides the DBA with the option to only back up blocks which have been updated since the last backup. This provides a dramatic savings in terms of backup time, but with a significant manageability cost:
- The number of backup pieces which must be cataloged and maintained increases significantly, making the dependencies greater, and thus increases the risk of a failed restore.
- The restore time increases, as blocks must be restored multiple times, as the same block may be updated many times and thus stored in separate incremental backups.
Because of these issues, Oracle introduced incremental update for RMAN with 10g. There is an excellent article on the way this feature works in the Oracle documentation on OTN. Effectively, incremental update applies the incremental backup to the previous full backup. This neatly solves all of the issues of RMAN backup, except one: Instead of having a set of full backups, you end up with a single rolling full backup. In other words, once you apply the incremental backup to the previous full backup, you no longer have that previous full backup. Instead, you have a new full backup as of the point in time when you took the incremental backup.
This issue can be solved, obviously, by making a copy of the full backup prior to the incremental update operation. However, making this copy is time consuming, causes I/O, and takes up space.
Enter Bruce's solution: By using the Data Domain fast copy feature, you can make a copy of the previous full backup very quickly, and with no additional space. Further, the blocks which will be applied to the full backup copy have already been stored on the Data Domain in the incremental backup. Thus, they take up very little space as well. The end result is tantamount to RMAN backup nirvana:
- You can take your backups in a fraction of the time it would require to do a full backup.
- At the end of the process, you have a set of full backups for each time you run your RMAN backup.
- All backups are fully deduplicated, thus saving you lots of space.
- Restore operations are optimized as there are no dependencies other than a single full backup, and thus no redundant blocks to apply.
The scripts to accomplish this are below. The caveats all apply: This is provided as is with no warranty or support obligations. You are on your own in terms of all that. However, the solution does work, and you should definitely give it a try. Also, note that this is Data Domain copyrighted material. However, I have received permission from Bruce Clarke, the author of this script, to publish it here.
Example script for weekly backups
This was created and tested on Sun Solaris so slight changes for other platforms might be required.#!/bin/sh -f # version 1.0 - Copyright (C) 2009, Data Domain, Inc - Nov, 2009
#
# Perform backup of an Oracle database in preparation using the
# incremental update procedure
#
# This script performs periodic full backups using the fact that
# a requested to the existing full creating an updated
# full in case recovery is required.
#
# Arguments: $1 - "full" or "incremental"
# $2 - instance name to be backed - Default: "orcl"
#
# specifying "full" starts a new "tag" thus creating a new full sequence
#
# The script also keeps everything in separate subdirectories. Because
# the underlying storage is assumed to be deduplicated, the actual storage
# consumed is similar to just running a series of incrementals
#
# Archivelogs and controlfiles are also backed up in the sequence
#
# directory naming scheme being followed
# $BACKUP_DIR :: path to make backup directory
# $BACKUP_DIR/full-$INSTANCE-date :: full backup directory for today
#
# IMPORTANT NOTE: THIS SCRIPT IS PROVIDED WITHOUT WARRANTY OR SUPPORT.
# FEEDBACK IS ALWAYS
# WELCOME, HOWEVER, IT SHOULD BE CLEARLY UNDERSTOOD THAT USE OF THIS SCRIPT
# SHOULD ONLY
# BE MADE AFTER AN EXPERIENCED ORACLE DBA CLEARLY UNDERSTANDS ALL STEPS IN THIS
# SCRIPT
# AND HAS MADE NECESSARY MODIFICATIONS AND PERFORMED SUFFICIENT TESTING TO BE
# ASSURED OF ITS
# PROPER OPERATION IN THEIR PARTICULAR ENVIRONMENT
#
# edit variables below as necessary
#
#DRYRUN=1 # Remove this line to allow script to run
# with this line, the script will only echo
# operations it would have performed
BACKUP_MOUNT="/dd/dd690a" # where "backup" directory on DD is mounted
BACKUP_DD_PATH="/backup/oracle"
BACKUP_DDSYS="sysa[email protected]" # used to perform ssh fastcopy after
# completion
INSTANCE="orcl"
RMANID="sys"
RMANPW="oracle"
RMANCAT="NOCATALOG" # Modify if a catalog is being used
RETENTIONLOCK=no # Yes, says change the atime to lock the file, otherwise don't
DELETE_OLDER=21 # how old a directory needs to be before it expires
# only relevant if RETENTIONLOCK is set to "yes"
MAXSIZE="10 g" # Maximum backup filesize - smaller allows replication to start
# sooner
# only relevant to archivelog backupsets. All other backups are "copy"
#### END OF VARIABLES
#### CHANGING VARIABLES BELOW THIS LINE WILL ALTER BEHAVIOR OF THIS SCRIPT
#
INCR_DATE="date_plus_3_weeks.pl" # script to give us a date in the future for
# "touch -atime <date> <file> to give us a lock date
case $1 in
full)
NEWBACKUP=Y
INCRBACKUP=N
;;
FULL)
NEWBACKUP=Y
INCRBACKUP=N
;;
incremental)
INCRBACKUP=Y
NEWBACKUP=N
;;
INCREMENTAL)
INCRBACKUP=Y
NEWBACKUP=N
;;
*)
echo "$0 - required argument \"full\" or \"incremental\" not specified"
exit 1
;;
esac
if [ "X$2" != "X" ]
then
INSTANCE=$2
fi
echo "-------------------------------------------"
if [ ${NEWBACKUP} = "Y" ]
then
echo "Performing New Level 1 image backup of Instance \"${INSTANCE}\" "
else
echo "Performing Incremental Level 1 image backup of Instance \"${INSTANCE}\" "
fi
echo "-------------------------------------------"
DATE_SUFFIX=`/usr/bin/date '+%e%b%y'`
BACKUP_DIR=${BACKUP_MOUNT}${BACKUP_DD_PATH}
FULL_BACKUP_DIR=${BACKUP_DIR}/full-${INSTANCE}-${DATE_SUFFIX}
INCR_BACKUP_DIR=${BACKUP_DIR}/incr-${INSTANCE}-${DATE_SUFFIX}
# set the backup directory that we care about for today
if [ ${NEWBACKUP} = Y ] then
THIS_BACKUP_DIR=${FULL_BACKUP_DIR}
FASTCOPYPATH=${BACKUP_DD_PATH}/full-${INSTANCE}-${DATE_SUFFIX}
else
THIS_BACKUP_DIR=${INCR_BACKUP_DIR}
FULL_BACKUP_DIR=`cat ${BACKUP_DIR}/most_recent_full`
# derive fastcopypath
FASTCOPYPATH=""
Y=${FULL_BACKUP_DIR}
X=$Y
Z=""
while [ ${X} != "backup" ]; do
X=`basename ${Y}`
Y=`dirname ${Y}`
if [ x$Z = "x" ]
then
Z=$X
else
Z=$X/$Z
fi
done
FASTCOPYPATH=$FASTCOPYPATH/$Z
fi
#
if [ ! X${DRYRUN} = X ]
then
set -vn
fi
# check if directory exists
if [ ! -d ${BACKUP_DIR} ]
then
echo "Directory \"${BACKUP_DIR}\" not presently available - aborting"
exit 2
fi
# directory already exists? If so, delete it -- OPTIONAL
if [ -d ${THIS_BACKUP_DIR} ]
then
echo "Directory \"${THIS_BACKUP_DIR}\" exists: Deleting"
# rm -rf ${THIS_BACKUP_DIR}
fi
# Make sure we can create the new directory
mkdir -p ${THIS_BACKUP_DIR}
if [ $? != 0 ]
then
echo "Failed to create \"${THIS_BACKUP_DIR}\" - aborting"
exit 3
fi
# Delete old backups - OPTIONAL
# Shouldn't be necessary if RMAN is maintaining the Retention policy
#find ${BACKUP_DIR} -name 'full*' -type d -mtime ${DELETE_OLDER} -exec rm -rf {} \;
# Get the appropriate TAG for use with RMAN. Create a new one if full, else use the existing one
if [ ${NEWBACKUP} = "Y" ]
then
echo "${INSTANCE}_${DATE_SUFFIX}" > ${FULL_BACKUP_DIR}/tagfile
fi
# Now set it here
TAGTODAY=`cat ${FULL_BACKUP_DIR}/tagfile`
# Echo the tag for the record
echo "-----------------------------------------------"
echo "TAG that will be used is ${TAGTODAY}"
echo "-----------------------------------------------"
# create the subdirectory that holds just the archivelogs mkdir -p ${THIS_BACKUP_DIR}/archivesonly
#
# Now finally get down to running the backup # # The command we want RMAN to run is:
# backup incremental level 1 for recover of copy with tag ${TAGTODAY} database;
rman target=${RMANID}/${RMANPW}@${INSTANCE} ${NOCATALOG} << FULLBACK
run {
configure controlfile autobackup on;
configure controlfile autobackup format for device type disk to '${THIS_BACKUP_DIR}/%F';
allocate channel c1 device type disk format '${THIS_BACKUP_DIR}/%U';
allocate channel c2 device type disk format '${THIS_BACKUP_DIR}/%U';
allocate channel c3 device type disk format '${THIS_BACKUP_DIR}/%U';
backup incremental level 1 for recover of copy with tag ${TAGTODAY} database ;
release channel c1 ;
release channel c2 ;
release channel c3 ;
allocate channel c1 device type disk format '${THIS_BACKUP_DIR}/archivesonly/%U' maxpiecesize ${MAXSIZE};
allocate channel c2 device type disk format '${THIS_BACKUP_DIR}/archivesonly/%U' maxpiecesize ${MAXSIZE};
allocate channel c3 device type disk format '${THIS_BACKUP_DIR}/archivesonly/%U' maxpiecesize ${MAXSIZE};
backup tag ${TAGTODAY} archivelog all delete all input;
recover copy of database with tag ${TAGTODAY};
}
quit
FULLBACK
# Done
echo Image backup of instance \"${INSTANCE}\" complete
echo --------------------------------------------
echo " Deleting expired backups from RMAN catalog"
echo --------------------------------------------
# Delete expired backups
rman target=${RMANID}/${RMANPW}@${INSTANCE} ${NOCATALOG} << DELEXP
delete noprompt expired backup;
delete noprompt expired archivelog all;
delete noprompt expired copy;
quit
DELEXP
#
# Fastcopy the full directory we just updated so it is retained # NEWDIR=${FASTCOPYPATH}-retain-${DATE_SUFFIX}
ssh ${BACKUP_DDSYS} "filesys fastcopy force source ${FASTCOPYPATH} \
destination ${NEWDIR} "
# now bring the second copy into the RMAN catalog
#
rman target=${RMANID}/${RMANPW}@${INSTANCE} ${NOCATALOG} << CATHERE
catalog start with '${BACKUP_MOUNT}${NEWDIR}';
yes
quit;
CATHERE
# record the backup directory so the incremental can find it
#
if [ ${NEWBACKUP} = Y ]
then
rm -f ${BACKUP_DIR}/most_recent_full
echo ${FULL_BACKUP_DIR} > ${BACKUP_DIR}/most_recent_full
fi
#
# If Retention Lock is going to be used, set the access time on the new files
#
if [ ${RETENTIONLOCK} = "yes" ]
then
RETAINUNTIL=`date_plus_3_weeks.pl`
export RETAINUNTIL
echo --------------------------------------------
echo Locking file until ${RETAINUNTIL}
echo --------------------------------------------
find ${FULL_BACKUP_DIR} -name '*' -exec touch -a ${RETAINUNTIL} \{\} \;
fi
# Script complete
Perl script that provides a date string three weeks out
This script is called by the weekly full script shown above. The actual desired retention should be altered here or else the script parameterized and the desired retention passed as an argument.#!/usr/bin/perl
use strict;
use DateTime;
my $dt2 = DateTime->today();
# get today's date $dt2->add (weeks => 3);
print ($dt2->year(),$dt2->month(),$dt2->day(),"\n");
For backups + dedupe, take a look at Avamar. Amazing bit of technology from EMC.
Dedupe takes place at the host, backups don't take all that long if you've got a bit of spare cpu and memory.
Posted by: Joshua Morast | July 14, 2010 at 06:15 PM
Thank you so much for your precious information.
Posted by: www.rmanbackup.com | September 29, 2010 at 12:10 PM
Hi,
wouldn't it be even faster if you use block change tracking? I've done this in almost all my rman databases as described on this blog http://www.beyondoracle.com/2008/11/24/fast_incremental_backups/
The problem to me is that i even need more speed of my terabyte database backups...
Regards
Terry
Posted by: Terry Lewis | November 15, 2010 at 07:04 AM
This is an interesting idea, and I've been testing it on dumb storage while our new Data Domain comes online. One issue I've been running into is that rman recovers the fastcopy (a physical copy, in my setup) rather than the original copy. In other words, I'm seeing the following:
1. I run the initial backup incremental level 1 for recover of copy... and rman creates the first set of datafile copies in destination X with tag Y
2. I copy that set of datafile copies from X to destination Z
3. I catalog the files in destination Z
4. I run backup incremental level 1 for recover of copy as in step 1.
5. I run recover copy of database with tag Y and notice RMAN recovers the set of files in destination Z, rather than those in destination X. In effect, it recovers the set of datafiles I want to preserve.
Not sure how to get around this; I suspect the problem is that the files in the step 3 copy are also tagged Y (rman catalog start with doesn't seem to allow one to change the tags). Any thoughts (other than falling back to user-managed backups)?
Rob
Posted by: Rob | January 27, 2011 at 10:30 AM
Thanks for useful information. I have a question. We have set up "two" NFS mount points off basically one identical dd storage to have more visibility. One, /proj/fra, is for Oracle FRA which is fastcopy source and the other, /proj/fradumps/, is for fastcopy destination. Oracle sends backups to FRA subdirectories which are /proj/fra/databasename/backupset/DATE, datafile, and etc. Basically I intend to dump daily backups from there to /proj/framdumps/database/backupset/DATE. By the way I figured that fastcopy command requires dd relative path instead of OS path for source and destination. What whould be the source & destination path for fastcopy command in my case?
Posted by: Ubee67 | March 18, 2011 at 11:31 AM
hi,
I encountered the exact same issue as that of Rob above. RMAN recovers wrong set of backup files. Any idea?
Posted by: Ubee Kwon | May 31, 2011 at 09:36 AM
I had the same issue as Rob in that RMAN merged into backups in Z destination instead of X destination. My workaround was NOT to catalog Z backups as they are needed only in case of recovery, not during backup. My other issue was about retention policy. My company policy is to keep backups for 14 days. RMAN policy other than 'redundancy 1' RMAN keeps all the backups since database creation with merged incremental backup. To resolve it I had to put "until time 'sysdate -x'" for 'recover copy' command so that RMAN marks backups older than 14 days as obsolete. This makes the whole fastcopy story unnecessary in my opinion.
Posted by: Ubee67 | October 06, 2011 at 11:12 AM
Interesting idea and thanks for taking the time to share the information.
I've done a lot of testing on this and have a few things to comment on this.
1. The script given above has some flaws in it. For example the code below...
----------------------------------------------
if [ ${NEWBACKUP} = Y ]
then
rm -f ${BACKUP_DIR}/most_recent_full
echo ${FULL_BACKUP_DIR} > ${BACKUP_DIR}/most_recent_full
fi
----------------------------------------------
This code implies that your most recent full is your last FULL backup and not the last full that you have recovered using the incremental merge. We should always update the most_recent_full with the recent incremental merge full that we have just recovered, so that the next incremental will be applied to this and NOT to the first FULL you have taken.
2. The other thing is about the discussion above about cataloging the backups and the way the incremental is applied to merge it to a full. Not cataloging your rman backups, I believe is an ugly solution. To be able to catalog your backups, you can do the following.
Make two FASTcopies of your recent full. One with a DBNAME-retain.date extension and the other with a DBNAME-nextdaycopy.date extension. Catalog DBNAME-retain.date first and then the DBNAME-nextdaycopy.date. The order here is very important. catalog the -retain copy first and then -nextdaycopy. Now, update your most_recent_file with the location of "DBNAME-nextdaycopy-date". Please realize that the tag for both these copies is still the same. Now that you have two full's cataloged in your rman catalog, the next time you apply the incremental, it will be applied to the last cataloged full backup, which means DBNAME-nextdaycopy. This way, you still have your DBNAME-retain.date in your catalog.
This should resolve the issue with cataloging your backups.
Posted by: Vish | April 07, 2012 at 12:54 PM
Also, using dNFS (if your oracle version is 11.2) will improve the performance of these rman backups by atleast 10-20%.
Posted by: Vish | June 29, 2012 at 01:45 PM