Bruce Clarke, my good friend and former boss, and I have have been collaborating recently on RMAN backup to Data Domain deduplication storage arrays. In the process I have learned from Bruce (who I gratefully acknowledge for the technical content of this post, including the scripts shown below) about a very valuable approach to RMAN backup.
Most DBAs do full RMAN backups of all databases which are small enough to fit into a backup window. Full backups have several advantages:
- They are very simple to run.
- They optimize the restore operation, as no redundant blocks need to be applied.
- They reduce dependencies in terms of the number of backup pieces that need to be cataloged and maintained in order to do a given restore, thereby reducing risk.
However, fulls have two strong disadvantages:
- They waste space since the entire database's content is stored in each full backup.
- They are time-consuming to run, because all active blocks in the database must be transferred to the backup target.
The first disadvantage is neatly avoided by the use of a deduplication array, such as Data Domain. My recent testing indicates that this is certainly true. More on this in a later post.
The second disadvantage is unavoidable at present. (Future developments in the area of source-based deduplication on the Data Domain side may mitigate this later on, but I will ignore that for now.) For most DBAs, the trade-off is acceptable as long as the full backup can fit into the available backup window. This is somewhat of a moving target as well, since recent developments in the area of storage networking (such as 10 GbE) have made the bandwidth increase, thereby increasing the size of the database that can fit into a backup window.
Unfortunately, the size of the typical database is also exploding as we all know. Thus, the issue remains: For a very significant number of large databases, the full backup cannot fit into the backup window. For these databases, the DBA has no real choice: He or she must use RMAN incremental backup.
Incremental backup (absent incremental update, which I will discuss later) provides the DBA with the option to only back up blocks which have been updated since the last backup. This provides a dramatic savings in terms of backup time, but with a significant manageability cost:
- The number of backup pieces which must be cataloged and maintained increases significantly, making the dependencies greater, and thus increases the risk of a failed restore.
- The restore time increases, as blocks must be restored multiple times, as the same block may be updated many times and thus stored in separate incremental backups.
Because of these issues, Oracle introduced incremental update for RMAN with 10g. There is an excellent article on the way this feature works in the Oracle documentation on OTN. Effectively, incremental update applies the incremental backup to the previous full backup. This neatly solves all of the issues of RMAN backup, except one: Instead of having a set of full backups, you end up with a single rolling full backup. In other words, once you apply the incremental backup to the previous full backup, you no longer have that previous full backup. Instead, you have a new full backup as of the point in time when you took the incremental backup.
This issue can be solved, obviously, by making a copy of the full backup prior to the incremental update operation. However, making this copy is time consuming, causes I/O, and takes up space.
Enter Bruce's solution: By using the Data Domain fast copy feature, you can make a copy of the previous full backup very quickly, and with no additional space. Further, the blocks which will be applied to the full backup copy have already been stored on the Data Domain in the incremental backup. Thus, they take up very little space as well. The end result is tantamount to RMAN backup nirvana:
- You can take your backups in a fraction of the time it would require to do a full backup.
- At the end of the process, you have a set of full backups for each time you run your RMAN backup.
- All backups are fully deduplicated, thus saving you lots of space.
- Restore operations are optimized as there are no dependencies other than a single full backup, and thus no redundant blocks to apply.
The scripts to accomplish this are below. The caveats all apply: This is provided as is with no warranty or support obligations. You are on your own in terms of all that. However, the solution does work, and you should definitely give it a try. Also, note that this is Data Domain copyrighted material. However, I have received permission from Bruce Clarke, the author of this script, to publish it here.
Example script for weekly backups
This was created and tested on Sun Solaris so slight changes for other platforms might be required.#!/bin/sh -f # version 1.0 - Copyright (C) 2009, Data Domain, Inc - Nov, 2009
#
# Perform backup of an Oracle database in preparation using the
# incremental update procedure
#
# This script performs periodic full backups using the fact that
# a requested to the existing full creating an updated
# full in case recovery is required.
#
# Arguments: $1 - "full" or "incremental"
# $2 - instance name to be backed - Default: "orcl"
#
# specifying "full" starts a new "tag" thus creating a new full sequence
#
# The script also keeps everything in separate subdirectories. Because
# the underlying storage is assumed to be deduplicated, the actual storage
# consumed is similar to just running a series of incrementals
#
# Archivelogs and controlfiles are also backed up in the sequence
#
# directory naming scheme being followed
# $BACKUP_DIR :: path to make backup directory
# $BACKUP_DIR/full-$INSTANCE-date :: full backup directory for today
#
# IMPORTANT NOTE: THIS SCRIPT IS PROVIDED WITHOUT WARRANTY OR SUPPORT.
# FEEDBACK IS ALWAYS
# WELCOME, HOWEVER, IT SHOULD BE CLEARLY UNDERSTOOD THAT USE OF THIS SCRIPT
# SHOULD ONLY
# BE MADE AFTER AN EXPERIENCED ORACLE DBA CLEARLY UNDERSTANDS ALL STEPS IN THIS
# SCRIPT
# AND HAS MADE NECESSARY MODIFICATIONS AND PERFORMED SUFFICIENT TESTING TO BE
# ASSURED OF ITS
# PROPER OPERATION IN THEIR PARTICULAR ENVIRONMENT
#
# edit variables below as necessary
#
#DRYRUN=1 # Remove this line to allow script to run
# with this line, the script will only echo
# operations it would have performed
BACKUP_MOUNT="/dd/dd690a" # where "backup" directory on DD is mounted
BACKUP_DD_PATH="/backup/oracle"
BACKUP_DDSYS="[email protected]" # used to perform ssh fastcopy after
# completion
INSTANCE="orcl"
RMANID="sys"
RMANPW="oracle"
RMANCAT="NOCATALOG" # Modify if a catalog is being used
RETENTIONLOCK=no # Yes, says change the atime to lock the file, otherwise don't
DELETE_OLDER=21 # how old a directory needs to be before it expires
# only relevant if RETENTIONLOCK is set to "yes"
MAXSIZE="10 g" # Maximum backup filesize - smaller allows replication to start
# sooner
# only relevant to archivelog backupsets. All other backups are "copy"
#### END OF VARIABLES
#### CHANGING VARIABLES BELOW THIS LINE WILL ALTER BEHAVIOR OF THIS SCRIPT
#
INCR_DATE="date_plus_3_weeks.pl" # script to give us a date in the future for
# "touch -atime <date> <file> to give us a lock date
case $1 in
full)
NEWBACKUP=Y
INCRBACKUP=N
;;
FULL)
NEWBACKUP=Y
INCRBACKUP=N
;;
incremental)
INCRBACKUP=Y
NEWBACKUP=N
;;
INCREMENTAL)
INCRBACKUP=Y
NEWBACKUP=N
;;
*)
echo "$0 - required argument \"full\" or \"incremental\" not specified"
exit 1
;;
esac
if [ "X$2" != "X" ]
then
INSTANCE=$2
fi
echo "-------------------------------------------"
if [ ${NEWBACKUP} = "Y" ]
then
echo "Performing New Level 1 image backup of Instance \"${INSTANCE}\" "
else
echo "Performing Incremental Level 1 image backup of Instance \"${INSTANCE}\" "
fi
echo "-------------------------------------------"
DATE_SUFFIX=`/usr/bin/date '+%e%b%y'`
BACKUP_DIR=${BACKUP_MOUNT}${BACKUP_DD_PATH}
FULL_BACKUP_DIR=${BACKUP_DIR}/full-${INSTANCE}-${DATE_SUFFIX}
INCR_BACKUP_DIR=${BACKUP_DIR}/incr-${INSTANCE}-${DATE_SUFFIX}
# set the backup directory that we care about for today
if [ ${NEWBACKUP} = Y ] then
THIS_BACKUP_DIR=${FULL_BACKUP_DIR}
FASTCOPYPATH=${BACKUP_DD_PATH}/full-${INSTANCE}-${DATE_SUFFIX}
else
THIS_BACKUP_DIR=${INCR_BACKUP_DIR}
FULL_BACKUP_DIR=`cat ${BACKUP_DIR}/most_recent_full`
# derive fastcopypath
FASTCOPYPATH=""
Y=${FULL_BACKUP_DIR}
X=$Y
Z=""
while [ ${X} != "backup" ]; do
X=`basename ${Y}`
Y=`dirname ${Y}`
if [ x$Z = "x" ]
then
Z=$X
else
Z=$X/$Z
fi
done
FASTCOPYPATH=$FASTCOPYPATH/$Z
fi
#
if [ ! X${DRYRUN} = X ]
then
set -vn
fi
# check if directory exists
if [ ! -d ${BACKUP_DIR} ]
then
echo "Directory \"${BACKUP_DIR}\" not presently available - aborting"
exit 2
fi
# directory already exists? If so, delete it -- OPTIONAL
if [ -d ${THIS_BACKUP_DIR} ]
then
echo "Directory \"${THIS_BACKUP_DIR}\" exists: Deleting"
# rm -rf ${THIS_BACKUP_DIR}
fi
# Make sure we can create the new directory
mkdir -p ${THIS_BACKUP_DIR}
if [ $? != 0 ]
then
echo "Failed to create \"${THIS_BACKUP_DIR}\" - aborting"
exit 3
fi
# Delete old backups - OPTIONAL
# Shouldn't be necessary if RMAN is maintaining the Retention policy
#find ${BACKUP_DIR} -name 'full*' -type d -mtime ${DELETE_OLDER} -exec rm -rf {} \;
# Get the appropriate TAG for use with RMAN. Create a new one if full, else use the existing one
if [ ${NEWBACKUP} = "Y" ]
then
echo "${INSTANCE}_${DATE_SUFFIX}" > ${FULL_BACKUP_DIR}/tagfile
fi
# Now set it here
TAGTODAY=`cat ${FULL_BACKUP_DIR}/tagfile`
# Echo the tag for the record
echo "-----------------------------------------------"
echo "TAG that will be used is ${TAGTODAY}"
echo "-----------------------------------------------"
# create the subdirectory that holds just the archivelogs mkdir -p ${THIS_BACKUP_DIR}/archivesonly
#
# Now finally get down to running the backup # # The command we want RMAN to run is:
# backup incremental level 1 for recover of copy with tag ${TAGTODAY} database;
rman target=${RMANID}/${RMANPW}@${INSTANCE} ${NOCATALOG} << FULLBACK
run {
configure controlfile autobackup on;
configure controlfile autobackup format for device type disk to '${THIS_BACKUP_DIR}/%F';
allocate channel c1 device type disk format '${THIS_BACKUP_DIR}/%U';
allocate channel c2 device type disk format '${THIS_BACKUP_DIR}/%U';
allocate channel c3 device type disk format '${THIS_BACKUP_DIR}/%U';
backup incremental level 1 for recover of copy with tag ${TAGTODAY} database ;
release channel c1 ;
release channel c2 ;
release channel c3 ;
allocate channel c1 device type disk format '${THIS_BACKUP_DIR}/archivesonly/%U' maxpiecesize ${MAXSIZE};
allocate channel c2 device type disk format '${THIS_BACKUP_DIR}/archivesonly/%U' maxpiecesize ${MAXSIZE};
allocate channel c3 device type disk format '${THIS_BACKUP_DIR}/archivesonly/%U' maxpiecesize ${MAXSIZE};
backup tag ${TAGTODAY} archivelog all delete all input;
recover copy of database with tag ${TAGTODAY};
}
quit
FULLBACK
# Done
echo Image backup of instance \"${INSTANCE}\" complete
echo --------------------------------------------
echo " Deleting expired backups from RMAN catalog"
echo --------------------------------------------
# Delete expired backups
rman target=${RMANID}/${RMANPW}@${INSTANCE} ${NOCATALOG} << DELEXP
delete noprompt expired backup;
delete noprompt expired archivelog all;
delete noprompt expired copy;
quit
DELEXP
#
# Fastcopy the full directory we just updated so it is retained # NEWDIR=${FASTCOPYPATH}-retain-${DATE_SUFFIX}
ssh ${BACKUP_DDSYS} "filesys fastcopy force source ${FASTCOPYPATH} \
destination ${NEWDIR} "
# now bring the second copy into the RMAN catalog
#
rman target=${RMANID}/${RMANPW}@${INSTANCE} ${NOCATALOG} << CATHERE
catalog start with '${BACKUP_MOUNT}${NEWDIR}';
yes
quit;
CATHERE
# record the backup directory so the incremental can find it
#
if [ ${NEWBACKUP} = Y ]
then
rm -f ${BACKUP_DIR}/most_recent_full
echo ${FULL_BACKUP_DIR} > ${BACKUP_DIR}/most_recent_full
fi
#
# If Retention Lock is going to be used, set the access time on the new files
#
if [ ${RETENTIONLOCK} = "yes" ]
then
RETAINUNTIL=`date_plus_3_weeks.pl`
export RETAINUNTIL
echo --------------------------------------------
echo Locking file until ${RETAINUNTIL}
echo --------------------------------------------
find ${FULL_BACKUP_DIR} -name '*' -exec touch -a ${RETAINUNTIL} \{\} \;
fi
# Script complete
Perl script that provides a date string three weeks out
This script is called by the weekly full script shown above. The actual desired retention should be altered here or else the script parameterized and the desired retention passed as an argument.
#!/usr/bin/perl
use strict;
use DateTime;
my $dt2 = DateTime->today();
# get today's date $dt2->add (weeks => 3);
print ($dt2->year(),$dt2->month(),$dt2->day(),"\n");
Interpreting the results
Below are some space reporting results from the Data Domain system used in the development of the script shown above. The first output was taken before the script was invoked with the “incremental” option and the second after it completed. Please note that the database was open but undergoing little change, so a significant amount of the incremental backup storage was consumed in backing up the handful of archive logfiles that had been created. Nonetheless, the report shows that 4.5GB of additional data was introduced to the Data Domain but only an additional 16MB of space has been consumed.