In my past few posts, I have explored the risks and benefits
of snapshot technologies from both NetApp and EMC. This series has covered:
- Part
1: The nature of snapshots and their benefits to the Oracle user
- Part
2: Snapshot performance overhead
- Part
3: Writable snapshots
In this post, the last of this series, I will discuss the manner in which a snapshot can
consume so much space that it will cause writes to the active file system to
fail, as well as the mechanisms which NetApp and EMC have created to avoid this fate.
Yes, it is true. You can get an ENOSPACE error when you are
using a metadata approach for creating snapshots, which is the way WAFL manages snapshots on a NetApp filer. Recall a couple of
posts ago, when I included this diagram:
Note that the additional blocks required by the snapshot are
invading the free space in the active file system. It is actually the light-colored
blocks (the “before” images of the blocks) which are held by the snapshot. At
NetApp, we used to have debates over whether the snapshot occupied the space, or
whether it was the active file system that did so. Whatever. The effect is
exactly the same. The storage space cost of a snapshot is equal to the number
of blocks which have been updated since the creation of the snapshot. Thus, you
can think of the storage space overhead of snapshots in this way:
From this diagram, you see that we are running a file system
that is about 70% full. We have another 10% of snapshot overhead. This creates
a file system which has about another 15% before it runs out of space.
Absent space reservations, you could do this:
All available space has now been fully occupied by snapshot storage
overhead, even though there has been no increase in the amount of data in the
active file system. This is because we kept this snapshot around too long: A
sufficient number of blocks were updated after creating the snapshot to exhaust
all empty space. The next write to this file system will get an ENOSPACE error. This includes updates to files already in the active file system, that require no additional space to be allocated.
Hence the common NetApp heuristic: “Old snapshots are dear; new
snapshots are cheap.”
This was a depressingly common issue at NetApp while I was
there, particularly with storage administrators who migrated to NetApp NAS from
a more traditional SAN storage environment (typically EMC). Those folks would
behave like good storage professionals: They would utilize all available space.
They regarded free space as wasted space. Further, these folks tended to think
that if they had created an Oracle datafile of 100 GB in size, then that file was
locked down and in place. They regarded a storage device returning of an ENOSPACE error as a result of an update to that file as naughty, irrational, and strange.
For these well-behaved storage professionals, the good habits
they had developed in the SAN context were a formula for disaster when dealing
with NetApp snapshots in an NAS context. By running with little or no free
space, they allowed no headroom for the snapshot overhead. Thus, ENOSPACE
errors were common.
I used to refer to snapshots as having a “dark side”. This is
the dark side I was talking about. The space allocated to a datafile is no
longer guaranteed. When you make a snapshot, you can run out of space on that
file anyway, although it is already allocated in the file system.
This led NetApp to introduce the notion of space reservations.
The architect of this concept was Bruce Gordon, the SAN marketing guy hired by
Rich Clifton during the 2000 to 2001 period. I will readily admit that I
fiercely resisted this concept. Basically, what space reservations do is
simple. If there are not enough free blocks in the file system to completely
duplicate all of the existing data, then the snapshot creation fails. An
illustration will help. Before space reservations, if you had this:
You could not create a snapshot at all. You do not have enough
free space to duplicate the existing data. You must either free some space or
add capacity. Assuming you add capacity then at this point, you could create a
snapshot:
Snapshot overhead then begins to invade the reserved space.
As you begin to accumulated updated blocks, the snapshot overhead looks like
this:
Since you have reserved enough space to duplicate all of the
data that existed at the time of the creation of the snapshot, theoretically an
ENOSPACE error is impossible.
I said previously that I resisted this concept. I used to tell
Bruce Gordon that as far as I was concerned, he was an EMC plant. Why? Because
space reservations destroy the one primary benefit of snapshots: Space
efficiency.
Go all the way back to my first post on this series. I stated
that the gold standard for Storage Layer Instantaneous Copy (SLIC) technologies
is BCVs. BCVs have lots and lots of advantages. They have absolutely no
performance penalty. They work beautifully. They have only one downside: They
require another set of disks. Before space reservations, snapshots did not. By providing the same basic
functionality as BCVs (instantaneous copy) without the storage overhead of
another set of disks, snapshots became the best way to do the job of Oracle database instantaneous hot backup.
With space reservations, the cost of snapshots became effectively
the same the same as BCVs. In that case, BCVs win. They do not have the
performance issues that metadata based snapshots do. (This performance trade-off is discussed in detail in Part 2 of this series.) Removing the cost advantage of snapshots over BCVs was a major erosion
in NetApp’s core value proposition.
But, as Bruce Gordon said, “No customer will ever have an
ENOSPACE error on my watch.” Bruce attempted to establish a principle that space
would always be reserved such that a snapshot could never exhaust the active
file system free space.
Unfortunately, FlexClones, covered in detail in my previous
post, violate this principle. That is because FlexClones create another write
thread. Remember that each write thread has the potential to double the space
requirements, by overwriting every block in the snapshot. That was
illustrated by the following diagram from my previous post:
Note how FlexClone increases the space requirements by
adding another set of “after” image blocks to the mix. Simply reserving space
for one set of additional blocks is now insufficient. You would now need to
reserve space for two. Thus FlexClones make the following scenario possible:
You are now out of space again. The next write will get an
ENOSPACE error.
EMC snapshots make all of this impossible. By using a reserved
LUN pool approach, EMC simply allocates the space required for the snapshot.
The snapshot space is not shared with the active file system space. Thus, it is
impossible for the active file system to receive ENOSPACE from a snapshot. The
following graphic illustrates this:
The snapshot space is contained within the RLP. It is not
shared with the active file system. Running out of space within the RLP will
cause the snapshot to become invalidated. But it will not affect the active
file system at all. An ENOSPACE error can never be returned to the active file system
with this design, unless the user exhausts the space in the active file system itself. Further, you decide how much space you want to allocate to the snapshot. Unlike WAFL-based snapshots, you are not writing a blank check for snapshot overhead, up to the full amount of data in the active file system. Rather, you can decide that the snapshot will only be allowed to take up 10% of that space if you want to. This adds discipline to the whole proposition of snapshot space overhead.
Once again, it is for you as the customer to judge the relative
merits of these approaches. In my series on snapshots, I have attempted to
bring clarity to the debate between EMC and NetApp on the benefits and risks of
snapshots for Oracle database backup. Based upon the number of comments this
series has received, I think you are hearing me.
Future posts on this blog will cover how EMC NAS compares to
NetApp NAS for Oracle database storage.