« Oracle Backup: Which Snapshot is best? (Part 2) | Main | Oracle Backup: Which Snapshot is best? (Part 4) »

August 03, 2007

TrackBack

TrackBack URL for this entry:
http://www.typepad.com/services/trackback/6a00e009802a79883300e39822193b8833

Listed below are links to weblogs that reference Oracle Backup: Which Snapshot is best? (Part 3):

Comments

Odafe

Well Mr. Oracle,
I would say I ma very impressed with your facts and figures.I am just a customer to Netapps and would not have had more experience than you.
I reda all tru fromm bottom to top and the trend with the Oracle Backup and Recovery.I would like to send you a mail:my email add is odafeuk@yahoo.co.uk, lets get to talk more.

thanks.

Odafe

Jonathan Marianu

Hello, I have both netapp and symmetrix storage.

The database in question is RAC OLTP and is mostly random r/w

I'd like to use snapshots as the primary recovery mechanism. I would keep 48 snapshots of the database file system.

Please correct me if I am in error.
On a COFW system, it seems to me that there is an inverse relationship between the number of snapshots and the random R/W performance of the primary lun. More snapshots decrease performance linearly.


I would think a WAFL file system could sustain the same random R/W performance regardless of the number of snapshots.


On the other hand a COFW systems excels if I am a using a short lived snapshot to make a backup to tape and then delete the snapshot after the backup completes. Once that snapshot is deleted there is no performance penalty.

a WAFL system still has the read penalty regardless if there are snapshots or not.

Is my thinking correct or am I in error?

Jonathan Marianu

Actually I see my error now.
A COFW system using multiple snapshots, copies the original block to the RLP once and just updates the pointers to that block in the other snapshots.

So subsequent snaps do not impact performance as much as the first snap does.

When determining the impact on performace the choice is not how many snaps but whether to maintain a persistent snapshot during production hours at all.

Thoughts?

Response by TOSG:

The COFW penalty paid by a set of snapshots can be easily determined. Let's look at the simplest case: Two snapshots. In that case, the COFW penalty is equal to the number of blocks which are updated for the first time following the creation of either or both snapshots. The reason I say either or both is because many blocks will be shared in common by both snapshots.

It would be best to give you an example in the form of a block diagram, as I do in my blog. However, responding to comments does not afford me that luxury. Perhaps I will do so later in a post.

Suffice it to say, if the two snapshots share blocks (which will normally be the case for the vast majority of blocks in a file system or LUN), then a subsequent update to a shared block will incur only one COFW penalty for that block.

Only if the two snapshots do not share a particular block (because it was or updated after the creation of the oldest snapshot, but before the creation of the newest snapshot) does the penalty get paid twice. This will occur once before the creation of the second snapshot, and once ofter it.

Inserts create another special case. A block inserted after a snapshot does not incur any penalty at all. Thus, a block inserted after the first snapshot but updated after the second snapshot incurs one COFW penalty. And blocks inserted after the newest snapshot incurs no penalty.

It seems confusing, but it really isn't. In practice, the creation of two snapshots in a file system or LUN where the vast majority of the blocks are shared will incur a COFW penalty which is extremely close to one snapshot only. If the blocks are less shared, then it will be closer to two.

In my experience multiple snapshots usually share the vast majority of the blocks in a file system or LUN, as I state above.

Let me know if this needs further clarification.

Regards,
Jeff

Jonathan Marianu

Let me restate and you can tell me if I have it correct.
The COFW feature consists of two components:
-an RLP block journal,
-one or more snapshot block maps.
When a write request occurs on the primary lun, a copy of the original block is added to the RLP journal and the snapshot block maps are updated as appropriate. Regardless of the number snapshots, the changed block is only written to the RLP journal once. When a snapshot is deleted, the blocks that only it references are deleted from the RLP journal. If this is the case, that seems efficient.

--------

Essentially correct. The RLP (Reserved LUN Pool for those not familiar with that term) is designed to be smart enough to know that a before image of a given block is required by more than one snapshot, if that is the case. In that event, a single copy is stored in the RLP journal, and a pointer to that copy is stored in the snapshot block map for each snapshot that requires that block to be preserved. This why a single COFW penalty is paid for multiple snapshots when a given block is updated.

Regards,
TOSG

tatianahunt download

Nice topic
thanks
I have found two interesting sources http://fileshunt.com and http://filesfinds.com and would like to give the benefit of my experience to you.

Todd Bourne

I would guess that the RLP tracks whether a block is dirty using a bitmap of the data blocks offsets.

Which raises further questions in my mind.
1. Do RLP systems offer variable block sizes?
2. Do differing work loads benefit from different block sizes?
3. Has there been any benchmarking done on this?

Thanks very much for a great series of articles.

Todd Bourne

-------------

Response by TOSG:

I am not aware of storage systems providing variable block sizes. A cursory examination of both Symmetrix and CLARiiON does not reveal this. As I recall, NetApp sticks to a 4 KB block size as well. If you have further information please let me know.

Paul Lewis-Borman

Hi Jeff,

Great blog. My question is essentially about restoring to a snapshot.

If you are using a snapshot as a rollback mechanism during an upgrade(say), then if you need to rollback with a COFW system (EMC) does it rewrite the main LUN blocks from the RLP, or does it just repoint the changed blocks back to the RLP versions?

If the latter, then that means that after rolling back to a snapshot you end up with a more fragmented block layout that when you started. If the former, then there is a write penalty while the RLP blocks are rewritten to the main LUN.

With the NetApp approach, assuming you take regular snapshots anyway and hence have a degree of inherent fragmentation, there's effectively little difference after rolling back, plus the rollback itself would be more or less instant as no block re-writes are required.

Is that right?

The comments to this entry are closed.

Powered by TypePad
View Jeff Browning's profile on LinkedIn

disclaimer: The opinions expressed here are my personal opinions. I am a blogger who works at EMC, not an EMC blogger. This is my blog, and not EMC's. Content published here is not read or approved in advance by EMC and does not necessarily reflect the views and opinions of EMC.