« Oracle Backup: Which Snapshot is best? (Part 4) | Main | Thin Provisioning Part II: Why TP Works Well for Oracle (Sometimes) »

In this post, I will discuss the concept of thin provisioning and database storage. There has been a lot of confusion on this issue recently, and I think that we need to clear the air and get some things straight.

First I would like to explain why thin provisioning is pretty much useless for Oracle datafiles right now. Then I will turn to how thin provisioning could be integrated into Oracle at which point it would be very interesting indeed.

OK, to start, I should define what I mean by thin provisioning. Assume you are a storage administrator, and you have a bunch of customers who need to store tons and tons of data. Many of you folks reading this post are probably in that category. What consumers of data on networked storage devices tend to do is the following:

  1. Figure out how much space they need to store their data.
  2. Double that because they know the data will grow over time.
  3. Double it again for ducks.

The effect is that storage consumers tend to be very liberal in their demands for storage. This becomes a problem for the storage administrator because costs balloon and utilization on the array is very low. A formula for aggressively and stupidly wasting money in other words. The following graphic illustrates this:

Thin1small

What you would like to do is the following:

  1. Give the customer a block of storage which is the size they are demanding.
  2. Under the covers, allocate a more realistic amount of storage, closer to what you think they really need right away.
  3. Monitor the storage, and add physical space to the device when you need it, in order to meet the demands for what your customer really needs.

The following graphic shows how this would work:

Thin2small

Note that the light grey, plus the yellow, represents the total space on the device. The over provisioned space is space which the array tells the customer he or she has, but it does not exist physically. It is then the storage administrator’s responsibility to make sure that physical space (i.e. disks) is added to the array in a timely manner in order to meet the customer’s expectations for space.

If this sounds like lying, you’re right. It certainly is. It’s a shell game. A very useful and cost effective shell game for many folks, though, given storage consumers’ propensity to ask for a lot of space they don’t need.

One assumption that is required in order for this to work at all, however, is the ability for the array to lie to the host operating system about the size of a file system or LUN which is served up to the host. For a file system like NFS or CIFS on a NAS box, this works well, as long as the data being stored is unstructured. Any given file is of a certain size, and that file’s space must be available for it to be stored. But the entire file system can certainly lie to the host and say its capacity is a terabyte, when it is really only 100 GB. No problem there, as long as the physical size is adequate to store the amount of data represented by the files actually in the file system.

The issue comes when you try to store structured data like Oracle. A DBA on an Oracle database will request the size of data he or she thinks the database will need, with the same propensity for over provisioning as any other consumer of data. The difference, though, is that the DBA will actually create datafiles which fill that space. And Oracle then makes a physical file on the device of that size, creates extents in this file, and writes zeros to it.

There is a concept in Oracle of auto extension of files. This concept seems like it would align well with thin provisioning. And it would if DBAs used it. Problem is, extending a file is a very expensive operation. Again, because Oracle likes to lock down that file and zero it out completely. DBAs hate that. A huge performance hit kicking the database in the teeth at any unpredictable time. Simply because the datafile ran out of space. Not good. Not good at all.

DBAs avoid auto extending files like the plague for this reason. They will allocate the space they need, always, when they create the database. And future expansions in space will be made intelligently, carefully, and methodically. That’s the way DBAs think. Believe me, I know. I am one of them.

This makes thin provisioning completely useless nonsense for Oracle data. Anyone who tells you otherwise should be viewed with deep suspicion. I say this without any bias whatsoever, since my employer sells arrays that provide thin provisioning too. I am simply telling you the way it is here.

Now, how could thin provisioning be made to work with Oracle? That’s a very interesting question. It would require integration between the storage array and the Oracle kernel. Then Oracle could avoid zeroing out a file, and simply allow the array to provide the storage Oracle needs to store the blocks presently in the file. This would probably occur within ASM. (ASM stands for Automatic Storage Management, Oracle’s storage layer.) There is some discussion of that type of integration between storage and Oracle, but do not expect to see it anytime soon if ever.

TrackBack

TrackBack URL for this entry:
http://www.typepad.com/t/trackback/2491404/21110137

Listed below are links to weblogs that reference Why thin provision does not work for Oracle data:

Comments

So I'm no Oracle guy, but I do have a question regarding the following comment:

"The difference, though, is that the DBA will actually create datafiles which fill that space. And Oracle then makes a physical file on the device of that size, creates extents in this file, and writes zeros to it."

So what would the effect be in above scenario, if Netapp were to turn on their deduplication capability on the volume with or even without Thin provisioning enabled? Given that the file has a lot of zeros into it, that means there's a ton of duplicate blocks which means that these blocks can easily be shared.

Any comment?

Network Appliance's Deduplication (A-SIS) is very interesting, but not particularly applicable in this space. The primary use case is LAN B2D (i.e. CIFS/NFS on a low-cost storage subsystem - or on a SnapVault target).

ASIS is run on a manual or scheduled basis, and deduplicates the 4K level. So, if you have a bunch of zeros, the data would be deduplicated when ASIS was run, but then not again until it was run next. Those zeros are going to change pretty rapidly.

Network Appliance's approach is innovative and different - analagous to a trash compactor (run periodically) rather than a shredder (run in realtime) ala DataDomain and Avamar.

So - if used on a production filesystem, the fileystem needs to be big enough for the "pre-deduped" content, then goes through compression. In the Oracle on NFS case, this wouldn't benefit since the filesystem would then need to be shrunk. It would benefit only in the case that Jeff describes through - very large scale, very overscribed. There, where some databases would be growing fast, great, others stangnant - in which case the you could "compact" and be oversubscribe with minimal risk - so long as someone was watching the henhouse.

So if not on production (i.e. not applicable where the poster suggests), then what about on a B2D target - it's arguable which de-dupe approach works better - NetApp style, DataDomain sytle or EMC (Avamar) sytle. NetApp is definitely the lowest initial cost (Nearstore license), Datadomain is in the middle, Avamar the most expensive, but then again, the true price of B2D is all about the compression one can acheive.

Compression/Dedupe rates are all about two things: 1) granualarity (the smallest chunk of data that can be marked as duplicate) and; 2) the domain size (i.e. how much data is being de-duped). Avamar wins on both counts - variable chunk size - as small as 12 bytes (vs 4K) and deduplicates the entire customer dataset (vs. a 1-4TB filesystem) - it's for that reason that customers regularly see 50:1 compression ratios with Avamar.

I thought EMC did thin provisioning on the Celerra head not the array?


"I say this without any bias whatsoever, since my employer sells arrays that provide thin provisioning too."

Re: "thin provisioning on the Celerra head, not the array" - all the thin provisioning mechanisms on the market today need to do some sort of "encapsulation" of the block backend.

A Celerra head is the same idea as a FAS filer "head" - a NAS device (that can do block devices as files in filesystems). EMC and NetApp take a different approach on how to do the back end - NetApp uses JBOD, and the filer head does the block functions (like RAID, BCS, etc.). EMC uses an array backend - so all block functions are handled by the array. each do certain things well. personally, I like the way that NetApp uses disk signatures rather than bus locations to identify disks. But the FC implementation and clustering model of EMC is superior. Celerra + backend is very analagous to a FAS array.

The idea that DBA's don't and won't use Oracle auto-extend with thin provisioning is erroneous. 3PAR has many thin provisioning customers happily using auto-extend for years who have never reported the slightest example of this “expensive" performance hit. 3PAR Thin Provisioning customers routinley use Oracle auto-extend to reap the benefits of thin provisioning. One Fortune 10 company has scores of TBs of thin provisioned volumes presented to transactional Oracle databases in auto-extend mode. The claim is completely spurious and without foundation with respect to 3PAR’s product, though I cannot speak for other implementations.

Geoff Hough
Director, Product Marketing
3PAR

For additional information, Compellent discusses its thin provisioning support for Oracle here, http://www.drunkendata.com/?p=1371.

Hi,

I might be late in this debate... but better late than never.

I'm a bit surprised by your position. I did not had time yet to dig into thin provisioning yet, but I'm considering doing it for a large Oracle/DMX customer (>100TB or Oracle data) soon. More than 90% of all DB denial of service in 2006 were related to data files or archive log filling up. Data file space below high water mark + max(archivelog) vs oracle allocated space (data file +file systems) shows a ratio of 1:2, and still all Oracle Data files are in autoextend in very large tbs (>200GB). Despite those precautions, there are at least one denial of service a day because they are basically unable to predict accurately enough the business activity.

Agreed, this might be a bit extreme but we really have a case for thin provisioning (if technically feasible).

Cheers

Christian

I have to agree with the 3Par guy Geoff: Thin Provisioning on 3Par works fine and as-advertised. We use 3Par extensively at my company, with Oracle 10g and ASM, and it works. If the customer agrees to use TPVV and ASM, we allow them to use up to the contracted amount of storage and charge them less, and the auto-extend is not a painfull process at all. However, we do autoextend in large chunks, generally about 100mb at a time. Is there a performance penalty in doing this ? Yes, somewhat, but only during those moments when an autoextend occurs, which is not that frequent. Perhaps it just works faster in a ASM environment compared to a regular filesystem. But if your paying a lot less and are willing to take the very slight performance hit, its a good tradeoff.

Post a comment

Comments are moderated, and will not appear on this weblog until the author has approved them.

If you have a TypeKey or TypePad account, please Sign In