Acknowledgements to Bart Sjerps for this content.
There seems to be a lot of confusion on licensing when customers consider running Oracle databases on VMware. Part of the confusion is caused by Oracle on purpose (classic FUD) by suggesting licensing is more expensive on VMware than on physical servers. The reality couldn't be more different - I strongly believe that many customers can actually *save* on database licenses by going virtual. But to understand how to achieve this, you need to know a few things - I hope I can clear this up in a short explanation. I will keep the discussion to Oracle database licenses and ignore application/middleware etc. for now.
License models
Customers typically license their basic database by one out of three options:
- License by CPU (core) - the more CPU cores, the more licenses are needed. There is a processor core factor depending on the type of CPU and can be 0.25, 0.5, 0.75 or 1.0.
- License by named user - the more named users, the more licenses are needed. The amount of CPU's is not important, neither the amount of total databases. Typically one license pack per 25 users.
- Enterprise License - the customer negotiates a contract for the whole company and afterwards can deploy as many databases on as many servers/cpus as he wants.
If a customer uses 2 or 3, then it does not matter if they run virtual or physical. But there are also no license savings possible without re-negotiating their contracts. I don't want to go as far as suggesting to customers to change their license models so we leave this as-is for now.
In my experience, most enterprise customers use either cpu licensing or enterprise contracts. Some have different licensing methods for different business units. Oracle can be very creative in customer-specific contracts so expect to find a different situation for each individual customer.
But let's assume CPU licensing for the sake of this discussion.
Maintenance & support
Users typically buy the CPU licenses but then have to pay maintenance for the time they use the licenses. Yearly maintenance cost is about 25% of license (list price). I have no information on typical discounts. I expect customers to get at least 50% discount off the price list (but only on licenses, not on maintenance AFAIK).
Database Edition and options
The plain database license comes in 3 versions (for servers):
- Standard Edition One - Maximum 2 processors, no options allowed. Only used for testing and very small deployments
- Standard Edition (SE) - Maximum 4 processors, no options allowed. Only used for smaller sizes and workloads (but stay tuned)
- Enterprise Edition (EE) - No limitations and on top of EE, you can have many licensed features. Most customers will use this, at least for production databases
On top of the basic Database license, most customers use a set of options, each requiring additional licenses per CPU. The most common options are:
- Real Application Clusters (RAC) - allows many servers running the same database (active-active clustering) to allow scale out performance and high availability.
- Real Application Clusters One Node - same but one database can only run actively on one node. For high availability only.
- Active Data Guard - remote replication using log shipping. Note that standard Data Guard is free, but Active Data Guard allows the standby database to be opened for read-only purposes and offers some extra features.
- Partitioning - allows tables to be split up in smaller chunks. Absolutely required when running large databases and no downtime can be tolerated. Eases administration work and offers some performance benefits.
- Real Application Testing - allows workloads to be recorded and re-played on another database to do performance and functionality testing
- Advanced Compression - allows database blocks to be compressed - requiring less storage and boosting performance (in most cases).
- Diagnostics Pack / Tuning pack - provides automated reports. Oracle AWR (Advanced Workload Reports - a performance reporting tool) is part of Tuning Pack.
In my experience, nearly all customers have partitioning. Most customers have tuning/diagnostics pack. Some customers have RAC. Some customers have the other options. There are more options available but these are the most common.
Many customers have 3 or more options - sometimes the options cost more than the base database license - especially if they use RAC they will have most of the other options, too.
Running on a cluster
If a database runs on a cluster, then Oracle assumes the database can make use of any processor in the cluster. This is independent on what kind of cluster is used (so can be MSCS, HP MCSG, Vmware, Oracle RAC, etc).
This is basically the foundation for all FUD and confusion. For example, if you deploy a VMware farm (cluster) of 16 servers, and all virtual machines run all kinds of stuff (file/print, exchange, apps, etc etc) and only one tiny virtual machine in the corner, with only one virtual CPU runs a small Oracle database, you would expect only to pay for one CPU core - but Oracle's reasoning is that this tiny VM can be dynamically moved (VMotion) to all nodes in the cluster and on any processor. Therefore, all CPU's have to be fully licensed by Oracle. So in this case, running the single database on a (small) physical server would be cheaper than running on a VM in the farm.
Total cost of the stack
In a typical database server deployment, the cost of the database licensing is far greater than the cost of the hardware + OS licenses combined. I have no hard numbers but I assume the average DB license cost (plus options) is 10 times larger than the cost of the server + OS.
So a $5,000 server would typically require $50,000 on licenses. Then because maintenance is 25% yearly, the total cost of licenses over a 3 to 5 year period is even higher - so for a 5 year TCO the total license cost might be $75,000 (assumption - could also be closer to $100,000 - and no, I didn't make a mistake with an extra zero, Oracle *really* is this expensive).
Utilization
It is very hard to size a typical Oracle database based application. There are no good methods or calculations to figure out how much CPU power, disk I/O and memory is needed to run a given app. So historically, project teams size their database servers for peak loads, and because they cannot predict how big the peak load is, they double the resources "just in case". The end result is that most database servers are way oversized in terms of CPU and memory.
Most physical deployed database servers will average on about 10-15% CPU load (or less). However, they will peak to higher loads at certain times, such as monday morning when many users log in, or when month/quarter/year-end batch processing is started, etc.
Then, the utilization numbers can be influenced by other tasks of the processors. Some common causes of "artificially high" CPU loads on database servers:
- CPU is involved in storage mirroring (i.e. Host Level Mirroring - using Oracel ASM or a Unix volume manager)
- CPU is involved in file transfers over the IP network
- Backup (non-serverless, using CPU, Network and I/O bandwidth)
- Customers run the application server on the same machine driving up CPU load - This can drive up CPU load from 10% to 90% or more !!
- Same for Middleware and Enterprise Service Buses (Think Oracle BEA, IBM Websphere, SAP Netweaver, etc)
- A bunch of monitoring/management agents burn CPU cycles (Tivoli, BMC, HP Openview, CA, etc). Each agent maybe consuming 1% but add it up and you have another 5-10% overhead.
- Administrators generate database dumps / exports and run their own reports, scripts and tools. They run ad-hoc queries as well that should not be on production.
- Poorly tuned database servers cause paging and other CPU overhead - hard to diagnose but driving up CPU and I/O significantly.
- Database admin tasks (table reorganizations, (re)building indexes, converting tablespaces, ...)
- And so on...
All of these cause the processors, expensively licensed for database processing, to do other stuff.
So if a server is running at 15% utilization, then the utilization caused by the database workload itself might only be 10% and the rest caused by other stuff (whether needed or not).
Needless to say that Oracle likes customers to use their expensive licensed CPU's for other tasks because it forces them to buy additional CPUs sooner and therefore drive their license revenues.
Isn't life great for an Oracle rep? ;-)
Number of databases
Most customers run many databases. For the average enterprise customer that I visit, 100+ databases is a normal number. A big global that I visited runs 3000+ Oracle databases worldwide (and this is only the scope of this specific project team). Imagine the cost of licensing all these databases on all individual servers...
Why so many? Well, customers do not like to share multiple applications on one database (and often this is not even supported). So if you run SAP ERP, Oracle JD Edwards, your own banking app and a few others, they all require their own production database.
For each production database, you might find an acceptance environment, test system, development server, maybe a staging area to load data into the data warehouse, maybe a firefighting environment, a standby for D/R, a training system and so on. Customers will rarely share production environments on the same server (unless virtualized or at least with workload management segregation). Sometimes they share a few databases for non-prod on a server. So for, say, 100 databases, the average customer runs between 30 and 50 (physical) servers.
Power of big numbers
It does not require rocket science to understand that many of these databases do not require peak performance at the same time. A development system typically drives workload during daytime (when developers are coding new application features). A data warehouse runs queries during the day and loads in the evening. For a production system it depends on the business process. An acceptance system might sit idle for weeks and then suddenly peak for a few days preparing for a new version deployment into the live production system. And so on.
So what if you could share resources across databases - without influencing code levels, security, stability and so on?
If that would be possible - you would not size for "peak load times two" anymore. You would size for what you expect and assume an average utilization of, say, 70% over the whole landscape. If one database needs extra horsepower, there is enough available in the landscape.
How much license cost would you save by bringing down the number of CPU's so that utilization goes up from 10% to 70%?
What would be the effect on power, cooling, floor space, hardware investments, time-to-market?
What would be the business advantage of not limiting production performance of a single server, by whatever was sized during initial deployment? Risk avoidance?
What would be the business advantage of solving future performance issues by just adding the latest and greatest Intel server in the cluster and Vmotion the troubled database over?
Wasn't this exactly why we started server virtualization in the first place about 8 years ago? And why EMC aquired VMware?
Wouldn't you think the average Oracle sales rep is scared to death when his customer starts considering to run his databases on a virtual (cloud) platform? Would it make sense for him to drive his customers mad with FUD around licensing, support issues and whatever he can think of to prevent his customers going this way? Even threatening to drop all support if they continue to go in that direction?
If Oracle is scared of losing license revenue, wouldn't you think there is a huge potential for savings for our customers here?
The journey to the private database cloud
So how should we deal with this?
A few starting points:
- Oracle supports VMware. Period. Any other claim of Oracle reps can be taken with a grain of salt (to be more specific: it's nonsense).
- Oracle does NOT certify VMware. Then again, Oracle does not certify anything except their own hard- and software. But IMO, support is all you need and the discussion around certification leads nowhere.
- Oracle might ask the customer to recreate issues on a physical server if they suspect problems with the hypervisor. Isn't it great that we can do this easily with Replication Manager? ;-)
- Oracle only supports Oracle RAC on VMware for one specific version (11.2.0.2). Any other version with RAC is not recommended on VMware because of support issues. Expected to change in the future.
- Both EMC and VMware offer additional support guarantees for customers deploying Oracle on Vmware. So where Oracle pulls back, EMC and VMware will fix any issue anyway.
- Performance is no longer an issue. With Vsphere 5, a single virtual machine can have 32 virtual processors, 1 TB ram and drive 1 million iops. Only the most demanding workloads would not fit in this footprint. But with customers running hundreds of databases, maybe we should start with the 95% + that DO fit and make significant savings there. By the time we're done, VMware will have Vsphere 6 and who knows what happens then.
How to get around the licensing issue
As I said, Oracle requires licenses for all servers in a cluster. So how do you limit the number of licenses? By deploying an Oracle-only VMware cluster. Only run Oracle databases here. No apps, no middleware, no fileservers, and try to move everything off that does not relate to database processing. No host replication, no storage mirroring, etc.
Say you have a legacy environment with 10 servers, each with 16 cores, so you have 160 cores licensed with oracle EE and a bunch of options. Average CPU load is 15% but let's assume 20% to be conservative.
I claim that a single VMware cluster with 3 servers each with 32 cores will easily do the job. Now we have 3 * 32 = 96 cores to be licensed. 96/160 = 0.6 = 60% so we saved 40% on licensing right away. Probably the average CPU load on the whole cluster will still be much less than 70% so we can gradually add a bunch more databases until we average out on 70%.
If the old system was not running Intel x86 but SPARC, PA-RISC or POWER cpu's then the processor factor was probably 1.0 or 0.75. Intel has 0.5. So for 96 cores (Intel) you would need to pay 48 full licenses. Another 33% savings.
The savings of 40% on licensing will easily justify an investment of a nice new EMC storage infrastructure with EFDs, FAST-VP and all other goodies. Do you think the customer will push us hard for a $0.01 lower GB price competing HDS or Netapp if we just saved them millions in Oracle licenses?
But the story does not end here.
Additional savings
Let's assume the customer needed high availability and scale-out performance and was running Oracle RAC. RAC is the most expensive licensed option and you need at least two for a two-node cluster. But VMware allows for HA (High Availaiblity clustering) as well. Using VMware HA instead of RAC, you would have to fail-over and recover the database in case of an outage - if the customer cannot tolerate this then he needs to stick with RAC (only for mission critical databases!). But most customers can live with 5 minutes of downtime in case a server CPU fails and in that case, replacing RAC with VMware HA can save them another big bunch of dollars.
Let's assume that with virtualization you justified the investment in a nice EMC infrastructure with Flash drives to replace the competitive gear. Now the Oracle cluster is no longer limited by storage I/O's and can drive more workload out of the same 3 VMware servers in the cluster. But you can also replace host mirroring (where applicable). You can implement snapshot backups to get the I/O load away from the production servers. You removed the middleware and apps stuff from the database servers - reducing CPU utilization and allowing even more headroom for DB consolidation - all without buying extra licenses from Oracle.
You have a customer who wants even more?
What if they create TWO database clusters for VMware? One for production (running Oracle Enterprise Edition (EE) with all the options they need) and one for Non-prod (running Oracle Standard Edition (SE) without options - good enough for test/dev and smaller, non-mission critical workloads). I bet the number of non-prod databases will be much more than prod. By removing the expensive options AND moving from Enterprise to Standard Edition, you saved another ton of money on Oracle licensing as SE is much cheaper than EE. But be aware - the devil is in the details and using Standard Edition is not for the faint-of-heart (for example, you could no longer clone a partitioned database to a SE enabled server because of the missing license and functionality). Still if the customer is keen on saving as much as possible, then this might be the final silver bullet...
Do they run a huge Enterprise Data warehouse? Carefully find out if they have troubles with it and see if you can position Greenplum - saving another bag of money and speeding up their BI queries. But be careful, in an Oracle-religious shop it might backfire on you...
Reality Check
I had this discussion already with a few enterprise customers. And found that although the story is easy in theory, the reality is different. If a customer already has the 160 CPU licenses purchased from Oracle, then the Oracle rep will not happily give a money-back in return of the shelfware licenses. So in that case the customer can only save on maintenance and support. But having enough licenses on the shelf, he would not have to purchase any more for the next 5 to 10 years. So talk cost avoidance instead of immediate savings. And again, if they are licensed by user or have a site license, then saving on licenses will be a tough discussion. Still, the savings on power/cooling/hardware/floorspace would still be significant enough to proceed anyway.
And don't forget the other benefits of private cloud of which we all know how to position: they are no different for Oracle than for other business applications.
Final thought
For this to work you need a customer that is willing to work with you and be open on how they negociated with Oracle, and a team of DB engineers to work with you to make it happen. If internal politics cause significant roadblocks then you will get nowhere.
It's not an easy sell but the rewards can be massive. We're only just starting to figure out how to convince customers and drive this approach. Feedback welcome and let me know if you need support.
Resources
The online Everything Oracle at EMC community has lots of information on this subject. See in particular this presentation which I co-presented (with Sam Lucido) at this year's VMworld.
Landing page with Oracle 's price list: http://www.oracle.com/us/corporate/pricing/price-lists/index.html
Download "US Oracle Technology Commercial Price List" for the database license document. Read the fine print because it's not always as simple as it seems.
Processor Core Factor information:http://www.oracle.com/us/corporate/contracts/processor-core-factor-table-070634.pdf