I have noticed that a large number of comments have been submitted to the recent post on Alessandro Perilli’s blog which links to my recent post on Oracle’s support policy for VMware virtualization. Many, if not most, of these comments are actually comments directed at my post, not Alessandro’s post. Therefore, I will respond to them here. I am also attempting to post this as a comment on Alessandro’s blog, but that has not appeared on his site as of yet, and I wanted my response to be available.
To respond to several of the points raised by these comments:
Why would anyone want to run Oracle on a single vCPU under VMware?
Not sure how to parse this one. VMware ESX 3.5 allows 4 vCPUs per VM. Our performance testing was done with 4 vCPU VMs (2 VMs per server on a 8 core box, consisting of 2 Quad-core processors). 4 Servers were in the HA cluster, so a total of 32 physical CPUs were in the configuration, all of which were allocated to vCPUs on 8 VMs. This is laid out pretty thoroughly in the reference architecture published on Powerlink and EMC.com. You can find the EMC.com version here.
In terms of why you would want to run Oracle in a virtualized environment using VMware, that is pretty well laid out in my blog. But to summarize:
- We found better performance in a virtualized environment using VMware HA cluster vs. Oracle RAC. I will deal with the fairness of that comparison on my next point.
- The cost of VMware HA cluster is less than Oracle RAC, and VMware is appropriate for many customers. Not all, but many. For those customers, RAC would also work, but is, again, vastly more expensive. I lay out the usage cases where VMware HA cluster works as an alternative to RAC in another item below.
- Manageability is higher with VMware HA cluster than RAC. Believe me, I should know. I run both routinely in our testing environment.
Why should I believe that performance of VMware HA cluster is higher than RAC when VMware will not allow third parties to benchmark their product?
Well, I am a third party, and the program I manage publishes performance benchmark results of both virtualized and physically booted Oracle production environments. These are available on Powerlink. The actual performance results are pretty confidential and proprietary, but I tend to open the kimono on this blog, as you have seen. We are using a TPC-C-like workload run under Quest Benchmark Factory for Databases. We do not claim that this is a published and audited TPC-C result, obviously. However, we have lots of experience running this.
I can tell you no one was more surprised by the VMware performance result than I was. We are still continuing to actively profile performance of both physically booted and virtualized Oracle environments. (In fact, we are beginning to profile OVM as well.) I had no particular axe to grind on this either way. The results were simply what they were.
RAC and VMware HA cluster are not comparable, are they?
Of course they are. Both Oracle Cluster Ready Services (the underlying technology behind Oracle RAC) and VMware HA cluster are cluster software products. They do basically the same sort of thing: Provide high availability for applications. They simply do so in a different sort of way. RAC provides one single database image across multiple physically booted servers. CRS provides transparent failover of VMs using VMware VMotion technology. Both are intended to protect application uptime and client access by providing a high availability solution.
Granted, they have different levels of HA. I would consider RAC to be a fault tolerant technology, in that the physical loss of a node will not result in database downtime (but may result in loss of client access). VMware HA cluster is a high availability product. A brief downtime is inevitable when a VM is being rebooted on a surviving node of the cluster after node failure. In our experience, VMware HA cluster works pretty well at getting the VMs back up and running, though.
In the end, it depends on what the customer needs, the level of HA being one of the issues. I cover that more in a later section.
Isn't RAC free with SE?
This is true, but very misleading. RAC is free with SE when the total CPU core count in the entire cluster is 2. This is a trivial cluster. Beyond that, EE is required.
And that is where the cost savings come in. EE is required for basically two products in most customer configurations:
- RAC
- Data Guard
RAC also carries the RAC upcharge above the cost of EE, making it the most expensive Oracle database software product, by far.
Assuming Data Guard is not required (and storage vendors have been competing with Data Guard using products like MirrorView and RecoverPoint for many years), then it comes down to RAC.
Assuming you can provide HA in another manner, then RAC is not required, and therefore SE can be used instead of EE, at 25% of the cost, plus savings on the RAC upcharge. I think you see the point.
The bottom line is that VMware HA cluster provides costs savings assuming the customer scenario allows this product to be used. Which is what I cover next.
Not all customers can use VMware for an HA solution, and need RAC instead. Right?
Not actually in the comments, but important. Again, EMC is a strong supporter of RAC. We run it in our lab, and we will continue to do so. No one would say that RAC does not work as well as VMware HA cluster or that it is not a great product. It is. But it is expensive and complex.
What RAC provides is two things:
- A single database image
- True fault tolerance
These are both great advantages, but not all customers need this. For example, I personally visited a very large Fortune 100 company. For confidentiality reasons, I will not use the name here, but believe me, they are a household name. This customer had many single-instance database servers running on 1U and 2U unclustered machines throughout their datacenter. The cost to manage these servers was immense, in the 9 figures per year. Yeah, that's a lot of jack.
The customer wished to consolidate all of these servers into a database cloud. I postulate that the way to do that could consist either of RAC or VMware HA cluster. But consider:
- What is the level of HA provided by the current configuration? Very low. Each physical server is presently a single point of failure. VMware HA cluster would be a big improvement.
- Does the customer need a single database image? Not at all. These servers are already islands of data.
The simplest and quickest way for this customer to consolidate the servers in this scenario, given the choice between RAC and VMware HA cluster, is VMware HA cluster. It is also the least expensive of those two alternatives.
I suspect that many, many customers are in this situation. For many applications VMware HA cluster is a viable high availability and consolidation solution for production Oracle databases. I would certainly not put a large ERP system on VMware. Nor the billing application for a large telco. But many, many applications could be put up on VMware just fine.
This is in many ways comparable to NAS vs. SAN. I am employed by EMC, so it may surprise you to hear me say this, but remember I came to EMC from NetApp. In my experience (not a scientific sample, but still) approximately 90% of the Oracle databases running in the world could be stored on an NFS server with absolutely no change in the client experience or uptime of the database. The same it probably true with VMware. Perhaps the percentage is higher or lower. I certainly have less experience with VMware at this point than NAS. But time will tell.
BTW, it would help me if folks could post comments to content written by me on my own blog, rather than having to run around and find it elsewhere. Just a thought...