In my last post I contrasted the use of Oracle RAC with VMware HA Cluster for creating a database cloud. In that post, I promised to describe further the suitability of the VMware HA Cluster solution as opposed to Oracle RAC. That is the purpose of this post.
There are two areas of concern when evaluating the use of VMware HA Cluster for creating an Oracle database cloud. These are:
- Scalability
- Availability
Analysts like to use the so-called "magic diagram" to describe the market space for a given technology or solution. This is very interesting in this case. The following diagram is useful:
Database Cluster Technology Magic Diagram
Think of this graphic as conceptual, but it makes an important point. At the bottom of Availability scale are databases that you don't even back up. Everyone has a few of these. One example is a staging database for temporary storage of summary data which is then inserted into a data mart or data warehouse. If this database fails, you will simply recreate it and rerun the job.
At the top of the same scale is your company's ERP, web catalog or online trading application. If this type of database is down for any length of time (even a few seconds), money is lost, or even worse hard legal requirements are not met and liability results. These databases absolutely have to be up and running at all times in other words.
In the middle part of the scale is everything else. These databases have various levels of HA requirements. The SLA may require no more than 2 minutes of downtime in a single month. Or perhaps several hours of downtime can be handled over a weekend. And up to half an hour of unplanned downtime may be OK in a six month period. Again, the requirements vary greatly.
The Scalability scale is similar. At the bottom of the scale (to the left in this case) are databases that are very small and have small I/O requirements. A small database stored on a personal laptop or PDA would be an example. At the top of this scale are databases which are many terabytes in size, have thousands of online users at all times, and where every user has to have the ability to see every row in every table in the entire database. The latter is referred to as a "scale-up". These databases exist in virtually every enterprise of any significant size, and frequently they are also the databases which have high HA requirements. Thus, there is a lot of commonality between these two scales. Again, the company ERP, web catalog, or online trading application may be multiple terabytes in size, have a large number of online users, high I/O demands, and the requirement that all users have access to all of the data in the database, making federating this database impractical or impossible. Very large databases with high I/O requirements probably are pretty important, or we wouldn't be spending all the money, time and effort required to maintain them, after all.
In the middle of this scale are databases that have various levels of scalability requirements. Many of these fall into the "scale-out" category, where the scalability demands can be met by creating a large number of separate databases. In my experience, this may be true for a couple of reasons:
- The database workload is naturally partitionable because a given class of users only needs to see a subset of the total database data. This is a federated database solution. The classic example of this is the software as a service (SaaS) customer. For example, I presented to a customer that publishes a software package for managing hair and nail salons. They have over 5.000 salons in North America that use their software. Obviously, no salon needs to be able to see the database data of any other salon, making the workload very easily partitionable. New users can either be added to an existing database, or a brand new database can be created.
- The customer has created a mess. I encounter this fairly frequently, even in very large enterprise accounts. Instead of creating a large database cloud, they have created a large number of small projects. Now they are experiencing poor utilization, poor manageability, high cost and data center sprawl. This is the classic "server hog" scenario, in which each separate group developing a software project wants their own small data center, consisting of servers, network switches, storage, power, cooling and the like. The result is that the overall costs are very high, including the Oracle license cost. Oddly, this is the customer that Oracle most often uses as an example for the suitability of RAC, although I find it far less compelling than the scale-up scenario, for reasons which will shortly become clear.
Mapping this set of concepts onto our magic diagram, we can see the following:
Database Cluster Technology Magic Diagram with Products
Bear in mind this is my opinion about the way I see the market. Others will probably disagree, which I welcome. But here is my take: Oracle RAC owns the high end where availability and scalability are both very high. And it should. Oracle RAC provides unparalleled uptime with great scalability. This comes at a cost however.
At the bottom end, you have MySQL, Microsoft Access, and other user-oriented databases. (While both Oracle and Microsoft have personal versions of their high-end databases, I have always considered them developmental tools. For a simple user to create an address book with Oracle would be like swatting a fly with a nuclear bomb.)
In the middle you have a pitched battle among a variety of products, including Oracle SE, Microsoft SQL Server and IBM DB2. This is actually the bulk of the database market, as I show in the diagram. And this is where VMware HA Cluster actually helps Oracle substantially (a fact which I think many at Oracle have yet to fully appreciate).
Having configured both Oracle RAC and VMware HA Cluster many times, I can tell you that the manageability advantages of VMware HA Cluster are very strong. With VMware HA Cluster, you do not need to establish ssh equivalence with passwordless authentication as you do with RAC, for example. Granted if you do this often (as I do), this is no big deal. For the occasional user, it is mind-numbingly technical, with intimate knowledge of the various ssh files (known_hosts, authorized_keys, id_dsa.pub and the like) being required. And this is only one of a number of similarly technical issues that must be mastered in order to install and configure Oracle RAC.
To configure VMware HA Cluster, you simply establish a port on a vSwitch on each node in the cluster, create the cluster, and then drag and drop the nodes into the cluster. Believe me, those three steps (all accomplished using the VIC GUI) are far easier than creating a cluster with Oracle RAC.
Which gets to my point. Given the alternatives, and judging things like cost, manageability, and the like, unless you are either a very high-end customer (which Oracle owns as I have said), or simply slavishly committed to Oracle (these undoubtedly exist as well), you would be crazy not to try to create a moderately HA database solution with something other than Oracle RAC. Thus, you would tend to go down either the Microsoft SQL Server (with MSCS) or IBM DB2 (with the HA option) path, both of which are available at significantly less cost than Oracle RAC. And this is where VMware HA Cluster can actually help Oracle out. By giving Oracle another HA solution with simpler management and lower cost, Oracle can actually compete successfully with Microsoft SQL Server and DB2 in this space. Which is, again, the bulk of the database market. Thus, VMware HA Cluster helps Oracle to move down market, which in my experience is where alot of the interesting money is, as well as the next generation of new customers for Oracle.
Where VMware HA Cluster falls down is, not surprisingly, in these two areas: Availability and scalability. First, in the area of availability, Oracle RAC provides absolutely no downtime on the database as long as at least one node in the cluster survives. With VMware HA Cluster, the loss of a node in the cluster will result in the VMs on that node being rebooted onto one or more surviving nodes in the cluster. This is downtime. Granted, only a couple of minutes of downtime, but downtime nonetheless. For many users with many databases, this is acceptable. If your database requires absolutely no downtime, then Oracle RAC is your solution. Otherwise, VMware HA Cluster may be a viable option.
The other area is scalability. VMware HA Cluster cannot scale any given database very high. Since each database must run in single instance on a separate virtualized OS image, the size of the database in terms of I/O is limited to the maximum that you can achieve in a VMware VM. Presently, this is fairly modest. The following chart compares our physically booted RAC solution to our VMware HA Cluster solution in terms of scalability. As you can see, the RAC solution scales far higher in terms of a single database image.
| Solution | Instances | CPUs | Memory | Database Size |
| Physically booted Oracle RAC | 4 | 8 per node | 24 GB per node | 2000 warehouses in one TPC-C database |
| Physically booted Oracle RAC | 8 | 4 per instance | 12 GB per instance | 250 warehouses in per TPC-C database (8 total) |
Again, in the scale-out scenario this is no big deal. As the scalability requirements of any given database customer are fairly modest, and each database customer can be placed on a separate server, this works fine. It does not provide for a scale-up scenario though. If you have a database that is many terabytes in size, has high I/O requirements, and where each user of the database must be able to see every row in the database, then you have a scale-up and you need Oracle RAC.
Where VMware HA Cluster shines is in several areas:
- In an SaaS scale-out, VMware HA Cluster (and VMware in general) provide very convenient tools for creating new database servers. You can provision a new server almost instantly. I have tried to accomplish this with RAC, and it is quite a bit more difficult.
- In a mixed environment, such as the server hog scale-out (where many versions of Oracle and different operating systems have been used), you can easily virtualize a given database project in place, without an expensive, time-consuming and risky migration. You have the Windows 2000 version of Oracle 8i running in production? No problem. You can p2v that puppy into a VMware environment, in the process getting all the advantages of mobility, manageability and improved utilization that VMware provides. Doing the same thing in RAC would require you to migrate all of your existing projects to a single target set of OS and Oracle versions, as RAC does not allow for any heterogeneity. This is a daunting prospect for many customers in this situation. VMware HA Cluster provides a quicker, simpler and cheaper path to a database cloud in this case.
- In both scale-out scenarios, VMware actually provides a more natural management of a given database image than RAC does. Assuming SaaS, you would like to be able to backup and restore a given database customer's data without any disruption of the other customer's data. If the customer's data is stored in a single database, this is very easy and natural. If it is consolidated withiin a larger RAC database, this becomes more difficult. Especially if you want to do a point-in-time recovery. Ever tried to do a tablespace point-in-time recovery? I have. Great fun, believe me. The same thing is true with things like Data Guard. Data Guard is a log shipping / log apply product. You do not get to pick and choose which log you want to apply to the target database. If each customer's data is stored in a separate physical database, this works great. If a large number of database customers are consolidated, this may create issues. The same sorts of issues exist with the server hog scale-out as well.
To conclude: Both Oracle RAC and VMware HA Cluster have their place as HA solutions within the Oracle production database space. Oracle RAC owns the high end in terms of scalability and availability. VMware HA Cluster can significantly help Oracle to address the mid-market where the costs and manageability issues of Oracle RAC make that product overkill. Both products have advantages and disadvantages, and each have their place. Oracle would do well to evaluate the benefits of working with VMware and enabling their product to move down market and compete more effectively with the likes of Microsoft SQL Server.