I had a very successful breakout session at VMworld this year, presenting to a total of around 200 people. The topic was:
Creating an Internal Oracle Database Cloud Using vSphere
I will now share much of the substance of my session here on this blog, for those of you who could not attend VMworld.
The first section (which will constitute this first post) was on the coming data explosion that will soon hit (indeed is already hitting) Oracle DBAs all over the industry.
Many of you are familiar with the Joe Tucci keynote from EMC World this spring. In that talk, Joe pointed out the recent IDC / Dataquest study, which points out the coming world wide data explosion which can be simply stated as follows:
Year Worldwide Data Set
2009 0.8 Zettabytes
2020 35 Zettabytes
Where a Zettabyte is 1024 Exabytes, an Exabyte is 1024 Petabytes and a Petabyte is 1024 Terabytes. A Zettabyte is therefore approximately 10 ^ 21 bytes.
That's a data explosion of 44x in approximately 10 years.
What Joe does not point out is the shift from unstructured data to structured data. Presently, 90% of worldwide data is unstructured. However, almost all of the sources of new data which are causing the explosion are structured. (These include social networking, blogging, Twitter, and all of the Web 2.0 content, plus sources like e-readers, PDAs, smart phones, medical / dental digital imaging, online security, smart energy metering and the like.) The result is this graphic showing the relationship between structured and unstructured data going into the future:
Undoubtedly much of this data will end up being owned by the so-called Database 2.0 vendors. Certainly, that will be true for the bulk of the social networking and Web 2.0 content, where those vendors have formed their roots. But this is not true for all of this data. Look again at the sources of data shown above. Is it likely that a medical or dental image (which may have critical consequences for actual patient care and confidentiality) will end up on a relatively untested and emerging Database 2.0 technology? I would regard that is very unlikely. It is much more likely that this data will end up in the known-to-be-reliable good enough technology in the Database 1.0 space we know as Oracle.
The same thing is true for smart metering. Again, the output of these meters will be used to calculate actual customer bills, not to mention determining the energy output for homes and businesses. Is the external cloud Database 2.0 environment ready to absorb this data? I doubt that. Folks like Progress Energy (my energy provider) are going to place that on Oracle, most likely.
Even making the incredibly depressing (for Oracle) prediction that Oracle's market share in the database market falls by 50% in the next 10 years, the result for Oracle can be calculated simply as follows:
44 x (9 / 2)
or about 200X. This means that the growth in Oracle data volume worldwide will increase by 200X in the next 10 years, by my conservative estimate.
I compare this to the movie Princess Bride, one of my favorite films by Rob Reiner. In this film, as many of you know, the main character Wesley leaves his beloved Buttercup to find his fortune. Very soon, he is captured by the Dread Pirate Roberts who makes him his cabin boy. During his years of service, the Dread Pirate Roberts tells Wesley the same thing everday:
"Hello, Wesley. I'll probably kill you in the morning"
I maintain that this is what the impending data explosion means to Oracle DBAs. The day will come when Oracle DBAs live in the "Wesley State". Coming to work everyday, they will look at their screens and their Oracle database will grin back, saying to them, in effect:
"Hello, DBA. I'll probably kill you in the morning."
Those DBAs who survive this challenge will do so because they have absorbed a very different way of thinking about their Oracle database environments. That will require many changes, among them the willingness to scale their database infrastructure in a completely new and radical way.
More on this later. In up-coming blogs, I will explain how I think that change will come about, what choices the Oracle DBA has to meet this challenge, and so forth.
To end, many of you are coming to OOW this month. I will be there as well, and will be co-presenting on the subject of RMAN backup with my good friend and former manager, Bruce Clarke. Bruce was my second boss at NetApp, and is now with Data Domain. Please come by the EMC booth and say hello, and I will post the logistics of my talk on this blog.
Also, a big thanks to Steve Tout with VMware, who co-presented with me at VMworld and provided my customer case study. Steve is responsible for all identities which exist on VMware.com and VMworld.com. Please see his blog as well. It contains much of the content he presented at VMworld.