The concept of managing multiple hypervisors in the data center isn't new--companies have been doing so or thinking about doing so for some time. Changes in licensing schemes and other events bring this issue to the forefront as customers look to avoid new costs. VMware recently acquired DynamicOps, a cloud automation/orchestration company with support for multiple hypervisors, as well as for Amazon Web Services. A hypervisor vendor investing in multihypervisor support brings the topic back to the forefront.
The arguments for multiple hypervisors are typically cost-based. If price weren't a factor, very few data centers would run anything other than VMware. VMware is the 800-pound gorilla in the virtualization market, with a slew of robust features that are time-tested with production workloads. That being said, cost is an issue--and VMware is at the top end of that scale.
The other side of the cost argument for multiple hypervisors involves leverage and lock-in. The leverage argument is based on pitting hypervisor vendors against one another in a pricing war to maximize discounts. From a lock-in perspective, the thinking is that using multiple hypervisors helps companies avoid being locked in and, therefore, beholden to one vendor.
On the surface, these arguments are sound and a multihypervisor environment makes sense. The problem is the arguments exist in a vacuum, and when you expand your view to the big picture, they begin to fall apart. When each argument is separated from its isolated base, we see that a multihypervisor environment is rarely the better choice.
The cost argument is the most common, so we’ll start there. This argument is based on hypervisor licensing and support contract costs, and is therefore capex related. There's no arguing that capital can be saved by choosing less expensive hypervisors for the workloads that don’t require advanced management or reliability features. Opex is where the issue lies. At a minimum, utilizing two hypervisors requires two separate IT skill sets, deployment methods and management models. This is an ongoing expense that will quickly outweigh the capex savings. You'll need data relevant to your business to back this up: salary data for the responsible admins, training costs to ramp staff up on new systems, additional hire requirements, and so on.
While the cost argument makes sense from an isolated capex perspective, the leverage argument holds little weight on its own. Actually deploying a multihypervisor environment based on gaining vendor leverage isn't necessary. If required, vendors can be pitted against one another during the sales cycle without deploying more than one product--think playing poker. For the most part, hypervisor vendors are well aware that the competition has become stiff, and pricing plays a big role.
Like the leverage argument, deploying multiple hypervisors to avoid lock-in holds little to no weight. It’s true that using multiple hypervisors prevents you from being beholden to one vendor, but you still incur the additional operational costs of managing them both. Additionally, you’ll still suffer the same pain if you choose to ditch one of the vendors completely, although on a possibly smaller scale. The virtual machines and/or apps will have to be moved onto whichever hypervisor is staying around. You’d also need to bring in a new hypervisor to maintain a multihypervisor-to-avoid-lock-in environment. After all, if such an environment made sense before, why would it have changed?
The most common deployment for multiple hypervisors is VMware as the production system, with Microsoft Hyper-V or Citrix XenServer running test/development systems. VMware is chosen for the performance and features needed for production, while another hypervisor is chosen to lower cost in another environment. Applications are developed and tested on one system, then migrated to the production system. This is a dangerous idea. In the Marines we used a saying: "Train like we fight, fight like we train." This applies well outside of combat, too. A system thoroughly tested on one hypervisor has not been properly vetted for another. Testing should be done on identical systems, down to firmware revision.
Regardless of which hypervisor is chosen, most environments will incur the least cost and gain the most overall benefits from a single hypervisor deployment. Standardizing on a single hypervisor reduces complexity and configuration points and, therefore, opex. Ensure that you look at the big picture when making hypervisor decisions, as it’s easy to get wrapped up in myopic views that lead to poor decisions.
Disclaimer: In my primary role I work closely with several hypervisor vendors. This post is not intended as an endorsement for any of those vendors or products mentioned.