Non Uniform Memory Access or NUMA is becoming increasingly commonplace on the next generation of very powerful servers. This is nothing new in the AMD product line; Opteron is a NUMA architecture and the associated performance boost of the Opteron specification catapulted AMD ahead of curve in the mid 2000's. Intel has been trying to catch up for quite some time and the latest generation of Intel Xeon Nehalem processors not only sport NUMA, but better Virtualization Assist (VT-x) as well.
What does this mean for virtualization applications on the latest VMware incarnation? Serious performance increases for NUMA equipped systems.
Before we start on getting the most out of your hot new Nehalem rig or that brand new HP DL585 Opteron equipped server, let's overview what makes NUMA different.
NUMA is the logical successor for Symmetric Multiprocessing. In Symmetric Multiprocessing or SMP, there are multiple processors and cores tied to a single memory controller. Each processor has uniform or symmetric access to all of the available memory. Access to memory resources are limited, because all CPUs work on a common bus and there is a fixed amount of bandwidth available.
In NUMA architectures, memory resources are allocated specifically to different processors and groups of cores (multiple buses). The most common way to do this is to build a memory controller in for each socket. The next step is to connect these processors with a high speed interface: AMD did it first with Hyper Transport, but the new Nehalem 3500 and 5500 series use a technology called Quickpath. The end result is a huge increase in performance for Intel chips, as much as a 100% improvement in processing and a drop of a third in memory latency figures in preliminary benchmarks over Penryn chipsets by Anandtech.
This is great news for virtualization, especially because our friends at VMware have built NUMA support directly into vSphere in a transparent and self optimizing fashion. The way it works is thus:
- The NUMA scheduler places each guest virtual machine on particular home node containing processor and memory resources.
- When memory is allocated to the virtual machine, it is allocated from the home node.
- The NUMA scheduler reallocates virtual machines to different home nodes whenever it is advantageous to do so. This happens when allocating more memory to a machine might violate locality etc.
There are a few things that we need to keep in mind to get the best performance out of the NUMA scheduler. First we need to make sure NUMA is turned on in the bios. Almost all of these machines have a memory interleaving setting: when interleaving is turned on, memory banks are interleaved and memory access becomes uniform which means that the NUMA scheduler can't operate and NUMA is effectively disabled.