In the never ending game of leapfrog between processing, memory, and I/O, the network has become the new server bottle neck. Today’s 10 GbE server networks are simply unable to keep up with the processor’s insatiable demand for data. This is creating a real problem where expensive servers equipped with the latest CPUs are giant beasts that suck power at an astounding rate, with all these massively parallel cores running at gigahertz rates busily doing nothing.
Giant multi-core processors with underfed network I/O are basically being starved of the data needed for them to continue processing. Architects from the hyperscale data centers have realized the need for higher network bandwidth and have jumped to the new 25, 50, and even100 GbE networks to keep their servers fed with data.
It’s important to understand the reasons why these leaders have made the decision to adopt faster networks. In fact there are three reasons why I/O has become the server bottleneck and faster networks are needed:
- CPUs with more cores need more data to feed them
- Faster storage needs faster networks
- Software-defined everything (SDX) uses networking to save money
Multi-core CPUs need fast data pipes
Despite the many predictions of the imminent demise of Moore’s Law, chip vendors have continued to rapidly advance processor and memory technologies. The latest X86, Power, and ARM CPUs offer dozens of cores and deliver hundreds of times the processing capabilities of the single core processors available at the start of the century. For example, this summer IBM announced the Power9 architecture, slated to be available in the second half of 2017 and include 24 cores!
Memory density and performance have advanced rapidly as well. So today’s advanced processor cores demand more data than ever to keep them fed, and you don't want the CPU-memory sub-system -- the most expensive component of the server -- sitting idle waiting for data. It's simple math that a faster network pays for itself by achieving improved server efficiency.
Faster storage needs faster networks
Just a few short years ago, the vast majority of storage was based on hard-disk drive technology -- what I like to call “spinning rust” -- with data access times of around 10 milliseconds and supporting only around 200 I/O operations per second (IOPS).
Today’s advanced flash-based solid state disks access data at least 100 times faster and a single NVMe flash drive can deliver around 30 Gbps of bandwidth and over one million IOPS. With new technologies like 3D XPoint and ReRAM just around the corner, access times are set to be reduced by another factor of 100X.
These lightning-quick access times means that servers equipped with solid-state storage need at least 25 Gbps networks to effectively take advantage of the available performance. A 10 Gbps connection leaves two thirds of the available bandwidth stranded. This is also true for accessing data from all-flash arrays from traditional storage vendors like EMC, NetApp, and Pure Systems. Here there is an incredible amount of performance that can be trapped in the centralized storage array, warranting 50 or even 100 Gbps networks.
Software-defined everything
The last driver for higher performance server connectivity is the trend towards software defined everything (SDX). This was perhaps best explained by Albert Greenberg from Microsoft, in his ONS keynote presentation where he said, “To make storage cheaper we use lots more network!” He went on to explain that to make Azure Storage scale they use RoCE (RDMA over Converged Ethernet) at 40Gb/s to achieve “massive COGS savings."
The key realization here is that with software defined storage, the network becomes a vital component of the solution and the key to achieving the cost savings available with industry standard servers. Instead of buying purpose built networking, storage, or database appliances engineered with five-nines reliability; an SDX architecture takes a fundamentally different approach.
Start with off-the-shelf servers with three-nines reliability and engineer a software defined system that achieves five-nines reliability through mirroring, high availability, and erasure coding. So instead of a high bandwidth backplane to strip data across disks – you simply use the network to stripe data across multiple servers. Of course you need a lot more network to do this but the cost savings, scalability, and performance benefits are dramatic.
Buying the most powerful server possible has always been a wise investment. This is more true than ever with powerful multicore processors, faster solid-state storage, and software-defined everything architectures. But to take advantage of all of the power of these server and software components requires more I/O and network bandwidth.