Solid-state storage in the form of flash memory continues its march into every nook and cranny of the storage industry. As it becomes the dominant storage media, it's become clear to me that future solid-state storage systems won't follow the large monolithic or dual-controller modular array models that have dominated the data center for the past decade or more. Instead, the systems that take best advantage of solid-state storage will adopt some sort of scale-out architecture.
Many industry observers have noticed that as each generation of Intel processors has delivered more compute power than its predecessor thorough a combination of faster clock rates and core multiplication, each generation of disk drives got not faster but bigger. In fact, this growing performance gap is frequently used as a justification for flash-based solid-state drives (SSDs). After all, if your disk drives can't keep your servers busy processing data, introducing some flash can speed up your applications.
Since the controllers on almost all of today's storage systems are based on the same processors as your servers, the processor/disk performance gap has empowered manufacturers to add CPU-intensive features like thin provisioning, snapshots and replication while also having each generation of controllers manage more capacity. A modular array with a petabyte of storage would have been unthinkable just a few years ago, but most vendors' products can do that today.
As vendors have added SSD support to their existing storage systems, they've discovered that for the first time in years the processors in those systems are running short on compute power. The problem is that the amount of processing power a controller needs isn't a function of the capacity it manages but the number of IOPS the storage it manages can deliver. The 1,000 disk drives a typical modular array can manage deliver a total of somewhere between 100,000 and 200,000 IOPS, while a single typical MLC SSD can deliver 20,000 to 40,000 IOPS. Put more than a handful of SSDs in an array designed for spinning disks, and the bottleneck will quickly shift from the disk drives to the controller.
Just as flash has forced us to start thinking about storage costs in dollar per IOP in addition to dollar per gigabyte, storage system designers have to think not about CPU cycles per gigabyte or CPU cycles per spindle, but CPU cycles per IOP when designing their systems.
If you look at the latest crop of all-solid-state or clean-slate hybrid array designs from companies like Pure Storage, Nimble, NexGen, Tegile or Tintri, they aren't traditional scale-up designs that support four or more drive shelves from a single set of controllers. Instead, these vendors have limited expandability to make sure they have enough CPU to manage the storage in each system. This also ensures that they have CPU cycles for features like compression and data deduplication that reduce the cost/capacity gap between flash and disk storage.
Clearly, if we're going to have all-solid-state or even substantially solid-state arrays that manage more than 50 or so SSDs, those systems are going to need more compute horsepower. The easiest way to deliver that is a scale-out architecture. The next-generation vendors that are supplying significant expansion like Kaminario, SolidFire, Whiptail and XtremIO are using a scale-out architecture that adds compute power as it adds storage capacity. Those that don't are relying on host storage management features like vSphere Storage DRS and Windows Server 2012's Storage Spaces to make managing multiple independent storage systems easier.
I have seen the future and it is scale-out. Not just for files and big data, but for everyone.