Back in 2012, in the pages of Network Computing, author Howard Marks said he "had seen the future of solid-state storage, and it was scale-out." Marks' underlying question, at the time, was "which scale-out storage architecture is best?" Today, there's no question: it's any architecture that allows you to meet your needs today and to easily scale for future requirements, without compromises.
Still, achieving that goal is not easy. Even though all-flash storage has emerged to deliver the performance (high IOPS, low latency) that modern business applications require, coping with the exponential growth in mission-critical data remains difficult.
The reality of scaling has led to the terms scale-up and scale-out, which describe different approaches to scaling. The problem is that scale-up and scale-out are terms first defined in computing, not in storage. In co-opting those terms and applying them to storage, vendors and users have created their own definitions, muddling the matter. It's time to put those many and varied definitions aside and instead consider how to scale smartly.
Simply, what any enterprise needs is the ability to scale its storage to its corporate requirements -- easily, instantaneously, and non-disruptively (with no disruptions to data center operations or performance). Further, scaling should be achievable by increments, in order to meet existing capacity and performance requirements at any time and without having to overprovision.
Let's focus on the challenges you may face, focusing on the end game of high performance at any capacity.
Know the challenges of scaling. Can you scale instantaneously and without disruption?
In a perfect world, you would be able to scale with no downtime and no degradation in storage performance. The problem is that adding physical storage to an existing array nearly always requires taking the system down. However, there are approaches that are entirely non-disruptive.
Understand the risks of losing a node
In some scale-out implementations, if one array goes down, the entire cluster can go down. That's the outcome of a design that spreads metadata across all the arrays -- a design that is prevalent in the enterprise storage sector. However, if each node operates independently, but still adds capacity and performance within a single namespace, then the loss of a single node does not bring down the remaining nodes.
What is the real capacity?
Vendors publish numbers referring to the total capacity their solutions support, but this is often misleading. Frequently, a significant portion of the overall total capacity will go to metadata, caching or other processes, severely reducing the actual capacity available to applications.
Another thing to be aware of is the difference between effective and usable capacity. Some vendors publish effective capacities, some publish usable, and others provide both. Usable capacity is the amount of storage available after any system overhead, such as parity space for RAID or system cache, is taken into account. Effective capacity is a generic number that is meant to reflect what the available storage will be when taking into consideration the effects of deduplication and compression. The problem with this number is that the effects of deduplication and compression are completely application-dependent. No two implementations will be the same. When selecting a solution, be sure to know what the actual usable capacity in a configuration with your applications will be.
Understand architectural limitations
The traditional definition of scale-out, as used in reference to adding compute nodes, defines the ability to add any number of nodes with no practical limit. However, this is not always true when referring to storage.
The architecture of your storage system will dictate its maximum capacity. While you can grow capacity and performance by adding nodes to a storage cluster, each vendor's implementation will have some limit to the number of nodes that can be added. Often scale-out systems are limited to clustering only a few nodes of identical arrays. Other implementations require users to add pairs of nodes, rather than individual systems.
Overcome the limitations of older technology
Look at the storage arrays in your current environment. If they are several years old, you may not be able to scale capacity by adding a current-generation array, even if it's available. That goes back to the problem of exact duplication: if you have an older-generation array and you need to scale your storage, you may have no choice but to supplement it with the same model of the array. The only practical alternative may be to rip-and-replace, upgrading to a current model.
There are no simple answers to the challenge of scaling storage effectively. But focusing on arbitrary definitions of terms, such as scale-up and scale-out, too often places the emphasis on managing a process rather than achieving objectives. It may sound like a pat question, but the best question to ask may be: How will my storage needs change in the coming two to three years, and how can I best get there from here?