The one constant in data storage over the years is the inexorable increase in the volume of data that needs to be stored. While the industry at large has done an exceptional job keeping pace with volume by consistently driving down the cost per GB, other factors become more limiting as data continues growing. After all, as Albert Einstein reputedly said, “Only two things are infinite—the universe and human stupidity.” Dubious, to be sure, but the point is that storage capacity is not endless, as systems administrators everywhere readily attest; this is one of the biggest reasons computational storage (CS) is ripe for a surge into the mainstream. The other is growth in edge computing—where capacity, footprint, power, and lifespan are all factors.
Market research firm Gartner has acknowledged the growing importance of CS adoption in its Hype Cycle for Storage and Data Protection Technologies, 2022. It also notes that more than 40 percent of enterprise storage will be deployed at the edge in the next few years. By 2026, large enterprises will triple their unstructured data capacity stored as file or object storage on-premises, at the edge, or in the public cloud.
To date, the volume problem has been addressed simply by adding raw capacity, but scale-out is not a long-term sustainable strategy, even if hard drives are capacious and cheap. Put them into a data center where they belong, and the costs increase. Retain them and their contents, and the operational costs soon mount. Managing data has challenges, particularly as it scales into petabytes and beyond.
But the real problem with the endless expansion of devices is asymmetry. There is more storage, but the computing and networking resources can’t keep up. Storing data and creating value becomes unwieldy, with input/output bottlenecks cropping up. The massive storage pool soon demands additional racks, compute, and networking resources—spiraling costs on the one hand but dwindling data utility on the other. Reducing this issue to its lowest common denominator comes down to one thing—commodity flash storage has reached its limits. Raw capacity, in other words, can only get you so far before it collapses under its weight.
That is where CS steps into the fray. Building aspects of the computer directly into the storage device—the solid-state Drive or SSD—addresses the demand for computing and networking resources head-on, reducing I/O load and traffic. That means putting processor and network resources to alternate (presumably better) use, supporting improved infrastructure performance and efficiency.
CS is what it sounds like—storage with built-in data processing capabilities. It moves some of the ever-growing load of I/O processing to storage, reducing demand on the other computing staples: compute, memory, and networking. As storage at the edge grows, it becomes apparent why “intelligent” storage will be favored. As the industry looks for ways to improve lifespan, reduce infrastructure deployed, increase uptime, and add capacity, the features enabled by CS become essential to achieving those goals. Out of the gate, the key feature is hardware-based compression operating transparently to the host. Adopting this feature into the data pipeline flips the conventional perception of compression from being a performance limiter into a performance accelerator while at the same time reducing complexity throughout the architecture.
CS SSDs packaged as standard drives with no special software, app configurations, or drivers simply installed like any other SSD adding no complexity to the experience. With processors embedded in the drive, the benefits of offloading the server CPU from burdensome tasks such as compression and encryption are immediate. The drive-based transparent compression can increase more than a four-fold capacity while reducing latency and alleviating CPU and memory bottlenecks. With the shift of host-based compression, the CPU is more available for higher-priority tasks while still delivering the overall benefit of a reduced storage footprint with lower power and cooling requirements.
When applied to the edge, the same benefits are even more valued – greater capability within tight power budgets, longer lifespan, and lower maintenance expectations. All while accelerating response times for crucial applications such as Internet of Things (IoT) devices and 5G infrastructure.
When it comes down to it, when drives with CS technology are perfectly compatible with existing server and software deployments, they far surpass the utility of the ordinary commodity SSDs used today. Although programmable options offer additional flexibility for specific use cases, the deployment complexity often outweighs the benefits. When the device is a simple standard install, such as the mainstream options already available, the benefits are easy to realize and require no special skills outside the scope of an ordinary IT person.
The bottom line? Modern computational storage options are an easy-to-use solution ready to help you do more with less when it comes to storage. It has a clear advantage, and it is believed that CS will soon become a mainstream standard.
JB Baker is VP of Marketing at ScaleFlux.
Related: