We once had a religious conflict about whether SAN or NAS was the best storage solution. Each side touted its benefits, resulting in entrenched IT customer bases locked into one form or the other.
The sad thing is that, though many of the claims were true, no single type of storage solution fit everybody's needs.
Now we are seeing changes in that polarized situation. First, people have recognized that how data is presented may be very different from how data is stored. Second, scaling out storage has strained both the NAS and SAN models to the limit. Third, object storage is affecting large storage deployments, especially in the cloud. As a result, we are witnessing a fusion of the best of these elements into a single converged storage solution.
Some of the early attempts in this area, such as Pillar Data's Axiom, solved the problem of having one pool of blocks with partitioning into block or NAS virtual arrays. This resolved the need for separate appliances, but it did not allow multiple access methods to the same data.
Recently we've seen a flurry of next-wave converged storage solutions dressed up with phrases like "data plane" and "software-defined storage." They allow a given storage element to be handled as a block, a file, or an object.
It’s worth discussing the differences between modes.
- BlockIO dates back to the start of computing. Each server has its own map of the data, and the storage system sees instructions framed with physical addresses of disk blocks. The server file stack translates the file address into a disk block. Sharing is achieved by building SANs.
- NAS moves the file stack to a NAS appliance. The server sends the address of the data within the file to that appliance, which runs its file stack to translate the address into disk blocks. Sharing is intrinsic, since the appliance owns the data.
- Object storage is like NAS, except that it uses a database to find addresses. Each data block is replicated on several appliances for integrity. This allows for much larger data spaces and flattens access to data. In addition, the database approach allows for extensive file tagging.
Each mode has its own access protocols. SANs have Fibre Channel and iSCSI. NAS has CIFS and NFS, and object storage has REST and SOAP.
Converged storage is an object storage approach, with the access protocols abstracted from the storage. This allows software written to any mode to access any data, with the converged storage appliance handling translation.
The solutions also clear up issues with performance between modes. SAN and NAS have typically been quite close in performance, with an edge of perhaps 3 percent for Fibre Channel SAN. Object storage has lagged, largely because of the carryover of some misconceptions from the early days of development. For instance, to load-balance the system, each 64K block in an object went to a different appliance. This guaranteed the least efficient use of drive IO. Drive-centric thinking has corrected that, and object storage in these new appliances should be as fast as the other means of access.
Who is providing converged storage? EMC recently unveiled ViPR, a software solution that will be sold on both EMC and third-party hardware. NetApp Inc. also has a converged product, but it hasn't any object storage capability and seems more like Axiom. Ceph, an open-source converged storage stack, is rapidly gaining acceptance. It allows storage to be built for less than $100/terabyte; that should stop dead any arguments about capacity costs.
EMC will build a comprehensive infrastructure of tools around ViPR, but it won't come close to Ceph-based solutions in terms of price. Ceph will have a complete infrastructure pretty soon, but it's worth remembering that many tools aren't needed when storage is really cheap. COTS can add more than 10x capacity for the cost of an EMC dedupe tool.
Red Hat Inc. also has converged storage. Its inexpensive code will give Ceph some competition. With large scaleouts needing a low cost per terabyte, COTS hardware with either Red Hat or Ceph will be hard to beat.
Converged storage will absorb the other methods. In going this route, EMC is gambling with the crown jewels of its business, and it could be Linuxed by the open-source community.
After converged storage, GPU-assisted storage streams appear to be the next over-the-horizon concept.