SMART Storage Systems recently announced the availability of flash storage on the memory channel of a server. Although at first blush, this seems to be only another place in the storage food chain where flash devices can be inserted, it is actually much more. There are a number of immediate applications for this new technology, but it also has profound implications for storage in the future as the distinction between main memory and primary storage vanishes.
But before we explore the implications, let's take a closer look at SMART's product. The company said that its ULLtraDIMM product, which contains its own NAND flash memory, can be plugged into a standard DDR3 DIMM socket from where it can be acceded directly through a server memory channel. DDR3 is a DRAM interface specification, as the server implicitly assumes that it is accessing DRAM over the memory channel. A DIMM is a dual in-line memory module that typically contained DRAM chips, but to make flash acceptable to the server required memory channel expertise. That is why SMART (which SanDisk announced in July it is acquiring) partnered with Diablo Technologies.
Diablo's contribution is what it calls Memory Channel Storage (MCS), which is an interface to flash and other non-volatile memory. SMART adds its Guardian technology, which improves flash performance in various ways, including NAND endurance to manage the physics of flash in addition to error management. ULLtraDIMM is the first flash memory directly on the memory channel and no faster access to flash is possible than through this approach. That is why SMART calls it the “final frontier” for storage latency. The next closest has been the use of PCI-e flash storage and other implementations of flash, such as in arrays, which have even higher latencies. But local, persistent storage that is connected to the memory bus eliminates arbitration and data contention on the I/O hub that these other implementations have to accept.
Placing flash storage on the memory channel takes advantage of technology in conjunction with CPU performance. Memory controllers in processors have long been tuned to achieve the absolute lowest latency possible in accessing data for applications. These memory controllers are massively parallel, which is essential in scaling memory bandwidth as the number of memory modules increase. This results in the lowest latency possible as the links to the processor are direct -- usually less than 5 microseconds for write latency. That's surprisingly close to DRAM, which in inherently faster.
[2D NAND flash is fast reaching its limit. Read how Samsung's new 3D chip holds promise for flash memory's evolution in "The Next Step For Flash: 3D"]
Critically important is that this architecture enables deterministic, i.e., predictable, latency. That means even if the queue depth increases (such as hundreds or thousands of requests per second) flash on the memory channel can still deliver consistent latency to each request. (Naturally the response time may lengthen if the physical storage available is maxed out.)
The architecture also has other benefits. Both DRAM and flash can be accessed in a blend on the memory channel (in fact some DRAM is required). Depending upon requirements, flash devices can be talked to as block storage devices as if they were hard disks (HDDs) or as memory mapped devices.
Potential Applications
High-performance, demanding applications that require characteristics, such as low latency, consistent latency, and scaling, would seem like good candidates for flash storage modules attached to the memory channel.
One example is high frequency trading in financial services. With such an ultra-high performance application, why would one not use all DRAM instead of a combination of DRAM and flash? Well, the combination solution could be significantly less expensive and the small loss of performance may be well compensated by the lower price.
Other applications include increasing the transactions per second for a database application, increasing the number of virtual machines per node, servicing more virtual desktop users faster, and serving as a platform for big data analytics, as in part of a Hadoop ecosystem.
Personally, I think in-memory computing for real-time business will be a big beneficiary of this approach. For example, SAP HANA has a very powerful, sophisticated technology that currently employs DRAM as if it were primary storage and not just memory. Still, DRAM is much more expensive than flash. If I were SAP or one of the other in-memory computing suppliers, I would be dancing in the streets as Economics 101 price elasticity says that the market potential goes up dramatically if the price decreases.
NEXT Page: Game Changing Technology