With an apparently insatiable need for I/O performance and near-line storage, the industry has produced a couple of flash products that use a DIMM model in the search for speed.
Using the memory bus as a means to access flash makes for a low-latency pathway and creates a storage technology faster than any PCIe solution on the market. This is what SanDisk and Diablo did with their UltraDIMM technology, and it allows a tighter coupling between the app and data with an option to bypass the OS storage stack.
Using UltraDIMMs as a flash accelerator in place of a PCIe solution is attractive, and one reason IBM recently announced servers with UltraDIMMs. These designs achieve five microsecond latency and IBM’s target market for the technology is in-memory databases, where this is a crucial issue. Most importantly, that latency is sustained consistently under high workloads and with the memory scaled out to large configurations, which is important in financial applications. PCIe devices start at 20 microsecond latencies and slow down under high I/O loads, with much more erratic latencies that can reach milliseconds.
Performance per UltraDIMM is 1GB/sec for sequential reads, which increases almost linearly with the number of DIMMs installed. At 400GB, the DIMM’s capacity is smaller than PCIe flash accelerators, but capacity can be increased by having multiple units, to as much as 12.8Tb in the IBM systems, for instance. In a test with 10,000 VDIs, SanDisk reports having doubled the number of VMs it could service in a server compared with PCIe SSD, and this more than halved the cost of the deployment.
The SanDisk-Diablo approach is relatively easy to integrate into servers, though BIOS extensions and drivers are required.
[Read Howard Marks' analysis of Diablo Technologies' Memory Channel Storage devices in "Putting Flash On The Memory Bus."]
Viking Technology took a different tack. It has created a DRAM backed up by flash memory. Viking's ArxCis product is effectively a non-volatile DRAM. As such, atomic reads and writes take the same time as in normal DRAM, and block I/O is completely avoidable. It’s possible to operate on the memory with standard CPU commands and write a single byte if desired.
ArxCis-NV (image below) has built-in logic to detect power failure or turn-off, and stores the data in the DRAM into flash built onto the same DIMM. When power is restored, the data is returned to the DRAM on the DIMM. This makes the memory persistent NVDRAM, so it is a distinct use case to the UltraDIMM.
Viking has partnered with SuperMicro to match motherboards. BIOS changes allow register and cache protection, while the OS and other software need changes to recognize persistence. Currently, this fits appliance designs really well, since the code is proprietary. We can expect to see the technology in storage systems soon.
An NVDRAM has major implications for the operating system and the BIOS if it is to be used to its full capability. At its simplest, it can be a RAMDisk just as the UltraDIMM, with even higher throughput and much faster random access. A bit of work on the OS can make it word-addressable, which is even faster, with latencies around 30 nanoseconds.
Going the next step and melding the non-volatile memory into applications requires changes to compilers and link editors, and apps need modification to recognize the persistent space as mapped to variables etc. This is a major effort requiring multiple vendors to work together.
The Storage Networking Industry Association (SNIA) has set up an NVDIMM SIG to focus efforts on the system and software changes required. "NVDIMM brings new capabilities to systems, and is a mechanism for enabling next-generation storage-class memories,” Adrian Proctor, the SIG chair and VP of marketing for Viking said in an interview.
The flash-based DIMM products from Viking and SanDisk are useful, even powerful, steps along a path to new persistent memory technologies. These flash-based DIMMs offer, on the one hand, a performance boost for existing server solutions, and on the other, a new type of tool that looks to expand how systems are built, how they recover from failures, and ultimately, to provide a super-fast alternative to file and block I/O.