I concluded in a recent blog entry "Do We Need Primary Storage Deduplication?" about how primary storage deduplication can bring significant value to the data center, and it is becoming a must for suppliers to provide. In our own testing on two different deduplication platforms, we are seeing an almost 70 percent reduction of capacity requirements on real world data sets.
Over the next couple of months we will be profiling companies that are offering primary storage deduplication. The players in this market range from companies offering deduplication as a module that storage manufacturers can integrate into their platforms to companies providing complete storage systems where deduplication just happens to be one of the capabilities and deduplication companies that offer complete storage system capabilities (meaning that deduplication is their lead). Our first company to be profiled, GreenBytes, falls in between those last two categories. Deduplication gets a lot of attention for them, but other capabilities make the unit compelling, as well.
GreenBytes, founded in 2007, is an OpenSolaris-based storage appliance. They are not using ZFS's deduplication capabilities; instead, they have developed their own technology and have been shipping long before ZFS added deduplication to its feature set. The majority of the GreenBytes IP is not dependent on OpenSolaris, and they are not as susceptible as others to changes that Oracle could bring to the OpenSolaris community. GreenBytes IP manifests itself in its GB-X series, a high performance, SSD accelerated, inline deduplication storage system. It can provide both file services (CIFS and NFS) as well as block (via iSCSI) storage.
The typical use case for the GB-X series is to start as part of a backup evaluation, then as the user experiences the capabilities and performance of the unit, they begin to examine its application in primary storage. Early implementations include hosting VMware images, and, of course, home directories. As the comfort level with the technology increases, so do the use cases.
There are three key ingredients to the GreenBytes solution. The GreenBytes File System (GBFS), the hardware platform and the management environment. The GreenBytes file system handles all the deduplication and compression inline. Inline deduplication has an advantage of never having to store redundant data. This saves on temporal capacity allocation and can improve performance by minimizing the amount of write activity. The GreenBytes solutions also leverage solid state storage to hold deduplication meta data and to act as a cache for read and write operations.