I spent several days in Miami recently at the Object Storage Summit with storage industry analysts and vendors, including Cleversafe, Scality, Data Direct Networks (DDN), Nexsan and Quantum. We spent a lot of time talking not just about how the various object storage systems worked, but also about what the object storage vendors have to do to move object storage further into the mainstream. The use cases are there, but vendors must make application integration easier for enterprise customers.
Like most emerging technologies, object storage found initial acceptance in a select set of vertical markets, such as Web application vendors, cloud service providers, high-performance computing, and media and entertainment. These organizations have lots of files that get created (think Shutterfly, for example), and rather than modify those files in place, their workflows maintain each version of each object to allow for different methods of reuse.
I was a bit more surprised at the level of success the object storage vendors were having in the intelligence community. A couple of the vendors spoke (in generalities of course) about how the data collected from keyhole satellites and Predator drones is stored and processed on object platforms.
Object storage has been less successful in the commercial space, which is a shame. When I was teaching backup seminars last year, I would regularly get users complaining that incremental backups of their NAS systems took days to complete. It took that long to walk their file system and figure out which of the millions of files changed, and therefore needed to be backed up, regardless of how much new data there was.
If those users could find a way to migrate old, stale files off the production server to an object storage system, they'd dramatically speed up their nightly incremental backups and reduce the size of their weekly full backups by 60% to 90%.
The best part is that the object storage system itself never needs to be backed up, which can save the organization a bundle in opex. Object systems use replication or advanced dispersal coding to protect the data against hardware or site failures. Object storage systems also create a new object every time a user modifies a file, keeping the old version around as long as the organization's retention policy requires, so the object store doesn't need backups to protect our data from the users, either.
A major factor limiting object storage's acceptance in the corporate market is that each object storage vendor has its own SOAP or REST-based API for getting data in and out of the system. This means companies and ISPs need to customize their applications for each storage platform.
One interesting development is that vendors are adding support for the Amazon S3 API in addition to their native API. For object storage to take off in the corporate market, there has to be a standard interface for application vendors to write to.
DDN takes an API-agnostic approach on its WOS (Web Object System); it supports its native high-performance API as well as Amazon S3's and CDMI's object APIs. The company also integrates with clustered file systems such as GPFS and Lustre, which are common in the HPC world, and Hadoop's HDFS, which provides the persistence that big data file systems have lacked.
Object storage is the solution for large organizations drowning in tens or hundreds of petabytes of unstructured data. If vendors can make application integration easier, the enterprise market may open up.