Unfortunately cloud storage is often thought of as a big holding tank, where you dump data in case you need it some day, but the truth is that even today you can do much more with it. Assuming that we want the cloud to be more than a data dumping ground somewhere on the internet, what should you expect cloud storage software to do?
As we discussed in my prior entry Cloud Storage Arms Race, cloud storage software can come as exactly that, software that you load on to your servers and storage to create you own cloud or it can be built into the turnkey cloud storage systems that are available. Regardless what this software can do and how it can do it becomes a critical decision point when selecting cloud storage.
One of the first capabilities to look for is dispersion. This is the ability for your cloud software to leverage the fact that it is on a global network and to disperse data throughout that global network. The most obvious use of this capability is as a data protection function. Critical data you may want distributed to four different data centers for example. What makes this important is that some of these software solutions have the ability to dial up or down the number of copies based on business policies.
This can be as cited above to provide an extra layer of protection or it might be to manage the popularity of a file. For example let's say you just released the latest version of your software, you might want to dial up the number of copies and access points for a period of time to meet the initial download demand, then as the initial wave dies down dial the copies down to a primary copy and a DR copy. When the next version of your product is ready for release, you may want to move the prior version of off all the primary access points and just keep one in an archive.
Tied closely into this is the ability for cloud software to be geographically aware. For example you may want to make sure that users on the West Coast download that software from a West Coast data center, users in Europe from a European based data center, etc. Then the software needs to interface with the network to route users to the most geographically close storage point.