Symantec just released its 2010 State of The Data Center Survey. In the survey, respondents were asked to rate their disaster recovery plan and only 12 percent rated it as excellent. Even if you add in the 27 percent who thought their plan was "Pretty Good," that means more than half thought that their plans were less than pretty good. Still, the choice of "pretty good" struck me. Who wants to execute a recovery from a "pretty good" DR plan?
What causes a disaster recovery plan to be pretty good or worse? My first guess is that there is too much data to deal with. Let's get rid of some of that. As we discussed in our article Archiving Basics, a solid archive plan should help eliminate a large portion of the problem. It will also eliminate the need for some of the complexity that is built into the backup process because people are using backups for long term retention of data.
My second guess is there are lots of extra copies of data being made. I have seen data centers taking no less than six copies of their most critical data. They are snapshotting it, doing internal application backups of it, backing it up with some sort of third party but application specific backup and backing it up with an enterprise backup application more than likely to a disk based backup target that makes it own copy of itself. This is not to mention all the replication processes going on: applications are replicating, storage is replicating and backup devices are replicating. Isn't this too much?
With all of these extra copies of data being made, it's no wonder that we are all running out of storage space or at least struggling with how to manage it. I hope that the hard drive suppliers come out with 8TB hard drives in a hurry and that the dedupe vendors uncover the secrets of quad-phased deduplication.
The answer is STOP. You really don't need another copy of data. You need one that is real-time enough to meet your emergency recovery needs, and you need one that provides some minimal point-in-time granularity. Remember, long term retention should be the sole domain of the archive. These copies should then be replicated off-site in case something goes wrong at the original site. Ideally, these can be provided by one process. If not, they should be managed as part of an overall backup workflow.