Last week EMC blogger Chuck Hollis started a bit of a firestorm when he questioned the marketing position some vendors are taking that local snapshots make traditional backups unnecessary. As we could expect, a series of EMC haters, employees of other vendors and industry analysts jumped in to add fuel to the fire. I think Hollis did ask a valid question: As technologies advance, what is a backup?
The only reason we back up in the first place is to be able to restore data to the condition it was in before some unfortunate event. At first approximation, lots of things serve as backups. In fact, the most common sources of saved data used to recover from an unfortunate event are the copies of documents Microsoft Office automatically saves periodically. I end up reverting to an auto-saved copy of a document about once a month as my poor PC collapses under the strain of 700 open apps and browser tabs and I have to reboot it.
So if any copy can be a backup, the real question is: What is a sufficient backup, and where do snapshots fit in my backup plan? To be sufficient, my backup architecture has to be able to satisfy restore and recovery requests in a reasonable period of time with as little work by me as possible. It also has to have as little impact on users and application performance as possible. As long as I can satisfy my users' restore requests, I am satisfied to call it a sufficient backup. From where I sit, backups don't have to be created by backup software, and they don't have to be in some special backup format.
That user and application impact area is where many copy-on-write snapshot systems break down. With some snapshot systems like VMware's, keeping multiple snapshots on disk over a period of several days can slow system performance significantly and even crash all the virtual machines using a data store if the data store fills up with snapshot data. So for me to consider snapshots as a backup, they'd better be good snapshots, and that usually means redirect-on-write rather than copy-on-write snapshots.
The majority of restore requests are for single files, or groups of related files, that users have accidentally deleted or corrupted. Since the users are the cause of this data destruction, one of my key criteria for backups is that the backup copy has to be outside the control of ordinary users and their applications. After all, the user who decided that he or she didn't want a file might make sure to delete all the copies of that file he or she can find.