There seems to be much discussion around backups lately. One of the big issues is if legacy backup
platforms stand a chance against new dedicated server virtualization
solutions, but what I'd like to discuss in this blog is if the entire backup process itself can be
eliminated by the intelligent use of snapshots, replication, compression
and data deduplication.
Elimination of backups really comes down to two key issues: can you
technically accomplish the goal and do you really want to accomplish the
goal? On the technical side, I'd have to say that we are pretty much there.
Primary storage is becoming more resilient and as primary storage
vendors add the data services I mention above, I think technically
the job can be accomplished. Many systems support an unlimited number of
snapshots and/or copies of data by leveraging deduplication. Most have data replicated to a remote site, so you are covered from a
single site disaster taking out all of your backup along with your
primary store.
The first challenge to this approach is that you are counting
on a single meta-data table to keep track of all the data
interrelations. If that table gets corrupted, then most if not all of
your point in time copies may not be able to be read. The other
challenge is that if the whole storage
system fails, if your redundant copies of data are all on the same
system, you have a problem.
I know that the chances of either of the above scenarios happening is
relatively small, but there is a chance, and isn't that what backups are
for? Cover yourself just in case something unlikely happens? Also, if you
did go with a replication option, it should help you recover from those
two situations, but you do have to have a way to get that data back. If
it is an entire restore across a WAN connection you may be out of
business before the recovery completes. While you could replicate to a
second unit locally and then a third in DR, doesn't that begin to sound
just like a backup?
A situation that could occur and one that replication would not protect you from is if the deduplication or snapshot
engines produce a silent error that does not appear right away and
somehow the deduplication or snapshot engine reports that it is working
correctly. If this happens, you may not know you have a problem until months later. I have yet to find this occurring in an environment and have
only heard anecdotal stories of even the possibility. Also most
deduplication processes have self-check code to help prevent it from
occurring, but it's better to be safe than sorry.
There are steps you can take to reduce the exposure to any of these
failures even if the likelihood of them actually
happening is slim. Using these methods as a primary and even secondary
recovery point is certainly acceptable. For me though, there is
something comforting in knowing that your data is on a separate platform
(disk or tape) managed and protected by a separate process.