Network Computing is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.

Deduplication's Replication Mode

According to every deduplication supplier that I talk to, replication
has a high attach rate for deduplication products. In most cases over
50 percent of their systems are sold with the replication module or
capabilities enabled. Over the next couple of entries I'll review some
of the specific vendor's claims and name names as it relates to
replication. If your in the dedupe space and I have not spoke to you,
please reach out to me so I can include you in the conversation.

While moving backup jobs to a remote site electronically is a key
capability for deduplication products, it should not be your sole
method of DR. It's important to keep in mind that the data in the remote site is in a backup format and needs to be
recovered to DR servers to be of value. The time it takes to move this
data from the disk deduplication device to the production server will
still take time. That time may push you outside of your recovery
service level agreement. For many data centers, having a data set that
goes off-site in an inexpensive fashion, a few hours after local backup
is complete may be all they can afford and may still represent a huge
improvement in recoverability.

There is one exception to the recovery first problem: server
virtualization. Since some of the appliance based devices present
themselves as disk targets via CIFS or NFS, you could mount server
images via NFS at the DR site and be back in production. None of the
appliance based systems bill themselves as primary storage, so the
intent would be to use a capability like VMware Storage VMotion to
move those images quickly to production storage. This concept is worth
an article all by itself and something I will dive into later.

While some of the deduplication vendors that I spoke with are
relatively new to providing replication capabilities to their
solutions, all of them seem to have something. Some of the
deduplication providers are delivering replication via a basic file
system replication technique. Basically they are leveraging the fact
that deduplication only writes unique blocks and they are using file
system replication to identify those writes and then replicate them
across the wire. While this certainly works from a "point A to point B"
perspective, it does cause some problems when you are trying to do a
many to one or cascaded type of replication.

Also how the vendor does deduplication, the old inline vs. post
processing debate, will affect how the replication mode works. Most
vendors will agree that both methods have their strong points and weak
points. It's how they take advantage of the strengths and design around
the weaknesses that matters. For example, when it comes to replication,
an inline system or even an adaptive inline system should be able to
replicate data either as data is written to the device or as the
specific backup stream to end and provide a file closure. In typical post process data deduplication, the entire backup has to complete before deduplication occurs.
Replication then occurs as unique blocks are identified and written to
disk.

  • 1