Comprehensive data protection for all workloads
Post Reply
eiskra
Enthusiast
Posts: 25
Liked: never
Joined: Mar 07, 2012 11:54 pm
Full Name: Edward Iskra
Contact:

Poor Dedupe on Exchange DAG

Post by eiskra » Apr 25, 2012 3:34 pm

I am getting very poor dedupe on a pair of Exchange servers running DAG in a single backup job and am looking for suggestions.
Am opening a ticket on this, as well.

Config: two Exchange 2010 servers in the environment. EXC-01 does OWA, but otherwise, the two servers each run all Exchange roles.
EXC-01 hosts mailbox DB A, and has a DAG copy of mailbox DB B.
EXC-02 hosts mailbox DB B, and has a DAG copy of mailbox DB A.

The total size of the volumes on each server is 810GB; I'd expect to get SUBSTANTIAL deduplication between the two.

Veeam reports:
Data size - 1620
Dedupe Ratio - 85%
Compress Ratio - 72%
Backup size - 1010

This was on a single new job running a clean full backup of both servers successfully last night. The dedupe ratio is VERY poor.

If I merely forced the drives to blank out all unused sectors, I'd expect to get a better dedupe ratio from deduplicating the blank space!
Actual data: 1300 (1620 minus the free space)
Implied dedupe Ratio - 80%

Regardless, the extreme duplicate nature of these servers should get me much closer to 50% than the current 85%.
Actual sizes of the two mailstores on EXC-01: 391 GB, 197 GB.
Actual sizes of the two mailstores on EXC-02: 391 GB, 197 GB.
I'd love to see closer to 588GB getting knocked of the backup - and that doesn't even consider that all the application/os files should be identical, as well.

The job settings are Incremental, with Synthentic Fulls weekly, but again, this was a simple Full backup run.
Inline dedupe is enabled; Compression is Optimal, Optimized for Local target.

Should I be looking to tweak the job settings - perhaps switch it to Optimize for WAN, decreasing the cluster size for dedupe analysis?
Should I be looking at the Exhange servers themselves, making some sort of change to the drives to allow for better recognition of dedupe (alignment issues? These servers were created the same way, so that's unlikely...)

Am also open to other recommendation on best practice for backup up Exchange 2010 w/DAG.

Vitaliy S.
Product Manager
Posts: 22970
Liked: 1555 times
Joined: Mar 30, 2009 9:13 am
Full Name: Vitaliy Safarov
Contact:

Re: Poor Dedupe on Exchange DAG

Post by Vitaliy S. » Apr 26, 2012 2:14 pm

eiskra wrote:Should I be looking to tweak the job settings - perhaps switch it to Optimize for WAN, decreasing the cluster size for dedupe analysis?
Yes, switching to WAN target will result in maximum deduplication ratio, as in this case for every changed block in Exchange VM Veeam will be picking up a 256 KB data block (WAN target).

As to the general recommendation on Exchange Server backup strategy, then please search these forums, as there are lots of existing threads.

Hope this helps!

eiskra
Enthusiast
Posts: 25
Liked: never
Joined: Mar 07, 2012 11:54 pm
Full Name: Edward Iskra
Contact:

Re: Poor Dedupe on Exchange DAG

Post by eiskra » Apr 26, 2012 5:26 pm

If I switch the existing job to WAN, will it create a new full backup, or will it perform a new incremental with more granular block tracking?
I need to know if I need to provide an extra TB of storage to re-start the chain!

eiskra
Enthusiast
Posts: 25
Liked: never
Joined: Mar 07, 2012 11:54 pm
Full Name: Edward Iskra
Contact:

Re: Poor Dedupe on Exchange DAG

Post by eiskra » Apr 26, 2012 5:32 pm

Vitaliy S. wrote: As to the general recommendation on Exchange Server backup strategy, then please search these forums, as there are lots of existing threads.
I appreciate the help, but with all due respect, I have not found these "existing threads."
Searching on "Exchange" and "Exchange DAG" finds lots of threads about specific Exchange problems (for example, how to deal with snapshot removal delays, cluster failovers, concerns about corrupt DB's being backed up, et cetera), but none on recommendations on strategies, even after digging through ten pages of results.

Searchign on Exchange "best practice" reduces the answer set, but still, no love.

I have found several comments in threads saying that there ARE many threads on Exchange best practices and/or strategies, but never a refernece to those threads.

I would apprecaite knowing what search terms will get me some hits - the existing threads are obviously in people's memories, but perhaps are too far back to find with my generic search queries...

I wish the FAQs included better coverage of Exchange.

Thanks for any direction you can provide.

Vitaliy S.
Product Manager
Posts: 22970
Liked: 1555 times
Joined: Mar 30, 2009 9:13 am
Full Name: Vitaliy Safarov
Contact:

Re: Poor Dedupe on Exchange DAG

Post by Vitaliy S. » Apr 26, 2012 9:53 pm

Edward, I believe you should start a new full to apply new block size settings.

As to Exchange best practices, so here is a brief summary of the recommendations on the forums:

1. Enable "application-aware image processing"
2. Truncate Exchange logs in case you do not this with other tools
3. If you experience VSS timeout win Exchange 2010, try to apply recommendations in this thread.

In addition to this, if you haven't seen our recorded webinar that covers all the aspects of Exchange backup&recovery with Veeam, here is a direct link to it.

Hope this helps!

tsightler
VP, Product Management
Posts: 5418
Liked: 2240 times
Joined: Jun 05, 2009 12:57 pm
Full Name: Tom Sightler
Contact:

Re: Poor Dedupe on Exchange DAG

Post by tsightler » Apr 27, 2012 1:56 am 3 people like this post

Also, Veeam best practices won't really have any impact on your dedupe results as best practices would be all about how to most reliably backup and restore your environment. Veeam dedupe operates with a fixed block size, with the smallest block being 256K when using WAN target. That means that an entire 256K chunk of disk must be identical between nodes before it will be deduplicated. With technology like DAG, while the logical data within the datastores is the same, the layout of that data on disk can be very, very different.

As a contrived example, if you have a 256K block on the node 1, it might contain 64 mails that are each 4K. Even if these 4 emails are all stored in a 256K block on the other node they must be stored in the identical order, otherwise it would not be a "duplicate" block. Log replay solutions like use with DAG replication simply do not create identical source level blocks in this way. Certainly you would expect to see some dedupe (binary attachments, etc), but if you're expecting it to be 50% you will likely be disappointed even after switching to smaller blocks, although you should see some improvement.

tgiphil
Influencer
Posts: 17
Liked: never
Joined: Nov 15, 2011 11:46 pm
Full Name: Phil Garcia
Contact:

Re: Poor Dedupe on Exchange DAG

Post by tgiphil » Apr 27, 2012 3:40 am

Tom is correct. While the Exchange servers may store the same data, they may be internally organized on disk differently. And thus, Veeam will not be able to dedupe this.

eiskra
Enthusiast
Posts: 25
Liked: never
Joined: Mar 07, 2012 11:54 pm
Full Name: Edward Iskra
Contact:

Re: Poor Dedupe on Exchange DAG

Post by eiskra » Apr 30, 2012 9:02 pm

Thanks, Vitaly. On your suggestions:
1. Enable "application-aware image processing"
Already done
2. Truncate Exchange logs in case you do not this with other tools
Already using Veeam truncate.
3. If you experience VSS timeout win Exchange 2010, try to apply recommendations in this thread.
Had issues months ago, resolved using that thread.
In addition to this, if you haven't seen our recorded webinar that covers all the aspects of Exchange backup&recovery with Veeam, here is a direct link to it.
On my list of things to watch.

eiskra
Enthusiast
Posts: 25
Liked: never
Joined: Mar 07, 2012 11:54 pm
Full Name: Edward Iskra
Contact:

Re: Poor Dedupe on Exchange DAG

Post by eiskra » Apr 30, 2012 9:05 pm

tgiphil wrote:Tom is correct. While the Exchange servers may store the same data, they may be internally organized on disk differently. And thus, Veeam will not be able to dedupe this.
Thanks Tom, and tgiphil, for confirming.

After making space for and running a full backup with WAN optimization, I got ratios of 86% Dedupe and 73% compression. Subsequent incremental was 99% dedupe and 61% compression. Given that the changes are almost exclusively in the mailstores, and it's Exchange 2010, even the compression isn't that great a number.

I suppose the next step is to move from Optimal to Best compression, and see how much space that saves - versus the extra load/runtime.

Am open to suggestions for Exchange config changes that might improve the odds of dedupe (perhaps by forcing an identical DAG database layout to the original mailstore, if something like that is possible... which I doubt.)

Post Reply

Who is online

Users browsing this forum: NightBird and 36 guests