Recommended Backup Job Settings for EMC Data Domain

Availability for the Always-On Enterprise

Recommended Backup Job Settings for EMC Data Domain

Veeam Logoby brupnick » Thu Nov 06, 2014 8:01 pm

Good afternoon-

Now that v8 with DD Boost support has been released, are there new setting recommendations when using a Data Domain as the storage target? I've come across the following, but there might be more:

Storage Target Settings
  • Align backup file data blocks (Yes/No)
  • Decompress backup data blocks before storing (Yes/No)
Backup Job Settings
  • Enable inline deduplication (Yes/No)
  • Compression level
  • Storage optimization
Thank you!
brupnick
Expert
 
Posts: 196
Liked: 13 times
Joined: Sat Feb 05, 2011 5:09 pm
Location: New York, USA
Full Name: Brian Rupnick

Re: Recommended Backup Job Settings for Data Domain

Veeam Logoby v.Eremin » Fri Nov 07, 2014 8:53 am 1 person likes this post

The storage target settings are the following:
Align backup file data blocks: NO
Decompress backup data blocks before storing: YES

Backup repository settings:
Enable inline deduplication: YES
Compression level: Optimal
Storage optimization: Local target

Thanks.
v.Eremin
Veeam Software
 
Posts: 13409
Liked: 985 times
Joined: Fri Oct 26, 2012 3:28 pm
Full Name: Vladimir Eremin

Re: Recommended Backup Job Settings for Data Domain

Veeam Logoby brupnick » Fri Nov 07, 2014 12:25 pm

Thanks, Vladimir!
brupnick
Expert
 
Posts: 196
Liked: 13 times
Joined: Sat Feb 05, 2011 5:09 pm
Location: New York, USA
Full Name: Brian Rupnick

Re: Recommended Backup Job Settings for Data Domain

Veeam Logoby v.Eremin » Fri Nov 07, 2014 12:30 pm

You're welcome. Feel free to contact in case additional clarification is needed. Thanks.
v.Eremin
Veeam Software
 
Posts: 13409
Liked: 985 times
Joined: Fri Oct 26, 2012 3:28 pm
Full Name: Vladimir Eremin

Re: Recommended Backup Job Settings for Data Domain

Veeam Logoby v.Eremin » Fri Nov 07, 2014 2:38 pm

I've edited my answer. So, please double check the provided recommendations (more specifically, part about compression level). Thanks.
v.Eremin
Veeam Software
 
Posts: 13409
Liked: 985 times
Joined: Fri Oct 26, 2012 3:28 pm
Full Name: Vladimir Eremin

Re: Recommended Backup Job Settings for Data Domain

Veeam Logoby Gostev » Fri Nov 07, 2014 7:45 pm

Lately our architects started recommending Local (16+ TB backup files) storage optimization for deduplicating storage. This significantly improves restore performance without impacting backup performance too much. Please consider this as well. Thanks!
Gostev
Veeam Software
 
Posts: 21428
Liked: 2358 times
Joined: Sun Jan 01, 2006 1:01 am
Location: Baar, Switzerland

Re: Recommended Backup Job Settings for Data Domain

Veeam Logoby jj281 » Mon Nov 10, 2014 5:54 pm

A couple questions... isn't this contradictory?

Decompress backup data blocks before storing: YES

Compression level: Optimal

Wouldn't this take extra CPU cycles on the Gateway to decompress, not to mention the wasted cycles on the Proxy to do it in the first place...? Also, if you're utilizing DDBoost, wouldn't it be better to do no compression since the only blocks process by the DD (vs Proxy/Gateway) that would eat CPU cycles (for Local Compress) would be the changed blocks?

Also, what's the point of Inline deduplication (again with DDBoost)? So it isn't transmitted to the Gateway from the Proxy (Transport)?? If that's the case, the assumption is the transmission of the block is the resource to be concerned about not the compute on the proxy right?
jj281
Novice
 
Posts: 9
Liked: never
Joined: Mon Aug 18, 2014 9:34 pm
Full Name: Jonathan Leafty

Re: Recommended Backup Job Settings for Data Domain

Veeam Logoby tsightler » Mon Nov 10, 2014 6:51 pm 5 people like this post

jj281 wrote:A couple questions... isn't this contradictory?

Decompress backup data blocks before storing: YES
Compression level: Optimal

Wouldn't this take extra CPU cycles on the Gateway to decompress, not to mention the wasted cycles on the Proxy to do it in the first place...? Also, if you're utilizing DDBoost, wouldn't it be better to do no compression since the only blocks process by the DD (vs Proxy/Gateway) that would eat CPU cycles (for Local Compress) would be the changed blocks?


Certainly you can manually disable compression on your jobs if you like, indeed this will lower the amount of CPU used by the proxy and repository for compression/decompression slightly. However, in most real world deployment, proxy/repo CPU is rarely the bottleneck, especially when using the "optimial" compression setting. In far more cases network bandwidth is the bottleneck, and not just bandwdith, but the extra overhead of sending and receiving 2-3x as much data across the network. That's 2x as many CPU interrupts, and 2x as much data being copied around the network drivers, and all of that uses CPU as well.

In most environments the benefit of using compression between the agents is worth the extra CPU overhead, especially if the CPU capacity is otherwise available and especially in environments with 1GbE networks between proxy and gateway. Veeam optimial compression uses the LZ4 algorithm which is designed for high throughput, and is very light on CPU, especially for the decompression side (a single decompression thread on a single core can decompress GB/s). So indeed, while the overall CPU usage might go up some, the bandwidth savings of 2-3x is worth it for the vast majority of environments. This effectively turns a gateway with a single 10GbE port into a gateway with 20-30Gb of inbound bandwidth.

But of course every environment is different, and you may have plenty of bandwidth even without using compression on the jobs, and perhaps you are CPU constrained instead, in which case, yes, disabling compression at the job level might be beneficial. That's the problem with generic "one size fits all" recommended settings and it why the settings are there in the first place. If the exact same options worked perfectly for every environment you wouldn't need those knobs. :D

jj281 wrote:Also, what's the point of Inline deduplication (again with DDBoost)? So it isn't transmitted to the Gateway from the Proxy (Transport)?? If that's the case, the assumption is the transmission of the block is the resource to be concerned about not the compute on the proxy right?


I personally have no problem with disabling dedupe in Veeam, and I've changed that in the best practice papers and deployment guides I've written or had input on. But, it really makes very little difference to the backup process itself as the Veeam dedupe engine is very, very CPU light. Leaving it on can reduce the amount of traffic that has to be processed by DDboost overall and may reduce CPU and bandwidth very slightly. I always use this simple example:

Block1 -- AA:BB:CC:DD:EE:FF
Block2 -- BB:AA:DD:CC:FF:EE
Block3 -- AA:BB:CC:DD:EE:FF

So, using a simplistic explanation, DDboost will recognize that there are 6 unique data patterns in each block and reduce those down. This will occur whether Veeam dedupe is enabled or not, however, if Veeam dedupe is disabled DDboost would have to analyze the contents of the third block and thus use CPU to do it. On the other hand, if Veeam dedupe is enabled, DDboost never even see that third block because it would already be recognized by Veeam as an exact duplicate of Block1 and thus never even be written to the repository in the first placed so the DDboost could process it. The total data savings is exactly the same either way, but DDboost had less work to do because Veeam had eliminated that block from the stream already.

In previous documents I always recommended just leaving Veeam dedupe on, since it had almost no negative impact and could have a slight positive impact in saved bandwidth, but I've more recently started telling people to turn if off mainly because it just confuses people and leads to long discussions about something that will overall have very little impact one way or the other. There can also be a slight benefit to turning it off as there is less metadata, which can lead to less read operations from the DD during job startup, but once again, this is usually a minor impact.
tsightler
Veeam Software
 
Posts: 4800
Liked: 1756 times
Joined: Fri Jun 05, 2009 12:57 pm
Full Name: Tom Sightler

Re: Recommended Backup Job Settings for Data Domain

Veeam Logoby jj281 » Tue Nov 11, 2014 4:25 pm

Thanks for the detailed explanation, it does help. I know we're talking about slight degrees of resource consumption but its nice to know the reasoning and the more deep-dive aspects of Veeam.
jj281
Novice
 
Posts: 9
Liked: never
Joined: Mon Aug 18, 2014 9:34 pm
Full Name: Jonathan Leafty

Re: Recommended Backup Job Settings for EMC Data Domain

Veeam Logoby BeThePacket » Sun Jan 25, 2015 7:01 pm

Why would Veeam architects recommend Local target (16+ TB backup files) for storage optimization to a DD appliance? The block size of this option is 8MB vs LAN target (512KB) or WAN target (256KB). Last I checked, the smaller the block size of a file the better the dedupe rate, which is something for those of us with Data Domain products really want to get the most out of our investment.

The "change advanced settings to recommended for repository type" prompt is also extremely annoying when creating or modifying a job, since IMO it's suggestion is completely wrong.
BeThePacket
Lurker
 
Posts: 2
Liked: never
Joined: Sun Jan 25, 2015 6:48 pm
Full Name: Nathan Nieblas

Re: Recommended Backup Job Settings for EMC Data Domain

Veeam Logoby Gostev » Sun Jan 25, 2015 11:24 pm

BeThePacket wrote:Last I checked, the smaller the block size of a file the better the dedupe rate, which is something for those of us with Data Domain products really want to get the most out of our investment.

That is absolutely correct, but you are wrongly applying B&R block size to Data Domain dedupe engine, where it has no play.

Regardless of B&R block size, Data Domain will dedupe Veeam backup files with much smaller blocks (of variable length btw), getting you the best dedupe ratio possible. Because of that, smaller B&R block size will have no impact on Data Domain dedupe efficiency. Reading with larger block size on B&R side helps restore performance though.

Without Data Domain in the picture (e.g. when backing up to raw disk), for best dedupe ratio you indeed would want to go with small block sizes, as B&R will be the only dedupe engine.
Gostev
Veeam Software
 
Posts: 21428
Liked: 2358 times
Joined: Sun Jan 01, 2006 1:01 am
Location: Baar, Switzerland

Re: Recommended Backup Job Settings for EMC Data Domain

Veeam Logoby BeThePacket » Mon Jan 26, 2015 1:36 am

What you're saying is that regardless of the block size a backup file is being stored as on the DD, the space savings will be the same? What about the initial VBK? It would be great for EMC to validate the job settings being proposed/used.
BeThePacket
Lurker
 
Posts: 2
Liked: never
Joined: Sun Jan 25, 2015 6:48 pm
Full Name: Nathan Nieblas

Re: Recommended Backup Job Settings for EMC Data Domain

Veeam Logoby Gostev » Mon Jan 26, 2015 3:35 pm

Correct. And we did validate this as a part of the mandatory certification testing that EMC requires all backup vendors to perform.
Gostev
Veeam Software
 
Posts: 21428
Liked: 2358 times
Joined: Sun Jan 01, 2006 1:01 am
Location: Baar, Switzerland

[MERGED] Data Domain Backup Storage Optimization

Veeam Logoby jkowal99 » Tue May 05, 2015 1:05 pm

Hello,
Trying to get a feel for setting up the best backup job for backing up to the Data Domain 200 device from VEEAM. In the advanced settings, under storage, there are some storage optimization options. The "recommended" option for the DD appliance is "Local Target (16 TB + backup files). The description says "Lowest deduplication ratio and larger incremental backups. Recommended for jobs producing full backup files larger then 16 TB". My questions is, if the VM's i'm backing up aren't anywhere near 16 TB, not even 1 TB, should i be choosing a different option with better deduplication? thanks
jkowal99
Lurker
 
Posts: 1
Liked: never
Joined: Tue May 05, 2015 12:57 pm
Full Name: Jeremy Kowalczyk

Re: Recommended Backup Job Settings for EMC Data Domain

Veeam Logoby foggy » Tue May 05, 2015 2:48 pm 1 person likes this post

Jeremy, please review the thread above for recommended settings and some deeper considerations. Should answer your questions. Thanks!
foggy
Veeam Software
 
Posts: 14885
Liked: 1092 times
Joined: Mon Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson

Next

Return to Veeam Backup & Replication



Who is online

Users browsing this forum: ac3v3, Daithi, Google [Bot], jensenjk, JimmyO, spiritie and 57 guests