Comprehensive data protection for all workloads
brupnick
Expert
Posts: 196
Liked: 13 times
Joined: Feb 05, 2011 5:09 pm
Full Name: Brian Rupnick
Location: New York, USA
Contact:

Recommended Backup Job Settings for EMC Data Domain

Post by brupnick »

Good afternoon-

Now that v8 with DD Boost support has been released, are there new setting recommendations when using a Data Domain as the storage target? I've come across the following, but there might be more:

Storage Target Settings
  • Align backup file data blocks (Yes/No)
  • Decompress backup data blocks before storing (Yes/No)
Backup Job Settings
  • Enable inline deduplication (Yes/No)
  • Compression level
  • Storage optimization
Thank you!
veremin
Product Manager
Posts: 20270
Liked: 2252 times
Joined: Oct 26, 2012 3:28 pm
Full Name: Vladimir Eremin
Contact:

Re: Recommended Backup Job Settings for Data Domain

Post by veremin » 1 person likes this post

The storage target settings are the following:
Align backup file data blocks: NO
Decompress backup data blocks before storing: YES

Backup repository settings:
Enable inline deduplication: YES
Compression level: Optimal
Storage optimization: Local target

Thanks.
brupnick
Expert
Posts: 196
Liked: 13 times
Joined: Feb 05, 2011 5:09 pm
Full Name: Brian Rupnick
Location: New York, USA
Contact:

Re: Recommended Backup Job Settings for Data Domain

Post by brupnick »

Thanks, Vladimir!
veremin
Product Manager
Posts: 20270
Liked: 2252 times
Joined: Oct 26, 2012 3:28 pm
Full Name: Vladimir Eremin
Contact:

Re: Recommended Backup Job Settings for Data Domain

Post by veremin »

You're welcome. Feel free to contact in case additional clarification is needed. Thanks.
veremin
Product Manager
Posts: 20270
Liked: 2252 times
Joined: Oct 26, 2012 3:28 pm
Full Name: Vladimir Eremin
Contact:

Re: Recommended Backup Job Settings for Data Domain

Post by veremin »

I've edited my answer. So, please double check the provided recommendations (more specifically, part about compression level). Thanks.
Gostev
Chief Product Officer
Posts: 31457
Liked: 6648 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Recommended Backup Job Settings for Data Domain

Post by Gostev »

Lately our architects started recommending Local (16+ TB backup files) storage optimization for deduplicating storage. This significantly improves restore performance without impacting backup performance too much. Please consider this as well. Thanks!
jj281
Novice
Posts: 9
Liked: never
Joined: Aug 18, 2014 9:34 pm
Full Name: Jonathan Leafty
Contact:

Re: Recommended Backup Job Settings for Data Domain

Post by jj281 »

A couple questions... isn't this contradictory?

Decompress backup data blocks before storing: YES

Compression level: Optimal

Wouldn't this take extra CPU cycles on the Gateway to decompress, not to mention the wasted cycles on the Proxy to do it in the first place...? Also, if you're utilizing DDBoost, wouldn't it be better to do no compression since the only blocks process by the DD (vs Proxy/Gateway) that would eat CPU cycles (for Local Compress) would be the changed blocks?

Also, what's the point of Inline deduplication (again with DDBoost)? So it isn't transmitted to the Gateway from the Proxy (Transport)?? If that's the case, the assumption is the transmission of the block is the resource to be concerned about not the compute on the proxy right?
tsightler
VP, Product Management
Posts: 6009
Liked: 2843 times
Joined: Jun 05, 2009 12:57 pm
Full Name: Tom Sightler
Contact:

Re: Recommended Backup Job Settings for Data Domain

Post by tsightler » 5 people like this post

jj281 wrote:A couple questions... isn't this contradictory?

Decompress backup data blocks before storing: YES
Compression level: Optimal

Wouldn't this take extra CPU cycles on the Gateway to decompress, not to mention the wasted cycles on the Proxy to do it in the first place...? Also, if you're utilizing DDBoost, wouldn't it be better to do no compression since the only blocks process by the DD (vs Proxy/Gateway) that would eat CPU cycles (for Local Compress) would be the changed blocks?
Certainly you can manually disable compression on your jobs if you like, indeed this will lower the amount of CPU used by the proxy and repository for compression/decompression slightly. However, in most real world deployment, proxy/repo CPU is rarely the bottleneck, especially when using the "optimial" compression setting. In far more cases network bandwidth is the bottleneck, and not just bandwdith, but the extra overhead of sending and receiving 2-3x as much data across the network. That's 2x as many CPU interrupts, and 2x as much data being copied around the network drivers, and all of that uses CPU as well.

In most environments the benefit of using compression between the agents is worth the extra CPU overhead, especially if the CPU capacity is otherwise available and especially in environments with 1GbE networks between proxy and gateway. Veeam optimial compression uses the LZ4 algorithm which is designed for high throughput, and is very light on CPU, especially for the decompression side (a single decompression thread on a single core can decompress GB/s). So indeed, while the overall CPU usage might go up some, the bandwidth savings of 2-3x is worth it for the vast majority of environments. This effectively turns a gateway with a single 10GbE port into a gateway with 20-30Gb of inbound bandwidth.

But of course every environment is different, and you may have plenty of bandwidth even without using compression on the jobs, and perhaps you are CPU constrained instead, in which case, yes, disabling compression at the job level might be beneficial. That's the problem with generic "one size fits all" recommended settings and it why the settings are there in the first place. If the exact same options worked perfectly for every environment you wouldn't need those knobs. :D
jj281 wrote:Also, what's the point of Inline deduplication (again with DDBoost)? So it isn't transmitted to the Gateway from the Proxy (Transport)?? If that's the case, the assumption is the transmission of the block is the resource to be concerned about not the compute on the proxy right?
I personally have no problem with disabling dedupe in Veeam, and I've changed that in the best practice papers and deployment guides I've written or had input on. But, it really makes very little difference to the backup process itself as the Veeam dedupe engine is very, very CPU light. Leaving it on can reduce the amount of traffic that has to be processed by DDboost overall and may reduce CPU and bandwidth very slightly. I always use this simple example:

Block1 -- AA:BB:CC:DD:EE:FF
Block2 -- BB:AA:DD:CC:FF:EE
Block3 -- AA:BB:CC:DD:EE:FF

So, using a simplistic explanation, DDboost will recognize that there are 6 unique data patterns in each block and reduce those down. This will occur whether Veeam dedupe is enabled or not, however, if Veeam dedupe is disabled DDboost would have to analyze the contents of the third block and thus use CPU to do it. On the other hand, if Veeam dedupe is enabled, DDboost never even see that third block because it would already be recognized by Veeam as an exact duplicate of Block1 and thus never even be written to the repository in the first placed so the DDboost could process it. The total data savings is exactly the same either way, but DDboost had less work to do because Veeam had eliminated that block from the stream already.

In previous documents I always recommended just leaving Veeam dedupe on, since it had almost no negative impact and could have a slight positive impact in saved bandwidth, but I've more recently started telling people to turn if off mainly because it just confuses people and leads to long discussions about something that will overall have very little impact one way or the other. There can also be a slight benefit to turning it off as there is less metadata, which can lead to less read operations from the DD during job startup, but once again, this is usually a minor impact.
jj281
Novice
Posts: 9
Liked: never
Joined: Aug 18, 2014 9:34 pm
Full Name: Jonathan Leafty
Contact:

Re: Recommended Backup Job Settings for Data Domain

Post by jj281 »

Thanks for the detailed explanation, it does help. I know we're talking about slight degrees of resource consumption but its nice to know the reasoning and the more deep-dive aspects of Veeam.
BeThePacket
Lurker
Posts: 2
Liked: never
Joined: Jan 25, 2015 6:48 pm
Full Name: Nathan Nieblas
Contact:

Re: Recommended Backup Job Settings for EMC Data Domain

Post by BeThePacket »

Why would Veeam architects recommend Local target (16+ TB backup files) for storage optimization to a DD appliance? The block size of this option is 8MB vs LAN target (512KB) or WAN target (256KB). Last I checked, the smaller the block size of a file the better the dedupe rate, which is something for those of us with Data Domain products really want to get the most out of our investment.

The "change advanced settings to recommended for repository type" prompt is also extremely annoying when creating or modifying a job, since IMO it's suggestion is completely wrong.
Gostev
Chief Product Officer
Posts: 31457
Liked: 6648 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Recommended Backup Job Settings for EMC Data Domain

Post by Gostev »

BeThePacket wrote:Last I checked, the smaller the block size of a file the better the dedupe rate, which is something for those of us with Data Domain products really want to get the most out of our investment.
That is absolutely correct, but you are wrongly applying B&R block size to Data Domain dedupe engine, where it has no play.

Regardless of B&R block size, Data Domain will dedupe Veeam backup files with much smaller blocks (of variable length btw), getting you the best dedupe ratio possible. Because of that, smaller B&R block size will have no impact on Data Domain dedupe efficiency. Reading with larger block size on B&R side helps restore performance though.

Without Data Domain in the picture (e.g. when backing up to raw disk), for best dedupe ratio you indeed would want to go with small block sizes, as B&R will be the only dedupe engine.
BeThePacket
Lurker
Posts: 2
Liked: never
Joined: Jan 25, 2015 6:48 pm
Full Name: Nathan Nieblas
Contact:

Re: Recommended Backup Job Settings for EMC Data Domain

Post by BeThePacket »

What you're saying is that regardless of the block size a backup file is being stored as on the DD, the space savings will be the same? What about the initial VBK? It would be great for EMC to validate the job settings being proposed/used.
Gostev
Chief Product Officer
Posts: 31457
Liked: 6648 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Recommended Backup Job Settings for EMC Data Domain

Post by Gostev »

Correct. And we did validate this as a part of the mandatory certification testing that EMC requires all backup vendors to perform.
jkowal99
Lurker
Posts: 1
Liked: never
Joined: May 05, 2015 12:57 pm
Full Name: Jeremy Kowalczyk
Contact:

[MERGED] Data Domain Backup Storage Optimization

Post by jkowal99 »

Hello,
Trying to get a feel for setting up the best backup job for backing up to the Data Domain 200 device from VEEAM. In the advanced settings, under storage, there are some storage optimization options. The "recommended" option for the DD appliance is "Local Target (16 TB + backup files). The description says "Lowest deduplication ratio and larger incremental backups. Recommended for jobs producing full backup files larger then 16 TB". My questions is, if the VM's i'm backing up aren't anywhere near 16 TB, not even 1 TB, should i be choosing a different option with better deduplication? thanks
foggy
Veeam Software
Posts: 21069
Liked: 2115 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Recommended Backup Job Settings for EMC Data Domain

Post by foggy » 1 person likes this post

Jeremy, please review the thread above for recommended settings and some deeper considerations. Should answer your questions. Thanks!
jsprinkleisg
Service Provider
Posts: 26
Liked: 4 times
Joined: Dec 09, 2009 9:59 pm
Full Name: James Sprinkle
Contact:

Re: Recommended Backup Job Settings for Data Domain

Post by jsprinkleisg »

Gostev wrote:Lately our architects started recommending Local (16+ TB backup files) storage optimization for deduplicating storage. This significantly improves restore performance without impacting backup performance too much.
Doesn't the larger block size result in reduced restore performance for operations such as FLR and Instant VM recovery, as seems to be the case in this post?
BeThePacket wrote: The "change advanced settings to recommended for repository type" prompt is also extremely annoying when creating or modifying a job, since IMO it's suggestion is completely wrong.
Though the prompt may not be completely wrong, I agree it is annoying, especially if the same advanced settings are not optimal in all situations.

Veeam's documentation on this subject is confusing. KB1956 links to KB1745, which is not consistent with the user guide and the UI prompt, while the white paper makes no mention of the advanced job settings at all. I would certainly like to see more consistency among Veeam's documentation.
foggy
Veeam Software
Posts: 21069
Liked: 2115 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Recommended Backup Job Settings for Data Domain

Post by foggy »

jsprinkleisg wrote: Doesn't the larger block size result in reduced restore performance for operations such as FLR and Instant VM recovery, as seems to be the case in this post?
Reading with larger block requires less IOPS, which results in increased restore performance (comparing to using smaller block on the same storage). Though, even given the recommended settings, restore from dedupe storage is still expectedly slower than restore from raw storage with its corresponding recommended setting, just because of deduplication (which is discussed in the referred post).
jsprinkleisg wrote:Veeam's documentation on this subject is confusing. KB1956 links to KB1745, which is not consistent with the user guide and the UI prompt, while the white paper makes no mention of the advanced job settings at all. I would certainly like to see more consistency among Veeam's documentation.
As you can see from this thread, recommendations might change in time, so I believe we need to update the KB. Thanks!
SE-1
Influencer
Posts: 22
Liked: 5 times
Joined: Apr 07, 2015 1:42 pm
Full Name: Dirk Slechten
Contact:

Re: Recommended Backup Job Settings for EMC Data Domain

Post by SE-1 » 1 person likes this post

Hello

We also use veeam in combination with a DD2500.

We had the setting also put to Local target (16+ TB backup files).

At a certain point we noticed that our incremental backups where very huge.

This was the total of all our jobs

Processed data GB:11587,1
Transferred data GB: 1859,2
Change ratio: 16,05%

At another veeam environment we where hitting 22% change ratio a day...

As we could not understand where this was coming from, we opened a case
It took support about a month to figure out that it was caused by the local target 16TB setting
after we have changed the setting to lan target and our incrementals dropped to 5-6%

Processed data GB: 12015,4
Transferred data GB: 656,4
Change ratio: 5.46%

At another veeam environment we changed the same and incremental dropped to 3-4%.

The problem with the huge incrementals was that our backup copy jobs had also to process almost 2TB of incrementals instead of 656GB...

Also we have disabled compression in the backup job, as the DD does the compression. Compression on Compression can't be good....
We have also disabled the setting decompress backup data blocks before storing on the repository. It has no point to enable this as there is no compression in the job...
Inline de-duplication is also disabled as the DD does the deduplication

There is a lot of discussion about DD in combination with veeam and there are a lot of recommendations.
But every tweak you do, seems to have negative impact on something else :(
Gostev
Chief Product Officer
Posts: 31457
Liked: 6648 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Recommended Backup Job Settings for EMC Data Domain

Post by Gostev » 1 person likes this post

Indeed, it's really just a normal "pick any two" situation (out of 3 critical characteristics). Most things in the world work like that... not that you did not go through this when choosing your backup storage for example, which was the "pick any two" situation between Cost, Capacity and Performance (and you chose the first two). Now, you are effectively trying to get Performance back with Cost staying the same - which means you need to sacrifice Capacity! So, it all makes perfect sense ;)

That said, we are constantly making changes and improvements to get as close as possible to the ideal balance of all 3 characteristics. For example, we will reduce the block size of 16+ TB setting in v9 based on lots of real-world data we have collected from the customer environments. And, there is one other optimization in the pipeline that I am really hoping will make it into v9, which can be a game changer with dedupe appliances.
btanguay
Influencer
Posts: 18
Liked: 2 times
Joined: Nov 28, 2014 6:58 pm
Full Name: Benoit Tanguay
Location: Boucherville, QC, Canada
Contact:

Re: Recommended Backup Job Settings for EMC Data Domain

Post by btanguay »

Hi,

I'm having the same issue like everybody here. But after discussion with EMC i heard that the v9 should have some improvment in restore performance for FLR and Instant recovery by using multistream instead of singlestream with dd boost. Anybody have more info on that? I've already check the v9 post on the Veeam website, but not much information for now.
Thanks
foggy
Veeam Software
Posts: 21069
Liked: 2115 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Recommended Backup Job Settings for EMC Data Domain

Post by foggy »

Benoit, please stay tuned for future announcements.
jpveen
Novice
Posts: 3
Liked: never
Joined: Sep 11, 2015 8:33 am
Full Name: Jan
Contact:

Re: Recommended Backup Job Settings for EMC Data Domain

Post by jpveen »

Hi all,
Backing up using Veeam8 with DDBoost and CIFS to DD produces very large files at the DD when backing up very large VM's (12TB full backup / 1,5TB incremental on a single VM).

DataDomain Mtree replication suffers from those large files as the mtree-replication works per file and cannot use multiple replication threads for a single file (proven and verified by EMC support on DDOS5.5).
In the environment with this large VM's this leads to a too large replication lag...

Question is how to reduce those filesizes, i.e. how to get Veeam to split the backups in smaller files. For example tell Veeam to create files with a max size of 100GB. Does anybody know how to achieve this. Will changing the "storage optimization" from Local target to LAN or WAN target help achieving this? Or is there somewhere a (hidden) setting to set a max backup filesize in Veeam?

Thanks for your thoughts on this!
foggy
Veeam Software
Posts: 21069
Liked: 2115 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Recommended Backup Job Settings for EMC Data Domain

Post by foggy »

There's no such a setting, however, to reduce the size of increment files you can switch the job to use smaller data blocks, which will, however, involve some processing overhead.
jpveen
Novice
Posts: 3
Liked: never
Joined: Sep 11, 2015 8:33 am
Full Name: Jan
Contact:

Re: Recommended Backup Job Settings for EMC Data Domain

Post by jpveen »

Ok, that was also my idea, however I can't find anywhere which storage optimization target setting (blocksize) results in which filesize.
The only thing I can find is the standard blocksize list Local16: 8MB, Local: 1MB, LAN: 512KB, WAN: 256KB....
But what will be the corresponding (max/split) filesizes?
foggy
Veeam Software
Posts: 21069
Liked: 2115 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Recommended Backup Job Settings for EMC Data Domain

Post by foggy »

There's no such an estimate, everything depends on the pattern of changes within a VM - with smaller block size you would not need to copy, say, entire 1MB block if only 1KB has changed in it, but 256KB only (if you switch to WAN target, for example).
tdewin
Veeam Software
Posts: 1775
Liked: 646 times
Joined: Mar 02, 2012 1:40 pm
Full Name: Timothy Dewin
Contact:

Re: Recommended Backup Job Settings for EMC Data Domain

Post by tdewin »

jpveen wrote:Ok, that was also my idea, however I can't find anywhere which storage optimization target setting (blocksize) results in which filesize.
The only thing I can find is the standard blocksize list Local16: 8MB, Local: 1MB, LAN: 512KB, WAN: 256KB....
But what will be the corresponding (max/split) filesizes?
Every job creates it own fileset. There is no splitting of files over a certain threshold. In short if you want more and smaller files, in this version, you can do so by splitting your source data up in more jobs. Changing the block size, could potentially help you in storing less data but doesn't make "more files".
jpveen
Novice
Posts: 3
Liked: never
Joined: Sep 11, 2015 8:33 am
Full Name: Jan
Contact:

Re: Recommended Backup Job Settings for EMC Data Domain

Post by jpveen »

Ok, but a smaller job than 1 VM is not possible ;-(
So a 1 VM with 5TB VMDK will create a 5TB file.... :-((
foggy
Veeam Software
Posts: 21069
Liked: 2115 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Recommended Backup Job Settings for EMC Data Domain

Post by foggy »

You could split the backup by backing up each VM disk (if there are several ones) in a different job, however this is not typically recommended as implies restoration issues.
tunturk
Lurker
Posts: 2
Liked: never
Joined: Dec 29, 2015 9:26 am
Contact:

Re: Recommended Backup Job Settings for EMC Data Domain

Post by tunturk »

Dear All

I read all posts but a i cant decide our compression ratio in our test platform. Can you inform which is the correct configuration for compression ratio ? "Dedup Friendly" or "None " or " Optimal" for our Data Domain DDBoost target. :D

Many Thanks
veremin
Product Manager
Posts: 20270
Liked: 2252 times
Joined: Oct 26, 2012 3:28 pm
Full Name: Vladimir Eremin
Contact:

Re: Recommended Backup Job Settings for EMC Data Domain

Post by veremin »

You should be ok with using Optimal level. Thanks.
Post Reply

Who is online

Users browsing this forum: No registered users and 127 guests