Comprehensive data protection for all workloads
Post Reply
mrholm
Expert
Posts: 170
Liked: 15 times
Joined: Apr 20, 2018 8:12 am
Full Name: Mats Holm
Contact:

Veeam Copy Jobs and Data Domain Synthetic Full

Post by mrholm »

Hi
We are working with Dell on an Veeam environment writing copy jobs to Data Domain. We see that the setup is giving us the amount of compression and it seems taht Veeam is storing to much data on the Data Domain.
Environment is VMware and around 650 VM with around 500TB of storage used.
The Backup is writing daily backups to Windows repositories.
We do daily copy jobs with Veeam to a Data Domain 9300 and store 30 restore points + 8 weekly + 12 Monthly.

We have an ongoing support ticket with Dell based on the Data Domain being filled much faster than expected based on the number VM and retention set in the envionmnet.
One thing we discussed is how Veeam handle the copy jobs and Synthetic Full, please explain the following scenario:

Day1 Server1 (1 TB in size) is copied to the Data Domain and it will store a Full copy (VBK) of 1 TB in logical size (don't care about compression ratio).
Day2-30 only incremental (VIB) is written to the Data Domain, if we calculate with 5% change rate meaning all incrementals will be 50GB in size.
Day 31 the Full and oldest Incremental is merged together to create a new Full. This is done using the DDboost API and Synthetic Full feature.
After day 31 Veeam will create a new Full copy based on the GFS and this is also done using the DDboost API and Synthetic Full feature.
This behavior will be repeated every day and every week.

The blocks changed in VMware and backuped by Veeam in the incremental (identified by CBT) are all changed blocks (new, changed and deleted) but how do Veeam do with the changed blocks from the incremental copy, are the new and changed blocks only inserted into the file together with deleting blocks no more needed?
Because the change rate of 5% doesn’t mean that the VM is getting bigger it can mean that data are only changed. If this is the case then we will end up having Full copies in the Data Domain of bigger size then the Full backups where we regularly run Active Full to “zero out” any effect like this.

Please explain how this is handled by Veeam and if the effect we see where are expected.

Would using the option of "Read the entire restore point from source backup instead of synthesizing it from increments" do anything to the “running copy chain”, the GFS will always be synthesized? I have read some saying that this option would be similar to an Active Full?
//Mats
HannesK
Product Manager
Posts: 14844
Liked: 3086 times
Joined: Sep 01, 2014 11:46 am
Full Name: Hannes Kasparick
Location: Austria
Contact:

Re: Veeam Copy Jobs and Data Domain Synthetic Full

Post by HannesK »

Hello,
if you see more data used then expected, then the change rate is higher or the compression rate lower than expected. One customer example I remember was a large server with encrypted data that was forgotten during planning.

Synthetic full works always the same (for backup copy jobs and for normal backup jobs). www.veeam.com/kb1932 (animation synthetic full backup)

Changing to "read the entire restore point" does not change anything on a deduplication appliance from a disk usage perspective. The full backups are still the same. Well, they are created on different days because the synthetic full with GFS does not happen on the day most people expect it (see http://rps.dewin.me/ and choose "manual run" and simulate). But that's not relevant from a disk usage perspective.

For 500 TB source with your configured retention and 5% change rate, I would expect about 1.575 TB disk usage on plain cheap disks with REFS / XFS. With DataDomain it might be a little bit less, let's say 1200 TB (or let's be optimistic, say 1 PB).

Best regards,
Hannes
foggy
Veeam Software
Posts: 21139
Liked: 2141 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Veeam Copy Jobs and Data Domain Synthetic Full

Post by foggy »

mrholm wrote: Jul 02, 2020 1:24 pm The blocks changed in VMware and backuped by Veeam in the incremental (identified by CBT) are all changed blocks (new, changed and deleted) but how do Veeam do with the changed blocks from the incremental copy, are the new and changed blocks only inserted into the file together with deleting blocks no more needed?
Hi Mats, for performance reasons, when performing merges on a forever forward incremental chain to a DDBoost repository (and actually any dedupe appliance), Veeam B&R appends data to the VBK file rather than re-uses unused space (i.e. blocks deleted inside guest OS) inside it. However, when GFS retention is enabled, the full backup is basically re-created anew each time the GFS restore point is offloaded, reducing the size to what it actually is (similar to what synthetic or active full would do) and preventing it from growing infinitely. This could explain some unexpected data growth as GFS points would take more space than one could assume.
mrholm
Expert
Posts: 170
Liked: 15 times
Joined: Apr 20, 2018 8:12 am
Full Name: Mats Holm
Contact:

Re: Veeam Copy Jobs and Data Domain Synthetic Full

Post by mrholm »

We see a lot of large servers having a larger GFS points then the actual data size of the server and changing this to "read the entire restore point" they are not that size any more and stay low in size.
One is 1,93TB in size but the GFS points are 3,77 - 4,27 TB in size. If this is the case that GFS points grow and only way to change it is to read everything from source then the whole idea with synthetic full will fail?
Is there a way for me to show what I see in a picture?

//M
mrholm
Expert
Posts: 170
Liked: 15 times
Joined: Apr 20, 2018 8:12 am
Full Name: Mats Holm
Contact:

Re: Veeam Copy Jobs and Data Domain Synthetic Full

Post by mrholm »

And why is there not an option to run "read the entire restore point" for all copies, now I have to choose to have GFS for this to work.
I see on one server without GFS that the data size of the VM is 3,98TB in size but the running Full is up on 4,73TB.
So if I don't want to have GFS pints the only way to "clear out" and get the full back to normal size is to run a true full copy?
mrholm
Expert
Posts: 170
Liked: 15 times
Joined: Apr 20, 2018 8:12 am
Full Name: Mats Holm
Contact:

Re: Veeam Copy Jobs and Data Domain Synthetic Full

Post by mrholm »

Some more info: I have two Notes servers, first one is 1,76TB in size (VM size) and the other one is 2,04TB in size.
Looking at these servers GFS points I see that for the first server the smallest one is 2,22TB and for the second server 2,03TB
So doing GFS copies to Data Domain seems not as a good idea if you don't want top end up buying a lot of storage since Veeam keep writing the GFS points that big?
foggy
Veeam Software
Posts: 21139
Liked: 2141 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Veeam Copy Jobs and Data Domain Synthetic Full

Post by foggy »

mrholm wrote: Sep 09, 2020 8:52 am So if I don't want to have GFS pints the only way to "clear out" and get the full back to normal size is to run a true full copy?
Compact operation will also help to bring back the full backup size to the original one.
mrholm
Expert
Posts: 170
Liked: 15 times
Joined: Apr 20, 2018 8:12 am
Full Name: Mats Holm
Contact:

Re: Veeam Copy Jobs and Data Domain Synthetic Full

Post by mrholm »

Run a compact operation against a Data Domain with around 500 TB of front end VM data stored (almost 5 PB logical) is not possible.
So basically you say that a GFS point will sometimes (by design) be bigger then the VM itself due to the way Veeam is calculating the data to it?
foggy
Veeam Software
Posts: 21139
Liked: 2141 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Veeam Copy Jobs and Data Domain Synthetic Full

Post by foggy »

Run a compact operation against a Data Domain with around 500 TB of front end VM data stored (almost 5 PB logical) is not possible.
Right, but you use GFS/synthetic fulls, which take comparable time.
So basically you say that a GFS point will sometimes (by design) be bigger then the VM itself due to the way Veeam is calculating the data to it?
Yes, that is expected behavior.
Zach123
Enthusiast
Posts: 40
Liked: 3 times
Joined: Jun 04, 2019 12:36 am
Full Name: zaki khan
Contact:

Re: Veeam Copy Jobs and Data Domain Synthetic Full

Post by Zach123 »

Hi Team

I know it's an old thread but just had a query.

We are using VBR version 11, DD6900 with DDBoost.

We were told that running a synthetic full will consume less space on the DD, than running an active full. Is that true?

I was under the impression that synthetic full will only improve backup speed and have less network load, but should not have an impact on the storage consumed on the DataDomain.
HannesK
Product Manager
Posts: 14844
Liked: 3086 times
Joined: Sep 01, 2014 11:46 am
Full Name: Hannes Kasparick
Location: Austria
Contact:

Re: Veeam Copy Jobs and Data Domain Synthetic Full

Post by HannesK »

We were told that running a synthetic full will consume less space on the DD, than running an active full. Is that true?
makes zero sense to me as the data blocks are the same. do you have a source (link) with more details on that statement?
Zach123
Enthusiast
Posts: 40
Liked: 3 times
Joined: Jun 04, 2019 12:36 am
Full Name: zaki khan
Contact:

Re: Veeam Copy Jobs and Data Domain Synthetic Full

Post by Zach123 »

Hi

Thanks for responding. The statement came from one of the technical guys and not directly from Veeam. It was stated that " according to the Veeam Restore Point Simulator, Active full backup uses 50% more capacity on the backup target (IE Data Domain)"

I am not sure which simulator is getting referred here as the below simulator that I know about does not have an option to provide storage type ( dedup or non-dedup).
http://rps.dewin.me/
HannesK
Product Manager
Posts: 14844
Liked: 3086 times
Joined: Sep 01, 2014 11:46 am
Full Name: Hannes Kasparick
Location: Austria
Contact:

Re: Veeam Copy Jobs and Data Domain Synthetic Full

Post by HannesK »

Hello,
the restore point simulator is built for the recommended reference architecture. That means "plain stupid disks" and not deduplication appliances. As you already noted, it doesn't have any selection for storage types.

I'm not sure what the "technical guys" are referring to. The restore point simulator cannot be the source of that statement :-)

Sometimes active full can be faster than synthetic full. That depends on the infrastructure. I don't see how it can take up more space as there is "global" deduplication.

Best regards,
Hannes
Mildur
Product Manager
Posts: 9848
Liked: 2607 times
Joined: May 13, 2017 4:51 pm
Full Name: Fabian K.
Location: Switzerland
Contact:

Re: Veeam Copy Jobs and Data Domain Synthetic Full

Post by Mildur »

I don't see how it can take up more space as there is "global"
Would veeam encryption of backup files change that?
Yes, veeam is aware of the blocks before the encryption. But not the datadomain.
If veeam creates a new active full, each block of the vbk is a standalone block. Not connected to the old ones.

It would get global dedup on the datadomain, but because it‘s a new block with encryption used, from the view of the datadomain, it is not the same data. Therefore it needs to be stored separately and the global dedup cannot be used.
Product Management Analyst @ Veeam Software
Mildur
Product Manager
Posts: 9848
Liked: 2607 times
Joined: May 13, 2017 4:51 pm
Full Name: Fabian K.
Location: Switzerland
Contact:

Re: Veeam Copy Jobs and Data Domain Synthetic Full

Post by Mildur »

I have found the answer in the best practice guide and a veeam kb article, which could let to the 50% more capacity. As i thought, job level encryption will be a issue for global deduplication.

https://bp.veeam.com/vbr/VBP/3_Build_st ... ation.html
Hardware assisted encryption is available for EMC DataDomain via DDBoost, but must be configured in the integration specific repository configuration. If enabled on the job level data reduction efficiency will be significantly degraded.
And from the kb article:

https://www.veeam.com/kb1745
Encryption
Encryption will create random data at the backup targets; as a result, the deduplication storages will not work effectively. It is recommended not to use encryption with deduplication storages.
Product Management Analyst @ Veeam Software
HannesK
Product Manager
Posts: 14844
Liked: 3086 times
Joined: Sep 01, 2014 11:46 am
Full Name: Hannes Kasparick
Location: Austria
Contact:

Re: Veeam Copy Jobs and Data Domain Synthetic Full

Post by HannesK »

yes, with Veeam encryption, storage consumption would grow on a deduplication appliance. But I cannot see encryption as topic in this thread.
Post Reply

Who is online

Users browsing this forum: Bing [Bot], Semrush [Bot] and 84 guests