Comprehensive data protection for all workloads
Post Reply
CSmith686
Novice
Posts: 4
Liked: never
Joined: Jun 14, 2023 4:17 pm
Contact:

CBT and DFSR

Post by CSmith686 »

Hello everyone. Hoping someone here has some knowledge with VMware CBT and Microsoft's DFSR. We are utilizing Exagrid as our backup infrastructure (since April of this year). Since then we have noticed incremental backups range between 500GB to 1TB in size. My synthetic full backups have decreased since April (from 24TB to 19TB) but my incremental still run between the 500GB to 1TB and we are running out of space.

The VM in question is running Windows 2019 as a file server configure with DFSR. Veeam backup job has CBT enabled.

Working with Veeam (Case #06260675) and my Exagrid rep, we are suspecting DFSR and CBT may be the cause. From what I know about DFSR, it uses something called RDC to break the file into "chunks" and only sends the differences to the remote server. This is similar to how CBT works.

We also have a department moving data to a cloud hosted storage provider. The process consists of exporting on-prem data as a zip file, extracting data, moving data to a staging location to be uploaded to the cloud provider. I can see this causing an increase in CBT. If I understand correctly, a 250GB file downloaded and moved/copy to a new location CBT will see this as a 500GB change in blocks, correct? This download will also get staged in DFRS. Causing another change in blocks.

Thansk and look forward to your replies and comments.
HannesK
Product Manager
Posts: 14322
Liked: 2890 times
Joined: Sep 01, 2014 11:46 am
Full Name: Hannes Kasparick
Location: Austria
Contact:

Re: CBT and DFSR

Post by HannesK » 1 person likes this post

Hello,
CBT / change block tracking helps to find changes faster. The changes could also be found by doing a full scan. But full scan would take longer. So I would take CBT out of the question and focus on "changed blocks".

If you download a file with 250GB that causes at least 250GB change. If you copy it, that's another 250GB, correct. For move it depends... on the same file system it would not really "move", but just change the "logical location" in the master file tape / file system index. If you move to a different file system, that would also add 250 GB changed blocks, yes.

High change rates are often caused by defragmentation or in-guest deduplication. Remote Differential Compression (RDC) might be relevant, but it does not sound logical to me why that mechanism would move more data around than needed.

Best regards,
Hannes
CSmith686
Novice
Posts: 4
Liked: never
Joined: Jun 14, 2023 4:17 pm
Contact:

Re: CBT and DFSR

Post by CSmith686 »

Thanks for the reply.

I forgot to mention we have disabled all defrag jobs and OS deduplication. I'm not 100% convinced it's related to DFSR but this is the only other process that would produce changes in blocks.
david.domask
Veeam Software
Posts: 1226
Liked: 322 times
Joined: Jun 28, 2016 12:12 pm
Contact:

Re: CBT and DFSR

Post by david.domask » 1 person likes this post

Hi @CSmith686,

More or less you're right. Even though the actual data likely didn't change, DFSR does a bunch of stuff that the hypervisor will see as "changed", even if nothing changed. I think this happens even without RDC, but don't quote me.

https://learn.microsoft.com/en-us/windo ... l-machines

I know the section is about Azure virtual machines, but the 2nd bullet point is quite relevant even for HyperV/VMware VMs and real-iron boxes. This is a good choice for Veeam Agent for Windows IMO, and depending on how your replication schedule is set, consider just backing up the replication partners to keep the primary server free to serve your users. The file level restores will be like introducing "new" files into the replication folders anyways, so there isn't a lot of sense to try to keep the entire environment going.

Similarly, consider NAS Backup; it can do a lot to help here also, but there are instance costs to consider. Probably I will make someone grumpy with this statement, but for DFSR, I would just use Veeam Agent for Windows and back it up once, maybe twice a day if there's usually a huge churn in the data. It's replicating anyways, so really you just need the backups for the real "it hits the fan" moments and maybe when someone nuked something on the prod.
David Domask | Product Management: Principal Analyst
dali@iae.nl
Enthusiast
Posts: 71
Liked: 13 times
Joined: Jan 17, 2022 10:31 am
Full Name: Da Li
Contact:

Re: CBT and DFSR

Post by dali@iae.nl » 1 person likes this post

Since I see you use Exagrid you must know how the Landing and Retention Zone exactly works and have the Best Practices in place. Especially in combination with Synthetic Full, that is killing for any deduplication appliance (unless you have configured the Fast Cloning feature on the Exagrid which is present in the latest Exagrid firmware). If you didn't it is better to take for example a weekly new Full Backup which lands on the fast Landing Zone and then dedups it to the Retention Zone.
Question is where is you capacity issue, on the Landing Zone or on the Retention Zone.
And where do you measure the incementals, form GB whats is said in the Job of from the Exagrid Console?

So it depends on which Exagrid firmware you are and how the jobs are configured.
Seve CH
Enthusiast
Posts: 69
Liked: 32 times
Joined: May 09, 2016 2:34 pm
Full Name: JM Severino
Location: Switzerland
Contact:

Re: CBT and DFSR

Post by Seve CH » 1 person likes this post

Hi
If I recall correctly, DFSR copies every single changed file in a staging area. Then it does its differential thing and sends the changes to the replication partners. Once the changes are replicated, the staged file is eventually deleted. You are most probably having a huge amount of changed blocks there, thus the big differentials.

I do not know your use case for DFSR but in most cases it's a bad design decision (it only synchronizes on file close, so you may have files that stay open for months which are never synchronized).

Best regards
Seve
CSmith686
Novice
Posts: 4
Liked: never
Joined: Jun 14, 2023 4:17 pm
Contact:

Re: CBT and DFSR

Post by CSmith686 »

Thank you all for you input.

We were able to reduce our incrementals to an acceptable level. We did discover several Conflict/Deleted and PreExisting folders in DFSR that collectively held several terabytes of data that was no longer needed and reduced the overall cache of some folders. This reduced the size of our System Volume Information folder from 3.9 TB to 2.1 TB.

We also created a location for the department that is moving data to a cloud porvider outside of DFSR. This also helped in reducing the overall incremental size.

Fast Cloning is enabled and we have upgraded to the latest firmware on the Exagrids.
Post Reply

Who is online

Users browsing this forum: Google [Bot] and 125 guests