Hi,
We got a customer that has a bunch of VMs in our VMware-environment.
These VMs are quite big and has alot of data. 4-10TB on each of the VMs.
They have enable Deduplication within the OS on these servers. They are running Windows Server 2012 R2 and later.
We use Veeam Backup&Replication to take the backups nightly through vSphere.
The backups are quite big and I'm wondering how does Veeam CBT work with VMs with Deduplication enabled within the OS?
How will Veeam consider changed blocks on these VMs?
Are the blocks generally changed when a Deduplication Optimization Job is runned and finished? I guess it depends on the dedup ratio and so on.
Perhaps also the scheduled time for both the Dedup Optimization and the Veeam backup jobs is a factor?
A lot of questions but it might sometimes be a bit tricky to get the whole picture.
Hopefully someone can explain this better and help me figure it all out.
/Fredrik
-
- Service Provider
- Posts: 12
- Liked: 1 time
- Joined: Nov 13, 2018 12:27 pm
- Full Name: Fredrik Örtenholm
- Contact:
-
- Service Provider
- Posts: 372
- Liked: 120 times
- Joined: Nov 25, 2016 1:56 pm
- Full Name: Mihkel Soomere
- Contact:
Re: Windows DeDup and CBT
When file is added to file system, it gets picked up by CBT.
When file is optimized, metadata changes get picked up by CBT. If file consumed extra space at chuck store, it gets picked up by CBT. If you have thin VMDK-s then orginal file *may* be UNMAP-ed and shrink VMDK. Original file is excluded by BitLooker (as it no longer exists).
When garbage collection runs, new chunk store parts get picked up by CBT. If you have thin VMDK-s then old chunk store files *may* be UNMAP-ed and shrink VMDK. Original chunk store file is excluded by BitLooker as it no longer exists.
It's about timing when if Veeam picks up these changes. You could for example tune dedup to run just after workday (and let it run overnight, depending on your change rate) and let Veeam run in the morning when dedup has finished and there's the least amount of data to be collected, eg all files have been optimized and GC has run and everything is in "final" state.
The GC part is worst as by default each 4th GC is full one. Even if one byte may be cleaned up per chunk (1GB files), whole 1GB chunk is copied (and picked up by CBT).
I would recommend you disable full GC as default one is usually good enough: https://support.microsoft.com/en-us/hel ... cause-perf
When file is optimized, metadata changes get picked up by CBT. If file consumed extra space at chuck store, it gets picked up by CBT. If you have thin VMDK-s then orginal file *may* be UNMAP-ed and shrink VMDK. Original file is excluded by BitLooker (as it no longer exists).
When garbage collection runs, new chunk store parts get picked up by CBT. If you have thin VMDK-s then old chunk store files *may* be UNMAP-ed and shrink VMDK. Original chunk store file is excluded by BitLooker as it no longer exists.
It's about timing when if Veeam picks up these changes. You could for example tune dedup to run just after workday (and let it run overnight, depending on your change rate) and let Veeam run in the morning when dedup has finished and there's the least amount of data to be collected, eg all files have been optimized and GC has run and everything is in "final" state.
The GC part is worst as by default each 4th GC is full one. Even if one byte may be cleaned up per chunk (1GB files), whole 1GB chunk is copied (and picked up by CBT).
I would recommend you disable full GC as default one is usually good enough: https://support.microsoft.com/en-us/hel ... cause-perf
-
- Veteran
- Posts: 643
- Liked: 312 times
- Joined: Aug 04, 2019 2:57 pm
- Full Name: Harvey
- Contact:
Re: Windows DeDup and CBT
I have little to add to DonZoomik's answer but I do want to offer a different frame of reference for you:
>How will Veeam consider changed blocks on these VMs?
Veeam determines nothing. The way CBT works is that as an application developer, you record a ChangeID with your last backup, and on next backup, you send the most recent ChangeID. VMware checks this against a hash table for the CBT data and returns all changed blocks since that time.
It sounds pedantic, but it's important to understand this to know where to look. No Backup application, unless it rolls its own CBT solution (which Veeam does not), has control over what the Hypervisor returns as an answer. I've fought this with countless clients and VMware/MS and while I understand the thought process, we have to accept this is coming from the respective hypervisor, not the backup application (if the backup app is doing it right)
So when you see strangeness in CBT, look to the Hypervisor.
>How will Veeam consider changed blocks on these VMs?
Veeam determines nothing. The way CBT works is that as an application developer, you record a ChangeID with your last backup, and on next backup, you send the most recent ChangeID. VMware checks this against a hash table for the CBT data and returns all changed blocks since that time.
It sounds pedantic, but it's important to understand this to know where to look. No Backup application, unless it rolls its own CBT solution (which Veeam does not), has control over what the Hypervisor returns as an answer. I've fought this with countless clients and VMware/MS and while I understand the thought process, we have to accept this is coming from the respective hypervisor, not the backup application (if the backup app is doing it right)
So when you see strangeness in CBT, look to the Hypervisor.
-
- Service Provider
- Posts: 12
- Liked: 1 time
- Joined: Nov 13, 2018 12:27 pm
- Full Name: Fredrik Örtenholm
- Contact:
Re: Windows DeDup and CBT
Thanks for a well written explaination.
It all gets a little bit clearer to me how this works.
It all gets a little bit clearer to me how this works.
Who is online
Users browsing this forum: No registered users and 58 guests