Discussions specific to the VMware vSphere hypervisor
Locked
Reimold
Enthusiast
Posts: 34
Liked: 1 time
Joined: Sep 07, 2009 11:58 am
Full Name: Dirk Reimold
Contact:

VMware CBT bug KB 2090639

Post by Reimold » Oct 27, 2014 9:53 am

Hello,

I have just read about the CBT bug in all ESXi versions and now I face a few questions about that:

- the KB document tells me "A virtual machine may be at risk if the vmdk file was extended to a size greater than 128 G." - does this mean that a VM that has a Initial disk size greater tha 128 GB would not bee affected? (Example a VM disk with 200GB expanede to 300GB) ?

- How can I tell if I have a corrupt backup of a CBT bug afected VM? Will Instant VM recovery fail or will just some files on the restored disk be unreadable? - Example: a Fileserver with 2 disks (40 GB and 700GB expanded to 900 GB) - will SureBackup job be able to recognize a corrupt disk?

- Is there a way to Bypass wron CBT Information without shutting the VM down to disable CBT? I think of creating a new backup Job as this would result in reading and backing up the whole disk again.

We have a lot of VM´s with disks greater than 128 GB that were expanded over the past few years.

Thank you for your comments

Dirk

MrSpock
Enthusiast
Posts: 34
Liked: 1 time
Joined: Apr 24, 2009 10:16 pm
Contact:

Re: VMware CBT bug KB 2090639

Post by MrSpock » Oct 27, 2014 10:19 am

Let me add one question to Dirk's list:

- Will the CBT be automatically reset if I manually make an "Active Full" backup?

Best regards,

Johan

Vitaliy S.
Product Manager
Posts: 22056
Liked: 1368 times
Joined: Mar 30, 2009 9:13 am
Full Name: Vitaliy Safarov
Contact:

Re: VMware CBT bug KB 2090639

Post by Vitaliy S. » Oct 27, 2014 11:29 am

Hi Dirk and Johan,
Reimold wrote:- the KB document tells me "A virtual machine may be at risk if the vmdk file was extended to a size greater than 128 G." - does this mean that a VM that has a Initial disk size greater tha 128 GB would not bee affected? (Example a VM disk with 200GB expanede to 300GB) ?
I have passed this question to our QC team and will let you know after we perform these tests.
Reimold wrote:- How can I tell if I have a corrupt backup of a CBT bug afected VM? Will Instant VM recovery fail or will just some files on the restored disk be unreadable? - Example: a Fileserver with 2 disks (40 GB and 700GB expanded to 900 GB) - will SureBackup job be able to recognize a corrupt disk?
Yes, it is recommended to configure and run SureBackup jobs for all mission critical VMs you protect with Veeam B&R server. These jobs will allow you to detect all problems with boot procedure.
Reimold wrote:- Is there a way to Bypass wrong CBT Information without shutting the VM down to disable CBT? I think of creating a new backup Job as this would result in reading and backing up the whole disk again.
CBT data should be reset, our support team should have instructions on how to do that.
MrSpock wrote:- Will the CBT be automatically reset if I manually make an "Active Full" backup?
CBT will not reset in this case, but the new active full backup should create a new valid restore point, though if you're affected by this issue, you need to reset CBT first.

Thanks!

Reimold
Enthusiast
Posts: 34
Liked: 1 time
Joined: Sep 07, 2009 11:58 am
Full Name: Dirk Reimold
Contact:

Re: VMware CBT bug KB 2090639

Post by Reimold » Oct 27, 2014 11:36 am

Vitaliy S. wrote: Yes, it is recommended to configure and run SureBackup jobs for all mission critical VMs you protect with Veeam B&R server. These jobs will allow you to detect all problems with boot procedure.
But I am not talking about boot disks here. In most cases my boot disks are between 40-60 GB - but data disks have often grown between 128 GB. Will SureBackup find affected "CBT-bug" problems with that disks too?

Thank you

Dirk

Vitaliy S.
Product Manager
Posts: 22056
Liked: 1368 times
Joined: Mar 30, 2009 9:13 am
Full Name: Vitaliy Safarov
Contact:

Re: VMware CBT bug KB 2090639

Post by Vitaliy S. » Oct 27, 2014 12:36 pm

Dirk, no, I believe you will need to check that you have all recent data on these disks manually.

Reimold
Enthusiast
Posts: 34
Liked: 1 time
Joined: Sep 07, 2009 11:58 am
Full Name: Dirk Reimold
Contact:

Re: VMware CBT bug KB 2090639

Post by Reimold » Oct 27, 2014 12:56 pm

Vitaliy S. wrote:Dirk, no, I believe you will need to check that you have all recent data on these disks manually.
so this would mean, that i cannot trust any backup I have made of my bigger VM´s during the past years and the statement from Gostev´s recent newsletter: "but only SureBackup can guarantee you the ability to recover." does not come true when we talk about bigger file- and databaseserver.

Is there any hint how often this bug corrupts a VM backed up with Veeam? Are we talking about any VM that has expanded disks or only a small percentage?

Thank you

Dirk

cffit
Expert
Posts: 338
Liked: 34 times
Joined: Jan 20, 2012 2:36 pm
Full Name: Christensen Farms
Contact:

Re: VMware CBT bug KB 2090639

Post by cffit » Oct 27, 2014 1:24 pm 6 people like this post

I agree with the others on here. The weekly email that brought this topic up was good to inform us of the issue, but lacked any detail in specifics. Beings this is such a critical issue, I think an in-depth explanation of what it affects and how to resolve it would be important. Thanks

JeremyS132
Novice
Posts: 8
Liked: never
Joined: Feb 07, 2014 2:40 pm
Full Name: Jeremy Schwarzrock
Contact:

Re: VMware CBT bug KB 2090639

Post by JeremyS132 » Oct 27, 2014 1:25 pm

cffit wrote:I agree with the others on here. The weekly email that brought this topic up was good to inform us of the issue, but lacked any detail in specifics. Beings this is such a critical issue, I think an in-depth explanation of what it affects and how to resolve it would be important. Thanks
I have to agree with this statement.

maddog2050
Lurker
Posts: 1
Liked: 2 times
Joined: Mar 21, 2012 4:45 pm
Full Name: Adam Stirk
Contact:

Re: VMware CBT bug KB 2090639

Post by maddog2050 » Oct 27, 2014 1:27 pm 2 people like this post

Hi,

In the community forum digest that came out highlighting this bug, one of the methods of disabling CBT was using PowerCLI. Is this a proven and supported way of disabling CBT? As VMware state powering off the VM, disabling CBT and then powering the VM back on. Looking at the script all it does is disable CBT and then create and remove a snapshot.

Thanks

Adam

namiko78
Expert
Posts: 113
Liked: 4 times
Joined: Mar 03, 2011 1:49 pm
Full Name: Steven Stirling
Contact:

[MERGED] Surebackup and CBT bug

Post by namiko78 » Oct 27, 2014 1:47 pm

Regarding Gotev's forum message (posted below) , if i had a D: drive that was expanded and thus affected by the bug, would surebackup catch this? Would only the D: fail to come back online or does it mean the entire VM would not recover?

>>>>

Unfortunately, I also have some not so good news to share. Earlier this month, VMware has quietly published a KB article about a rather terrible CBT bug that exists in all versions of ESX(i) since changed block tracking functionality was first introduced. We have been working directly with VMware to confirm the exact scope of the issue and update the KB article with more details. But the main point is that your backups and replicas for all VMs that had its virtual disk size expanded beyond 128 GB at some point may be unrecoverable. We are working on a hot fix for both 7.0 Patch 4 and 8.0 code branches that will reset CBT automatically upon detecting source virtual disk size change. Meanwhile, I recommend manual CBT reset for all VMs that had their virtual disks expanded at some point by disabling CBT (the following Veeam job run will re-enable CBT automatically). Perhaps, just disabling CBT on all VMs with a PowerCLI script might be the best idea - but keep in mind that the following job runs will take much longer, so best is to do this before the weekend.

These kind of issues always make me stress the importance of SureBackup. Many users consider setting up SureBackup jobs to be a low priority when compared to actual backups - however, only SureBackup is able to catch these kind of issues. Interestingly enough, a lot of people seem to recognize the importance of backup integrity testing, in fact our Backup Validator tool seems to be very popular. I do agree that integrity checks are important, in fact I have dedicated the entire VeeamON breakout session to "classic" data corruption issues. However, integrity checks will not detect corruption issues similar to the above. And yet, these sort of issues are much more common. I cannot stress this enough, especially in light of enhancements we are adding to our Backup Validator tool in v8. These enhancements are based on your feedback, but they do not mean that Backup Validator is the future. It has its use in detecting storage level corruptions, but only SureBackup can guarantee you the ability to recover.

Stoo
Service Provider
Posts: 5
Liked: never
Joined: Aug 23, 2013 8:42 am
Full Name: Stu P.
Contact:

Re: VMware CBT bug KB 2090639

Post by Stoo » Oct 27, 2014 2:06 pm

Has anyone actually come across this bug 'in the wild' yet and have personal experience of how it manifests?

Keen to know whether this will trash the entire 128Gb+ disk's structure and file headers, making it effectively unusable/unmountable, or whether theoretically, if i'm able to use the windows guest File-Level-Restore wizard which creates vmdk mountpoints in C:\veeamflr on my backup server, and it successfully enumerates the entire drive's contents and directory structure, i should be in the clear?

Reimold
Enthusiast
Posts: 34
Liked: 1 time
Joined: Sep 07, 2009 11:58 am
Full Name: Dirk Reimold
Contact:

Re: VMware CBT bug KB 2090639

Post by Reimold » Oct 27, 2014 2:47 pm 1 person likes this post

I have opened a ticket at VMware this morning and just got a call from their support:

- they will check if this Problem only affects disks that are expanded from under 128 GB to a size greater that128 GB and get back to me.
- there is no fix available in the near future
- there are not much support requests about that CBT bug and only one real case is linked to that KB document
- to disable CBT the VM has to be powered off - no other way is available

jklimo
Lurker
Posts: 1
Liked: never
Joined: Apr 30, 2014 2:20 pm
Full Name: Johnathan Klimo
Contact:

Re: VMware CBT bug KB 2090639

Post by jklimo » Oct 27, 2014 3:00 pm

MrSpock wrote:Let me add one question to Dirk's list:

- Will the CBT be automatically reset if I manually make an "Active Full" backup?

Best regards,

Johan
Great question. I've been researching this as well, and haven't found clarity/details yet re: CBT when Veeam completes Active Full backup jobs.

John

Khue
Enthusiast
Posts: 63
Liked: 3 times
Joined: Sep 26, 2013 6:01 pm
Contact:

Re: VMware CBT bug KB 2090639

Post by Khue » Oct 27, 2014 3:17 pm

Reimold wrote: - to disable CBT the VM has to be powered off - no other way is available
Data Protector (HP Product) used to have CBT issues all the time which would require you to disable and re-enable CBT. You may want to check, but I'd imagine you also have to delete the existing change block database (*.ctk files).
jklimo wrote: I've been researching this as well, and haven't found clarity/details yet re: CBT when Veeam completes Active Full backup jobs.
Just a guess based on my statement above, but I would imagine CBT would have to be completely turned off and then re-enabled by the job. The takeaway is that the change block tracking database get's fubard and the only work around is to wipe it and start from scratch. I would imagine when you add blocks to the vmdk past 128 gigs, it requires the cbt database to go through some process of growth and VMware's method of adding that to the existing database is destructive.

dzeleski
Novice
Posts: 3
Liked: 1 time
Joined: Sep 12, 2014 2:54 am
Full Name: Dylan Zeleski
Contact:

Re: VMware CBT bug KB 2090639

Post by dzeleski » Oct 27, 2014 3:23 pm 1 person likes this post

Reimold wrote: - to disable CBT the VM has to be powered off - no other way is available
As far as I am aware you can set CBT to disabled, snap the VM, remove the snap, and that should reset CBT(powercli). There are a few blogs stating this is the case, im waiting on my call back from veeam now.

I know for a fact that a storage vMotion will reset CBT as well.

Code: Select all

$choice = Read-Host 'Press 1 to select a VM or Press 2 to select a Cluster.'

switch ($choice)
{
    1 {
       $vmName = Read-Host 'Please type in a VM name to reset CBT on.'

        $vmInfo = Get-vm $vmName
        $spec = New-Object VMware.Vim.VirtualMachineConfigSpec
        $spec.ChangeTrackingEnabled = $false


        $vmInfo.ExtensionData.ReconfigVM($spec)
        $snap=$vmInfo | New-Snapshot -Name 'Disable CBT'
        $snap | Remove-Snapshot -confirm:$false 
    }
    2 {
        $vmName = Read-Host 'Please type in a Cluster name to reset CBT on.'

        $vmInfo = Get-Cluster $vmName | Get-VM 
        $spec = New-Object VMware.Vim.VirtualMachineConfigSpec
        $spec.ChangeTrackingEnabled = $false


        $vmInfo.ExtensionData.ReconfigVM($spec)
        $snap=$vmInfo | New-Snapshot -Name 'Disable CBT'
        $snap | Remove-Snapshot -confirm:$false
    }
    default {Write-host 'Invaild Input, exiting...'}
}    

Locked

Who is online

Users browsing this forum: Bing [Bot], Majestic-12 [Bot] and 11 guests