VMware CBT bug KB 2090639

Discussions specific to VMware vSphere hypervisor

VMware CBT bug KB 2090639

Veeam Logoby Reimold » Mon Oct 27, 2014 9:53 am

Hello,

I have just read about the CBT bug in all ESXi versions and now I face a few questions about that:

- the KB document tells me "A virtual machine may be at risk if the vmdk file was extended to a size greater than 128 G." - does this mean that a VM that has a Initial disk size greater tha 128 GB would not bee affected? (Example a VM disk with 200GB expanede to 300GB) ?

- How can I tell if I have a corrupt backup of a CBT bug afected VM? Will Instant VM recovery fail or will just some files on the restored disk be unreadable? - Example: a Fileserver with 2 disks (40 GB and 700GB expanded to 900 GB) - will SureBackup job be able to recognize a corrupt disk?

- Is there a way to Bypass wron CBT Information without shutting the VM down to disable CBT? I think of creating a new backup Job as this would result in reading and backing up the whole disk again.

We have a lot of VM´s with disks greater than 128 GB that were expanded over the past few years.

Thank you for your comments

Dirk
Reimold
Enthusiast
 
Posts: 33
Liked: 1 time
Joined: Mon Sep 07, 2009 11:58 am
Full Name: Dirk Reimold

Re: VMware CBT bug KB 2090639

Veeam Logoby MrSpock » Mon Oct 27, 2014 10:19 am

Let me add one question to Dirk's list:

- Will the CBT be automatically reset if I manually make an "Active Full" backup?

Best regards,

Johan
MrSpock
Enthusiast
 
Posts: 34
Liked: 1 time
Joined: Fri Apr 24, 2009 10:16 pm

Re: VMware CBT bug KB 2090639

Veeam Logoby Vitaliy S. » Mon Oct 27, 2014 11:29 am

Hi Dirk and Johan,

Reimold wrote:- the KB document tells me "A virtual machine may be at risk if the vmdk file was extended to a size greater than 128 G." - does this mean that a VM that has a Initial disk size greater tha 128 GB would not bee affected? (Example a VM disk with 200GB expanede to 300GB) ?

I have passed this question to our QC team and will let you know after we perform these tests.

Reimold wrote:- How can I tell if I have a corrupt backup of a CBT bug afected VM? Will Instant VM recovery fail or will just some files on the restored disk be unreadable? - Example: a Fileserver with 2 disks (40 GB and 700GB expanded to 900 GB) - will SureBackup job be able to recognize a corrupt disk?

Yes, it is recommended to configure and run SureBackup jobs for all mission critical VMs you protect with Veeam B&R server. These jobs will allow you to detect all problems with boot procedure.

Reimold wrote:- Is there a way to Bypass wrong CBT Information without shutting the VM down to disable CBT? I think of creating a new backup Job as this would result in reading and backing up the whole disk again.

CBT data should be reset, our support team should have instructions on how to do that.

MrSpock wrote:- Will the CBT be automatically reset if I manually make an "Active Full" backup?

CBT will not reset in this case, but the new active full backup should create a new valid restore point, though if you're affected by this issue, you need to reset CBT first.

Thanks!
Vitaliy S.
Veeam Software
 
Posts: 20357
Liked: 1178 times
Joined: Mon Mar 30, 2009 9:13 am
Full Name: Vitaliy Safarov

Re: VMware CBT bug KB 2090639

Veeam Logoby Reimold » Mon Oct 27, 2014 11:36 am

Vitaliy S. wrote:Yes, it is recommended to configure and run SureBackup jobs for all mission critical VMs you protect with Veeam B&R server. These jobs will allow you to detect all problems with boot procedure.


But I am not talking about boot disks here. In most cases my boot disks are between 40-60 GB - but data disks have often grown between 128 GB. Will SureBackup find affected "CBT-bug" problems with that disks too?

Thank you

Dirk
Reimold
Enthusiast
 
Posts: 33
Liked: 1 time
Joined: Mon Sep 07, 2009 11:58 am
Full Name: Dirk Reimold

Re: VMware CBT bug KB 2090639

Veeam Logoby Vitaliy S. » Mon Oct 27, 2014 12:36 pm

Dirk, no, I believe you will need to check that you have all recent data on these disks manually.
Vitaliy S.
Veeam Software
 
Posts: 20357
Liked: 1178 times
Joined: Mon Mar 30, 2009 9:13 am
Full Name: Vitaliy Safarov

Re: VMware CBT bug KB 2090639

Veeam Logoby Reimold » Mon Oct 27, 2014 12:56 pm

Vitaliy S. wrote:Dirk, no, I believe you will need to check that you have all recent data on these disks manually.


so this would mean, that i cannot trust any backup I have made of my bigger VM´s during the past years and the statement from Gostev´s recent newsletter: "but only SureBackup can guarantee you the ability to recover." does not come true when we talk about bigger file- and databaseserver.

Is there any hint how often this bug corrupts a VM backed up with Veeam? Are we talking about any VM that has expanded disks or only a small percentage?

Thank you

Dirk
Reimold
Enthusiast
 
Posts: 33
Liked: 1 time
Joined: Mon Sep 07, 2009 11:58 am
Full Name: Dirk Reimold

Re: VMware CBT bug KB 2090639

Veeam Logoby cffit » Mon Oct 27, 2014 1:24 pm 6 people like this post

I agree with the others on here. The weekly email that brought this topic up was good to inform us of the issue, but lacked any detail in specifics. Beings this is such a critical issue, I think an in-depth explanation of what it affects and how to resolve it would be important. Thanks
cffit
Expert
 
Posts: 338
Liked: 33 times
Joined: Fri Jan 20, 2012 2:36 pm
Full Name: Christensen Farms

Re: VMware CBT bug KB 2090639

Veeam Logoby JeremyS132 » Mon Oct 27, 2014 1:25 pm

cffit wrote:I agree with the others on here. The weekly email that brought this topic up was good to inform us of the issue, but lacked any detail in specifics. Beings this is such a critical issue, I think an in-depth explanation of what it affects and how to resolve it would be important. Thanks


I have to agree with this statement.
JeremyS132
Novice
 
Posts: 8
Liked: never
Joined: Fri Feb 07, 2014 2:40 pm
Full Name: Jeremy Schwarzrock

Re: VMware CBT bug KB 2090639

Veeam Logoby maddog2050 » Mon Oct 27, 2014 1:27 pm 2 people like this post

Hi,

In the community forum digest that came out highlighting this bug, one of the methods of disabling CBT was using PowerCLI. Is this a proven and supported way of disabling CBT? As VMware state powering off the VM, disabling CBT and then powering the VM back on. Looking at the script all it does is disable CBT and then create and remove a snapshot.

Thanks

Adam
maddog2050
Lurker
 
Posts: 1
Liked: 2 times
Joined: Wed Mar 21, 2012 4:45 pm
Full Name: Adam Stirk

[MERGED] Surebackup and CBT bug

Veeam Logoby namiko78 » Mon Oct 27, 2014 1:47 pm

Regarding Gotev's forum message (posted below) , if i had a D: drive that was expanded and thus affected by the bug, would surebackup catch this? Would only the D: fail to come back online or does it mean the entire VM would not recover?

>>>>

Unfortunately, I also have some not so good news to share. Earlier this month, VMware has quietly published a KB article about a rather terrible CBT bug that exists in all versions of ESX(i) since changed block tracking functionality was first introduced. We have been working directly with VMware to confirm the exact scope of the issue and update the KB article with more details. But the main point is that your backups and replicas for all VMs that had its virtual disk size expanded beyond 128 GB at some point may be unrecoverable. We are working on a hot fix for both 7.0 Patch 4 and 8.0 code branches that will reset CBT automatically upon detecting source virtual disk size change. Meanwhile, I recommend manual CBT reset for all VMs that had their virtual disks expanded at some point by disabling CBT (the following Veeam job run will re-enable CBT automatically). Perhaps, just disabling CBT on all VMs with a PowerCLI script might be the best idea - but keep in mind that the following job runs will take much longer, so best is to do this before the weekend.

These kind of issues always make me stress the importance of SureBackup. Many users consider setting up SureBackup jobs to be a low priority when compared to actual backups - however, only SureBackup is able to catch these kind of issues. Interestingly enough, a lot of people seem to recognize the importance of backup integrity testing, in fact our Backup Validator tool seems to be very popular. I do agree that integrity checks are important, in fact I have dedicated the entire VeeamON breakout session to "classic" data corruption issues. However, integrity checks will not detect corruption issues similar to the above. And yet, these sort of issues are much more common. I cannot stress this enough, especially in light of enhancements we are adding to our Backup Validator tool in v8. These enhancements are based on your feedback, but they do not mean that Backup Validator is the future. It has its use in detecting storage level corruptions, but only SureBackup can guarantee you the ability to recover.
namiko78
Expert
 
Posts: 110
Liked: 4 times
Joined: Thu Mar 03, 2011 1:49 pm
Full Name: Steven Stirling

Re: VMware CBT bug KB 2090639

Veeam Logoby Stoo » Mon Oct 27, 2014 2:06 pm

Has anyone actually come across this bug 'in the wild' yet and have personal experience of how it manifests?

Keen to know whether this will trash the entire 128Gb+ disk's structure and file headers, making it effectively unusable/unmountable, or whether theoretically, if i'm able to use the windows guest File-Level-Restore wizard which creates vmdk mountpoints in C:\veeamflr on my backup server, and it successfully enumerates the entire drive's contents and directory structure, i should be in the clear?
Stoo
Service Provider
 
Posts: 5
Liked: never
Joined: Fri Aug 23, 2013 8:42 am
Full Name: Stu P.

Re: VMware CBT bug KB 2090639

Veeam Logoby Reimold » Mon Oct 27, 2014 2:47 pm 1 person likes this post

I have opened a ticket at VMware this morning and just got a call from their support:

- they will check if this Problem only affects disks that are expanded from under 128 GB to a size greater that128 GB and get back to me.
- there is no fix available in the near future
- there are not much support requests about that CBT bug and only one real case is linked to that KB document
- to disable CBT the VM has to be powered off - no other way is available
Reimold
Enthusiast
 
Posts: 33
Liked: 1 time
Joined: Mon Sep 07, 2009 11:58 am
Full Name: Dirk Reimold

Re: VMware CBT bug KB 2090639

Veeam Logoby jklimo » Mon Oct 27, 2014 3:00 pm

MrSpock wrote:Let me add one question to Dirk's list:

- Will the CBT be automatically reset if I manually make an "Active Full" backup?

Best regards,

Johan


Great question. I've been researching this as well, and haven't found clarity/details yet re: CBT when Veeam completes Active Full backup jobs.

John
jklimo
Lurker
 
Posts: 1
Liked: never
Joined: Wed Apr 30, 2014 2:20 pm
Full Name: Johnathan Klimo

Re: VMware CBT bug KB 2090639

Veeam Logoby Khue » Mon Oct 27, 2014 3:17 pm

Reimold wrote:- to disable CBT the VM has to be powered off - no other way is available


Data Protector (HP Product) used to have CBT issues all the time which would require you to disable and re-enable CBT. You may want to check, but I'd imagine you also have to delete the existing change block database (*.ctk files).

jklimo wrote: I've been researching this as well, and haven't found clarity/details yet re: CBT when Veeam completes Active Full backup jobs.


Just a guess based on my statement above, but I would imagine CBT would have to be completely turned off and then re-enabled by the job. The takeaway is that the change block tracking database get's fubard and the only work around is to wipe it and start from scratch. I would imagine when you add blocks to the vmdk past 128 gigs, it requires the cbt database to go through some process of growth and VMware's method of adding that to the existing database is destructive.
Khue
Enthusiast
 
Posts: 60
Liked: 3 times
Joined: Thu Sep 26, 2013 6:01 pm

Re: VMware CBT bug KB 2090639

Veeam Logoby dzeleski » Mon Oct 27, 2014 3:23 pm 1 person likes this post

Reimold wrote:- to disable CBT the VM has to be powered off - no other way is available


As far as I am aware you can set CBT to disabled, snap the VM, remove the snap, and that should reset CBT(powercli). There are a few blogs stating this is the case, im waiting on my call back from veeam now.

I know for a fact that a storage vMotion will reset CBT as well.

Code: Select all
$choice = Read-Host 'Press 1 to select a VM or Press 2 to select a Cluster.'

switch ($choice)
{
    1 {
       $vmName = Read-Host 'Please type in a VM name to reset CBT on.'

        $vmInfo = Get-vm $vmName
        $spec = New-Object VMware.Vim.VirtualMachineConfigSpec
        $spec.ChangeTrackingEnabled = $false


        $vmInfo.ExtensionData.ReconfigVM($spec)
        $snap=$vmInfo | New-Snapshot -Name 'Disable CBT'
        $snap | Remove-Snapshot -confirm:$false
    }
    2 {
        $vmName = Read-Host 'Please type in a Cluster name to reset CBT on.'

        $vmInfo = Get-Cluster $vmName | Get-VM
        $spec = New-Object VMware.Vim.VirtualMachineConfigSpec
        $spec.ChangeTrackingEnabled = $false


        $vmInfo.ExtensionData.ReconfigVM($spec)
        $snap=$vmInfo | New-Snapshot -Name 'Disable CBT'
        $snap | Remove-Snapshot -confirm:$false
    }
    default {Write-host 'Invaild Input, exiting...'}
}   
dzeleski
Novice
 
Posts: 3
Liked: 1 time
Joined: Fri Sep 12, 2014 2:54 am
Full Name: Dylan Zeleski

Next

Return to VMware vSphere



Who is online

Users browsing this forum: FrancWest and 1 guest