Discussions specific to the VMware vSphere hypervisor
willn
Influencer
Posts: 15
Liked: never
Joined: May 07, 2012 1:47 pm
Full Name: William Nelson
Contact:

Re: There's a new CBT bug in ESXi 6

Post by willn » Nov 16, 2015 2:03 pm

I am not sure if it happens every time. I have restored Exchange mailbox items as of Friday.

mdmd
Enthusiast
Posts: 37
Liked: 1 time
Joined: Jan 06, 2014 10:29 am
Full Name: Mike
Contact:

Re: There's a new CBT bug in ESXi 6

Post by mdmd » Nov 16, 2015 2:38 pm

We run a reverse incremental based backup scheme for all our 100+ VMs and use 6.0u1

I had to do a restore of our SharePoint front end a couple of weeks ago, all went fine. Just ran a chkdsk on that VM, and all reported fine. VM was 100+ GB.

We have also done various file level and exchange mailbox restores without a hitch too.

dellock6
Veeam Software
Posts: 5653
Liked: 1589 times
Joined: Jul 26, 2009 3:39 pm
Full Name: Luca Dell'Oca
Location: Varese, Italy
Contact:

Re: There's a new CBT bug in ESXi 6

Post by dellock6 » Nov 16, 2015 4:03 pm 1 person likes this post

The problem with the bug is that is not consistent, it "may" report wrong blocks when reading CBT information, and this is why probably you (and many others) have not seen the error. The suggestion of disabling CBT at all until there's a patch from VMwre is to avoid those small chances of getting caught by the bug.
Luca Dell'Oca
Principal EMEA Cloud Architect @ Veeam Software

@dellock6
https://www.virtualtothecore.com/
vExpert 2011 -> 2019
Veeam VMCE #1

lightsout
Expert
Posts: 211
Liked: 55 times
Joined: Apr 10, 2014 4:13 pm
Contact:

Re: There's a new CBT bug in ESXi 6

Post by lightsout » Nov 16, 2015 4:25 pm

I used this bit of PowerShell:

Code: Select all

get-vbrjob | ? {$_.JobType -eq "Backup" } | Set-VBRJobAdvancedViOptions -UseChangeTracking $false | out-null
To disable CBT on all of my jobs.

I've heard from VMware we'll see a patch for it this month.

rccl_ecain
Influencer
Posts: 12
Liked: 14 times
Joined: Jan 06, 2015 10:46 pm
Full Name: Ethan Cain
Location: Miramr, FL
Contact:

Re: There's a new CBT bug in ESXi 6

Post by rccl_ecain » Nov 16, 2015 4:52 pm

This part of the KB:

" this causes the change tracking information of I/Os that occur during snapshot consolidation to be lost."

Seems to be suggesting that the VM itself may be corrupted because of the consolidation process, whereas the incorrect data (corrupt data) is written back into the VM. I have a few VM's that are experiencing this type of issue it seems, were an Oracle DB server becomes corrupt after a backup. Has anyone else seen this? IF this shows up more, the only choice I have is to disable all backups.
Retired 'Cloud Admiral'. Might actually be on a ship.

fourg
Novice
Posts: 4
Liked: 1 time
Joined: Sep 16, 2015 10:11 pm
Full Name: Brent F
Contact:

Re: There's a new CBT bug in ESXi 6

Post by fourg » Nov 16, 2015 4:57 pm 1 person likes this post

lightsout wrote:

Code: Select all

get-vbrjob | ? {$_.JobType -eq "Backup" } | Set-VBRJobAdvancedViOptions -UseChangeTracking $false | out-null
I came here for this, thank you kind sir!

ndolson
Influencer
Posts: 13
Liked: 2 times
Joined: Jan 08, 2015 3:56 pm
Full Name: Neal
Contact:

Re: There's a new CBT bug in ESXi 6

Post by ndolson » Nov 16, 2015 7:13 pm

rccl_ecain wrote:This part of the KB:

" this causes the change tracking information of I/Os that occur during snapshot consolidation to be lost."

Seems to be suggesting that the VM itself may be corrupted because of the consolidation process, whereas the incorrect data (corrupt data) is written back into the VM.
That was my concern...you can't roll back into the parent VMDK what no longer exists. :?

jmerfeld
Influencer
Posts: 16
Liked: 1 time
Joined: Apr 05, 2013 12:59 am
Full Name: James Merfeld
Contact:

Re: [KB2136854] There's a new CBT bug in ESXi 6

Post by jmerfeld » Nov 16, 2015 7:15 pm

I opened a case with VMware about this issue.. Here is the response from them. I guess CBT disable until then..
VMware Support wrote:With regards to your question on when a patch is released on the issue, it was advised a tentative release should be by December 2nd week and no exact date was given.

But a workround on the issue is suggested on http://kb.vmware.com/kb/2136854

Any questions. do reply and i will be glad to help

Gostev
SVP, Product Management
Posts: 24169
Liked: 3299 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: There's a new CBT bug in ESXi 6

Post by Gostev » Nov 17, 2015 1:14 am

rccl_ecain wrote:This part of the KB:

" this causes the change tracking information of I/Os that occur during snapshot consolidation to be lost."

Seems to be suggesting that the VM itself may be corrupted because of the consolidation process, whereas the incorrect data (corrupt data) is written back into the VM.
No, you misunderstood this part. This talks about change tracking information specifically, which goes into a separate file (CTK). Your issue reminds one of the "early" vSphere 6 issues, but assuming you are fully patched to vSphere 6 U1a level, then there are no known issues matching your description.

tinto1970
Enthusiast
Posts: 78
Liked: 25 times
Joined: Sep 26, 2013 8:40 am
Full Name: Alessandro Tinivelli
Location: Bologna, Italy
Contact:

Re: There's a new CBT bug in ESXi 6

Post by tinto1970 » Nov 17, 2015 8:07 am

lightsout wrote: I've heard from VMware we'll see a patch for it this month.
so soon?
great!!! :evil:
Alessandro Tinivelli aka Tinto
@tinto1970

v.eremin
Product Manager
Posts: 16325
Liked: 1343 times
Joined: Oct 26, 2012 3:28 pm
Full Name: Vladimir Eremin
Contact:

Re: There's a new CBT bug in ESXi 6

Post by v.eremin » Nov 17, 2015 10:15 am 2 people like this post

lightsout wrote:I used this bit of PowerShell to disable CBT on all of my jobs.
For those of you using both backup and replication jobs the script has to be modified slightly:

Code: Select all

get-vbrjob | ? {$_.JobType -eq "Backup" -or $_.JobType -eq "Replica"} | Set-VBRJobAdvancedViOptions -UseChangeTracking $false | out-null
Thanks.

cerberus
Enthusiast
Posts: 53
Liked: 4 times
Joined: Aug 28, 2015 2:45 pm
Full Name: MD
Contact:

Re: [KB2136854] There's a new CBT bug in ESXi 6

Post by cerberus » Nov 17, 2015 6:24 pm

I've read this thread from start to end, and I just want to clarify on the impact of disabling CBT.

Currently we are in process of upgrading to vSphere 6.0 U1a and I noticed this CBT bug thread. Our Veeam nightly backup job is setup for "Reverse incremental" because we offload to tape and require daily tape jobs to have the full backup. This works great. We are doing a SAN based backup, all traffic goes over the fiber channel network.

Now with this CBT bug, we're currently not affected by it because we have not upgraded our production environment to vSphere 6 (only our Lab), so when we do upgrade I need to uncheck "Use changed block tracking data" and also uncheck "Enable CBT for all protected VMs automatically" under Advanced Settings for the backup to disk job?

How does Veeam backup the blocks inside my VMs with CBT disabled for a job that is Reverse Incremental? Is this now doing a full backup over the FC network?

I am trying to avoid a full backup as they are lengthy and resource intensive. At this point we're probably going to wait until VMware fixes the bug before we upgrade production with vSphere 6.

[EDIT] Bummer I can't seem to edit my own post.. i have my answer.. without CBT, Veeam has to read the entire disk and figure out what blocks are changed (hence the increase in disk I/O)... please ignore my rookie question.

cffit
Expert
Posts: 338
Liked: 34 times
Joined: Jan 20, 2012 2:36 pm
Full Name: Christensen Farms
Contact:

Re: [KB2136854] There's a new CBT bug in ESXi 6

Post by cffit » Nov 17, 2015 9:07 pm

Can someone help explain what kind of impact to expect when you turn CBT off? Say for instance tonight I would normally run an incremental backup and I turn CBT off. Will that essentially take the same amount of time to backup as a Full Active backup does? Or more?

Do you need to start with a full backup job that has CBT turned off before you can do an incremental with CBT off?

I'd just like to know what kind of impact to expect and I realize it can very. Just some thoughts would be appreciated.

Gostev
SVP, Product Management
Posts: 24169
Liked: 3299 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: [KB2136854] There's a new CBT bug in ESXi 6

Post by Gostev » Nov 17, 2015 9:28 pm 1 person likes this post

Wow, some coincidences are simply unbelievable... believe or not, but I came to this topic with the sole purpose to share an excellent blog post on this exact topic. And what I see? Your post requesting this exact information! What are the chances?!

Well, there you go > Image-level backups can be done even without CBT

Kudos to our very own dellock6, his blog post could not be more timely.

emachabert
Veeam Vanguard
Posts: 370
Liked: 164 times
Joined: Nov 17, 2010 11:42 am
Full Name: Eric Machabert
Location: France
Contact:

Re: [KB2136854] There's a new CBT bug in ESXi 6

Post by emachabert » Nov 17, 2015 9:57 pm 1 person likes this post

Wow, clicking the link given by Anton, reading Luca's blog post and then clicking the link to Josh's blog -> can't believe what I've just read. WTF !
Veeamizing your IT since 2009/ Vanguard 2015,2016,2017,2018,2019

cffit
Expert
Posts: 338
Liked: 34 times
Joined: Jan 20, 2012 2:36 pm
Full Name: Christensen Farms
Contact:

Re: [KB2136854] There's a new CBT bug in ESXi 6

Post by cffit » Nov 17, 2015 10:34 pm

I assume that changing from CBT to non-CBT backups you aren't really covered until you have done a full backup without CBT enabled correct? Having a full backup that was done with CBT enabled and then incrementals without CBT enabled doesn't eliminate the risk given the full backup is always going to impact every incremental after it?

And if that's the case, how does this affect backup copies since they never really do a full backup after the first backup?

Gostev
SVP, Product Management
Posts: 24169
Liked: 3299 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: [KB2136854] There's a new CBT bug in ESXi 6

Post by Gostev » Nov 17, 2015 11:36 pm

Your assumption is incorrect, moreover I have already answered this question on the second page ;)

Anyway, here is the extended version of my answer from this week's forum digest for your convenience:
Gostev wrote:Disabling CBT is sufficient to both prevent and remediate the issue, because this will make jobs physically compare latest state of disk in backup or replica with its actual state, and transfer any non-matching blocks over as a part of incremental backup (along with actually changed blocks), thus fixing any corruption that may already be in place.

alex1002
Enthusiast
Posts: 25
Liked: 1 time
Joined: Jan 27, 2015 6:17 pm
Full Name: Alex
Contact:

Re: [KB2136854] There's a new CBT bug in ESXi 6

Post by alex1002 » Nov 18, 2015 3:20 am

Today I tried veeam in my lab and tried to restart it with cbt off. The cpu usage on the esxi servers went so high causing VMware to halt vms. The Alarms were saying high
Cpu usage. Everything make to a complete halt while backups were running. Is this normal behaviour?

Gostev
SVP, Product Management
Posts: 24169
Liked: 3299 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: [KB2136854] There's a new CBT bug in ESXi 6

Post by Gostev » Nov 18, 2015 1:19 pm 1 person likes this post

No, this is not normal, as calculating hashes requires little to no CPU resources. And this is the only extra operation that proxy does when CBT is disabled, comparing to incremental runs with CBT enabled.

cffit
Expert
Posts: 338
Liked: 34 times
Joined: Jan 20, 2012 2:36 pm
Full Name: Christensen Farms
Contact:

Re: [KB2136854] There's a new CBT bug in ESXi 6

Post by cffit » Nov 18, 2015 7:24 pm

If I wanted to check our backups by booting up a SureBackup job and then running chkdsk, do I need to run the full chkdsk (which will take forever) or for this purpose can a more limited chkdsk scan be done with some of the command switches?

Thanks!

Gostev
SVP, Product Management
Posts: 24169
Liked: 3299 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: [KB2136854] There's a new CBT bug in ESXi 6

Post by Gostev » Nov 18, 2015 7:44 pm

I don't believe you need a full scan, because it is designed for finding physical disk issues (specifically, bad sectors). Regular CHKDSK run, which checks if actual file allocation on disk matches NTFS MFT expectations, is usually enough to pick up major inconsistencies and corruptions. Thanks!

alex1002
Enthusiast
Posts: 25
Liked: 1 time
Joined: Jan 27, 2015 6:17 pm
Full Name: Alex
Contact:

Re: [KB2136854] There's a new CBT bug in ESXi 6

Post by alex1002 » Nov 19, 2015 6:59 pm

Gostev wrote:No, this is not normal, as calculating hashes requires little to no CPU resources. And this is the only extra operation that proxy does when CBT is disabled, comparing to incremental runs with CBT enabled.
What do you suggest?

Gostev
SVP, Product Management
Posts: 24169
Liked: 3299 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: [KB2136854] There's a new CBT bug in ESXi 6

Post by Gostev » Nov 19, 2015 7:14 pm 3 people like this post

To open a support case for troubleshooting, of course :)

BLWL
Enthusiast
Posts: 28
Liked: 38 times
Joined: Jan 27, 2015 7:24 am
Full Name: Bjorn L
Contact:

Re: [KB2136854] There's a new CBT bug in ESXi 6

Post by BLWL » Nov 23, 2015 10:57 am

For you that disabled CBT, how have that affected your source storage?

We run a NetApp MetroCluster storage on two sites. Veeam using storage snapshots. Before I disabled CBT, we could see some 100% max disk utilization during backups for shorter periods. No complaints though from users.

After disabling CBT, we now see much longer periods, 2-6 hours of 100% disk util. Both sites affected. Some VM's having problems. Moving jobs have helped a bit, but I still need to adjust in Veeam I belive.

Would decrease of concurrent tasks be the most suitable thing to adjust?


Thanks,
/BLWL

foggy
Veeam Software
Posts: 17908
Liked: 1506 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: [KB2136854] There's a new CBT bug in ESXi 6

Post by foggy » Nov 23, 2015 12:52 pm

Yes, you can play with reducing concurrent tasks limit to decrease the load on primary storage.
But a far better instrument to control storage load is our Backup I/O Control functionality.

sphilp
Enthusiast
Posts: 36
Liked: 9 times
Joined: May 28, 2009 7:52 pm
Full Name: Steve Philp
Contact:

Re: [KB2136854] There's a new CBT bug in ESXi 6

Post by sphilp » Nov 25, 2015 4:17 pm 3 people like this post

VMware appears to have issued a patch for this, it came through Update Manager an hour or so ago for us.

Patch ID: ESXi600-201511401-BG
KB Article: http://kb.vmware.com/kb/2137546

I'm curious about what the "recovery" process for our backups and replication should be. Is the suggestion that we run an Active Full against each job to get a "known good" set of backups after the VMware patch has been applied?

sthorpe
Lurker
Posts: 1
Liked: never
Joined: Nov 25, 2015 4:33 pm
Full Name: Steve Thorpe
Contact:

Re: [KB2136854] There's a new CBT bug in ESXi 6

Post by sthorpe » Nov 25, 2015 4:46 pm

Even though VMware have posted a patch, our backups would still be 'inconsistent' (synthetic full) as if we were hit by the bug then we wouldn't know we have have missed any blocks.
We will still need to perform a full backup - Does this reset the CBT at that point?

FYI turning off CBT on one of our 'real life servers' increased the backup time from 13mins to 3hours 7mins it also meant rather than reading 40Gb of 'changed data' it read the whole 2.1Tb disk increasing I/O on the backend SAN. Turn off CBT in the real world is not a workable solution.

Steve

ar2015
Lurker
Posts: 1
Liked: never
Joined: Nov 25, 2015 5:23 pm
Full Name: Adam Richards
Contact:

Re: [KB2136854] There's a new CBT bug in ESXi 6

Post by ar2015 » Nov 25, 2015 5:35 pm

Hi,

Will CBT need to be reset to continue with good backups once the patch has been applied?? Gostevs post "Veeam Logoby Gostev » Sat Nov 14, 2015 4:41 pm" suggests there may be an issue with full backups as they utilise CBT, therefore if there are potentially missing blocks in existing CBT will a new Full Backup be susceptible to the same issue. Alternatively does CBT get reset at the point a full is made ??

cerberus
Enthusiast
Posts: 53
Liked: 4 times
Joined: Aug 28, 2015 2:45 pm
Full Name: MD
Contact:

Re: [KB2136854] There's a new CBT bug in ESXi 6

Post by cerberus » Nov 25, 2015 5:49 pm

sphilp wrote:VMware appears to have issued a patch for this, it came through Update Manager an hour or so ago for us.

Patch ID: ESXi600-201511401-BG
KB Article: http://kb.vmware.com/kb/2137546

I'm curious about what the "recovery" process for our backups and replication should be. Is the suggestion that we run an Active Full against each job to get a "known good" set of backups after the VMware patch has been applied?
I see the patch now too, wohoo :)

Off-topic but does anyone on here know if using a non-VMware ESXi 6.0 image (in my case, I am using Dell-ESXi-6.0U1-3073146-A01 for my ESXi hosts), can we still apply this patch release by VMware or do we have to wait for Dell?

neuvoja
Service Provider
Posts: 1
Liked: never
Joined: Nov 16, 2015 5:04 pm
Full Name: Jani Neuvonen
Location: Finland
Contact:

Re: [KB2136854] There's a new CBT bug in ESXi 6

Post by neuvoja » Nov 25, 2015 5:54 pm

You do not have to wait for Dell, they usually only update the custom iso with major updates. Just use VUM to patch it to your hosts.

Locked

Who is online

Users browsing this forum: Google [Bot], rossb2 and 31 guests