Discussions specific to the VMware vSphere hypervisor
Locked
NightBird
Service Provider
Posts: 172
Liked: 32 times
Joined: Apr 28, 2009 8:33 am
Location: Strasbourg, FRANCE
Contact:

Re: There's a new CBT bug in ESXi 6

Post by NightBird » Nov 14, 2015 7:23 pm

If we disable use of CBT data on the backup job, Do Veeam retrieve the missing block ? or do we need to do an active full backup (I think Veeam will retrieve the missing disk block)

Gostev
SVP, Product Management
Posts: 24092
Liked: 3278 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: There's a new CBT bug in ESXi 6

Post by Gostev » Nov 14, 2015 8:29 pm 3 people like this post

Yes, as without CBT we will physically compare latest state of the disk in backup with its actual state, and transfer any non-matching blocks over into the incremental backup file.

jamerson
Expert
Posts: 311
Liked: 18 times
Joined: May 01, 2013 9:54 pm
Full Name: Julien
Contact:

Re: There's a new CBT bug in ESXi 6

Post by jamerson » Nov 15, 2015 8:55 pm

Let hope VMware will come this week with a new patches .
all i can say its not good !
thank you guys for informing us .

alex1002
Enthusiast
Posts: 25
Liked: 1 time
Joined: Jan 27, 2015 6:17 pm
Full Name: Alex
Contact:

Re: There's a new CBT bug in ESXi 6

Post by alex1002 » Nov 15, 2015 9:35 pm

Is Microsoft
Or someone else paying them off to purposely cripple their products?
What's going on VMware?

ashleyw
Service Provider
Posts: 154
Liked: 20 times
Joined: Oct 28, 2010 10:55 pm
Full Name: Ashley Watson
Contact:

Re: There's a new CBT bug in ESXi 6

Post by ashleyw » Nov 15, 2015 11:49 pm

so just so we can understand the risks here...
In the development space we often take snapshots of VMs while they are running and then revert back to the snapshots at a later stage - would this also be impacted?

If we disable CBT on our farm, then surely the amount of IOPs involved comparing what has changed on a VM will have a huge impact to our storage layer (based on 400 VMs on approx 60TB) to the point where backup cycles are no longer viable?

If we had a primary SAN capable of SAN snapshots and were able to ship the SAN snapshots to another device (like ZFS snapshot send) I'm guessing then at least we'd be in a safe position as the changed blocks would be handled at a SAN layer rather than at a VMware layer?

I really hope this issue gets fixed urgently as the risk of silent corruption is potentially significant.

gerdesj
Lurker
Posts: 2
Liked: 1 time
Joined: Nov 21, 2012 3:02 pm
Full Name: Jon Gerdes
Contact:

Re: There's a new CBT bug in ESXi 6

Post by gerdesj » Nov 16, 2015 12:11 am

I provided a little feedback to the TID:
----------------------------8<------------------------------
You give indications of a problem but no definitive descriptions of the implication of what will really happen. Your customers will be using various backup solutions - yours, BEX, Veeam, Acronis, etc etc - what are the implications?

Man up and describe exactly what symptoms might occur for all your major solution providers. Bugs happen but this TID is rubbish and not what I would expect from VMware.
----------------------------8<------------------------------

We know what to do here: Gostev has spelt it out already. The real losers are those who don't keep abreast of events ("hmm data, he no restore") and the big boys who depend on CBT to keep backups within a finely tuned window.

jelloir
Novice
Posts: 5
Liked: 1 time
Joined: May 07, 2015 11:45 pm
Contact:

Re: MAJOR: new CBT issue on vsphere6.

Post by jelloir » Nov 16, 2015 3:41 am

ashleyw wrote:and we've just brainstormed this through;-
- Synthetic fulls merge the incrementals so they are potentially bad.
Can Veeam elaborate on this please.

We use forward incremental forever and a copy job to Cloud Connect. I presume that merged [CBT based] increments may be corrupt, so what impact does this have on the full VBK file? Will simply disabling CBT ensure consistent backups from that point forward? And what if any impact on copies of this data to Cloud Connect?

Regards

James

BLWL
Enthusiast
Posts: 25
Liked: 37 times
Joined: Jan 27, 2015 7:24 am
Full Name: Bjorn L
Contact:

Re: There's a new CBT bug in ESXi 6

Post by BLWL » Nov 16, 2015 6:57 am

Thanks Gostev for the heads-up in your always awesome newsletter!

Just curious: Anyone knows if disabling CBT per VM is a workaround?

http://kb.vmware.com/kb/1031873

joergr
Expert
Posts: 386
Liked: 39 times
Joined: Jun 08, 2010 2:01 pm
Full Name: Joerg Riether
Contact:

Re: There's a new CBT bug in ESXi 6

Post by joergr » Nov 16, 2015 12:50 pm 3 people like this post

No, CBT will be re-enabled at next backup-run. So i suggest to disable CBT in VEEAM. Thus, you will be 100% safe. You don´t need to do a full backup or change anything else because VEEAM will (when CBT is turned off in VEEAM) compare every single block and replace every single block it finds which don´t matches. So let´s say you disable CBT in VEEAM today and you have an inc backup scheduled for this night, the backup of this night will take longer but will be 100% OK.

Best,
Joerg

dkvello
Service Provider
Posts: 93
Liked: 12 times
Joined: Jan 01, 2006 1:01 am
Full Name: Dag Kvello
Location: Oslo, Norway
Contact:

Re: There's a new CBT bug in ESXi 6

Post by dkvello » Nov 16, 2015 12:53 pm

Is there any know way to figure out If a particular environment is affected ?

What are the "requirements" for this bug to appear ?

I have several R6 environments without seeing this problem, so there has to some particular attributes in an vSphere6 environment that cause this.

lrosa
Influencer
Posts: 15
Liked: 2 times
Joined: Dec 11, 2012 9:11 pm
Full Name: Luigi Rosa
Location: Italy
Contact:

Re: There's a new CBT bug in ESXi 6

Post by lrosa » Nov 16, 2015 12:55 pm

Thank you joergr, that's exactly what I needed to know.


Luigi

joergr
Expert
Posts: 386
Liked: 39 times
Joined: Jun 08, 2010 2:01 pm
Full Name: Joerg Riether
Contact:

Re: There's a new CBT bug in ESXi 6

Post by joergr » Nov 16, 2015 12:58 pm

dkvello: It looks like as per Anton´s research it could occur during heavy i/o write operations from within the guest, but i suggest to wait till we have way more intel before drawing any conclusions. Till then, i´d suggest to either downgrade to esxi 5.5 or disable cbt from within veeam.

Best,
Joerg

joergr
Expert
Posts: 386
Liked: 39 times
Joined: Jun 08, 2010 2:01 pm
Full Name: Joerg Riether
Contact:

Re: There's a new CBT bug in ESXi 6

Post by joergr » Nov 16, 2015 12:58 pm

lrosa wrote:Thank you joergr, that's exactly what I needed to know.
you are very welcome ;)

Gostev
SVP, Product Management
Posts: 24092
Liked: 3278 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: There's a new CBT bug in ESXi 6

Post by Gostev » Nov 16, 2015 1:50 pm 1 person likes this post

dkvello wrote:I have several R6 environments without seeing this problem, so there has to some particular attributes in an vSphere6 environment that cause this.
Do you know for sure there is no issue? Did you try to restore a VM and run chkdsk, at least? I am naturally curious, because it will help to understand the scope (whether corruption due to the issue happens every time, or only under certain conditions). chkdsk is typically the best tool among those readily available to quickly detect corruptions at the file system level, but it may not be able to detect every corruption every time - but rather, depending on the actual corruption pattern.

We used a more advanced approach based on one of our existing auto tests. This one constantly writes a very large file (100GB+) in guest, and then compares MD5 hash between the resulting file on the production VM, and with hash of the file in VM restored from an incremental backup. Such a test will obviously detect even single bit corruptions every time.

joergr
Expert
Posts: 386
Liked: 39 times
Joined: Jun 08, 2010 2:01 pm
Full Name: Joerg Riether
Contact:

Re: There's a new CBT bug in ESXi 6

Post by joergr » Nov 16, 2015 2:01 pm 2 people like this post

I was thinking a while about that Anton ;-) Wouldnt it be a nice thing (yeah here comes a feature idea) to have a very simple chkdsk with returncode test script right build-in to surebackup assistant, let´s say to just checkmark a surebackup option, call it somewhat like "check disk integrity at guest level with chkdsk" and bang - a chkdsk script is executed when a win* os was found in the vm.

Best,
Joerg

Locked

Who is online

Users browsing this forum: Bing [Bot], Google [Bot] and 15 guests