Discussions specific to the VMware vSphere hypervisor
mkretzer
Expert
Posts: 545
Liked: 121 times
Joined: Dec 17, 2015 7:17 am
Contact:

Forum Digest: CBT bug - Snapshot revert issues

Post by mkretzer » Aug 26, 2019 4:56 am 1 person likes this post

Hello,

regarding backup corruption after snapshot revert (see forum digest from today):

What is the correct workaround? Reset CBT after revert?

I always thought CBT is reset automatically after snapshot revert!

Markus

oliverL
Enthusiast
Posts: 60
Liked: 6 times
Joined: Nov 11, 2016 8:56 am
Full Name: Oliver
Contact:

Re: Forum Digest - Snapshot Revert issues

Post by oliverL » Aug 26, 2019 7:31 am

If the CBT Information would be reset automatically, that would/should force an Active Full Backup, because no CBT Information is available, right?
I assume that an Active Full can be the only workaround (if you assume that you have corrupt incrementals) as it reads the Disk from the beginning to the end...

mcz
Expert
Posts: 287
Liked: 53 times
Joined: Jul 19, 2016 8:39 am
Full Name: Michael
Contact:

Re: Forum Digest - Snapshot Revert issues

Post by mcz » Aug 26, 2019 8:08 am

I'd say it wouldn't create an active full, it only reads the whole disk and then compares the contents with the latest backup to only write the increments. In the end, you would still have a vib but without using the faster way of fetching the delta blocks via CBT.

mcz
Expert
Posts: 287
Liked: 53 times
Joined: Jul 19, 2016 8:39 am
Full Name: Michael
Contact:

Re: Forum Digest - Snapshot Revert issues

Post by mcz » Aug 26, 2019 8:10 am

I'm wondering how this "bug" affects replication... I mean if you start a (powered off) replica, the next replication pass would revert the snapshot to make sure the the last replication pass is in place (even if you hadn't done anything). Now does Veeam use CBT for the target VM? I assume that it doesn't and only uses CBT for the source VM - can anyone confirm that?

oliverL
Enthusiast
Posts: 60
Liked: 6 times
Joined: Nov 11, 2016 8:56 am
Full Name: Oliver
Contact:

Re: Forum Digest - Snapshot Revert issues

Post by oliverL » Aug 26, 2019 8:58 am

mcz wrote:
Aug 26, 2019 8:08 am
I'd say it wouldn't create an active full, it only reads the whole disk and then compares the contents with the latest backup to only write the increments. In the end, you would still have a vib but without using the faster way of fetching the delta blocks via CBT.
From the digest we know that Veeam gets invalid CBT Data but does Veeam has a logic to do a full read on the disk for the comparison because the CBT Data is invalid? I mean from Veeam side such a logic now is needed to be ensure Backups are ok (without using Surebackup).

And in the VMware KB it is stating that the CBT API might not even throw a error. You could run into this problem without noticing it.

Also if i restore such a corrupt Backup, how will this affect the restored VM? Will it boot and everything looks fine but under the hood it isn't? Will Surebackup see such a corrupt Backup?

mcz
Expert
Posts: 287
Liked: 53 times
Joined: Jul 19, 2016 8:39 am
Full Name: Michael
Contact:

Re: Forum Digest - Snapshot Revert issues

Post by mcz » Aug 26, 2019 9:18 am

Well if you reset CBT or disable CBT usage in the backup job settings, veeam would read the whole disk to check "manually" which blocks have changed - so there is already this functionality. From my personal experience I can tell you that a corruption caused by invalid CBT data may not be noticed even if you run a SureBackup job. It might be that the VM is booting fine and just has a corruption within some files which are not part of the OS / any other process whatever. Even if you let veeam check all the data blocks wouldn't help because the blocks within the backup file are fine - the corruption has been caused by e.g. not all changed blocks have been fetched and therefore written to the backup file.

So all in all, to be sure that you're backup up correct data: Reset CBT or disable the usage of it in the job settings

myFist
Enthusiast
Posts: 28
Liked: 4 times
Joined: Nov 29, 2017 1:06 pm
Contact:

Re: Forum Digest - Snapshot Revert issues

Post by myFist » Aug 26, 2019 10:43 am

Would periodically active fulls help to avoid the corruption?

Andreas Neufert
Veeam Software
Posts: 3745
Liked: 665 times
Joined: May 04, 2011 8:36 am
Full Name: Andreas Neufert
Location: Germany
Contact:

Re: Forum Digest - Snapshot Revert issues

Post by Andreas Neufert » Aug 26, 2019 11:32 am

Correction: If CBT is corrupt then a CBT reset as described here https://kb.vmware.com/s/article/2139574 is the workaround. It will trigger within Veeam a snap and scan backup which read 100% of the data and then work accrodingly to the selected backup method to store and active/syntetic full or incremental but you have to trigger an active full to reset the CBT within the backup chain to not depend on older restore points.

mcz
Expert
Posts: 287
Liked: 53 times
Joined: Jul 19, 2016 8:39 am
Full Name: Michael
Contact:

Re: Forum Digest - Snapshot Revert issues

Post by mcz » Aug 26, 2019 12:11 pm

@Andreas: Why would you need to do an active full if CBT was corrupted? Your current restore points may have corrupted data but if you start to not use CBT, the next (incremental) pass would then be fine because veeam would then start to backup incrementals as it should. If veeam hasn't backed up some blocks because of the invalid CBT result, then it would detect the changes during the read of the whole disk. That means after that run, data would be consistent - even without active full.

Have a look at this forum thread: vmware-vsphere-f24/there-s-a-new-cbt-bu ... ml#p168402

tinto1970
Enthusiast
Posts: 81
Liked: 28 times
Joined: Sep 26, 2013 8:40 am
Full Name: Alessandro Tinivelli
Location: Bologna, Italy
Contact:

Re: Forum Digest - Snapshot Revert issues

Post by tinto1970 » Aug 26, 2019 1:05 pm 1 person likes this post

Good day everyone,
concerned with this newly discovered issue, i started searching for the tool to reset cbt i had been using some years ago. I have found it but i am no longer able to make it run. It was called "CBT reset tool" (had this name in ascii art within the code) and it was nice because it allowed me to specify which VM i wanted to had CBT reset.

In the Vmware KB there is another script, called ResetCBT

https://kb.vmware.com/s/article/2139574

this one has a problem: it only operates an all VMs and it asks
"Continue to Reset CBT
PRESS CONTROL + C TO REJECT, ANY OTHER KEY TO ACCEPT"

i don't know what has happend, but i pressed ctrl+c (slowly) and it started without my consent! And it has started the modification on the VM without finishing it.

I have found this other script

https://www.unitrends.com/blog/vmware-c ... ell-cmdlet

it seems to work fine and it allows to reset cbt only on desired VM and has other tools and intructions are pretty clear.

Hope this helps, any comment or correction are appreciated.
Alessandro Tinivelli aka Tinto
@tinto1970

lemtargatwing
Service Provider
Posts: 23
Liked: 3 times
Joined: Jul 28, 2017 2:48 pm
Full Name: Kyle Witte
Contact:

Re: Forum Digest - Snapshot Revert issues

Post by lemtargatwing » Aug 26, 2019 2:36 pm

Just to make sure I am reading and following correctly in my own head, let me ask a few questions.

The bug in question has to do with reverting snapshots while CBT is enabled correct?

I manage backups for an MSP with several engineers that may or may not be aware of this (even if I tell them about it). So let's say one of our engineers takes a snapshot outside of Veeam and then reverts it.

Several of our environments won't really be able to handle turning CBT off on all of the backup jobs. It'd be too much strain on their network and/or disks.

So to fix this if I find out it happened (I am made aware of all snapshot activity by engineers per company policy), what option do I do to fix this?

Do I take an active full? If I read correctly this won't reset CBT, right?

Do I reset CBT and THEN take an active full?

Do I just have to disable CBT backups for this one VM now?

How about prevention? I can post an internal KB to our company for taking and reverting snapshots if necessary. I'm already posting one now to just not and to rely on the backups themselves, but in the case that we have to (hopefully not), I'd like to have a plan.

To prevent this, could I just have my engineers disable the job, disable CBT at the VMware level, then take a snapshot? Veeam should turn CBT back on afterwards right? But at that point I would think it'd be reset.

tinto1970
Enthusiast
Posts: 81
Liked: 28 times
Joined: Sep 26, 2013 8:40 am
Full Name: Alessandro Tinivelli
Location: Bologna, Italy
Contact:

Re: Forum Digest - Snapshot Revert issues

Post by tinto1970 » Aug 26, 2019 2:47 pm 2 people like this post

it's quite easy if i understood well

For the Future:

solution a: don't use snapshots any longer, take a backup instead

solution b: if you have to revert a snapshot in a VM, than reset CBT data for that VM

that's all.

For the Past:

reset CBT for all VMs if you are not sure if reverts have been performed or not
Alessandro Tinivelli aka Tinto
@tinto1970

lemtargatwing
Service Provider
Posts: 23
Liked: 3 times
Joined: Jul 28, 2017 2:48 pm
Full Name: Kyle Witte
Contact:

Re: Forum Digest - Snapshot Revert issues

Post by lemtargatwing » Aug 26, 2019 3:01 pm

Thank you for that. After a cup of coffee and reading everything again that all makes sense!

pizzim13
Enthusiast
Posts: 94
Liked: 6 times
Joined: Apr 21, 2011 7:37 pm
Contact:

Re: Forum Digest - Snapshot Revert issues

Post by pizzim13 » Aug 26, 2019 5:47 pm

tinto1970 wrote:
Aug 26, 2019 2:47 pm
it's quite easy if i understood well

For the Future:

solution a: don't use snapshots any longer, take a backup instead

solution b: if you have to revert a snapshot in a VM, than reset CBT data for that VM

that's all.

For the Past:

reset CBT for all VMs if you are not sure if reverts have been performed or not
Can someone from Veeam validate this post?

veremin
Product Manager
Posts: 16778
Liked: 1404 times
Joined: Oct 26, 2012 3:28 pm
Full Name: Vladimir Eremin
Contact:

Re: Forum Digest - Snapshot Revert issues

Post by veremin » Aug 26, 2019 6:37 pm 1 person likes this post

Looks valid. Thanks!

Excerpt from the latest community digest (in case you've missed it):
Gostev wrote:This just in. We've been troubleshooting one backup corruption issue seen internally in one of our labs, where all signs pointed to a possible VMware changed block tracking (CBT) bug. Eventually, this was tracked down to a revert snapshot operation on the protected VM, following which CBT API started to return invalid data. So we've opened a support case with VMware Support, and after 2 months their conclusion was that this corruption is "by design" and is due to the fact that CBT API does not support reverting snapshot on a VM. They even published the official support KB article about this. I'm still trying to wrap my head around their response, but my first reaction is that it makes little sense? I would argue ESXi should then simply reset CBT on a VM following snapshot revert operation, or even just start returning an error – instead of providing invalid CBT information, as if nothing happened? So this week, we'll be escalating this issue through our VMware Alliance channel as the next step. Normally, I would wait until we get another opinion there, however I had to share what we know so far immediately - since this issue leads to backup corruption.

Bottom line: don't revert VM snapshots, and better yet - don't use VM snapshots in production environments at all. They impact performance, they overfill datastores - and now this. Instead, just use Quick Backup (or VeeamZIP) to create out-of-band restore points as needed - and do a full VM restore if you need to rollback. Do keep in mind that Veeam can use CBT for restores as well, which makes VM rollback blazing fast even for biggest VMs.

driley
Influencer
Posts: 20
Liked: 2 times
Joined: Jan 04, 2017 4:49 pm
Full Name: Dennis Riley
Contact:

Re: Forum Digest - Snapshot Revert issues

Post by driley » Aug 26, 2019 8:20 pm

Does this affect replica failover and then the "undo" of the failover back to production?
D.

FrancWest
Expert
Posts: 107
Liked: 10 times
Joined: Sep 17, 2017 3:20 am
Full Name: Franc
Contact:

Re: Forum Digest - Snapshot Revert issues

Post by FrancWest » Aug 26, 2019 8:48 pm 2 people like this post

sigh, the use of snapshots is daily practice in our organisation. Before installing updates or doing upgrades of installed software we take a snapshot and install the update. If all is working fine, we delete the snapshot. If there's an issue, we revert back to the snapshot and start over. Sometimes we restore the VM from backup if we forgot to take a snapshot. I think this use of snapshots is not uncommon, so I find it a bit strange that this CBT issue comes up now only. Shouldn't it have been detected way earlier? Also, this would mean that a VM that was restored from backup might have been corrupted, but it's possible it isn't noticed yet since no critical files were affected.

This really seems a huge bug in vmware and should get way more attention. It also means that when using forward incremental backups some of them might be useless since they might contain corrupt VMs...

oh well... :-(

Gostev
SVP, Product Management
Posts: 24610
Liked: 3458 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Forum Digest - Snapshot Revert issues

Post by Gostev » Aug 26, 2019 8:57 pm 1 person likes this post

FrancWest wrote:
Aug 26, 2019 8:48 pm
It also means that when using forward incremental backups some of them might be useless since they might contain corrupt VMs...
Unfortunately, as with a number of previous VMware CBT issues, Active Fulls are also affected by this bug from the moment when this CBT issue is triggered. In other words, using forever incremental does not introduce additional risks here comparing to backup modes with periodic Active Fulls.

However, despite what VMware KB article says, we (Veeam) actually believe that the scope of the issue could be a bit smaller. According to our own testing, simply reverting VM snapshot does not break CBT - one additional action is also required. We kept stressing these findings with VMware Support, but they largely ignored our observations, and as you can see they also didn't include any other variables in the official KB article. So at this time, as a partner we must stick to their official conclusion.

We are getting another opinion from our friendly alliance folks at VMware though. It may take longer than usual due to VMworld U.S. happening right this moment, but I will surely keep everyone here posted on any significant updates.

FrancWest
Expert
Posts: 107
Liked: 10 times
Joined: Sep 17, 2017 3:20 am
Full Name: Franc
Contact:

Re: Forum Digest - Snapshot Revert issues

Post by FrancWest » Aug 26, 2019 9:26 pm

Can you mention which additional action is required to trigger the bug?

Gostev
SVP, Product Management
Posts: 24610
Liked: 3458 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Forum Digest - Snapshot Revert issues

Post by Gostev » Aug 26, 2019 9:38 pm 1 person likes this post

Well, if I could - I would do that right in the previous post, and even in the digest, trust me. However, I decided I'd rather not, because we did stress these findings with VMware Support - so at this point, it may come across as me publicly arguing with the official VMware position documented in the KB article, which clearly explains that reverting snapshots is not supported by CBT in principle. And by doing that, potentially endangering our joint customers due to providing wrong info. From my previous experience with VMware, this kind of stuff can get very political real quick. In the end, no ones wants to have their partner telling a different story... and I can relate to that myself, because we do have such issues with our own partners occasionally!

FrancWest
Expert
Posts: 107
Liked: 10 times
Joined: Sep 17, 2017 3:20 am
Full Name: Franc
Contact:

Re: Forum Digest - Snapshot Revert issues

Post by FrancWest » Aug 26, 2019 9:41 pm

Ok, thanks for the explanation, I understand your position.

mcz
Expert
Posts: 287
Liked: 53 times
Joined: Jul 19, 2016 8:39 am
Full Name: Michael
Contact:

Re: Forum Digest - Snapshot Revert issues

Post by mcz » Aug 27, 2019 6:58 am

Instead, just use Quick Backup (or VeeamZIP) to create out-of-band restore points as needed - and do a full VM restore if you need to rollback. Do keep in mind that Veeam can use CBT for restores as well, which makes VM rollback blazing fast even for biggest VMs.
The only drawback using full VM restore + CBT is that you can only use it once! I had a very unlucky situation 2 years ago when I had to investigate a very severe database issue. I did a quick rollback using veeam's full vm restore and started the investigation. After a while I noticed that I was on the wrong path and wannted to revert again (using the same method) but got very surprised by the fact that veeam didn't do the delta restore but instead it wrote the whole disk (500 GB) which took a while because of a bottleneck. I didn't understand the situation and after talking to a veeam person I noticed that CBT only works when the vm is powered on - which isn't the case when veeam does a full vm restore. So that means if you need it once (the restore) than it's perfect, if you probably have to restore several times then it could be painful. In such a case, a snapshot would be better but of course now it's a bad idea since we know about this possible corruption.

Just wannted to share this information in case somebody else will get into a similar situation.

chrislove
Lurker
Posts: 1
Liked: 1 time
Joined: Dec 20, 2017 12:56 pm
Full Name: Chris Love
Contact:

Re: Forum Digest - Snapshot Revert issues

Post by chrislove » Aug 27, 2019 7:57 am 1 person likes this post

The notes in the VMware KB suggest this issue only occurs if you enable CBT on a virtual machine which already has existing snapshots or am I missing something? Despite the KB saying that CBT does not support the revert snapshot operation.

Note: Ensure that there are no snapshots on the virtual machine before enabling change tracking. If you create snapshots before enabling CBT, the QueryChangedDiskAreas API might not return any error or the data returned by QueryChangedDiskAreas might be incorrect.

Spex
Enthusiast
Posts: 33
Liked: 1 time
Joined: May 09, 2012 12:52 pm
Full Name: Stefan Holzwarth
Contact:

Re: Forum Digest - Snapshot Revert issues

Post by Spex » Aug 27, 2019 8:38 am

I'm not sure I understand everything. Can Veeam please confirm or update the following.

Reset cbt to a good state:
- Doing an active full does not reset cbt.
- Turning off cbt in the backupjob, doing active full and turning cbt on again in the backupjob does NOT reset cbt at vmware too.

What are the options to reset cbt (per vm)?

If I reset cbt at vmware. The next incr backup will read whole disk and compare blocks with existing backups.
SInce existing backup maybe wrong do I have to do an active full to be safe?

micoolpaul
Service Provider
Posts: 32
Liked: 7 times
Joined: Jun 29, 2015 9:21 am
Full Name: Michael Paul
Contact:

[MERGED] SureReplica & CBT

Post by micoolpaul » Aug 27, 2019 9:41 am

Morning,

Just after a bit of information from Veeam if possible regarding the announcement of CBT issues with reverting snapshots. SureReplica reverts the snapshot once it has finished testing doesn't it? So therefore if we used a replica as a source for a backup, the CBT would be invalid wouldn't it? Just making sure we structure our backups around this unfortunate announcement.

Thanks
-------------
Michael Paul
Veeam Certified Engineer

Gostev
SVP, Product Management
Posts: 24610
Liked: 3458 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Forum Digest - Snapshot Revert issues

Post by Gostev » Aug 27, 2019 12:27 pm

chrislove wrote:
Aug 27, 2019 7:57 am
The notes in the VMware KB suggest this issue only occurs if you enable CBT on a virtual machine which already has existing snapshots or am I missing something? Despite the KB saying that CBT does not support the revert snapshot operation.

Note: Ensure that there are no snapshots on the virtual machine before enabling change tracking. If you create snapshots before enabling CBT, the QueryChangedDiskAreas API might not return any error or the data returned by QueryChangedDiskAreas might be incorrect.
No, this is just general/unrelated note that is separate from the discussed issue. But in any case, as you can see from the corresponding Veeam UI label, we do perform this check before enabling CBT on a VM. This was in place since we first released CBT support in v4.

Gostev
SVP, Product Management
Posts: 24610
Liked: 3458 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Forum Digest - Snapshot Revert issues

Post by Gostev » Aug 27, 2019 12:41 pm 2 people like this post

OK, so things are getting really interesting here!

One of Veeam support folks who has been around forever suddenly recalled that we already been through this very topic with VMware back 8 years ago! There was a CBT bug with reverting VM snapshots back then, which was fixed by VMware - and moreover, in exactly the way I suggested it should be fixed (by returning an error, which in turn forces Veeam to perform an incremental backup using the entire image scan approach, and establish the new reference point for CBT to use). Which in turn means that reverting snapshots IS actually correctly handled by CBT API, contrary to the most recent KB article? And perhaps there are other variables indeed?

Here's the original VMware support KB article from 2011 > KB1021607
Symptoms wrote:Reverting a snapshot for a virtual machine that has Changed Block Tracking (CBT) enabled to a snapshot older than its last incremental backup can cause inconsistencies in incremental backups of that virtual machine.
Resolution wrote:This issue is resolved in vSphere 4.1 and vSphere 4.0 Update 3. Rather than potentially providing incomplete data, a change ID obtained before the snapshot revert is now correctly considered as being invalid.

lando_uk
Expert
Posts: 303
Liked: 22 times
Joined: Oct 17, 2013 10:02 am
Full Name: Mark
Location: UK
Contact:

Re: Forum Digest: CBT bug - Snapshot revert issues

Post by lando_uk » Aug 27, 2019 2:16 pm

This whole thing sounds fishy to me. I would think in most large environments. snapshot reverts are BAU activities. They teach you this way of working on the basic courses with no warning of "reverting may corrupt your backups"

There's no info in the KB about the different snapshot formats, VMFSsparse or SEsparse - previous bugs(features) sometimes only relate to one or the other.

I think this is a too big deal to be a real thing after all these years...

tinto1970
Enthusiast
Posts: 81
Liked: 28 times
Joined: Sep 26, 2013 8:40 am
Full Name: Alessandro Tinivelli
Location: Bologna, Italy
Contact:

Re: Forum Digest: CBT bug - Snapshot revert issues

Post by tinto1970 » Aug 29, 2019 12:45 pm

is it a fault of my memory or in

https://kb.vmware.com/s/article/71155

the following note has been added in the last 2 days, without changing the "last updated" field?

"Note: Ensure that there are no snapshots on the virtual machine before enabling change tracking. If you create snapshots before enabling CBT, the QueryChangedDiskAreas API might not return any error or the data returned by QueryChangedDiskAreas might be incorrect."
Alessandro Tinivelli aka Tinto
@tinto1970

mma
Service Provider
Posts: 97
Liked: 18 times
Joined: Dec 22, 2011 9:12 am
Full Name: Marcel
Location: Lucerne, Switzerland
Contact:

Re: Forum Digest: CBT bug - Snapshot revert issues

Post by mma » Aug 29, 2019 1:06 pm 2 people like this post

So, I was looking for some information about this suspicious CBT problem.

A quick google search showed an older thread here on veeam forums.
vmware-vsphere-f24/vmware-snapshots-cbt ... 20792.html

Sounds legit to me, so let's test it. (Veeam 9.5 Update 4b, vCenter 6.5 U1g, ESXi 6.0 EP4)

What did I do?
- create a new job with option "use changed block tracking data" enabled
- deploy a new VM from template, assign to job

Tasks:
  1. create active full backup and check if cbt will be used
  2. create incremental backup and check if cbt will be used
  3. create snapshot CBT1 and revert
  4. create incremental backup, check if cbt will be used
  5. create incremental backup, check if cbt will be used
  6. create snapshot CBT2
  7. create incremental backup, check if cbt will be used
  8. revert to snapshot CBT2 (the one before the backup)
  9. create incremental backup, check if cbt will be used
Result:
  1. CBT used
  2. CBT used
  3. done
  4. CBT not used - "CBT data is invalid, failing over to legacy incremental backup" - Veeam reads whole disk
  5. CBT used
  6. done
  7. CBT not used - "CBT data is invalid, failing over to legacy incremental backup" - Veeam reads whole disk
  8. done
  9. CBT not used - "CBT data is invalid, failing over to legacy incremental backup" - Veeam reads whole disk
Conclusion?
It works like it should. Did I miss something?

Regards
Marcel

Post Reply

Who is online

Users browsing this forum: No registered users and 12 guests