DrColonel wrote:Their response did seem a bit suspect to me. I had already reset CBT on our affected VM's so it was no longer an immediate issue, but I thought their response was interesting. Can you elaborate or refer me to a past post showing why it's incorrect so that I can let them know that the info they're giving out about this issue is wrong?
The actual
VMware KB article that this thread refers to is really all you need. Also, it's important to note that
VMware KB2090639 has changed significantly since it was originally released. At first it claimed that it was only impacting a specific call to QueryChangedDiskAreas(), specifically when you call it with an "*", a special call that should return a list of all blocks that have been used/allocated in the entire VMDK. This call was useful for full backups to save time reading from disk areas that were not previously written and thus obviously had no data. This information lead to some vendors which didn't use this specific call to claim they were not impacted by this
bug.
However, the KB article has since been updated and now notes that any call to this API can return inconsistent data if the VMDK has been expanded beyond the given thresholds. Also, they have added some additional information about those threasholds, specifically they have the following in the Q&A section:
Are virtual machines grown in smaller increments affected?
The amount of space the virtual disk is extended is not relevant, the increment of space by which a virtual disk is extended is not relevant.
Virtual machine is affected when the disk is grown past the 128G boundary in absolute size. The issue is triggered at other sizes which are a power of 2 from 128G up. For example: 256G, 512G, and 1024G.
A full backup that didn't use the QueryChangeDiskAreas() API at all should not be impacted, but incremental backups using
CBT from that point could still be impacted and thus invalid since the API would fail to return changed blocks from blocks over the extended thresholds. Veeam uses QueryChangeDiskAreas() even during full backups to identified "used" blocks in a VMDK so it would impact both Full and Incremental backups until the
CBT data is reset.
So to clarify, the only "safe" thing to do is to reset
CBT for any VM that is over 128GB, unless you have a full history of change control information for a VM and know it was never expanded.
I would continue to monitor this thread as well as the
VMware KB article as it's obvious that information is still being discovered about this issue and we may not yet be at the final state.