I have a ticket open for this case (00674402) and I'm taking this to the forums, because, first of all, I feel information about the bug needs to come out, so that people can work around it and avoid data loss in a restore scenario.
To trigger this bug you have to:
1. Perform an Instant VM Recovery of your virtual machine. When configuring the Instant VM Recovery job, enable redirection of disk updates to the datastore you're planning on migrating the production VM to.
2. Power on the VM. At this point, the machine will be "live" and accepting user data. The machine will also not be covered by any backup, unless you take steps to ensure that it will.
3. At an appropriate time, perform a "Migrate to production" on the VM. Choose the same datastore as in step 1. (Actually, I'm not sure if the two datastores need to be the same, in my tests, I have not tested to use two different production datastores). Ensure that VMware Storage VMotion is used (not Quick MIgration). There is a check box at the end called "Delete source VM files upon successful quick migration (does not apply to vMotion)". Set this checkbox however, you like, it makes no difference.
4. Kiss your production data goodbye. Any data that has been written between steps 2 and steps 3, which could potentially be several hours or even days waiting for an appropriate service window - gone. You'll of course still have your original backup that you spun up the Instant VM Recovery from, but anything after that, irretrievable. What happens is that Veeam triggers a VMotion of the machine. For some reason, perhaps because the redo logs are already on the destination datastore, it decides that the Storage VMotion is done after only a few seconds, even though the data is still on the vPower NFS datastore. At this point, Veeam decides to DELETE your instant recovery VM because the Instant Recovery job is "done". That is most definitely not the desired behaviour for anyone in any scenario.
Right now, I'm a bit wary to offer workarounds, I suggest that anybody planning on using this feature tests it out in their environment, and makes sure anybody in their organization who might do an Instant VM Restore knows about this bug, until such time that Veeam releases a patch for this. Some possible workarounds (again, I take no responsibility for these, you will have to test this yourself to see if it works in your environment):
1. Don't use Instant VM Recovery, instead do a regular VM restore.
2. If you have to use Instant VM Recovery, do not redirect virtual disk updates.
3. If you have to redirect virtual disk updates - try using a different datastore for your migration destination and your disk updates. (UNTESTED)
4. If you have to put your virtual disk updates on the same datastore you plan on migrating to, use Veeam Quick Migration rather than Storage vMotion.
If your VMware installation doesn't have a license for Storage vMotion, you will not be bitten by this bug, because it only happens when using Storage vMotion.
Now, for me, I didn't experience any real data loss, because I happened to find this bug when I was demoing the software to a colleague who was preparing some documentation for VM recovery prodecures in our organization, so all I lost was some test data. But it might as well have been real data loss.
Still, I'm disappointed that Veeam Support has not been taking this bug seriously. The last response I got from Veeam is this:
As we discussed with engineers it is not actually a bug from Veeam side, it is more by design behavior. Because all steps from Veeam were done correctly according to the settings set for the jobs. We thing about warning message to notify user about consequences of these steps. In the next patches of Veeam we are gonna to add this notification.
In other words: Veeam will eat your data. By design.
Is it just me having too high expectations, or does anyone else find this kind of stance about this kind of bug... strange? Makes me wonder what other "design behaviours" are lurking below...