Im not sure if anybody has already covered this but I wanted to give you guys a warning of a scenario ( probably very rare) that causes veeam to be destructive and will actually power off and delete your production servers. I know this first hand after an incident on Friday with one of our engineers.
Here is the background info to hopefully explain the scenario and the setup.
We have 3 sites London, Newbury, Wales.
We want to move SQL1 from Wales to London so we setup a replica job. We wait until the replica is in a good state, power off SQL1 then do a final replication and power on the server in the remote site, reconfigure it and all is happy. the replication job is deleted and all is forgotten. After the server is rehomed we setup a replication job from London to Newbury.
12 Months later we have an issue with the replication job and a corrupt snapshot and the engineer in question finds this KB https://www.veeam.com/kb1773
( Within the Veeam console under Replicas find the replica that you will be repair and right-click it, from the context menu choose “Remove from replicas…” ) no action was performed in VMware at all.
He clicks on the replica vm and then clicks to remove it from disk. ( I think he skimmed over the red text in the KB just to note )
What happens next you wonder? Well what was expected is the replica in Newbury would be removed for reseeding. You would be wrong.
Veeam actually recognises SQL1 in the London office as the replica due to the fact it once was the replica many moons ago, but veeam is unaware its not, it then proceeds to power down the running production server and promptly deletes it.
I'm sure there are steps which we could have taken to clean up the configs in veeam to help prevent this. I personally would have actually expected veeam to error out and say sorry the server is running please shut it down first, it didn't however it was just very helpful in shutting it down for you and promptly deleting.
We have put additional procedural steps in place on our side to help prevent this, if I have missed anything or a way to prevent a server deletion by security roles please send me a link.
Just wanted to share this, should anybody be using the software like we do when helping to move servers around but not using the failover feature.