Comprehensive data protection for all workloads
Post Reply
bbricker
Enthusiast
Posts: 49
Liked: 23 times
Joined: Feb 10, 2012 8:43 pm
Contact:

Over 40 VM backups deleted in backup job craziness

Post by bbricker »

This weekend I was modifying a backup job and during the process, VB&R deleted every retention point for every VM in the job, all 43 or so. I still don't understand why this happened and am hoping there is a logical explanation (or maybe just a bug?).

For starters- this is a job I have setup to run on about 43 VM's every Saturday, and it is set to keep 4 retention points.

I first edited this job back on Wednesday or Thursday to remove a single VM from the selection that was no longer needed. I then changed the setting for how long to keep a deleted VM data to 1 day, since I didn't care to keep that 1 old VM's data at all. I expected it would get removed from the backup on Saturday.

On Saturday before the scheduled time for the job, I again edited it, this time to add a new VM to the selection list. Next, I manually started the job.

Immediately I realized that I had forgotten to set an exclusion on that new VM for it's 3rd vmdk disk. So I right-click and stop the job so I can go in and fix that in the job properties. But the job just sits there saying "stopping" for a really long time. I go off to take care of some other things and come back to check on it a few minutes later. It is *still* saying "stopping", so I right-click to get the realtime statistics, and to my horror it is going through every single VM saying, "VM 'xxxxxx' is outdated and will be deleted".

It finishes a few seconds later before I can do anything about it (not that I could have), and sure enough, all retention points for every VM in the job is deleted. I go to the repository folder for that job which confirms it, as it is basically an empty folder minus a small .vbm. That's compared to the 2TB+ of VBK/VBR files that were there previously there.

Anyone have any clues? I have attached a screenshot of it deleting all the VM's. The machine "XPTemp2" at the top of the list is the only VM that I removed from the job selection list. And I know I didn't accidentally remove all 43 VM's or something, because as soon as it was finished nuking all of my data, I just started it right up again and it ran fine.

Image

Gostev
SVP, Product Management
Posts: 26701
Liked: 4276 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Over 40 VM backups deleted in backup job craziness

Post by Gostev »

Generally speaking, you do not want to set deleted VM retention period to 1 day. This setting is there and with large default exactly to prevent immediate deletion, and give you the time to react if VM is accidentally deleted from the infrastructure. Or, consider the other case: just 2 glitches in a row from network/vCenter/ESXi resulting in the complete infrastructure tree not being returned properly (happens quite often), and the VM will be considered as removed from infrastructure, and deleted according because of 1 day retention for the deleted VMs.

As to what exactly happened here, you should open a support case and let our engineers look at the job's log files. I am sure there is simple explanation.

bbricker
Enthusiast
Posts: 49
Liked: 23 times
Joined: Feb 10, 2012 8:43 pm
Contact:

Re: Over 40 VM backups deleted in backup job craziness

Post by bbricker »

Thanks Gostev, I will definitely open a case, just been really busy and it's easy to get all my thoughts typed out on here :)

So if I am understanding you correctly, you are saying that the "deleted VMs data retention period" option not only has to do with me manually removing a VM from the backup job selection list that I don't want backed up any more, but it also has to do with a VM that has *actually* been deleted from my vSphere infrastructure? (I guess that is obvious now that I think about it).

If that's the case, and the glitch in which you are talking about occurred, and is known to occur, then that seems pretty dangerous. Wouldn't it be a good idea then to not allow the user to pick 1 day if this is a known issue with VB&R just "thinking" that the VM's were deleted because of a communication failure from vSphere or the user's network? Seems maybe there needs to be more fail-safe's here. And yes I understand that you are saying the "fail-safe" should just be me not setting the days to 1 :wink: And in reality, my jobs are all set to 3 days normally, I had just set it to 1 day because I had a bunch of data tied up in that old VM that I wanted removed on the next backup job.

foggy
Veeam Software
Posts: 19463
Liked: 1767 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Over 40 VM backups deleted in backup job craziness

Post by foggy »

bbricker wrote:I had just set it to 1 day because I had a bunch of data tied up in that old VM that I wanted removed on the next backup job.
Note that in case of reversed incremental backup mode this won't decrease the VBK file size, but just mark all the blocks inside the file belonging to deleted/removed VMs as unused (so that they could be reused by some other data). NTFS does not allow "shrinking files", so in order to reduce VBK file size, you need to perform an active full backup instead.

Gostev
SVP, Product Management
Posts: 26701
Liked: 4276 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Over 40 VM backups deleted in backup job craziness

Post by Gostev »

We actually had it limited to 7 days minimum originally (in v5), and the users were complaining about that because they wanted to set 1 day in some cases :D

Post Reply

Who is online

Users browsing this forum: No registered users and 33 guests