Sudden Poor Backup Performance

Khue · Post by **Khue** » Sep 20, 2018 11:42 am this post

Hello,

On Monday night, we apparently had some backup issues. Typically when I see backup issues, it's a one off with Veeam and the next backup cycle fires off fine. Unfortunately that's not the case this time. I have 4 jobs that fire off one after another starting around 8:00 PM or so. In jobs 1 and 2 there are 20 to 60 VMs. Jobs 3 and 4 contain a single VM each, but they are very large. The fulls for these jobs (not synthetic) fire off on Saturdays and then we do incrementals for the rest of the week. Job 1 starts at 8PM and can backup anywhere from 50 to 200 gigs out of 2.7 terabytes total a night. Job 1 usually takes about 30 to 45 minutes. Job 2 starts immediately after job 1, typically around 8:30 PM or 9 PM and then finishes around 10 or so. For the past 3 nights job 1 has functioned flawlessly. Job 2 has not.

Job 2 seems to process about 3-4 of the 60 VMs per usual and then immediately after the 3rd or 4th VM that successfully completes the job appears to just hang. I watched the job last night and again around 9:05 PM after the 3rd VM backed up successfully (usually processes a few at a time) the job just ceases to make forward progress. I cancel the job "immediately" and after about an hour and a half of waiting for the job to cancel. I decided to punt and reboot the Veeam B&R server.

Further details

Veeam Backup Server is Windows 2012 R2 on Veeam 9.5.0.823
Veeam Backup Server has 2 10GbE uplinks in an LACP trunk. There are two VLANs down the trunk. One for the production network and one for the backup network.
Veeam Backup Server is also a proxy.
Veeam Backup Server backs up to a DD2500.
All VMs being backed up are in the same vCenter
All VMs being backed up are in the same cluster
Recently patched the PSC and vCSA responsible for the cluster to 6.0.30800
Job 1 seems to work fine as expected
Job 2 can successfully back up 3-4 of the first few VMs in the group and then fails to back up the rest
Job 3 was manually triggered and it appears to be functioning as expected. Update: It processed 2.8 TBs of data, read 286 gigs, and transferred 286 gigs in about 19 minutes.
Job 4 was manually triggered and it backed up successfully within the expected timeframe and grabbed the appropriate amount of data
Average backup processing speeds are usually between 250 MB/s and 400 MB/s
During the "stall time" on Job 2, the repository seems to be accessible and is exhibiting no discernible performance issues.
During the "stall time" on Job 2, the source seems to be accessible (vmfs volumes) and doesn't seem to be exhibiting any performance issues.

I have a case open with Veeam (#0300479) but in the last 24 hours the only response I've received thus far is asking me to force network mode. I really don't want to force network mode right now, I want to understand why this one job is choking. Does anyone have any thoughts? I have looked through the error logs, but I have no idea what I am looking for and I don't see any blatant or obvious "error" related messages.

Post by **foggy** » Sep 21, 2018 4:54 pm this post

Hi, this is not a correct case ID, could you please share the correct one? Thanks!

R&D Forums

Sudden Poor Backup Performance

Re: Sudden Poor Backup Performance

Who is online