Availability for the Always-On Enterprise
Post Reply
velotime
Novice
Posts: 7
Liked: never
Joined: Aug 29, 2012 11:18 pm
Contact:

Veeam server 100% CPU Question

Post by velotime » Mar 21, 2018 3:39 am

We recently wrapped up an ESXi Host and storage refresh and now I'm beginning to update our Veeam architecture. I'm sure I'll have some questions once I really get going on what my options are there that I will post in another thread later.

As of now I have this question.
Previously I had a Veeam BU Server proxy running on a VM. The repository was a Dell DAS device connected to a host with SAS and presented to the Veeam VM as an VMFS Datastore. Performance was ok, 80-100MB/s. That worked and B/U windows were acceptable so I never bothered with trying to improve speed.

The new hosts do not support the DELL DAS device so I took that old ESXi host and put Win Server 2016 on it and created a new repository on the DAS with REFS, added it to the Veeam server VM infrastructure and pointed the backups jobs storage target at it.
First job started and the CPU on the Veeam VM is running at 100%, which looks like a good thing from what I have read. It never got up past ~40% before. Backup job is running at 180MB/s which is also quite an improvement and the NIC on the Veeamserver and repository windows server are basically saturated 1Gb/s. Not sure if this could even be improved on without adding network bandwidth.

Question. That Veeam Vm is set to run 4 concurrent tasks and it sounds like that doesn't really serve any purpose. Would it be best to just set that to 1 task to eliminate the amount of time the VMs are in snapshot? I can add CPU to the VM if that would help.

I'll be setting up direct SAN access and maybe utilize the new win 2016 server as the proxy but until I get all that sorted I want to make sure my current setup is running as well as it can. The speed improvement was actually a bit of a surprise. I was expecting it to drop a bit since I'm now sending it over the LAN.

thanks in advance!

velotime
Novice
Posts: 7
Liked: never
Joined: Aug 29, 2012 11:18 pm
Contact:

Re: Veeam server 100% CPU Question

Post by velotime » Mar 21, 2018 7:16 pm

Perhaps that was more info than necessary.

Simplified version.

If the proxy VM is at 100% with a saturated NIC and the repository server's NIC is also saturated. Is there any point to processing concurrent VMs in the backup jobs?

csydas
Expert
Posts: 139
Liked: 29 times
Joined: Jan 16, 2018 5:14 pm
Full Name: Harvey Carel
Contact:

Re: Veeam server 100% CPU Question

Post by csydas » Mar 21, 2018 7:52 pm

Hey velotime,

I think just math it out - what's more efficient time wise for you? Waiting as you do each disk one at a time super fast or doing several disks at a time pretty fast? There's probably a point of diminishing returns for you with the concurrent tasks, but I would imagine that scheduling smartly would solve it.

I'm doing napkin-math at this point, but if you've got a bunch of appliances with tiny disks (like 20-50 gb) and then a few gargantuan servers like a 4 tb file server, in my mind it makes sense to just schedule the big servers last and blow through all your smaller VMs first. You get the best of both worlds then; parallel processing on the VMs that will still backup reasonably fast, and then a saturated gigabit nic for your gargantuan servers when they're the only ones running.

velotime
Novice
Posts: 7
Liked: never
Joined: Aug 29, 2012 11:18 pm
Contact:

Re: Veeam server 100% CPU Question

Post by velotime » Mar 21, 2018 9:54 pm

Thanks for the reply. This is almost set up as you suggested. Smaller VMs are in there own jobs based on similar OSs or application groups. The big monster file servers are in their own individual jobs.

I think I'll leave it as it is and focus on getting the old host setup as a proxy with direct SAN access to our new nimble. I think the most performance gains will be had there anyway.

thanks again.

foggy
Veeam Software
Posts: 17097
Liked: 1397 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Veeam server 100% CPU Question

Post by foggy » Mar 23, 2018 12:59 pm

What are the bottleneck stats for your jobs now?

Lowering the number of concurrent tasks may actually increase the time the VM runs on snapshot, since will mean less parallelism in data processing. If current major bottleneck is proxy, you can either add CPU to it or introduce more proxies into the setup to allow even more parallel jobs.

velotime
Novice
Posts: 7
Liked: never
Joined: Aug 29, 2012 11:18 pm
Contact:

Re: Veeam server 100% CPU Question

Post by velotime » Mar 23, 2018 4:36 pm

Hi Foggy

Stats for my jobs last night were all similar to this.

Load: Source 98% > Proxy 53% > Network 11% > Target 0%

I did get my physical proxy setup with direct SAN access yesterday. Running some test jobs last night comparing Virtual Appliance Hot Add and Direct SAN resulted in basically the same stats and processing speed. Since the bottle neck is the source I think that is to be expected.

I'm not sure why I didn't review this first but when I looked at the Nimble perf monitoring during the jobs last night its interfaces are saturated as well. It's currently running at 1gb/s iSCSI until we get our 10g switch infrastructure in place later this year.
So, I think I need to work the other direction and actually throttle this down. I'm pulling data off the SAN as quick as it can present it. ESXi host latency of course went up a bit during the backups which is not what I want.

Does it sound like I'm looking at this the right way? I don't think there is way to reduce the load on the SAN without throttling the repository. I was playing around with the network traffic shaping rules but those don't appear to have any affect on the physical proxy direct SAN connection. I believe those are only to shape traffic between data movers but I need to review that more.

thanks

foggy
Veeam Software
Posts: 17097
Liked: 1397 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Veeam server 100% CPU Question

Post by foggy » Mar 27, 2018 2:58 pm

Correct, network throttling rules affect communication between data movers. You should look at I/O control settings instead.

Post Reply

Who is online

Users browsing this forum: No registered users and 37 guests