We thought we had scaled our new deployment server pretty well, with 24C EPYC ROME, 256GB ram and local disks.
However, watching the resource consumption during the initial full backup, I see that we could probably have added more juice.
What are you seeing as normal resource consumption? It is only the Veeam.Archiver.Proxy.exe that is running wild it seems, consuming alone 100% CPU and 157GB of RAM.
Hey Henrik,
During backup our proxies have a lot of work, so CPU and Memory are pretty much needed. I believe you are running everything on one server? Are all jobs running at the same time? And how large is this job?
It is actually not that bad to be honest. We can only catch what MSFT endpoints want to give us, and most others will tell you they have a max. of 1 TB per day
The Exchange online job has finished and further incremental runs are running quickly without any issues.
For OneDrive there is a completely different story.
The resource consumption on the proxy server is still very high while this job runs, 100% on CPU.
I see very few item/s, 2-4, even though we have now added 40 auxiliary backup accounts. - so no visible increase or decrease.
The download speed seems to be a reflection on the site of those items, since I can see rates of up to 40MB/s (350Mbps), but as long as there are very many small files to fetch the overall completion time is still not reached.
We are now at 236 out of 1840 Objects after about 200 hours of running and only 6,5TB fetched.
How can we debug and increase the items/s ?
Edit: We see in Cloud App Security console that the proxy server uses all 40 auxiliary accounts to download onedrive items.
There might be something blocking. We know that many small files can be a challenge (although we already have improvements for that...).
Maybe you can best create a support call at the moment so our engineers can see if there is not something blocking.
As per forum rules, please post the case ID and follow-up after investigation here.
High CPU consumption during backup is normal for VBO. Working threads will consume as many resources as they can and as available to ensure faster operations.
I'm sure support engineers provided you the most accurate response according to the specific details of your case (which I'm obviously not aware of).
Will you please share your case ID so that we all could be on the same page?
Morten, no, due to the holiday season I have not been able to look further into it.
We changes the repository from ReFS to NTFS and started a new initial run.
Same issues so far: 160TB RAM usage, 100% CPU, fetched 20TB and 9.2Mill files and the job has been running for 640+ hours so far.
It only fetches like 3-4 items/s. I see the speed is OK when there are larger files fetched.
1: Reboot of backup server (repository and proxy)
2: Adjusting down worker threads from 64 (default) to 24 (same number as actual cores)
3: Adding another repository and a new job with only 73 objects for OneDrive only backup.
4: Running the initial backup - so far after 19 hours fetched 49/73 objects, 661GB and running at around 6-10 items/s
Now what I have seen looking at the resource consumption:
CPU usage 100% on 24 cores (due to HTT, reported as overall 50% usage)
I also tried to lower this to 12 worker threads, resulting in 100% usage on 12 cores (due to HTT reported as overall 25% usage)
When it comes to items/s or speed, I see no special decrease in regards to the OneDrive backup. For another Exchange job, there is a drop in performance.
When it comes to RAM consumption:
When the job started, the consumtion was low, at around 20GB, but that steadingly increased onwards to the same 160GB, which seems to be some kind of limit after about 200GB of data was fetched?
About the speed, it started with speeds at around 30MB/s (250Mbps) and 20+ items/s, but decreased quickly
The jobs report bottleneck as source and target.
The disk arrays report low consumption and response times.
This leads me to think:
1: As Polina stated, the proxy will use as much CPU resources as you enable worker threads - but for what? what exactly makes the proxy threads consume 100% CPU at all given times?
2: What is the reason for the extreme RAM consumption and limit at 160GB?
3: How is the repository code optimized for storing the data? As it is the proxy service running wild with CPU and RAM consumption and the low disk activity, I assume the repository code is OK ?