Host-based backup of VMware vSphere VMs.
Post Reply
balma01
Certified Trainer
Posts: 91
Liked: 5 times
Joined: Jan 01, 2006 1:01 am
Contact:

repository tasks calculation

Post by balma01 »

How to calculate the maximum concurrent tasks for a per VM backup files repository?
It depends on the disks type, speed, raid level, CPUs, controller and so on ... but which is a generic guideline to calculate the 'perfect' number of concurrent repository tasks?
wishr
Veteran
Posts: 3077
Liked: 453 times
Joined: Aug 07, 2018 3:11 pm
Full Name: Fedor Maslov
Contact:

Re: repository tasks calculation

Post by wishr »

Hello,

This article should help. Also, you may use forum search to find more details.

Thanks
balma01
Certified Trainer
Posts: 91
Liked: 5 times
Joined: Jan 01, 2006 1:01 am
Contact:

Re: repository tasks calculation

Post by balma01 »

Hi wishr and thanks
I've read the https://bp.veeam.com/ before opening the ticket and I made some forum search but I still do not find an answer.
I know that any write process will consume a task slot so if we have a per-VM bck files repo and we are backing up 10 VMs in a single job, supposing that we have plenty of proxy cores that are sending all the vmdks to our repo, the repo (if configured with at least 10 tasks) can write up to 10 VM files at the same time. Here's my question: how can I calculate the perfect number of tasks to maximize te repository performances, supposing to have plenty of cores and RAM available on my repository server.
Gostev
Chief Product Officer
Posts: 31561
Liked: 6724 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: repository tasks calculation

Post by Gostev » 1 person likes this post

I don't know if you can calculate it, because every storage has different IOPS capacity and interface bandwidth. I think the better approach to understand the "perfect number" of tasks would be to simply monitor backup storage load metrics at the beginning of the backup window to determine the number of tasks that caps its I/O capacity and/or starts to cause excessive I/O latency. Although not every storage device will provide an IOPS monitoring dashboard, I believe you should at least always be able to measure storage access latency - and when the latency starts to spike, you will know you're at the I/O capacacity.
HannesK
Product Manager
Posts: 14322
Liked: 2890 times
Joined: Sep 01, 2014 11:46 am
Full Name: Hannes Kasparick
Location: Austria
Contact:

Re: repository tasks calculation

Post by HannesK » 1 person likes this post

Hello,
a rule of thumb is a 1.5 or 2 to 1 ratio for tasks/cores settings.

Example:
- your calculation tells you that you need 16 proxy cores (proxy core / task ratio is recommend 1:1 or only minimal overbooking)
- that means 8 repository cores with a 2:1 ratio
- that means 16 tasks on the repository
- then see how it works and adjust if needed

Also keep in mind, that the best practice guide is still wrong about per-machine / per job chains on tasks (the responsible team is informed). https://helpcenter.veeam.com/docs/backu ... ml?ver=110 has the correct information about tasks.

For calculation, I would just go with 2:1 (or 1.5:1 if you are more conservative). Keep it simple :-)

Best regards,
Hannes
TinchoB
Enthusiast
Posts: 29
Liked: 3 times
Joined: Nov 17, 2020 9:49 pm
Full Name: Martin B
Contact:

Re: repository tasks calculation

Post by TinchoB »

Hello Hannes,
You say BP guide is wrong about how tasks are consumed. So one question: when using Per-VM backup chains, is it possible that VEEAM uses one Repository Task per every VM Virtual Disk ? (same as proxy task consumption)...
In the BP guide they states one repo task per VM chain, but in practice when I test having 16 proxy tasks and 6 repo tasks available, VEEAM only process 6 virtual disks at a time... (not 6 VMs at a time limited by 16 disks total...).

Is it right ?
Regards,

Tincho
HannesK
Product Manager
Posts: 14322
Liked: 2890 times
Joined: Sep 01, 2014 11:46 am
Full Name: Hannes Kasparick
Location: Austria
Contact:

Re: repository tasks calculation

Post by HannesK »

Hello,
yes, one task is used per disk on proxy and repository (that's what the user guide states). It's the same for per-job chains or per-VM chains.
For every disk of VMs added to the job, Veeam Backup & Replication creates a new task.
You can see the tasks allocations in various ways. The easiest ones are probably these three options
- C:\programdata\Veeam\Backup\RTS.REsourcesUsage.log (I prefer this one)
- C:\Program Files\Veeam\Backup and Replication\Backup>Veeam.Backup.Manager.exe -SHOWREPOSITORYUSAGES
- C:\Program Files\Veeam\Backup and Replication\Backup>Veeam.Backup.Manager.exe -SHOWPROXYUSAGES

Best regards,
Hannes
balma01
Certified Trainer
Posts: 91
Liked: 5 times
Joined: Jan 01, 2006 1:01 am
Contact:

Re: repository tasks calculation

Post by balma01 »

Hi
are you sure that one task is used per disk on repository ?
and, more important, ar you sure that It's the same for per-job chains or per-VM chains?
If yes, which is the purpose to use a per-VM chains repo ?
HannesK
Product Manager
Posts: 14322
Liked: 2890 times
Joined: Sep 01, 2014 11:46 am
Full Name: Hannes Kasparick
Location: Austria
Contact:

Re: repository tasks calculation

Post by HannesK »

Hello,
2x yes :-) Did you see something different in your environment in your RTS.ResourcesUsage.log?


I use the following use-cases since many years. Maybe there are even more.
- Easier tape restore
- No 16TB files on wrong formatted NTFS volumes
- More performance through parallel writes to storage
- Easier job management (put more VMs in one job)
- Resource usage with SOBR
- Easy deletion of VMs from backups
- Per VM accounting

Best regards,
Hannes

PS: as some people started throwing away after realizing that tasks are "per disk": everything is fine. Please keep formulas that are working since many years.
balma01
Certified Trainer
Posts: 91
Liked: 5 times
Joined: Jan 01, 2006 1:01 am
Contact:

Re: repository tasks calculation

Post by balma01 »

Sorry,
where in the user guide I can see that one task is used per disk on repository ?
HannesK
Product Manager
Posts: 14322
Liked: 2890 times
Joined: Sep 01, 2014 11:46 am
Full Name: Hannes Kasparick
Location: Austria
Contact:

Re: repository tasks calculation

Post by HannesK »

please see link and quote in my posts above :-)

post408745.html#p408745
post409981.html#p409981

additionally to the quote mentioned, step 3 and 5 might be interesting.
TinchoB
Enthusiast
Posts: 29
Liked: 3 times
Joined: Nov 17, 2020 9:49 pm
Full Name: Martin B
Contact:

Re: repository tasks calculation

Post by TinchoB »

Thanks Hannes for the confirmation.
As you noted, per VM chains has a lot of advantages over the per job scheme. Let me add that as every VM has his own files, if somehow one file get corrupted, the other ones are not affected.

In per VM chains, should we also assume one repo tasks is NOT a write stream ? If I have a VM with 3 virtual disks writing to one repo file (per VM), is it one write stream right ?
I´ve experienced big variations in performance (or repository saturation) when some of the VM in the backup JOB are not available for backup or when changing the VM processing order. My deduction is that it alters the repo writting, thus changing the performance. For example:
You have 6 repo tasks and per VM chains.
+ 10 VMs to backup (1 big one with 5 disks and the other ones with 1 disk each).
+ if the big VM go first in processing order, VEEAM will start with 2 write streams (6 virtual disks total for two VMs), with low repo load.
+ but if the first one is not available, VEEAM will start with 6 write streams (6 virtual disks for 6 VMs), increasing the load over the repository disks system, so taking much more time to process due to I/O disk saturation.

Could it be possible ?

If so, VEEAM should find the way to add some kind of extra control at the repository level when per-VM is enabled. For example add a field were the users could set the amount of "max repo write streams in per VM mode". This way one could set first the "max repo tasks" and then the "max repo write streams". The combination of both is what VEEAM will do. This could maximize storage performance (not wasting or satuarting performance).

Thanks !
HannesK
Product Manager
Posts: 14322
Liked: 2890 times
Joined: Sep 01, 2014 11:46 am
Full Name: Hannes Kasparick
Location: Austria
Contact:

Re: repository tasks calculation

Post by HannesK »

Hello,
correct, one backup chain = one write stream to the disks.

Whether parallel streams are an advantage or disadvantage depends on the storage. SMB / SOHO hardware often suffers from parallel writes. "Proper" block storage hardware with RAID controller caches profit from parallel writes.

Best regards,
Hannes
TinchoB
Enthusiast
Posts: 29
Liked: 3 times
Joined: Nov 17, 2020 9:49 pm
Full Name: Martin B
Contact:

Re: repository tasks calculation

Post by TinchoB »

Hannes,
we have a block storage Dell ME4 with 12 HDD, using direct iSCSI SAN 10G and Jumbo enabled. A proper block storage. Using Disk SPD tools, we got that 4 write streams is the most performant option.

As I mentioned, we use Per-VM schains for all the advantages you mentioned before.

The point is: we as users have not a way in VEEAM B&R to specify an additional control parameter: the max amount of write streams.
The only control point is repo task (as you confirmed 1 of them = to 1 VMDK disk), and if we set it to 4:
a) it will generate 4 write streams if we process VMs with just one disk - ok.
b) it will generate few streams if the VMs have more disks.

But if we set so few repo tasks, JOBs will take way more time to complete, and our backup resources (server, repo, proxy, etc) will be wasted.

To compensate for that, we have to find a compromise solution and increase repo tasks to for example 8, and try to re-arrange and try diferent VM processing order (in order try to keep write streams close to 4/5).

But as I did mention, when some VM fail (backup), the processing order will changed, and sometimes it will slow down the JOB processing. In some situations JOB log will says Destination is the bottleneck (when in normal operation will say 1% Destination). My deduction is that it could be due to increase in write streams produced by more VM being processed at the same time.

If we could set the repo task but also the write streams. We could set for example 16 tasks, but limited to not more than for example 4 write streams.
Perhaps its not possible due to VEEAM architecture, but maybe someone could take it into account.

Regards.
HannesK
Product Manager
Posts: 14322
Liked: 2890 times
Joined: Sep 01, 2014 11:46 am
Full Name: Hannes Kasparick
Location: Austria
Contact:

Re: repository tasks calculation

Post by HannesK »

Hello,
yes, that would be a complex task to re-write the engine / scheduler.

In the best case, tasks or streams settings should be removed from the UI completely. The software should figure out on it's own. But as mentioned before, that needs some more re-engineering :-)

Best regards,
Hannes
Post Reply

Who is online

Users browsing this forum: No registered users and 82 guests