-
- Certified Trainer
- Posts: 91
- Liked: 5 times
- Joined: Jan 01, 2006 1:01 am
- Contact:
repository tasks calculation
How to calculate the maximum concurrent tasks for a per VM backup files repository?
It depends on the disks type, speed, raid level, CPUs, controller and so on ... but which is a generic guideline to calculate the 'perfect' number of concurrent repository tasks?
It depends on the disks type, speed, raid level, CPUs, controller and so on ... but which is a generic guideline to calculate the 'perfect' number of concurrent repository tasks?
-
- Veteran
- Posts: 3077
- Liked: 455 times
- Joined: Aug 07, 2018 3:11 pm
- Full Name: Fedor Maslov
- Contact:
-
- Certified Trainer
- Posts: 91
- Liked: 5 times
- Joined: Jan 01, 2006 1:01 am
- Contact:
Re: repository tasks calculation
Hi wishr and thanks
I've read the https://bp.veeam.com/ before opening the ticket and I made some forum search but I still do not find an answer.
I know that any write process will consume a task slot so if we have a per-VM bck files repo and we are backing up 10 VMs in a single job, supposing that we have plenty of proxy cores that are sending all the vmdks to our repo, the repo (if configured with at least 10 tasks) can write up to 10 VM files at the same time. Here's my question: how can I calculate the perfect number of tasks to maximize te repository performances, supposing to have plenty of cores and RAM available on my repository server.
I've read the https://bp.veeam.com/ before opening the ticket and I made some forum search but I still do not find an answer.
I know that any write process will consume a task slot so if we have a per-VM bck files repo and we are backing up 10 VMs in a single job, supposing that we have plenty of proxy cores that are sending all the vmdks to our repo, the repo (if configured with at least 10 tasks) can write up to 10 VM files at the same time. Here's my question: how can I calculate the perfect number of tasks to maximize te repository performances, supposing to have plenty of cores and RAM available on my repository server.
-
- Chief Product Officer
- Posts: 31814
- Liked: 7302 times
- Joined: Jan 01, 2006 1:01 am
- Location: Baar, Switzerland
- Contact:
Re: repository tasks calculation
I don't know if you can calculate it, because every storage has different IOPS capacity and interface bandwidth. I think the better approach to understand the "perfect number" of tasks would be to simply monitor backup storage load metrics at the beginning of the backup window to determine the number of tasks that caps its I/O capacity and/or starts to cause excessive I/O latency. Although not every storage device will provide an IOPS monitoring dashboard, I believe you should at least always be able to measure storage access latency - and when the latency starts to spike, you will know you're at the I/O capacacity.
-
- Product Manager
- Posts: 14844
- Liked: 3086 times
- Joined: Sep 01, 2014 11:46 am
- Full Name: Hannes Kasparick
- Location: Austria
- Contact:
Re: repository tasks calculation
Hello,
a rule of thumb is a 1.5 or 2 to 1 ratio for tasks/cores settings.
Example:
- your calculation tells you that you need 16 proxy cores (proxy core / task ratio is recommend 1:1 or only minimal overbooking)
- that means 8 repository cores with a 2:1 ratio
- that means 16 tasks on the repository
- then see how it works and adjust if needed
Also keep in mind, that the best practice guide is still wrong about per-machine / per job chains on tasks (the responsible team is informed). https://helpcenter.veeam.com/docs/backu ... ml?ver=110 has the correct information about tasks.
For calculation, I would just go with 2:1 (or 1.5:1 if you are more conservative). Keep it simple
Best regards,
Hannes
a rule of thumb is a 1.5 or 2 to 1 ratio for tasks/cores settings.
Example:
- your calculation tells you that you need 16 proxy cores (proxy core / task ratio is recommend 1:1 or only minimal overbooking)
- that means 8 repository cores with a 2:1 ratio
- that means 16 tasks on the repository
- then see how it works and adjust if needed
Also keep in mind, that the best practice guide is still wrong about per-machine / per job chains on tasks (the responsible team is informed). https://helpcenter.veeam.com/docs/backu ... ml?ver=110 has the correct information about tasks.
For calculation, I would just go with 2:1 (or 1.5:1 if you are more conservative). Keep it simple
Best regards,
Hannes
-
- Enthusiast
- Posts: 29
- Liked: 3 times
- Joined: Nov 17, 2020 9:49 pm
- Full Name: Martin B
- Contact:
Re: repository tasks calculation
Hello Hannes,
You say BP guide is wrong about how tasks are consumed. So one question: when using Per-VM backup chains, is it possible that VEEAM uses one Repository Task per every VM Virtual Disk ? (same as proxy task consumption)...
In the BP guide they states one repo task per VM chain, but in practice when I test having 16 proxy tasks and 6 repo tasks available, VEEAM only process 6 virtual disks at a time... (not 6 VMs at a time limited by 16 disks total...).
Is it right ?
Regards,
Tincho
You say BP guide is wrong about how tasks are consumed. So one question: when using Per-VM backup chains, is it possible that VEEAM uses one Repository Task per every VM Virtual Disk ? (same as proxy task consumption)...
In the BP guide they states one repo task per VM chain, but in practice when I test having 16 proxy tasks and 6 repo tasks available, VEEAM only process 6 virtual disks at a time... (not 6 VMs at a time limited by 16 disks total...).
Is it right ?
Regards,
Tincho
-
- Product Manager
- Posts: 14844
- Liked: 3086 times
- Joined: Sep 01, 2014 11:46 am
- Full Name: Hannes Kasparick
- Location: Austria
- Contact:
Re: repository tasks calculation
Hello,
yes, one task is used per disk on proxy and repository (that's what the user guide states). It's the same for per-job chains or per-VM chains.
- C:\programdata\Veeam\Backup\RTS.REsourcesUsage.log (I prefer this one)
- C:\Program Files\Veeam\Backup and Replication\Backup>Veeam.Backup.Manager.exe -SHOWREPOSITORYUSAGES
- C:\Program Files\Veeam\Backup and Replication\Backup>Veeam.Backup.Manager.exe -SHOWPROXYUSAGES
Best regards,
Hannes
yes, one task is used per disk on proxy and repository (that's what the user guide states). It's the same for per-job chains or per-VM chains.
You can see the tasks allocations in various ways. The easiest ones are probably these three optionsFor every disk of VMs added to the job, Veeam Backup & Replication creates a new task.
- C:\programdata\Veeam\Backup\RTS.REsourcesUsage.log (I prefer this one)
- C:\Program Files\Veeam\Backup and Replication\Backup>Veeam.Backup.Manager.exe -SHOWREPOSITORYUSAGES
- C:\Program Files\Veeam\Backup and Replication\Backup>Veeam.Backup.Manager.exe -SHOWPROXYUSAGES
Best regards,
Hannes
-
- Certified Trainer
- Posts: 91
- Liked: 5 times
- Joined: Jan 01, 2006 1:01 am
- Contact:
Re: repository tasks calculation
Hi
are you sure that one task is used per disk on repository ?
and, more important, ar you sure that It's the same for per-job chains or per-VM chains?
If yes, which is the purpose to use a per-VM chains repo ?
are you sure that one task is used per disk on repository ?
and, more important, ar you sure that It's the same for per-job chains or per-VM chains?
If yes, which is the purpose to use a per-VM chains repo ?
-
- Product Manager
- Posts: 14844
- Liked: 3086 times
- Joined: Sep 01, 2014 11:46 am
- Full Name: Hannes Kasparick
- Location: Austria
- Contact:
Re: repository tasks calculation
Hello,
2x yes Did you see something different in your environment in your RTS.ResourcesUsage.log?
I use the following use-cases since many years. Maybe there are even more.
- Easier tape restore
- No 16TB files on wrong formatted NTFS volumes
- More performance through parallel writes to storage
- Easier job management (put more VMs in one job)
- Resource usage with SOBR
- Easy deletion of VMs from backups
- Per VM accounting
Best regards,
Hannes
PS: as some people started throwing away after realizing that tasks are "per disk": everything is fine. Please keep formulas that are working since many years.
2x yes Did you see something different in your environment in your RTS.ResourcesUsage.log?
I use the following use-cases since many years. Maybe there are even more.
- Easier tape restore
- No 16TB files on wrong formatted NTFS volumes
- More performance through parallel writes to storage
- Easier job management (put more VMs in one job)
- Resource usage with SOBR
- Easy deletion of VMs from backups
- Per VM accounting
Best regards,
Hannes
PS: as some people started throwing away after realizing that tasks are "per disk": everything is fine. Please keep formulas that are working since many years.
-
- Certified Trainer
- Posts: 91
- Liked: 5 times
- Joined: Jan 01, 2006 1:01 am
- Contact:
Re: repository tasks calculation
Sorry,
where in the user guide I can see that one task is used per disk on repository ?
where in the user guide I can see that one task is used per disk on repository ?
-
- Product Manager
- Posts: 14844
- Liked: 3086 times
- Joined: Sep 01, 2014 11:46 am
- Full Name: Hannes Kasparick
- Location: Austria
- Contact:
Re: repository tasks calculation
please see link and quote in my posts above
post408745.html#p408745
post409981.html#p409981
additionally to the quote mentioned, step 3 and 5 might be interesting.
post408745.html#p408745
post409981.html#p409981
additionally to the quote mentioned, step 3 and 5 might be interesting.
-
- Enthusiast
- Posts: 29
- Liked: 3 times
- Joined: Nov 17, 2020 9:49 pm
- Full Name: Martin B
- Contact:
Re: repository tasks calculation
Thanks Hannes for the confirmation.
As you noted, per VM chains has a lot of advantages over the per job scheme. Let me add that as every VM has his own files, if somehow one file get corrupted, the other ones are not affected.
In per VM chains, should we also assume one repo tasks is NOT a write stream ? If I have a VM with 3 virtual disks writing to one repo file (per VM), is it one write stream right ?
I´ve experienced big variations in performance (or repository saturation) when some of the VM in the backup JOB are not available for backup or when changing the VM processing order. My deduction is that it alters the repo writting, thus changing the performance. For example:
You have 6 repo tasks and per VM chains.
+ 10 VMs to backup (1 big one with 5 disks and the other ones with 1 disk each).
+ if the big VM go first in processing order, VEEAM will start with 2 write streams (6 virtual disks total for two VMs), with low repo load.
+ but if the first one is not available, VEEAM will start with 6 write streams (6 virtual disks for 6 VMs), increasing the load over the repository disks system, so taking much more time to process due to I/O disk saturation.
Could it be possible ?
If so, VEEAM should find the way to add some kind of extra control at the repository level when per-VM is enabled. For example add a field were the users could set the amount of "max repo write streams in per VM mode". This way one could set first the "max repo tasks" and then the "max repo write streams". The combination of both is what VEEAM will do. This could maximize storage performance (not wasting or satuarting performance).
Thanks !
As you noted, per VM chains has a lot of advantages over the per job scheme. Let me add that as every VM has his own files, if somehow one file get corrupted, the other ones are not affected.
In per VM chains, should we also assume one repo tasks is NOT a write stream ? If I have a VM with 3 virtual disks writing to one repo file (per VM), is it one write stream right ?
I´ve experienced big variations in performance (or repository saturation) when some of the VM in the backup JOB are not available for backup or when changing the VM processing order. My deduction is that it alters the repo writting, thus changing the performance. For example:
You have 6 repo tasks and per VM chains.
+ 10 VMs to backup (1 big one with 5 disks and the other ones with 1 disk each).
+ if the big VM go first in processing order, VEEAM will start with 2 write streams (6 virtual disks total for two VMs), with low repo load.
+ but if the first one is not available, VEEAM will start with 6 write streams (6 virtual disks for 6 VMs), increasing the load over the repository disks system, so taking much more time to process due to I/O disk saturation.
Could it be possible ?
If so, VEEAM should find the way to add some kind of extra control at the repository level when per-VM is enabled. For example add a field were the users could set the amount of "max repo write streams in per VM mode". This way one could set first the "max repo tasks" and then the "max repo write streams". The combination of both is what VEEAM will do. This could maximize storage performance (not wasting or satuarting performance).
Thanks !
-
- Product Manager
- Posts: 14844
- Liked: 3086 times
- Joined: Sep 01, 2014 11:46 am
- Full Name: Hannes Kasparick
- Location: Austria
- Contact:
Re: repository tasks calculation
Hello,
correct, one backup chain = one write stream to the disks.
Whether parallel streams are an advantage or disadvantage depends on the storage. SMB / SOHO hardware often suffers from parallel writes. "Proper" block storage hardware with RAID controller caches profit from parallel writes.
Best regards,
Hannes
correct, one backup chain = one write stream to the disks.
Whether parallel streams are an advantage or disadvantage depends on the storage. SMB / SOHO hardware often suffers from parallel writes. "Proper" block storage hardware with RAID controller caches profit from parallel writes.
Best regards,
Hannes
-
- Enthusiast
- Posts: 29
- Liked: 3 times
- Joined: Nov 17, 2020 9:49 pm
- Full Name: Martin B
- Contact:
Re: repository tasks calculation
Hannes,
we have a block storage Dell ME4 with 12 HDD, using direct iSCSI SAN 10G and Jumbo enabled. A proper block storage. Using Disk SPD tools, we got that 4 write streams is the most performant option.
As I mentioned, we use Per-VM schains for all the advantages you mentioned before.
The point is: we as users have not a way in VEEAM B&R to specify an additional control parameter: the max amount of write streams.
The only control point is repo task (as you confirmed 1 of them = to 1 VMDK disk), and if we set it to 4:
a) it will generate 4 write streams if we process VMs with just one disk - ok.
b) it will generate few streams if the VMs have more disks.
But if we set so few repo tasks, JOBs will take way more time to complete, and our backup resources (server, repo, proxy, etc) will be wasted.
To compensate for that, we have to find a compromise solution and increase repo tasks to for example 8, and try to re-arrange and try diferent VM processing order (in order try to keep write streams close to 4/5).
But as I did mention, when some VM fail (backup), the processing order will changed, and sometimes it will slow down the JOB processing. In some situations JOB log will says Destination is the bottleneck (when in normal operation will say 1% Destination). My deduction is that it could be due to increase in write streams produced by more VM being processed at the same time.
If we could set the repo task but also the write streams. We could set for example 16 tasks, but limited to not more than for example 4 write streams.
Perhaps its not possible due to VEEAM architecture, but maybe someone could take it into account.
Regards.
we have a block storage Dell ME4 with 12 HDD, using direct iSCSI SAN 10G and Jumbo enabled. A proper block storage. Using Disk SPD tools, we got that 4 write streams is the most performant option.
As I mentioned, we use Per-VM schains for all the advantages you mentioned before.
The point is: we as users have not a way in VEEAM B&R to specify an additional control parameter: the max amount of write streams.
The only control point is repo task (as you confirmed 1 of them = to 1 VMDK disk), and if we set it to 4:
a) it will generate 4 write streams if we process VMs with just one disk - ok.
b) it will generate few streams if the VMs have more disks.
But if we set so few repo tasks, JOBs will take way more time to complete, and our backup resources (server, repo, proxy, etc) will be wasted.
To compensate for that, we have to find a compromise solution and increase repo tasks to for example 8, and try to re-arrange and try diferent VM processing order (in order try to keep write streams close to 4/5).
But as I did mention, when some VM fail (backup), the processing order will changed, and sometimes it will slow down the JOB processing. In some situations JOB log will says Destination is the bottleneck (when in normal operation will say 1% Destination). My deduction is that it could be due to increase in write streams produced by more VM being processed at the same time.
If we could set the repo task but also the write streams. We could set for example 16 tasks, but limited to not more than for example 4 write streams.
Perhaps its not possible due to VEEAM architecture, but maybe someone could take it into account.
Regards.
-
- Product Manager
- Posts: 14844
- Liked: 3086 times
- Joined: Sep 01, 2014 11:46 am
- Full Name: Hannes Kasparick
- Location: Austria
- Contact:
Re: repository tasks calculation
Hello,
yes, that would be a complex task to re-write the engine / scheduler.
In the best case, tasks or streams settings should be removed from the UI completely. The software should figure out on it's own. But as mentioned before, that needs some more re-engineering
Best regards,
Hannes
yes, that would be a complex task to re-write the engine / scheduler.
In the best case, tasks or streams settings should be removed from the UI completely. The software should figure out on it's own. But as mentioned before, that needs some more re-engineering
Best regards,
Hannes
Who is online
Users browsing this forum: Amazon [Bot], Bing [Bot] and 20 guests