Concurrent Tasks

joost1981 · Post by **joost1981** » Dec 02, 2011 2:59 pm this post

Hi there,

I have a question about concurrent tasks. We have one backup job which holds about 75 VMs. In version 5 it starts processing them one by one.
I was hoping that the new feature "concurrent tasks" would start processing multiple VMs at a time (from within one backup job). Is this not how "concurrent tasks" are desgined or did I just misunderstood this new feature? Or do we have a problem with our current setup?

Thank you in advance.

Cheers,
Joost Poulissen

Post by **Gostev** » Dec 02, 2011 3:08 pm this post

Hi Joost, within the job, each VM is processed sequentially according to the defined processing order. Each proxy can run multiple jobs at once, this is what this setting affects. Thanks!

joost1981 · Post by **joost1981** » Dec 02, 2011 6:59 pm this post

Hi Gostev,

Thank you for your answer. The reason why we have only one job, is that we backup the complete VMware cluster. When a new VM is created in VMware it will automatically backup in Veeam.

Is this a feature worth implementing in a new version?

Thank you again!

Cheers,
Joost Poulissen

Post by **Gostev** » Dec 02, 2011 9:01 pm this post

Hi Joost, sure - everything is possible depending on demand. I can certainly see how this feature can be useful for monster jobs, however typically customers prefer to create smaller jobs (to keep the backup sizes manageable), and run them in parallel - this is the use case v6 is tuned for.

I am not sure why the fact that you need to backup vSphere cluster forces you to put all VMs in the same job. Most customer seem to prefer to organize the job based on VM folders instead of infrastructure objects for added flexibility. Have you considered using VM folders for backup jobs?

Also, unlike previous version, v6 should be able to fully load your backup infrastructure with consequent processing of VM in the job, so you should be seeing massive improvements from v5 anyway.

Thanks!

joost1981 · Post by **joost1981** » Dec 02, 2011 10:55 pm this post

Hi Gostev,

Thank you for explaining this. Of course using VM Folders to backup is an option for us, we can redesign jobs.
However in the future our environment is expected to grow exponentially (about 6/8 backup proxies and 400/500 VMs) and then running mutiple VMs backups simultaneous from one job instead of creating an huge number of jobs would be handy. However of course it is not a requirement!

Thank you again for your help!

Joost.

mgambetti · Post by **mgambetti** » Dec 05, 2011 4:40 pm this post

so let me recap if i have understand correctly...if i need to make more parallel backups of VMs inside the SAME job, the OLNY solution is to set more concurrent tasks?
or the other solution is to split VMs on more jobs, and use more proxies, am i right?
thanks.

Post by **foggy** » Dec 05, 2011 4:50 pm this post

Please note, that task=job here. So the only way to get more balanced backups is to split jobs into smaller ones, as within the job VMs are processed sequentially.

averylarry · Post by **averylarry** » Dec 05, 2011 7:31 pm this post

Is there a reason for the sequential processing? Even if it only helps a little bit, I have to think it would be more efficient to process multiple VM's simultaneously. There's no way a sequential method will keep all phases (source, destination, target, CPU, ram, network, WAN) saturated.

On the other hand, if it messes with deduplication or compression or something, then I would understand. But I would argue that some people (myself included) have created smaller jobs EXCLUSIVELY for parallel processing.

andrewilmann · Post by **andrewilmann** » Dec 05, 2011 7:44 pm this post

+1

We are about to split up our Backup job just because of this to!

jamessa · Post by **jamessa** » Dec 05, 2011 9:57 pm this post

+1 here too. This problem for us causes backup jobs to run for a long time when it hits a new 300GB VM. I know the answer could be separate jobs by folders or resource pools, but with multiple people working on our team VMs could and do get moved around between these quite often, which causes them to get duplicated in backups. This is a major problem for us that keeps us from doing multiple jobs per cluster.

Post by **Gostev** » Dec 06, 2011 12:08 am this post

averylarry wrote:There's no way a sequential method will keep all phases (source, destination, target, CPU, ram, network, WAN) saturated.

Thing is, you do not need to have ALL phases saturated - as soon as the "weakest" phase is saturated, it will become the bottleneck for the job - and as soon as this happens, it no longer matters that other phases are not saturated. Because there is a bottleneck for the job already, it is pointless to start processing of another VM in parallel within the same job.

The ONLY benefit of parallel processing within the job is that while one VM is only preparing for processing, another one can be processed. We are talking about 20-30 seconds of time that is otherwise simply "lost" with sequential method. However, the price you would have to pay for these 20-30 sec of time is twice longer time that each VM lives on snapshot (because with 2 VMs being processed in parallel, processing time for each one doubles), with in turn directly affects snapshot size, and so snapshot commit process that impacts actual VMs. This is exactly why finishing processing of every VM as fast as possible is so important.

The devil is always in details - just because something sounds nice, it does not mean it will work nice in all aspects.

Please note however, that everything I said above applies to v6 mostly, as the previous versions were not nearly as good with saturating backup infrastructure with a single sequential job.

averylarry · Post by **averylarry** » Dec 06, 2011 2:35 am this post

I just spent 20 minutes writing a response that was disjointed.

Let me just say this -- I'm a small business. I can't afford fancy fast storage. More than anything else -- I think my biggest point is simply -- if I have multiple proxies, why can't I use more than 1 at a time? I use "poor man's aggregation", where I have separate LUNS on my SAN that don't share iSCSI network ports. I have iSCSI ports and LUNS that aren't even getting used because it won't process multiple VMs at the same time.

More jobs? Doesn't that hinder deduplication? Not to mention increasing the administration and complexity for a small business.

Just for the record -- I still love Veeam, and in particular v6. Sometimes I think small businesses (and small budgets) are somewhat forgotten.

Post by **Gostev** » Dec 06, 2011 10:52 am this post

Point taken.

jamessa · Post by **jamessa** » Dec 06, 2011 2:27 pm this post

Very interesting topic and it looks like there are pros and cons for both. I think in future versions there should be a way to allow it, maybe not a default option but one that is an advanced setting. We are also concerned about things like snapshots taking too long, but if we throw enough horsepower at the Veeam server and our backup target is fast enough it seems like a good option to let the more technically inclined people be able to tweak backups to run more than one at a time if they so wish.

JM2C

And btw Gostev, thanks for your participation in these forums!

vluzhkov · Post by **vluzhkov** » Dec 19, 2011 1:14 pm this post

Gostev wrote: Thing is, you do not need to have ALL phases saturated - as soon as the "weakest" phase is saturated, it will become the bottleneck for the job - and as soon as this happens, it no longer matters that other phases are not saturated. Because there is a bottleneck for the job already, it is pointless to start processing of another VM in parallel within the same job.

You are making main assumption, that backup of one machine should saturate at least one phase in any case. That's not right at all. As it was said in low-budget installations you may have the example of different storage targets. In high-budget installations you have storage and network multipathing, many independent datastores (even storage-drs managed in vsphere 5), high performance network and storage which with CBT can even make data transmission time shorter than preparing to backup VM.
When you are trying to prevent saturation for specific phases you use phase-specific concurrency limits - for proxy servers, for repository servers. These methods are already implemented. Really, i was very surprised after reading pre-v6 infos, that backup jobs still runs sequentially. And really disappointed. Even free vmware Data Recovery has concurrency as one of base features.
Nowdays despite of all new features of v6 we still can't optimize performance of backups without splitting jobs - the concurrence features are almost useless.
It's no way to present jobs as method of managing backup queues - any new instance of any object should be created only when it's properties are different from existing ones. For jobs these are storage policies (expiration, incremental/reserve incremental etc), application processing settings, schedule, proxy/destination, all settings you specify when creating the job. There is no reasons to create different jobs with same settings - expect of software limitations, which force you to manually micromanage things that should not require management at all and add a huge overhead when you should continually inspect job logs to distribute VM's.
Traditional backup software sometimes has sequential approach to processing data because of being tape-orientated - and that is not veeam case again.
I usually don't write any suggestions in support forums, as it has a little to no effect, but in case if veeam, which, as i think, is not a very big company, i hope these opinions can somehow reach the decision makers in veeam...

tgiphil · Post by **tgiphil** » Dec 19, 2011 4:27 pm this post

Another +1.

sipo75 · Post by **sipo75** » Jan 16, 2012 5:13 pm this post

+1

Saturation? After five days of reading through the manual and the dozens of related posts I am still stuck with 3MB/s with v6 both with "Virtual Appliance with hot-add" and "Network" for initial backups.

If I could "thread" a job I could work around this issue.

mysidia · Post by **mysidia** » Jan 19, 2012 3:20 pm this post

Gostev wrote: Thing is, you do not need to have ALL phases saturated - as soon as the "weakest" phase is saturated, it will become the bottleneck for the job - and as soon as this happens, it no longer matters that other phases are not saturated. Because there is a bottleneck for the job already, it is pointless to start processing of another VM in parallel within the same job.

I would like to see more concurrency in the Veeam backup processing. In my case, I am backing up large VMs with multiple separate VMDKs. On these VMs, each of their VMDK's is on an entirely different storage array, for the purpose of balancing the load.

What happens is, the backup server's load utilization is very low, and the source is the bottleneck, as in throughput through each storage array, each array tops out at around 30 MB/s with the load that they have; I have multiple source proxies, but nonetheless Veeam backup processes the job VMDK by VMDK.

This serial processing is highly inefficient. If the backup software efficiently processed several of the Virtual machine's virtual disks in parallel, instead of one by one, the VM would finish being backed up much sooner.

Post by **tsightler** » Jan 20, 2012 2:31 am this post

sipo75 wrote:+1
Saturation? After five days of reading through the manual and the dozens of related posts I am still stuck with 3MB/s with v6 both with "Virtual Appliance with hot-add" and "Network" for initial backups.

While I agree that having multiple streams (one per VMDK) might be nice, in your case, 3MB/s is unbelievably slow. It would be best to attempt to figure out why this is the case. My laptop running VMs inside of nested ESX with a regular laptop drive provides, on average 10-20MB/s, and that's backing up to itself, so both reads and writes are on the same, relatively slow laptop hard disk (not an SSD). Backing up to a remote repository over my home wireless is around 15MB/s. There has to be something very, very wrong with your setup.

MartinSvec · Apr 18, 2012 5:38 pm

Hello,

another +1 for me (Vitaliy, thanks for redirecting me to this thread). I evaluate Veeam v6 four our new vSphere 5 cloud environment with dozens (will be hundreds to thousands) of VMs that are relatively small and have minimal CBT changes. Having redundant 10GE iSCSI SAN, enterprise storage arrays managed by Storage DRS and a decent backup server, it's impossible for me to efficiently utilize the backup process with the number of jobs that corresponds to logical groups of VMs. About 10-30% of VM backup time is spent by waiting for quiescing, snapshots and appliance reconfigurations, and during the actual backup there are still plenty of CPU/bandwidth/storage resources available. I was really surprised that despite all the v6 scalability features, the level of concurrency is primarily limited by the number of jobs.

Although we clearly cannot backup hundreds of VMs by only one job and so there will be always some concurrency, I still see this limitation important when considering if Veeam is suitable for our environment. It must be taken into account when planning the structure of vSphere folders, when planning the capacity and partitioning of backup servers, it's necessary to ensure uniform distribution of VMs between multiple jobs to maintain the backup window length, etc.

So, if it's not an architectural issue of the backup engine, I vote for an option to enable and tune concurrency inside jobs. Let users decide which jobs - serial or parallel - are better in their environment

Thank you

Martin Svec

acasanovanex · Post by **acasanovanex** » Jun 16, 2012 2:22 pm this post

Another +1

TimJWatts · Post by **TimJWatts** » Aug 05, 2012 1:56 pm this post

[merged]

Hi folks,

Sorry - probably a newbie question...

My early tests with Veeam Backup suggest that if you set up a single backup job with a lot of VMs in, then the system backs up each VM sequentially.

Is that correct?

Assuming so, presumably to get some parallellisation, you need sereral backup jobs with subsets of VMs in running concurrently?

A single backup job with Direct SAN enabled is dumping at a claimed 50MB/s - that's about half of the theoretical max (gig link to very (as in several miles away) remote linux backup store). I suspect a couple of jobs in parallel might get closer to full link utilisation, especially if using 2 proxies running as VMs on separate ESX hosts.

Cheers!

Tim

TimJWatts · Post by **TimJWatts** » Aug 06, 2012 2:11 pm this post

Thank you Mod - I did not look back this far in the search listing - but just the ticket

Post by **masonit** » Jan 28, 2013 7:56 am this post

Hi!

Is this something you plan to implement? For us this is a big downside that all jobs run serial. It would be alot easier to manage backup window with parallel processing.

\Masonit

Guido · Post by **Guido** » Feb 06, 2013 9:58 am this post

I have 4 proxies (hotadd) and configured one backup job for 20 VM's. The VM's are processed one by one. Is that normal? Is that becourse I selected "Automatic Selection" in stead of "use the backup proxy servers specified below"?

How do I know how many proxy servers are used at a given time?

Post by **Vitaliy S.** » Feb 06, 2013 10:05 am this post

Guido wrote:How do I know how many proxy servers are used at a given time?

You can either review job session or use Veeam ONE to monitor proxy server load.

Post by **dellock6** » Feb 06, 2013 1:13 pm this post

Well, if you have only 1 job, it will only involve one proxy, regardless how many of them you have in place.

Luca.

ptmartin · Post by **ptmartin** » Feb 11, 2013 3:03 pm this post

Hello All -
I have what seems to me to a basic question so forgive me if the answer is obvious. ..

When a backup job is being processed that contains multiple VMs, why is it that only one VM is processed at a time? It seems to me that if more than a single VM is processed then the backup time would be reduced .. potentially substantially if several machines can be processed at the same time. If it is because of storage (that is the only reason I can think of) perhaps at least being able to process VMs that are on different LUNs at least would be a good option to have.

Just curious ... I cannot see anywhere in the interface for a regular backup job where I can select how many VMs are processed.
I have that option in a SureBackup job, why don't I have it in a standard backup job?

Thanks,
Paul

tscott · Post by **tscott** » Feb 14, 2013 7:48 pm this post

I have one VM Proxy with 2 vCPUs x 2 vCores so I have the recommended amount to run 2 concurrent tasks..

I have one backup job that is backing up my datastores with multiple VMs.. When I run the job it only backs up 1 VM at a time.. I'm assuming this is normal?

If so, should I have multiple backup jobs? One for each datastore? And then would I get concurrent backups?

Thanks

Post by **rbrambley** » Feb 14, 2013 8:02 pm this post

Yes that is normal

task = job in Veeam speak.

Running 2 jobs at the same time would be 2 concurrent tasks.

Veeam processes 1 VM at a time in a job.

If you only have 2 datastores then yes, 1 job per datastore would be a good way to go.

R&D Forums

Concurrent Tasks

Re: Concurrent Tasks

Re: Concurrent Tasks

Re: Concurrent Tasks

Re: Concurrent Tasks

Re: Concurrent Tasks

Re: Concurrent Tasks

Re: Concurrent Tasks

Re: Concurrent Tasks

Re: Concurrent Tasks

Re: Concurrent Tasks

Re: Concurrent Tasks

Re: Concurrent Tasks

Re: Concurrent Tasks

Re: Concurrent Tasks

Re: Concurrent Tasks

Re: Concurrent Tasks

Re: Concurrent Tasks

Re: Concurrent Tasks

Re: Concurrent Tasks

Re: Concurrent Tasks

Parallel VM backups in a single Backup Job?

Re: Concurrent Tasks

[MERGED] Re: Why does Veeam do serial VM backups instead of

[MERGED] simultaneously processing vm's in 1 job

Re: Concurrent Tasks

Re: Concurrent Tasks

[MERGED] Multiple VMs Processed Simultaneously

[MERGED] Backing up via datastore.. Only one VM at a time?

Re: Backing up via datastore.. Only one VM at a time?

Who is online