Comprehensive data protection for all workloads
jyarborough
Veeam ProPartner
Posts: 31
Liked: never
Joined: Apr 03, 2010 2:23 am
Full Name: John Yarborough
Contact:

VeeamAgent.exe multi-threading

Post by jyarborough »

I've noticed that my backups are not running quite as fast as I had hoped. When I opened up task manager on the Veeam Backup server during a backup job, I noticed that "VeeamAgent.exe *32" was constantly hovering around 25%. Since this is a 4x vCPU virtual machine, this makes it seem to me that the VeeamAgent.exe is not multi-threading very well. I might be talking out of school so forgive me if I am overlooking something :) Without further testing I can't say for sure but I am assuming that the VeeamAgent.exe process is probably doing the deduplication checks and compression tasks which would both require a significant amount of CPU. Couldn't this be spread across multiple cores at the same time somehow? With this logic, it seems like adding cores to the server isn't necessarily going to improve performance if you are running a single job at a time. Instead, putting it on faster cores would be the only way to get improved performance.

Can someone confirm what I am seeing? Scenario is one Virtual Appliance backup job running against a couple servers with a fair amount of data (around 100GB). While the job is running the VeeamAgent.exe stays at about 25%. I then fire off a second job which has the same basic setup and I now see a second VeeamAgent.exe taking 25%. This seems like a little bit of a bottleneck because each backup seems to only be able to use 1 core at any given time but might be a known limitation. I've never really developed high end applications so this might be the "norm".

Thanks for any additional info!
TrevorBell
Veteran
Posts: 364
Liked: 17 times
Joined: Feb 13, 2009 10:13 am
Full Name: Trevor Bell
Location: Worcester UK
Contact:

Re: VeeamAgent.exe multi-threading

Post by TrevorBell »

Hi,

I have just done a test backup on my win2k8 Vm Veeam server i then run perfmon and i can see that it does fluctuate between 1 and 4 cores...with 29 threads

Are you using 2003 or 2008 server ?

thanks

Trev
jyarborough
Veeam ProPartner
Posts: 31
Liked: never
Joined: Apr 03, 2010 2:23 am
Full Name: John Yarborough
Contact:

Re: VeeamAgent.exe multi-threading

Post by jyarborough »

Just using task manager's performance tab, I do see it going across all the cores, however it still hovers at 25%. I do see it jump up occassionally but I think 30 was the highest I see. My server Windows Server 2008 R2 Standard with 4 vCPU and 4GB of RAM. Maybe multi-threading is not the correct terminology but either way it does not seem to be utilizing all the available resources on my server.

Looking at perfmon a little, specifically at "Processor\% Processor Time" and "System\Processor Queue Length", I noticed a couple of things. With one job running % Processor Time consistently stays at 25% with minor fluctuations up to about 30%. With two jobs running it goes up to 50% and fluctuates up to less than 60%. With Processor Queue Length, I am seeing an average queue depth of .6 while 2 backups are running.
Vitaliy S.
VP, Product Management
Posts: 27055
Liked: 2710 times
Joined: Mar 30, 2009 9:13 am
Full Name: Vitaliy Safarov
Contact:

Re: VeeamAgent.exe multi-threading

Post by Vitaliy S. »

Hello John,

Try using Best Compression option that will require more CPU resources, in this case you should see higher CPU utilization and better performance rates.

Thank you!
jyarborough
Veeam ProPartner
Posts: 31
Liked: never
Joined: Apr 03, 2010 2:23 am
Full Name: John Yarborough
Contact:

Re: VeeamAgent.exe multi-threading

Post by jyarborough »

For further clarification, I added "Thread\% Processor Time\All Instances" to perfmon and cleaned it up a little to only display VeeamAgent. I am also seeing 29 threads and it seems like thread 4 (specifically 4#2) is the one that is maxing it out. One curious thing is that I would expect to see two of these threads going crazy, like 4#1 and 4#2 since I have two backups running at the same time but I only see one instance with this type of activity.
jyarborough
Veeam ProPartner
Posts: 31
Liked: never
Joined: Apr 03, 2010 2:23 am
Full Name: John Yarborough
Contact:

Re: VeeamAgent.exe multi-threading

Post by jyarborough »

Thanks for all the replies, I guess you guys are insomniacs too? :D Vitality, I did try Best compression and am seeing the exact same thing: VeeamAgent.exe *32 is averaging 25%. I also looked at it with 2 other jobs running and I had three instances of VeeamAgent.exe each running at 25%. I'm thinking about trying to disable inline dedupe to see if that might help.
Gostev
Chief Product Officer
Posts: 31457
Liked: 6648 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: VeeamAgent.exe multi-threading

Post by Gostev »

The issue here looks to be production and/or backup storage speed.

Below is 2 vCPU VM trying to handle the data FC SAN throws at it, while backing up to another SAN LUN.
Very good multithreading, as you can see ;)

Image
bc07
Enthusiast
Posts: 85
Liked: never
Joined: Mar 03, 2011 4:48 pm
Full Name: Enrico
Contact:

Re: VeeamAgent.exe multi-threading

Post by bc07 »

How about multi-threading in a sense of multiple data streams?

At the moment Veeam does not seem to utilize the available bandwidth through network or SAN or what the SAN storage would be capable of with multi-threading data streams (SAN as source where the VM is).
Gostev
Chief Product Officer
Posts: 31457
Liked: 6648 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: VeeamAgent.exe multi-threading

Post by Gostev »

Assuming Veeam agent is not CPU-bound, it will utilize all available bandwidth. This is according to our own testing on extremely high-end FC8 SAN primarily designed for AV production (super fast).
bc07
Enthusiast
Posts: 85
Liked: never
Joined: Mar 03, 2011 4:48 pm
Full Name: Enrico
Contact:

Re: VeeamAgent.exe multi-threading

Post by bc07 »

That maybe so in your lab but that does not answer my question about multi-threaded data streams. When I talk about bandwidth I'm not talking about the processing rate what Veeam shows but the real data throughput on the networkcard or the SAN adapter.
At SAN's like EMC Clarrion CX3-10c (what we have) a single data stream does not get the full bandwidth the LUN/disks can provide. I'm not talking about the theoretical bandwidth of FC. Total CPU load is at 30-50% when the job runs.
Do a test, run a full backup of a VM without any compression and deduplication and then the same with both and watch the throughput on the NIC or SAN adapter (depending which methode you use).
jyarborough
Veeam ProPartner
Posts: 31
Liked: never
Joined: Apr 03, 2010 2:23 am
Full Name: John Yarborough
Contact:

Re: VeeamAgent.exe multi-threading

Post by jyarborough »

I finally gave up on Veeam in favor of simple VMware Data Recovery and then using SAN to SAN replication. Everytime we tried to troubleshoot a performance issue we were always told something like the above "on extremely high-performance FC8 SAN primarily designed for AV production" this runs very well. No kidding! In our environment (and I would assume a lot of other environments) we do not use this kind of equipment for backups, we use NAS and iSCSI to tier 3 storage. Backup Exec, Acronis, Amanda, VMware Data Recovery, NT Backup, everything else runs fine to this, but Veeam had issues that we could not seem to overcome.

Apparently Veeam makes great products based on the awards they win, but I guess our environment is too small to use it correctly or something. As soon as I can get a dual hex-core server with FC or 10G connectivity to ultra fast storage that is dedicated to backups, maybe I will re-evaluate it.
bc07
Enthusiast
Posts: 85
Liked: never
Joined: Mar 03, 2011 4:48 pm
Full Name: Enrico
Contact:

Re: VeeamAgent.exe multi-threading

Post by bc07 »

In our case the backup target storage is not the issue, I can write with 80MB/s with other applications to it.
Gostev
Chief Product Officer
Posts: 31457
Liked: 6648 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: VeeamAgent.exe multi-threading

Post by Gostev »

bc07, you can get full data throughtput if you properly tune your iSCSI storage network, please refer to sticky FAQ topic for more info. Here is the throughtput screenshot posted by one of the users, for example. If you have FC SAN, then reportedly some things worths trying are: updating HBA drivers, updating MPIO software, or removing MPIO software altogether (don't forget to reboot). In our testing we did not use MPIO, because it was known to cause performance issues starting from early VCB days.

Please appreciate that we cannot get the data faster than SAN is able to give it to us. As our FC8 SAN no-MPIO testing (and end user screenshot above) shows, there are really no bottlenecks in our engine. However, we have no control over actual SAN storage or connection to it. Our product operates on a higher level, so there is nothing we can do from our side if the source storage connection is not properly tuned.

jyarborough, if you do not have a good hardware for your Veeam server, then you should change compression level in the advanced job settings to Low. This will give you identical backup performance with VMware Data Recovery (assuming you have installed Veeam Backup in a VM, using the virtual appliance processing mode, and backing up to Veeam Backup VM's virtual disk like VMware DR does). It is true that default values are tuned for decent multi-core hardware (physical proxy server). In your case, the issue is likely because of the target storage performance though. Anyway, I hope VMware DR will work well for your needs. Thanks for evaluating us.
bc07
Enthusiast
Posts: 85
Liked: never
Joined: Mar 03, 2011 4:48 pm
Full Name: Enrico
Contact:

Re: VeeamAgent.exe multi-threading

Post by bc07 »

The EMC SAN where our VM's are running from is connected with FibreChannel 4GBit. The target (backup server) has direct attached storage (SAS) and has a Qlogic HBA installed. We don't use iSCSI for the VM's or backup storage. HBA has latest firmware and drivers, SAN and switches too. EMC powerpath is installed but it is not using Multipath to the Clarrion system.
If I use as backup method NBD I have almost the same performance/throughput, maybe 2MB/s faster.

And you are still not answering my question about multi-threaded data stream or if Veeam is doing one or multiple data streams (similar to copying several files at the same time from the same storage).
tsightler
VP, Product Management
Posts: 6009
Liked: 2843 times
Joined: Jun 05, 2009 12:57 pm
Full Name: Tom Sightler
Contact:

Re: VeeamAgent.exe multi-threading

Post by tsightler »

bc07 wrote:And you are still not answering my question about multi-threaded data stream or if Veeam is doing one or multiple data streams (similar to copying several files at the same time from the same storage).
I'm just a customer, but I'll provide my feedback here. As far as I can tell, Veeam does not perform multi-threaded data stream copies for a single job. We work around this by running multiple simultaneous jobs. It takes us about 4 jobs to hit 200MB/sec.

I'm not sure that vStorage API even supports multiple streams for a single VM, but I, and many other users, have asked for the ability to backup multiple VM's from a single job and this request has generally been panned by Veeam. I don't think they really understand that not all storage systems will provide their full performance from a single threaded read process (well, actually, they might, but only with massive read-ahead). With our older Equallogic SATA arrays we can't get more than 50-60MB/sec from a single backup, but if I run three simultaneous backup jobs I can push 150-180MB/sec so it's quite obvious the array is capable of pushing that much data, that my Veeam server is capable of processing that much data, and that my backup storage can handle that much data, but it requires three jobs to top out. This is one of the few places that vRanger (the product we came from) was a clear winner over Veeam.
Gostev
Chief Product Officer
Posts: 31457
Liked: 6648 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: VeeamAgent.exe multi-threading

Post by Gostev »

tsightler wrote:I don't think they really understand that not all storage systems will provide their full performance from a single threaded read process
No no, we definitely do understand this!
bc07
Enthusiast
Posts: 85
Liked: never
Joined: Mar 03, 2011 4:48 pm
Full Name: Enrico
Contact:

Re: VeeamAgent.exe multi-threading

Post by bc07 »

With all the hassle I had so far trying to get all VM backups in my backup window I'm staring to think more and more about checking out PHDVirtual or vRanger.
Veeam has some nice features (compared to the competition products) but if I cannot get the backup done in time they are worthless.
Gostev
Chief Product Officer
Posts: 31457
Liked: 6648 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: VeeamAgent.exe multi-threading

Post by Gostev »

If a single job does not saturate your SAN, why not just run your jobs in parallel like Tom does?
bc07
Enthusiast
Posts: 85
Liked: never
Joined: Mar 03, 2011 4:48 pm
Full Name: Enrico
Contact:

Re: VeeamAgent.exe multi-threading

Post by bc07 »

The physical server where I'm running Veeam now has only one CPU with 4cores. When the backup job is running the CPU load is between 35% and 80% depending on what VM it is backing up and on what SAN LUN the VM is located. I could run two jobs at the same time but that would certainly max out the CPU. I don't know what impact that has on the backups (reliability and performance) when they run with maxed out CPU. I already moved the Exchange backup to VM (Win7) as Virtual Appliance because Exchange backup has at full backup a processing speed of only 16MB/s no matter what I do (SAN, vApp, NBD) where I put it but it is consuming a lot of CPU.
It is really time consuming and hard work to find out what the best way is to backup our environment because some VM's are slower for different reasons and I have to find a way to separate them. I looks like I'll end up having the physical server backing up the smaller VM's and then having vApp's backing up separately the bigger VM's (one vApp for each VM) like Database server, file server and exchange server. This adds some complexity.
Gostev
Chief Product Officer
Posts: 31457
Liked: 6648 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: VeeamAgent.exe multi-threading

Post by Gostev »

I see. For weaker backup proxy servers, you can always consider reducing compression level to Low, as most CPU load is produced by compression. I happened to perform some performance testing on a very similar configuration single configuration (single Intel Q6600 CPU, also 4 cores) recently, so I can give you some idea on exact numbers. In direct SAN access with very fast storage (not a bottleneck). With default Optimal compression level, full backup speed was around 50MB/s (CPU usage maxxed out). After switching to Low compression level, full backup speed went to 166MB/s (CPU maxxed out) for the same VM. This was single job, and under worst possible conditions (virtual disks had randomly generated content, so no dedupe - which does help with performance noticeably).

Of course, Low compression level means significantly larger backups though.
bc07
Enthusiast
Posts: 85
Liked: never
Joined: Mar 03, 2011 4:48 pm
Full Name: Enrico
Contact:

Re: VeeamAgent.exe multi-threading

Post by bc07 »

If the issue would only be the CPU, that is easy to fix. But in my case, our environment the CPU load (never maxxed out with a single job) when backing up is different between VM's and the LUN's where they are located on the same storage. That brings me back to the point I have mentioned, the EMC CX3-10c (where most of our VM's are stored) is designed for multi-threaded data streams, that means the application does not process byte by byte (or block by block) in a serial process. The system shines when you have multiple data streams at the same time.
Thus to max out the SAN I have to run multiple Jobs at the same time but the CPU performance is not there to support that on one system and when I lower the compression the backup is slower (lower processing rate). The question is, why the CPU load is sometimes only 30% and sometimes 80% when processing vmdk's
tsightler
VP, Product Management
Posts: 6009
Liked: 2843 times
Joined: Jun 05, 2009 12:57 pm
Full Name: Tom Sightler
Contact:

Re: VeeamAgent.exe multi-threading

Post by tsightler »

Why do you think that having a single job with multiple threads would be less CPU intensive than running two separate jobs? I think it would be virtually identical. The biggest advantage to have a multi-threaded architecture would be that all of the backups could use the same VBK file for maximum dedupe.

Our backup proxy is tremendously underpowered, (an old dual CPU hyperthreaded system) and we still get tons of advantages with multiple jobs mainly due to the fact that, with CBT incremental runs, so much of the backup is "downtime" (i.e. time not spend transferring data but rather taking snapshots, indexing systems, freezing systems, and removing snapshots). We also offload some of our work because we backup to Linux servers.

I'd love for Veeam to support a "multithreaded" option for a single backup job, but it hasn't been a showstopper for us. With vRanger we had to be careful because running multiple threads would saturate the SAN and cause performance issues for the applications during the backup window (we're a 24x7 shop).
tsightler
VP, Product Management
Posts: 6009
Liked: 2843 times
Joined: Jun 05, 2009 12:57 pm
Full Name: Tom Sightler
Contact:

Re: VeeamAgent.exe multi-threading

Post by tsightler »

Gostev wrote: No no, we definitely do understand this!
Well, I'm not sure this shows complete understanding since you believe it has to do with "tuning" or "older systems" but I guess I'll accept that you at least understand the concept.
Gostev
Chief Product Officer
Posts: 31457
Liked: 6648 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: VeeamAgent.exe multi-threading

Post by Gostev »

Well, you have just used the same words ;)
tsightler wrote:With our older Equallogic SATA arrays we can't get more than 50-60MB/sec from a single backup
It is also true that in many cases tuning iSCSI network makes single full backup job saturate your storage link, just like it happened to Joerg, who went from most typical 40-45MB/s to 90 MB/s on pretty regular EQL storage.

By the way, Joerg had also done some cool testing using 10Gb EQL using NBD mode and monster 4 x 6 cores proxy server, and got over 500 MB/s full backup performance (numbers from network monitor). There was very sharp and clear limit around this value, we assumed this was service console reservation/limitation. Nevertheless, again only proves that our engine does not really have any bottlenecks, and can do 500MB/s in a single thread if source and target are fast enough.

To me, biggest benefit of parallel jobs comes due to the fact that during certain periods of time (especially during incremental backup pass), storage does not do anything at all (for example, waiting for VSS freeze to complete, or VM config files to be backed up). This is mostly why it makes sense to process more than one VM in parallel, using parallel jobs. And it does makes sense for us to add parallel processing within the same job (less jobs = easier management, better dedupe).
bc07
Enthusiast
Posts: 85
Liked: never
Joined: Mar 03, 2011 4:48 pm
Full Name: Enrico
Contact:

Re: VeeamAgent.exe multi-threading

Post by bc07 »

Tom, single job with multi-threaded data streams would not use less CPU but would make things easier to set up and as you said it would go in the same DeDupe store and with lowering the compression at the same time also faster (In know I can lower the compression now but that would make the backup per Job also slower).

Anton, that only shows that Veeam runs well with that storage system and that environment. Every storage system vendor manages IO requests coming from the host differently and if you have the option you can optimize the storage system for different types of access/applications. Backup from DAS to DAS is probably the fastest way to backup and Veeam would shine there.
But Veeam seems only to care for customers with storage systems that are optimized/configured for single-thread applications.
Not every customer knows of the differences between single and multi-thread data stream applications when they buy a new storage system (like we did). The EMC Clariion CX is optimized for multi-thread applications and designed to serve multiple hosts at the same time very well. Unfortunately there is no option for the customer to tune the system that single-thread applications work better. Go ask another customer with a Clariion CX and how one backup job is doing especially from a RAID group with more than 6 drives.
I read yesterday in the release notes of the CX firmware the single-thread applications cannot take advantage of more than 4 drives of any RAID group.
When I run IOmeter on this RAID group with multiple threads/workers I get much more throughput than with Veeam. That means the storage system is not slow in general but Veeam is not optimized for it.
Or does Veeam think that customers who have performance issues with Veeam and their storage system they would buy another storage system?
Gostev
Chief Product Officer
Posts: 31457
Liked: 6648 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: VeeamAgent.exe multi-threading

Post by Gostev »

No. Reading back through my earlier response on this matter (referenced above), we do agree that this feature is needed because it will help with certain SAN storage devices, and we are looking to add this capability to the product down the road. Right now, owner of such storage systems can address their SAN design specifics by running multiple jobs in parallel... we definitely do not want to ask you to buy new storage ;)
tsightler
VP, Product Management
Posts: 6009
Liked: 2843 times
Joined: Jun 05, 2009 12:57 pm
Full Name: Tom Sightler
Contact:

Re: VeeamAgent.exe multi-threading

Post by tsightler »

Gostev wrote:Well, you have just used the same words ;) It is also true that in many cases tuning iSCSI network makes single full backup job saturate your storage link, just like it happened to Joerg, who went from most typical 40-45MB/s to 90 MB/s on pretty regular EQL storage.
I used "older" because, out of our current storage systems, these units display the most pronounced version of this behavior, however, I've never owned a SAN that didn't have this behavior. SAN's are generally tuned to provide minimal latency to multiple simultaneously active clients, you don't want a single client performing a large copy to have a detrimental effect on the dozens of other clients that might be accessing the storage.

You can generally get near top performance for sequential reads by performing very aggressive read-ahead, but for more random access, multiple threads are generally the only way to achieve maximum performance from any well designed SAN. Sure, a SAN with no real competition for access might be able to achieve near full bandwidth, but my SAN is generally running around 30-50% IOP load even before backups start. Without multiple threads I won't get the best throughput.
Tommi
Enthusiast
Posts: 54
Liked: never
Joined: Sep 23, 2010 12:12 pm
Full Name: Tommi Wassini
Contact:

Re: VeeamAgent.exe multi-threading

Post by Tommi »

i just wanted to throw my 2 cents here.

i made a post some time ago with the same problem,,,

veeamagent.exe only runs at 25%.

while writing this, i have 2 jobs running, both veeamagents.exe runs at 25% not more not less.
my storage is allmost idle, working at 6%.
i think it is strange that veeamagent.exe (in some cases) just " freeze at 25%..
i have compression at BEST too.
more strange is it,. that i HAVE SEEN veeamagent run higher... but that was with another job with other settings ( normal compression)
somehow veeam dont "like" sertain settings....

just my feedback
Tommi
bc07
Enthusiast
Posts: 85
Liked: never
Joined: Mar 03, 2011 4:48 pm
Full Name: Enrico
Contact:

Re: VeeamAgent.exe multi-threading

Post by bc07 »

Veeam is data reading and writing single threaded. I ran some tests the past days and could reproduce the performance I get from Veeam (when doing full backup) with IOmeter with settings Outstanding IO 1 and large disk size and disabled write cache.
I tried to configure Prefetch (read-ahead settings) at our Clariion but I don't have knowledge of it on what setting would be best/optimal.

That is a big issue when doing full backups of very big VM's (with more than 1TB of actual data) because you can run only on Job per VM at the same time.
It would help if Veeam would read/write multiple blocks at the same time or maybe processes more than one vmdk file of the same VM at the same time. Sure you would need more processing power to keep the compression level high but is probably easier and cheaper to get the processing power (as VM or physical server) than to mess with the storage system or to buy another one.
tsightler
VP, Product Management
Posts: 6009
Liked: 2843 times
Joined: Jun 05, 2009 12:57 pm
Full Name: Tom Sightler
Contact:

Re: VeeamAgent.exe multi-threading

Post by tsightler »

I don't know if vStorage API supports it, but simply having more outstanding IO's queued would probably be of some help. I think the Veeam at one time did do multiblock reads but disabled this due to a few corner cases that lead to silent corruption of backups, but it's been a while so I may not be remembering that quite right.
Post Reply

Who is online

Users browsing this forum: Gostev, Ivan239, jp.verlande, RValensise and 163 guests