Very slow throughput on jobs

daverich · Post by **daverich** » Jul 28, 2014 11:22 pm this post

First of all let me say that Veeam support has totally dropped the ball on this case, hence why Im turning to the forums as they provided much more information than the very uneducated Veeam rep did. He also hung up on me and closed my case out, which I find very unprofessional and will be following up with management on that in a day or two. Moving on....

Background:
Veeam was installed many years ago here in our environment by a previous employee and everything has always worked perfectly for us until I started digging.

I was told our jobs just barely make it inside of our backup windows and sometimes have to be manually killed to prevent major slowness on our production environment. After doing some very basic research, it looks as if our Veeam is underperforming by a longshot. The average job only runs at about 2-5MB/sec on reverse incrementals of no larger than 10gb of data!

Adjustments:
I started with NIC configurations on the hosts and quickly turned back to job details. I found our jobs were running in NDB mode and that HAD to be the problem. I experimented with Hot Add and Direct SAN and found them both to be about the same speed but decided to stick it out with Direct SAN to avoid the disk mount times. I added a second NIC (vmxnet3) to my veeam server and isolated him to the iSCSI network. I also isolated all in-guest mounted LUNs to that same adapter and I have confirmed that traffic is sticking to that adapter.

Next change was to change job Storage details as they were set to compression High and storage optimizations were set for WAN. Our original backup target repo was over a 1GB fiber link and the original guy I guess didnt trust this. I have built a SAN local here now as the new repo with backup jobs running to old repo over 1GB line, so I adjusted the main jobs to Optimal compression, and Local Target for storage optimization. STILL NO BACKUP SPEED DIFFERENCE! I created some new jobs and I found on the first full, I could hit 50-70MB/sec which is about my cap for my target repo (7k RPM disks for now) and I was seeing routine IOPS of 1200, which is about right at full write. However as soon as my jobs go to incrementals, they slow back to 3-7MB/sec for the duration of the job.

Support advised killing application aware image processing to speed the differentials up, which it did but only by about 5-8MB/sec which is not worth it for us. They turned up maximum concurrent tasks to 4, still no difference. We did however create a new job (old ones were modified clones) and let it do a full. The on the reverse incremental it was able to still pull 20-30MB/sec and the old original is stuck at 7MB/sec but I dont know if that is because it is only a test job and has only run an incremental 1 time vs the other job has been running for about 2 weeks.

Support basically dropped me off because they couldnt figure this out and I had more suggestions than they did from hard core digging for days through this forum and through many posts on blogs and spiceworks including VMworld presentations. Any help at this point would be AWEESOME!

Noted Performances
I have witnessed the job for exchange and several other of my VMs sitting there, pushing data at 700KB/sec (NO JOKE) for over 30 minutes, while the processor is at 30%, ram 55%, network at 1%!!!!!!!!!!! I checked in VeeamONE and the write latency of my target is 12ms, read latency of source is 2MS. During this backup slowness, I decided to bench my target NAS HARD and see if there was something going on deeper behind the scenes. Nope, a 4GB read/write test over iSCSI reported 100MB/sec read and 92MB/sec write WHILE the backup was stalling. WHAT GIVES?????

My Environment:
Source SANs:
Nimble CS series with 10gb links
HP Lefthand with 1gb agg'd links
Target SANs:
local backup is FreeNAS 9.2 latest build, hardcore tested and hardcore stable so dont even try blaming that please. It has been rebuilt over 6 times trying various methods and it is rock solid on HP gear. Its draw back are its 7K SATA disks and 1gbe card. Ill address the card when I get 15K SAS drives in it. For now, I can chunk data at it at over 96MB/sec on sequencial writes, and about 115MB/sec sequencial reads on 1000mb sizes.
Offsite is Nimble CS series, 10gbe links.

vSphere Environment:
keeping it simple for now, Im only referring to 1 cluster. All jobs suffer but lets keep it simple. HP C3000 blade chassis with 4 physical hosts. last host installed this year, 24 cores, 256gb ram, all 10gbe. We upgraded to 5.5 recently but job performance has not changed.

Veeam:
7.0.0.839
I do understand there is a patch but seeing how this has been going on for a very long time, I dont believe a patch is going to be my end all to this problem. I will apply the patch if troubleshooting requires though.
Veeam is running as a VM. all target repos are presented as datastores to the cluster and presented to Veeam VM as drives (individual vmdks) and he formats the vmdk as NTFS with 4K blocksize in windows 2008 R2. (I have read this could be a bottle neck, as it is preferred to do 'in-guest' presentation of LUNs to veeam?)

I appreciate any help and advice and I can post numerous screen shots and whatever else may be needed. After reading entire manuals and countless posts/blogs, I know my way around quite well now but am totally stumped on tthe continued poor performance of our environment.

Post by **Gostev** » Jul 29, 2014 6:54 am this post

daverich wrote:First of all let me say that Veeam support has totally dropped the ball on this case.

Hi, please include the support case ID of this case.

daverich wrote:However as soon as my jobs go to incrementals, they slow back to 3-7MB/sec for the duration of the job.

What is the bottleneck statistics reported by the job during the incremental runs?
At a first sight, your issue looks like classic target storage IOPS capacity issue to me.
Rule of thumb for reversed incremental backup is 2 MB/s per spindle max.

daverich wrote:local backup is FreeNAS 9.2 latest build, hardcore tested and hardcore stable so dont even try blaming that please.

The absolute majority of support cases about slow backup performance are in fact caused by target storage lacking IOPS, so target storage is naturally the one and only thing to blame in most cases.

daverich wrote:For now, I can chunk data at it at over 96MB/sec on sequencial writes, and about 115MB/sec sequencial reads on 1000mb sizes.
4GB read/write test over iSCSI reported 100MB/sec read and 92MB/sec write WHILE the backup was stalling. WHAT GIVES?????

Sequential I/O performance does not matter. My 5 year old notebook can do it even faster, but it makes really bad backup target

because IOPS is the king. If you want to test your storage correctly for Veeam reversed incremental workload, please try random I/O test with 512KB block size, instead of sequential 1GB.

Post by **dellock6** » Jul 29, 2014 7:47 am this post

And also, the 1Gbe connection will not help too. The theoretical limit will be 125 MBs, but because of tcp overhead you will probably reach at most around 120 MBs. Without a larger pipe, network will become the bottleneck and so any change to disk layout will not show any difference.

Post by **Gostev** » Jul 29, 2014 8:00 am this post

This should not matter with max sequential throughput of storage being significantly less than 1Gb line speed?
Anyway, I am yet to see reversed incremental backup to bottleneck a 1Gb ethernet pipe

daverich · Post by **daverich** » Jul 29, 2014 2:43 pm this post

Gostev wrote: Hi, please include the support case ID of this case.

00607585

Gostev wrote: What is the bottleneck statistics reported by the job during the incremental runs?
At a first sight, your issue looks like classic target storage IOPS capacity issue to me.
Rule of thumb for reversed incremental backup is 2 MB/s per spindle max.

Job contains 9 VMs, stats are Load: Source 64% > Proxy 38% > Network 42% > Target 63% --- process rate = 7MB/s
Job Contains 1 VM, stats are Load: Source 51% > Proxy 38% > Network 10% > Target 91% --- Process rate = 8MB/s
Job Contains 2 VMs, stats are Load: Source 74% > Proxy 41% > Network 44% > Target 61% --- Process rate = 15MB/s

Gostev wrote: The absolute majority of support cases about slow backup performance are in fact caused by target storage lacking IOPS, so target storage is naturally the one and only thing to blame in most cases.

I do understand this and I understand the limits of what my environment can produce. However, even you must agree that 3-8MB/s for jobs is a little too slow here. Does everyone on here posting 150+MB/s job speed all have full SSD target repo?

Gostev wrote: Sequential I/O performance does not matter. My 5 year old notebook can do it even faster, but it makes really bad backup target because IOPS is the king. If you want to test your storage correctly for Veeam reversed incremental workload, please try random I/O test with 512KB block size, instead of sequential 1GB.

I dont think I explained our setup well enough. For years our primary backup target was a Nimble CS array and HP Lefthand gear with the same speeds. We wanted to introduce the 1-2-3(?) model and have local backup with an offsite copy, vs our current setup of just offsite copy. intro the FreeNAS for the time being until monies are allocated for better SANs.

My nimble array is 11 spindles.
My freenas array is 10 spindles.

I do understand these limitations. I just want to see consistent writing at max speeds available. 3MB/s is nowhere near the 22MB/s it could and should write at.

I did the benchmarks as requested. 2 passes of 2000MB 512k random to my freenas from Veeam (LUN presented through ESXi host to vm).. 61.46MB/s read, 80.81MB/sec write.
2 passes of 2000MB 512 Random to my Nimble from Veeam .. 18.62MB/sec read (DING DING DING!), 57.94MB/sec write. (this was also just now benched during production hours. but I doubt it would influence it that much.)

Im going to dig hard into why my source nimbles are failing me on reads like this. We just confirmed on SAN GUI the same speeds but we are indeed under heavyish load right now.

But let me ask this... for reversed incrementals, most of the IOPS that we are looking for would be for the target, correct? I recognize I do have a source bottle neck here as well though.

I didnt want to come in and start a war so I apologize if I came off as a little hostile. I have been fighting this for over 2 weeks and Veeam support dropping the ball on me kind of set me off.

So with my potential IOPS bottlenecks for reversed incrementals, would forward incrementals with synthetic fulls be a better solution for us? Most of my test benching didnt show too much of a difference. Please keep in mind that when running a full in off-peak hours, I get pretty decent rates. A full ran last night for a VM of 405GB, process rate of 54MB/s with a max transfer at 68MB/s read, 12MB/s write but it still showed - Load: Source 97% > Proxy 62% > Network 4% > Target 0%.

I really appreciate even the conversation about this! Honestly my veeam support guys was pretty 'lost'.

daverich · Post by **daverich** » Jul 29, 2014 2:47 pm this post

Gostev wrote:This should not matter with max sequential throughput of storage being significantly less than 1Gb line speed?
Anyway, I am yet to see reversed incremental backup to bottleneck a 1Gb ethernet pipe

Gostev you are correct. I only max my 1gbe lines when running Fulls or backup copies (sometimes). I see my targets at 575-700mbps which is about the max with TCP/ISCSI overhead and I am happy with this. Like I said before, an upgrade in cards will not benefit me unless I can get faster drives or somehow get our council to pay for 10+TB of SSD,which isnt going to happen.

chrisdearden · Post by **chrisdearden** » Jul 29, 2014 2:57 pm this post

In terms of the bottleneck, there will always be a rate limiting step somewhere - with that full backup run , you'll possibly be reading data out of the nimbles slower spinning disk , rather than the SSD's where the "hot" blocks live. The proxy is doing a fair but of work , compressing and deduplicating the data stream off the nimble , reducing it by quite a lot - a full backup is a much more sequential write load on the target , so it should perform well.

forward incremental backups reduce the random IO on the target , at the expense of higher storage consumption.

R&D Forums

Very slow throughput on jobs

Re: Very slow throughput on jobs

Re: Very slow throughput on jobs

Re: Very slow throughput on jobs

Re: Very slow throughput on jobs

Re: Very slow throughput on jobs

Re: Very slow throughput on jobs

Who is online