Comprehensive data protection for all workloads
Post Reply
b.vanhaastrecht
Service Provider
Posts: 833
Liked: 154 times
Joined: Aug 26, 2013 7:46 am
Full Name: Bastiaan van Haastrecht
Location: The Netherlands
Contact:

Copy job botthleneck and decompress

Post by b.vanhaastrecht »

Hi,

We have a single box proxy & repository Veeam server: HP DL380 Gen8 with 32GB RAM, 16 Core's, 32 hyperthreaded, 2 LAN 10GbE, 2 iSCSI 10GbE, 3x D2700 with 18TB on 25x 15k 900GB SAS each. The local repository is fast for instant recovery ext, and then we push the backups with a copy jobs thru CIFS to a Quantum DXi 6702, a 2x 10GbE, dedupe unit.

We notice poor thruput performance of the copy job towards the DXi. 2GbE / 200MB/s max, very fluctuating. Bottleneck of all copy jobs show proxy is the main problem:
Source 10%, Proxy 60%, Network 20%, Target 40%

I need some help interpreting the botthleneck stats on the copy jobs:
1) Network, does it mean Proxy to Repository agent traffic (even on same box?), or repository agent to CIFS share?
2) Target, how is this meassured?

Secondly, we notice a very high CPU utilization on ALL 32 core's when decompress is enabled on repository or copy job. Litterly 80-100% on all core's when a vmdk disk is being copied. So proxy mentiond in botthleneck is probebly because of CPU stressed:
3) Howcome all CPU core's seems to be used when a single job copy is executed? (the proxy is set to 16 (of 32), the copy job doesn't seem to respect this setting.)
4) Howcome CPU utilization for decompress is much much intens then compress on backup. By far, we don't see the same CPU utilisation on backup (compression optimal / local target).

And as last:
5) Whats the difference between decompress on repository or copy job when backup up to CIFS share?

Thanks in advance, regards,
Bastiaan
======================================================
Veeam ProPartner, Service Provider and a proud Veeam Legend
foggy
Veeam Software
Posts: 21071
Liked: 2115 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Copy job botthleneck and decompress

Post by foggy »

Bastiaan, "Proxy" in case of direct backup copy job is Veeam data mover agent installed on the source backup repository and performing data compression prior to transfer to the target. In your case, since you're using CIFS, target repository is also the same server and target data mover performing decompression is also installed on it. That is why you see high CPU usage and why proxy is being reported as bottleneck.

Do you want to write uncompressed backups for the purpose of better deduplication on the target device? What compression settings do you specify for both regular backup and backup copy jobs?
b.vanhaastrecht wrote:1) Network, does it mean Proxy to Repository agent traffic (even on same box?), or repository agent to CIFS share?
"Network" here refers to the connection between source and target data movers, so in your case it is their local communication.
b.vanhaastrecht wrote:2) Target, how is this meassured?
"Target" is the write speed to the target storage.
b.vanhaastrecht
Service Provider
Posts: 833
Liked: 154 times
Joined: Aug 26, 2013 7:46 am
Full Name: Bastiaan van Haastrecht
Location: The Netherlands
Contact:

Re: Copy job botthleneck and decompress

Post by b.vanhaastrecht »

Hi Alexander, thank you for your anwsers. It's clear now.
foggy wrote:Do you want to write uncompressed backups for the purpose of better deduplication on the target device? What compression settings do you specify for both regular backup and backup copy jobs?
Yes, we want to decompress before storing to the DXi dedupe unit.

Backup job (to local disks): Compression - Optimal / Optimize - Local Target / Repository - Align and decompress not selected
Copy job (to DXi dedupe): Compression - Auto / Repository - Align and Decompress selected
======================================================
Veeam ProPartner, Service Provider and a proud Veeam Legend
foggy
Veeam Software
Posts: 21071
Liked: 2115 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Copy job botthleneck and decompress

Post by foggy »

Data blocks alignment should only be enabled for constant block size deduplicating storage, while DXi seems to use variable block size. So you can try to disable it and see whether this helps.
b.vanhaastrecht
Service Provider
Posts: 833
Liked: 154 times
Joined: Aug 26, 2013 7:46 am
Full Name: Bastiaan van Haastrecht
Location: The Netherlands
Contact:

Re: Copy job botthleneck and decompress

Post by b.vanhaastrecht »

According to the best-practice quide for DXi & Veeam the "Align Backup File Data Blocks" should be enabled.

"DXi-Series_Configuration_and_Best_Practices_Guide_for_Veeam_Backup_&_Replication_[BPG00020A].pdf" http://www.quantum.com/iqdoc/doc.aspx?id=7908

So the job settings above regarding compression seems right? To us the CPU load on decompress seems abnormal, its quite a powerfull server with 32 cores @ 2.6Ghz and 40MB L3 cache and it's fully utilized on a single copy job with settings as mentiond above. Would you advise creating a support case for this? Other suggestions?
======================================================
Veeam ProPartner, Service Provider and a proud Veeam Legend
foggy
Veeam Software
Posts: 21071
Liked: 2115 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Copy job botthleneck and decompress

Post by foggy »

Our internal testing showed reduced dedupe ratio on storages using variable block size deduplication if block alignment was enabled. Anyway, you can just test and compare.

Also, you could try to configure some third server to play as a proxying server for the CIFS repository and take the decompression load from the proxy server. This will, however, increase the load on the network.

Have you checked the particular process(es) that takes most of CPU resources? If it is Veeam process, you can open a case with our technical guys, they will probably ask you to provide performance logs for investigation.
b.vanhaastrecht
Service Provider
Posts: 833
Liked: 154 times
Joined: Aug 26, 2013 7:46 am
Full Name: Bastiaan van Haastrecht
Location: The Netherlands
Contact:

Re: Copy job botthleneck and decompress

Post by b.vanhaastrecht »

Ok, I understand. I've created a post in the Quantum community to explain the variable/fixed blocksize setting. http://stornextforum.com/forum/topics/v ... ign-or-not

We prefer a single box solution as this box has multiple 10GbE connections. Again, it's a beast in all assets, should handle this load with ease.

It's the VeeamAgent.exe that fills all CPU cores, will create a sup case. Thanks.
======================================================
Veeam ProPartner, Service Provider and a proud Veeam Legend
foggy
Veeam Software
Posts: 21071
Liked: 2115 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Copy job botthleneck and decompress

Post by foggy »

Please keep us informed on how investigation is going. Btw, what is the support case number?
b.vanhaastrecht
Service Provider
Posts: 833
Liked: 154 times
Joined: Aug 26, 2013 7:46 am
Full Name: Bastiaan van Haastrecht
Location: The Netherlands
Contact:

Re: Copy job botthleneck and decompress

Post by b.vanhaastrecht » 2 people like this post

We've just got word back from Quantum, apparently the part about block alignment is wrong in their best practice guide:

"The escalation team have confirmed that the correct setting should be: Align backup file data blocks: No.
Our documentation is wrong.
We have raised bug number 40685 internally to get our Veeam Best Practices documents corrected."


We're currently testing what impact this option has on CPU load and trhuput. If posible this flaw in the guide should be spread to the Veeam community, there are a lot of DXi users in here. :idea:
======================================================
Veeam ProPartner, Service Provider and a proud Veeam Legend
foggy
Veeam Software
Posts: 21071
Liked: 2115 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Copy job botthleneck and decompress

Post by foggy »

Thanks, Bastiaan for getting back to us with this, much appreciated!
b.vanhaastrecht
Service Provider
Posts: 833
Liked: 154 times
Joined: Aug 26, 2013 7:46 am
Full Name: Bastiaan van Haastrecht
Location: The Netherlands
Contact:

Re: Copy job botthleneck and decompress

Post by b.vanhaastrecht »

After deselecting the "Align backup file data blocks" the CPU usage went down with about 20%. Dedupe ratio improved a few percentages. Network trhuput to the DXi did not improve.
======================================================
Veeam ProPartner, Service Provider and a proud Veeam Legend
foggy
Veeam Software
Posts: 21071
Liked: 2115 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Copy job botthleneck and decompress

Post by foggy »

How about the overall job performance? Has it improved? What are the bottleneck stats for this job now?
b.vanhaastrecht
Service Provider
Posts: 833
Liked: 154 times
Joined: Aug 26, 2013 7:46 am
Full Name: Bastiaan van Haastrecht
Location: The Netherlands
Contact:

Re: Copy job botthleneck and decompress

Post by b.vanhaastrecht »

No, this has not improved. Does not matter if a single or multiple copy jobs run, a overal cap of 200MB/s applies.

Bothleneck stats: Load: Source 41% > Proxy 82% > Network 30% > Target 91%

CPU load on a single copy job has dropped with about 20%. When all jobs run (8 copy jobs), overal CPU load is 90-100% on all 32 core's.
======================================================
Veeam ProPartner, Service Provider and a proud Veeam Legend
foggy
Veeam Software
Posts: 21071
Liked: 2115 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Copy job botthleneck and decompress

Post by foggy »

Now primary bottleneck has moved to target, which is more expected for dedupe storage (remember, backup copy is a synthetic activity, requiring lots of I/O).
Post Reply

Who is online

Users browsing this forum: Amazon [Bot] and 169 guests