Comprehensive data protection for all workloads
Post Reply
KarmaKuma
Enthusiast
Posts: 32
Liked: 6 times
Joined: Feb 05, 2022 11:16 am
Contact:

SMB Repo make Automatic Gateway sticky to current Task Proxy

Post by KarmaKuma »

Is it possible (through some Windows registry tweaks or something) to:
a) keep proxy and SMB/NFS gateway server for the current "per VM backup file" on the same machine?
and
b) keep all disks of a VM assigned to the same proxy and not spread the disks among available proxies
without explicitly specifying the same machine as preferred proxy and SMB/NFS gateway server?

Just have proxy+gateway server role locked together for all tasks belonging to the same "per VM backup file"...

When configuring SMB/NFS Repos, keeping proxy preference and gateway server to "automatic", unneccessary network traffic "can" (not necessarily must) be generated during task execution -> all VM disks get assigned a task and these tasks in return get assigned to proxies. Now if the current task-proxy is NOT the SMB/NFS gateway Server for the current "per VM backup file", it will transfer all data to the designated gateway server which in turn writes the data to the Repo. The traffic between proxy and gateway server is unnecessary and "can" slow things down (and eat away precious CPU time). This seems to happen with VMs that have more than one disk.

If in the Repo configuration the same machine gets assigned as preferred proxy and gateway server for a specific Repo, no additional network traffic ever gets generated -> all VM disks get assigned the same proxy that is also the gateway server and it writes the data directly to the Repo. Now this configuration has its own drawbacks - especially with regards to service high availability that is a given in "automatic" configuration in multi-proxy plus multi-repo setups.

Any options?
HannesK
Product Manager
Posts: 14322
Liked: 2890 times
Joined: Sep 01, 2014 11:46 am
Full Name: Hannes Kasparick
Location: Austria
Contact:

Re: SMB Repo make Automatic Gateway sticky to current Task Proxy

Post by HannesK »

Hello,
tasks / proxies / gateway servers are assigned "per disk" for load balancing. There is no reg key to make it "per machine".
"can" slow things down
sounds like a "corner case" scenario. If the network is too slow for parallel processing, then you can set the gateway server to only one task. That means only one disk is processed at a time.

Best regards,
Hannes
KarmaKuma
Enthusiast
Posts: 32
Liked: 6 times
Joined: Feb 05, 2022 11:16 am
Contact:

Re: SMB Repo make Automatic Gateway sticky to current Task Proxy

Post by KarmaKuma »

Thanks for the reply.

I just found out that indeed this is a corner case scenario. And that the random bottlenecking that appeared in my environment has its roots somewhere else. It's the network =)

Well, my point about precious CPU time for handling gateway server NIC-in-NIC-out traffic remains, but is probably negligible compared to the load balancing gains achieved by splitting disks between proxies...

Root cause in my case:
I was backing up a part of the VMs across an Inter-DC Switch Mesh (8x 10GbE between the two DCs) from a Metro Storage Cluster via proxies that unfortunately ended up sending the backup traffic to my Isilon/PowerScale backup target via SMB/NFS gateway servers over "some" of my Inter-DC links (and not all!) that were quite saturated already by reading and pushing VM disk data via ISCSI to hot-add proxies in the same Job/task chain.

Reason / Lesson learned through last night's testing:
Switch Virtual Chassis (kind of Advanced Stacking, in my case Full-Mesh) seems to keep traffic between devices connected to participating switches on local/shortest Inter-Switch path (direct Switch-Switch connection a-b) and does not distribute traffic across other available Inter-Switch paths (example indirect connection a-to-b via c or d or c-d). This can saturate one link easily during Backup-Storm while other available links remain idle...

Resolution:
Connect your devices wisely. I'll have to re-cable quite a lot of my devices/stuff and physically distribute and "load balance" each device class (ESX-Hosts, Source-Storage, Target-Storage, etc.) across all Switches participating in a Virtual Chassis. Using certain switches for ESX-Hosts, others for storage, etc. was a design mishap I guess, that did not show up until now. Vertical distribution vs. horizontal distribution so to say.

Tl;dr
Nevermind, my bad =)
Post Reply

Who is online

Users browsing this forum: Semrush [Bot] and 121 guests