Comprehensive data protection for all workloads
Post Reply
rreed
Veteran
Posts: 354
Liked: 72 times
Joined: Jun 30, 2015 6:06 pm
Contact:

Could we talk mutliple streams for a moment please?

Post by rreed »

Morning everyone. Question, in version 8, and possibly earlier versions (don't know when it was introduced but thought I saw it in 7), we have the option of "Use multiple upload streams per job," w/ an increment counter that I believe goes to 100. The description indicates "Improves job performance through better utilization of high-latency links. Disable this option if you are running a large amount of concurrent jobs, or for networking equipment compatibility purposes."

Being somewhat a networking guy, I got excited when I saw "use multiple upload streams per job," which I took to mean "multiple data streams," of which LAG can make great use. But the description throws me a bit; does it actually not mean multiple data streams? When checked, and running concurrent jobs, each job w/ multiple VM's, at least some of those VM's w/ multiple HDD's, and multiple proxies w/ multiple CPU's. I would expect to see a fairly high amount of concurrency, but when watching the number of connections at my repositories (EMC and Dell dedupe devices) I only see around 2-5 concurrent connections. I would have expected to see around 10-15+ easily, if not getting into scores of connections. I guess what I'm looking for here is clarification on what "Use multiple upload streams per job" actually means.

One other thing I've noticed is that historically backups become extremely unstable w/ lot of failures seemingly due to perceived network issues w/ this feature enabled. Clear the check mark box and stability improves. The description does indicate "high-latency links" which we do not have; locally we're 10Gb inside our VMware environment, 10Gb to our Dell DR4100's, and 2x 1Gb each to our Data Domains. Two matching data centers, w/ a 500Mb L2 link b/t them, but no Veeam backup data traverses. Only Veeam <-> Proxy commands; each data center backs up its data locally to itself. I didn't consider myself using a high amount of concurrent jobs, probably 5-6 concurrent jobs, each w/ around 3-10 VM's. If this is truly meant for slow network links and/or 5-6 jobs running concurrently is a large amount of concurrent jobs, then that's fine, I get it. I've since staggered my jobs anyways, but still just don't use this feature since it seems to cause more problems than get a network guy excited about efficient use of network capacity.

That's V8 and/or older. V9, I understand it's going to make great use of multiple data streams; again if I misunderstand the above please let me know. Otherwise I find myself a bit confused we would tout this as a new feature if it's been there in the last 1-2 major revisions. I'm trying to find notes from the Veeam folks that give details on exactly what we mean by "multiple data streams" as we speak. Anyways, do we have some good details on exactly what and how V9 is going to use multiple data streams please? I need this both for my excitement as well as seriously network planning. I'm gathering up a couple of switches and old SAN's to insert as my first landing zone/staging area for backups and need to plan accordingly. Multiple data streams are the name of the game for LAG so if we will have capability to run a LOT of traffic down the highway, then we'll need as many lanes as we can throw at it to back up our modest 33TB or so data every weekend. I'm sure larger places have a lot more. :D
VMware 6
Veeam B&R v9
Dell DR4100's
EMC DD2200's
EMC DD620's
Dell TL2000 via PE430 (SAS)
alanbolte
Veteran
Posts: 635
Liked: 174 times
Joined: Jun 18, 2012 8:58 pm
Full Name: Alan Bolte
Contact:

Re: Could we talk mutliple streams for a moment please?

Post by alanbolte » 1 person likes this post

Current implementation (v8) is only for traffic between Veeam data movers. Each job can only have one data stream to your deduplication device because it's writing a single file. In v9, the option for per-VM backup chains will allow you to write separate files for each VM in the job; because the job is writing multiple files to the deduplication device at the same time, it can use multiple streams.
rreed
Veteran
Posts: 354
Liked: 72 times
Joined: Jun 30, 2015 6:06 pm
Contact:

Re: Could we talk mutliple streams for a moment please?

Post by rreed »

Thanks Alan, excellent point about a given job only writing one file at a time! My grand assumption was per proxy CPU, each meant a single data stream, so I never could figure out why (4) proxies each w/ (4) CPU's all running concurrently never gave (16) concurrent streams. :? I'm looking very forward to the per-VM chain featuer of v9. So how will the proxies work w/ a given multi-HDD VM at that point w/ per-VM chain? Currently a given mutli-HDD VM will get different proxies assigned to it, will they each stream their respective HDD(s) into the same file concurrently?
VMware 6
Veeam B&R v9
Dell DR4100's
EMC DD2200's
EMC DD620's
Dell TL2000 via PE430 (SAS)
alanbolte
Veteran
Posts: 635
Liked: 174 times
Joined: Jun 18, 2012 8:58 pm
Full Name: Alan Bolte
Contact:

Re: Could we talk mutliple streams for a moment please?

Post by alanbolte »

Proxies are not important to the question of writing files to a SMB share on a Dell DR device (not sure what EMC device you're using, but the same is true of any SMB share or DataDomain Boost). Rather, all proxies send their data to the gateway (part of the repository settings), and the gateway writes files to the storage device. However, if you choose not to specify a gateway in the repository settings, it's common for one of the proxy servers you're using to be dynamically assigned the role.
rreed
Veteran
Posts: 354
Liked: 72 times
Joined: Jun 30, 2015 6:06 pm
Contact:

Re: Could we talk mutliple streams for a moment please?

Post by rreed »

Something else I'd like to understand better, the Gateway server. I've looked around before but never found much on what it is or does. Currently we have it set to auto on all our repositories. Will the Gateway server's functionality change in v9? If we have say, (4) proxies in a given data center, and job concurrency, will they all just choose a single one to be the gateway server? Or can/will they each become a Gateway server? Our old EMC DD's are just CIFS share (SMB). No DD Boost needed in our environment. Old 620's and a 2200 per data center. CIFS shares on our Dell DR4100's as well. No OST or anything fancy.

I found this article floating around, any corroboration? http://www.virtualtothecore.com/en/veea ... up-chains/
VMware 6
Veeam B&R v9
Dell DR4100's
EMC DD2200's
EMC DD620's
Dell TL2000 via PE430 (SAS)
Gostev
Chief Product Officer
Posts: 31460
Liked: 6648 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Could we talk mutliple streams for a moment please?

Post by Gostev »

Gateway server is the server that runs the target data mover in cases when it cannot be run on the storage device itself (as is the case with backup repositories based on Windows or Linux servers with internal or DAS storage).
mmonroe
Enthusiast
Posts: 75
Liked: 3 times
Joined: Jun 16, 2010 8:16 pm
Full Name: Monroe
Contact:

Re: Could we talk mutliple streams for a moment please?

Post by mmonroe »

In v9, will this feature "per-VM backup chains" also be used with Backup Copy Jobs?
veremin
Product Manager
Posts: 20270
Liked: 2252 times
Joined: Oct 26, 2012 3:28 pm
Full Name: Vladimir Eremin
Contact:

Re: Could we talk mutliple streams for a moment please?

Post by veremin »

As mentioned in the referenced blog post, it's repository option, not a job one. Thanks.
mmonroe
Enthusiast
Posts: 75
Liked: 3 times
Joined: Jun 16, 2010 8:16 pm
Full Name: Monroe
Contact:

Re: Could we talk mutliple streams for a moment please?

Post by mmonroe »

So the answer would be "yes" as long as the backup copy jobs are on repositories with this option? It looks like this would be a way to have multiple VM's processing at the same time with Backup Copy Jobs. I know that this has been requested in the past a few times. Nice.
foggy
Veeam Software
Posts: 21069
Liked: 2115 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Could we talk mutliple streams for a moment please?

Post by foggy »

Yes, your understanding is correct.
rreed
Veteran
Posts: 354
Liked: 72 times
Joined: Jun 30, 2015 6:06 pm
Contact:

Re: Could we talk mutliple streams for a moment please?

Post by rreed »

So how would the Gateway server come into place for basic CIFS shares then? Bear w/ my incessant questions please, I'm just trying to gain a thorough understanding of the underlying architecture.
VMware 6
Veeam B&R v9
Dell DR4100's
EMC DD2200's
EMC DD620's
Dell TL2000 via PE430 (SAS)
Gostev
Chief Product Officer
Posts: 31460
Liked: 6648 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Could we talk mutliple streams for a moment please?

Post by Gostev »

Well, to be honest it's really not a "explain me in two words in a forum post" sort of topic. Proper explanation requires diagrams and takes about an hour of our VMCE class.

If you are naturally interested, then you should first read our documentation to understand Veeam architecture, components and data flow... the above is impossible to grasp without knowing the underlying architecture. And I can assure you that as soon as you learn that, you will not need me to answer your question above at all ;)
Delo123
Veteran
Posts: 361
Liked: 109 times
Joined: Dec 28, 2012 5:20 pm
Full Name: Guido Meijers
Contact:

Re: Could we talk mutliple streams for a moment please?

Post by Delo123 » 1 person likes this post

Anyway, should you use a windows repository, do not use "CIFS" (which is the wrong word anyway) but select the server, then the disk, Veeam will take care of the "share" and is much faster.
dellock6
Veeam Software
Posts: 6137
Liked: 1928 times
Joined: Jul 26, 2009 3:39 pm
Full Name: Luca Dell'Oca
Location: Varese, Italy
Contact:

Re: Could we talk mutliple streams for a moment please?

Post by dellock6 » 1 person likes this post

The like is for the CIFS bashing, appreciated ;)

It's SMB!!!
Luca Dell'Oca
Principal EMEA Cloud Architect @ Veeam Software

@dellock6
https://www.virtualtothecore.com/
vExpert 2011 -> 2022
Veeam VMCE #1
rreed
Veteran
Posts: 354
Liked: 72 times
Joined: Jun 30, 2015 6:06 pm
Contact:

Re: Could we talk mutliple streams for a moment please?

Post by rreed » 2 people like this post

If I understand the attempt at correction(?), I also understand Windows servers to indeed be SMB, and Dell/EMC dedupe devices are Linux-based (being not windows), and refer to their local shares as CIFS protocol during setup of the shares (<--also their nomenclature). I just took their references and ran w/ it. I've also found during some test comparison that setting up a repository as a shared Windows (SMB) folder vs. just adding it as a server to point to its drive locally, does indeed run a bit faster. Now if we can just get tape to run at speed across the network! But perhaps we get a bit off-topic here, albeit great advice regarding repositories on Windows machines. <insert thumbs-up smiley here>

As to "explain to me in two words topic," well no of course not! :wink: I'm not looking for a two-word discussion on multiple data streams in regards to concurrent backups, link aggregation, more detail on what some features mean and what they're doing under the hood, etc. I have read over the architecture (didn't find much on the gateway server at the time) but have since downloaded some more documentation I do need to read over. In parallel w/ that, I also like to have human conversations to help dig deeper and understand.
VMware 6
Veeam B&R v9
Dell DR4100's
EMC DD2200's
EMC DD620's
Dell TL2000 via PE430 (SAS)
Post Reply

Who is online

Users browsing this forum: Google [Bot] and 233 guests