Host-based backup of VMware vSphere VMs.
Post Reply
pshute
Veteran
Posts: 254
Liked: 14 times
Joined: Nov 23, 2015 10:56 pm
Full Name: Peter Shute
Contact:

Strategies for full backups that run too long

Post by pshute »

We've been adding more and more VMs to our main backup job, and now it's no longer finishing overnight. I don't want it running during work hours, so what's the best way around it?

My first thought was to split it into two jobs, and run each one on alternate days. That should do what we want, but then it becomes more difficult to tell which VMs are backed up up and which ones aren't. Is there any other way around it?
Andanet
Service Provider
Posts: 40
Liked: 3 times
Joined: Jul 08, 2015 8:26 pm
Full Name: Antonio
Location: Italy
Contact:

Re: Strategies for full backups that run too long

Post by Andanet »

I think best way is to split in more jobs to run at the same time.
Concurrent jobs permit to use better write stream to your repository.
We backup over 1000 VM in a backup window from 6.30PM to 05.30 AM.
Antonio aka Andanet D'Andrea
Backup System Engineer Senior at Sorint.lab ¦ VMCE2021-VMCA2022 | VEEAM Legends 2023 | VEEAM VUG Italian Leader ¦
Eamonn Deering
Service Provider
Posts: 33
Liked: 4 times
Joined: Feb 29, 2012 1:42 pm
Full Name: EamonnD
Location: Dublin, Ireland
Contact:

Re: Strategies for full backups that run too long

Post by Eamonn Deering »

Some info on your setup would help.
Using SAN backup mode?
Merging taking a long time?
Dedup box?
Where is the bottleneck?
jmmarton
Veeam Software
Posts: 2092
Liked: 309 times
Joined: Nov 17, 2015 2:38 am
Full Name: Joe Marton
Location: Chicago, IL
Contact:

Re: Strategies for full backups that run too long

Post by jmmarton » 1 person likes this post

What version of VBR is it? If you're on 9.0, the Advanced Data Fetcher in 9.5 could help. Also Eamonn's questions are additional details we'd need to see what are the best recommendations to solve your problems.

Joe
pshute
Veteran
Posts: 254
Liked: 14 times
Joined: Nov 23, 2015 10:56 pm
Full Name: Peter Shute
Contact:

Re: Strategies for full backups that run too long

Post by pshute »

jmmarton wrote:What version of VBR is it? If you're on 9.0, the Advanced Data Fetcher in 9.5 could help. Also Eamonn's questions are additional details we'd need to see what are the best recommendations to solve your problems.

Joe
Yes, we're still on v9. I'll upgrade it and see if it improves.
pshute
Veteran
Posts: 254
Liked: 14 times
Joined: Nov 23, 2015 10:56 pm
Full Name: Peter Shute
Contact:

Re: Strategies for full backups that run too long

Post by pshute »

Eamonn Deering wrote:Some info on your setup would help.
Using SAN backup mode?
Merging taking a long time?
Dedup box?
Where is the bottleneck?
The job history says the bottleneck is "source", and shows that nearly 9 of the 17 hours are spent reading one disk of one particular machine, our Exchange server. How long should I expect it to take to read 1TB?

Not sure about the answers to the other questions, or how to find out.
pshute
Veteran
Posts: 254
Liked: 14 times
Joined: Nov 23, 2015 10:56 pm
Full Name: Peter Shute
Contact:

Re: Strategies for full backups that run too long

Post by pshute »

Andanet wrote:I think best way is to split in more jobs to run at the same time.
Concurrent jobs permit to use better write stream to your repository.
We backup over 1000 VM in a backup window from 6.30PM to 05.30 AM.
It looks like just splitting one particular machine to its own job would fix the problem, but I'll have to test whether running them both concurrently will reduce the total time. I thought each job processed more than one VM at a time anyway. Is that not correct?
foggy
Veeam Software
Posts: 21069
Liked: 2115 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Strategies for full backups that run too long

Post by foggy »

Look for the transport mode tag in the job session log ([nbd], [san], [hotadd], etc. - you should see it right after the proxy server name after selecting the particular VM in the list).
pshute
Veteran
Posts: 254
Liked: 14 times
Joined: Nov 23, 2015 10:56 pm
Full Name: Peter Shute
Contact:

Re: Strategies for full backups that run too long

Post by pshute »

Like this? "18/01/2017 9:52:13 AM :: Using backup proxy VMware Backup Proxy for disk Hard disk 7 [nbd]"

This is configured in the proxy properties, correct? It's set to automatic there. We've never changed any settings there.

I see the Max Concurrent Tasks is set to 2. Would increasing that help anything?
pshute
Veteran
Posts: 254
Liked: 14 times
Joined: Nov 23, 2015 10:56 pm
Full Name: Peter Shute
Contact:

Re: Strategies for full backups that run too long

Post by pshute »

jmmarton wrote:What version of VBR is it? If you're on 9.0, the Advanced Data Fetcher in 9.5 could help. Also Eamonn's questions are additional details we'd need to see what are the best recommendations to solve your problems.

Joe
Now on v9.5, and last night's incremental took about as long as normal.
DaveWatkins
Veteran
Posts: 370
Liked: 97 times
Joined: Dec 13, 2015 11:33 pm
Contact:

Re: Strategies for full backups that run too long

Post by DaveWatkins » 1 person likes this post

Like this? "18/01/2017 9:52:13 AM :: Using backup proxy VMware Backup Proxy for disk Hard disk 7 [nbd]"

This is configured in the proxy properties, correct? It's set to automatic there. We've never changed any settings there.

I see the Max Concurrent Tasks is set to 2. Would increasing that help anything?
If you're on 10Gb nbd mode might be ok, otherwise setting up Hot-Add would be an easy way to go. Without knowing how many VM's you backup or how fast your repository is it's hard to make any recommendations. Simply increasing the concurrent tasks would likely speed you up massively assuming your Repo can handle it. NBD mode if you're not on 10Gb is the slowest way to backup machines. Add a new windows VM and install the proxy role on it and then you'll be able to use Hot-Add.

If your B&R server is physical and can be connected directly to your SAN via iSCSI or Fibre Channel you could configure that for Direct SAN mode which may be faster still, but I doubt you'll need to go that far to fix your current issue, increasing the concurrent tasks on both the proxy and the Repository would likely be enough, assuming you have the resources to support it.

Answering some of the earlier questions about your archtecture and what Veeam reports as the bottleneck will go a long way to getting more good advice. At the moment we're guessing
pshute
Veteran
Posts: 254
Liked: 14 times
Joined: Nov 23, 2015 10:56 pm
Full Name: Peter Shute
Contact:

Re: Strategies for full backups that run too long

Post by pshute »

Our B&R server is physical, and is connected to the ESX host via 4 teamed 1Gb/s connections. I've never really even looked how it's connected before. Proxy is on the same machine.

Veeam was reporting this when it was set to 2 concurrent tasks - Source 99% > Proxy 12% > Network 1% > Target 0%. That was with a single VM with 3 disks. Overall speed was 47MB/s

I increased it to 4 concurrent tasks, and got this - Source 99% > Proxy 18% > Network 1% > Target 0% . Overall speed now 76MB/s

I increased it to 8 concurrent tasks and added another VM with another 5 disks, and got - Source 99% > Proxy 21% > Network 1% > Target 0% . Overall speed 84MB/s

I'm not convinced it's actually doing 8 disks at once, as some finish before others start. Certainly at least 4 at once. I changed it in the proxy properties - is there some other setting limiting it?

My tests are probably not representative of what I'll get for nightly backups, as the system is in use while I'm testing. My nightly backups have been getting an overall speed of about 63MB/s.

Can you please explain the bottleneck stats? What does source 99% mean? I know it means the source is the problem, but I don't know where the 99% comes from.

If I was to put the proxy on a VM on the ESX host, I assume it could read the disks much faster, but I'm not sure if I'd be able to give it the resources to handle it. I guess I'll just have to try it and see how it works out. I'll see how tonight's backup goes before I decide whether it's worth trying.
pshute
Veteran
Posts: 254
Liked: 14 times
Joined: Nov 23, 2015 10:56 pm
Full Name: Peter Shute
Contact:

Re: Strategies for full backups that run too long

Post by pshute »

The throughput graphs on the job history were as high as 119MB/s. I'm not sure what they were before because it only shows a graph on the most recent job. Is that statistic saved anywhere?
pshute
Veteran
Posts: 254
Liked: 14 times
Joined: Nov 23, 2015 10:56 pm
Full Name: Peter Shute
Contact:

Re: Strategies for full backups that run too long

Post by pshute »

A question has been asked here about putting a proxy on a VM. Our ESX machine has three hosts. Would there only been a read speed increase for VMs on the same host as the proxy?
foggy
Veeam Software
Posts: 21069
Liked: 2115 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Strategies for full backups that run too long

Post by foggy »

pshute wrote:Can you please explain the bottleneck stats? What does source 99% mean? I know it means the source is the problem, but I don't know where the 99% comes from.

If I was to put the proxy on a VM on the ESX host, I assume it could read the disks much faster, but I'm not sure if I'd be able to give it the resources to handle it.
Bottleneck source means that data cannot be retrieved from the storage any faster. Currently source data reader is the slowest component in the data processing chain, while other components are able to process more data, but just sitting and waiting for it. Giving them the ability to process more data will result in overall backup performance increase (and bottleneck will probably shift to another component).
pshute wrote:The throughput graphs on the job history were as high as 119MB/s. I'm not sure what they were before because it only shows a graph on the most recent job. Is that statistic saved anywhere?
You can look up previous jobs stats in the sessions History.
pshute wrote:A question has been asked here about putting a proxy on a VM. Our ESX machine has three hosts. Would there only been a read speed increase for VMs on the same host as the proxy?
If the proxy VM has access to the shared storage, it will be able to use hotadd for all VMs stored there, regardless of the host they reside on.
pshute
Veteran
Posts: 254
Liked: 14 times
Joined: Nov 23, 2015 10:56 pm
Full Name: Peter Shute
Contact:

Re: Strategies for full backups that run too long

Post by pshute »

foggy wrote:Bottleneck source means that data cannot be retrieved from the storage any faster. Currently source data reader is the slowest component in the data processing chain, while other components are able to process more data, but just sitting and waiting for it. Giving them the ability to process more data will result in overall backup performance increase (and bottleneck will probably shift to another component).
OK, that makes sense - it's reading from source 99% of the time, and is always making other components wait. But how then did allowing more concurrent tasks manage to get more data from the host? How can the host send data more quickly just by doing multiple streams?
You can look up previous jobs stats in the sessions History.
I can see duration, processing rate, processed, read and transferred for all sessions, but they don't tell me everything that's on the graphs. I'm interested to see what maximum transfer rate we achieved, and whether there were periods of low speeds.

The processing rate seems a bit deceptive. It's the amount of data processed divided by the job duration. But swap file and deleted blocks don't actually get read, making the rate higher than the actual read rate. But all the statistics to do with actual read rates are unavailable for all but the most recent job.
foggy
Veeam Software
Posts: 21069
Liked: 2115 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Strategies for full backups that run too long

Post by foggy » 1 person likes this post

pshute wrote:But how then did allowing more concurrent tasks manage to get more data from the host? How can the host send data more quickly just by doing multiple streams?
Storage systems often perform better with multiple parallel streams instead of a single one.
pshute wrote:The processing rate seems a bit deceptive. It's the amount of data processed divided by the job duration. But swap file and deleted blocks don't actually get read, making the rate higher than the actual read rate. But all the statistics to do with actual read rates are unavailable for all but the most recent job.
Processing rate is calculated off the actually read data. But overall, I can see your point here.
pshute
Veteran
Posts: 254
Liked: 14 times
Joined: Nov 23, 2015 10:56 pm
Full Name: Peter Shute
Contact:

Re: Strategies for full backups that run too long

Post by pshute » 2 people like this post

Now that we've increased the number of concurrent tasks and rearranged the order of the VMs so that the biggest ones start backing up first, and also enabling deleted block skipping, our full backups are done in 8h20, down from 17 hours. We now have a few hours of time left in the night we can expand into. No need to try installing a proxy on the host for now. Thanks for your help with this.
pshute
Veteran
Posts: 254
Liked: 14 times
Joined: Nov 23, 2015 10:56 pm
Full Name: Peter Shute
Contact:

Re: Strategies for full backups that run too long

Post by pshute »

A possibly unrelated question - would enabling Per-VM Backups improve the speed where the bottleneck is Source? I'm think no, but I'm interested in enabling it anyway, so I'd like to know if there's an additional advantage.

The main reason I'm tempted to enable it is that our repository is short on space to do restores from tape. If I change to Per-VM, I will be able to restore just a single machine's backup without having to have enough room for the whole backup? Are there any disadvantages to splitting the backups up, apart from having way more files in the repository?
Post Reply

Who is online

Users browsing this forum: Bing [Bot] and 90 guests