V6 quirks/problems.

ashleyw · Post by **ashleyw** » Dec 06, 2011 10:40 pm this post

Hi,

We've just switched to V6 - we like it - nice job - it is by far the best solution out there - we love the multiple proxy architecture!

However, we have a few issues we've picked up!

UI
- It would be really useful to be able to see which proxy a job is running on (we have 2 proxies each with 2 threads) from the Job Summary screen of the B&R console.

Disable proxy on B&R console
- We want the ability to disable the proxy on the console - yes we know that we can manually select the proxies to exclude the B&R console proxy but we'd prefer the "disable proxy" option to be available for the B&R console proxy.

Scheduling Timeout when no available proxy
- We have split our load into 14 separate jobs (this is the same approach we took with v5) as we need to back up around 16TB of used space (roughly 300 VMs). Under v5, we split these jobs into 2 different units of work (using powershell) and pushed each unit of work to a separate B&R engine (so that we could get two jobs running simultaneously). Under v6 we thought we could schedule all 14 jobs to start at the same time and the B&R scheduler would schedule the jobs to run on the multiple proxies as various threads became available - all good, but the problem we saw last night was that the full run through on these jobs obviously takes a long time and half of our jobs failed with a message "Could not allocate resources within allowed timeframe (43200 sec) Failed to start VM backup in the allowed time due to insufficient avaliable resources. Timeout: [43200 sec]". How do we disable/extend the timeout to prevent jobs from failing if there is no available proxy for more than 12 hours?

CIFS networking - security and performance
- Like many people for optimal performance and preventing backup files being abstracted under a VMFS/NTFS partition on top of our backup target device (and for easy storage expansion), we use a CIFS target (in our case using a high speed NexentaStor ZFS whitebox) on a private backup network running 10GbE. We find we need to expose this whitebox to the B&R console as well as the backup proxies even though the connectivity is between the proxy and the storage device. Shouldn't the B&R console pass all storage requests to the proxies (including initial verification of available CIFS paths) to prevent the console from having to have access to the storage device? The use of a CIFS target seems to remove the need for a repository server?
- To make use of an active/active 10GbE connection to our CIFS server, we present the same CIFS shares as two separate IPs on different subnets; i.e; \\10.0.2.23\veeambackups and \\10.0.3.23\veeambackups Using CIFS we can't bond NICs together (to get 20GbE rather than 10GbE active/passive) via ethertrunk etc as this is not supported on the ESX5i host directly. This means under Veeam we have to setup 2 targets (one for each IP) and then manually assign jobs to alternate target IPs to be able to make use of active/active CIFS connections. Is there no other way?

Scheduling job chains
- If all our jobs are set to enable synthetic fulls on Saturday evening, what happens if the start date of our jobs varies? ie. we have some jobs with large amounts of changes in so sometimes an individual job may take 1 hour, other times 3 hours. If the jobs are queued up in a chain to start then sometimes they may run on Saturday say 23:00 but other times they may only start at 01:00 on Sunday. If we have rollup days clicked for both Sat and Sun then some jobs will rollup twice which is not what we want. Could a change be made to v6 so we could select multiple rollup days - in our case Fri+Sat+Sun but be able to select another option "Don't attempt to rollup more than once per week".
-We have a similar issue with Active full backups when jobs can start on different days. How do we get around this? Perhaps one solution is to be able to treat multiple jobs as a single logical unit (ie. dailybackup consists of job1+job2+job3+job4+job5) and dailybackup unit has a start time, rollup, incremental and active full backup schedule rather than the specific jobs themselves.
- In the past we have had to deal with a lot of these quirks using powershell but would prefer an out the box solution with v6 if possible.

Any advice would be much appreciated!
thanks

Post by **Gostev** » Dec 06, 2011 11:17 pm this post

Hi,

UI
This is not quite possible as job is not running on proxy. Each VM in the job can run on the different proxy, depending where the intelligent load balancing places it in the runtime. While in per-VM statistics we do provide this information.

Disable proxy on B&R console
No ability to disable is a UI bug that will be fixed.

Scheduling timeout when no available proxy
There is a registry key for that, but I do not have it handy (at home now).

CIFS networking - security and performance
- I believe that the UI does need to interact with share as you go through the wizards. Yes, if CIFS target is located in the same site, then there is no need for a repository server?
- Yes, there have to be 2 different backup repositories, one per IP.

Scheduling job chains
You should not be chaining the jobs with v6. Instead, chain them using resource scheduling. Limit the concurrent tasks on proxies or repositories, and set all jobs to start in the required order (18:00,18:01,18:02 etc.). Jobs which were started later will wait for all jobs that were started earlier to finish. With this in place, if rollup is scheduled to start on Saturday and the jobs starts on Saturday, the rollup will still happen even if the actual process will be delayed until Sunday due to no processing resources being available.

Thanks!

ashleyw · Post by **ashleyw** » Dec 07, 2011 12:15 am this post

thanks! Excellent fast response - as per usual!

UI
My understanding is that VMs within a single job are processed sequentially so that a single job can currently have only a single VM being backed up at any one time. As that VM is being processed by a proxy, I can't see why it wouldn't be possible for the proxy currently handling the VM in the job to be shown on the job summary screen? The reason we ask for this is that it would make potential troubleshooting with the load balancing easier.

Scheduling timeout when no available proxy
I've done a registry search but couldn't find that value anywhere. It would be great if you can dig that out for us as I'm sure it would also apply to many other installations.

Scheduling job chains
Once we can change the scheduling timeout, then all our problems should go away (provided as you say the job schedule date/time rather than the actual time allocated to a proxy is used).

Well done to the Veeam team for such a solid release! Definitely the best yet!

ashleyw · Post by **ashleyw** » Dec 07, 2011 9:35 pm this post

Hi Gostev, were you able to dig out the registry change/hack to increase the scheduling timeout? (We had some more failures last night due to the same issue).

cheers

Post by **Gostev** » Dec 07, 2011 9:41 pm this post

Yes, going through my email and I was just about to post.

Depending on hypervisor, create the corresponding value in Veeam registry key:

Code: Select all

VwVmReadyToBackupTimeout
HvVmReadyToBackupTimeout

For some weird reason, this is of REG_SZ type (string). For example, for 20 hours you should enter this:

Code: Select all

20:0:0

Default is 12 hours.

ashleyw · Post by **ashleyw** » Dec 07, 2011 9:58 pm this post

great thanks!
So just to be clear it's

Code: Select all

\HKEY_LOCAL_MACHINE\SOFTWARE\VeeaM\Veeam Backup and Replication\VwVmReadyToBackupTimeout

REG_SZ type - a value of 99:0:0 will give us a scheduling timeout of 99 hours (for VMware ESX5i hypervisors)?

Do the Veeam services need to be restarted for this default to be picked up?
It would be great if this setting could be adjusted via the >tools>options> on the console.

DSmith · Post by **DSmith** » Dec 07, 2011 11:18 pm this post

With any registry edit, it is always a good idea to restart the services. I have picked up your support ticket, please feel free to reply to it if you need anything or have any questions.

rhnb · Post by **rhnb** » Dec 12, 2011 3:19 pm this post

Just a quick query about this registry key to increase the timeout.
I add the key as described ...
\HKEY_LOCAL_MACHINE\SOFTWARE\VeeaM\Veeam Backup and Replication\VwVmReadyToBackupTimeout
and gave it a value of 24:0:0 (ie 24 hours) but one of my jobs timed-out after waiting for resources for the default 12 hours.
I had rebooted the server after the change so the new value should have been in effect.

Can I check the syntax please.
The type is REG_SZ and the value is
24:0:0
or does it have to be in quotes?
"24:0:0"

Post by **Gostev** » Dec 12, 2011 3:23 pm this post

No quotes.

rhnb · Post by **rhnb** » Dec 12, 2011 3:33 pm this post

Hmmm - that's what I've got - no quotes.

VwVmReadyToBackupTimeout REG_SZ 24:0:0

any other suggestions?

Post by **Gostev** » Dec 12, 2011 4:55 pm this post

Please contact support for assistance.

ashleyw · Post by **ashleyw** » Feb 26, 2012 11:12 pm this post

[merged]

Hi,

Anton previously advised us to use the following registry key to extend the default time-outs for jobs waiting to start
"\HKEY_LOCAL_MACHINE\SOFTWARE\VeeaM\Veeam Backup and Replication\VwVmReadyToBackupTimeout" (REG_SZ)
and a value of say "99:0:0" without quotes would extend the time-outs from the default of 12 hours to 99 hours.

This last weekend - it was time for our active full backups so the jobs take substantially longer than the daily incrementals. We noticed the jobs timing out after 12 hours.

Has anyone else noticed this behaviour despite running with the registry key entry?
Has something changed with the latest patchset to break this?

cheers
Ashley

Post by **dellock6** » Feb 27, 2012 9:16 am this post

Never had to use that registry key, but it is enough to set the key, or maybe also Veeam services need to be restarted?
A possible workaround would be to change the schedule of the queued jobs, but I do not know if it's your case.

rhnb · Post by **rhnb** » Feb 27, 2012 12:03 pm this post

Yes, I've got this problem. I've rebooted, restarted services but no joy. I was advised to open a call with support which I haven't done - yet. I had other things going on at the time and it slipped my mind, and I've another call in at the moment which is a lot more urgent.
For us it's an infrequent occurrence now we've juggled job timings around to get round it, so the default works, but it would be nice to get to the bottom of why it doesn't work, and then I could just set the jobs up to start at 18:01,18:02,18:03 etc.

Post by **foggy** » Feb 27, 2012 2:27 pm this post

Guys, values exceeding 24h should be set in the following format: D.HH:MM:SS. I.e., to set 99h timeout, create the value 4.03:00:00.

lorengordon · Post by **lorengordon** » Feb 27, 2012 2:48 pm this post

ashleyw wrote:CIFS networking - security and performance
- To make use of an active/active 10GbE connection to our CIFS server, we present the same CIFS shares as two separate IPs on different subnets; i.e; \\10.0.2.23\veeambackups and \\10.0.3.23\veeambackups Using CIFS we can't bond NICs together (to get 20GbE rather than 10GbE active/passive) via ethertrunk etc as this is not supported on the ESX5i host directly. This means under Veeam we have to setup 2 targets (one for each IP) and then manually assign jobs to alternate target IPs to be able to make use of active/active CIFS connections. Is there no other way?

I'd like to second the request for a better model of CIFS repositories. Here's a clip of a feature request I sent to our CSM over the weekend:

Loren wrote:One of the items on my mind is improving the modeling of backup repositories...in particular CIFS/NFS repositories. By this, I mean that these storage devices often have multiple interfaces, and there is a maximum ingest rate for each interface as well as the device as a whole.

Right now, barring some cuteness with DNS resolution, each interface must be added as a separate backup repository, and the controls on number of jobs and bandwidth are per repository. For example, say I have a backup device with four 1GbE interfaces, but a maximum ingest rate of 2Gbps. If I add four repositories, one for each interface, then I start running into issues where I might exceed the device's maximum ingest rate. Additionally, I have to manually manage the backup jobs that are pointing at each interface to attempt to balance the load. That's quite a headache.

My feature request is this: Simply make interfaces a property of the backup repository. Both interfaces and the repository should have configurable maximum ingest rates. This enables an awful lot of possibilities for Veeam Backup to manage and distribute the load of the backup jobs across different interfaces transparently, and even allows the software to account for connectivity failures to individual interfaces (automatic failover).

Basically, this would get us network-level multipathing for CIFS/NFS repositories. This is my number 1 feature request right now.

Thanks!
-Loren

ashleyw · Post by **ashleyw** » Feb 28, 2012 1:06 am this post

foggy wrote:Guys, values exceeding 24h should be set in the following format: D.HH:MM:SS. I.e., to set 99h timeout, create the value 4.03:00:00.

Great! thanks a lot @foggy. I've made the change on our side and I'll see what happens on the next full backup. It would be great if this quirky setting could be modified via the UI under the advanced options to make it easier for everyone.

rhnb · Post by **rhnb** » Feb 28, 2012 9:28 am this post

Thanks foggy. So, if I'd picked 23hrs 59 mins it would have worked!

Post by **Gostev** » Feb 28, 2012 9:54 am this post

Guess what, the format of this counter also depends on the system locale! I am waking up in the cold sweat when I have dreams about this registry hack. What were the developer who chose to use the string value smoking? Why in the world this is not a normal DWORD counter in seconds, like every other timeout counter we have? So embarrassing!!

I will ask to change it to DWORD in the next release, and increase the default. With daily jobs in mind, do you all think that 23 hours will be good default value?

ashleyw · Post by **ashleyw** » Mar 05, 2012 12:45 am this post

thanks Anton, 24 hours would be a good default for most people - but we'd really like to see the settings changeable via the UI as well to prevent having to hunt out registry entries for this type of setting.

The only time this setting hits us is when monthly active full backups kick in as these take significantly longer than our normal incrementals and synthetic fulls.In our case we'd like to see the default at 48 hours to prevent any failures with our backup size.

R&D Forums

V6 quirks/problems.

Re: V6 quirks/problems.

Re: V6 quirks/problems.

Re: V6 quirks/problems.

Re: V6 quirks/problems.

Re: V6 quirks/problems.

Re: V6 quirks/problems.

Re: V6 quirks/problems.

Re: V6 quirks/problems.

Re: V6 quirks/problems.

Re: V6 quirks/problems.

Backup Timeout - still 12 hours despite registry entry.

Re: Backup Timeout - still 12 hours despite registry entry.

Re: Backup Timeout - still 12 hours despite registry entry.

Re: V6 quirks/problems.

Re: V6 quirks/problems.

Re: V6 quirks/problems.

Re: V6 quirks/problems.

Re: V6 quirks/problems.

Re: V6 quirks/problems.

Who is online