-
- Veteran
- Posts: 563
- Liked: 173 times
- Joined: Nov 15, 2019 4:09 pm
- Full Name: Alex Heylin
- Contact:
What's the Veeam Way: Confirm SOBR offload either in progress or completed
Currently we're monitoring for SOBR upload errors logged to Windows Event log - this is doesn't work well because errors are "expected" so it's hard to tell "expected" errors from "I'm broken and you need to fix me" errors. For example, we had 22 offload failures on one of our SOBR in last 24 hours... that's more than we usually see - but it doesn't tell us if human intervention is required.
What's "the Veeam way" for a monitoring system to confirm SOBR offload has completed, or is still in progress, or has real errors which need manual intervention to resolve?
We've got VSPC if that helps.
Thanks
What's "the Veeam way" for a monitoring system to confirm SOBR offload has completed, or is still in progress, or has real errors which need manual intervention to resolve?
We've got VSPC if that helps.
Thanks
-
- Chief Product Officer
- Posts: 32230
- Liked: 7592 times
- Joined: Jan 01, 2006 1:01 am
- Location: Baar, Switzerland
- Contact:
Re: What's the Veeam Way: Confirm SOBR offload either in progress or completed
The daily SOBR status email report provides a good summary.
@Egor Yakovlev please also check if some of those Windows Event log events should really be warnings, or not logged at all. Temporary connection issues might be better not mentioned at all, unless of course they are already logged only after lots of fighting and retries? It is just that 22 failures would indicate we're too spammy, unless there were actual major backup infrastructure or Internet access or object storage issues during those 24 hours.
Please include @veremin in review to understand what can be optimized.
@Egor Yakovlev please also check if some of those Windows Event log events should really be warnings, or not logged at all. Temporary connection issues might be better not mentioned at all, unless of course they are already logged only after lots of fighting and retries? It is just that 22 failures would indicate we're too spammy, unless there were actual major backup infrastructure or Internet access or object storage issues during those 24 hours.
Please include @veremin in review to understand what can be optimized.
-
- Product Manager
- Posts: 2597
- Liked: 715 times
- Joined: Jun 14, 2013 9:30 am
- Full Name: Egor Yakovlev
- Location: Prague, Czech Republic
- Contact:
Re: What's the Veeam Way: Confirm SOBR offload either in progress or completed
Sounds good, queued for investigation.
/Cheers!
/Cheers!
-
- Product Manager
- Posts: 20677
- Liked: 2382 times
- Joined: Oct 26, 2012 3:28 pm
- Full Name: Vladimir Eremin
- Contact:
Re: What's the Veeam Way: Confirm SOBR offload either in progress or completed
Sure, we will have a call with Egor this week to review the current situation with offloading errors, warnings and reporting. Thanks!
-
- Veteran
- Posts: 563
- Liked: 173 times
- Joined: Nov 15, 2019 4:09 pm
- Full Name: Alex Heylin
- Contact:
Re: What's the Veeam Way: Confirm SOBR offload either in progress or completed
Wow - what a response guys - thanks!!
While this thread isn't specifically about this case, you might find helpful background in Case 05930792 & Case #04800922. We don't normally open cases for this, but it's suboptimal to live with is as we have been.
If you want me to make a case with some logs inc Windows event logs, let me know.
Thanks
Alex

While this thread isn't specifically about this case, you might find helpful background in Case 05930792 & Case #04800922. We don't normally open cases for this, but it's suboptimal to live with is as we have been.
That looks like a good place to start for us to use as "OK / go look at it" indication - thanks.
If you want me to make a case with some logs inc Windows event logs, let me know.
Thanks
Alex
-
- Service Provider
- Posts: 47
- Liked: 2 times
- Joined: Jul 27, 2020 1:16 pm
- Full Name: SYK
- Contact:
Re: What's the Veeam Way: Confirm SOBR offload either in progress or completed
Hi
This may seem like overkill, but here is something we cobbled together based on similar conversations and suggestions on the forums.
Forgive the ugly look. The simplest way to run in via our RMM daily was as a one line CMD.
This may seem like overkill, but here is something we cobbled together based on similar conversations and suggestions on the forums.
This gives you information for each offload "task" (one for each backup "Job"). Name, ID, Last time the job succeeded actually sent data. The task name will sometime change to "name of the SOBR Offload" depending on if this task was independent or not, but the ID stays the same (I don't know of a great way to deal with that)."%SystemRoot%\system32\WindowsPowerShell\v1.0\powershell.exe" -noprofile -command "import-module Veeam.Backup.PowerShell; $sobrOffload = [Veeam.Backup.Model.EDbJobType]::ArchiveBackup; $sessions =[Veeam.Backup.Core.CBackupSession]::GetByTypeAndTimeInterval($sobrOffload,'9/1/2022', (Get-Date).adddays(1)) ; $taskgroups = $sessions.gettasksessions() | where {($_.progress.TransferedSize -gt '0') -and($_.status -eq 'Success')} |group-object -property Name; $lastSuccessTasks = foreach ($Task in $Taskgroups) {$task.group | sort -property {$_.progress.stoptimelocal} | select -last 1 -Property JobName, Name, Status, @{l='EndTime';e={$_.progress.StopTimeLocal}}, @{l='Duration'; e={$_.progress.duration}}, @{l='TransferedSize (GB)'; e={$_.progress.TransferedSize/1GB}} }; 'Task Count:' ; ($lastsuccessTasks | measure-object).count ; $lastsuccessTasks| sort jobname | convertto-csv"
Forgive the ugly look. The simplest way to run in via our RMM daily was as a one line CMD.
-
- Veteran
- Posts: 563
- Liked: 173 times
- Joined: Nov 15, 2019 4:09 pm
- Full Name: Alex Heylin
- Contact:
Re: What's the Veeam Way: Confirm SOBR offload either in progress or completed
Thanks very much sykerzner! I'll give that a go - certainly a great place to start 

-
- Product Manager
- Posts: 20677
- Liked: 2382 times
- Joined: Oct 26, 2012 3:28 pm
- Full Name: Vladimir Eremin
- Contact:
Re: What's the Veeam Way: Confirm SOBR offload either in progress or completed
Hey, Alex, we discussed the issue further, and in order to change the behavior or suggest something further we'd like to get the exact failure that got logged 22 times. This should help us to re-verify the logic behind the particular event. Thanks!
-
- Veteran
- Posts: 563
- Liked: 173 times
- Joined: Nov 15, 2019 4:09 pm
- Full Name: Alex Heylin
- Contact:
Re: What's the Veeam Way: Confirm SOBR offload either in progress or completed
Hi,
I've uploaded both the Veeam logs and Veeam Backup windows event log (which is what we've been looking at) to Case #05930792
Thanks!
I've uploaded both the Veeam logs and Veeam Backup windows event log (which is what we've been looking at) to Case #05930792
Thanks!
-
- Product Manager
- Posts: 20677
- Liked: 2382 times
- Joined: Oct 26, 2012 3:28 pm
- Full Name: Vladimir Eremin
- Contact:
Re: What's the Veeam Way: Confirm SOBR offload either in progress or completed
Thanks for the reference, we will review the provided information and post back. Thanks!
-
- Product Manager
- Posts: 20677
- Liked: 2382 times
- Joined: Oct 26, 2012 3:28 pm
- Full Name: Vladimir Eremin
- Contact:
Re: What's the Veeam Way: Confirm SOBR offload either in progress or completed
We've contacted your support engineer recently.
Next week we review the event logs and see whether the given error is logged with the necessary priority (error instead of a warning) and the necessary number of times. This will help us to understand if there is room for improvement.
I will update the topic once I have more information.
Thanks!
Next week we review the event logs and see whether the given error is logged with the necessary priority (error instead of a warning) and the necessary number of times. This will help us to understand if there is room for improvement.
I will update the topic once I have more information.
Thanks!
-
- Veteran
- Posts: 563
- Liked: 173 times
- Joined: Nov 15, 2019 4:09 pm
- Full Name: Alex Heylin
- Contact:
Re: What's the Veeam Way: Confirm SOBR offload either in progress or completed
Just to share today's alerts from our monitoring based on the eventlogs
13 SOBR offload failures on OUR-SP-SOBR1 in last 24 hours. Offsite backups may be incomplete! Most recent 2023-04-18 07:43:27
2 SOBR offload failures on TENANT1-SOBR1 in last 24 hours. Offsite backups may be incomplete! Most recent 2023-04-17 22:39:21
2 SOBR offload failures on TENANT2-SOBR1 in last 24 hours. Offsite backups may be incomplete! Most recent 2023-04-17 09:01:52
2 SOBR offload failures on TENANT3-SOBR1 in last 24 hours. Offsite backups may be incomplete! Most recent 2023-04-18 04:59:46
We're looking to move over to the VSPC alerts, though integrating those into our systems / process is rather challenging.
13 SOBR offload failures on OUR-SP-SOBR1 in last 24 hours. Offsite backups may be incomplete! Most recent 2023-04-18 07:43:27
2 SOBR offload failures on TENANT1-SOBR1 in last 24 hours. Offsite backups may be incomplete! Most recent 2023-04-17 22:39:21
2 SOBR offload failures on TENANT2-SOBR1 in last 24 hours. Offsite backups may be incomplete! Most recent 2023-04-17 09:01:52
2 SOBR offload failures on TENANT3-SOBR1 in last 24 hours. Offsite backups may be incomplete! Most recent 2023-04-18 04:59:46
We're looking to move over to the VSPC alerts, though integrating those into our systems / process is rather challenging.
-
- Veteran
- Posts: 563
- Liked: 173 times
- Joined: Nov 15, 2019 4:09 pm
- Full Name: Alex Heylin
- Contact:
Re: What's the Veeam Way: Confirm SOBR offload either in progress or completed
It looks like there are various contributors to these message counts:
And other related transient errors from the S3. To a point VBR should just accept these as normal and retry and only report as errors if they fail repeatedly.
We find the "locked by 0 processes" very suspicious. Does that mean it's locked by zero Veeam processes, or zero processes in total (in which case the whole message could be wrong, as it means the file is NOT locked open as it says)
This warning seems to be due to an internal Veeam design / scaling issue.Object storage cleanup failed: Timed out waiting for the backup files to be released, cancelling the job
This seems to be due to VBR trying to run too many offload jobs at the same time. In this case there appear to have been three instances of "SOBR Offload" plus an "SOBR-NAME Offload" for each SOBR running simultaneously (five total). Due to the required (but undocumented) concurrent task limit on the S3 repo (to avoid rate limit errors from S3 vendor) this pushes the bottleneck back to the object storage repository S3-EXT-NAME being "unavailable". At least, that's my interpretation.Resource not ready: object storage repository S3-EXT-NAME for SOBR-NAME Timed out waiting for backup infrastructure resources to become available (14400 sec)
18/04/2023 23:59:01 :: Removing checkpoint d4d19b02-323f-41cd-81fe-1bd7b354b1a2 from Capacity Tier...
19/04/2023 01:20:25 :: Checkpoint cleanup failed Details: HTTP exception: WinHttpQueryDataAvaliable: 12002: The operation timed out, error code: 12002
REST API error: 'S3 error: We encountered an internal error. Please retry the operation again later. Code: InternalError', error code: 500 Other: Detail: 'Could not find pool number 2269 in extent B-643390/O-f5db5c2dbe717ca6/S-1',
18/04/2023 23:40:30 :: Checkpoint cleanup failed Details: HTTP exception: WinHttpQueryDataAvaliable: 12002: The operation timed out, error code: 12002
18/04/2023 23:40:32 :: Object storage cleanup failed: HTTP exception: WinHttpQueryDataAvaliable: 12002: The operation timed out, error code: 12002
Shared memory connection was closed.
18/04/2023 23:40:32 :: Object storage cleanup failed: HTTP exception: WinHttpQueryDataAvaliable: 12002: The operation timed out, error code: 12002
Exception from server: HTTP exception: WinHttpQueryDataAvaliable: 12002: The operation timed out, error code: 12002
18/04/2023 23:40:47 :: Offload finished with warning at 18/04/2023 23:40:47
And other related transient errors from the S3. To a point VBR should just accept these as normal and retry and only report as errors if they fail repeatedly.
We're plagued by this occasional error and can't find the cause. All AV exclusions are in place as the most aggressive exclusions possible, and applied to both the file path and file names. Windows defender is uninstalled.18/04/2023 07:43:27 :: Failed to offload backup. Error: Failed to call RPC function 'FcRenameFile': The process cannot access the file because it is being used by another process. Failed to rename file from [D:\Veeam\Backups\xxxxxxxxxxxxxxxxxxxxxxx\xxxxxxxxxxxxxxxxxxxxxx\xxxxxxxx.vbm.temp] to [D:\Veeam\Backups\xxxxxxxxxxxxxxxxxxxxxxx\xxxxxxxxxxxxxxxxxxxxxx\xxxxxxxx.vbm].
File 'D:\Veeam\Backups\xxxxxxxxxxxxxxxxxxxxxxx\xxxxxxxxxxxxxxxxxxxxxx\xxxxxxxx.vbm.temp' locked by 0 processes:.
File 'D:\Veeam\Backups\xxxxxxxxxxxxxxxxxxxxxxx\xxxxxxxxxxxxxxxxxxxxxx\xxxxxxxx.vbm' locked by 0 processes:.
18/04/2023 07:43:27 :: Failed to upload meta into master agent.
We find the "locked by 0 processes" very suspicious. Does that mean it's locked by zero Veeam processes, or zero processes in total (in which case the whole message could be wrong, as it means the file is NOT locked open as it says)
-
- Veteran
- Posts: 613
- Liked: 92 times
- Joined: Dec 20, 2015 6:24 pm
- Contact:
Re: What's the Veeam Way: Confirm SOBR offload either in progress or completed
This is v11, a few common errors/warnings
Error (not sure if this has to be an error as the blackout period was set by purpose)
Warning (happens a lot, we tweaked some settings in the past but without 100% solution, so we just ignore it)
Error (not sure if this has to be an error as the blackout period was set by purpose)
Error (should this really be an error?)20.04.2023 06:19:18 :: Processing xxxxx Error: Job was stopped due to backup window setting
Warning08.04.2023 23:14:55 :: Processing xxxx Error: Stopped by job 'xxxx' (Backup)
Error (very common over all our different locations with buckets in different regions, not sure why the above is warning and this an error)19.04.2023 22:27:46 :: Object storage cleanup failed: Failed to retrieve certificate from https://s3.dualstack.ap-southeast-1.amazonaws.com
Error (random but very common, I guess it has to be an error, but as this happens only randomly we usually ignore it)19.04.2023 17:00:44 :: Processing xxxxx Error: Failed to retrieve certificate from https://s3.dualstack.ap-southeast-1.amazonaws.com
09.04.2023 05:00:35 :: Processing xxxxx Error: HTTP exception: WinHttpSendRequest: 12030: The connection with the server was terminated abnormally
, error code: 12030
Warning (happens a lot, we tweaked some settings in the past but without 100% solution, so we just ignore it)
08.04.2023 19:45:31 :: Object storage cleanup failed: REST API error: 'S3 error: Please reduce your request rate.
Code: SlowDown', error code: 503
Other: HostId: 'xxxxxx
-
- Veteran
- Posts: 563
- Liked: 173 times
- Joined: Nov 15, 2019 4:09 pm
- Full Name: Alex Heylin
- Contact:
Re: What's the Veeam Way: Confirm SOBR offload either in progress or completed
I agree that "Job was stopped due to backup window setting" should not be an error. It's an indication that the system is working as designed / configured.
-
- Veteran
- Posts: 563
- Liked: 173 times
- Joined: Nov 15, 2019 4:09 pm
- Full Name: Alex Heylin
- Contact:
Re: What's the Veeam Way: Confirm SOBR offload either in progress or completed
Still causing drama several days a week...
These seem to be routine - and due to other offload jobs running. Bear in mind we've been told by support to limit concurrent jobs on the S3 repo to 2 to deal with another message - very likely
This whole design of "run LOADS of offload jobs, often at the same time, have them ignore that other jobs are already running, then log errors when they timeout" just seems "highly suboptimal".
Oh!!!
Given this was in sync previously, and nothing other than VBR has touched either the performance or capacity tiers - this is "very disappointing" that this seems to keep happening, long after the upgrade to v12 was supposed to improve all this.
If a rescan really is required - why doesn't VBR queue one up and suspend all the offload jobs (which will likely fail anyway) until it's completed?
Code: Select all
Processing 0c31728d-5c3c-46fa-925f-9edbe89621b7 Error: Timed out waiting for backup infrastructure resources to become available (14400 sec)
Code: Select all
REST API error: 'S3 error: Please reduce your request rate.
Code: Select all
Error: Backup file version mismatch: scale-out backup repository rescan is required.

If a rescan really is required - why doesn't VBR queue one up and suspend all the offload jobs (which will likely fail anyway) until it's completed?
-
- Veteran
- Posts: 563
- Liked: 173 times
- Joined: Nov 15, 2019 4:09 pm
- Full Name: Alex Heylin
- Contact:
-
- Veteran
- Posts: 563
- Liked: 173 times
- Joined: Nov 15, 2019 4:09 pm
- Full Name: Alex Heylin
- Contact:
Re: What's the Veeam Way: Confirm SOBR offload either in progress or completed
The rescan has spat out a load of warnings like
These are presumably because the SP side is sulking about a tenant having built a new backup server and remapped the new backups to the old chain, having upgraded from "per-machine data single metadata" to "per-machine data per-machine metadata".
SPs need a system that works and is more reliable and less needy than this!
Thanks
Alex
Code: Select all
Failed to import backup Backup Copy xxxxxxxxx\yyyyyyy - zzzzzz Details: The existing index has a different backup id
SPs need a system that works and is more reliable and less needy than this!
Thanks
Alex
Who is online
Users browsing this forum: No registered users and 4 guests