Feature Improvement Request
-------------------------------------------
Based on our discussion with Veeam Support (Case #08103472) , we understand that gateway server failover during SOBR operations is not supported by design.
The job selects an available gateway VM at the start of the task and continues processing through that gateway.
If the selected gateway encounters an issue, such as a disk-related problem, the task ultimately fails instead of automatically switching to another available gateway VM.
With that understanding, we would like to know:
In a multi-gateway server configuration, why does another available gateway not automatically take over when the active gateway becomes unavailable during job execution?
Additionally, are there any future plans or feature enhancements being considered to introduce automatic gateway failover for improved resiliency and operational continuity?
-
satyaibm007
- Service Provider
- Posts: 6
- Liked: never
- Joined: Feb 27, 2025 12:29 pm
- Full Name: Satyabrata Roy
- Contact:
-
david.domask
- Product Manager
- Posts: 3663
- Liked: 889 times
- Joined: Jun 28, 2016 12:12 pm
- Contact:
Re: Case #08103472 - SOBR Job failed due to /root partition 100% utilized on gateway server
Hi Satyabrata,
Correct, currently if the gateway experiences an aborting error, the task(s) assigned to that gateway will be failed, and currently there is no mechanism for a failover to another gateway within the same session. Aborting errors (like out of space errors from the case) typically require user interaction to resolve and with Automatic selection (as I see is utilized in your situation) may result in non-ideal gateway selection.
Will discuss internally on options for handling this failure more gracefully, but in an ideal configuration it's best to ensure that you specify ideal gateways for working with the Object Storage repository, and monitor for such issues. Our upcoming 13.1 includes hardware health / monitoring for the Veeam Infrastructure Appliances which ought help to keep on top of issues such as lack of space, and the appliance itself will have additional controls to help prevent / detect such out of space issues before they impact backup operations.
Correct, currently if the gateway experiences an aborting error, the task(s) assigned to that gateway will be failed, and currently there is no mechanism for a failover to another gateway within the same session. Aborting errors (like out of space errors from the case) typically require user interaction to resolve and with Automatic selection (as I see is utilized in your situation) may result in non-ideal gateway selection.
Will discuss internally on options for handling this failure more gracefully, but in an ideal configuration it's best to ensure that you specify ideal gateways for working with the Object Storage repository, and monitor for such issues. Our upcoming 13.1 includes hardware health / monitoring for the Veeam Infrastructure Appliances which ought help to keep on top of issues such as lack of space, and the appliance itself will have additional controls to help prevent / detect such out of space issues before they impact backup operations.
David Domask | Product Management: Principal Analyst
Who is online
Users browsing this forum: No registered users and 50 guests