-
- Novice
- Posts: 8
- Liked: never
- Joined: Aug 20, 2019 5:06 am
- Full Name: Mark Mathieson
- Contact:
Feature Request: smarter Cluster backups
Greetings all,
Are there any plans to make Veeam Agents smarter with respect to clusters and replicated data?
At this stage (Veeam 9.5 U4a), a fail-over cluster with physical RDMs needs to be backed up with an agent, as vSphere cannot snapshot the physical RDMs. The agent is loaded on each node, and the Change Block Tracker assesses the node with a view that all data is local. If a fail-over event occurs, and the RDMs are removed from the primary node and mounted against the secondary node, the agent on the secondary node sees the relocated RDMs as new data and tries to back it all up from scratch, blowing out the backup time and the repository size and duplicating all of the data within Veeam. Instead of doing an incremental backup in a couple of hours, our backup blows out to 4 days, and fills the repository with new data, requiring manual intervention to prevent catastrophic failure.
The agent should be able to recognise the relocated data as already existing in Veeam, then do incremental backups only, but it doesn't.
Ironically, Update 4b is Storage Replica aware, and will only backup data once when it is actually replicated, but the Failover cluster data that exists only once gets replicated by Veeam.
I've seen various similar topics around regarding Microsoft Failover Clusters, AAG and DAGs, but nothing that quite covers this.
Thoughts?
Are there any plans to make Veeam Agents smarter with respect to clusters and replicated data?
At this stage (Veeam 9.5 U4a), a fail-over cluster with physical RDMs needs to be backed up with an agent, as vSphere cannot snapshot the physical RDMs. The agent is loaded on each node, and the Change Block Tracker assesses the node with a view that all data is local. If a fail-over event occurs, and the RDMs are removed from the primary node and mounted against the secondary node, the agent on the secondary node sees the relocated RDMs as new data and tries to back it all up from scratch, blowing out the backup time and the repository size and duplicating all of the data within Veeam. Instead of doing an incremental backup in a couple of hours, our backup blows out to 4 days, and fills the repository with new data, requiring manual intervention to prevent catastrophic failure.
The agent should be able to recognise the relocated data as already existing in Veeam, then do incremental backups only, but it doesn't.
Ironically, Update 4b is Storage Replica aware, and will only backup data once when it is actually replicated, but the Failover cluster data that exists only once gets replicated by Veeam.
I've seen various similar topics around regarding Microsoft Failover Clusters, AAG and DAGs, but nothing that quite covers this.
Thoughts?
-
- Product Manager
- Posts: 14914
- Liked: 3109 times
- Joined: Sep 01, 2014 11:46 am
- Full Name: Hannes Kasparick
- Location: Austria
- Contact:
Re: Feature Request: smarter Cluster backups
Hello,
and welcome to the forums.
Best regards,
Hannes
and welcome to the forums.
that's not normal. Do you have a case number for that behavior?and the repository size and duplicating all of the data within Veeam
can you describe what you mean? The agent does not see what happens on the storage. It only sees the shared volume mounted to the cluster.Ironically, Update 4b is Storage Replica aware
I guess you are talking about a windows cluster failover event. Nothing changes to the VMware environment?If a fail-over event occurs
Best regards,
Hannes
-
- Novice
- Posts: 8
- Liked: never
- Joined: Aug 20, 2019 5:06 am
- Full Name: Mark Mathieson
- Contact:
Re: Feature Request: smarter Cluster backups
Hi Hannes,
Thanks for your response!
When you say "that's not normal", are you implying that the Agent should only create one copy of data, regardless of which Failover Cluster node the data is hosted on? We were advised by our vendor that having to do a full backup as a baseline each time the failover cluster changed nodes was entirely normal, so we never logged a job. We had been assured that this was normal behaviour.
The reference to Storage Replica is admittedly a little off-topic here: Storage Replica doesn't use RDMs, so it can be snapshotted, and therefore backed up agentless, so it's not really a discussion for this thread, as it won't use the Veeam agent. But the release notes state "support for Windows Server Storage Replica, including automatic exclusion of duplicate copies of data at backup time". so it seems that Veeam can identify data duplication in Storage Replica clusters, but not when using the Agent in Failover clusters.
Sorry, yes: the failover reference was talking about a Microsoft Failover Cluster fail-over, not the ESXi hosts. So if the MS Failover cluster moves the disk resources from the Windows Primary Node to the Windows Secondary Node, the Agent sees that as new data and tries to back it up from scratch.
This is problematic for two reasons: 1. It takes 4 days to do a full backup of 64TB across the wire via Agent, and 2. the backup repository has a maximum space of 120TB, so we don't have the space for two separate copies in the repository.
So, is this normal behaviour? If not, how do we fix it?!
Thanks for your response!
When you say "that's not normal", are you implying that the Agent should only create one copy of data, regardless of which Failover Cluster node the data is hosted on? We were advised by our vendor that having to do a full backup as a baseline each time the failover cluster changed nodes was entirely normal, so we never logged a job. We had been assured that this was normal behaviour.
The reference to Storage Replica is admittedly a little off-topic here: Storage Replica doesn't use RDMs, so it can be snapshotted, and therefore backed up agentless, so it's not really a discussion for this thread, as it won't use the Veeam agent. But the release notes state "support for Windows Server Storage Replica, including automatic exclusion of duplicate copies of data at backup time". so it seems that Veeam can identify data duplication in Storage Replica clusters, but not when using the Agent in Failover clusters.
Sorry, yes: the failover reference was talking about a Microsoft Failover Cluster fail-over, not the ESXi hosts. So if the MS Failover cluster moves the disk resources from the Windows Primary Node to the Windows Secondary Node, the Agent sees that as new data and tries to back it up from scratch.
This is problematic for two reasons: 1. It takes 4 days to do a full backup of 64TB across the wire via Agent, and 2. the backup repository has a maximum space of 120TB, so we don't have the space for two separate copies in the repository.
So, is this normal behaviour? If not, how do we fix it?!
-
- Product Manager
- Posts: 14914
- Liked: 3109 times
- Joined: Sep 01, 2014 11:46 am
- Full Name: Hannes Kasparick
- Location: Austria
- Contact:
Re: Feature Request: smarter Cluster backups
Hello,
Best regards,
Hannes
yes. That's what I would expect from any backup solution. But it's a good point to add that to my upcoming blog post. I did the following test: failover from node 1 to node 2. The incremental backup was 350 MByte. Full backup was 13 GByte. After full shutdown of the cluster, the incremental was 3 GByte. There is no full backup if everything works as expected.When you say "that's not normal", are you implying that the Agent should only create one copy of data, regardless of which Failover Cluster node the data is hosted on?
no idea - I'm not the vendorWe were advised by our vendor that having to do a full backup as a baseline each time the failover cluster changed nodes was entirely normal, so we never logged a job.
Best regards,
Hannes
-
- Novice
- Posts: 8
- Liked: never
- Joined: Aug 20, 2019 5:06 am
- Full Name: Mark Mathieson
- Contact:
Re: Feature Request: smarter Cluster backups
Hi Hannes,
So you would expect Veeam to keep only a single copy of all cluster data, and for the Failover cluster to keep doing incremental backups, regardless of which Microsoft server node the data is hosted on?
If this is the expected behaviour, why is our Failover Cluster not behaving as expected, and instead attempting a Full backup every time the Microsoft Cluster fails over nodes? Should I log a Support Call and log this as a fault?
Incidentally, the Vendor in question is a Veeam partner, as recommended to us by Veeam for the Veeam implementation.
Cheers
So you would expect Veeam to keep only a single copy of all cluster data, and for the Failover cluster to keep doing incremental backups, regardless of which Microsoft server node the data is hosted on?
If this is the expected behaviour, why is our Failover Cluster not behaving as expected, and instead attempting a Full backup every time the Microsoft Cluster fails over nodes? Should I log a Support Call and log this as a fault?
Incidentally, the Vendor in question is a Veeam partner, as recommended to us by Veeam for the Veeam implementation.
Cheers
-
- Product Manager
- Posts: 14914
- Liked: 3109 times
- Joined: Sep 01, 2014 11:46 am
- Full Name: Hannes Kasparick
- Location: Austria
- Contact:
Re: Feature Request: smarter Cluster backups
Hello,
yes and yes (as mentioned in my first post ).
Please post the case number here for reference.
Best regards,
Hannes
yes and yes (as mentioned in my first post ).
Please post the case number here for reference.
Best regards,
Hannes
-
- Novice
- Posts: 8
- Liked: never
- Joined: Aug 20, 2019 5:06 am
- Full Name: Mark Mathieson
- Contact:
Re: Feature Request: smarter Cluster backups
Hi Hannes,
Case # 03744935.
There should be full details in that job, but we're seeing some really odd behaviour in these backups. Like the Backup Repository reporting that it's backing up 158TB of Fileshare cluster, even thought the cluster has only 54TB used with a total capacity of 80TB. It then stops and does a full in the middle of the week, for no apparent reason, but doesn't mark it as a Full, just an incremental. We have a SAN limitation of 120TB for a volume, so we are running out of space regularly if it tries to do a full backup on top of an existing full backup, because we just can't fit multiple full backups in the same repository. And every attempt it makes blows our backup out for 3 to 4 days.
Like I said: more details in the case, but let me know if you want any clarification.
Cheers,
Mark
Case # 03744935.
There should be full details in that job, but we're seeing some really odd behaviour in these backups. Like the Backup Repository reporting that it's backing up 158TB of Fileshare cluster, even thought the cluster has only 54TB used with a total capacity of 80TB. It then stops and does a full in the middle of the week, for no apparent reason, but doesn't mark it as a Full, just an incremental. We have a SAN limitation of 120TB for a volume, so we are running out of space regularly if it tries to do a full backup on top of an existing full backup, because we just can't fit multiple full backups in the same repository. And every attempt it makes blows our backup out for 3 to 4 days.
Like I said: more details in the case, but let me know if you want any clarification.
Cheers,
Mark
-
- Enthusiast
- Posts: 40
- Liked: 3 times
- Joined: Jun 04, 2019 12:36 am
- Full Name: zaki khan
- Contact:
Re: Feature Request: smarter Cluster backups
Hi
I bet it is because of the "Per-VM backup" enabled on the cloud repository your backup is going to. We are facing a similar issue. The data on the shared disk is backed up and saved twice, once for each node. Each node is saved with an independent copy of the data on the shared disk.
I bet it is because of the "Per-VM backup" enabled on the cloud repository your backup is going to. We are facing a similar issue. The data on the shared disk is backed up and saved twice, once for each node. Each node is saved with an independent copy of the data on the shared disk.
-
- Expert
- Posts: 186
- Liked: 22 times
- Joined: Mar 13, 2019 2:30 pm
- Full Name: Alabaster McJenkins
- Contact:
Re: Feature Request: smarter Cluster backups
https://www.veeam.com/blog/windows-2019 ... agent.html
If you use agents and create an actual failover cluster job then according to the blog post, even a repo with per vm chains is ignored and one chain is made for all cluster members. Which makes sense.
So I would question if you properly set up an actual failover cluster backup job as documented in the blog post.
If you use agents and create an actual failover cluster job then according to the blog post, even a repo with per vm chains is ignored and one chain is made for all cluster members. Which makes sense.
So I would question if you properly set up an actual failover cluster backup job as documented in the blog post.
-
- Lurker
- Posts: 1
- Liked: never
- Joined: Dec 13, 2019 7:29 pm
- Contact:
Re: Feature Request: smarter Cluster backups
I know this is an old thread, but I came across it in a search and I want to second it. I'm seeing this exact issue, and I've seen it with multiple file server clusters, too. It seems like if the backup so much as sneezes, or if there are non-existent random "cluster membership changes", it gives up on the backup chain entirely and starts a new full backup. Some of us are backing up 30-40TB data sources with this - an unexpected full backup in the middle of the week is a death sentence, to both the backups that need to run nightly while it runs, and our backup repositories' that need that space. How are we supposed to calculate our required repository storage when it can be nuked randomly? I put in a ticket for this issue, and the support person said that it would be fixed by upgrading to V10, and I upgraded, and it just happened to us again, with a different cluster.
This isn't even unexpected behavior, per this: https://helpcenter.veeam.com/docs/backu ... ml?ver=100
"In case a backup task within a Veeam Agent backup job that processes a cluster completes unsuccessfully, Veeam Agent for Microsoft Windows will create full backup of all shared disks of the cluster."
Why would this product be designed in such a way? Sorry to be so pointed, but I've been a Veeam evangelist since 2014, shouting it from the rooftops when I worked in an environment that was all virtual (I even sold my current org on Veeam), but now that I'm in an environment that has big clustered physical data sources, I am severely disappointed by how far behind the Veeam Agent for Windows is.
This isn't even unexpected behavior, per this: https://helpcenter.veeam.com/docs/backu ... ml?ver=100
"In case a backup task within a Veeam Agent backup job that processes a cluster completes unsuccessfully, Veeam Agent for Microsoft Windows will create full backup of all shared disks of the cluster."
Why would this product be designed in such a way? Sorry to be so pointed, but I've been a Veeam evangelist since 2014, shouting it from the rooftops when I worked in an environment that was all virtual (I even sold my current org on Veeam), but now that I'm in an environment that has big clustered physical data sources, I am severely disappointed by how far behind the Veeam Agent for Windows is.
-
- Product Manager
- Posts: 14914
- Liked: 3109 times
- Joined: Sep 01, 2014 11:46 am
- Full Name: Hannes Kasparick
- Location: Austria
- Contact:
Re: Feature Request: smarter Cluster backups
Hello,
and welcome to the forums.
The user guide is not precise enough at this point. I will try whether we can improve that. Yes, there are some situations where a new full backup is required. But for most "unsuccessfull" backups there is no new full backup. With V10 we improved it further.
As you mentioned that you have regular issues... could you please post the case number to see what we can improve (assuming that you are using V10)?
Thanks,
Hannes
and welcome to the forums.
The user guide is not precise enough at this point. I will try whether we can improve that. Yes, there are some situations where a new full backup is required. But for most "unsuccessfull" backups there is no new full backup. With V10 we improved it further.
As you mentioned that you have regular issues... could you please post the case number to see what we can improve (assuming that you are using V10)?
Thanks,
Hannes
Who is online
Users browsing this forum: Bing [Bot] and 9 guests