Comprehensive data protection for all workloads
foggy
Veeam Software
Posts: 21069
Liked: 2115 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: "Some hot add capable backup proxy VMs were skipped..."

Post by foggy »

dellock6 wrote:So it seems vddk 5.5 is a leftover of the installation Alexander?
Correct.
Yuki
Veeam ProPartner
Posts: 252
Liked: 26 times
Joined: Apr 05, 2011 11:44 pm
Contact:

[MERGED] v8 Upd 2 UUID, replication and Hot-Add problems

Post by Yuki »

Case: 00917457

Hi all,
We have an issue with Veeam server/proxies failing to use Hot-Add mode and resorting to Network mode. We use Veeam internally, we are Veeam Provider and we manage retail Veeam installations at our clients). All of this started after upgrading Veeam installs to v8 U2.

What happens is this: Any proxy (even veeam server itself if also used as proxy) will fail to use Hot-Add mode for local VMs if this server/proxy has been replicated with Veeam. We replicate our Veeam server VM (which also works as a proxy) to another data-center in another state for redundancy/recovery purposes. We also replicate our client's VMs to their local systems or to remote systems. In a number of instances our client's VMs that act as Veeam Proxies also perform other important functions, so they want them replicated (protected). However, because Veeam check's UUIDs and finds those dark replicas, it now started to fail over to Network Mode instead of Hot Add. Network mode then gets used even for backups, which prolongs the backup window.

So far suggestion to us was to dedicate new VMs for proxies and not to replicate them. Of course this means increased hardware resource usage and licensing costs for us and our clients.

We did not experience any of these issues prior to v8 U2 update and I wanted to see if this is something that can be addressed with a patch or update? Any way to stop this behavior?
foggy
Veeam Software
Posts: 21069
Liked: 2115 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: "Some hot add capable backup proxy VMs were skipped..."

Post by foggy »

Please review this thread for details on the reasons of the observed behavior. We will discuss the ability to exclude replica VMs from UUID check.
Yuki
Veeam ProPartner
Posts: 252
Liked: 26 times
Joined: Apr 05, 2011 11:44 pm
Contact:

Re: "Some hot add capable backup proxy VMs were skipped..."

Post by Yuki »

foggy wrote:Please review this thread for details on the reasons of the observed behavior. We will discuss the ability to exclude replica VMs from UUID check.
I did review this and to be honest - the lack of concern i'm seeing in this thread is alarming. Right now there is no solution for this and it is a problem. Even though the jobs are not failing outright, switching to Network mode means serious performance hit and increased backup windows. At this time we have blocked updating Veeam BR to v8 U2 across 80 companies with around 300 physical servers and 600-700 VMs. We were unfortunate not have this caught earlier in smaller deployments and v8 U2 made it into our provider environment (which is smaller when compared to all of the retail licenses we are managing, but it still had serious impact).

I really would like to see this issue escalated and solution made available to those being affected.
Gostev
Chief Product Officer
Posts: 31459
Liked: 6648 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: "Some hot add capable backup proxy VMs were skipped..."

Post by Gostev » 1 person likes this post

OK, so let me step back a little and explain some essential details of hot add implementation in VDDK to help everyone understand this issue better.

Looking up proxy VM by BIOS UUID is how VDDK worked since its inception, so this is not something unique to VDDK 6.0. Why lookup is performed by non-unique UUID instead of unique moRef is a separate question, however the VDDK code tells us that this is done deliberately: having a proxy VM moRef in hands passed to VDDK by backup application, VDDK then uses one to look up UUID of the corresponding VM, and then looks up a proxy VM again based on that UUID. So perhaps, there are good reasons for this logic, as VMware support KB link hints. I will try to find out the reason from VMware directly.

But moving on. Ideally, look up should return a single VM, in which case there are no issues. Now, how look up results returning more than one VM are treated is where the differences appear between VDDK versions.

In VDDK 5.5 and earlier, when look up returns more than one VM, VDDK just goes ahead with the top one in the list as the hot add proxy. Most likely, the list is arranged by VM creation time (which would explain why proxy VM replicas have never caused issues before). At this point, all users with duplicated proxy UUIDs were divided on "lucky" and "unlucky" ones: for some hot add worked fine, and for other it failed due to incorrect proxy VM being picked by VDDK. Users from the latter group would then create endless "hot add not working" topics on this forum - ending up in our support, which helped them fix non-unique BIOS UUIDs on cloned proxy VMs, thus resolving the whole issue.

VDDK 6.0, on the other hand, immediately fails out of hot add mode after encountering more than one VM with the same UUID - without even attempting to use the first VM in the list. As a result, all the users from "lucky" group, as well as anyone replicating their proxy VMs, are now guaranteed to have their hot add not working.

Now, I expect majority of the users facing this issue after upgrading to U2 are actually former "lucky group" members, because replicating backup proxy VMs is not very common (as there is no reason to protect proxy VMs, unless they also carry some other roles outside Veeam). For such users, quick and dirty fix would be to simply ensure that all proxy VMs have unique BIOS UUID by changing the duplicate ones. You can determine proxy VMs with duplicate UUIDs using Veeam logs, as explained earlier in this topic.

As far as the users replicating their backup proxy VMs, I don't currently see a quick solution, but will discuss this with R&D and QC tomorrow.
chrisdearden
Veteran
Posts: 1531
Liked: 226 times
Joined: Jul 21, 2010 9:47 am
Full Name: Chris Dearden
Contact:

Re: "Some hot add capable backup proxy VMs were skipped..."

Post by chrisdearden »

It may be possible that certain provisioning processes create VM's with duplicate BIOS UUID - Eg clone from template vs deploy from template for example. This could well lead to the situation where it would appear the proxies have not been cloned , but in fact they have.
Yuki
Veeam ProPartner
Posts: 252
Liked: 26 times
Joined: Apr 05, 2011 11:44 pm
Contact:

Re: "Some hot add capable backup proxy VMs were skipped..."

Post by Yuki »

Gostev wrote:...... because replicating backup proxy VMs is not very common (as there is no reason to protect proxy VMs, unless they also carry some other roles outside Veeam).

As far as the users replicating their backup proxy VMs, I don't currently see a quick solution, but will discuss this with R&D and QC tomorrow.
Hi Anton,
Now I understand better why this was not seen as a "bug" by Veeam (even though in our view it did have negative impact). I do want to note,really to note and not engage in an argument by any means, that replication of proxy VMs is not uncommon in many smaller environments. From what we've observed it is quite common to see Veeam proxy agent installed on "utility" or "auxiliary" servers that perform other important, but not critical functions (AV management, DHCP servers, environment management workstations, etc). Because in case of DR restoring those simple, but still important network functions is much faster from replicated machine than rebuilding them, there is no reason not to include them in the DR process.

As far as i know (and at least in the past for sure), excluding proxy VMs from replication/backups has never been mentioned in documentation. Not placing proxy agents on systems performing other functions has also not been something stressed in documentation or pointed out as potentially having negative effect on Veeam (beyond potential for resource sharing considerations and system load in general). In fact, on several occasions when calling Veeam for support, tech agents have suggested adding agents to existing system in the network environment and never said proxy should only go on its own VM or never be replicated.

It would be ideal to either change the current replica detection behavior or add into official documentation that proxies can't be replicated without loosing hot-add. Adding separate VMs is not hard, but does require additional resources and licenses.

Thank you for addressing this with R&D and QC! Much appreciated.

Chris - in our case it's not cloning, it's replica enumeration that is causing the problem.
chrisdearden
Veteran
Posts: 1531
Liked: 226 times
Joined: Jul 21, 2010 9:47 am
Full Name: Chris Dearden
Contact:

Re: "Some hot add capable backup proxy VMs were skipped..."

Post by chrisdearden »

In terms of reusing existing machines as proxies, there are other things to consider, for example lack of CBT (we disable it due to potential issues) for me, that's good enough reason not to share roles unless absolutely necessary due to hard resource / licence constraints
Yuki
Veeam ProPartner
Posts: 252
Liked: 26 times
Joined: Apr 05, 2011 11:44 pm
Contact:

Re: "Some hot add capable backup proxy VMs were skipped..."

Post by Yuki »

Hi Chris,
Yes, that part is well understood. All of these multi-function proxy VMs are typically very small due to the fact that they perform light functions as noted above (management tasks, running scripts, alerts, DHCP, DNS, AV management, etc). So their VMDK footprint is minimal.
Gostev
Chief Product Officer
Posts: 31459
Liked: 6648 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: "Some hot add capable backup proxy VMs were skipped..."

Post by Gostev »

Yuki wrote:Now I understand better why this was not seen as a "bug" by Veeam (even though in our view it did have negative impact)
No, actually it's not a bug because of what you say. We do recognize your use case as legitimate - plus overall, we just hate when something that was working before stops working after upgrade, so we always do anything possible on our end to prevent this from happening - at least when the reason for that lies in our code.

In reality, this is not a bug by definition, because it's not something that we've missed. It was exactly opposite - we have ran into this behavior change very early on in VDDK 6.0 testing, so early that we even had time to properly handle new exceptions caused by this new limitation in the code. For example, all those cute self explanatory notification messages quoted in this thread (including the one in the topic name) were first added in the Update 2 specifically to account for this VDDK behavior change, to make sure that users understand what is causing hot add from not working. They did not even exist before Update 2.

That said, again we do recognize that there are legitimate use cases where this new hot add limitation purposely introduced in VDDK 6.0, and based on the reaction to this newly introduced VDDK limitation from our users so far, we are actively looking at our [very limited] options on what we can do from our end. I am still waiting for a response from VDDK PM at VMware, and his assessment of the reason behind this change in VDDK 6.0. A lot will depend on what we hear back.

And I must say I am totally with you on the concept of using those tiny core infrastructure VMs as proxies, I would totally do the same every time regardless of losing CBT for processing those. I do know that it's uncommon due to various concerns, but I actually wish more people would do that... it just makes sense. Minus 1 VM is a lot of resources saved on permanent basis, and all you "pay" for that is just a few GB of extra sequential read I/O every 24 hours - that's nothing! But, I don't need to convince you I guess ;)
Yuki
Veeam ProPartner
Posts: 252
Liked: 26 times
Joined: Apr 05, 2011 11:44 pm
Contact:

Re: "Some hot add capable backup proxy VMs were skipped..."

Post by Yuki »

Any chance to implement a feature similar to Network Address assignment rules and re-assign UUIDs for replicas to ensure they are unique? Maybe even with option to assign original UUID back if it has been failed-over to?
Gostev
Chief Product Officer
Posts: 31459
Liked: 6648 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: "Some hot add capable backup proxy VMs were skipped..."

Post by Gostev »

This was actually the first option that I have removed from the table, as in my previous experience, changing BIOS UUID may cause MASSIVE issues for applications that can be installed in replica VMs. Some apps tie themselves up to a particular computer via BIOS UUID for various purposes, and when all of a sudden they find themselves starting on a different computer, they break. For one, if I am not mistaken, this may even cause Windows OS itself asking for reactivation (depending on the license used).

So, this change means many customers will not be able to recover some of their apps in case of disaster, and there is basically nothing worse than failed recovery, as I am sure you agree.
tsightler
VP, Product Management
Posts: 6009
Liked: 2843 times
Joined: Jun 05, 2009 12:57 pm
Full Name: Tom Sightler
Contact:

Re: "Some hot add capable backup proxy VMs were skipped..."

Post by tsightler »

VMware themselves acknowledge this as a bug in right in the VDDK 6.0 release notes and says a future version will include a fix for this new behavior:
VDDK 6.0 searches for virtual machines by BIOS UUID.
VDDK 6.0.0 tries to find requested virtual machines by looking up their BIOS UUID, instead of by the MoRef provided, as in previous VDDK releases. Because a BIOS UUID is not guaranteed unique, unlike a MoRef, this new behavior may cause problems after cloning, out of place VM restore, and so forth. The problem happens when two VMs in a datacenter have the same BIOS UUID, with both backup and restore, whether the VMs are powered on or off. The error occurs when VixDiskLib_Open(Ex) fetches the virtual disk's block map. A fix has been identified and will be available in a future release.
Actually, I guess this isn't quite the same issue as it doesn't mention anything about hotadd, but it is weird that VMware is doing any type of selection via UUID which they admit may not be unique.
Gostev
Chief Product Officer
Posts: 31459
Liked: 6648 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: "Some hot add capable backup proxy VMs were skipped..."

Post by Gostev »

Yes, this is a totally different issue, Tom.

Quoted by you is a BUG that we have found and reported ourselves during U2 testing, so I am very familiar with this one. This talks about looking up processed VMs by UUID (in other words, VM you are trying to backup). And, it impacts Direct SAN transport mode only. We have patched the quoted VDDK bug in U2, so this one is a non-issue.

This topic is due to VDDK looking up "hot add proxy" VM by UUID, in a different place in the VDDK code. As per my original post above, VDDK used such look up since the first version. The only change is extra check added in VDDK 6.0 that prevents hot add process from moving along in case when multiple proxy VMs have been found having the same UUID. Unlikely a bug, because somebody at VMware has purposely added this limitation for whatever reason.
tsightler
VP, Product Management
Posts: 6009
Liked: 2843 times
Joined: Jun 05, 2009 12:57 pm
Full Name: Tom Sightler
Contact:

Re: "Some hot add capable backup proxy VMs were skipped..."

Post by tsightler »

Yeah, I kinda figured that out after I read it which is why I modified the post with my extra comment, and I almost deleted it completely to avoid confusion (perhaps I should have), but I still found it interesting that there were so many changes in the VDDK that had to do with UUID.
danswartz
Veteran
Posts: 264
Liked: 30 times
Joined: Apr 26, 2013 4:53 pm
Full Name: Dan Swartzendruber
Contact:

Re: "Some hot add capable backup proxy VMs were skipped..."

Post by danswartz »

Interesting experience with this. I am using a windows server 2008r2 VM as the host of the B&R software, which implicitly makes it the 'vmware backup proxy'. I replicate a couple of critical VMs, one of which is the proxy. I happened to be looking at the replication job log, and saw that it was using SAN mode for the source, but NBD mode for the destination. When I dug deeper, I saw:

5/22/2015 3:01:18 PM :: Unable to leverage hot add processing mode: All suitable backup proxy VMs have non-unique BIOS UUID

After looking at the proxy's vmx file, as well as the replica's vmx file, they are sure enough the same. So, is the lesson here (a subtle one, I think) that you can't replicate a proxy VM if you want to be able to use virtual appliance mode?
foggy
Veeam Software
Posts: 21069
Liked: 2115 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: "Some hot add capable backup proxy VMs were skipped..."

Post by foggy »

danswartz wrote:So, is the lesson here (a subtle one, I think) that you can't replicate a proxy VM if you want to be able to use virtual appliance mode?
Correct, that is how the case currently stands, after upgrading to Veeam B&R v8 Update 2.
danswartz
Veteran
Posts: 264
Liked: 30 times
Joined: Apr 26, 2013 4:53 pm
Full Name: Dan Swartzendruber
Contact:

Re: "Some hot add capable backup proxy VMs were skipped..."

Post by danswartz »

Okay, good to know, thanks.
Yuki
Veeam ProPartner
Posts: 252
Liked: 26 times
Joined: Apr 05, 2011 11:44 pm
Contact:

Re: "Some hot add capable backup proxy VMs were skipped..."

Post by Yuki » 1 person likes this post

Just got the weekly diagest from Anton and saw this:
2. Hot add processing mode not available for backup proxy VMs with non-unique BIOS UUID. We have a pilot hot fix ready that returns VDDK behavior to pre-VDDK 6.0 state, and will decide on including it into Update 2a based on the results of testing one with some of the impacted customers.
We are watching this very closely and hope it is made available soon.

Thanks for listening!
Yuki
Veeam ProPartner
Posts: 252
Liked: 26 times
Joined: Apr 05, 2011 11:44 pm
Contact:

Re: "Some hot add capable backup proxy VMs were skipped..."

Post by Yuki »

I believe our issue is being addressed with this update and latest digest said that customers affected by the issues in the patch are being given advanced access to it - can we also receive this patch?
ChuckS42
Expert
Posts: 189
Liked: 27 times
Joined: Apr 24, 2013 8:53 pm
Full Name: Chuck Stevens
Location: Seattle, WA
Contact:

Re: "Some hot add capable backup proxy VMs were skipped..."

Post by ChuckS42 »

When I built my proxy servers, I created one and polished it all up, then cloned it multiple times for the rest. Direct clones, not from a template.

I just upgraded to Update 2. Will I have this problem when backups run tonight?
Veeaming since 2013
Yuki
Veeam ProPartner
Posts: 252
Liked: 26 times
Joined: Apr 05, 2011 11:44 pm
Contact:

Re: "Some hot add capable backup proxy VMs were skipped..."

Post by Yuki »

Depending on the rest of your setup, you are likely to be affected.
foggy
Veeam Software
Posts: 21069
Liked: 2115 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: "Some hot add capable backup proxy VMs were skipped..."

Post by foggy »

Yuki wrote:I believe our issue is being addressed with this update and latest digest said that customers affected by the issues in the patch are being given advanced access to it - can we also receive this patch?
Please contact support with this request.
ChuckS42 wrote:I just upgraded to Update 2. Will I have this problem when backups run tonight?
You can check whether your hotadd proxies ID's are duplicate using the procedure suggested above in this thread.
ChuckS42
Expert
Posts: 189
Liked: 27 times
Joined: Apr 24, 2013 8:53 pm
Full Name: Chuck Stevens
Location: Seattle, WA
Contact:

Re: "Some hot add capable backup proxy VMs were skipped..."

Post by ChuckS42 »

Easier to just rebuild my proxies.
Veeaming since 2013
foggy
Veeam Software
Posts: 21069
Liked: 2115 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: "Some hot add capable backup proxy VMs were skipped..."

Post by foggy »

Yes, if they are just proxies and do not run any other apps/services.
Gostev
Chief Product Officer
Posts: 31459
Liked: 6648 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: "Some hot add capable backup proxy VMs were skipped..."

Post by Gostev »

Yuki et al.

We ended up patching VDDK 6.0 in Update 2a to revert its behavior of duplicate proxy UUIDs to one of VDDK 5.5.

Since my last post, we've done lots of extended stress testing internally on the same VMs using both VDDK 5.5 and VDDK 6.0. We found VDDK 5.5 logic of handling duplicate proxy UUIDs to be pretty random. Occasionally, we observed VDDK 5.5 try to hot add disks to a wrong VM, in which case hot add would fail. So, perhaps this is the very reason why some developer at VMware chose to close this bug by simply blocking hot add completely in this scenario. This is arguable solution to the problem, especially when VDDK has the moRef of proxy VM to attach disks to "in hands", so I am interacting with VDDK PM regarding this issue for possible future improvements.

Nevertheless, Update 2a will patch VDDK 6.0 to revert its behavior to one of previous VDDK versions. Which means that starting Update 2a, hot add should not work any worse than it did with previous VDDK versions. Nor it will work any better though, so random hot add failures are still fully expected occasionally.
tsightler
VP, Product Management
Posts: 6009
Liked: 2843 times
Joined: Jun 05, 2009 12:57 pm
Full Name: Tom Sightler
Contact:

Re: "Some hot add capable backup proxy VMs were skipped..."

Post by tsightler » 1 person likes this post

Gostev wrote:Since my last post, we've done lots of extended stress testing internally on the same VMs using both VDDK 5.5 and VDDK 6.0. We found VDDK 5.5 logic of handling duplicate proxy UUIDs to be pretty random. Occasionally, we observed VDDK 5.5 try to hot add disks to a wrong VM, in which case hot add would fail. So, perhaps this is the very reason why some developer at VMware chose to close this bug by simply blocking hot add completely in this scenario.
This is one of the most common reasons I see in the field for random hotadd failing over to NBD. Users never understand it because it will fail one time and then work the next, in some cases even when using the very same proxy, so then we go digging and find they have another proxy with the same UUID and during the "random" failure VMware tried to hotadd the disk to the wrong proxy, but then the next time it connected it to the "correct" proxy so I guess I'm just saying the QC testing matches the real world very well in this instance.
Post Reply

Who is online

Users browsing this forum: mikeram, Semrush [Bot] and 274 guests