Comprehensive data protection for all workloads
Gostev
SVP, Product Management
Posts: 24439
Liked: 3409 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Inconsistent Filesystems on restores when backing up with Hotadd

Post by Gostev » Aug 06, 2019 9:45 pm

mkretzer wrote:
Aug 06, 2019 4:24 pm
https://helpcenter.veeam.com/docs/backu ... l?ver=95u4
"Veeam Backup & Replication automatically sets the SAN Policy within each proxy to Offline Shared."
Right, after reading the link I recalled that this was implemented for Direct SAN Access transport mode (just as per the User Guide's section), so this is the correct setting for that case.

But I also stand corrected that offlineShared is a cause of the issue for hot add. After digging Microsoft documentation on SAN policies, this setting does appear to cover SCSI bus disks too (and not just iSCSI, as I thought). So, my idea was completely incorrect... back to drawing board.

Right now I am completely lost as to why "half of our volumes get mounted on the proxy (including getting a drive letter) while backup is running"... all possible ideas are now proven 100% wrong. Due to offlineShared setting, they should have been kept offline and read-only.

mkretzer
Expert
Posts: 532
Liked: 113 times
Joined: Dec 17, 2015 7:17 am
Contact:

Re: Inconsistent Filesystems on restores when backing up with Hotadd

Post by mkretzer » Aug 07, 2019 4:20 am

Whats even more confusing:
Your support provided a nice script logging all the to a proxy attached disks.
It always shows one additional attached disk. Even while active full is running. But that one disk is sometimes in one backup offline and sometimes in other backups online.

I wonder: could that be another crazy Windows 2019 bug?

Gostev
SVP, Product Management
Posts: 24439
Liked: 3409 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Inconsistent Filesystems on restores when backing up with Hotadd

Post by Gostev » Aug 12, 2019 3:23 pm

No, definitely not. We're not seeing anything like that with any OS versions, including Server 2019... so this must be something environment-specific in your case.

Can you try to do this:
DISKPART > automount scrub
for the OS to "forget" all previously mounted volumes, and then see if these volumes start piling up again despite of SANpolicy set to offlineShared?

If they don't, then it would be an indication that this setting was possibly disabled at some point, which resulted in these volumes being automounted and memorized by the OS (after which having SANpolicy offlineShared will not make any difference).

If they still do, then clearly the next step should be to involve Microsoft, as we're not seeing such behavior in any of our labs.

mkretzer
Expert
Posts: 532
Liked: 113 times
Joined: Dec 17, 2015 7:17 am
Contact:

Re: Inconsistent Filesystems on restores when backing up with Hotadd

Post by mkretzer » Aug 12, 2019 4:46 pm

@Gostev: Your support has sent me a long explaination. Seems like offline shared and offline all both dont work always!

Automount must be disabled as well. Also this very well can hit other customers too from what i understand.

Currently they are debating how to fix this permanently from your side... I think this is important information and at least a warning should be postet in the backup log!

Gostev
SVP, Product Management
Posts: 24439
Liked: 3409 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Inconsistent Filesystems on restores when backing up with Hotadd

Post by Gostev » Aug 12, 2019 10:08 pm

mkretzer wrote:
Aug 12, 2019 4:46 pm
Seems like offline shared and offline all both dont work always!
There must be some misunderstanding between you and support then, because R&D has confirmed with a very comprehensive testing that offlineShared always works. With this setting, Windows will only mount the disks already "known" to the OS (meaning, those which were mounted to the OS at least once before Veeam backup proxy was even deployed). But, I already checked on this possibility with your earlier, and you have confirmed this absolutely cannot be the case.

So, I would still like to the bottom of this, and the test I suggest in my previous test would help us to. It is also very safe, because you will be able to observe the issue easily through the presence of additional "known" drives in the registry following a hot add backup. Would you be open to do this test?

As noted earlier, disabling automount has its own implications, and last thing we want is fixing the headache with a guillotine. But, in order to find the best solution, we need to fully understand what is causing the issue in your particular environment.

mkretzer
Expert
Posts: 532
Liked: 113 times
Joined: Dec 17, 2015 7:17 am
Contact:

Re: Inconsistent Filesystems on restores when backing up with Hotadd

Post by mkretzer » Aug 13, 2019 11:36 am

Gostev,

no. Did you read the document they wrote/provided:
"As it turned out, indeed, setting San Policy to either OfflineShared (what Veeam already does by default for any newly added server) or OfflineAll (what we, Support, did in some cases with ReFS volumes not feeling too well after being backed up in HotAdd) is not guaranteeing that the volumes will not be mounted."

In the analysis document they sent us they sound quite sure about the whole thing. They seem to know exactly what is wrong.

So can you please talk to support if there is anything unclear on your side? If so i am happy to do further tests!

They also told us to:

1) Make sure the SAN Policy is OfflineShared (!)
2) Disable automount via Diskpart by running “automount disable”. Follow it up with “automount scrub”
3) Run mountvol /r (make sure to exit Diskpart first)

So again, not "OfflineAll"

Markus

Gostev
SVP, Product Management
Posts: 24439
Liked: 3409 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Inconsistent Filesystems on restores when backing up with Hotadd

Post by Gostev » Aug 13, 2019 1:25 pm

Yes, I've been in touch with support regarding this and cleared up their confusion. Also, the support case has been transferred to the team leader. Everyone is now in agreement that the next step should be the following test:

1. Make sure the SAN Policy is OfflineShared, and automount is enabled on the backup proxy (default Veeam backup proxy settings)
2. DISKPART > automount scrub OR CMD > mountvol /r (afaik, these do the same thing)
3. Perform some hot add backups of various VMs

Expected result: not a single backed up drive should be mounted to the backup proxy, even if automount is still enabled.

mkretzer
Expert
Posts: 532
Liked: 113 times
Joined: Dec 17, 2015 7:17 am
Contact:

Re: Inconsistent Filesystems on restores when backing up with Hotadd

Post by mkretzer » Aug 13, 2019 5:39 pm

It gets confusing now: automount scrub leads to the system volume (including a drive letter!) and the two dynamic volumes (shown as foreign) of the vm beeing mounted and the GPT volume which got corrupted the most to be offline.

Testing with mountvol /r next

mkretzer
Expert
Posts: 532
Liked: 113 times
Joined: Dec 17, 2015 7:17 am
Contact:

Re: Inconsistent Filesystems on restores when backing up with Hotadd

Post by mkretzer » Aug 13, 2019 5:51 pm 1 person likes this post

LOL nope. After another automount scrub and then mountvol /r the big GPT disk gets mounted again, now one of the dynamic disks is offline and the other dynamic one is still online.

mkretzer
Expert
Posts: 532
Liked: 113 times
Joined: Dec 17, 2015 7:17 am
Contact:

Re: Inconsistent Filesystems on restores when backing up with Hotadd

Post by mkretzer » Aug 13, 2019 6:24 pm

It gets worse... We disabled automount:
DISKPART> SAN
SAN-Richtlinie : Offline - Freigegeben
DISKPART> automount
Die automatische Bereitstellung von neuen Volumes ist deaktiviert.

Still volumes get show up but get no letter!

mkretzer
Expert
Posts: 532
Liked: 113 times
Joined: Dec 17, 2015 7:17 am
Contact:

Re: Inconsistent Filesystems on restores when backing up with Hotadd

Post by mkretzer » Aug 13, 2019 6:34 pm

Final test: automount off, offlineall, scrub. Volumes still show up but get no drive letter (should they not be "offline")....

Gostev
SVP, Product Management
Posts: 24439
Liked: 3409 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Inconsistent Filesystems on restores when backing up with Hotadd

Post by Gostev » Aug 14, 2019 2:15 pm

mkretzer wrote:
Aug 13, 2019 6:34 pm
Final test: automount off, offlineall
As per my request above, can you please test with automount on and offlineShared?
This would represent default settings following Veeam backup proxy installation.

Expected result with these settings:
All backed up disks appear as offline in the Disk Management snap-in.

If you see any other result, please post your respective screenshot.

mkretzer
Expert
Posts: 532
Liked: 113 times
Joined: Dec 17, 2015 7:17 am
Contact:

Re: Inconsistent Filesystems on restores when backing up with Hotadd

Post by mkretzer » Aug 14, 2019 3:41 pm

Gostev,

That was there setting as we tested yesterday. Result was as before disks are not offline!! But they are not offline with offline all, automount off and scrub as well! I will test again and provide screenshot (how can I put a screenshot here??).

Markus

mkretzer
Expert
Posts: 532
Liked: 113 times
Joined: Dec 17, 2015 7:17 am
Contact:

Re: Inconsistent Filesystems on restores when backing up with Hotadd

Post by mkretzer » Aug 14, 2019 4:08 pm

https://imgur.com/a/53xc5l3

Here you can see the chaos... And which volume is offline and which not and which gets a letter changes every backup... Thats why some backups are consistent and some are not...

Edit: To Explain what is what:
Disk 0 is the proxy system drive
Disk 1 is a dynamic disk of the backed up vm which sometimes got corrupted (dedup/dynamic disk)
Disk 2 is the OS disk of the backed up vm which even was assigned a drive letter and which got corrupted sometimes as well
Disk 3 is a GPT disk with dedup which got corrupted the most (sometimes completely destroyed and unreadable)
Disk 4 is another dynamic disk which is the only one that is offline

@Gostev: can you submit this result to support or do i need to update the case as well?

Gostev
SVP, Product Management
Posts: 24439
Liked: 3409 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Inconsistent Filesystems on restores when backing up with Hotadd

Post by Gostev » Aug 14, 2019 9:53 pm

Yeah, clearly some weird stuff is going on there with this backup proxy. No worries about submitting this information to support, we have an internal thread going with all stakeholders.

For further troubleshooting, I have two options:

Option 1. To exclude the possibility that Windows OS itself is messed up on this particular proxy (or that some 3rd party software is acting up), it would help to deploy brand new backup proxy by doing clean OS install manually from Windows distribution ISO. After that, without installing any 3rd party software, see if the issue repeats on that new backup proxy.

Option 2. Open a support case with Microsoft and have them troubleshoot this (we can open it on your behalf, but obviously it will require your time in case they need logs). Naturally, this would be better to do this after Option 1 and with the clean OS install. Since we're talking VM, may be you can even create a snapshot once the OS is fully updated, right before installing a Veeam backup proxy role and doing the experiments - so that rollback to the clean state is easy.

In general I think we're pretty close to understanding the issue now, with all signs currently pointing at some issue with this specific backup proxy. Naturally, we never see backed up disks being Online in any of our labs.

mkretzer
Expert
Posts: 532
Liked: 113 times
Joined: Dec 17, 2015 7:17 am
Contact:

Re: Inconsistent Filesystems on restores when backing up with Hotadd

Post by mkretzer » Aug 14, 2019 10:01 pm

Gostev,

ok lets go for option 1. But: These proxys where freshly setup with windows 2019 just 2-3 months ago. Nothing was done with them but installing proxy component.
I am currently on vaccation, i will ask my colleagues to do a fresh install on a new proxy. So we will:
- install W2019, newest updates + VMware tools
- Create a snapshot
- Add the proxy to veeam
- Test Backups with out test job

ok?

Gostev
SVP, Product Management
Posts: 24439
Liked: 3409 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Inconsistent Filesystems on restores when backing up with Hotadd

Post by Gostev » Aug 15, 2019 5:37 pm

That is correct + remember not to install any 3rd party software too.

mkretzer
Expert
Posts: 532
Liked: 113 times
Joined: Dec 17, 2015 7:17 am
Contact:

Re: Inconsistent Filesystems on restores when backing up with Hotadd

Post by mkretzer » Aug 17, 2019 4:21 am

Fresh install does not show the issue! All drives stay "offline" even with default settings.
But this does not make it "solved" for us.
1. We should find out what can cause this - Perhaps a remote session with veeam support or some kind of analysis script to compare settings would be nice
2. Veeam should prevent this from happening ever again with any customer. This should be possible: After the volumes are mounted and before reading data starts veeam should check volume status. If one of the volumes to be backed up is "online" if should fail the backup right away

Until 2) is implemented hot-add is too risky to use from my point of view! Which is bad for us as with NBD backups are quite slow. I hope the performance issues of direct SAN at the start of backup are solved when we upgrade all our ESXi to 6.7U2...

Gostev
SVP, Product Management
Posts: 24439
Liked: 3409 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Inconsistent Filesystems on restores when backing up with Hotadd

Post by Gostev » Aug 18, 2019 9:57 pm

OK, thank you very much for confirming the issue is specific to the specific backup proxy server (and especially for spending your vacation time). What you're observing with the new backup proxy is consistent to what we're seeing in our test labs.

Since we're dealing with what is either Windows OS or a 3rd party issue on the particular server, it would be best to engage Microsoft for further troubleshooting of the impacted server. Let us know if you're able to open a support case of Microsoft directly. We can do this as well, however we don't have a single system where this issue is reproducible - so we would have to immediately refer them to you anyway.

We will decide on the best course of action based on the conclusion from Microsoft. We need to know what exactly is happening on that server, in order to be able to design a reliable and bulletproof solution.

Unfortunately, it is a bit too late to include any sort of advanced new logic in v10, especially without understanding the issue we're fixing. For example, if those volume status checks will appear to sometime return invalid results on certain system configurations, we're risking to break hot add for thousands of customers. So unless absolutely necessary, we really don't like to touch the code that we spent 10 years stabilizing, making sure it works reliably in 500K different environments... and at this time, we don't even know what's going on with the offending server. This makes it really hard to justify making any risky last minute changes.

However, for example bringing back the code that automatically disables automount will always remain an option for v10.

mkretzer
Expert
Posts: 532
Liked: 113 times
Joined: Dec 17, 2015 7:17 am
Contact:

Re: Inconsistent Filesystems on restores when backing up with Hotadd

Post by mkretzer » Aug 19, 2019 9:41 am

Gostev,

understandable. I will do this after my vaccation.

I wonder: Would disabling automount prevent a volume getting corrupted even if the disk goes "online"?

Markus

Gostev
SVP, Product Management
Posts: 24439
Liked: 3409 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Inconsistent Filesystems on restores when backing up with Hotadd

Post by Gostev » Aug 19, 2019 4:05 pm

That's the current theory... we believe that Windows OS does not do things* to volumes until they are actually mounted to a drive letter (or a folder). And while it is best to just have all disks remain "Offline" as the lower-level protection, we believe that merely having them "Online" should not cause issues until they are actually mounted. As such, scrubbing and disabling automount at the time when backup proxy is deployed may provide another layer of protection.

*Things seem to be limited to special workloads like ReFS or Windows dedupe. Perhaps there's some transaction log that OS starts to automatically replay once the volume is mounted, or something along these lines. Because regular NTFS volumes don't seem to receive any modifications in any case.

mkretzer
Expert
Posts: 532
Liked: 113 times
Joined: Dec 17, 2015 7:17 am
Contact:

Re: Inconsistent Filesystems on restores when backing up with Hotadd

Post by mkretzer » Aug 19, 2019 6:46 pm

Ok.

I think we will stay with NBD for now until we know whats the real issue. Perhaps there is something which we can implement in our monitoring to make sure it does not happen again.

Markus

Post Reply

Who is online

Users browsing this forum: Google [Bot], Kenji Kashiwa and 31 guests