Host-based backup of VMware vSphere VMs.
Post Reply
gfdos.sys
Influencer
Posts: 10
Liked: 1 time
Joined: Nov 01, 2017 3:31 pm
Full Name: Gabriel Fischer
Contact:

Any Word on NVMe Replication (vSphere 6.5 + B&R 9.5U2)

Post by gfdos.sys »

After the B&R 9.5U1 announcement https://www.veeam.com/kb2222
Virtual hardware version 13 support. vSphere 6.5 introduces the new VM hardware version which increases some configuration maximums and adds ability to add NVMe controllers to a VM. This update adds ability to process such VMs.

Using Veeam Backup 9.5U2 a windows vm with a virtual NVMe controler backups fine.
However if trying a Replication Job, the initial replication works, but subsequent Replication incrementals always fail.
I get virtual NVMe is a new feature in vSphere 6.5, but if it doesn't work with Backup *AND* Replication, why call attention to it as a feature that is supported in the 9.5U1 announcement?

Warning -- for now Backup for vms with virtual NVMe works but Replication does not.
(note: this is on vsphere 6.5U1 with the patch found here: https://kb.vmware.com/s/article/2151061?language=en_US
with Veeam Backup 9.5U2)
foggy
Veeam Software
Posts: 21069
Liked: 2115 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Any Word on NVMe Replication (vSphere 6.5 + B&R 9.5U2)

Post by foggy »

Hi Gabriel, do you have a support case ID for this issue?
gfdos.sys
Influencer
Posts: 10
Liked: 1 time
Joined: Nov 01, 2017 3:31 pm
Full Name: Gabriel Fischer
Contact:

Re: Any Word on NVMe Replication (vSphere 6.5 + B&R 9.5U2)

Post by gfdos.sys » 1 person likes this post

Case #02333485
Level 2 engineer looking at it. Apparently 2 other cases open right now.
In my case and one other case the "solution" was use SCSI not NVMe.
Waiting to hear back what they told the other case.

Figured I'd ask here and give the warning of what we are running into.

NVMe on top of HCI infrastructure is sweet, and gave us a nice performance boost. We are using DataCore VirtualSAN, so it is doing auto tiering of our SAS SSD and SAS SCSI, so why add extra "baggage" of scsi virtually?
Physical NVMe has a high cost, but if we already have the performance with HCI, it makes total sense. Only thing holding us back is the Replication needed for DR.

https://www.starwindsoftware.com/blog/i ... olutionary
foggy
Veeam Software
Posts: 21069
Liked: 2115 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Any Word on NVMe Replication (vSphere 6.5 + B&R 9.5U2)

Post by foggy » 1 person likes this post

We were able to reproduce this internally, the issue is in failing Revert Snapshot operation. We will investigate and, if the issue is on Veeam B&R side, address it in one of the next updates (could be even the upcoming v9.5 Update 3).
Gostev
Chief Product Officer
Posts: 31455
Liked: 6646 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Any Word on NVMe Replication (vSphere 6.5 + B&R 9.5U2)

Post by Gostev »

Most likely some change in vSphere 6.5U1 or Veeam 9.5U2 is causing this... B&R 9.5U1 was indeed tested against VMs with NVMe controllers extensively, but it was done during plain vSphere 6.5 times. We will figure this out.
gfdos.sys
Influencer
Posts: 10
Liked: 1 time
Joined: Nov 01, 2017 3:31 pm
Full Name: Gabriel Fischer
Contact:

Re: Any Word on NVMe Replication (vSphere 6.5 + B&R 9.5U2)

Post by gfdos.sys »

Veeam vmware on hpe hosts vmware on dell veeam host
7/20 9.5.0.823 6.5.0 5146846 6.5.0
9/22 NVMe Enabled
10/28 9.5.0.1038 6.5U1 5969303(Oct)+patch 6.5U1+patch

>Most likely some change in vSphere 6.5U1 or Veeam 9.5U2 is causing this...
Just for the record: We had the problem as soon as we enabled NVMe on vmware 6.5.0 5146846 on Veeam 9.5U1

>B&R 9.5U1 was indeed tested against VMs with NVMe controllers extensively
The "Full replica" works -- its any incremental replicas that fail.
I went back to my Veeam email logs and Replication failed on 9/22 --> at the first "replication incremental" after adding it.
"Creating helper snapshot Error: Detected an invalid snapshot configuration.
Error: Detected an invalid snapshot configuration."

the next day the error changes to:
"Preparing replica VM Error: Detected an invalid snapshot configuration.
Error: Detected an invalid snapshot configuration."

every attempt after that:
"Deleting helper snapshot Error: Unable to access file since it is locked
Error: Unable to access file since it is locked"

Backup works just fine all along.
gfdos.sys
Influencer
Posts: 10
Liked: 1 time
Joined: Nov 01, 2017 3:31 pm
Full Name: Gabriel Fischer
Contact:

Re: Any Word on NVMe Replication (vSphere 6.5 + B&R 9.5U2)

Post by gfdos.sys »

I had 2 ideas for a workaround on this, What do you guys think?

Work around 1:
Rhetorical question -- Would this NVMe issue with replication happen with physical hardware?
In order to do that you would have to use the Veeam Agent to do the backup and replication....

Is that a possible workaround -- Installing the Veeam Agent on a vm with virtual NVMe?
After installing the Veeam Agent (I have never done backup that way) would I have to remove the vm from the backup/replica jobs and readd it?

Work around 2:
I presume this backup and replication job are using NBD mode.
All the VMs I'm backing up are stored on a Datacore "iSCSI" SAN. If I backed up the jobs in Direct SAN access mode would it avoid the issue of the "snapshots" being locked that the error message seems to suggest?
foggy
Veeam Software
Posts: 21069
Liked: 2115 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Any Word on NVMe Replication (vSphere 6.5 + B&R 9.5U2)

Post by foggy »

The issue is in reverting a snapshot of the replica VM, so the mentioned workarounds are not relevant.
foggy
Veeam Software
Posts: 21069
Liked: 2115 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Any Word on NVMe Replication (vSphere 6.5 + B&R 9.5U2)

Post by foggy »

Btw, as an update, it is currently being investigated with VMware and has just been escalated to T2 of their support.
gfdos.sys
Influencer
Posts: 10
Liked: 1 time
Joined: Nov 01, 2017 3:31 pm
Full Name: Gabriel Fischer
Contact:

Re: Any Word on NVMe Replication (vSphere 6.5 + B&R 9.5U2)

Post by gfdos.sys »

I was reading this artictle: https://static.rainfocus.com/vmware/vmw ... 01nYO0.pdf
Which is a presentation from vmWorld 2017 entitled "A Deep Dive into vSphere 6.5 Core Storage Features and Functionality [SER1143BU]"

I got to side 18 which was talking about differences between VMFS6 and VMFS5:
Resources for VMs (blocks, file descriptors, etc.) on earlier VMFS versions were
allocated on a per host basis (host-based block allocation affinity)
• Host contention issues arose when a VM/VMDK was created on one host, and then
vMotion was used to migrate the VM to another host
• If additional blocks were allocated to the VM/VMDK by the new host at the same time
as the original host tried to allocate blocks for a different VM in the same resource
group, the different hosts could contend for resource locks on the same resource
• This change introduces VM-based block allocation affinity, which will decrease
resource lock contention

I checked -- the datastores that the replicas are in a vmWare 6.5 VM on Server 2012 ReFS stored on *VMFS5*.
Any chance this is the issue? Ugly if it is because there is no easy conversion from VMFS5 to VMFS6 other than "get a new datastore (on additional storage you may not have), and then move the data from the old VMFS5 datastore to the new one".

I thought this because VMFS6 comes WITH 6.5..... So what about old VMFS5 datastores..... from text above they could be having resource lock contention --- that's what the error sounds like.

What do you think?
foggy
Veeam Software
Posts: 21069
Liked: 2115 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Any Word on NVMe Replication (vSphere 6.5 + B&R 9.5U2)

Post by foggy »

According to our QA, this doesn't look related, they were able to reproduce on VMFS6 as well.
gfdos.sys
Influencer
Posts: 10
Liked: 1 time
Joined: Nov 01, 2017 3:31 pm
Full Name: Gabriel Fischer
Contact:

Re: Any Word on NVMe Replication (vSphere 6.5 + B&R 9.5U2)

Post by gfdos.sys »

Status update:
I heard back from the SE at Veeam, the case is now with the QA Department, and they will be creating a DCPN ticket to work with VMWare toward resolution.
gfdos.sys
Influencer
Posts: 10
Liked: 1 time
Joined: Nov 01, 2017 3:31 pm
Full Name: Gabriel Fischer
Contact:

Re: Any Word on NVMe Replication (vSphere 6.5 + B&R 9.5U2)

Post by gfdos.sys »

Any chance NVMe issues are addressed in the RTM of Veeam Update 3?
Any other status update on this?
foggy
Veeam Software
Posts: 21069
Liked: 2115 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Any Word on NVMe Replication (vSphere 6.5 + B&R 9.5U2)

Post by foggy »

Definitely not in U3. The investigation is underway, VMware is involved, however, not much progress to share.
gfdos.sys
Influencer
Posts: 10
Liked: 1 time
Joined: Nov 01, 2017 3:31 pm
Full Name: Gabriel Fischer
Contact:

Re: Any Word on NVMe Replication (vSphere 6.5 + B&R 9.5U2)

Post by gfdos.sys »

Here is the latest from Veeam about the NVMe controllers......

I asked our QA team for the update for the issue with NVMe controllers and this issue is still under research from VMware side. The issue with NVMe controllers will be fixed as soon as VMware confirms a bug. So unfortunately, there is no ETA at this point.
Gostev
Chief Product Officer
Posts: 31455
Liked: 6646 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Any Word on NVMe Replication (vSphere 6.5 + B&R 9.5U2)

Post by Gostev »

All, we have identified the way to fix this issue and will include fix in the next update. Patch for the existing version is a possibility too, if there is enough support cases requesting one. Thanks!
Gostev
Chief Product Officer
Posts: 31455
Liked: 6646 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Any Word on NVMe Replication (vSphere 6.5 + B&R 9.5U2)

Post by Gostev »

All, our support now has the hot fix for 9.5 U3 in hands. You can refer to the issue ID 127791. Thanks!
gfdos.sys
Influencer
Posts: 10
Liked: 1 time
Joined: Nov 01, 2017 3:31 pm
Full Name: Gabriel Fischer
Contact:

Re: Any Word on NVMe Replication (vSphere 6.5 + B&R 9.5U2)

Post by gfdos.sys »

I'm being told there is a hot fix for this we are going to try today.
Please send my thanks to the engineers on the Veeam and vmWare sides on this.
Looks like it is a Veeam only hot fix -- swapping out a dll.
I'll check back in when I know more, and to let you know if it looks like it is successful
gfdos.sys
Influencer
Posts: 10
Liked: 1 time
Joined: Nov 01, 2017 3:31 pm
Full Name: Gabriel Fischer
Contact:

Re: Any Word on NVMe Replication (vSphere 6.5 + B&R 9.5U2)

Post by gfdos.sys »

Let it do its thing overnight. Looks like it is fixed!

For reference -- To enable virtual NVMe in vms
Requires ESXi 6.5
Verify that you have one of the following supported guest operating systems:
Windows 7 and 2008 R2 (hot fix required: https://support.microsoft.com/en-us/kb/2990941)
Windows 8.1, 2012 R2, 10, 2016
RHEL, CentOS, NeoKylin 6.5 and later
Oracle Linux 6.5 and later
Ubuntu 13.10 and later
SLE 11 SP4 and later
Solaris 11.3 and later
FreeBSD 10.1 and later
Mac OS X 10.10.3 and later
Debian 8.0 and later

Note: The guest operating system requires a driver to use the NVMe controller. See the VMware Compatibility Guide to verify support.

You can determine whether the controller is configured in the virtual machine configuration file. NVMe-related entries are similar to SCSI-related entries.

-- FOR SUPPORTED WINDOWS VERSIONS --
I add the NVMe controller to vm in ESXi, and check Windows Device Manager to verify that is is showing up and the driver has been installed.
Shut down the vm. (IF the drive is ENCRYPTED --- make sure to 1. have a backup before trying this, 2. have the drive unlock code handy, 3. this should work if you disable the encryption enforcement before the reboot the same as you would for a BIOS update on a physical PC)
Change hard drives from SCSI controller to NVMe
(If you have nothing else using the SCSI controller you can remove it)
Boot vm -- since the NVMe driver was preloaded it boots -- If you didn't make sure it was in Windows Device Manager first (first step) it WILL blue screen -- no panic switch it back to SCSI controller and go back to the start of these instructions.
(I personally wait until I verify the NVMe is working before removing the SCSI driver)

FYI -- this is reversible, I had to go back to SCSI driver on these vms until issue was resolved and the only disadvantage was the normal SCSI speed is inferior to NVMe.

On linux distros, I did not have such easy luck, and it was less trouble to simply reinstall the linux distro I was testing, but It is possible doing a similar process with linux should work if the driver is installed, unless linux does something about pairing scsi driver to actual hard drive.

-=-
https://kb.vmware.com/s/article/2147714
Post Reply

Who is online

Users browsing this forum: Google [Bot], Ivan239 and 94 guests