Host-based backup of VMware vSphere VMs.
Post Reply
jklaw91
Influencer
Posts: 19
Liked: 2 times
Joined: Oct 01, 2018 3:15 pm
Full Name: John Lawrence
Contact:

Errors when using Instant Recovery of VM from PowerStore 1200T SAN Snapshot

Post by jklaw91 »

Three months ago we purchased a Dell EMC PowerStore 1200T running the latest firmware, Version: 3.0.0.1. Installed Veeam Plugin on our Veeam B&R Server. We also have a Dell EMC Unity 550F, which we have been using for 3 years with Veeam. We have had no problems doing Instant Recovery of a VM from the Unity 550F. Works like a champ. However, using Veeam Instant Recovery of a VM from a PowerStore 1200T SAN Snapshot has been a nightmare. Have an existing Case 05597617 open with Veeam Support since 8/25. Veeam rep has been helpful, but the problem persists. The Instant Recovery works great, but the 2nd step where the VM is migrated from the clone of the PowerStore snapshot will end with the snap datastore not being unmounted, then resulting in PDL errors causing the host to become unresponsive because the ESX host suffers from a condition where these failed commands are continuously retried thereby blocking other commands. This error results in widespread I/O timeouts and subsequent aborts. We also have cases open with VMware, Cisco, and Dell EMC.

We are running these versions/hardware/firmware:

Dell EMC PowerStore 1200T running Version: 3.0.0.1
VMware ESXi 7.0 Update 3g (Build 20328353)
Cisco UCS B200 M5 Blades running UCS Infrastructure and Blades 4.2(2c) with nfnic driver 5.0.0.34

Is anyone else using Dell EMC PowerStore 1200T with ESXi 7.0 and using Veeam Instant Recovery?
Thanks,

John
Origin 2000
Service Provider
Posts: 84
Liked: 20 times
Joined: Sep 24, 2020 2:14 pm
Contact:

Re: Errors when using Instant Recovery of VM from PowerStore 1200T SAN Snapshot

Post by Origin 2000 »

I have a lot of customers with Powerstore but sadly non of them are Veeam Enterprise+ user so no Backup from San Snapshot. But i have a few with Dell Compellent and Netapp (NFS).
I thought that i have a good understanding how the Backup of the San Snaps work but how is a what ever SAN Snap involved during a RESTORE? When performing a Instant Recovery(IR) the Veeam Server present a NFS like Datastore to the ESXi and pull the Data from the what ever Repository. I must admin that i use svMotion from vCenter to migrate the VM backup to production and not the the function "Migrate to Production" within the Veeam Console.

Can you explain where a Snapshot comes into play during the IR?

Regards,
Joerg
jklaw91
Influencer
Posts: 19
Liked: 2 times
Joined: Oct 01, 2018 3:15 pm
Full Name: John Lawrence
Contact:

Re: Errors when using Instant Recovery of VM from PowerStore 1200T SAN Snapshot

Post by jklaw91 »

The SAN Snapshot is used by Instant Recovery (which works just fine for both Unity 550F and PowerStore 1200T). The LUN snapshot on which the desired VM is cloned, that cloned LUN is presented to VMware and mounted as a datastore, then able to power on the VM in vSphere using Instant Recovery. The actual problem when involving a VM on a Snap from PowerStore 1200T (Unity 550F works fine here) occurs when the VM is migrated to a production datastore from the temporarily presented datastore from the clone of the SAN snapshot.
Thanks,

John
Andreas Neufert
VP, Product Management
Posts: 6749
Liked: 1408 times
Joined: May 04, 2011 8:36 am
Full Name: Andreas Neufert
Location: Germany
Contact:

Re: Errors when using Instant Recovery of VM from PowerStore 1200T SAN Snapshot

Post by Andreas Neufert »

This process is usually done by Storage vMotion and if this fails we need to check as well with VMware. Potentially an VAAI issue or similar.

Anyway without deeper analyzation we can not guess what happened. Please open a Veeam support ticket, upload logs and let the support team have a look at it. Please share the support ticket number here for reference.
jklaw91
Influencer
Posts: 19
Liked: 2 times
Joined: Oct 01, 2018 3:15 pm
Full Name: John Lawrence
Contact:

Re: Errors when using Instant Recovery of VM from PowerStore 1200T SAN Snapshot

Post by jklaw91 »

As mentioned in my original post, we have had an existing Case 05597617 open with Veeam Support since 8/25. We also have cases open with VMware, Cisco, and Dell EMC. Purpose of this post was to ask other Veeam B&R customers "Is anyone else using Dell EMC PowerStore 1200T with ESXi 7.0 and using Veeam Instant Recovery?" That would be helpful info with regard to troubleshooting. Since you are VP of Product Dev, can you help escalate this case? The current Veeam Support Rep, has been geat, but this issue has become very painful for us. We would appreciate it.
Thanks,

John
foggy
Veeam Software
Posts: 21073
Liked: 2115 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Errors when using Instant Recovery of VM from PowerStore 1200T SAN Snapshot

Post by foggy »

Please continue working with support - you can ask your engineer to escalate the case to a higher tier if required.

Just to check, do you see any related warnings in the job session or is it entirely 'green' like if all the datastores are actually unmounted? We had a similar case with other storage recently where in case the initiator group contains initiators of all hosts in the cluster, the datastore automatically gets mounted to all of those, while Veeam B&R unmounts it just from a single host. In this case, creating separate initiator groups might be a workaround.
jklaw91
Influencer
Posts: 19
Liked: 2 times
Joined: Oct 01, 2018 3:15 pm
Full Name: John Lawrence
Contact:

Re: Errors when using Instant Recovery of VM from PowerStore 1200T SAN Snapshot

Post by jklaw91 »

snippets of the Instant Recovery log (I changed names of VM and host). There was a Warning.

Code: Select all

10/16/2022 7:09:40 PM          SERVER1 has been recovered successfully
10/16/2022 7:09:40 PM          Waiting for user to start migration
10/16/2022 8:22:16 PM          Starting SERVER1 dismount
10/16/2022 8:22:17 PM          Connecting to host 10.10.10.10
10/16/2022 8:22:42 PM          Dismounting VMFS volume snap-21483685-P1200T-SQL-002 from host 10.10.10.10
10/16/2022 8:23:49 PM          Unbinding VeeamAUX-RESTORECLONE-2c09c40d-48a6-4370-8d04-ad370e281e5d-1665965279777 from FC adapter vmhba1 on host 10.10.10.10
10/16/2022 8:23:50 PM          Rescanning FC adapter vmhba1 on host 10.10.10.10
10/16/2022 8:40:50 PM Warning    Failed to update storage information: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond
10/16/2022 8:40:55 PM          Deleting snapshot clone volume VeeamAUX-RESTORECLONE-2c09c40d-48a6-4370-8d04-ad370e281e5d-1665965279777
10/16/2022 8:40:55 PM          Unlocking storage snapshot
10/16/2022 8:40:55 PM          SERVER1 has been unmounted successfully
So does this match the other, similar case?
Thanks,

John
foggy
Veeam Software
Posts: 21073
Liked: 2115 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Errors when using Instant Recovery of VM from PowerStore 1200T SAN Snapshot

Post by foggy »

No, I meant the job session log available in the Veeam B&R console, not the job log file.
jklaw91
Influencer
Posts: 19
Liked: 2 times
Joined: Oct 01, 2018 3:15 pm
Full Name: John Lawrence
Contact:

Re: Errors when using Instant Recovery of VM from PowerStore 1200T SAN Snapshot

Post by jklaw91 »

Foggy, where can I locate the job session log? It has been more than 24 hours so it does not show under Home. If I go to History I can see Quick Migration jobs info as well as the Instant Recovery under Restore session (which is the portion that I posted). I appreciate the help.
Thanks,

John
foggy
Veeam Software
Posts: 21073
Liked: 2115 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Errors when using Instant Recovery of VM from PowerStore 1200T SAN Snapshot

Post by foggy »

You can right-click the job, select Statistics, and navigate between the job runs using the arrow keys.
popjls
Enthusiast
Posts: 55
Liked: 5 times
Joined: Jun 25, 2018 3:41 am
Contact:

Re: Errors when using Instant Recovery of VM from PowerStore 1200T SAN Snapshot

Post by popjls »

I have a 3000T with Ent Plus.. I can probably try this for you as well on our UCS setup. Admittedly, never used the instant restore from Snapshot so are you using restore to same location or different settings method?
foggy
Veeam Software
Posts: 21073
Liked: 2115 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Errors when using Instant Recovery of VM from PowerStore 1200T SAN Snapshot

Post by foggy »

Judging by the log excerpt shared above, this might be caused by a known issue recently reported to us by Dell R&D, which comes down to the new OS API behavior changes. To address the issue, a new plug-in update will be required, to adjust to the new API output. We will work with Dell R&D on that but until that, we cannot claim full support for PowerStore 3.0.
jklaw91
Influencer
Posts: 19
Liked: 2 times
Joined: Oct 01, 2018 3:15 pm
Full Name: John Lawrence
Contact:

Re: Errors when using Instant Recovery of VM from PowerStore 1200T SAN Snapshot

Post by jklaw91 » 1 person likes this post

Definitely would have been good to know. Is there a Veeam document which indicates compatibility with certain SAN vendors and firmware? I would like to consult going forward since we have both a Dell PowerStore 1200T and Unity 550F. This was a very painful issue which caused downtime due to PDL events which caused hosts to become unresponsive. And unfortunately, trying to test out different scenarios caused some downtime each time and post cleanup tasks.

Fortunately, we have been able to resolve our issue after working with Veeam, Dell, and VMware. The root cause turned out to be a configuration issue with the PowerStore 1200T. We had all hosts as part of host group on the PowerStore. In current PowerStore Code a standalone volume cannot be presented to an individual host which is part of a host group. So during Instant Recovery, Veeam calls PowerStore to create a clone volume from snapshot and present to the selected hosts. (This is where the trouble starts). In current PowerStore Code a standalone volume cannot be presented to an individual host which is part of a host group. As a result the cloned snapshot is presented to the host group and all hosts with in that host group. Veeam then calls vCenter which issues HBA rescan and VFSM mount commands to the host(s)

***Its unclear at this point if Veeam is telling vCenter that all hosts in the cluster should rescan and mount the VMFS, or if its just the selected host which should rescan an mount the VMFS. Can Veeam provide direction here?

Either way the problem is that all ESXi hosts are presented the clone and all hosts mount the VMFS datastore at this point. After migrating the VM to a production datastore, Veeam leverages storage vMotion to migrate the VM to a production datastore. Veeam then instructs vCenter to unmount the datastore for the selected host (not all host are commanded to unmap by vCenter which at this point all have the datastore mounted). Then Veeam tells PowerStore to unmap and delete the clone volume. (This is where PDLs events start to occur on hosts still have the datastore mounted). Per VMware instructions, we then had to SSH to each host (other than the 1 which received the command) and manually issue commands to rescan, identify snap volume in question, unmount, detach, place host into maint mode, and reboot in order fully remove the datastore for that snap. Had to do this for all hosts in the cluster.
Thanks,

John
foggy
Veeam Software
Posts: 21073
Liked: 2115 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Errors when using Instant Recovery of VM from PowerStore 1200T SAN Snapshot

Post by foggy »

This is exactly the issue I mentioned first - the initiator group containing initiators of all hosts in the cluster.
Is there a Veeam document which indicates compatibility with certain SAN vendors and firmware?
We try to keep our system requirements up-to-date but there are cases when compatibility is intended but not actually provided (and sometimes there are situations where we don't have a timely opportunity to test against the actual storage/firmware version, which we try to avoid).
Chrispyyy
Enthusiast
Posts: 39
Liked: 9 times
Joined: Jan 04, 2023 6:28 pm
Full Name: Chrispyyy
Contact:

Re: Errors when using Instant Recovery of VM from PowerStore 1200T SAN Snapshot

Post by Chrispyyy »

Having the exact same issue but with a 500T running the latest vCenter. When not in a host group within PowerStore, it works fine.

Does anyone involved with QA know if this has been resolved in the upcoming storage plug-in release?

Thanks.
foggy
Veeam Software
Posts: 21073
Liked: 2115 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Errors when using Instant Recovery of VM from PowerStore 1200T SAN Snapshot

Post by foggy »

See this post for reference.
pmcgrail
Lurker
Posts: 1
Liked: never
Joined: Jan 27, 2024 2:12 pm
Full Name: Paul McGrail
Contact:

Re: Errors when using Instant Recovery of VM from PowerStore 1200T SAN Snapshot

Post by pmcgrail »

This is still an issue, we suffered from this on 1/26/2024 using PowerStore 1000T and VMware vSphere 7.03o and the PowerStore Plugin 2.03

Server are in a host group with PowerStore, are you saying that by removing them from the PowerStore host group resolved the issue for you?

member110393.html
jklaw91
Influencer
Posts: 19
Liked: 2 times
Joined: Oct 01, 2018 3:15 pm
Full Name: John Lawrence
Contact:

Re: Errors when using Instant Recovery of VM from PowerStore 1200T SAN Snapshot

Post by jklaw91 »

For us, this problem was so painful that we never, ever wanted to encounter it again so yes, we no longer use host groups. We have only done 1 upgrade since we identified this problem (because it was so painful), but it went smoothly.
Thanks,

John
Post Reply

Who is online

Users browsing this forum: Google [Bot] and 53 guests