Discussions specific to the VMware vSphere hypervisor
Vitaliy S.
Product Manager
Posts: 23062
Liked: 1581 times
Joined: Mar 30, 2009 9:13 am
Full Name: Vitaliy Safarov
Contact:

Re: Incredible slow Direc SAN restore

Post by Vitaliy S. » Mar 27, 2015 11:14 am

What about using Instant VM recovery and then use migration jobs to bring VM back to production?

SyNtAxx
Expert
Posts: 148
Liked: 15 times
Joined: Jan 02, 2015 7:12 pm
Contact:

Re: Incredible slow Direc SAN restore

Post by SyNtAxx » Mar 31, 2015 4:18 pm

How about the feature work correctly? My issue is currently at Tier 3 support.

Vitaliy S.
Product Manager
Posts: 23062
Liked: 1581 times
Joined: Mar 30, 2009 9:13 am
Full Name: Vitaliy Safarov
Contact:

Re: Incredible slow Direc SAN restore

Post by Vitaliy S. » Mar 31, 2015 5:03 pm

This doesn't mean I said stop troubleshooting it with our support team, I was just suggesting all available options ;)

SyNtAxx
Expert
Posts: 148
Liked: 15 times
Joined: Jan 02, 2015 7:12 pm
Contact:

Re: Incredible slow Direc SAN restore

Post by SyNtAxx » Mar 31, 2015 5:13 pm

I understand, and apologize for the short reply. I am a bit frustrated. It seems like every time I go to test a new feature of the product I hit a wall. Small SAN restores complete, though large vms do not. Currently I've worked backwards from 1.5tb to a 900gb vm and it fails. I just got off the phone with support for yet another issue with Linux FLR not working. I get FIPs errors, and the appliance doesn't deploy. I've spent a lot of time convincing management this was the solution to go with and we've invested significantly in the product, 102 enterprise plus licensing and counting. I do my best to self resolve before I open a case so as not to waste an engineers time. So I've got a lot riding on this product and it is a critical application for us as I am sure it is for everybody else here.

chjones
Expert
Posts: 106
Liked: 27 times
Joined: Oct 30, 2012 7:53 pm
Full Name: Chris Jones
Contact:

Re: Incredible slow Direc SAN restore

Post by chjones » Apr 02, 2015 1:07 am

I can confirm as well that not bringing the SAN disks online before attempting a restore will fallover to network mode with the warning:

"Unable to execute write operation using advanced transport, failing over to network mode..."

We re-ran the restore after bringing the disks online in Windows Disk Management, and the restores start directly over the SAN. This just doesn't seem good design since the Veeam Proxy disables automount on disks, but can't use them in that state during a restore.

SyNtAxx
Expert
Posts: 148
Liked: 15 times
Joined: Jan 02, 2015 7:12 pm
Contact:

Re: Incredible slow Direc SAN restore

Post by SyNtAxx » Apr 02, 2015 1:50 pm

Case #00828648 Open just short of a month so far. No resolve. Very large vms are a fail on restore every time. Has anybody successfully restored a 900gb+ vm successfully on version 8.0 Patch1 using any transport method? I've run into just about every known bug so far, requiring hotfixes to fix hotfixes. We are fortunate to have a very capable collection of hardware, but not sure why I am running into so many issues. I can admit to not being fully aware of all the nuances of the software features. However, at the core the configuration is good.

Environment:

Veeam Backup manager: vm @ 8 core 16gb ram and no proxy services, strictly a job manager.
proxy1: HP DL 580G7 4 procs @ 6 core + HT and 32gb ram. 4gb fiber hba to fabric. Teamed 1gE @ LACP. Win2k12R2
proxy2: HP BL460c g8: 2 procs @ 10 core + HT and 512gb ram and 2 @ 10gE for network and 2 @ 8gbps FC HBA. Win2k12R2 Blade chassis is 8x8gbps direct connected to SAN Frame and 40gE to the core network.

SAN:

3par 10800 fully loaded.
8 controllers
1700 drives.


Nick

Gostev
SVP, Product Management
Posts: 24907
Liked: 3612 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Incredible slow Direc SAN restore

Post by Gostev » Apr 02, 2015 7:42 pm

SyNtAxx wrote:I've run into just about every known bug so far
Looking at your case history, this indeed looks to be the case...
I've escalated this particular issue for you though.

SyNtAxx
Expert
Posts: 148
Liked: 15 times
Joined: Jan 02, 2015 7:12 pm
Contact:

Re: Incredible slow Direc SAN restore

Post by SyNtAxx » Apr 02, 2015 9:53 pm

Gostev,

thank you.

WimVD
Service Provider
Posts: 53
Liked: 18 times
Joined: Dec 23, 2014 4:04 pm
Contact:

Re: Incredible slow Direc SAN restore

Post by WimVD » Apr 03, 2015 9:43 am

SyNtAxx wrote:Case #00828648 Open just short of a month so far. No resolve. Very large vms are a fail on restore every time. Has anybody successfully restored a 900gb+ vm successfully on version 8.0 Patch1 using any transport method? I've run into just about every known bug so far, requiring hotfixes to fix hotfixes. We are fortunate to have a very capable collection of hardware, but not sure why I am running into so many issues. I can admit to not being fully aware of all the nuances of the software features. However, at the core the configuration is good.

Environment:

Veeam Backup manager: vm @ 8 core 16gb ram and no proxy services, strictly a job manager.
proxy1: HP DL 580G7 4 procs @ 6 core + HT and 32gb ram. 4gb fiber hba to fabric. Teamed 1gE @ LACP. Win2k12R2
proxy2: HP BL460c g8: 2 procs @ 10 core + HT and 512gb ram and 2 @ 10gE for network and 2 @ 8gbps FC HBA. Win2k12R2 Blade chassis is 8x8gbps direct connected to SAN Frame and 40gE to the core network.

SAN:

3par 10800 fully loaded.
8 controllers
1700 drives.


Nick
I have restored 1TB+ disks using Instant VM restore and storage vMotion. Other methods failed for different reasons as you mention.
Looking at your hardware it worries me that you only restore at 90MB/s. At that rate restoring big disks just takes too long.
Granted you can use instant restore to bring the VM online while restoring but what if you only need to restore a single big disk.
A very common scenario with cryptolocker viruses these days...

I hope Veeam will make restore speed and stability a priority in the next patches/releases. But I also understand that they are very dependent on VMware in this case.
There is a very good possibility this is not (only) a Veeam issue but VMware as well.
Knowing Veeam i'm sure it will get resolved.

SyNtAxx
Expert
Posts: 148
Liked: 15 times
Joined: Jan 02, 2015 7:12 pm
Contact:

Re: Incredible slow Direc SAN restore

Post by SyNtAxx » Apr 07, 2015 9:31 pm

With such large drives becoming fairly common things should work. I hesitate to use the instant vm recovery (have done it a few time with success) but I'm not sure how it will work for a High IO server. I tried redirecting the change file to a san, but to be honest I don't recall the outcome of that test. Be that as it may, moving a very large vm to san via NFS can take quite a bit of time.

What would be nice to see and I've requested, is the ability to schedule automatic san snapshots via the storage integration/san explorer (if you l have it licensed) and use those as recovery points. I know they talk about it in the user manual being a very fast way to recover, and indeed it is as I've recovered a test server in about 3 minutes. I'm still learning the finer details of the application, but I don't think you can auto schedule san snaps to use as recovery points currently. They are manual at this point. That feature would be awesome for tier1 applications. I can do a lot of that with an application I have called HP Recovery Manager, but would be nice to have all function in one recovery app.

Nick

WimVD
Service Provider
Posts: 53
Liked: 18 times
Joined: Dec 23, 2014 4:04 pm
Contact:

Re: Incredible slow Direc SAN restore

Post by WimVD » Apr 10, 2015 12:11 pm

SyNtAxx wrote:With such large drives becoming fairly common things should work. I hesitate to use the instant vm recovery (have done it a few time with success) but I'm not sure how it will work for a High IO server. I tried redirecting the change file to a san, but to be honest I don't recall the outcome of that test.
Remember you don't need to bring the server online when using instant restore. You can just publish it and move the VM or a disk offline.

Delo123
Expert
Posts: 361
Liked: 109 times
Joined: Dec 28, 2012 5:20 pm
Full Name: Guido Meijers
Contact:

Re: Incredible slow Direc SAN restore

Post by Delo123 » Apr 13, 2015 9:26 am

I remeber using instant restore one time after a bad bad san crash. During vmotion the instant recovery process stopped due to a decompression error if i remember correctly. Backup was on a windows deduped volume.
Maybe some thing were not fully in order, however what scared me the most is that the instant recovery job showed 5 retries (but ofcourse they also failed) and then shut down the vm. Since this was our only "live" recovery exchange database we actually had some dataloss, because after the vm shutdown all changes were gone and there is no option to reboot (to recover other data then the 1 corrupt block) so i guess i will never again use it for production restore where changes are important

foggy
Veeam Software
Posts: 18348
Liked: 1575 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Incredible slow Direc SAN restore

Post by foggy » Apr 21, 2015 1:30 pm

Guido, have you contacted support to investigate the reasons of this behavior?

Delo123
Expert
Posts: 361
Liked: 109 times
Joined: Dec 28, 2012 5:20 pm
Full Name: Guido Meijers
Contact:

Re: Incredible slow Direc SAN restore

Post by Delo123 » Apr 21, 2015 2:29 pm

Actually we didn't. You can imagine it was quite a hard weekend, and in the end we knew something went wrong on the windows deduplication target...
We did find the error in the Veeam KB's (or Forums) confirming there was a read "error" from the repository. However this doesn't really help.
For us we learned from this is that Surebackup / Instant Recovery cannot be reliable enough unless it actually reads all of the data within the backup file which of course is impossible to do practically.
That is where replica's come into play of course....

I thought about opening a ticket later on to discuss the matter of actually keeping the changed data after an instant recovery failes (or skips the bad block) without actually taking the vm down.
Actually i tried finding someboby at Vmworld to talk to regarding this and everybody at the booth pointed to Anton to discuss this. When i finally met him briefly at the veeam party (greatest party at the show as always) other things were more important :):) and i actually forgot about it.... But maybe it would be a good thing to discuss internally for you guys.

SyNtAxx
Expert
Posts: 148
Liked: 15 times
Joined: Jan 02, 2015 7:12 pm
Contact:

Re: Incredible slow Direc SAN restore

Post by SyNtAxx » Apr 22, 2015 3:35 pm

I have an update on my failed SAN restores: We uncovered a bug in the SAN recovery method. I custom fix was created that will be released post update 2.

I still have slow SAN restores speeds. Restoring to a local 5 disk raid 5 array is 2x faster (200=MB/sec) then the 1700 drive array 89-120MB/sec). Local drive has no VAAI integration, I've tried disabling VAAI on a test esx host but yielded no improvement in speed. One thing I hadn't tried yet is created a new lun exported only to the test esxi server and disabling VAAI again.

-Nick

chjones
Expert
Posts: 106
Liked: 27 times
Joined: Oct 30, 2012 7:53 pm
Full Name: Chris Jones
Contact:

Re: Incredible slow Direc SAN restore

Post by chjones » May 04, 2015 3:08 am

Hi Nick,

I saw your posts to another Topic I had been active in, http://forums.veeam.com/vmware-vsphere- ... 92-30.html, and you mentioned you had the same speed issues with SAN Restores. You mention here that this is a bug and there is a fix. Is that confirmed with Veeam Support? I upgraded to 8.0 Patch 2 this morning and tested this issue and still see SAN Restores being about half that of Network and HotAdd restores.

If there is a confirmed fix for this I'll contact support and try and get it and confirm it works for me as well.

Thanks,

Chris

SyNtAxx
Expert
Posts: 148
Liked: 15 times
Joined: Jan 02, 2015 7:12 pm
Contact:

Re: Incredible slow Direc SAN restore

Post by SyNtAxx » May 04, 2015 7:01 pm

Chris,

There was a bug that I stumbled across when restoring large vms( 900gb+) via the SAN method. The transport would hang/timeout. There was a private patch released for that. I seriously doubt my configuration at this point is to blame for slow SAN restores. My blade cage is direct connected to the 3par v800 via 8 @ 8gbps ports. My physical SAN proxy is direct connected to our SAN directors to eliminate any edge switch issues. Results are consistent no matter what. I have a new colo site coming online with a new 3par array, and when I have a moment i'll test there. If it is still slow then there seems to be an application issue or at the very least an interaction issue at some level. I think if we keep up the heat we might get some additional exposure.

-Nick

chjones
Expert
Posts: 106
Liked: 27 times
Joined: Oct 30, 2012 7:53 pm
Full Name: Chris Jones
Contact:

Re: Incredible slow Direc SAN restore

Post by chjones » May 05, 2015 4:08 am

Nick,

I agree that I don't believe it's your infrastructure. I can write data at up to 800-900MB/sec to a 3PAR Volume presented to my Veeam Proxy which is a Windows 2012 R2 Server on the same model HP Blade Server, and in the same blade enclosure, as my ESXi hosts, but a Veeam SAN Restore is not even a tenth of that speed.

I have a very similar setup to you. 2 x HP c7000 Blade Chassis, each with 2 x FlexFabric 10Gb/24-Port Modules (used to use it for Direct-Attach to the 3PAR for a flat SAN but now just use it for 10GbE Ethernet to our Cisco 6880 Core Switch) and 2 x HP 8Gb 20-Port Fibre Channel Modules that connect to 2 x HP CN3000B 16Gb Fibre Switches (rebadged Brocade Switches). The 3PAR 7400 (4-Node) has an 8Gb FC connection from each controller to each of the two fibre switches. All up each ProLiant BL460c Gen8 Blade sees 8 paths to the 3PAR.

SAN backups work great, very fast (an Active Backup can run over 400-500MB/sec+ if there is no other activity). Network restores are quick at over 140-150MB/sec+ ... however SAN restores are down to 40-50MB/sec. I tested again this morning with a new thin provisioned volume and also a thick provisioned volume using a VBK that was on the local 300GB SAS drives on one of Gen8 Blades that acts as a Veeam Proxy (to take the Veeam Repository out of the equation for speed tests) and got the same results no matter whether the 3PAR volume was thick or thin.

Definitely something going on.

I'll open a case with Veeam ... i'm dreading explaining this issue and the many, many tests that have been done. But fingers crossed.

SyNtAxx
Expert
Posts: 148
Liked: 15 times
Joined: Jan 02, 2015 7:12 pm
Contact:

Re: Incredible slow Direc SAN restore

Post by SyNtAxx » May 05, 2015 12:57 pm

chjones wrote:Nick,

I agree that I don't believe it's your infrastructure. I can write data at up to 800-900MB/sec to a 3PAR Volume presented to my Veeam Proxy which is a Windows 2012 R2 Server on the same model HP Blade Server, and in the same blade enclosure, as my ESXi hosts, but a Veeam SAN Restore is not even a tenth of that speed.

I have a very similar setup to you. 2 x HP c7000 Blade Chassis, each with 2 x FlexFabric 10Gb/24-Port Modules (used to use it for Direct-Attach to the 3PAR for a flat SAN but now just use it for 10GbE Ethernet to our Cisco 6880 Core Switch) and 2 x HP 8Gb 20-Port Fibre Channel Modules that connect to 2 x HP CN3000B 16Gb Fibre Switches (rebadged Brocade Switches). The 3PAR 7400 (4-Node) has an 8Gb FC connection from each controller to each of the two fibre switches. All up each ProLiant BL460c Gen8 Blade sees 8 paths to the 3PAR.

SAN backups work great, very fast (an Active Backup can run over 400-500MB/sec+ if there is no other activity). Network restores are quick at over 140-150MB/sec+ ... however SAN restores are down to 40-50MB/sec. I tested again this morning with a new thin provisioned volume and also a thick provisioned volume using a VBK that was on the local 300GB SAS drives on one of Gen8 Blades that acts as a Veeam Proxy (to take the Veeam Repository out of the equation for speed tests) and got the same results no matter whether the 3PAR volume was thick or thin.

Definitely something going on.

I'll open a case with Veeam ... i'm dreading explaining this issue and the many, many tests that have been done. But fingers crossed.

Sounds good. Lets keep on it.

-Nick

chjones
Expert
Posts: 106
Liked: 27 times
Joined: Oct 30, 2012 7:53 pm
Full Name: Chris Jones
Contact:

Re: Incredible slow Direc SAN restore

Post by chjones » May 15, 2015 4:15 am

I have support case 00911701 opened about this. Kinda struggling to get them to understand the problem.

The only response so far has been "The Direct SAN Access transport mode can be used to restore VMs with thick disks only. Before VM data is restored, the ESX(i) host needs to allocate space for the restored VM disk on the datastore".

This is fine and makes sense. My comments to this are all of the VMs we back up are THICK EAGER ZEROED, yet when they are restored, even if we select to KEEP SAME AS SOURCE for the disks, the VMs are always restored as THICK LAZY ZEROED so not sure why this would have an impact on the restore speed being one third that of a network restore. A network restore has to write to the same Datastore so you would expect the speed to be roughly the same if that was the issue. Plus, a Network Restore has to write via the Hypervisor where a SAN Restore writes directly to the SAN Volume, bypassing the hypervisor.

I understand there is an overhead on THICK EAGER disks as the storage is zeroed for every block so this doesn’t have to occur on first write, but I can create a 1TB THICK EAGER VMDK on the same Datastore from within vCenter and will complete in a couple of minutes.

I can’t understand why the difference between restoring via the SAN (40-50MB/sec) and the Network (140-150MB/sec) would result in such a large difference.

chjones
Expert
Posts: 106
Liked: 27 times
Joined: Oct 30, 2012 7:53 pm
Full Name: Chris Jones
Contact:

Re: Incredible slow Direc SAN restore

Post by chjones » May 25, 2015 7:59 pm 1 person likes this post

Well, finally making some progress with this case. At least Veeam Support have now been able to reproduce the same issue.

Support had me download the VDDK, which are the libraries used to access VMware Storage, and run a few tests. If I write to a VMDK using network mode the speed is over 250MB/sec. If I use SAN restore the speed plummets to 60MB/sec. The VMDK was THICK LAZY ZEROED. However, if the VMDK I am writing to is THICK EAGER ZEROED then the speed of the VDDK tests rivals the 250MB+/sec that I see with a network mode restore.

It's the same conclusion that we have come to ourselves, that a thick eager VM is always restored as thin lazy by Veeam, and this is causing the slow SAN restore speed issues. I asked the question to support and they confirmed that Veeam itself sets the disk to lazy regardless of what was backed up.

At least the case is making progress as I've been advised that Veeam are looking into the eager/lazy issue and have been able to reproduce the same results. Hopefully there is a resolution forthcoming.

SyNtAxx
Expert
Posts: 148
Liked: 15 times
Joined: Jan 02, 2015 7:12 pm
Contact:

Re: Incredible slow Direc SAN restore

Post by SyNtAxx » May 26, 2015 1:35 pm

Good to hear, I wasn't able to make any progress with them as the tool wasn't working when they attempted the same tests.

-Nick

chjones
Expert
Posts: 106
Liked: 27 times
Joined: Oct 30, 2012 7:53 pm
Full Name: Chris Jones
Contact:

Re: Incredible slow Direc SAN restore

Post by chjones » Jun 02, 2015 3:24 am

Veeam have now opened a case with VMware, as they see the same results internally, and I've given them permission to hand over my details to VMware if they wish to contact me regarding the issue. Fingers crossed.

dmitri-va
Enthusiast
Posts: 51
Liked: 3 times
Joined: Jun 01, 2015 1:28 pm
Full Name: Dmitri
Contact:

Re: Incredible slow Direc SAN restore

Post by dmitri-va » Jun 22, 2015 5:46 pm

chjones wrote:Veeam have now opened a case with VMware, as they see the same results internally, and I've given them permission to hand over my details to VMware if they wish to contact me regarding the issue. Fingers crossed.
Just came across the same issue with my Direct SAN restore testing. Is there any update on your case?

Vitaliy S.
Product Manager
Posts: 23062
Liked: 1581 times
Joined: Mar 30, 2009 9:13 am
Full Name: Vitaliy Safarov
Contact:

Re: Incredible slow Direc SAN restore

Post by Vitaliy S. » Jun 23, 2015 10:10 am

Hi Dmitri,

Can you please give us a bit more details on our setup? What disks were you restoring? What was the connection and what performance rates you had?

Thanks!

dmitri-va
Enthusiast
Posts: 51
Liked: 3 times
Joined: Jun 01, 2015 1:28 pm
Full Name: Dmitri
Contact:

Re: Incredible slow Direc SAN restore

Post by dmitri-va » Jun 23, 2015 2:50 pm

Vitaliy S. wrote:Hi Dmitri,

Can you please give us a bit more details on our setup? What disks were you restoring? What was the connection and what performance rates you had?

Thanks!

I have VeeamB v8.0.0.2021 on a physical server (Dell R720) with dedicated 10GbE connection to Compellent SAN over iSCSI

The average throughput for Direct SAN backups is ~300MB/s, however the direct SAN restore I tested of a 'thick eager zeroed' VM, was only 75MB/s.

I can't compare it with restoring of the same VM using network mode yet, but once I get a 10GbE connection for it, I will.

Vitaliy S.
Product Manager
Posts: 23062
Liked: 1581 times
Joined: Mar 30, 2009 9:13 am
Full Name: Vitaliy Safarov
Contact:

Re: Incredible slow Direc SAN restore

Post by Vitaliy S. » Jun 23, 2015 4:15 pm

What about using hotadd mode for restoring the entire VM image? If you have a virtual proxy server, then you can run a restore job through it and then compare the restore job performance.

dmitri-va
Enthusiast
Posts: 51
Liked: 3 times
Joined: Jun 01, 2015 1:28 pm
Full Name: Dmitri
Contact:

Re: Incredible slow Direc SAN restore

Post by dmitri-va » Jun 23, 2015 5:10 pm

Vitaliy S. wrote:What about using hotadd mode for restoring the entire VM image? If you have a virtual proxy server, then you can run a restore job through it and then compare the restore job performance.
I'd like to stay with the physical veeam deployment to have the backup infrastructure de-coupled from the vmware cluster.

So, do I understand this correct that the slower direct san restores issue is due to VMs being restored as thick lazy zeroed, no matter what the original disk was and that this is vmware limitation? Or is it something else?

foggy
Veeam Software
Posts: 18348
Liked: 1575 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Incredible slow Direc SAN restore

Post by foggy » Jun 24, 2015 5:04 pm

dmitri-va wrote:So, do I understand this correct that the slower direct san restores issue is due to VMs being restored as thick lazy zeroed, no matter what the original disk was and that this is vmware limitation?
That is correct except it is not a VMware limitation.

dmitri-va
Enthusiast
Posts: 51
Liked: 3 times
Joined: Jun 01, 2015 1:28 pm
Full Name: Dmitri
Contact:

Re: Incredible slow Direc SAN restore

Post by dmitri-va » Jun 24, 2015 5:59 pm

oh, ok. Somewhere earlier in the thread it was mentioned that once veeam opened its own case, they were also opening a vmware case, so I've figured that's because the issue was traced to some vmware bug or limitation...

Any plans on a roadmap to fix it, or is the solution just to switch to hotadd or network mode for restores?

Post Reply

Who is online

Users browsing this forum: No registered users and 22 guests