Host-based backup of VMware vSphere VMs.
Post Reply
ferrus
Veeam ProPartner
Posts: 300
Liked: 44 times
Joined: Dec 03, 2015 3:41 pm
Location: UK
Contact:

Instant Restore performance

Post by ferrus »

I rolled out vPower NFS to our vCentre 5.1 farm last week, for the first time.
Generally everything is OK, but the performance is very, very bad.
I realise the limitations of Instant Restore/vPower connections, but for our architecture - it seems to be falling far short of others users reports on this forum.

Veeam server specs:
2x Intel E5-2660 CPU
96GB RAM
10x 6TB RAID 6 (Tier 1 DAS storage), 64k NTFS
2x 960GB RAID 1 (SSD storage), used for the NFS datastore
1Gbps dedicated vPower connection, connected to a 1Gpbs vmkernel port, over a dedicated VLAN

When I tried an Instant Restore, it took over 10 minutes to boot the VM. The console was unusable, for a long time.
There's no deduplication or encrytion on the backup storage, and only optimal compression.

Strangely, on subsequent reboots, the boot up time reduced to 7 minutes, and then 3.
My only explanation for this, is that as the VM changes mount up, more of the data is being read from the SSD NFS Datastore, rather than the Tier 1 backup store. Is this correct?

I couldn't find anything to explain the original performance issue. This was the only job running on the Veeam server, all other VMs were removed from the ESX server for testing - so it's pretty much the only thing using the CPU, memory, network and disk resources on each end.
Still, even with the 3 minute boot time, the latency on the VM disk was 120-200ms during boot.

Then I read a post here that mentioned, incremental chain size.
We use a 1x full/29x incremental, forever-forward Tier 1 strategy.
As I was restoring from the latest backup - this would use all 30 restore points.
Is this the likely bottleneck in the design?

If so, is there a way to create a quick restore/backup copy etc. to local storage (Tier 1 or SSD - NOT vCentre), before performing the instant restore?
I know this is an extra step, but it would almost certainly be quicker than the boot up time of the VM, and would allow Virtual Labs, DR in case of SAN failure etc.

I hope there is a way to do this, because at the moment - all the IR/SB/VL functionality is redundant with the current speeds.
Gostev
Chief Product Officer
Posts: 31816
Liked: 7302 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Instant Restore performance

Post by Gostev »

Hi, most likely there are some issues with your backup storage issues or its controller. I recall that even on my laptop lab with a single spindle of low-RPM HDD, VMs were booting in less than a minute with Instant VM Recovery - and that was before some massive vPower NFS optimizations in the past two B&R versions.

Have you tried backing up to some other storage for a test, and performing the same test there?

I keep remembering this feedback from v5 beta, when someone actually had instantly recovered VM boot to login prompt faster from backup file on newly acquired SAN (that was waiting to be put into production) than the production VM itself (running off older and busy SAN). It was something like 40 sec vs. 45 sec. And again, that was with the pre-release code and before recent optimizations.
ferrus
Veeam ProPartner
Posts: 300
Liked: 44 times
Joined: Dec 03, 2015 3:41 pm
Location: UK
Contact:

Re: Instant Restore performance

Post by ferrus »

Thanks for the reply.
I read that thread about your laptop setup, and a number of others that mentioned similar boot times.
I can't believe our storage build could be that far wrong - it's virtually the same as the Veeam Cisco C240 Best Practice build document.
That's why I wondered about the number of restore points.

I was planning to do a backup to the SSD disk tomorrow, so I'll post the results after that.
I'll do a similar backup to the Tier 1 storage as a control - to see whether it's storage or RP's.
Gostev
Chief Product Officer
Posts: 31816
Liked: 7302 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Instant Restore performance

Post by Gostev »

Yes, these would be a perfect test for a start - please keep us posted.
Indeed, Cisco C240 should provide excellent results really.
ferrus
Veeam ProPartner
Posts: 300
Liked: 44 times
Joined: Dec 03, 2015 3:41 pm
Location: UK
Contact:

Re: Instant Restore performance

Post by ferrus »

First test is quite surprising:

Full backup, no incrementals, stored on Veeam proxy local RAID 1 SSD disk.
Instant Restore boot up time - to login prompt: 12 minutes 5 seconds

I could carry on with the control test - but that was supposed to be the FAST result.
ferrus
Veeam ProPartner
Posts: 300
Liked: 44 times
Joined: Dec 03, 2015 3:41 pm
Location: UK
Contact:

Re: Instant Restore performance

Post by ferrus »

Been away for a week, but looking into this issue again.

My next test was a fresh VM from our Windows 2012 gold image. No Domain Policies, AD interference, or additional software services that could interfere with boot time..
Again - tested from an idle ESX server, connecting to an idle Veeam server.

Boot times ranged from 5 minutes 40 seconds, to 8 minutes 15 seconds.

Any suggestions to find the bottleneck?
DaveWatkins
Veteran
Posts: 370
Liked: 97 times
Joined: Dec 13, 2015 11:33 pm
Contact:

Re: Instant Restore performance

Post by DaveWatkins »

Random thought, You're using a dedicated vmkernel port on the ESX host, since that wouldn't be used for the backups themselves I'd start looking at that, has it actually negotiated at 1Gb, are there errors on that switch port, is it set to jumbo frames incorrectly.. that sort of thing. Seems a logical place to look anyway
ferrus
Veeam ProPartner
Posts: 300
Liked: 44 times
Joined: Dec 03, 2015 3:41 pm
Location: UK
Contact:

Re: Instant Restore performance

Post by ferrus » 1 person likes this post

I just want to report back, that this issue is resolved (don't know if there's a way of editing the subject to - Fixed).

A lot has changed since I originally posted the topic - Veeam 9, Veeam 9u2, Veeam 9.5, an extra Proxy + Repository. We've changed block sizes, compression, database - all kinds.
Just retested this following the recent v9.5 upgrade, and the Instant Restore VM boot up time - which was >12 minutes to the logon prompt, is now just under 2 minutes.
Once logged in - everything seems a lot more responsive, too.

I'm not sure which change has resolved the issue - v9.5 is the obvious answer - but the previous performance issues were severe for v8/9.
I'd have thought the change to per-VM backup files was the most likely fix (we had 30x Forever Incremental Restore Point backup chains, with >50 VMs and several TB files), but the test above showed poor performance with a single backup file.

I'd still like the backup times of <1 minute mentioned above, but factoring in domain policies etc, which slow down our boot up times anyway, we're happy with the performance.

Advice to others is persevere and keep to the best practice - however long it takes to implement :D
foggy
Veeam Software
Posts: 21139
Liked: 2141 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Instant Restore performance

Post by foggy »

Thanks for getting back with this. v9.5 has indeed introduced considerable restore performance improvements, including Instant Recovery, so that might be it.
Post Reply

Who is online

Users browsing this forum: Semrush [Bot] and 27 guests