Host-based backup of Microsoft Hyper-V VMs.
Post Reply
User0816
Lurker
Posts: 2
Liked: never
Joined: Sep 19, 2021 7:20 am
Full Name: Jürgen Mair
Contact:

Bad performance after migration to new physical machine (ReFS?)

Post by User0816 »

Hi,

We have a stand-alone Hyper-V Server. On the Hyper-V there is a VM with Veeam B&R to backup all the VMs to a local NAS System.
The old machine had SAS 15k Disks with R6 Array, the new one SSD R6 Array. The new machine should be better because of new CPUs, more Memory etc.

I wonder why a full bacukp takes twice the time of the full backup on the SAS Disks.

So I went to the HPE SmartArray Controller settings and playing around - but found nothing significant. Veeam Backup Job reports Bottleneck Source.
I startet to Google and now I think it could have to do with the FileSystem. On the old machine we use a NTFS Volume to store all the VMs, on the new one there is almost the same setup but we used a ReFS Volume...

Is there something like a best practice for using Veeam B&R with Hyper-V Hosts using ReFS?
Is ReFS known less performant than NTFS in this setup?
We use B&R 9.5 U4.

regards
Jürgen
Egor Yakovlev
Veeam Software
Posts: 2537
Liked: 683 times
Joined: Jun 14, 2013 9:30 am
Full Name: Egor Yakovlev
Location: Prague, Czech Republic
Contact:

Re: Bad performance after migration to new physical machine (ReFS?)

Post by Egor Yakovlev »

Hi Jürgen,

You never mentioned source Hyper-V host version, which, with a high chance, will be the reason of your performance issues with ReFS. There are like 8 ReFS release versions as of today each with it's own tricks and guides. Microsoft have had quite a list of patches\fixes\updates for ReFS, with some nasty issues being solved.
Please make sure your hosts are fully patched in the first place.

Since Veeam reads data from Hyper-V host directly using local transport service(I guess you are not using Off-Host Proxy in your scenario?), you can also give a shot to test volume Read stress test with any tool of choice(MS has quite a list of options, and there are hundreds of 3rd party options to stress test reads).

/Cheers!
User0816
Lurker
Posts: 2
Liked: never
Joined: Sep 19, 2021 7:20 am
Full Name: Jürgen Mair
Contact:

Re: Bad performance after migration to new physical machine (ReFS?)

Post by User0816 »

Hi,

brand new machine and therefore a new Windows Server 2019 Std - all Standard Patches applied...
We have veeam in a VM on the Hyper-V machine and do not use an off-host Proxy...

How I should do the read stresstest - on the Hyper-V machine, reading from the datastore-volume?

regards
Egor Yakovlev
Veeam Software
Posts: 2537
Liked: 683 times
Joined: Jun 14, 2013 9:30 am
Full Name: Egor Yakovlev
Location: Prague, Czech Republic
Contact:

Re: Bad performance after migration to new physical machine (ReFS?)

Post by Egor Yakovlev »

Yes, correct. You want to see how fast Hyper-V host can read data from the volume where VMs disk files reside.
Since you are using file share as a target, make sure you have respectful Backup Repository Gateway set to be on a Hyper-V host, rather than default VBR server, because otherwise traffic will loop on Veeam VM with the path: [Production Volume] > [Hyper-V Host] > [Gateway: VBR VM] > [File share], instead of going straight to the file share from the host transport service. That does not affect Source read speeds in any form, but will optimize backup traffic flow anyway.
PetrM
Veeam Software
Posts: 3264
Liked: 528 times
Joined: Aug 28, 2013 8:23 am
Full Name: Petr Makarov
Location: Prague, Czech Republic
Contact:

Re: Bad performance after migration to new physical machine (ReFS?)

Post by PetrM »

Adding my 2 cents: if the plan is to test read speed by 3rd party tool, then it's worth testing on both old and new servers to see if the same pattern of processing rate dependency as you see in Veeam (full is 2x slower) exists when you use a 3rd party tool.

Thanks!
RGijsen
Expert
Posts: 124
Liked: 25 times
Joined: Oct 10, 2014 2:06 pm
Contact:

Re: Bad performance after migration to new physical machine (ReFS?)

Post by RGijsen » 2 people like this post

I'm trying to wrap my head around what you are saying here, but it's not clear. You have a new Hyper-V host on which the Veeam server is running as a VM. That host has SSD storage I think is what you are saying? But you also say you backup to a NAS. That makes me believe you map a LUN on the NAS as an iSCSI device? Because else the NAS hosts the filesystem and that won't work with ReFS as far as I know.

But in the long end, yes ReFS is going to be slow as hell especially on spinning disks. More and more in the digests I see Gostev actually advocating (more or less) spinning disks, but my personal experience on this is that it's terrible with ReFS, especially on standlone volumes (S2D is another story). The reason is that ReFS uses blockcloning, which is great, but it makes the data extremely fragmented very quickly. And we all know that if there is one thing spindles are bad with, it's fragmented data.
I have a dedicated DL380 Gen9, 32 or 64GB RAM (don't remember exactly), a rather nice p840ar RAID controller and 12 10TB disks in RAID6. Now of course 12 spindles is not that great, but still it's a nice setup. Initially we had Veeam backup to ReFS on that, and initially is was working well. However, after about two weeks already, we started to get performance issues. And after two months restoring a VM would be slow as hell, with about 30MB/sec max. Imaging having to restore a 5TB VM from that storage.
As with 12x10TB I have enough storage on my main repository for our environment anyway, we switched back to NTFS, and so far I must say I probably never look back to ReFS. NTFS is just so much more mature. And that's another thing, ReFS is still rather flacky with all its private patches, regkeys and what not to maybe get it stable. More than once we had corruption in a ReFS volume, and then ReFS tags the affected files as unusable, but also invisible. Even after a thorough case with MS, they simply couldn't manage to free up that space anymore, leaving us with over 20TB of simply unusable space on our disks. The only solution they came up with is to copy ALL data from that volume to another volume, which completely kills the block-cloning savings. There is practically no tooling available for ReFS apart from refsutil.exe which is extremely limited.
ReFS is great when used in storage spaces, where it can actually self-heal itself in case of issues. In standalone volumes though, I think it's a terrible filesystem. Especially on spindles when using block-cloning. Note that we run ReFS on our offsite backup, which is in fact SSD powered. Even there we are noticing some performence degradation over time, but that's not really an issue for us, as we only need that in case of HUGE accidedents were our primary store would be devestated. But remember even SSD's are much faster sequentially than with random data.

So just my 2 cents - If you care just a little bit about performance and you are backing up to spindles, just buy enough storage and use NTFS. If you backup to SSD storage, ReFS can be cool but bear in mind you are totally on your own when you run into issues. And there are plenty of issues with ReFS. I know there's not a lot of people that are with me in this, but I've had my share of ReFS problems and I just don't want anyone else to face the same.
Post Reply

Who is online

Users browsing this forum: Google [Bot] and 21 guests