Comprehensive data protection for all workloads
Post Reply
MCU_Networking
Enthusiast
Posts: 38
Liked: 5 times
Joined: May 02, 2016 10:21 pm
Full Name: MCU Networking
Contact:

New Servers - NTFS SSD Repo no speed increase - Proxy Bottleneck

Post by MCU_Networking »

We are setting up new Veeam servers. We are starting from scratch and not carrying anything over. Meaning no jobs nor settings, nor backup files(for now).
Server is a Dell R540, 21TB SSDs(Self-Encrypting) backed by a Raid 5 on H750 Raid controller. 64GB of memory and 10(v20) core processor. I made this my Off-Host Proxy.
The most likely scenario is that I am going to format the drive with 64K, not 4K.
I plan on using Windows Deduplication. It will be possibly only on files older than 30 days to keep a hydrated landing zone. The idea is it will only dedupe Monthlies, far out weeklies, yearlies that will not and/or are not as likely to be modified again.
The Repo in VBR will be set to Align blocks, do not compress when hitting the repo (I will lose some dedupe, and overall it will take more space, but the server will work less), per-vm files.
The backup jobs will use inline dedupe, for same reason mentioned about decompress above, and will likely use Dedupe-Friendly or Optimal compression. Skipping page files and deleted files of course.
Not sure if I am going to use On-Host or Off-host proxy yet for jobs.
VBR server is virtualized.

I currently am backing up to a repo that is also the VBR server, using Extreme Compression (low available space), not uncompressing at repo, using inline dedupe, no windows dedupe. Bottleneck:Proxy on jobs is always the highest bottleneck and sits in high 90%'s.


I backed up one machine on the old for a testing control. It took 13 minutes and backup size was 71.6GB.
Same virtual machine backed up on new took same amount of time and same amount of space.
I sadly saw no benefits from the SSD, because of the Proxy Bottleneck.
This was the same whether I used On-Host or Off-Host Proxy. On-Host Proxy, aka Hyper-V host is a 32 core server with couple hundred free GBs of memory.






Question1 I have, is there something glaring I am missing that is preventing me from getting any more performance increases on the new servers(Using NTFS). I know I will get a more noticeable performance increase for on-repo tasks, such as synthetic operations because of the higher IO bandwidth of SSDs versus the 15K SAS drives on the old. I will also be able to dedupe files really quick, I have already tested that. Wondering if anyone has recently been in a similar situation and can provide some insight. Before anyone asks, I will likely not go with ReFS, because I already work too much as is(not in a lazy way, more like a more than 40hrs/week way), and I do not wish to deal with ReFS & Storage Spaces/Storage Space direct. I have done about 20 - 30 hours or reading on ReFS vs NTFS, and so far I am more comfortable sticking to what I am used to since I haven't enough information to convince me otherwise.
Now, if someone is using ReFS, with Hardware Raid, not using Storage Spaces at all and can give me info on how that is going, that might be something I might consider since it is only 1 new thing. This of course would be ReFS without Deduplication and using Fast Clone with Veeam. Also, I would specifically want to know if ReFS will still delete corrupted data it finds automatically, if I don't have Storage Spaces Direct installed. This would ideally be similar to NTFS, but I get Fast Clone instead of Windows Deduplication.

Question 2 would be, why would the Proxy Bottleneck not improve when using On-Host, when going from Extreme Compression to None/Dedupe-Friendly/Optimal?
HannesK
Product Manager
Posts: 15598
Liked: 3443 times
Joined: Sep 01, 2014 11:46 am
Full Name: Hannes Kasparick
Location: Austria
Contact:

Re: New Servers - NTFS SSD Repo no speed increase - Proxy Bottleneck

Post by HannesK » 1 person likes this post

Hello,
I made this my Off-Host Proxy.
good luck with that. Most customers cannot find a combination of Hyper-V patchlevel, storage firmware, hardware vss provider and Windows patchlevel that is stable over years. That's why I always recommend on-host proxies for Hyper-V.
I plan on using Windows Deduplication.
although Windows deduplication seems to break less things today (I remember multiple times data corruption at customers), I would go with REFS (without storage spaces / storage spaces direct). If you want, go with NTFS. With SSDs, it sounds to fit for you.
VBR server is virtualized.
to avoid chicken-egg issues, I would put everything on one physical server. Technically you can. I just would not do it.
using Extreme Compression (low available space), not uncompressing at repo, using inline dedupe, no windows dedupe. Bottleneck:Proxy on jobs is always the highest bottleneck and sits in high 90%'s.
90% is expected behavior, yes.
I sadly saw no benefits from the SSD, because of the Proxy Bottleneck.
also expected because of the compression setting :-)
Also, I would specifically want to know if ReFS will still delete corrupted data it finds automatically,
Veeam has built-in health checks
why would the Proxy Bottleneck not improve when using On-Host, when going from Extreme Compression to None/Dedupe-Friendly/Optimal?
probably not enough parallel tasks. What happens if you do the same test with 25 VMs in parallel?

Best regards,
Hannes
Steve-nIP
Service Provider
Posts: 138
Liked: 68 times
Joined: Feb 06, 2018 10:08 am
Full Name: Steve
Contact:

Re: New Servers - NTFS SSD Repo no speed increase - Proxy Bottleneck

Post by Steve-nIP » 2 people like this post

Just to note, it's unreasonable to expect OK performance with extreme compression. I have had a machine with 24 cores and 48 threads and 256GB RAM max out all cores with ONE job, quickly becoming the bottleneck and killing job performance. I wouldn't ever consider it viable, and honestly high is rarely useful. Optimal is just so.. optimal.
HannesK
Product Manager
Posts: 15598
Liked: 3443 times
Joined: Sep 01, 2014 11:46 am
Full Name: Hannes Kasparick
Location: Austria
Contact:

Re: New Servers - NTFS SSD Repo no speed increase - Proxy Bottleneck

Post by HannesK » 4 people like this post

Optimal is just so.. optimal.
I love that... and I added your quote it to an official V12 deck ;-)
MCU_Networking
Enthusiast
Posts: 38
Liked: 5 times
Joined: May 02, 2016 10:21 pm
Full Name: MCU Networking
Contact:

Re: New Servers - NTFS SSD Repo no speed increase - Proxy Bottleneck

Post by MCU_Networking »

Thank you for the responses.
"I would go with REFS (without storage spaces / storage spaces direct)" - HannesK
"Veeam has built-in health checks" - HannesK

I still have the question of If ReFS finds corrupted data, will it delete it automatically since there is no copies of the data ( because no Storage Spaces) or can it not make any corrective actions at all by itself (because no Storage Spaces Direct)? I am find some people online saying it will, some saying it won't neither providing proof. Microsoft's main ReFS page doesn't really come across this situation, or least I can't find it.


Another question, if a backup repo says to use per-vm files, will ReFS save data only in the per-vm backup chains, or will the ReFS Fast Clone/Block Clone feature work across the entire Job? So asked another way, is it per-backup chain or per-job?
soncscy
Veteran
Posts: 643
Liked: 314 times
Joined: Aug 04, 2019 2:57 pm
Full Name: Harvey
Contact:

Re: New Servers - NTFS SSD Repo no speed increase - Proxy Bottleneck

Post by soncscy » 1 person likes this post

ReFS has integrity streams and you can monitor the event logs for event 133 I think and it will flag when it needs to remove data from the namespace. ReFS won't attempt repairs without mirror accelerated parity:

https://docs.microsoft.com/en-us/window ... ty-streams
If ReFS is mounted on a resilient mirror or parity space, ReFS will attempt to correct the corruption.
If the attempt is successful, ReFS will apply a corrective write to restore the integrity of the data, and it will return the valid data to the application. The application remains unaware of any corruptions.
If the attempt is unsuccessful, ReFS will return an error.
Veeam's Healthcheck can replace this, but it sometimes struggles on ReFS from my experience (this isn't about HealthCheck, it's about ReFS being fragmented by design). Fast enough disks and you won't notice it likely, but it's kind of a wash in my opinion. On SSD disks, I think it should be acceptable for your most important servers; for my clients we have a very straightforward talk on classifying servers into categories so they are set from the beginning with this.

> Another question, if a backup repo says to use per-vm files, will ReFS save data only in the per-vm backup chains, or will the ReFS Fast Clone/Block Clone feature work across the entire Job? So asked another way, is it per-backup chain or per-job?

Block clone works per backup chain. So if you have a per-vm repo, each machine will only benefit from its own chain. For per-job repos, I suppose this means that you'll theoretically see bigger savings, but I am not fond of per-Job chains for a few reasons (too many eggs in one basket basically, lack of flexibility in my opinion in shuffling backups around, restores from tape are a beast cause you have to take the entire file back sometimes for just a single 30 GiB machine...)
MCU_Networking
Enthusiast
Posts: 38
Liked: 5 times
Joined: May 02, 2016 10:21 pm
Full Name: MCU Networking
Contact:

Re: New Servers - NTFS SSD Repo no speed increase - Proxy Bottleneck

Post by MCU_Networking »

Thank you for the help.
DonZoomik
Service Provider
Posts: 378
Liked: 124 times
Joined: Nov 25, 2016 1:56 pm
Full Name: Mihkel Soomere
Contact:

Re: New Servers - NTFS SSD Repo no speed increase - Proxy Bottleneck

Post by DonZoomik » 1 person likes this post

A point about Extreme Compression, the only place where I've found it useful is Backup Copy jobs to remote sites with low enough bandwidth limitation that shifts bottleneck to network.
MCU_Networking
Enthusiast
Posts: 38
Liked: 5 times
Joined: May 02, 2016 10:21 pm
Full Name: MCU Networking
Contact:

Re: New Servers - NTFS SSD Repo no speed increase - Proxy Bottleneck

Post by MCU_Networking »

Backup Jobs ended up being about 10% quicker on average. One job was about 100% faster. The Copy jobs had some slight performance increase in them, still working on best setting for those.
Because we saw faster backup times, we start the more important half of the jobs at 7:00pm and the other half at 7:30pm. They are usually all done before 8pm.
I haven't paid much attention to the Health-Check speed times nor defragging time, plus it hasn't came up in every job yet. This is likely where I will see the largest speed increase since it is mainly locally contained processing. I can try and report back in case someone in the future finds it helpful.

We ended up staying with NTFS 64K with Optimal Compression on jobs, using Inline Deduplication, not decompressing when hitting repo. Stayed with NTFS because there was too many negatives related to it to go with. We used per-VM backup files, so we were not going to get great savings with that it looked like. We also did not want to go with the full blown ReFS setup with storage spaces, relying on the tried and true hardware RAID instead. For Hyper-V servers we ended up using On-Host proxies, for vmWare we use the Repository server as the proxy.
Post Reply

Who is online

Users browsing this forum: Baidu [Spider], Google [Bot], woifgaung and 68 guests