Comprehensive data protection for all workloads
Post Reply
FrancWest
Veteran
Posts: 489
Liked: 93 times
Joined: Sep 17, 2017 3:20 am
Full Name: Franc
Contact:

UseUnbufferedAccess, DisableHtAsyncIo and V11a

Post by FrancWest »

Hoi,

In the past we had to use the DisableHtAsyncIo and UseUnbufferedAccess registry keys as a workaround for hanging merges on ReFS. Now that we updated to v11a, are these registry keys still needed? And if so, does having these registry keys present have any influence on the speed improvements on the health checks that were introduced in v11a?
Gostev
Chief Product Officer
Posts: 31561
Liked: 6725 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: UseUnbufferedAccess, DisableHtAsyncIo and V11a

Post by Gostev »

Hard to say without trying on your specific storage device.

DisableHtAsyncIo was ever only needed for "untypical" storage devices which we did not encounter in our own testing or during beta program. These device did not tolerate async I/O well for whatever reason, which is why disabling the async I/O engine helped them. Most backup storage device saw a significant performance boost from V11 upgrade thanks to async I/O engine.

UseUnbufferedAccess might be safer to remove as there were some generic improvements on our end which should help any storage (including making "typical" storage even faster). But we did not make all the changes we wanted around metadata flushing, they were postponed to V12 due to complexity and more testing required. Potentially, the changes we made in 11a might not be enough for "untypical" storage to perform as fast as it does with buffered access. Again, impossible to predict without testing with the specific storage device.

Health check performance improvements in 11a come solely from using the async I/O engine, so if you have that disabled via the registry key you won't see any differences. But since your storage apparently does not tolerate async I/O in any case, no improvements are possible in principle.
FrancWest
Veteran
Posts: 489
Liked: 93 times
Joined: Sep 17, 2017 3:20 am
Full Name: Franc
Contact:

Re: UseUnbufferedAccess, DisableHtAsyncIo and V11a

Post by FrancWest »

Thanks for the reply.

Our primary storage device is a locally attached HPE D3600 enclosure attached to a Proliant DL360 gen10. Other storage (for copy jobs) is a Netapp connected using iscsi. I’ll remove the regkeys and monitor it.
Gostev
Chief Product Officer
Posts: 31561
Liked: 6725 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: UseUnbufferedAccess, DisableHtAsyncIo and V11a

Post by Gostev » 2 people like this post

Hmm... assuming you have a proper hardware RAID controller, I would not expect you to have any issues with your storage in the first place.

After going through all the communication with the devs regarding this again, here's my understanding of the issue and what those registry values do. I hope this helps you find the right combination of the values, or even pinpoint the actual issue so your could remove them completely.

UseUnbufferedAccess=0 is to fix slow WRITE speed (when your backup became slower after upgrading to V11) by opening backup files for write in buffered mode. This primarily affects metadata flush operations during backup files population, which is the main cause of slowness. Why:

Most enterprise-grade storage devices do those metadata flushes way faster in unbuffered mode, as this allows RAID controller to manage writes most efficiently, FAR more efficient than the OS system cache can (so removing the system cache from the data path usually improves write speed up to 2x after upgrading to V11). However, some storage devices (presumably those without a hardware RAID controller, or perhaps with "bad" RAID controller write cache settings) on the other hand start to suffer. For these devices, enabling buffered access back helps to bring write performance back to V10 levels, because writes are again buffered in the system cache.

11a has optimizations for flushes in the unbuffered (default) mode, but we did not have a chance to implement the ultimate solution yet.

DisableHtAsyncIo=1 fixes slow OPEN and READ by disabling async I/O engine for ANY operations. Note that when async I/O engine is enabled, backup files are ALWAYS opened for read in the unbuffered mode regardless of the UseUnbufferedAccess value, which is why both keys are usually used in tandem. The usage if async I/O engine results in the following issue (only on certain storage devices and for unknown reasons):
1. (Because of unbuffered access) Backup files in the incremental chain take a long time to open during incremental run (they need to be open for READ to load metadata essential for performing an incremental run). This issue is typically described as "backups take longer to complete after upgrading to V11".
2. (Because of unbuffered access) All restores operations are slow to initialize for the same reason: it takes a long time to open the all backup files in the incremental chain, which normally happens instantly.
3. (Because of async I/O: this is likely specific to deduplication storage appliances, since they are built as tape replacement and are thus optimized for sequential I/O) Actual restore performance drops because the device can't handle async I/O well (async I/O = simultaneously reading multiple different parts of the file). Contrary to regular enterprise-grade raw storage devices with many spindles which LOVE async I/O and see performance boost of a few times specifically because async I/O allows leveraging their IOPS capacity more fully.
FrancWest
Veteran
Posts: 489
Liked: 93 times
Joined: Sep 17, 2017 3:20 am
Full Name: Franc
Contact:

Re: UseUnbufferedAccess, DisableHtAsyncIo and V11a

Post by FrancWest »

Thanks for the detailed explanation. The raid controller is an HPE Smart Array P408e-p SR gen10. We were advised to set both keys due to an issue with fast clone operations (synthetic fulls on ReFS) hanging at a certain point. I’ll remove the keys and see if the improvements in V11a also remove the need for these keys.
Gostev
Chief Product Officer
Posts: 31561
Liked: 6725 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: UseUnbufferedAccess, DisableHtAsyncIo and V11a

Post by Gostev »

What you've been told make sense and correlates with my explanation above perfectly. Backup file metadata is the ONLY new data that is being physically written to disk during synthetic fulls on ReFS, while 100% of data blocks of a synthetic full backup are reused from existing backup files. This is why synthetic fulls don't consume physical disk space on ReFS (well, aside of those small metadata banks).
FrancWest
Veteran
Posts: 489
Liked: 93 times
Joined: Sep 17, 2017 3:20 am
Full Name: Franc
Contact:

Re: UseUnbufferedAccess, DisableHtAsyncIo and V11a

Post by FrancWest »

Ok then, let’s see what happens when I remove both keys and see if the improvements you made in V11a are enough for this issue.

Thanks!
FECV
Enthusiast
Posts: 41
Liked: 7 times
Joined: Mar 24, 2016 2:23 pm
Full Name: Frederick Cooper V
Contact:

Re: UseUnbufferedAccess, DisableHtAsyncIo and V11a

Post by FECV »

Interesting this thread should come up for me. Support just asked me to set these keys for some "Transmission pipeline hanged" errors. My server and storage are Dell Complellent and ME arrays which i would image do not fall into the untypical storage category. This only happens on jobs from one sever and none of the others servers of similar size etc. I just upgraded to 11a post opening of the case, and was hoping to see some of those health check improvements, as my health checks take between 1 and 2 weeks (my jobs are all like 50TB but my repository is spread out over 250 spinning drives).

Am i to understand that with these reg keys set, that i will not see some of the new performance increases?
Gostev
Chief Product Officer
Posts: 31561
Liked: 6725 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: UseUnbufferedAccess, DisableHtAsyncIo and V11a

Post by Gostev »

Your error is unlikely to have anything to deal with what is discussed above. The described issues can make processing slower than it was in V10, but never hanging completely. In any case, there's no point to use these global values if it only happens on one server. Support will need to troubleshoot what is wrong with that particular server.
FrancWest
Veteran
Posts: 489
Liked: 93 times
Joined: Sep 17, 2017 3:20 am
Full Name: Franc
Contact:

Re: UseUnbufferedAccess, DisableHtAsyncIo and V11a

Post by FrancWest »

Hi Anton,

We were also having the pipeline error during merge operations. We were advised to set those keys on our side and even on the side of the cloud provider. See case #05044281.
DanielJ
Service Provider
Posts: 200
Liked: 32 times
Joined: Jun 10, 2019 12:19 pm
Full Name: Daniel Johansson
Contact:

Re: UseUnbufferedAccess, DisableHtAsyncIo and V11a

Post by DanielJ »

We have a setup with two HPE Apollos which were installed 6 months ago, connected to a server running VBR 11.0.0.837 P20210525. We were never happy with the performance but a while ago it started to become a real problem, with the throughput down to a trickle while the servers were more or less idle, and jobs started failing with "Timed out requesting agent port" and similar. (Ticket 05184567) We were advised to implement the reg keys mentioned in this thread. We had previously (in september) been advised to implement AgentStartTimeoutSec = 900, MaxUserPort = 65534 and TcpTimedWaitDelay = 15 for the very same problem (the only workaround until then was to reboot the servers), and the situation improved after that, but only after the recent changes have we reached something that resembles the performance expected from the beginning.

Now we are looking at upgrading to 11a. How should we handle the above settings? Clear them for a fresh start or keep them (or some of them) and see what happens?
Gostev
Chief Product Officer
Posts: 31561
Liked: 6725 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: UseUnbufferedAccess, DisableHtAsyncIo and V11a

Post by Gostev »

I highly recommend removing all registry hacks prior to upgrading to a new version, just because our QC labs don't have them implemented so their impact on the new build can be totally unpredictable. It is always safer to return them later as needed, although normally the newer build will have fixes for most support issues anyway (if they were caused by bugs in our code that is, as opposed to by environmental problems).

BTW I think it was a bad recommendation from support for you to add the registry keys discussed in this topic in the first place. It seems at some point they were universally recommending them in case of bad performance with V11, without proper troubleshooting. But these keys naturally don't apply to your hardware, as what they do is revert those V11 changes that were specifically made for backup repositories based on general-purpose repositories like HPE Apollo to perform better.
DanielJ
Service Provider
Posts: 200
Liked: 32 times
Joined: Jun 10, 2019 12:19 pm
Full Name: Daniel Johansson
Contact:

Re: UseUnbufferedAccess, DisableHtAsyncIo and V11a

Post by DanielJ »

Well, we did see an improvement after implementing those keys, but maybe it was just a combination of wishful thinking and an effect of restarting them. After a few nights we were back to the same old problem again (first extremely reduced performance (not only our "normal reduced"), then failures with "Timed out requesting agent port". I have sent fresh logs to support and asked them to take another look at this seemingly neverending problem (same ticket).
Gostev
Chief Product Officer
Posts: 31561
Liked: 6725 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: UseUnbufferedAccess, DisableHtAsyncIo and V11a

Post by Gostev »

Honestly, I would be surprised if your issue has anything to deal with the registry hacks this thread is about. I mean, with those data movers failing to start in the first place, clearly much more is going on than a changed disk I/O pattern of already started data movers.
eengland09
Influencer
Posts: 17
Liked: 1 time
Joined: Oct 07, 2021 5:38 pm
Full Name: Eric England
Contact:

Re: UseUnbufferedAccess, DisableHtAsyncIo and V11a

Post by eengland09 »

Can these registry changes be implemented with backups running or do all processes need to be stopped first?
foggy
Veeam Software
Posts: 21073
Liked: 2115 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: UseUnbufferedAccess, DisableHtAsyncIo and V11a

Post by foggy »

Hi Eric, for the registry value changes to take effect, it is typically recommended to restart Veeam Backup Service. While this can be done with backups running, the changes themselves will not take effect immediately but when the corresponding processes start the next time.
eengland09
Influencer
Posts: 17
Liked: 1 time
Joined: Oct 07, 2021 5:38 pm
Full Name: Eric England
Contact:

Re: UseUnbufferedAccess, DisableHtAsyncIo and V11a

Post by eengland09 »

I did go ahead and apply the changes with everything halted, then rebooted each VM. So far things look better - there are 2 jobs that I think I'm going to have to run active full backups on first - have you seen that trigger the incremental to run correctly for those that were problematic before the registry changes? I'm in the middle of one right now - hoping it completes and then proceeds to do the incrementals.
foggy
Veeam Software
Posts: 21073
Liked: 2115 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: UseUnbufferedAccess, DisableHtAsyncIo and V11a

Post by foggy »

Can't comment as we know nothing about your setup and the actual issues. Could you please share what was 'problematic' before applying the values and describe the setup in more detail?
eengland09
Influencer
Posts: 17
Liked: 1 time
Joined: Oct 07, 2021 5:38 pm
Full Name: Eric England
Contact:

Re: UseUnbufferedAccess, DisableHtAsyncIo and V11a

Post by eengland09 » 1 person likes this post

We had a number of backups running slow and into the next day which was quite odd. We've since uninstalled the registry changes and things have been working fine - so for us - I think the recommendations were not needed and a bit of a "fluke" for us.
Post Reply

Who is online

Users browsing this forum: Semrush [Bot] and 120 guests