Comprehensive data protection for all workloads
Post Reply
AuGL
Enthusiast
Posts: 51
Liked: 3 times
Joined: May 07, 2019 12:22 am
Full Name: Glenn
Contact:

Re: Windows 2019, large REFS and deletes

Post by AuGL »

@PeterC ... have you tried clearing system working set (Memory related), as per an earlier post on this page, or even rebooting to see if it perhaps normally again, if even for a few days?

PeterC
Enthusiast
Posts: 25
Liked: 7 times
Joined: Apr 10, 2018 2:24 pm
Full Name: Peter Camps
Contact:

Re: Windows 2019, large REFS and deletes

Post by PeterC »

Clearing the system working set, we did a few months ago as test. We haven't tried it at this moment. But we did several reboots, also updating drivers, firmware and windows updates. This does not help at all.
We also tested with the block-clone-tool from Veeam and the results are very bad at the moment. The virtual repos with server 2016 are about 7 times faster with this tool.

mkretzer
Expert
Posts: 675
Liked: 156 times
Joined: Dec 17, 2015 7:17 am
Contact:

Re: Windows 2019, large REFS and deletes

Post by mkretzer »

I still suspect 10a. I hope i am finally able to upload the 16 GB of logs i collected today....

PeterStam
Influencer
Posts: 13
Liked: 1 time
Joined: Feb 25, 2015 1:49 pm
Full Name: Peter Stam
Contact:

Re: Windows 2019, large REFS and deletes

Post by PeterStam »

When running block-clone-spd.exe (test) tool we noticed that the ‘modified’ memory (orange bar in resource monitor) is exponentially growing. The tool writes 2 files and then performs a block clone to ‘merge’ the 2 files into a new file. During this merge the modified memory is slowly dropped because it write the memory to disk. During this process the disk totally freezes. We can’t even create a folder onto the disk where the test is running against (explorer turns white). When the modified memory is released everything is suddenly responsive again. Strange thing is that the test finishes…

We had another physical server with other specs (cpu, mem, array controller different) and installed the same Windows 2019 DataCenter build 1809 17763.1397, same ReFS version 10.0.17763.1369. This server has exactly the same issue.

Both servers all the latest firmware, drivers and Windows Updates (including the jan. that supposed to fix this). During the night when VEEAM starts merging the same symptoms occur (explorer freeze, memory etc.).

None of this is happening on a Windows 2016 DataCenter server. Same command with block-clone-spd.exe but we can see the modified memory drop so much faster (7 – 8 times faster) and the tool finishes.

Tried the following with no result:
Change pagefile
Change array controller cache
Change ReFS registry settings (think I tried them all, also the one for optimizing DPM, u never know…)

All this with no luck…hopefully MS will come with something….we are out of options…unless….someone has a idea…?

(Another thing we are noticing is that the VeeamAgent.exe process is reporting
'One or more threads of VeeamAgent.exe are waiting to finish network I/O' (resource monitor Analyze wait chain))

koenteugels
Novice
Posts: 7
Liked: never
Joined: Jan 31, 2017 1:50 pm
Full Name: Koen Teugels
Contact:

Re: Windows 2019, large REFS and deletes

Post by koenteugels »

@PeterC
Can you give some more information on your environment? To comparing if the environment size is similar.
I need to backup 200 TB SSD NVMe SAN (FC) VMware source data to 4 physical proxy servers (2x Intel Xeon Gold 5215, 128GB RAM) with each 100 SAS SSD TB of SAN backup storage (FC) for short retention and copy it to a NL SAN (FC) storage for long term.

PeterStam
Influencer
Posts: 13
Liked: 1 time
Joined: Feb 25, 2015 1:49 pm
Full Name: Peter Stam
Contact:

Re: Windows 2019, large REFS and deletes

Post by PeterStam »

So PeterC installed a old HPE server with Server 2016 and we ran the
.\block-clone-spd.exe <drive>:\Temp\ 50
with the result

All block cloning took 4.436s.
Average speed: 11540.839 MiB/s

On the production Server 2019, same command (no jobs running)
.\block-clone-spd.exe <drive>:\Temp\ 50
with the result

All block cloning took 56.518s.
Average speed: 905.902 MiB/s

Can anybody please post some results for us with 2019 server with the latest updates installed, just curious what values u are getting....

Thank you

AuGL
Enthusiast
Posts: 51
Liked: 3 times
Joined: May 07, 2019 12:22 am
Full Name: Glenn
Contact:

Re: Windows 2019, large REFS and deletes

Post by AuGL »

I can do this .. where do I get the block-clone-spd.exe?

Rick_M
Influencer
Posts: 12
Liked: never
Joined: Aug 29, 2016 10:54 pm
Contact:

Re: Windows 2019, large REFS and deletes

Post by Rick_M »

Gostev wrote: May 05, 2020 11:37 am In general, as you can see from the above feedback, we have multiple customers with much larger ReFS deployments happy with the performance - so the issue seems to be specific to your backup repository server configuration. Veeam support will not help you here indeed, as without the source code we cannot debug Windows or ReFS issues... you should open a case with Microsoft instead.

I would also suggest trying the following:
1. Make sure you have real-time antivirus protection disabled (including Windows Defender).
2. Reduce concurrent tasks on the repository significantly. In case lack of memory is the issue, it will be multiplied by not using the recommended ReFS registry settings you mentioned above. The RefsEnableLargeWorkingSetTrim key reduces memory pressure, and is highly recommended for RAM-restricted configurations.
3. Enable synthetic fulls in the jobs. I noticed you do forever incremental, which is a non-default setting, and is unusual to use for ReFS - and makes no sense as synthetic fulls are "free" on ReFS. So currently, the workload you're putting on ReFS is very different from what most customers do.

Finally, if you're still uncomfortable with ReFS, you can always switch to XFS - which provides the same exact capabilities.

Thanks!
I can confirm Windows Defender Smartscreen was giving us several performance issues - Specifically around Fastclone operations, all performance issues were resolved once it was disabled.

Rick_M
Influencer
Posts: 12
Liked: never
Joined: Aug 29, 2016 10:54 pm
Contact:

Re: Windows 2019, large REFS and deletes

Post by Rick_M »

PeterStam wrote: Aug 18, 2020 2:21 pm So PeterC installed a old HPE server with Server 2016 and we ran the
.\block-clone-spd.exe <drive>:\Temp\ 50
with the result

All block cloning took 4.436s.
Average speed: 11540.839 MiB/s

On the production Server 2019, same command (no jobs running)
.\block-clone-spd.exe <drive>:\Temp\ 50
with the result

All block cloning took 56.518s.
Average speed: 905.902 MiB/s

Can anybody please post some results for us with 2019 server with the latest updates installed, just curious what values u are getting....

Thank you
Sure,

block-clone-spd-0.3.1> .\block-clone-spd.exe T:\test 50
block-clone-spd utility, v0.3.1. Vsevolod Zubarev 2018-19.
Volume file system is ReFS.
Block cloning is available.
Cluster size is 65536 bytes.
Free space available: 262144.000 GiB.
Will create three files 50 GiB each, for a total of 150 GiB.
Writing random file "T:\test\01.data"...
███████████████████████████████████████████████████████████████████████████████████████████████████████████ 51200/51200
Writing random file "T:\test\02.data"...
███████████████████████████████████████████████████████████████████████████████████████████████████████████ 51200/51200
Writing new file "T:\test\cloned.data" via block cloning...
All block cloning took 7.196s.
Average speed: 7115.356 MiB/s

PeterStam
Influencer
Posts: 13
Liked: 1 time
Joined: Feb 25, 2015 1:49 pm
Full Name: Peter Stam
Contact:

Re: Windows 2019, large REFS and deletes

Post by PeterStam » 1 person likes this post

Thank you Rick_M for the results. As of a test we already removed Windows Defender, it didn't make a difference in our case.

karsten123
Service Provider
Posts: 28
Liked: 3 times
Joined: Apr 03, 2019 6:53 am
Full Name: Karsten Meja
Contact:

Re: Windows 2019, large REFS and deletes

Post by karsten123 »

Hi,
where do i get this „magical“ block-clone-spd.exe?
Tia

Gostev
SVP, Product Management
Posts: 26706
Liked: 4277 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Windows 2019, large REFS and deletes

Post by Gostev » 1 person likes this post

It's an in-house tool created by Veeam support, so there's no public download currently. May be there should be though, because it basically became "industry standard" after all these years of ReFS troubleshooting. These days, even the ReFS team at Microsoft uses it ;)

popjls
Influencer
Posts: 16
Liked: 1 time
Joined: Jun 25, 2018 3:41 am
Contact:

Re: Windows 2019, large REFS and deletes

Post by popjls »

Yes please if you can provide this block-clone tool that would be great.

Rick_M
Influencer
Posts: 12
Liked: never
Joined: Aug 29, 2016 10:54 pm
Contact:

Re: Windows 2019, large REFS and deletes

Post by Rick_M »

PeterStam wrote: Aug 23, 2020 8:55 pm Thank you Rick_M for the results. As of a test we already removed Windows Defender, it didn't make a difference in our case.
Just to confirm, you specifically disabled Smartscreen? (Windows Security > App & Browser Control > Check Apps & Files: Off)

PeterStam
Influencer
Posts: 13
Liked: 1 time
Joined: Feb 25, 2015 1:49 pm
Full Name: Peter Stam
Contact:

Re: Windows 2019, large REFS and deletes

Post by PeterStam »

Hello Rick_M,
Yes it's disabled, but didn't solve anything.
Cheers,
Peter

tillmann
Lurker
Posts: 2
Liked: never
Joined: Aug 31, 2020 11:10 am
Contact:

Re: Windows 2019, large REFS and deletes

Post by tillmann »

Hi all,
I just registered on this forum, because we seem to still have this problem and I do not know, what else to do.
I had manufacturer support check the Hardware: Everything is ok.
Veeam Support (04326410) checked, and recommended in the end to create a Microsoft Ticket, which I did (but did not hear back from them yet).

We are running Veeam Backup & Replication 10a on a Windows Server 2019 Standard with SSU 08-2020. So the ReFS patch should be included.
I followed the recommendations ( fsutil behavior set DisableDeleteNotify ReFS 1; RefsEnableLargeWorkingSetTrim 1).
The Backupserver is a VM with 3 disks attached, all residing on a 18-disk NL SAS:
Drive E is 60TB (ReFS), created April 2019
Drive F is 50TB (ReFS), created December 2019
Drive G is 10TB (ReFS), created last Week

A few months ago we experienced issues with backup (copy) jobs running out of their scheduled window and sometimes the job ran into Timeouts after 900sec. A fresh start got rid of this. We adjusted some exclusion rules for Windows Defender ATP, which resulted in no more timeouts.
But the speed issue still remains a huge problem.
I added another test drive (G) last week and run serveral speedtests (IOMeter, diskspd). The results (write) were around 600MB/sec, for E and F around 100MB/sec.
diskspd with the following settings (-c25G -b512k -w100 -r4k -Sh -d300) achieves around 220MB/sec on G and around 25MB/sec on E or F. All disks are based on the same NL-SAS, so they should achieve the same performance (and in the beginning, they did.)

Do you have any recommendations or suggestions, what else I could do?

Gostev
SVP, Product Management
Posts: 26706
Liked: 4277 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Windows 2019, large REFS and deletes

Post by Gostev »

How does the manufacturer explain 10x difference in diskspd results?

Just to be clear, the issues discussed in this topic had to deal with the block cloning, thus references to block-clone-spd.exe support tool above. While you seem to have issues with basic write performance?

tillmann
Lurker
Posts: 2
Liked: never
Joined: Aug 31, 2020 11:10 am
Contact:

Re: Windows 2019, large REFS and deletes

Post by tillmann »

Ah, sorry. I was under the impression that this is the same problem. The Manufacturer said, that it is (most likely) a problem on the virtual machine level. He wanted me to measure the performance via dd on the host. But as this is our live backup server and my knowledge is not that good regarding virtualization and linux, I refrained from doing this, but created a fresh volume on the same storage for the virtual machine. This newly created volume does not have any issues regarding write or read performance. This is why I suspect this issue to be on the file-system level.

kte
Expert
Posts: 178
Liked: 7 times
Joined: Jul 02, 2013 7:48 pm
Full Name: Koen Teugels
Contact:

Re: Windows 2019, large REFS and deletes

Post by kte »

Any updates on this. For a new installation veeam 10a.
Windows 2016 or Windows 2019 with ReFS? Are the latest patches more stable??
Because in Windows 2019 we have repair tools for ReFS

Gostev
SVP, Product Management
Posts: 26706
Liked: 4277 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Windows 2019, large REFS and deletes

Post by Gostev »

Yes, both are stable with the latest patches.

karsten123
Service Provider
Posts: 28
Liked: 3 times
Joined: Apr 03, 2019 6:53 am
Full Name: Karsten Meja
Contact:

Re: Windows 2019, large REFS and deletes

Post by karsten123 »

and performance wise?

Gostev
SVP, Product Management
Posts: 26706
Liked: 4277 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Windows 2019, large REFS and deletes

Post by Gostev »

In theory, 2019 should be a bit faster due to having more optimizations.

PeterC
Enthusiast
Posts: 25
Liked: 7 times
Joined: Apr 10, 2018 2:24 pm
Full Name: Peter Camps
Contact:

Re: Windows 2019, large REFS and deletes

Post by PeterC » 1 person likes this post

Gostev wrote: Sep 14, 2020 3:21 pm Yes, both are stable with the latest patches.
Sorry to say, but at the moment we are still having issues with ReFS on Server 2019 (1809). We are using Veeam B&R 10a.
We have a case with Veeam (04331174) and one with MS (120081425000237).

We can definitely see and notice performance issues on our Apollo 4200 at the moment when several merges start. I/O from other backupjobs to this repo is degrading rapidly, explorer on the repo is turning white and programs (perfmon) are not responding.
When merges are finished I/O goes up again and programs start reacting again. I would definitely advice to use Server 2016. Unfortunately we can not reinstall the Apollo because we have no storage to move the backups to.
At the moment we are doing several performance tests for MS. Hope that one day a solution will be available.

Gostev
SVP, Product Management
Posts: 26706
Liked: 4277 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Windows 2019, large REFS and deletes

Post by Gostev »

Well - given our customer base size, there always be a few of customers who have some issues. But all of those I'm aware of have very different issues, and they are often inconsistent (happen on one day but not the other) - which would indicate environment-specific or configuration-specific issues, rather than some "global" ReFS issue. They will be investigated and resolved, and I have to say that the ReFS dev team at Microsoft has been extremely involved and helpful.

In general though, we don't see any common stability issues with ReFS on whether 2016 and 2019 in the past few months, so by now I personally am confident recommending both. Remember we have about 200K ReFS backup repositories deployed out there, with at least one third on 2019... so, it is safe to bet that any real common issue would be resulting in hundreds of identical support cases every day.

Having said that, it is hard to argue that 2016 is much more "proven" simply due to its age.

stefanbrun
Service Provider
Posts: 26
Liked: 4 times
Joined: Apr 26, 2011 7:36 am
Full Name: Stefan Brun | MSupport Networks AG
Location: Lengnau, Switzerland
Contact:

Re: Windows 2019, large REFS and deletes

Post by stefanbrun » 1 person likes this post

I have received a new Apollo 4200 G10. It has been assembled and configured using the reference architecture from HPE and Veeam.
ReFS stores 12 x 16TB in Raid6 via the 816i controller.

Here the comparison data with server 2019 (ver 1809 Build 17763.1490 | with all Windows updates):

C:\install>block-clone-spd.exe v:\temp 50
block-clone-spd utility, v0.3.1. Vsevolod Zubarev 2018-19.
Volume file system is ReFS.
Block cloning is available.
Cluster size is 65536 bytes.
Free space available: 148303.033 GiB.
Will create three files 50 GiB each, for a total of 150 GiB.
Writing random file "v:\temp\01.data"...
███████████████████████████████████████████████████████████████████████████████████████████████████████████ 51200/51200
Writing random file "v:\temp\02.data"...
███████████████████████████████████████████████████████████████████████████████████████████████████████████ 51200/51200
Writing new file "v:\temp\cloned.data" via block cloning...
All block cloning took 5.547s.
Average speed: 9230.052 MiB/s

I hope this helps as another reference value.

NightBird
Service Provider
Posts: 193
Liked: 38 times
Joined: Apr 28, 2009 8:33 am
Location: Strasbourg, FRANCE
Contact:

Re: Windows 2019, large REFS and deletes

Post by NightBird »

Hello all,

Could someone send me the block-clone-spd.exe
So I can have a try on various repository configurations.

More particularly I have an all flash repository configuration I want to test ;)

Thank you very much.

dominic.stratmann
Lurker
Posts: 1
Liked: never
Joined: Sep 24, 2020 12:36 pm
Contact:

Re: Windows 2019, large REFS and deletes

Post by dominic.stratmann »

Hello,

we have a new HPE Apollo System with Serve 2019 Build 177763.1490
System 64GB, RAID6 with 10 *12TB NL-SAS
REFS Volume ist about 40TB fpr Repo.
I have tried Regkeys and hotfixes without luck.
Veeam B&R 10a is running.
Afte a while of uptime performance is going extremly down, after a reboot eveything is ok.

maybe someone can send the block-clone-spd tool?

Post Reply

Who is online

Users browsing this forum: No registered users and 27 guests