Win2016 ReFs repository server freeze during Back-up copy

VMware specific discussions

Win2016 ReFs repository server freeze during Back-up copy

Veeam Logoby maja » Sat Feb 04, 2017 7:29 am

We have Veeam 9.5 Update 1 running and are using Windows 2016 with ReFS as backup-repository servers. Sometimes during back-up or copy, the repository servers completely freezes. The Win2016 repository server is a vSphere Virtual Machin and when this happens, the VMware tools service stops running and I can no longer connect to the VM using any possible method (direct console, RDP, PowerShell Remoting, SMB, etc)

Backups and copies show messages like this:

'Unable to allocate processing resources. Error: No scale-out repository extents are available. '

When I do a hard reset of the Win2016 repository VM, the machine comes back online and jobs usually continue, but the OS does not show any related cause in the event log.

Any ideas?
maja
Novice
 
Posts: 3
Liked: never
Joined: Wed Oct 03, 2012 2:38 pm
Full Name: Marco Janse

Re: Win2016 ReFs repository server freeze during Back-up cop

Veeam Logoby Gostev » Sat Feb 04, 2017 5:40 pm

Large repository and 4KB cluster size used for ReFS?
Gostev
Veeam Software
 
Posts: 21396
Liked: 2350 times
Joined: Sun Jan 01, 2006 1:01 am
Location: Baar, Switzerland

Re: Win2016 ReFs repository server freeze during Back-up cop

Veeam Logoby maja » Fri Feb 24, 2017 1:19 pm

Yes, large repository. I logged a case with Veeam for this: 02060234.

Last week, we have now changed the cluster size to 64K and the freezes have not occurred ever since.
However, backup file sizes are getting a lot larger now... :(
maja
Novice
 
Posts: 3
Liked: never
Joined: Wed Oct 03, 2012 2:38 pm
Full Name: Marco Janse

Re: Win2016 ReFs repository server freeze during Back-up cop

Veeam Logoby Gostev » Sat Feb 25, 2017 6:25 pm

Should be no more than 10% larger - good price for stability until Microsoft figures out those 4KB cluster size problems!
Gostev
Veeam Software
 
Posts: 21396
Liked: 2350 times
Joined: Sun Jan 01, 2006 1:01 am
Location: Baar, Switzerland

Re: Win2016 ReFs repository server freeze during Back-up cop

Veeam Logoby davecla » Tue Jul 04, 2017 10:16 pm

I'm having exactly the problem described here

Windows 2016 VM with ReFS as repository server. During back-up or copy, the repository servers freezes (seems to slow, then freeze over a short period of time). VMware tools service stops running and I can no longer connect to the VM.
The repository is on an RDM. Lots of resources available to the server.

Have to power cycle to restart. Nothing in the logs....

I'm running update 2 and the ReFS volume is using 64kb clusters (i triple checked)

Havn't logged a call yet. Hoping someone here has the answer.....
davecla
Influencer
 
Posts: 20
Liked: 2 times
Joined: Wed Feb 03, 2016 9:40 pm
Full Name: Dave Clarke

Re: Win2016 ReFs repository server freeze during Back-up cop

Veeam Logoby mwvme » Wed Jul 05, 2017 11:22 pm

Hello there,
The fact that VMware Tools freeze makes this quite interesting. I would not expend that with a ReFS issue. I suggest you talk to support. You can start with us and I bet we can help.

Michael
mwvme
Veeam Software
 
Posts: 70
Liked: 18 times
Joined: Sat Dec 05, 2015 10:19 pm
Location: Calgary, Alberta Canada
Full Name: Michael White

Re: Win2016 ReFs repository server freeze during Back-up cop

Veeam Logoby davecla » Thu Jul 06, 2017 3:54 am

I restarted the server and sat and watched the copy job progress and resource monitor for about 45mins today.

As the backup copy job ran, the server got noticeably slower from a UI perspective and then stopped. Vsphere posted a virtual machine CPU usage error. While I watched resource monitor on the guest there was no sign of high CPU or memory usage.
I disabled the backup-copy jobs and restarted the server. The small backup job running in the site completes ok. The backup-copy job from that site to another site also completes ok.

I've got some new storage being install at that site next week, so will try moving the repository back to NTFS. There's about 50TB so the move will take so time :-(
davecla
Influencer
 
Posts: 20
Liked: 2 times
Joined: Wed Feb 03, 2016 9:40 pm
Full Name: Dave Clarke

Re: Win2016 ReFs repository server freeze during Back-up cop

Veeam Logoby davecla » Thu Jul 06, 2017 9:52 pm

To add to above, server is still running happily today with the backup copy jobs disabled.
davecla
Influencer
 
Posts: 20
Liked: 2 times
Joined: Wed Feb 03, 2016 9:40 pm
Full Name: Dave Clarke

Re: Win2016 ReFs repository server freeze during Back-up cop

Veeam Logoby mfirewalker » Mon Jul 10, 2017 4:17 pm

I have a similar issue with a physical Windows Server 2016 Standard machine as ReFS fast clone repository and 4 KB allocation unit (4.4 of 17.1 TB used). It previously happened only every week or so, but today I had to power cycle 4 times in a row because the server immediately locked up completely as mentioned above. For me it is related to backup jobs rather than backup copy jobs, I had to stop the running jobs and disable them to actually do anything again on the repository server. With the jobs stopped, the server did not freeze. I am now unable to do backups. I am currently running update 2 for 9.5 but was previously running update 1 with the same issue.
mfirewalker
Novice
 
Posts: 3
Liked: never
Joined: Fri Jul 07, 2017 9:58 am

Re: Win2016 ReFs repository server freeze during Back-up cop

Veeam Logoby kwinsor » Wed Jul 12, 2017 8:57 pm

Hi, we have the same issue. We have a physical server connected to Dell MD direct attached storage running 2016 and REFS for the repo drive. We use it as a target for copy jobs from our other datacenter. The server dies nearly every day and it seems it's with copy jobs that contain VM's with VMDK's 2 TB or larger. If we disable jobs with large VMDK's it's ok. Our allocation unit size is 64 KB. The server will get slower and slower until you can no longer RDP to it. We have to connect via iDRAC and power cycle it.
kwinsor
Influencer
 
Posts: 14
Liked: 1 time
Joined: Fri Oct 17, 2014 1:26 pm
Location: Toronto
Full Name: Kent Winsor

Re: Win2016 ReFs repository server freeze during Back-up cop

Veeam Logoby tsightler » Wed Jul 12, 2017 10:14 pm

So my theory is that Veeam is attempting to write data to the filesystem faster than it can keep up with, which is causing Windows to buffer these writes and, as memory is exhausted, you get a hang. It is just a theory, but it fits the pattern and is similar to issues seen with NTFS in some cases, although in those cases the system doesn't hang completely, just slows to a crawl. This is just a theory, but if correct, I have a few options that might be worth testing:

1) Use performance monitor to see the actual write rate to the ReFS volume during the backup copy, then set the write throttle in Veeam on the repository to about 80% of this value. This will throttle writes to the repo and slow down the copy job, but should keep it from needing memory for buffering.

2) If the above doesn't work, try the UseUnbufferedAccess registry key. I actually haven't tested this behavior yet, but I believe it will open all backup files with direct access, bypassing the OS buffer cache. I'll try to test the behavior in my lab in the next couple of weeks, but it might be worth a shot just to see if there is any impact.
tsightler
Veeam Software
 
Posts: 4772
Liked: 1740 times
Joined: Fri Jun 05, 2009 12:57 pm
Full Name: Tom Sightler

Re: Win2016 ReFs repository server freeze during Back-up cop

Veeam Logoby mfirewalker » Thu Jul 13, 2017 7:30 am

I have now copied almost 9 TB of backup files to change the cluster size of the ReFS repository from 4K to 64K. I will now resume the backup jobs and see how it goes. However, reading other posts here I expect the issues to continue (or arise again as backup sizes increase). Unfortunately the fast clone advantage is now gone for the existing files because I had to copy them. I will consider the steps described by tsightler, however, deeper investigation into the technical details by Veeam would be much appreciated. For the record: largest files are around 2 TB on the affected Repository.
mfirewalker
Novice
 
Posts: 3
Liked: never
Joined: Fri Jul 07, 2017 9:58 am

Re: Win2016 ReFs repository server freeze during Back-up cop

Veeam Logoby kwinsor » Thu Jul 13, 2017 2:11 pm

We are not seeing the server run out of memory. Our monitoring software doesn't see it go above 50%.
kwinsor
Influencer
 
Posts: 14
Liked: 1 time
Joined: Fri Oct 17, 2014 1:26 pm
Location: Toronto
Full Name: Kent Winsor

Re: Win2016 ReFs repository server freeze during Back-up cop

Veeam Logoby tsightler » Thu Jul 13, 2017 5:59 pm

Not sure what monitoring tool you are using, but I've found that many do not include Standby memory in the calculation for memory utilization. Regardless, it does not require a complete exhaustion of memory for the symptom to occur and yet still be related to memory buffering. I didn't see if where you mention how much memory vs storage you have, but a huge percentage of performance and hang issues that I've seen in the field have been resolved by increasing the amount of memory available, regardless of whether it seemed the system was under memory pressure or not.
tsightler
Veeam Software
 
Posts: 4772
Liked: 1740 times
Joined: Fri Jun 05, 2009 12:57 pm
Full Name: Tom Sightler

Re: Win2016 ReFs repository server freeze during Back-up cop

Veeam Logoby kwinsor » Thu Jul 13, 2017 7:40 pm

My server is in the process of dying as we speak. SNMP has tripped. Here's a snippet of resource monitor and the server appears idle. It's not accepting any data. I cannot get Windows Explorer to show any contents of the repository drive. YOu can see in my image the veeam copy job was stuck for 4 hours on one HDD of a single server. If we kill the Veeam Data Mover Service the server functions normal. We happened to login at the right time today to catch this server before it became unresponsive. There's something wrong with Veeam on Windows 2016 with REFS or maybe 2016 in general. All my other local repos are Windows 2012 R2 with NTFS. I'm building another with REFS on 2016 as a VM today and will test with large transfer.

http://ibb.co/d8cBua
http://ibb.co/eYG4Ea
kwinsor
Influencer
 
Posts: 14
Liked: 1 time
Joined: Fri Oct 17, 2014 1:26 pm
Location: Toronto
Full Name: Kent Winsor

Next

Return to VMware vSphere



Who is online

Users browsing this forum: MPIDR and 25 guests