Backup Copy Files Going Offline on Dell DR4300

Availability for the Always-On Enterprise

Backup Copy Files Going Offline on Dell DR4300

Veeam Logoby inayama » Thu Dec 08, 2016 12:22 am

Hello,

I am having an issue where the files copied from Backup Copy Jobs are getting offlined (with red x's) periodically on a CIFS share on our Dell DR4300 (not Rapid CIFS). I can reconnect with the files if I rescan the Backup Repo, but I have to disable all Backup Copy jobs before I rescan or I get errors stating that the files are locked by running session. The Backup Copy Jobs are running fine (seems to write fine), but if were to run FLR or Recovery against the files in the Backup Copy Jobs, I will get a read error, and I will have to manually rescan.

The Support Tech had me change the Maximum Number of Concurrent Connections on the Backup Repo (for DR4300) from 4 to 2, but the files ended up going offline.

We checked for any dropped packets between the gateway (the Veeam Backup) server, but did not find any in a 24 hour period.

I would also like to know if there are any way to automate the manual rescan via scripts so I can run it as a scheduled task.

Support Ticket # 01984964

Any help or suggestion is appreciated.

Thank you,
Tadahi
inayama
Influencer
 
Posts: 14
Liked: never
Joined: Thu Dec 08, 2016 12:02 am
Full Name: Tadashi Inayama

Re: Backup Copy Files Going Offline on Dell DR4300

Veeam Logoby PTide » Thu Dec 08, 2016 11:16 am

Hi,

What are your gateway sever settings for that repository?

Thank you
PTide
Veeam Software
 
Posts: 3019
Liked: 246 times
Joined: Tue May 19, 2015 1:46 pm

Re: Backup Copy Files Going Offline on Dell DR4300

Veeam Logoby inayama » Fri Dec 09, 2016 5:07 pm

Hello.

The gateway server for the Backup Repository is set as:

Following Server:
xxxxx (Backup Server)

It is the Veeam Backup Server.

Let me know if you need more info.

Thank you,
Tadashi
inayama
Influencer
 
Posts: 14
Liked: never
Joined: Thu Dec 08, 2016 12:02 am
Full Name: Tadashi Inayama

Re: Backup Copy Files Going Offline on Dell DR4300

Veeam Logoby inayama » Fri Dec 09, 2016 5:10 pm

Hello.

I also tried reducing the Maximum Number of Concurrent Connection to 1 and disabled the Align backup file data blocks option. The Decompress backup data blocks before storing option is enabled, the This repository is backed up by rotated hard drives option is disabled, and the Use per-VM backup files is enabled.

The files ended up going offline again.

Thank you,
Tadashi
inayama
Influencer
 
Posts: 14
Liked: never
Joined: Thu Dec 08, 2016 12:02 am
Full Name: Tadashi Inayama

Re: Backup Copy Files Going Offline on Dell DR4300

Veeam Logoby PTide » Fri Dec 09, 2016 5:24 pm

Do you have any other machine that is closer to the share than the backup server? If yes then please try to configure it as a gateway and see if that resolves the issue.

I would also like to know if there are any way to automate the manual rescan via scripts so I can run it as a scheduled task.
Please check this PS cmdlet

Thank you.
PTide
Veeam Software
 
Posts: 3019
Liked: 246 times
Joined: Tue May 19, 2015 1:46 pm

Re: Backup Copy Files Going Offline on Dell DR4300

Veeam Logoby inayama » Fri Dec 09, 2016 5:58 pm

Hello,

The Veeam Backup Server is on the same vlan and IP subnet as the DR4300. Both the Veeam Backup Server and the DR4300 are on dual 10Gbps on the same FEX. Physically, the DR4300 is on top of the Veeam Backup Server.

I will also check the site you forwarded.

Thank you,
Tadashi
inayama
Influencer
 
Posts: 14
Liked: never
Joined: Thu Dec 08, 2016 12:02 am
Full Name: Tadashi Inayama

Re: Backup Copy Files Going Offline on Dell DR4300

Veeam Logoby inayama » Fri Dec 09, 2016 11:13 pm

Hello.

The Veeam Support Engineer suggested disabling the parallel processing, so we did that. But the files for the Backup Copy Jobs went offline again.

Thank you,
Tadashi
inayama
Influencer
 
Posts: 14
Liked: never
Joined: Thu Dec 08, 2016 12:02 am
Full Name: Tadashi Inayama

Re: Backup Copy Files Going Offline on Dell DR4300

Veeam Logoby foggy » Mon Dec 12, 2016 2:16 pm

Looks strange, indeed. Have you looked at the storage itself? Does it throw any related events/messages in is logs at the time the backups go offline?
foggy
Veeam Software
 
Posts: 14743
Liked: 1081 times
Joined: Mon Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson

Re: Backup Copy Files Going Offline on Dell DR4300

Veeam Logoby inayama » Mon Dec 12, 2016 5:35 pm

Hello.

I did check the logs on Dell DR4300, and I did not see any corresponding errors on the DR4300. I also opened a case with Dell (SR: 940346626) and had a conf call with Veeam and Dell, but could not complete the root cause analysis or come to a resolution. Dell has the complete logs from DR4300, but I'm still waiting for a complete analysis.

Another interesting bit is that the files are locked from Windows stand point, meaning that the I can re-name the Veeam backup files via the share folders, but Veeam is seeing these files as off-line. The "locking" is done only at the Veeam level, not at the Windows OS level.

Veeam support requested more logs, so I uploaded them.

Our Veeam Solutions Architect approved our plan to run the Backup Copy Jobs to a CIFS Share presented by Dell DR4300 (or any other NAS Appliance), and we are currently backing up to Dell DR4200/DR4300 via NetBackup without any problems. And this problem is stopping us from releasing/certifying Veeam to production, so looking for a quick resolution and avoid re-architecting the storage for Veeam.

Thank you,
Tadashi
inayama
Influencer
 
Posts: 14
Liked: never
Joined: Thu Dec 08, 2016 12:02 am
Full Name: Tadashi Inayama

Re: Backup Copy Files Going Offline on Dell DR4300

Veeam Logoby inayama » Tue Dec 13, 2016 9:18 pm

Hello.

So I got a reply from Dell Support:

Tadashi,

I received an update on the log review. We are not seeing any disconnects caused by the DR. However, there are two possible registry edits that could help. Microsoft posted them as a way to alleviate dropped connections on the Windows side. They would have to be applied to every Windows server that is attaching to the DR. Those servers would have to be rebooted. Here are the KBs:

http://support.microsoft.com/kb/102067 and http://support.microsoft.com/en-us/kb/170359.

1.) Set the SessTimeout 3600 seconds. This key can be found (or created if not present) under

HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\LanmanWorkstation\Parameters\
Value Name: Sesstimeout
Data Type: REG_DWORD - Number
Value: 3600 (Decimal)

2) Set the TcpMaxDataRetransmissions to 64. This key can be found (or created if not present) under

HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\Tcpip\Parameters
Value Name: TcpMaxDataRetransmissions
Data Type: REG_DWORD - Number
Value: 64 (Decimal)

I will try these out and see how it goes.

Thank you,
Tadashi
inayama
Influencer
 
Posts: 14
Liked: never
Joined: Thu Dec 08, 2016 12:02 am
Full Name: Tadashi Inayama

Re: Backup Copy Files Going Offline on Dell DR4300

Veeam Logoby inayama » Wed Dec 14, 2016 4:39 pm

Hello,

The registry edits did not work, the files went offline.

And I created two more Backup Repositories mounted to CIFS Shares from two additional Dell DR's (DR4000 and DR4100) and they also went offline.

Not sure what to try out next.

Any suggestions would be appreciated.

Thank you,
Tadashi
inayama
Influencer
 
Posts: 14
Liked: never
Joined: Thu Dec 08, 2016 12:02 am
Full Name: Tadashi Inayama

Re: Backup Copy Files Going Offline on Dell DR4300

Veeam Logoby inayama » Wed Dec 14, 2016 5:16 pm

Hello,

From the logs:

[09.11.2016 03:27:18] Error Unable to investigate free space: [09.11.2016 03:27:18] Error System.Exception: Heartbeat check failed for repo 'codd06' [09.11.2016 03:27:18] Error at Veeam.Backup.ResourceScanner.CRepositoryScanProcessor.Execute() [09.11.2016 03:27:18] Info [StorageDB] Marking storage \\CODD06\veeam\Backup Copy Job 3\qqcoqadwin511.vm-9682016-11-08T210000.vib id 67bcb217-d493-40f5-b010-85cfb86fb2b7 as RepositoryUnavailable (prev. value Available) [09.11.2016 03:27:18] Info [StorageDB] Marking storage \\CODD06\veeam\Copy Prod Fri 9pm Daily Rvrs Incr\qqcoqadwin538.vm-12262016-11-08T000000.vib id 779ae21e-3b28-4d41-b8c6-588fbc1b5a80 as RepositoryUnavailable

So keying into the why Veeam marks the files as offline:

Error Unable to investigate free space: [09.11.2016 03:27:18] Error System.Exception: Heartbeat check failed for repo 'codd06'

Does anyone have an insight into the " investigate free space" step? Is there a way to manually set the time or the interval to "investigate free space" so I can troubleshoot this step or to even stop this step so the files do not go offline?

Thank you,
Tadashi
inayama
Influencer
 
Posts: 14
Liked: never
Joined: Thu Dec 08, 2016 12:02 am
Full Name: Tadashi Inayama

Re: Backup Copy Files Going Offline on Dell DR4300

Veeam Logoby foggy » Thu Dec 15, 2016 11:18 am

Tadashi, these kind of questions are better to be addressed to support engineer, so please keep the investigation going.
foggy
Veeam Software
 
Posts: 14743
Liked: 1081 times
Joined: Mon Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson

Re: Backup Copy Files Going Offline on Dell DR4300

Veeam Logoby inayama » Sat Dec 17, 2016 3:07 am

Hello.

So from Veeam and Dell Support, it looks like there is a compatibility issue:

Veeam as part of the heartbeat process queries the storage for free space "investigate free space." Dell DR4XXX treats these queries as "management process" which have lower priority than primary process like writing data to disk. So when Backup Copy Jobs are running, then DR4XXX answers the queries slower, sometimes too slow for Veeam so Veeam generates this "Unable to investigate free space....Heartbeat check failed for repo..." then marks all the files in that Backup Repo as offline.

We asked Veeam Support if there was a way to increase this timeout duration for "investigate free space" query. He said that there is a registry key to change the frequency of the test, but could not find how to increase the timeout.

Does anyone know how to increase the timeout duration for "investigate free space" query and any of the other Backup Repo Heartbeat Check process in general?

Thank you,
Tadashi
inayama
Influencer
 
Posts: 14
Liked: never
Joined: Thu Dec 08, 2016 12:02 am
Full Name: Tadashi Inayama

Re: Backup Copy Files Going Offline on Dell DR4300

Veeam Logoby foggy » Fri Dec 30, 2016 1:54 pm

I'm not sure if there's a timeout specifically for this check. This should be an RPC call, so could you please verify whether you have the default RPC timeout value (RpcRequestTimeoutSec) redefined in the registry? The default is 3600 seconds, so should be sufficient.
foggy
Veeam Software
 
Posts: 14743
Liked: 1081 times
Joined: Mon Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson

Next

Return to Veeam Backup & Replication



Who is online

Users browsing this forum: dgapinski, vishalgupta and 31 guests