Host-based backup of Microsoft Hyper-V VMs.
Post Reply
Simon_LBC
Enthusiast
Posts: 53
Liked: 4 times
Joined: Dec 11, 2018 3:15 pm
Full Name: Simon C.
Contact:

Slow NAS storage making my Veeam task to timed out and fail

Post by Simon_LBC » 1 person likes this post

Before you ask, I already opened some tickets with Veeam support for this issue and the bottomline was "your storage is slow and it makes your Veeam tasks to timed out and fails". But I am still posting the question to see how I can troubleshoot further my slow storage issue and to have the opinion of other Veeam users and maybe also other Veeam technician. Obviously I first checked my NAS for storage integrity and everything is perfect and by the way the NAS is brand new (3 months old), it's using NAS PRO entreprise level disk and it's set to RAID6 and it also have SSD caching enable.

So let speak about the setup; the destination server of the jobs is a super freaking fast HP Proliant DL380-G8, Windows Server 2012 R2 and the server is connected using 1 Gbps ethernet to a QNAP Entreprise NAS TES-1885U also connected with 1 Gbps ethernet because I don't have 10 Gbps ethernet switch, nor 10 Gbps ports on the server. That being said, the QNAP is mounted using iSCSI LUN to the HP server. Then the server is running two jobs from Veeam, first job is a Replication from my other server (production server) to this server (replica server) and the second job is a Veeam Backup to a repository on this same server (replica) for data retention.

Both of the replication and backup jobs have issues. The replication job main issue is when the job is performing the "Merging Hyper-V snapshot" on the host, Veeam is "timing out" because merging the snapshot may take up to 3 hours and it don't fail properly speaking (working fine on the host), but Veeam is reaching a "Timed Out" at some point and return a "false" fail error. By the way the snapshot are 120 GB large over a 13 TB full volume size, so yes, it's pretty huge. I keep 3 snapshots in addition to the full volume.

Now for the backup job, the problem is a bit more tricky. The job itself is working very fine without any issue, but I set the job to perform a "backup health check" once per month as suggested and this part of the job is always failing after exactly 12 hours of run time. The given error is Agent failed to process method {Signature.FullRecheckBackup}. But because the check job always fail after precisely 12 hours let me think that it's not related to any hardware error and that my backup are healthy, but more to a "timed out" process error. Because I work in IT for 18 years and usually when a job is failing after precisely 12 hours, it sounds like a "time out" process! :lol:

At this point I think that the "bottleneck" of my setup is probably the 1 Gbps connection between the server and the storage. But my goal is to understand what would be the minimum speed requirement for the storage to perform Veeam backup & replication jobs? Because I manually tested the speed between my server and the NAS by copying back and forth same large files, medium files, a bunch of small files, etc. and I easily reach a stable 110-115 Mb/s read and write speed between the two points, which is the expect maxim speed of a gigabit ethernet connection. So what's the deal with Veeam and Windows process? They needs a 10 Gbps interface to work properly?

Before purchasing a 10 Gbps infrastructure ($$$) on both ends between server and NAS, I would like to clarify the expected storage speed and also be sure that this investment will fix all backup and replication issues. :)
foggy
Veeam Software
Posts: 21139
Liked: 2141 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Slow NAS storage making my Veeam task to timed out and fail

Post by foggy »

Hi Simon, let me clarify a few things. First of all, do I understand right, you're running replication job from another (production) Hyper-V host and replica VMs land on an iSCSI volume residing on NAS? Then, the backup job is using these replica VMs as a source and sends backups to a Hyper-V host's local disk? What are the bottleneck stats reported by both jobs?
Simon_LBC
Enthusiast
Posts: 53
Liked: 4 times
Joined: Dec 11, 2018 3:15 pm
Full Name: Simon C.
Contact:

Re: Slow NAS storage making my Veeam task to timed out and fail

Post by Simon_LBC »

Hi Foggy, not exactly. I am running both (not at the same time as well) a replication from my main Hypervisor to this replica server who have the iSCSI volume on the NAS. Then I also have a backup job that are doing pretty much the same way, between the main server and the replica server who are also set as "repository" for backup.

So both the replication job and the backup job, who have this same destination, fail or time-out at some point.

Thank you and sorry for the misunderstanding.
Simon_LBC
Enthusiast
Posts: 53
Liked: 4 times
Joined: Dec 11, 2018 3:15 pm
Full Name: Simon C.
Contact:

Re: Slow NAS storage making my Veeam task to timed out and fail

Post by Simon_LBC »

Hi Foggy, I found (almost by mistake) this KB from Veeam (https://www.veeam.com/kb2014) and I was stunned that the guy I speak at Veeam technical support never lead me to this article at the first glance. That being said, I ran some benchmark on my NAS storage with this MS utility according to the Veeam KB steps and here's what I found. I don't know if these results are good, average or bad, but here's what I get :

When I simulate a "full backup or forward incremental", result is :

Total IO
thread | bytes | I/Os | MiB/s | I/O per s | file
------------------------------------------------------------------------------
0 | 20280508416 | 38682 | 32.23 | 64.47 | D:\testfile.dat (1024MiB)
------------------------------------------------------------------------------
total: 20280508416 | 38682 | 32.23 | 64.47


Then, when I simulate any "synthetic jobs, such as backup merge", result is :

Total IO
thread | bytes | I/Os | MiB/s | I/O per s | file
------------------------------------------------------------------------------
0 | 39459487744 | 75263 | 62.72 | 125.44 | D:\testfile.dat (1024MiB)
------------------------------------------------------------------------------
total: 39459487744 | 75263 | 62.72 | 125.44

HOWEVER, as for the second result, Veeam KB indicates to divide result per 2 because Veeam needs 2 I/O at the same time, so in this case this leads to pretty much the same result as the first result for the "full backup" simulation...

I would say that 32.23 MiB/s look like a small throughput, however I am not sure what would be a "ideal" or recommended throughput for the storage with Veeam... two times faster, three times, four times, more?
foggy
Veeam Software
Posts: 21139
Liked: 2141 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Slow NAS storage making my Veeam task to timed out and fail

Post by foggy »

Hi Simon, the throughput is indeed not high, but to make sure the target storage is the culprit here, we need to review the job bottleneck stats.
Simon_LBC
Enthusiast
Posts: 53
Liked: 4 times
Joined: Dec 11, 2018 3:15 pm
Full Name: Simon C.
Contact:

Re: Slow NAS storage making my Veeam task to timed out and fail

Post by Simon_LBC »

Hi Foggy,

I finally took a chance a put a 10 Gbps network card in my server and direct attached the QNAP NAS storage to this card (who is already 10 Gbps capable) and it severally increased my overall Veeam job speed. Yes, it's still very long but I think this is also due that my job are VERY huge... my backup job is 15 TB, so as my replication job that is 22 TB. Some specifics jobs still take a while, like "backup compact" who take around 40 hours to complete, but since I understand it's re-creating a new VBK for the entire job (15 TB), it's sounds like a normal delay.
Post Reply

Who is online

Users browsing this forum: No registered users and 12 guests