Backup copy throughput

perjonsson1960 · Nov 29, 2019 4:08 pm

Folks,

We have been having problems with how fast data can be written to our backup copy repositories. The backup copy jobs sometimes have to wait for the target server to write the cached data to disk. But now I have used the "Limit read and write data rates to" setting to throttle the data. I have set the limit to 200 MB/s on each repository, and that seems to be the optimal throughput, because the server manages to write all the incoming data to the disks fast enough, which is good. BUT, with one of the jobs, that copies around 1 TB data from 13 SQL servers each day, and the job starts with "Merging oldest incremental backup into full backup file", it is MUCH slower. Without the data rate limit the merge takes around 6 minutes, but with the limit of 200 MB/s, it takes around 55 minutes, which is a big difference.

I suppose that this is a side effect of the data rate limit setting, and a problem that cannot be solved? Perhaps I can increase the limit a little, but I don't want the target server to start "choking", which it does with no data rate limit. The best scenario would be if the limit was in effect during the copy process, but not used during the merge.

The target server has two repositories, each consisting of 11 spinning SAS disks, 10 TB each, using RAID 5.

PJ

Post by **foggy** » Nov 29, 2019 4:16 pm this post

Hi Per, haven't you tried playing with the max number of concurrent tasks instead of the ingestion rate? That way the merge throughput will not be limited while overall the storage load will still be controlled. Btw, how many tasks are typically running in parallel and do you have the per-VM chains setting enabled?

perjonsson1960 · Nov 29, 2019 8:46 pm

I have played around with the number of concurrent tasks already. I have even tried to set it to 1. But it doesn't seem to help, the target server still gets too much to do, and the backup copy job must wait for it to write the cached data to disk, sometimes for several minutes. So I went back to the default, which is 4. Yes, I have the per-VM setting.

YouTube · Post by **haslund** » Dec 01, 2019 10:21 pm this post

Do you have a battery for your RAID controller? Which stripe size did you use for the RAID 5? Which file system did you format it with? Which block size did you use for the file system?

perjonsson1960 · Dec 02, 2019 10:05 am

We use three HPE DL380 Gen 10 servers for our backup solution, each with these two controllers:

HPE Smart Array P408i-a SR Gen10 (This is the controller for the internal disks)
HPE Smart Array P408e-p SR Gen10 (This is the controller for the external disk enclosure)

The internal controller has a 2 GB flash-backed cache, and the external has 4 GB. The file system is ReFS and the block size is 64k. I am trying to determine if we have the HPE Smart Storage Battery in the servers. I don't know what stripe size is, and I cannot seem to get that info from the Smart Storage Administrator. Where can I see that?

perjonsson1960 · Dec 02, 2019 12:54 pm

Yes, we have the HPE Smart Storage Battery in all three servers.

perjonsson1960 · Dec 02, 2019 2:34 pm

...and the stripe size, which I finally found in the logical device specs: "Stripe size / Full stripe size ... 256 KB / 2560 KB".

perjonsson1960 · Dec 06, 2019 3:44 pm

It seems that Network Traffic Throttling is the correct way to go for me. Then I can limit the amount of data being sent to the backup copy server, without having to worry about the merge process being slow after all the data has been copied to the repositories. At a first glance it seems to work smoothly.

R&D Forums

Backup copy throughput

Re: Backup copy throughput

Re: Backup copy throughput

Re: Backup copy throughput

Re: Backup copy throughput

Re: Backup copy throughput

Re: Backup copy throughput

Re: Backup copy throughput

Who is online