Comprehensive data protection for all workloads
Post Reply
TitanDMS
Novice
Posts: 4
Liked: never
Joined: Sep 26, 2016 1:25 am
Full Name: Titan DMS
Contact:

Backup Merge Painfully Slow (V9)

Post by TitanDMS »

Hi All,
Just wondering if anyone has had any experience/success backing up large (5.5TB virtual machines).
In our production environment, we are currently backing up 20 Virtual machines, with all of the corresponding jobs utilizing the forever forward incremental method.
We have no issues with 18 of the 20 machines, as these are all <500GB in size...
That brings me to our other 2 machines (the problem machines). These machines are our SQL servers that are 5.5TB in size, and these are the ones that i'm having issues with.
To give everyone an idea of the backup infrastructure we have setup:
3xVMWare Hosts individually connected to 1000mb switch with 4x connections in link aggregation.
1VM dedicated to being the backup server (32gb RAM, 4CPU)
All NICS are using the VMWare VMXNet3 Card/Driver and everything has jumbo packets enabled (switch, NICs etc)
The Backup Device is a Synology 1815+ (running the latest DSM) which is physically connected 4x to the same switch and using link aggregation as well.
The backup server, utilises iSCSI to connect to the Synology and as such, appears as a local drive on the Backup Server Virtual Machine.

Our issue is not the actual backup time itself, but rather the merge time.
The backup takes roughly 8hrs (which i'm fine with) but the merge (depending on day) takes anywhere up to 127hrs or more to complete.

I have spoken with a few veeam techs, and they have advised increasing the memory, and cpu on the backup server. (which i have done, to reach the point it is at now) to no avail.

Does anyone else backup servers this large out there without issue?

Any assistance would be appreciated.

(Case 01750796)
nmdange
Veteran
Posts: 527
Liked: 142 times
Joined: Aug 20, 2015 9:30 pm
Contact:

Re: Backup Merge Painfully Slow (V9)

Post by nmdange » 1 person likes this post

I have some file server VMs that are that size or larger, and they backup very quickly, but there is minimal change data. I have a separate job specifically for our SQL Server data warehouse VMs. They do a daily ETL from our ERP system, and generate a lot of change data. However, even then the merge only takes 5 hours. 127hrs is way beyond the norm.

How big are your VIB files? What does the job say is the bottleneck? What about the number of disks, speed of the disks and RAID config in the Synology?

Merge time is heavily dependent upon the disk performance of your backup repository. You are not the first person I've seen complain about performance issues using a Synology. It could also be due to the 1Gbps switch. I realize you are using link aggregation, but the thing with link aggs is that a single TCP stream can't use more than a single link, and depending on the load balancing algorithm, it might even be all traffic to a specific IP address. So you may think you are getting 4Gbps by using 4 links, but you may actually still be limited to 1Gbps.
dellock6
Veeam Software
Posts: 6137
Liked: 1928 times
Joined: Jul 26, 2009 3:39 pm
Full Name: Luca Dell'Oca
Location: Varese, Italy
Contact:

Re: Backup Merge Painfully Slow (V9)

Post by dellock6 »

Hi,
we have customers protecting virtual machine 10x bigger then those without issues, but it depends what you mean by "issues". On a storage like the one you listed, I'm nost expecting to be honest incredible performance, depending on the number of disks you have and the raid that is configured you may improve something, but not like 5 times to be honest.
How many disks you have in it, what type, and grouped with what raid set? Memory on the synology itself?
Luca Dell'Oca
Principal EMEA Cloud Architect @ Veeam Software

@dellock6
https://www.virtualtothecore.com/
vExpert 2011 -> 2022
Veeam VMCE #1
TitanDMS
Novice
Posts: 4
Liked: never
Joined: Sep 26, 2016 1:25 am
Full Name: Titan DMS
Contact:

Re: Backup Merge Painfully Slow (V9)

Post by TitanDMS »

Hi Guys,
The Synology has 8 disks in RAID5 and 4GB RAM.
I'm aware that the performance of this wont be 'great' but i would expect the merge to be much faster.
We don't have any issues with the backup performance (8Hrs) but rather just the merge.

The bottleneck usually shows as Target, but sometimes is the source. When i had a Veeam tech connect to the machine and investigate, he said the logs didn't really indicate any one thing that stood out to him.
When the merge is happening, the network/memory/cpu on the Synology looks to be pretty much idle. not maxed or spiking anywhere, and the CPU/Memory usage on the Backup server is pretty low too. (<12gb RAM, 20% CPU Utilisation)
Gostev
Chief Product Officer
Posts: 31460
Liked: 6648 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Backup Merge Painfully Slow (V9)

Post by Gostev »

8 disks on RAID5 over 1Gb is not a good candidate for forever-incremental backup. Transforms are about trading performance for disk space, but you don't have any performance to trade, really. You should enable periodic fulls, and consider using ReFS once 9.5 is out.
TitanDMS
Novice
Posts: 4
Liked: never
Joined: Sep 26, 2016 1:25 am
Full Name: Titan DMS
Contact:

Re: Backup Merge Painfully Slow (V9)

Post by TitanDMS »

Understood, however the part that i'm not quite understanding is why would i not being seeing the disk utilization or anything else being sky high whilst the merge is being performed.
I was under the impression that if the disk performance was the holdup in the merge process, i would be seeing 100% disk usage etc (which i can understand)

To give you the full rundown, i have a merge that is at 85% complete now and still running, these are the stats im currently seeing:
Synology CPU - ~3%
Synology RAM - ~13%
Synology NIC - ~8.1MB/s
Synology Disk - ~25%
Synology iSCSI - ~10MB/s

Backup Server CPU - ~10%
Backup Server RAM - 50%

Nothing in that list stands out to me as being a bottleneck which i would be expecting to see. Is there something that i'm not full understanding here?
tsightler
VP, Product Management
Posts: 6009
Liked: 2843 times
Joined: Jun 05, 2009 12:57 pm
Full Name: Tom Sightler
Contact:

Re: Backup Merge Painfully Slow (V9)

Post by tsightler »

It's a little difficult to know, but I'd be particularly interested in the request latency as seen from the Windows repository during the merge. Probably the easiest way to do this on Windows is to use the "Disk" tab in Resource Monitor. Look in the disk activity Windows and sort it by activity. You should see a VeeamAgent process performing read/write I/O and the "Response Time" metric would be the value I'm looking for there.

Also a screenshot of the memory usage for resource monitor and some information about the backup size and the size of the incremental backups would also be useful.
nmdange
Veteran
Posts: 527
Liked: 142 times
Joined: Aug 20, 2015 9:30 pm
Contact:

Re: Backup Merge Painfully Slow (V9)

Post by nmdange »

I'd look at the following performance counters in Performance Monitor on the backup server for each physical disk that represents a backup repository:
Current Disk Queue Length
Avg. Disk Write Queue Length
Avg. Disk Read Queue Length
Avg. Disk Sec / Read
Avg. Disk Sec / Write
Disk Read Bytes / sec
Disk Read / sec
Disk Write Bytes / sec
Disk Writes / sec
Split IO / sec

These counters will give you a clear picture as to whether the repository is able to keep up with the IO demand from Veeam during the merge. In particular the disk queue length is a really good indication of a disk bottleneck.

Since this is from the perspective of Windows, the bottleneck could be in the iSCSI connection, the Synology disk controller, in the RAID5 overhead, or in the physical disks themselves.
TitanDMS
Novice
Posts: 4
Liked: never
Joined: Sep 26, 2016 1:25 am
Full Name: Titan DMS
Contact:

Re: Backup Merge Painfully Slow (V9)

Post by TitanDMS »

Hmm interesting.
Looks like the Read performance is lacking. I'm seeing the average Read queue length hovering around 90+
joechay
Novice
Posts: 3
Liked: 1 time
Joined: Sep 15, 2016 1:09 am
Full Name: Joe Chay
Contact:

Re: Backup Merge Painfully Slow (V9)

Post by joechay » 1 person likes this post

I have recently made a backup repository change from Dell MD3200i (1GB iSCSI setup with 1TB 7200K SATA drive RAID 5) to Dell PS6210 (10GB iSCSI setup with 1.2TB 10K SAS Drive RAID50).

Previously my backup merging for 3.3TB daily incremental merging took 3-4hours (after backup completed) down to 1hour 40min. My noticeable gain was the daily full offload to tape via iSCSI bridge which previously took 13-14hours to 8hours).

Nothing change, my Veeam Backup server hosted on a separate ESXi host from the 3 Production ESXi running only with 16GB RAM and with older hardware.
sakatam
Lurker
Posts: 1
Liked: never
Joined: Dec 18, 2012 4:51 pm
Full Name: Dave
Contact:

Re: Backup Merge Painfully Slow (V9)

Post by sakatam »

Just my 2 pennies,

Seems to me there is a link somewhere that is dropping Jumbo. IMO, I would drop JF, drop all Link Agro. Start with a simple 1 GB 1500, or 1492 link. From VM to Repo, test, record speed. Add a link, no JF at this point, test, record. Etc... With link agro, ensure GOOD cable hygiene, clean, ensure same length if possible. My experience we saw a better merger speed with no JF, about 60/100MB/s over 2GB link. QNAP 4 dual 1GB, over SMB.

Can be allot of tiny things, like solar flares, earthquakes, shacking of the rack, loose internal head in NAS HD, ants, dirt, space aliens.

Backtrace and see where performance stops going forward, may be a limitation of switxh saturation. Port utilization on switch will also let you know where an issue can be.

Good luck!

Good luck, sorry to jk.
PedroVDP
Service Provider
Posts: 19
Liked: 2 times
Joined: Jan 18, 2012 9:10 am
Contact:

Re: Backup Merge Painfully Slow (V9)

Post by PedroVDP »

Two things that I know for a fact that have helped are :

- doing regular compacts. If you're using either forever incremental or incremental with synthetic fulls your full backup file gets fragmented. Doing either an active full or a compact operation (which can now be scheduled) you can see dramatic time wins in your merges. We've gone to 1 hour, coming down from 23-28h on a particularly large SQL server. Over time merge time will drift again so it's about finding the right interval for your infrastructure.

- use scale out repositories to separate your fulls from your incrementals. That way even low cost storage isn't bogged down doing reads and writes on the same hardware when you merge or transform.
newfirewallman
Enthusiast
Posts: 35
Liked: 2 times
Joined: Jan 20, 2015 12:08 pm
Full Name: Blake Forslund
Contact:

Re: Backup Merge Painfully Slow (V9)

Post by newfirewallman » 1 person likes this post

The key is quick storage and utilizing Veeams Scale-out repository. If you put all the Full's one RAID Array and the Incrementals on other Arrays, disk IO is greatly reduced. Before scale-out we had the same issue and when you dig into it slow spinning disk HATES random IO, you can watch average response time on the array jump to tens of thousands of milliseconds. We backup 35TB (SQL and File servers are the majority, some over 7.5TB) Once we were able to create arrays (spindles) for fulls and arrays (spindles) for incrementals performance increase exponentially.

Segregate and keep spindles specific where it makes since, to keep random read/write down and it will get better.
CatellaFI
Lurker
Posts: 1
Liked: never
Joined: Apr 02, 2015 3:42 pm
Full Name: Mika Melonen
Contact:

Re: Backup Merge Painfully Slow (V9)

Post by CatellaFI »

I have exactly the same Synology device as a home vSphere test system, when I got it this spring I set it up as you have, 4 nics and iSCSI for esxi hosts. The performance was always bad with this device, even if I set up MPIO and tweaked all the possible settings I found MTU/Jumbo/MPIO/RR with different packages sizes etc. I never got more than like perhaps 20-30MB/s per NIC using iSCSI, the IOmeter results inside VM were not promising either, specially IOPS. Then I found synology forums about this iSCSI poor performance, at least on ESXi and suggestions to change to NFS. When I did that, things changed big time, much more IOPS, and les lag on VM's, I could actually run VM's on this. The bad thing of course is that max 1GB LAN transfer speed, but I never got any better even with 4 nics MPIO and IOPS is so much better. I set up the disks as RAID10 and set 2x 240GB SSD as READ and WRITE cache and no probs anymore for home system (perhaps 5 running windows server). I don't know if the iSCSI problem comes with ESXi only, I belive that this particular Synology device simply doesn't do iSCSI well.
sentenza
Novice
Posts: 5
Liked: 3 times
Joined: May 27, 2015 7:41 pm
Full Name: Olivier Tarnus
Contact:

Re: Backup Merge Painfully Slow (V9)

Post by sentenza » 1 person likes this post

Hello ,

We're also running quite big jobs with VMs >4TB, on a not so slow storage, and I can only confirm what has already been said: IOPS is the key for backup merge and you won't see a big throughput in your stats. Looking at the description of your setup, I can only assume your disks are 7200 RPM SATA in one big RAID5 array: this is more or less the worst setup you can get because:

- SATA 7200 is slow, very slow in terms of IOPS.
- RAID5 should always be created in 4+1 or 8+1 setup or you just add additional IOPS for any write (you have either 7+1 or 6+1+hot spare)
- The additional IOPS on write will impact your reads because the disk is simply saturated and you're reaching system limits
- Also consider that no write cache should be enabled on HDD level in a properly configured RAID system, except if you're ready to corrupt your repository data at first power outage

Veeam is quite smart but nothing comes for free and as already stated you've chosen to trade volume against performances by choosing the forever incremental. You've certainly reached the limits of your disk system in this specific configuration.

PS: I didn't knew the separation of Full and Incremental on different volumes was that beneficial and will definitely look into it, so thanks for the tip ;)
Post Reply

Who is online

Users browsing this forum: Google [Bot], Ivan239, tyler.jurgens and 253 guests