-
- Enthusiast
- Posts: 37
- Liked: 3 times
- Joined: Jun 26, 2019 3:28 pm
- Full Name: Filip Smeets
- Contact:
Backup Copy DataDomain to DD taking longer and longer
Hi
Let me start by explaining the setup:
Backup server and proxies installed at HQ. Still running on version 9.5.
Proxies installed at ROBO's.
DataDomain installed at each location for local backups. 49 restore points with weekly synthetic fulls.
Second DataDomain installed at HQ for Backup Copy of ROBO's. also 49 restore points with GFS, 7 weekly
Al the proxy servers are also gateway server for DDBoost. And all virtual machines. Tranport mode: Virtual appliance. (A migration to vSAN is planned.)
WAN links between HQ and ROBO's are MPLS and mostly poor in bandwidth. Like 5, 10, 15, 50Mbps.
For the copy jobs, we are using the proxy/gateway server at the source to leverage DDBoost. This to limit to amount of bandwidth needed.
The solution is implemented by following these best practices:
https://www.veeambp.com/repository_serv ... integrated
https://www.veeambp.com/readme-1/datado ... calability
Everything was running fine until we got around 40 restore points for the backup copy jobs. Backup Copy job runtime increased from 2 hours to 7 hours and even worse for some sites. The amount of changes stayed the same.
There is a support case running with Veeam (# 04149627) and also Dell EMC.
Veeam support explained the behaviour as followed:
Veeam requests Change Block tracking information from all the files in the copy chain. In our case - CBT data is downloaded from all 40+ files. What is more - it happens over WAN.
They suggested to lower the restore points for the copy job to 7 with 7 weekly (GFS) to shorten the chain and to upgrade to V10. Reading CBT from copy jobs should be a bit faster.
They also suggested to implement a gateway server at the target but this would kill the advantages of DDBoost and the needed bandwidth which is why we went for this setup in the first place.
Dell EMC support saw the same behaviour in their logs. Their response:
It is a well known issue that VEEAM does not implement MFR for DDBOOST.
It is also well known that the way VEEAM reads the images is slow. VEEAM performs random readings, which kills the performance. On our DDBOOST integration guide we say in an explicit way that the readings must be sequential to leverage on the cache and speed up readings that can be provided by DD.
Anybody else with a similar setup or experiencing this same behavior?
Any sollutions?
Thx.
Let me start by explaining the setup:
Backup server and proxies installed at HQ. Still running on version 9.5.
Proxies installed at ROBO's.
DataDomain installed at each location for local backups. 49 restore points with weekly synthetic fulls.
Second DataDomain installed at HQ for Backup Copy of ROBO's. also 49 restore points with GFS, 7 weekly
Al the proxy servers are also gateway server for DDBoost. And all virtual machines. Tranport mode: Virtual appliance. (A migration to vSAN is planned.)
WAN links between HQ and ROBO's are MPLS and mostly poor in bandwidth. Like 5, 10, 15, 50Mbps.
For the copy jobs, we are using the proxy/gateway server at the source to leverage DDBoost. This to limit to amount of bandwidth needed.
The solution is implemented by following these best practices:
https://www.veeambp.com/repository_serv ... integrated
https://www.veeambp.com/readme-1/datado ... calability
Everything was running fine until we got around 40 restore points for the backup copy jobs. Backup Copy job runtime increased from 2 hours to 7 hours and even worse for some sites. The amount of changes stayed the same.
There is a support case running with Veeam (# 04149627) and also Dell EMC.
Veeam support explained the behaviour as followed:
Veeam requests Change Block tracking information from all the files in the copy chain. In our case - CBT data is downloaded from all 40+ files. What is more - it happens over WAN.
They suggested to lower the restore points for the copy job to 7 with 7 weekly (GFS) to shorten the chain and to upgrade to V10. Reading CBT from copy jobs should be a bit faster.
They also suggested to implement a gateway server at the target but this would kill the advantages of DDBoost and the needed bandwidth which is why we went for this setup in the first place.
Dell EMC support saw the same behaviour in their logs. Their response:
It is a well known issue that VEEAM does not implement MFR for DDBOOST.
It is also well known that the way VEEAM reads the images is slow. VEEAM performs random readings, which kills the performance. On our DDBOOST integration guide we say in an explicit way that the readings must be sequential to leverage on the cache and speed up readings that can be provided by DD.
Anybody else with a similar setup or experiencing this same behavior?
Any sollutions?
Thx.
-
- Veeam Software
- Posts: 21139
- Liked: 2141 times
- Joined: Jul 11, 2011 10:22 am
- Full Name: Alexander Fogelson
- Contact:
Re: Backup Copy DataDomain to DD taking longer and longer
Hi Filip, have the bottleneck stats changed when the job started to run slower? Recommendations from our engineers make sense to me, even though DDBosst works fine over WAN, I would at least try with the gateway for the target DataDomain located closer to the device. In this case, data will flow compressed over WAN and then decompressed on target prior to being written to the storage (make sure the corresponding setting is enabled in repository settings).
-
- Enthusiast
- Posts: 37
- Liked: 3 times
- Joined: Jun 26, 2019 3:28 pm
- Full Name: Filip Smeets
- Contact:
Re: Backup Copy DataDomain to DD taking longer and longer
Hi Alexander. Thanks for taking the time to reply.
I can only see the bottleneck stats for the latest run which is Throttling. But that's probably because we have configured a blackout window during business hours. Looking at the percentages, 96% for network, makes also sense as we are going over a slow WAN link.
Is there a way to look up the bottleneck for past runs?
We did run a test with a gateway at the target DataDomain and it did run faster but used more bandwidth. This would work for some sites were there is enough bandwidth but definitively not for all.
Doing it like this would also make the investment in the DataDomains a bit useless. We chose this solution because it would minimize the needed bandwidth via DDBoost. We already had to upgraded several WAN links before we could start so it's not really an option now to ask for more.
Do you think adding WAN accelerators would give the same benefits?
Thx.
I can only see the bottleneck stats for the latest run which is Throttling. But that's probably because we have configured a blackout window during business hours. Looking at the percentages, 96% for network, makes also sense as we are going over a slow WAN link.
Is there a way to look up the bottleneck for past runs?
We did run a test with a gateway at the target DataDomain and it did run faster but used more bandwidth. This would work for some sites were there is enough bandwidth but definitively not for all.
Doing it like this would also make the investment in the DataDomains a bit useless. We chose this solution because it would minimize the needed bandwidth via DDBoost. We already had to upgraded several WAN links before we could start so it's not really an option now to ask for more.
Do you think adding WAN accelerators would give the same benefits?
Thx.
-
- Veeam Software
- Posts: 21139
- Liked: 2141 times
- Joined: Jul 11, 2011 10:22 am
- Full Name: Alexander Fogelson
- Contact:
Re: Backup Copy DataDomain to DD taking longer and longer
Throttling being the bottleneck means that either you have network rules configured between the servers running source and target data movers or repository read and write data rates are limited in the repository wizard. Is either of these the case? To switch to the previous job runs, use the arrow keys in the job stats window.
WAN acceleration worth a try on such a slow link, could shift the load to the gateway servers.
WAN acceleration worth a try on such a slow link, could shift the load to the gateway servers.
-
- Enthusiast
- Posts: 37
- Liked: 3 times
- Joined: Jun 26, 2019 3:28 pm
- Full Name: Filip Smeets
- Contact:
Re: Backup Copy DataDomain to DD taking longer and longer
Just wanted to let you know that we have switched our Backup Copy Job method from synthetic to active fulls by enabling 'Read the entire restore point from source instead of synthesizing it from increments'
Explained here: https://helpcenter.veeam.com/archive/ba ... modes.html
We are seeing some positive results but it's too soon to know for sure. I will report back after a week or so.
Interesting thing is that we asked Veeam support about this and they discouraged this. The documentation also states that this is not recommended for Data Domains. But as we have such a high need of restore points, it would break up our backup chain and could have a positive impact.
Explained here: https://helpcenter.veeam.com/archive/ba ... modes.html
We are seeing some positive results but it's too soon to know for sure. I will report back after a week or so.
Interesting thing is that we asked Veeam support about this and they discouraged this. The documentation also states that this is not recommended for Data Domains. But as we have such a high need of restore points, it would break up our backup chain and could have a positive impact.
-
- Veeam Software
- Posts: 21139
- Liked: 2141 times
- Joined: Jul 11, 2011 10:22 am
- Full Name: Alexander Fogelson
- Contact:
Re: Backup Copy DataDomain to DD taking longer and longer
In this case, much more data goes across the link - if it is not an issue and gives performance increase, it is ok to run not according to the best practices. I'm however a bit confused by this considering the bottleneck stats you've mentioned above.
-
- Expert
- Posts: 128
- Liked: 14 times
- Joined: Jul 02, 2010 2:57 pm
- Full Name: Chad
- Contact:
Re: Backup Copy DataDomain to DD taking longer and longer
When we were using netbackup software we had minimal issues writing to the primary DD and the the replicated data to the secondary DD was always in sync. Now that we are using VEEAM, I can't say the same thing. Writing to the primary DD takes longer than before and we gave up all together on it replicating to the secondary DD.
-
- Enthusiast
- Posts: 37
- Liked: 3 times
- Joined: Jun 26, 2019 3:28 pm
- Full Name: Filip Smeets
- Contact:
Re: Backup Copy DataDomain to DD taking longer and longer
We are not experiencing any problems with our local backups.
We did noticed that the long backup runtimes for the copy jobs are always caused by VM's with SQL on them. We even opened a case with Veeam for this issue: #04130276
They said it was caused by DDBoost not able to handle the compressed SQL backups on the VM. So we disabled compression on the SQL backups which indeed reduced the backup window but still not enough.
We then moved the SQL backups to Veeam with log shipping but still can't take our backups within a 6 hour time window.
I'm wondering if there are more people experiencing this with SQL servers and if there are some optimizations we can implement.
Thanks.
We did noticed that the long backup runtimes for the copy jobs are always caused by VM's with SQL on them. We even opened a case with Veeam for this issue: #04130276
They said it was caused by DDBoost not able to handle the compressed SQL backups on the VM. So we disabled compression on the SQL backups which indeed reduced the backup window but still not enough.
We then moved the SQL backups to Veeam with log shipping but still can't take our backups within a 6 hour time window.
I'm wondering if there are more people experiencing this with SQL servers and if there are some optimizations we can implement.
Thanks.
-
- Expert
- Posts: 128
- Liked: 14 times
- Joined: Jul 02, 2010 2:57 pm
- Full Name: Chad
- Contact:
Re: Backup Copy DataDomain to DD taking longer and longer
We use a share off the Data Domain to hold all our SQL dumps (uncompressed). We do not quiesce our vm guests when doing backups either. We have a separate network for use with the backup traffic that is not routed. We are using DDBOOST well.
-
- Enthusiast
- Posts: 37
- Liked: 3 times
- Joined: Jun 26, 2019 3:28 pm
- Full Name: Filip Smeets
- Contact:
Re: Backup Copy DataDomain to DD taking longer and longer
https://photos.app.goo.gl/xGjx33o1S6hq8FjY6
I noticed the Backup Copy Job for some disks is running very slow compared to others. What could be the cause of this? And how can I troubleshoot this?
I noticed the Backup Copy Job for some disks is running very slow compared to others. What could be the cause of this? And how can I troubleshoot this?
-
- Chief Product Officer
- Posts: 31816
- Liked: 7302 times
- Joined: Jan 01, 2006 1:01 am
- Location: Baar, Switzerland
- Contact:
Re: Backup Copy DataDomain to DD taking longer and longer
Perhaps the "fast" ones have large contiguous segments of free space?
-
- Enthusiast
- Posts: 37
- Liked: 3 times
- Joined: Jun 26, 2019 3:28 pm
- Full Name: Filip Smeets
- Contact:
Re: Backup Copy DataDomain to DD taking longer and longer
I would suspect this to have more an effect on the normal backup job? This would also mean that this free space is in our normal backup job? As the copy job gets it's data from the normal backup job.
-
- Enthusiast
- Posts: 37
- Liked: 3 times
- Joined: Jun 26, 2019 3:28 pm
- Full Name: Filip Smeets
- Contact:
Re: Backup Copy DataDomain to DD taking longer and longer
We came to an agreement with the client to lower the amount of restore points to 7 with 4 weekly GFS's We therefor also switched back to synthetic fulls. All SQL server backups are now also migrated to Veeam.
Implementing all these changes brought us back to a respectable backup window except for some sites. But we are going to try and solve these with the network team.
The ability to define a gateway server at the source site for the actual backup copy and another gateway server at the target location solely for the merge operations would have been a nice feature .The merge operations were almost 5x faster when using a gateway at the target site.
Just a small remark as I see I forgot to mention it in the initial post is that we are also using SilverPeak on the SDWAN links. Another reason why we didn't look into the Veeam WAN accelerators.
Implementing all these changes brought us back to a respectable backup window except for some sites. But we are going to try and solve these with the network team.
The ability to define a gateway server at the source site for the actual backup copy and another gateway server at the target location solely for the merge operations would have been a nice feature .The merge operations were almost 5x faster when using a gateway at the target site.
Just a small remark as I see I forgot to mention it in the initial post is that we are also using SilverPeak on the SDWAN links. Another reason why we didn't look into the Veeam WAN accelerators.
Who is online
Users browsing this forum: Bing [Bot], Google [Bot], Semrush [Bot] and 78 guests