DataDomain replication performance issue

Availability for the Always-On Enterprise

DataDomain replication performance issue

Veeam Logoby ferrus » Fri Jul 15, 2016 2:21 pm 1 person likes this post

I've been using a pair of EMC DD2500 units + DDboost, since Veeam was installed last year.

Backup Copy jobs are saved to the first device, then DD replication (not Veeam) is used to create another copy on the second DD2500 at a remote site.
This has been working well since installation. There's often a few TB (pre-compression) remaining when I check on a morning, but this is usually cleared by the afternoon.
Occasionally, the replication lag spans >24 hours before clearing.

For over a week now, the size of the replication set has been building up and up - and shows no sign of reducing.
Currently it stands at >400TB (pre-compression).

As the replication is a purely DD -> DD operation, with no Veeam involvement - I wouldn't usually post here.
But the sudden drop on replication performance occurred immediately after our Veeam v8 -> v9 upgrade.

So before opening a support call with EMC, I was just wondering if anyone had noticed similar behaviour after their upgrade.
I've noticed some settings that look a little different in the backup job configurations, eg:

"Local Target (legacy 8MB block size)" Legacy storage optimization setting left by upgrade process. Please switch to another setting, and initiate Active Full

Could there be an change in the way the backups are stored on the primary DD device post v9, which would affect replication to the other?
ferrus
Veeam ProPartner
 
Posts: 137
Liked: 20 times
Joined: Thu Dec 03, 2015 3:41 pm
Location: UK

Re: DataDomain replication performance issue

Veeam Logoby ferrus » Fri Jul 15, 2016 3:23 pm

This graph - from the DD system manager, shows the jump in size after v9 was installed

Image

I could understand a one day jump - with the post-upgrade test backups we took, but that seems to be consistently bigger.
ferrus
Veeam ProPartner
 
Posts: 137
Liked: 20 times
Joined: Thu Dec 03, 2015 3:41 pm
Location: UK

Re: DataDomain replication performance issue

Veeam Logoby adb98 » Thu Jul 21, 2016 5:12 pm

Thought I might throw some hints your way. The first one is to check your nic speeds on the DD. Ensure that you are running at full 1gb or 10gb if your a lucky bastard. I found that my replication started falling behind to the point it would never have synced and it was due to the nics being bonded and stuck at 100mb.

The other thought is make sure you are following the DD best practice guide. A few things have changed in Veeam 9. Also check your historical and see if you are still getting good dedup rates and compression. This going down means more than likely a bad setting in Veeam. This in turn causes more data to have to be sent.
https://www.veeam.com/kb1956

Lastly on the settings note I wonder if using "Use per-VM Backup file" maybe causing a difference. This makes a backup file for each VM when its backed up. Should be an issue but you might want to expermiment with that.

Hope this helps a little.
adb98
Influencer
 
Posts: 18
Liked: 1 time
Joined: Thu Jul 21, 2016 5:03 pm
Full Name: Aaron B

Re: DataDomain replication performance issue

Veeam Logoby ferrus » Fri Jul 22, 2016 11:46 am

Thanks for the reply.

I pretty sure we're following most of the best practice. We haven't switched on the per-VM backup file option yet, although it's something I'd like to do once this issue is resolved.
The only setting I'm not sure of is the number of concurrent tasks for the DD repos.
EMC recommend setting it to half of the maximum your DD model can deliver - but I can't find any published values of what that should be for a DD2500.

NIC speed is set to 1Gbps, but our network team mention the DDboost replication was peaking at around 100Mpbs over the WAN - and this amount hasn't changed before or after the backlog appeared.

For info, I've opened support calls with both Veeam and EMC.
Veeam support pointed to the resultant size of the Backup Copy Jobs, which appear not to have changed since the upgrade.
On the DD, the post-compression sizes are also similar either side of the upgrade.

The only thing to have changed is the pre-compression (and replication size), when it's first written to the disk.

One interesting point to note, is that EMC only support DD OS v5.6 - v5.7 with Veeam v9. Veeam support permits a wider OS range - v5.4 - v5.7.
So our next step is to upgrade to v5.6 on the DDs, which may align better with the DDboost v3 compatibility of Veeam v9.
At the very least, it will allow us to receive further EMC support.
ferrus
Veeam ProPartner
 
Posts: 137
Liked: 20 times
Joined: Thu Dec 03, 2015 3:41 pm
Location: UK

Re: DataDomain replication performance issue

Veeam Logoby nefes » Fri Jul 22, 2016 2:39 pm

Could you please check, whether Compact Full is scheduled for your backup copy jobs?
If no, full backup will grow during time, thus increasing replication time. (looks like it is the reason why your Pre-Comp size increased)

Be aware, that Compact is time-consuming operation, and puts certain load on your device, so it should not be scheduled for the same time for all your backup copy jobs.
nefes
Veeam Software
 
Posts: 543
Liked: 128 times
Joined: Mon Dec 10, 2012 8:44 am
Full Name: Nikita Efes

Re: DataDomain replication performance issue

Veeam Logoby ferrus » Fri Jul 22, 2016 3:18 pm

Thanks for the reply.

The compact full backup option is greyed out, with the an alert:

"Maintenance is not required when periodic full backups are enabled"

We use a split strategy: 30 RP Forever-Incremental on Tier 1 / GFS 7 days, 4 weeks, 18 months on Tier 2 DD.

Going to upgrade the DD OS tomorrow.

Interestingly, while investigating the settings - I found a couple of backup copy jobs that had been disabled - because of a sync issue with the DD repository.
It recommended doing a manual Repository Rescan, then re-enabling the job.
I've done that, and everything is OK again. Don't know if that would cause any write issues.
ferrus
Veeam ProPartner
 
Posts: 137
Liked: 20 times
Joined: Thu Dec 03, 2015 3:41 pm
Location: UK

Re: DataDomain replication performance issue

Veeam Logoby rreed » Fri Jul 22, 2016 7:59 pm

Block size changed a bit w/ v9, w/ what size are you writing? In v8 we noticed cranking them up ("1TB Larger Files") yielded much faster restores; I've left it that way in v9 though I think the block size was halved. I would also recommend enabling per-VM chains, for a lot of reasons. One I think is that dedupe devices might be able to better dedupe blocks of individual VM files vs. large consolidated files. I've left on Veeam dedupe and set compression none in my jobs. Overall, in v8 we saw real-world savings of ~%86-88% w/ our DD's and Dell DR's. After v9 and all our v8 restore points purged away we've seen ~90% space savings on both devices. I disabled limiting concurrent tasks since DD's can handle I think around 150-300 or so concurrent connections - you'd have to enable per-VM chains, have a lot of concurrent jobs, and a lot of VM's firing off all at once to saturate that. Default in Veeam I think is four?

Have your network guys monitor and report DD --> DD traffic, it should be saturating your WAN link if it's that far behind. If traffic is slim to none, make sure your DD replication throttling didn't get turned on, your network guys didn't accidentally QOS down your replication traffic, etc.
VMware 6
Veeam B&R v9
Dell DR4100's
EMC DD2200's
EMC DD620's
Dell TL2000 via PE430 (SAS)
rreed
Expert
 
Posts: 354
Liked: 72 times
Joined: Tue Jun 30, 2015 6:06 pm

Re: DataDomain replication performance issue

Veeam Logoby barresi » Fri Jul 29, 2016 6:30 am

Hello,
what is the solution for this? The OS update for the DD?
Regards barresi
barresi
Lurker
 
Posts: 2
Liked: never
Joined: Tue Nov 22, 2011 2:52 pm
Full Name: Matthias Barre

Re: DataDomain replication performance issue

Veeam Logoby rreed » Fri Jul 29, 2016 1:42 pm

Your graph is showing that you're writing a LOT of dedupable/compressible data to the DD, might check your job settings to make sure Veem is doing some of the dedupe and compression work (Enable inline deduplication, Compression level in Job -> Advanced -> Storage tab). Storage optimization might have an effect, but I set mine to largest size to help w/ restores, and it certainly seems to. Smaller block size yielded EXCRUCIATINGLY slow restore times, once I cranked it up they became usable. That was v8, I haven't experimented in v9 to see what difference block size makes. Ours is all still working well, our dedupe devices keep up, compress well, restores are acceptable, so I try to leave well enough alone.

Try having Veeam take off some of the load if you aren't already, and I can confirm when two DD's suddenly have some data to replicate, they will drive up their link to each other. Our saturates a 500Mb link after a backup job has ran.
VMware 6
Veeam B&R v9
Dell DR4100's
EMC DD2200's
EMC DD620's
Dell TL2000 via PE430 (SAS)
rreed
Expert
 
Posts: 354
Liked: 72 times
Joined: Tue Jun 30, 2015 6:06 pm

Re: DataDomain replication performance issue

Veeam Logoby ferrus » Tue Aug 23, 2016 8:17 am

Sorry - I've been away for a few weeks and forgot about this thread.
This issue still isn't resolved, and have quite a bit to post later.

Before then - could someone answer give a quick answer to the following:

If you have Direct Attached Storage for your Primary Backups, and from there take Backup Copy Jobs to Data Domains - which Storage Optimization is recommended?
I've found advice for Local Target 16TB+ for Dedup appliances, and Local Target for DAS - but they both have to be the same for Backup Copy Jobs. So which is the preferred option?
ferrus
Veeam ProPartner
 
Posts: 137
Liked: 20 times
Joined: Thu Dec 03, 2015 3:41 pm
Location: UK

Re: DataDomain replication performance issue

Veeam Logoby foggy » Tue Aug 23, 2016 12:58 pm

Recommendation in this case is to have the 'Local Target 16TB+' setting in the original job, since backup copy will then use the same block size to store data on the dedupe device. This is, however, not a "hard" recommendation, smaller block ('Local target' setting) will result in a bit slower backup, however, some types of restores will be faster, especially in v9.5.
foggy
Veeam Software
 
Posts: 15078
Liked: 1110 times
Joined: Mon Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson

Re: DataDomain replication performance issue

Veeam Logoby victorB » Thu Oct 27, 2016 4:28 pm

Hi, I was wondering if a solution was found to this problem as we have been experiencing the exact same issues straight after our environment was upgraded from v8 to v9, while also using a pair of EMC DD2500 units + DDboost DD2500.
victorB
Lurker
 
Posts: 2
Liked: never
Joined: Thu Oct 27, 2016 4:21 pm
Full Name: Victor Brown

Re: DataDomain replication performance issue

Veeam Logoby ferrus » Fri Oct 28, 2016 1:04 pm

I'm away from the office this week - but just noticed an e-mail from EMC support attempting to close the original support call - for the second time.
I'm absolutely no further on this.

We've raised the WAN link speed from 10Mbps to 1Gbps - and that seemed to provide a temporary speed burst, but the comms link is nowhere near saturated and we're back to having 100-200TB backlog.

The support call went round a few different EMC engineers and departments - but they kept insisting it was a replication issue only, and suggesting tweaks, when I see the issue as before that - and the replication a symptom.
The day after the Veeam v8 to v9 upgrade the pre-compression data (and resulting compression rate) rocketed up, and hasn't come down since. If I turned off replication altogether - I can't see those values changing.

They've provided me with graphs showing gentle linear increases of various counters over the last year - while consistently ignoring the graph above with the massive one day change.

I couldn't keep the configuration static for all this time - so in the last few weeks I've started rearranging our Veeam backups and amended some of the configs. We now have per-VM backups, compression, in-line dedup, a smaller block size, fewer higher density jobs. None seem to have made a difference to the DD performance - they're more to yield better Tier1 DAS efficiency,

If you find anything out - I'd be grateful if you let me know, as we're no further forward. The Veeam 9 upgrade has been successful in every other way, but the DD2500 performance has worsened dramatically.
ferrus
Veeam ProPartner
 
Posts: 137
Liked: 20 times
Joined: Thu Dec 03, 2015 3:41 pm
Location: UK

Re: DataDomain replication performance issue

Veeam Logoby victorB » Fri Oct 28, 2016 4:21 pm

Hi, I have been informed by EMC support that there is a problem with Data Domain replication using Veeam version 9. There has been a change in Veeam version 9 in the way it keeps base file relationships, therefore breaking the Data Domain VSR (Virtual Synthentic Replication) capability, it is essentially turned off during Data Domain replication. This means that Veeam backups will take longer to replicate from version 9 onwards (on DD).

I have also been informed that EMC have investigated this bahaviour and their conclusion is that this needs to be corrected from Veeam software side and not DDOS code.

The only recommendation to counter this has been to create additional Mtree's and create replication contexts for each additional Mtree created, I have done this and it has helped but not completely resolved the problem, we still suffer with replcation lag but no where near the levels we were experiencing.

Hope this helps.
victorB
Lurker
 
Posts: 2
Liked: never
Joined: Thu Oct 27, 2016 4:21 pm
Full Name: Victor Brown

Re: DataDomain replication performance issue

Veeam Logoby ferrus » Fri Oct 28, 2016 4:53 pm

Victor - that's a great help.
Almost none of the above has been passed on to us from EMC support - apart from a comment in the support call close request a couple of days ago, that we might want to consider using additional MTree's - to allow more replication streams.
That's very useful information.
Do you know if EMC have published a KB article on the issue?

Anyone from Veeam like to comment on the issue? I suppose it's too late for v9.5 ....
ferrus
Veeam ProPartner
 
Posts: 137
Liked: 20 times
Joined: Thu Dec 03, 2015 3:41 pm
Location: UK

Next

Return to Veeam Backup & Replication



Who is online

Users browsing this forum: Google [Bot] and 37 guests