-
- Expert
- Posts: 164
- Liked: 9 times
- Joined: Jan 28, 2014 5:41 pm
- Contact:
New Dell Compellent = slower replication?
We recently installed new Dell Compellent disk storage systems in both our main and secondary sites. Prior to Compellent, we have been using an old HP MSA2000 where each of the ESXi hosts were directly connected via SAS cables. Now with Compellent, we have a dedicated 10gb switch at each site where all the local ESXi hosts communicate with the site's new Dell storage via iSCSI. We still have the HP MSA installed as well for the time being.
One of our main VM's that we replicate from the main to secondary site is our file server. The file server is around 3tb total. Shortly after the Compellent installation, we migrated the file server's replica in the secondary site from the HP MSA to the Dell storage.
Ever since we have done this, we have noticed that the replication time to complete has increased. Looking at the bottleneck history, I can see that the target percentage has increase 13 - 25 percentage points. All the other pieces of the bottleneck chain remains relatively the same. The communication between the sites has not changed and is reflected by the consistent 27% (+-2%) in the bottleneck logs. The performance rate logs from EM show that before Dell, it was around 13 MB/s. Now we are getting around 6 MB/s
We have confirmed that the replication does not overlap with the Compellent Data Progression or it's built in Snap Shot schedule. I did discover that the Server OS preference setting on each of the Dell systems was set to Other Single Path. I have since changed them to VMware ESXi 6.0. No change. Compellent does not have deduplication as far as I'm aware either.
Please advice and thanks!
One of our main VM's that we replicate from the main to secondary site is our file server. The file server is around 3tb total. Shortly after the Compellent installation, we migrated the file server's replica in the secondary site from the HP MSA to the Dell storage.
Ever since we have done this, we have noticed that the replication time to complete has increased. Looking at the bottleneck history, I can see that the target percentage has increase 13 - 25 percentage points. All the other pieces of the bottleneck chain remains relatively the same. The communication between the sites has not changed and is reflected by the consistent 27% (+-2%) in the bottleneck logs. The performance rate logs from EM show that before Dell, it was around 13 MB/s. Now we are getting around 6 MB/s
We have confirmed that the replication does not overlap with the Compellent Data Progression or it's built in Snap Shot schedule. I did discover that the Server OS preference setting on each of the Dell systems was set to Other Single Path. I have since changed them to VMware ESXi 6.0. No change. Compellent does not have deduplication as far as I'm aware either.
Please advice and thanks!
-
- Veeam Software
- Posts: 21139
- Liked: 2141 times
- Joined: Jul 11, 2011 10:22 am
- Full Name: Alexander Fogelson
- Contact:
Re: New Dell Compellent = slower replication?
Am I getting right, that the primary bottleneck for this replication job is target? What transport modes are used by the source and target proxy servers?
-
- Expert
- Posts: 164
- Liked: 9 times
- Joined: Jan 28, 2014 5:41 pm
- Contact:
Re: New Dell Compellent = slower replication?
Actually the bottleneck seems to ping pong between the source and the target now. The source has always been 77% (+- of course) and the target was 50% - 55%. Now the target is 68% - 80%.foggy wrote:...the primary bottleneck for this replication job is target?
Not certain I understand what you are asking.foggy wrote:What transport modes are used by the source and target proxy servers?
Compression level = Optimal (recommended)
Exclude swap file blocks = checked
Enable VMware Tools quiescence = unchecked
Use changed block tracking data = checked
Enable CBT for all protected VMs automatically = checked
Data Transfer = Direct (only option available)
Target Proxy = Automatic selection (we do have a proxy setup at each site)
Does the above answer your question?
Thanks
-
- Veteran
- Posts: 370
- Liked: 97 times
- Joined: Dec 13, 2015 11:33 pm
- Contact:
Re: New Dell Compellent = slower replication?
You talk about pathing, have you set the ESX hosts to use round robin for the new iSCSI LUN's?
Jumbo frames setup for the iSCSI traffic?
Jumbo frames setup for the iSCSI traffic?
-
- Veeam Software
- Posts: 21139
- Liked: 2141 times
- Joined: Jul 11, 2011 10:22 am
- Full Name: Alexander Fogelson
- Contact:
Re: New Dell Compellent = slower replication?
I was talking about the transport mode used to populate the target datastore. You can look it up in the job session window, if you select the particular VM in the left pane and locate the proxy server selected for processing to the right.
-
- Expert
- Posts: 164
- Liked: 9 times
- Joined: Jan 28, 2014 5:41 pm
- Contact:
Re: New Dell Compellent = slower replication?
Looking at the "Path Selection Policy", it looks like they are set for "Most Recently Used" and not "Round Robin".DaveWatkins wrote:You talk about pathing, have you set the ESX hosts to use round robin for the new iSCSI LUN's?
Jumbo frames setup for the iSCSI traffic?
I'm also seeing that the vSwitches are set for 9000 MTU. However, we must of forgotten to adjust the NIC's themselves since I see they are still set to 1500 MTU.
Is it ok to make these type of changes on a live system or do I need to vmotion VM's off, reboot hosts, etc?
Thanks
PS: We are still pretty new to iSCSI come from a direct SAS connection environment.
-
- Expert
- Posts: 164
- Liked: 9 times
- Joined: Jan 28, 2014 5:41 pm
- Contact:
Re: New Dell Compellent = slower replication?
The Transport Mode for the Proxy is "Automatic Selection"foggy wrote:I was talking about the transport mode used to populate the target datastore. You can look it up in the job session window, if you select the particular VM in the left pane and locate the proxy server selected for processing to the right.
Should it be something else?
Thanks
-
- Veeam Software
- Posts: 21139
- Liked: 2141 times
- Joined: Jul 11, 2011 10:22 am
- Full Name: Alexander Fogelson
- Contact:
Re: New Dell Compellent = slower replication?
I mean the transport mode effectively selected by the proxy server during VM processing. You can look it up in the job session log, as I've described above.
-
- Expert
- Posts: 164
- Liked: 9 times
- Joined: Jan 28, 2014 5:41 pm
- Contact:
Re: New Dell Compellent = slower replication?
Sorry foggy, I'm not sure I'm finding what you are asking.
Here's where I'm going
Here's where I'm going
- Backup and Replication
Jobs
Then select the job on the right
I then select the VM on the left, as described, in the bottom pane.
- Replicating restore point....
Queued for processing....
Required Backup infrastructure resources have been assigned
VM processing started...
VM size...
Discovering replica VM
Preparing replica VM
Processing configuration
Creating helper snapshot
Using target proxy <name> for disk Hard disk 2 [hotadd]
Hard disk 2 ...read at 5 MB/s
Using target proxy <name> for disk Hard disk 1 [hotadd]
--- Continues this for all the disks for the VM ---
Deleting helper snapshot
Finalizing
Busy: Source 76% > Proxy.....
Primary bottle neck....
Network traffic verification detected no corrupted blocks
Process finished...
-
- Veeam Software
- Posts: 21139
- Liked: 2141 times
- Joined: Jul 11, 2011 10:22 am
- Full Name: Alexander Fogelson
- Contact:
Re: New Dell Compellent = slower replication?
Hotadd is the transport method in this case. In case target is the primary bottleneck, looks like the target storage is the issue.
-
- Expert
- Posts: 164
- Liked: 9 times
- Joined: Jan 28, 2014 5:41 pm
- Contact:
Re: New Dell Compellent = slower replication?
I have a couple followup questions on this topic.
1. What does "Processing rate" actually mean? I see that the Throughput Speed can be quite a bit faster than the processing rate.
2. When there is an incremental replication going on, is there a lot of random searches and writes at the destination? Mostly curious if there is a lot of comparing going on between the latest source and what's at the destination.
3. What is the path of the data during offsite replication? Does it go through vCenter at all? My assumption is Source -> local Veeam Server -> WAN -> Offsite Proxy -> Destination.
Thanks
Brendan
1. What does "Processing rate" actually mean? I see that the Throughput Speed can be quite a bit faster than the processing rate.
2. When there is an incremental replication going on, is there a lot of random searches and writes at the destination? Mostly curious if there is a lot of comparing going on between the latest source and what's at the destination.
3. What is the path of the data during offsite replication? Does it go through vCenter at all? My assumption is Source -> local Veeam Server -> WAN -> Offsite Proxy -> Destination.
Thanks
Brendan
-
- Veteran
- Posts: 1943
- Liked: 247 times
- Joined: Dec 01, 2016 3:49 pm
- Full Name: Dmitry Grinev
- Location: St.Petersburg
- Contact:
Re: New Dell Compellent = slower replication?
Hi,
1. I'd recommend you review a thread called "Interpreting real-time statistics" you'll find detailed descriptions of every stat used by jobs.
2. All data changes since the last job run are written to the snapshot delta file, and the snapshot delta file acts as a restore point. The more details you'll find in "Replication chain".
3. The Off-site replication data flow looks like: Source -> Source Proxy -> WAN -> Offsite Proxy -> Destination. Also, you can see it in the UG article "Replication Scenarios". Thanks!
1. I'd recommend you review a thread called "Interpreting real-time statistics" you'll find detailed descriptions of every stat used by jobs.
2. All data changes since the last job run are written to the snapshot delta file, and the snapshot delta file acts as a restore point. The more details you'll find in "Replication chain".
3. The Off-site replication data flow looks like: Source -> Source Proxy -> WAN -> Offsite Proxy -> Destination. Also, you can see it in the UG article "Replication Scenarios". Thanks!
-
- Expert
- Posts: 164
- Liked: 9 times
- Joined: Jan 28, 2014 5:41 pm
- Contact:
Re: New Dell Compellent = slower replication?
After a long process of speaking with Dell and Veeam on this issue, we have FINALLY got the problem resolved.
The target proxy transport mode was set to "Automatic". When the replication ran, it would choose to use HotAdd. If we changed the transport mode to "Network", it then would show NBD instead of HotAdd. Result? HotAdd was getting between 4 -10 MB/s. NBD is getting 34 - 50 MB/s! Huge improvement
Thought I'd share our findings in case others are noticing similar throughput issues with replication.
Thanks
The target proxy transport mode was set to "Automatic". When the replication ran, it would choose to use HotAdd. If we changed the transport mode to "Network", it then would show NBD instead of HotAdd. Result? HotAdd was getting between 4 -10 MB/s. NBD is getting 34 - 50 MB/s! Huge improvement
Thought I'd share our findings in case others are noticing similar throughput issues with replication.
Thanks
-
- Veteran
- Posts: 1943
- Liked: 247 times
- Joined: Dec 01, 2016 3:49 pm
- Full Name: Dmitry Grinev
- Location: St.Petersburg
- Contact:
Re: New Dell Compellent = slower replication?
Hi B.F.,
Thank you for following up, that's could be useful for further readers.
Also, I would recommend you to read this post by Tom, that explains in depth the difference between Hotadd and Network mode processes. Thanks!
Thank you for following up, that's could be useful for further readers.
Also, I would recommend you to read this post by Tom, that explains in depth the difference between Hotadd and Network mode processes. Thanks!
-
- Lurker
- Posts: 1
- Liked: never
- Joined: Feb 06, 2018 9:32 pm
- Full Name: Paul Mawdsley
- Contact:
Re: New Dell Compellent = slower replication?
Just a note (and an apology for resurrecting a thread...)Looking at the "Path Selection Policy", it looks like they are set for "Most Recently Used" and not "Round Robin".
We have experienced some VERY serious outages with the MPIO set to "Most Recently Used" as is default. I would recommend, if not already done, you set this to "Round Robin" ASAP. We have ... NOW Please contact Dell support if unsure.
It was a very annoying day off for me when an MRU pathed volume went offline with ~50% of our servers on it... Took 3 days to get SharePoint behaving again.
-
- Enthusiast
- Posts: 80
- Liked: 7 times
- Joined: Aug 11, 2015 9:10 am
- Full Name: Bilal AHmed
- Contact:
Re: New Dell Compellent = slower replication?
As far as i am aware the PSP for compellent is RR not MRU. The newer versions of SCOS show as SATP_ALUA in ESXi and this has a default PSP of MRU. Compellent support and in their docs recommend that you change the PSP for ALUA to RR to be fully supported.
Who is online
Users browsing this forum: No registered users and 69 guests