Host-based backup of VMware vSphere VMs.
Post Reply
barchas
Enthusiast
Posts: 26
Liked: 1 time
Joined: Dec 13, 2011 6:43 pm
Contact:

Slow replication v6 even locally?

Post by barchas »

I have a very simple lab test that I am running to evaluate V6. (not yet a customer)
because of extreme slowness between hosts, I have cut my test down to this:
1 esxi5 host.
1 2008r2 VM which runs veeam. (4gb,4cpu, vmxnet3 adapter)
1 windows 7 VM (17.5gb)

The target storage is local, as is the VM storage.

I am doing nothing at all on the windows 7 VM, just sitting idle.
I set up a replication job to run continuously.

first run of course takes a while, about 45 mins.
But the followups, are taking about 30 mins?
This is the history from the most recent completed replication run.

12/13/2011 9:55:26 AM :: Job started at 12/13/2011 9:55:17 AM
12/13/2011 9:55:26 AM :: Building VM list
12/13/2011 9:55:42 AM :: VM size: 50.0 GB (17.4 GB used)
12/13/2011 9:55:42 AM :: Changed block tracking is enabled
12/13/2011 9:55:45 AM :: Preparing next VM for processing
12/13/2011 9:55:45 AM :: Processing 'win7'
12/13/2011 10:25:41 AM :: All VMs have been processed
12/13/2011 10:25:42 AM :: Load: Source 6% > Proxy 50% > Network 98% > Target 99%
12/13/2011 10:25:42 AM :: Primary bottleneck: Target
12/13/2011 10:25:42 AM :: Job finished at 12/13/2011 10:25:42 AM

i get the same behavior if replicating to a different host as well.

any thoughts? I assume it's me missing something.
Gostev
Chief Product Officer
Posts: 31809
Liked: 7302 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Slow replication v6 even locally?

Post by Gostev »

As per bottleneck analysis stats, it is clear that the speed issue is write speed to the target storage. Select the VM in the real-time statistics window, and make sure the target proxy is using hot add mode (otherwise all writes would go over the network, which does not provide good speed). When replicating to other host, make sure you use target proxy running on that host (in case if that host uses local storage), again to allow for hotadd.

If your proxies are in fact using hotadd, then the only other issue I can suspect is hardware issues with hosts affecting storage write speed. For example, RAID controller not having writeback cache setting enabled.
tsightler
VP, Product Management
Posts: 6035
Liked: 2860 times
Joined: Jun 05, 2009 12:57 pm
Full Name: Tom Sightler
Contact:

Re: Slow replication v6 even locally?

Post by tsightler »

It's a little suspicious that your target is 99% bottleneck but overall very difficult to tell from this log since this is the overall job log and not the VM specific portion. If you click on the VM it will provide more details about the time spent at the various steps.
barchas
Enthusiast
Posts: 26
Liked: 1 time
Joined: Dec 13, 2011 6:43 pm
Contact:

Re: Slow replication v6 even locally?

Post by barchas »

Gostev:
I suppose I wasn't clear about my environment or got too blabbery.
Everything is running on the same 1 host to eliminate the network as a bottleneck in my testing (because it was slow between hosts)
1 veeam instance, 1 replication subject (windows 7), and the replicant are all on the same machine, same DAS array.
4 disks raid 10 - 15k sas - 1gb cache - hp410 raid controller - DAS

Are proxies required? I assumed the actual main veeam server is its own proxy.

tsightler: where is the information with more details you mention? I cannot seem to find anything more detailed than this HTML report.

thanks for helping out guys.
Gostev
Chief Product Officer
Posts: 31809
Liked: 7302 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Slow replication v6 even locally?

Post by Gostev »

Yes, this is exactly how I understand your environment. For VM processing log Tom is asking, select the VM in the real-time processing window (same window where you have got the stats you have posted in your first post from).
barchas
Enthusiast
Posts: 26
Liked: 1 time
Joined: Dec 13, 2011 6:43 pm
Contact:

Re: Slow replication v6 even locally?

Post by barchas »

Ah ok. I didnt realize that I could click over there to get more info, duh.

12/13/2011 9:55:43 AM :: Queued for processing at 12/13/2011 9:55:43 AM
12/13/2011 9:55:43 AM :: Required resources have been assigned
12/13/2011 9:55:46 AM :: VM processing started at 12/13/2011 9:55:45 AM
12/13/2011 9:55:46 AM :: VM size: 50.0 GB (17.4 GB used)
12/13/2011 9:55:50 AM :: Using source proxy VMware Backup Proxy [hotadd;nbd]
12/13/2011 9:55:53 AM :: Using target proxy VMware Backup Proxy [hotadd;nbd]
12/13/2011 9:55:54 AM :: Discovering replica VM
12/13/2011 9:55:55 AM :: Preparing replica VM
12/13/2011 9:57:40 AM :: Preparing guest for hot backup
12/13/2011 9:57:56 AM :: Creating snapshot
12/13/2011 9:58:04 AM :: Releasing guest
12/13/2011 9:59:17 AM :: Processing configuration
12/13/2011 9:59:31 AM :: Creating helper snapshot
12/13/2011 9:59:39 AM :: Hard Disk 1 (50.0 GB)
12/13/2011 10:23:44 AM :: Deleting helper snapshot
12/13/2011 10:24:09 AM :: Truncating transaction logs
12/13/2011 10:24:16 AM :: Removing snapshot
12/13/2011 10:25:25 AM :: Swap file blocks skipped: 7.0 MB
12/13/2011 10:25:26 AM :: Finalizing
12/13/2011 10:25:26 AM :: Applying retention policy
12/13/2011 10:25:37 AM :: Busy: Source 6% > Proxy 50% > Network 98% > Target 99%
12/13/2011 10:25:37 AM :: Primary bottleneck: Target
12/13/2011 10:25:37 AM :: Processing finished at 12/13/2011 10:25:37 AM
tsightler
VP, Product Management
Posts: 6035
Liked: 2860 times
Joined: Jun 05, 2009 12:57 pm
Full Name: Tom Sightler
Contact:

Re: Slow replication v6 even locally?

Post by tsightler »

Does your HP410 RAID controller have the batter-backed cache upgrade and do you have write-back caching turned on? This is quite important for write performance on VMFS. The fact that the target shows 99% of the wait is telling me this is the problem.

That being said, it would be interesting to see the amount of data read/transferred. Even a very slow disk (for example my laptop lab running nested VMs in ESX5 inside of Workstation 8) would seem to be able to perform this replication in 25 minutes unless CBT is not working at all. Is the VM somehow not a HW V7 VM?
ThomasMc
Veteran
Posts: 293
Liked: 19 times
Joined: Apr 13, 2011 12:45 pm
Full Name: Thomas McConnell
Contact:

Re: Slow replication v6 even locally?

Post by ThomasMc »

I could maybe chip in with this one;

A Little History
Old ESXi Replica host was a DL120 x3330 with 2x SATA 2 WD Disk 1TB Black ones sitting on a old LSi card I got with a tape drive :D (unfortunate we're SMB budgeting LOL!)
New ESXi Replica host is a ML 110 G6 x3430 with 4x SATA 2 WD Disk 1TB Black ones sitting on a P400 512MB BBWC(75/25)

Full replica pre new host for ca01.lab.local
VM size: 15.0 GB (6.0 GB used)
SProxy (Veeam02) hotadd;ndb | TProxy (Veeam02) ndb
HD 1 (15.0 GB) 14.3 GB read at 85MB/s
Load: Source 36% > Proxy 74% > Network 67% > Target 86%

Full replica post new host for ca01.lab.local
VM size: 15.0 GB (6.0 GB used)
SProxy (Veeam02) san;ndb | TProxy (Veeam03) hotadd;ndb
HD 1 (15.0 GB) 14.3 GB read at 7MB/s
Busy: Source 4% > Proxy 51% > Network 97% > Target 99%

I've ran HDTune and IOMeter on Veeam03 and its defiantly able to achieve more and on the network side the ports that are connected to the new host are hardly twitching
ThomasMc
Veteran
Posts: 293
Liked: 19 times
Joined: Apr 13, 2011 12:45 pm
Full Name: Thomas McConnell
Contact:

Re: Slow replication v6 even locally?

Post by ThomasMc »

I ripped the p400 out after tampering about with it all days trying to get decent speeds out of it, and re-run the jobs on the old controller with the same disks, the old controller doesn’t have any BBWC and still wipes the floor :)

Run 1 Full for ca01.lab.local
VM size: 15.0 GB (6.0 GB used)
SProxy (Veeam02) san;ndb | TProxy (Veeam03) hotadd;ndb
HD 1 (15.0 GB) 14.3 GB read at 93MB/s
Busy: Source 92% > Proxy 71% > Network 27% > Target 92%

Run 2 Full for ca01.lab.local
VM size: 15.0 GB (6.0 GB used)
SProxy (Veeam02) san;ndb | TProxy (Veeam02) ndb
HD 1 (15.0 GB) 14.3 GB read at 91MB/s
Busy: Source 49% > Proxy 73% > Network 87% > Target 95%
tsightler
VP, Product Management
Posts: 6035
Liked: 2860 times
Joined: Jun 05, 2009 12:57 pm
Full Name: Tom Sightler
Contact:

Re: Slow replication v6 even locally?

Post by tsightler »

Hi Thomas, I'm assuming you've done all of this, but I have to ask. Have you verified that the P400 has the latest firmware, that the battery is actually charged and showing "normal" state, and that the write-back cache is actually enabled. HP seems to recommend of 50/50 split on the read/write cache. The VMware forums are pretty full of issues regarding write performance with this card.
ThomasMc
Veteran
Posts: 293
Liked: 19 times
Joined: Apr 13, 2011 12:45 pm
Full Name: Thomas McConnell
Contact:

Re: Slow replication v6 even locally?

Post by ThomasMc »

tsightler wrote:Hi Thomas, I'm assuming you've done all of this, but I have to ask. Have you verified that the P400 has the latest firmware, that the battery is actually charged and showing "normal" state, and that the write-back cache is actually enabled. HP seems to recommend of 50/50 split on the read/write cache. The VMware forums are pretty full of issues regarding write performance with this card.

Hi Tom;

Battery was 100% and BBWC was active (tried 75/25 | 50/50 | 25/75) when it was 75% on write I got a small increase of 1MB/s
Firmware was updated to latest as part of my fiddling today with no noticeable difference
I even flipped the Disk caches on as well, one thing I did notice is the reaction time on starting VM, stopping VMs was pretty weird and sometimes when running the storage tests it would just lockup

e.g
http://dl.dropbox.com/u/4304771/HDTune%20ET%20test.png
http://dl.dropbox.com/u/4304771/HDTune.png

I know HDTune isn't really that good but if you look at the 8MB random seek time you'll see what I mean
StarRefrigeration
Lurker
Posts: 1
Liked: never
Joined: Sep 30, 2011 9:43 am
Full Name: Colin Wright
Contact:

Re: Slow replication v6 even locally?

Post by StarRefrigeration »

Experiencing the same issue with a DL580 G5 ESXi5 p400 BBWC (latest firmware) 25/75 10 x 500GB Sas Midline getting between 1-3mb/s on a local replication job (99% Target bottleneck) Anyone got any further ideas?
Yuki
Veeam ProPartner
Posts: 252
Liked: 26 times
Joined: Apr 05, 2011 11:44 pm
Contact:

Re: Slow replication v6 even locally?

Post by Yuki »

had this issue on Dell PowerEdge R720xd with H700 Raid controller with 1GB cache BBWC and 12x3TB NL SAS.

Took 9 hours to clean up snapshots with only 91GB of data transferred on the last replication.
tsightler
VP, Product Management
Posts: 6035
Liked: 2860 times
Joined: Jun 05, 2009 12:57 pm
Full Name: Tom Sightler
Contact:

Re: Slow replication v6 even locally?

Post by tsightler »

I don't think this thread has anything to do with removing snapshots so you'll have to clarify what you mean. Do you mean the snapshots on the source VM, or the restore point on the target? Might actually want to start a different thread.
Yuki
Veeam ProPartner
Posts: 252
Liked: 26 times
Joined: Apr 05, 2011 11:44 pm
Contact:

Re: Slow replication v6 even locally?

Post by Yuki »

Well, i believe the slow replication others seen may have been due to similar issue where "finalizing" stage takes a long while. you can only see that if you click on the actual vm being processed, but no in overall job view.
tsightler
VP, Product Management
Posts: 6035
Liked: 2860 times
Joined: Jun 05, 2009 12:57 pm
Full Name: Tom Sightler
Contact:

Re: Slow replication v6 even locally?

Post by tsightler »

The thread above specifically talks about transfer rate, including statistics showing bottleneck, and transfer rate per disk. This doesn't have anything to do with snapshot commit as that factor is not part of the bottleneck or per-disk transfer rate calculations so they can't be related. If you are interested in discussing your issue it's not problem, but I'd prefer not to confuse this thread with a completely different issue as that will simply confuse future searchers.
Post Reply

Who is online

Users browsing this forum: Semrush [Bot] and 54 guests