Is my Replication job behaving normally?

Post by **mplep** » Mar 09, 2011 8:41 am this post

Hi,

I seeded a 300GB (130GB thin Prov) VM recently which took about 48 hours to complete. Since then I've been taking incremental’s and it's worked well. It's an Exchange 2010 server and takes about 2 hours across a 10MB leased line to replicate the incremental if you average it out. Some replications are quicker, some slower as you'd expect.

I had two failures earlier today relating to VSS which caused the job to fail. A few minutes later the job started and started replicating the VMDK again and not reporting the VSS problems. But the job seems terribly slow now, like it's seeding again. I'm watching the "Processed Size" crawl from 512MB upwards very slowly. it's currently at 4GB processed after 2 hours running, but the VRB/VBK are quite small at 89MB and 43MB respectively and don't seem to be growing after monitoring them for about an hour. The time modified stamps are changing for these files however. It shouldn't have much changed data to sync and the small VRB file seems to indicate that, but they seem a bit too static and slow.

I have other VM's from the same host which are replicating to the same target nice and quickly and these have all been running together previously without issue. It was since the VSS error that this issue has occurred.

I have Veeam 5.0.1 running in NBD from a ESXi 4.0 source to ESXi 4.1 target. Veeam is running on a VM at the target site. I'm aware NBD isn't the best method, but comms prevent connecting to the SAN at the source site directly. The job is reporting that CBT is enabled.

Before I contact support I wanted to understand whether the behaviour I'm seeing is unusual or happens after a VSS fail on a replication job. is there anywhere I can be looking to better understand what it's actually doing? I'm praying it is not re-seeding or requires a full replication.

Thanks in advance,

MPLEP

Post by **Vitaliy S.** » Mar 09, 2011 11:12 am this post

Hello,

If your replication job has failed, then most likely it is being automatically repaired now. It is not re-seeing, but the repair process should take some time, meaning that you experience an expected behaviour. Though you can also shoot replication job log to our support team to have a look.

Thank you!

Post by **mplep** » Mar 11, 2011 11:29 am this post

Thanks Vitaliy,

The rety has been running for about 42 hours (full seed was 48) and seems to have stopped. It's showing 130GB of 250GB being processed. That actual VMDK size is about 120GB as it's Thin Prov. The modified times on the VRB/VBK are about 3 hours old now.

My question is what now? I HAVE to get this sorted this weekend, either by being more patient or doing a re-seed over the weekend. If I log a support call I doubt I'll get into the meat of things until early next week. I'll probably log one anyway, but if a re-seed will sort it for Monday then I'll have to start that immediately (it takes 48hrs). Is it still likely to be repairing? And the repair seems to take as long as the seeding anyway.

Suggestions welcomed

Thanks, MPLEP

Post by **mplep** » Mar 11, 2011 1:43 pm this post

Update: It's moved up to 150GB processed after sitting around at 130GB for a few hours. I'll wait, but I would appreciate knowing whether this is a "repair" or something else going on, particularly when it's about the same time as a full seeding. We are a providing Veeam as a hosted service and I really need to understand these type of processes and in particular their impact for SLA's etc.

Post by **Gostev** » Mar 11, 2011 1:50 pm this post

Our support should be able to say what the job is doing by looking at the most current full logs files. But if I am not mistaken, repair process is reflected with the corresponding status in the real-time job statistics window.

Post by **mplep** » Mar 15, 2011 8:35 am this post

Hi,

I am in contact with support on ID#594354. I'm a bit confused as to a couple of things. Firstly why CBT got disabled (nobody touched that setting) and why the follow up job after a failure was going to take over double the time of a seed before I killed it off. If I have to perform a reseed (or very very lengthy incremental) after a failure then things will become difficult with maintaing the SLA.

If you get a chance to look at the support call I would appreciate it.

Thanks

MPLEP

lewisdrummond · Post by **lewisdrummond** » Mar 15, 2011 3:56 pm this post

We noticed the exact same thing today with Exchange 2007 and long replication job times after VSS failures. Keep me posted with what you learn.

Post by **mplep** » Mar 16, 2011 5:20 am this post

After the VSS failure the replication was taking 60+ hours before I stopped it. The sync was still only about 3/4 complete at that point. A reseed took 34 hours. This is all across a 10MB leased line with nothing else running upon that line. Something went very wrong somewhere. I don't want to reseed after a sync failure, the seeds are bad enough with me having Veeam on the target side. Additionally I'm sure looking forward to v6 which indicates some significant additions to the replication technology.

R&D Forums

Is my Replication job behaving normally?

Re: Is my Replication job behaving normally?

Re: Is my Replication job behaving normally?

Re: Is my Replication job behaving normally?

Re: Is my Replication job behaving normally?

Re: Is my Replication job behaving normally?

Re: Is my Replication job behaving normally?

Re: Is my Replication job behaving normally?

Who is online