Discussions specific to the VMware vSphere hypervisor
Post Reply
harleyd
Novice
Posts: 9
Liked: 2 times
Joined: Jan 31, 2013 8:24 pm
Contact:

A couple of ideas

Post by harleyd » Jan 31, 2013 8:50 pm

Hi There,

I've just recently started using B&R for local and remote replication with pretty good results. Like most people, however, I tend to struggle with lack of bandwidth and tight backup windows. There are a couple of suggestions I have that may improve peformance in my situation. Please find these below in order of preference.

Fully Cached Replication
-As Veeam processes servers in a job, ALL changes are stored on a fast local repository for transmission. Transmission begins as soon as this cache starts to fill. Cache is cleared at the end of the job. This allows the job to remove the VMware snapshot sooner and bandwidth is used more efficiently as there is no waiting for the job to process.

Two point Replication (Staging)
-A replica is updated locally and remotely in the same job. At the moment this can be done by creating a replica of a replica. The downside to doing so is that CBT cannot be used.
-This request also ties into the Fully Cached Replication request above, however the local replica could be used for Cache rather than tying up the production environment.
-WAN Data duplication and packet size reduction could be built in, as the local replica and changes could be compared locally before transmission.
-I assume doing so would assist in adding a resume function for remote replication as the local replica would not have changed since retry.

Check to see if a specific job is running
-A scheduling option to allow B&R to check if a specific job is running before starting another scheduled job. This request is based purely on the need to create replicas of replicas.

Remove Replicas from the Database without deleting
-A button to remove replicas from the Replicas > Ready screen without deleting from disk. It is sometimes necessary to keep a copy of the data on the original failed VM and bring the replica VM in to production. The process right now requires that you clone the entire replica VM to a new VM in order to change the ID of the machine, and then use the delete from disk function to clear up the DB.

Use Multiple TCP/IP connections per job based on subnet.
-As per the title this option would be available for change based on subnet rules, not as a blanket global policy.

Limit bandwidth per job
-Allow you to set bandwidth limits at the job level in addition to the subnet level. This helps with prioritising more/less important replicas.


Please understand I have no expectation that any of these requests will be introduced in to B&R, but I want to call them out rather than sit on them in case someone else finds them a good idea.

Thanks for your time,
Derek

Vitaliy S.
Product Manager
Posts: 22862
Liked: 1538 times
Joined: Mar 30, 2009 9:13 am
Full Name: Vitaliy Safarov
Contact:

Re: A couple of ideas

Post by Vitaliy S. » Feb 03, 2013 3:10 pm

Hi Derek,

Thanks for the feedback, much appreciated. I've got a couple of follow up questions for you:
harleyd wrote:Fully Cached Replication
-As Veeam processes servers in a job, ALL changes are stored on a fast local repository for transmission. Transmission begins as soon as this cache starts to fill. Cache is cleared at the end of the job. This allows the job to remove the VMware snapshot sooner and bandwidth is used more efficiently as there is no waiting for the job to process.
Hmm... but what if your source storage is a bottleneck? Having fast local repository will not help in this case, right?
harleyd wrote:Check to see if a specific job is running
-A scheduling option to allow B&R to check if a specific job is running before starting another scheduled job. This request is based purely on the need to create replicas of replicas.
If this request is based on the need of replicating replicas, then why not to use post-backup job scheduling options? Once the original replication job is over, start the second one to replicate the replica. Or there is another fact that I should keep in mind?
harleyd wrote:Remove Replicas from the Database without deleting
-A button to remove replicas from the Replicas > Ready screen without deleting from disk. It is sometimes necessary to keep a copy of the data on the original failed VM and bring the replica VM in to production. The process right now requires that you clone the entire replica VM to a new VM in order to change the ID of the machine, and then use the delete from disk function to clear up the DB.
Do you want to see this for individual VMs, correct? The option you're referring to is already available, though for the entire job only....

Thanks!

harleyd
Novice
Posts: 9
Liked: 2 times
Joined: Jan 31, 2013 8:24 pm
Contact:

Re: A couple of ideas

Post by harleyd » Feb 03, 2013 11:01 pm

Hi Vitaliy,

Thanks for getting back to me, it’s great to see Veeam actively looks at these types of posts.
Hmm... but what if your source storage is a bottleneck? Having fast local repository will not help in this case, right?
Totally agree, if your source is a bottleneck then this functionality would slow things down, but my intention was for this to be an option for WAN. A source bottleneck rarely tends to be the case when it comes to WAN replication with servers with a lot of data and changes. Most of the time I would pull between 50-120MB/s (CBT turned off) when the job is transmitting little to no changes or replicating to a local NAS. What I do see is that the job will slow right down at the point it starts uploading to the remote replication site, however continue to process at the rate above when it these uploads complete. What I'm suggesting is some concurrency, which would mean that the job could complete processing and remove the snapshot sooner and saturate the WAN link as the changes are available sequentially.

Also, customers with high speed sources and low speed destinations on site may also see some improvement by using a repository with SSDs.
If this request is based on the need of replicating replicas, then why not to use post-backup job scheduling options? Once the original replication job is over, start the second one to replicate the replica. Or there is another fact that I should keep in mind?
I do use the post-backup job option at the moment and it works very well. This is more about when the replication job goes for 24 hours or longer and the jobs overlap. Because I use different Proxies (mixture of SAN/NBD/HOTADD for speed) on these two jobs, the original job can fire again before the second has even completed.
Do you want to see this for individual VMs, correct? The option you're referring to is already available, though for the entire job only....
Would prefer individual, but if I can do it another way, then I'm happy. Any chance you can point me to some details on how to do this?

Thanks & Regards,
Derek

veremin
Product Manager
Posts: 16778
Liked: 1405 times
Joined: Oct 26, 2012 3:28 pm
Full Name: Vladimir Eremin
Contact:

Re: A couple of ideas

Post by veremin » Feb 04, 2013 7:34 am

Hi, Derek.
I do use the post-backup job option at the moment and it works very well. This is more about when the replication job goes for 24 hours or longer and the jobs overlap. Because I use different Proxies (mixture of SAN/NBD/HOTADD for speed) on these two jobs, the original job can fire again before the second has even completed.
As a potential workaround you can probably implement some PowerShell script which would be run via Windows Scheduler and which would firstly check status of the second job, initiating start of the first one only if the status isn’t “Running”.
Any chance you can point me to some details on how to do this?
Click “Replicas” -> Right-click necessary “Job name” -> "Remove from replicas".

Image

Hope this helps.
Thanks.

Vitaliy S.
Product Manager
Posts: 22862
Liked: 1538 times
Joined: Mar 30, 2009 9:13 am
Full Name: Vitaliy Safarov
Contact:

Re: A couple of ideas

Post by Vitaliy S. » Feb 04, 2013 11:23 am

Thanks for clarifications, much appreciated!
harleyd wrote:I do use the post-backup job option at the moment and it works very well. This is more about when the replication job goes for 24 hours or longer and the jobs overlap. Because I use different Proxies (mixture of SAN/NBD/HOTADD for speed) on these two jobs, the original job can fire again before the second has even completed.
Got it, but there shouldn't be any issues (when replication and backup jobs overlap), or you're mostly concerned that the backup job will not be saving the latest VM state?

harleyd
Novice
Posts: 9
Liked: 2 times
Joined: Jan 31, 2013 8:24 pm
Contact:

Re: A couple of ideas

Post by harleyd » Feb 05, 2013 5:09 am

Hi Vladimir,
v.Eremin wrote:As a potential workaround you can probably implement some PowerShell script which would be run via Windows Scheduler and which would firstly check status of the second job, initiating start of the first one only if the status isn’t “Running”.
I'm not very handy with powershell, but will take a look. It just seems like a logical function to have in Veeam to me as I would prefer to leverage the scheduling functionality in there rather than building my own messy version under windows scheduler.
v.Eremin wrote:Click “Replicas” -> Right-click necessary “Job name” -> "Remove from replicas".
Thanks for the tip.

Regards,
Derek

harleyd
Novice
Posts: 9
Liked: 2 times
Joined: Jan 31, 2013 8:24 pm
Contact:

Re: A couple of ideas

Post by harleyd » Feb 05, 2013 5:16 am

Hi Vitaliy
Vitaliy S. wrote:Thanks for clarifications, much appreciated! Got it, but there shouldn't be any issues (when replication and backup jobs overlap), or you're mostly concerned that the backup job will not be saving the latest VM state?
As I'm replicating replicas, I'm concerned the local replication job will update the source VM of the long running replication job. I've never actively let this happen so don't know what the outcome would be, but thought it would probably cause an issue?

E.g.
1. Job 1 replicates VM1 locally as VM1_replica
2. Job 2 replicates VM1_replica remotely as VM1_replica_dr
3. Job 1 runs again while job 2 is still running and attempts to update VM1_replica


Cheers,
Derek

Vitaliy S.
Product Manager
Posts: 22862
Liked: 1538 times
Joined: Mar 30, 2009 9:13 am
Full Name: Vitaliy Safarov
Contact:

Re: A couple of ideas

Post by Vitaliy S. » Feb 05, 2013 7:35 am

There won't be any issues, as backup/replication job will be capturing VM data from the snapshot taken previously and new replication jobs will create its own, independent snapshot.

harleyd
Novice
Posts: 9
Liked: 2 times
Joined: Jan 31, 2013 8:24 pm
Contact:

Re: A couple of ideas

Post by harleyd » Feb 05, 2013 8:02 am

Sounds great! Would there be any issue if the local replica job completes before the long running WAN replication job? such as the retention policy attempting to delete the original snapshot?

Vitaliy S.
Product Manager
Posts: 22862
Liked: 1538 times
Joined: Mar 30, 2009 9:13 am
Full Name: Vitaliy Safarov
Contact:

Re: A couple of ideas

Post by Vitaliy S. » Feb 05, 2013 8:15 am

This shouldn't have any effect either.

veremin
Product Manager
Posts: 16778
Liked: 1405 times
Joined: Oct 26, 2012 3:28 pm
Full Name: Vladimir Eremin
Contact:

Re: A couple of ideas

Post by veremin » Feb 05, 2013 10:02 am

I'm not very handy with powershell, but will take a look. It just seems like a logical function to have in Veeam to me as I would prefer to leverage the scheduling functionality in there rather than building my own messy version under windows scheduler.
You can take the following script as a rough example, which is likely to give you a little bit more understanding on how to script this process:

Code: Select all

Add-PSSnapin VeeamPSSnapin
$firstjob = Get-VBRJob -name "Name of your first job"
$secondjob = Get-VBRJob -name " Name of your first job "
If($secondjob.GetLastState() -ne "Working") {Start-VBRJob $firstjob}
Else
{
    do 
    {
      Start-sleep -s 900 
      $status = $secondjob.GetLastState()
    }while ($status -eq "Working")
    Start-VBRJob $firstjob
} 
In general, this script firstly checks the status of your second job and if the status isn’t equal to “Working” it starts first job.

Otherwise, it keeps checking the status in a specified interval (in example provided above interval is 15 minutes = 900 seconds) and when the second job finally stops, it initiates the start of the first one.

As always, don’t forget to test it before implementing.

Hope this helps.
Thanks.

Post Reply

Who is online

Users browsing this forum: Google [Bot] and 18 guests