Host-based backup of VMware vSphere VMs.
eiskra
Enthusiast
Posts: 25
Liked: never
Joined: Mar 07, 2012 11:54 pm
Full Name: Edward Iskra
Contact:

Re: Optimizing Exchange 2010 backups

Post by eiskra »

Second question, re: optimization - with our 2-server DAG environment, we'd expect much better deduplication than we're getting (typically 1.1 on nightly incrementals.)

We have both servers in a single job in order to get dedupe. The first server's incremental backup runs for just over 3 hours. Then the second kicks off. It occurs to me that the two servers would be more similar if they were both snapshotted and frozen at the same time, rather than 3 hours apart.

I could put them in separate jobs with separate proxies and force them to start at the same time, but then there would be no dedupe.

Is there any way to get Veeam to snap both servers at the start of the job? (Technically, to snap them consecutively, and then perform the backup of each server in turn, and then to delete the snapshots consecutively.) I realize that the second server backed up will have a larger snapshot delta and take longer to consolidate, but we'd prefer that situation if we can reduce the size of the backup - and perhaps the overall speed, if more is getting skipped via dedupe.
Vitaliy S.
VP, Product Management
Posts: 27055
Liked: 2710 times
Joined: Mar 30, 2009 9:13 am
Full Name: Vitaliy Safarov
Contact:

Re: Issue with VMware & Exchange 2010 DAG

Post by Vitaliy S. »

Hi Edward,
eiskra wrote:Second question, re: optimization - with our 2-server DAG environment, we'd expect much better deduplication than we're getting (typically 1.1 on nightly incrementals.)
We've got an existing topic with the same question, please take a look: Poor Dedupe on Exchange DAG
eiskra wrote:Is there any way to get Veeam to snap both servers at the start of the job? (Technically, to snap them consecutively, and then perform the backup of each server in turn, and then to delete the snapshots consecutively.) I realize that the second server backed up will have a larger snapshot delta and take longer to consolidate, but we'd prefer that situation if we can reduce the size of the backup - and perhaps the overall speed, if more is getting skipped via dedupe.
No, that's not possible.

Thanks!
Butha
Enthusiast
Posts: 39
Liked: 20 times
Joined: Oct 03, 2012 10:59 am
Full Name: Butha van der Merwe
Contact:

Question about increasing cluster timeouts RE: Exchange 2010

Post by Butha »

Hi Everybody,

I have a question about applying the following to improve the backups of our exchange DAG with Veeam. There is a lot of links saying that applying the following will help:
cluster /prop SameSubnetDelay=2000:DWORD

cluster /prop CrossSubnetDelay=4000:DWORD

cluster /prop CrossSubnetThreshold=10:DWORD

cluster /prop SameSubnetThreshold=10:DWORD

The one question I cannot get an answer to though is - does it require any restart of services/servers (Mailbox,Frontend) ? The only information is right at the end of the MS article: http://technet.microsoft.com/en-us/libr ... 10%29.aspx

Point 11:
"Take the clustered service or application offline and bring it back online, using the method that you are most familiar with. For example, to use the Failover Cluster Management snap-in, under Services and Applications, right-click the service or application and click Take this service or application offline, then right-click again and click Bring this service or application online."

Can anybody that has applied the settings perhaps comment?

Thanks!
Cokovic
Veteran
Posts: 295
Liked: 59 times
Joined: Sep 06, 2011 8:45 am
Full Name: Haris Cokovic
Contact:

Re: Issue with VMware & Exchange 2010 DAG

Post by Cokovic »

Hi Butha,

these changes are only to prevent a database failover in case of a Vmotion or when snapshot is created/commited in an virtualised Exchange DAG environment. These settings will immediately take effect. And there is no need to restart the server or any services within your DAG cluster. If you open up the Failover Cluster Manager you will see under the Services and application node that there are no services specified in a DAG cluster.
Andreas Neufert
VP, Product Management
Posts: 6707
Liked: 1401 times
Joined: May 04, 2011 8:36 am
Full Name: Andreas Neufert
Location: Germany
Contact:

Re: Issue with VMware & Exchange 2010 DAG

Post by Andreas Neufert » 4 people like this post

For those of you who had Problems with Cluster Failovers at Snapshot commit. The following solution worked at one of my customers. (8500 Mailboxes, 13TB)

Customer checked Exchange DAG Datastore Performance and did everything they can to bring Exchange to very fast disks.
Then they disabled every backupground scanning/maintenance of Exchange and other Tools at the Moment of backup.
Customer start the Exchange Backup when no other backup run (no IO load on the disk System).
Customer enhanced the DAG heartbeat Settings to max. => see discussion in this Forum Topic.

This helped them a lot but there where some Cluster Failovers.

VMware Support told them that they can use the following (undocumented?) VM Setting.
snapshot.maxConsolidateTime = "1" (Default = 6 this is in seconds)
I googled it but didn´t fin any documentation

Warning please add this only to your VMs after you spoke with VMware Support and have discussed the consequences.
VMware told my customer to Monitor continous on a daily base if all snalshots will be deleted successfully.

I think basically VMware bring the changes at snapshot commit in small peaces to prduction storage and commit them, this process can take 6 seconds which is to long for the max. DAG cluster heartbeat of 2 seconds.

At my customer this helped to solve the Cluster Failover Problems at snapshot commit.


If you have DAG Failover Problems at snapshot creation:
Use the fastest discs you can.
Limit the volume (Datastore) amount. Feedback from Vmware is that with more disks attached to a VM the snapshot take longer to create.
Do backups on free time Slots with no other operations on storage, vcenter or backup systems.
Check your VCenter - is a ESX Server much faster in Snapshot creation (add ESX host directly to backup infrastructure and try to backup with ESX-VM selection) check with VMware why snapshot take much longer over VCenter.

Just my 2 Cents - hope it helps you a bit... Please add Feedback if this was not helpful or if you facing Problems with that or if it was helpful.

CU Andy
jeremy.otten
Lurker
Posts: 1
Liked: never
Joined: Sep 11, 2012 7:59 am
Full Name: Jeremy Otten
Contact:

Re: Issue with VMware & Exchange 2010 DAG

Post by jeremy.otten »

Just had a call with vmware.

They do not advise to use the undocumented feature called snapshot.maxConsolidateTime = "1"

They only use it at engineering to do debugs.. and if you want to use it.. it’s at your own risk….

They also state that they cannot confirm that this “hidden feature” will stay they when ESXi is updated or the vm itself is changed in anyway…

Just Great!.. I think you really need to check this out through the higher levels of vmware.. cause this is a biggy.

The support of veeam for Exchange and SQL.. in a HA solution.... is very at risk... because nor Microsoft .. nor Vmware advise you to use it...

But veeam created a official KB for it.... http://www.veeam.com/kb1744 and they call it a TIP... pfff..

you guys state :: Please note that this is an undocumented vmx alteration, and should be validated by VMware support prior to using.

Well i contacted them.. and they say.. we do not advise to use it...

Really do not like this.. because I do not want to use agent backups in VM's.. that goes past the entire concept of Vmware....
Andreas Neufert
VP, Product Management
Posts: 6707
Liked: 1401 times
Joined: May 04, 2011 8:36 am
Full Name: Andreas Neufert
Location: Germany
Contact:

Re: Issue with VMware & Exchange 2010 DAG

Post by Andreas Neufert » 1 person likes this post

Hi Jeremy,

thank you very much for this feedback.

I will share here my experience from the field that I had with big Exchange/Veeam customers.
In one Situation one of my customer told me that VMware Support helped them with that undocumented (and as you said ... unsupported )snapshot.maxConsolidateTime = "1" setting.

Because Veeam use the APIs from VMware, we can only use what we get from this side. Please Keep in mind that Hyper-V Backup working on a different way without VM Snapshots (volume snapshots).

What we can do from Veeam perspective is to optimize everything what we can to reduce Snapshot lifetime and we did this with v7.

For example with v7 we are processing VM disks parallel (if you will update your code, you need to enable it manually). Likely your Exchange Servers have many disks and this reduce Snapshot Lifetime => reduces amount of blocks to commit because of less changes over time => reduces Snapshot commit load. Theoretical example: 10 Disks 6minutes for each disk => 1h backup window with v6.5 => ~6 minutes with v7 if you have enough proxy ressources.

With Enterprise Plus and HP Lefthand/3PAR and the VSA (can be used with other Storage vendors), we can create a physical Snapshot right after the VM Snapshot and release the VM Snapshot after some seconds. This minimize the Snapshot Lifetime and Data to some seconds => 1000times less data changes => fast commit.

You will see other optimizations in v7. So we here you.


Regarding MS Support and Snapshot based Backups. Microsoft stattet in their documentation that they do not Support this because that sort of backups are not application aware. Because of our own VSS Integration we are. At the end MS will Support only their own Backup products themself when it comes hard on hard.


My very personal list of tips to optimize Exchange DAG backups at this point are:

1) Increase the DAG heartbeat time (no reboot needed)
cluster /prop SameSubnetDelay=2000:DWORD
cluster /prop CrossSubnetDelay=4000:DWORD
cluster /prop CrossSubnetThreshold=10:DWORD
cluster /prop SameSubnetThreshold=10:DWORD
2) Use new Veeam Storage Snapshot Feature (Lefthand/3PAR/VSA) if you can (after v7 release)
3) Backup only an DAG member that hold only inactive databases (no cluster failover because of no active databases) (Logfile Truncation will be replicated by Exchange in whole DAG)
4) Try to avoid any changes at the backup time window (User, Background processes, Antivirus, ....). Also try to avoid that on all LUNs on the storage System itself (faster writes at snapshot commit).
5) Use Forward Incremental or if you Need space Forward incremental with daily transform into rollbacks. Do not use Reverse Incremental (Reverse Incremental took a bit longer than the other backup methods)
6) Use new parallel Prozessing with enought ressources to backup all of your disks at the same time (after v7 release)
7) Use Direct SAN Mode with minimal needed disks connected at selected Proxy. If not possible use NBD mode with 10GbE. (Do not run Proxy in Autoselect mode). Disable VDDK Logging for Direct SAN Mode if your backups themself run stable (ask support for the registry key and consequences).
8 ) Use actual VMware Versions (newest VADP/VDDK Kits with a lot of updates in it) and actual Veeam Versions (newer VDDK Integration)
9) Use at Minimum VMware vSphere 5.0 because of changes in the snapshot places and Background things.
10) User Fast disks for all of the VM disks (also the OS disk)
11) Less disks can help to reduce snapshot creation (and commit?) time.
12) Check your vcenter load and optimize it (or use direct ESX(i) Connections for Veeam VM selection.
13) Maybe another Option (unsupported from VMware) is to change VM Setting snapshot.maxConsolidateTime = "1" (see discussion above).

Hope that this informations can be helpful.

CU Andy
mchang
Lurker
Posts: 2
Liked: 2 times
Joined: Oct 05, 2011 7:54 pm
Full Name: Mike Chang
Contact:

Re: Issue with VMware & Exchange 2010 DAG

Post by mchang » 2 people like this post

Just wanted to post my experiences with this issue after lurking in this thread for some time. Hopefully it will be helpful to some people.

We started using VEEAM in our single server Exchange 2003 environment and not surprisingly had no issues. When we initially starting deploying our 2 server Exchange 2010 DAG setup, we started seeing sporadic failures and VSS timeout issues, but they were infrequent enough that it was not a huge issue and the jobs would eventually succeed after a retry or two. However, once we had fully deployed our Exchange 2010 and stretched our DAG across 2 datacenters, it reached the point (I don't remember specifically what changes triggered what) that we could no longer complete a backup most of time due to snapshot timeouts. In addition the backups would frequently trigger cluster node failover events in Windows.

We were able to completely fix these issues with a couple of the best practices many people have already suggest in this thread. Primarily:
1) Increase Exchange 2010 DAG timeouts - this solved the triggering of unintended failover events complety
2) We shuffled around some servers/storage groups so Exchange had much better storage performance from our EMC array (I love storage vMotion). This completely solved our snapshot timeout issues.
3) Increase CPU+Memory resources to vCenter moderately

Just doing the above 3 things was able to completely solve our issues.

BTW - I found out the hard way that Cisco ASA firewall inspecting of DCERPC does *not* work well with Exchange DAG traffic. I tried many things, but finally had to turn it off completely to get DAG working properly.
Andreas Neufert
VP, Product Management
Posts: 6707
Liked: 1401 times
Joined: May 04, 2011 8:36 am
Full Name: Andreas Neufert
Location: Germany
Contact:

Re: Issue with VMware & Exchange 2010 DAG

Post by Andreas Neufert »

Thank you for your Feedback Mike
Post Reply

Who is online

Users browsing this forum: Google [Bot] and 85 guests