- Posts: 3
- Liked: never
- Joined: Feb 03, 2012 8:16 am
- Full Name: Jonas Carlsson
We have an environment with two CAS 2010 servers, load balanced with Windows NLB.
Everything is running on esx4.1.
When the Veeam backup runs it seems to freeze the current CAS for a while (is it standard VMware snapshot being done?) causing it to failover to the other CAS. This takes some time before the NLB sorts it out and the CAS-service is up again. In practice it means we have a mail outage for 10 min every night...
What is the best practice to backup such CAS setup?
Thanks in advance
- VP, Product Management
- Posts: 26649
- Liked: 2624 times
- Joined: Mar 30, 2009 9:13 am
- Full Name: Vitaliy Safarov
- VP, Product Management
- Posts: 6499
- Liked: 1357 times
- Joined: May 04, 2011 8:36 am
- Full Name: Andreas Neufert
- Location: Germany
thanks for your enquiry.
Think there are 2 problems:
1. If you delete a snapshot the VM freezes
2. Because of the VM freeze ... you have problems with the Windows NLB Cluster heartbeat
So this is no Exchange Problem and as you said, it also happens if you do this manual so it is no Veeam Problem, too. It is a infrastruktur problem.
Solution for Problem 1:
@all with snapshot freeze problems.
@all with DAG cluster pans
NFS Datastores => Install VMware fixes (symtom: snapshot freezes at snapshot delete)
SAN Datastores => Install latest VMware Versions and check your HBA/datastore access profile if it suites your SAN Storage (Dedicated/Rounrobin/...)
iSCSI Datastores => Install latest VMware Versions and check your HBA/datastore access profile if it suites your SAN Storage (Dedicated/Rounrobin/...) + Use a dedicated enterprise switch for iSCSI VMware traffic
Update your SAN/iSCSI/NAS Firmware (in case of VMware snapshot commit/delete VMware writes a large amount of random writes) I saw a lot of old firmwares that have problems with that.
Do you use Disk System based sync mirroring?
To check if this is the problem: Disable Storage System syncron mirroring (I saw some systems that perform not well beacues of firmware bugs)
To check out if your Disk/network environment have problems, you can use local disks to check this out.(Storage vmotion of all Volumes)
And use NTP Servers for time sync on each VMware host and VM:
http://kb.vmware.com/selfservice/micros ... nalId=1318
For Problem 2 if problem 1 can not be solved:
Extend the heartbeat timeout
http://technet.microsoft.com/en-us/libr ... S.10).aspx
You can find the entry here:NLB assumes that a host is functioning normally within a cluster as long as it participates in the normal exchange of heartbeat messages between it and the other hosts. If the other hosts do not receive a message from a host for several periods of heartbeat exchange, they initiate convergence. The number of missed messages required to initiate convergence is set to five by default (but can be changed).
http://technet.microsoft.com/de-de/libr ... S.10).aspx
In my life before Veeam I saw a lot of Problems with the NLB Unicast Mode. If you use it I recommend to change it to IGMP Multicast together with your network spezialist, because you have to do some changes in your network for that.
Windows NLB is maybe not the best way to cluster CAS Server because NLB is not service (Exchange) aware. It only cares for the network, and not for Exchange CAS is running behind it or not.
A hardware load balancer cares also about the service availability.
Let me say again, that this is a infrastruktur problem not a Veeam Backup & Replication Software Problem. Veeam uses standard VMware Snapshots for the backup. If these Snapshots don´t work, I recommend to analyse this together with VMware, your Storage Vendor and your Infrastruktur service contractor.
Hope this information can help you to fix your problem.
Users browsing this forum: No registered users and 12 guests