Host-based backup of VMware vSphere VMs.
Post Reply
habibalby
Veteran
Posts: 391
Liked: 32 times
Joined: Jul 18, 2011 9:30 am
Full Name: Hussain Al Sayed
Location: Bahrain
Contact:

Re: VSS timeout with Exchange 2010

Post by habibalby »

jdmarchand wrote:I have been closely looking at this thread for a while now.
We run a VMWare farm with several Exchange 2010 VM, all on disjoint networks.

We wanted to keep using Veeam backups over ViX so that there is no direct connectivity to our backup infrastructure.

We tried everything in:
http://www.veeam.com/kb1680

My results are that usually after restarting COM+ and the Information store in order to re-register the Exchange VSS writer, backups would work and fail again afterwards. Only thing I was reluctant to try is direct network connectivity, since I wanted to keep everything nicely separated and not introduce a layer of complexity (routing + firewalling my backup network)

Today, I decided to test direct network connectivity to all the tenants exchange servers. All the exchange server worked first try, which never happened before.

Clearly, there is a latency involved in making the VSS go from Veeam to vCenter -> host -> tools -> VSS and back breaking the 20 seconds hardcoded limit.

Hope this helps,

JD
My Exchange 2007 was failing to complete the backup since long time and I have done every configuration as a workaround to make it successful. Surprisngly, after I migrated my Veeam Backup Server to a new Physical Server, the new physical server has got two four connections, two to the iSCSI networks targets, one connected to the vSphere Management network and one for our production network as this machine is joined to the domain and I require domain authentication for VSS. So, my exchange also in the production network, I have configured it the new physical machine as a Backup Proxy as well and purposly I left the proxy mode to Automatic. After that the Exchange backup always success and the proxy mode always as Network [nbd].

Thanks,
jdmarchand
Novice
Posts: 6
Liked: 2 times
Joined: Jan 27, 2012 6:14 pm
Full Name: JD
Contact:

Re: VSS timeout with Exchange 2010

Post by jdmarchand »

habibalby wrote: I left the proxy mode to Automatic. After that the Exchange backup always success and the proxy mode always as Network [nbd].

Thanks,
In our case, in order to have a decent backup window, we need to use SAN mode, nbd was not fast enough. Our proxy servers are on the iSCSI fabric
artecu
Influencer
Posts: 23
Liked: 2 times
Joined: Mar 23, 2012 6:52 am
Full Name: Andrew R Tilbury
Contact:

Re: VSS timeout with Exchange 2010

Post by artecu »

Its a long shot I know, but when I fixed my lab problem it also co-incidentally fixed MY Exchange backup timeouts as well. Please refer to my topic http://forums.veeam.com/viewtopic.php?f=24&t=15729 for our solution (dvSwitch code level).
habibalby
Veteran
Posts: 391
Liked: 32 times
Joined: Jul 18, 2011 9:30 am
Full Name: Hussain Al Sayed
Location: Bahrain
Contact:

Re: VSS timeout with Exchange 2010

Post by habibalby » 1 person likes this post

jdmarchand wrote: In our case, in order to have a decent backup window, we need to use SAN mode, nbd was not fast enough. Our proxy servers are on the iSCSI fabric
If your vSphere Management Network is different than the network where the Exchange resides, create a proxy with two interfaces one beside vSphere Network and the other one beside the Exchange Production Network, keep Proxy in Auto Mode and keep the Job in Auto Proxy Mode. I'm sure the job will pick up the Proxy which has two interfaces :) I can prove that, it works in my environment really well.

Thanks,
jdgs
Influencer
Posts: 16
Liked: 2 times
Joined: Oct 11, 2012 8:42 am
Full Name: Jack
Contact:

Re: VSS timeout with Exchange 2010

Post by jdgs »

This problem is absolutely driving me mad at present. In our virtual environment we currently have 3 or 4 different Exchange 2010 deployments, ranging from single server through to a six server fully redundant exchange. We are having recurring issues with a single exchange server environment failing with this error, whilst all others succeed every time. What makes this all the more frustrating is that the backup will fail, I will hit retry, it usually fails again, then I hit retry again (without changing anything), and it will complete successfully. I have logged this a few times with Veeam, however every time I seem to log it, they get it "working", only for the same behaviour to emerge again a few days later. :evil:

I meant to add, I have tried everything in the Veeam kb article, and have opened up direct communication as well.
jdmarchand
Novice
Posts: 6
Liked: 2 times
Joined: Jan 27, 2012 6:14 pm
Full Name: JD
Contact:

Re: VSS timeout with Exchange 2010

Post by jdmarchand » 1 person likes this post

It's all about getting it to run in the "timeout" window. That's why you get it to work sometimes. If you are running a virtual environment, I suggest using a virtual router like a vyatta to route your Veeam servers to the Exchange server subnet (This could also be done with a dedicated physical router, but vyatta is free and fast).

Once I did this, never any issues again.
McClane
Expert
Posts: 106
Liked: 11 times
Joined: Jun 20, 2009 12:47 pm
Contact:

Re: VSS timeout with Exchange 2010

Post by McClane »

Do the servers run on SP3 RU1 already? I had problems with SP2 servers (not virtualized) and VSS timeouts. It seem to be gone after SP3 RU1.
jdmarchand
Novice
Posts: 6
Liked: 2 times
Joined: Jan 27, 2012 6:14 pm
Full Name: JD
Contact:

Re: VSS timeout with Exchange 2010

Post by jdmarchand »

You can get this on any VSS snapshot. I get the error on domain controllers, SQL, 2008R2, 2012.

Exchange is simply more "plagued" because the VSS snap takes more time on IO contended volumes.
jdgs
Influencer
Posts: 16
Liked: 2 times
Joined: Oct 11, 2012 8:42 am
Full Name: Jack
Contact:

Re: VSS timeout with Exchange 2010

Post by jdgs »

Yes, that is true. However it seems strange to me that in the last month, this is the ONLY server on which this has occured, and it is nowhere near the heaviest loaded box.
jdgs
Influencer
Posts: 16
Liked: 2 times
Joined: Oct 11, 2012 8:42 am
Full Name: Jack
Contact:

Re: VSS timeout with Exchange 2010

Post by jdgs »

McClane wrote:Do the servers run on SP3 RU1 already? I had problems with SP2 servers (not virtualized) and VSS timeouts. It seem to be gone after SP3 RU1.
It is indeed running SP2. Thank you very much for your input. I am going to schedule an upgrade to SP3, hopefully this will help.
jdgs
Influencer
Posts: 16
Liked: 2 times
Joined: Oct 11, 2012 8:42 am
Full Name: Jack
Contact:

Re: VSS timeout with Exchange 2010

Post by jdgs »

Unfortunately I have upgraded the server to SP3 RU1 and the issue still persists. Anyone else have suggestions?
Jamie Pert
Enthusiast
Posts: 68
Liked: 2 times
Joined: Jun 14, 2012 10:56 am
Full Name: Jamie Pert
Location: twitter.com/jam1epert
Contact:

Re: VSS timeout with Exchange 2010

Post by Jamie Pert »

Faster storage / storage under less load is key, I have been plagued by this issue on multiple servers in multiple environments. The fact is if we had the Exchange server on it's own SAN just for testing purposes it wouldn't happen because the SAN's load is low the snapshot won't hit any snags and complete within the 20 second window.

The trouble is most clients have multiple VMs on one to SAN (which is obviously what's meant to happen), Exchange gets busier as time goes on and on the whole I/O load on the SAN grows and grows.... I guess the key is to get the job to run at the time when the SAN is as 'free' as it can be.

What annoys me is I reckon if the 20 second windows could be hacked to a 30 second window this thread on the forum would be less than half its size.

once in a while the Exchange servers are fine, but there's no obvious pattern, it's obviously just when the SAN could cope with the jump in I/O associated with snapshotting
@jam1epert on Twitter
Gostev
Chief Product Officer
Posts: 31641
Liked: 7132 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: VSS timeout with Exchange 2010

Post by Gostev »

Jamie Pert wrote:The trouble is most clients have multiple VMs on one to SAN (which is obviously what's meant to happen), Exchange gets busier as time goes on and on the whole I/O load on the SAN grows and grows
I totally agree. We have also accepted this as the fact. We have built a prototype of a solid workaround for this issue, something that will work even with slowest production storage. But it touches quite a few things around restores, so we cannot release this as a patch... hopefully our next release timelines will give us enough time to include this fix.
Andreas Neufert
VP, Product Management
Posts: 6958
Liked: 1472 times
Joined: May 04, 2011 8:36 am
Full Name: Andreas Neufert
Location: Germany
Contact:

Re: VSS timeout with Exchange 2010

Post by Andreas Neufert »

I saw a article on TechNet regarding Exchange VSS Event Log error 1296.
Maybe this is helpful for some of you:
External Link: Resolving error code 1296 during Exchange backups
larry
Veteran
Posts: 387
Liked: 97 times
Joined: Mar 24, 2010 5:47 pm
Full Name: Larry Walker
Contact:

Re: VSS timeout with Exchange 2010

Post by larry » 3 people like this post

I posted on this thread long ago and our fix was to add more memory and cpu to the exchange server. We use to get this randomly but after adding resources we havent had this in a year.
mrt
Enthusiast
Posts: 53
Liked: 2 times
Joined: Feb 10, 2011 7:27 pm
Contact:

Re: VSS timeout with Exchange 2010

Post by mrt »

Gostev wrote: I totally agree. We have also accepted this as the fact. We have built a prototype of a solid workaround for this issue, something that will work even with slowest production storage. But it touches quite a few things around restores, so we cannot release this as a patch... hopefully our next release timelines will give us enough time to include this fix.
I don't suppose the fix you mention here is part of v7 R2?
soylent
Enthusiast
Posts: 61
Liked: 7 times
Joined: Aug 01, 2012 8:33 pm
Full Name: Max
Location: Fort Lauderdale, Florida
Contact:

Re: VSS timeout with Exchange 2010

Post by soylent »

mrt wrote: I don't suppose the fix you mention here is part of v7 R2?
There is this bullet point in the release notes:

Added ability for application-aware processing logic to detect passive Microsoft Exchange DAG database present on the VM, and process it accordingly.
Gostev
Chief Product Officer
Posts: 31641
Liked: 7132 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: VSS timeout with Exchange 2010

Post by Gostev »

No, that was to address a different issue with item-level recovery. As I've noted, the fix that we have developed for the issues discussed in this thread cannot be shipped as a part of the update due to the scope of changes required. And, this issue has nothing to deal with DAG anyway, and can affect standalone Exchange servers as well.
james575
Novice
Posts: 8
Liked: never
Joined: Jun 18, 2013 4:07 pm
Contact:

Re: VSS timeout with Exchange 2010

Post by james575 »

KB1680 says that you have only 20 seconds to snapshot the VM between the start of the freeze and the unfreezing by design. If I do a regular snapshot creation in vCenter of our single Exchange 2010 SP3 server, it takes around 5 minutes, while the KB is suggesting it should take a couple seconds. I am dumbfounded by this--are people here finding snapshots typically take only a few seconds to create??? I have only worked with this one vSphere environment so figured a 5 minute snapshot is normal.
lobo519
Veteran
Posts: 315
Liked: 38 times
Joined: Sep 29, 2010 3:37 pm
Contact:

Re: VSS timeout with Exchange 2010

Post by lobo519 » 1 person likes this post

james575 wrote:KB1680 says that you have only 20 seconds to snapshot the VM between the start of the freeze and the unfreezing by design. If I do a regular snapshot creation in vCenter of our single Exchange 2010 SP3 server, it takes around 5 minutes, while the KB is suggesting it should take a couple seconds. I am dumbfounded by this--are people here finding snapshots typically take only a few seconds to create??? I have only worked with this one vSphere environment so figured a 5 minute snapshot is normal.
Do you have the option "Snapshot virtual machine's memory" selected when you do this manually?
james575
Novice
Posts: 8
Liked: never
Joined: Jun 18, 2013 4:07 pm
Contact:

Re: VSS timeout with Exchange 2010

Post by james575 »

lobo519 wrote: Do you have the option "Snapshot virtual machine's memory" selected when you do this manually?
Yes. I have always have selected this for every snapshot I think I have ever made. After reading your question, I tried snapshotting the Exchange server with that option not selected--took two seconds! I always chose the memory option because I thought that would be the safer/better method, but it looks like I need to read up on that some. Thank you!

(As an informational aside for anyone reading this thread, our VSS writer error was fixed by turning on the log truncating for Exchange in the Veeam replication job.)
massimiliano.rizzi
Service Provider
Posts: 211
Liked: 27 times
Joined: Jan 24, 2012 7:56 am
Full Name: Massimiliano Rizzi
Contact:

Re: VSS timeout with Exchange 2010

Post by massimiliano.rizzi »

Hello experts,

we are facing the exact same problem when processing a Windows SBS 2008 VM using Application-Aware Image Processing:

==================================================
Image
==================================================

Based on my understanding, in our scenario the time-out occurs before VMware even tries to create the snapshot, thus before the 20 seconds timeout kicks in.

Is my understanding correct ?

BTW, I also have opened a Support Case (# 00482742) in order to take a deeper look into this.

Thank you in advance for your support.
elliott
Influencer
Posts: 11
Liked: 7 times
Joined: Jul 06, 2011 12:43 am
Contact:

Re: VSS timeout with Exchange 2010

Post by elliott » 1 person likes this post

Yes, the problem is the VSS snapshot timeout.

To save you some time here, just turn off the VM requiring application aware processing (so the job will complete with a warning and you will have a crash-consistent backup). You can try all the suggestions in the thread, but nothing has worked for our clients. (About 80% of our clients who use SBS and have bought Veeam have this problem.) I've played with v7 trial and it doesn't fix it.

I hope one day Veeam will patch this. Plenty of blame game (It's all Microsoft's fault) but at the end of the day other backup software doesn't have this problem, so Veeam needs to find away to work around the issue.
Jamie Pert
Enthusiast
Posts: 68
Liked: 2 times
Joined: Jun 14, 2012 10:56 am
Full Name: Jamie Pert
Location: twitter.com/jam1epert
Contact:

Re: VSS timeout with Exchange 2010

Post by Jamie Pert »

annoyingly we had to turn on circular logging on the exchange server and then disable application aware processing and log truncation - far from ideal, however at least we get backups.

In my experience Veeam backing up Exchange 2010 servers is not a great combination. I wish I had more time and more resources to truly get a full understanding as to at what point the 20 second timeout occurs. It's so frustrating to think that a 20 second windows causes all this, a 5 or 10 second increase to the allowance and we could all be happy
@jam1epert on Twitter
Gmc85
Novice
Posts: 9
Liked: never
Joined: Dec 28, 2011 1:05 pm
Full Name: George
Contact:

Re: VSS timeout with Exchange 2010

Post by Gmc85 »

On our exchange mailbox servers we were encountering the vss_ws_failed_at_freeze error.

Our mailbox databases were sitting on RDM's, we've since changed them over to normal vmware hard disks and now I've not had a failed backup since (Hurrah!).
tfloor
Veteran
Posts: 270
Liked: 15 times
Joined: Jan 03, 2012 2:02 pm
Full Name: Tristan Floor
Contact:

Re: VSS timeout with Exchange 2010

Post by tfloor »

I have a support ticket because I see the same with remote SQL server
yaroon
Novice
Posts: 5
Liked: never
Joined: Jun 16, 2011 8:08 am
Full Name: Jeroen Beekhuis

Re: VSS timeout with Exchange 2010

Post by yaroon »

Our exchange 2010 test servers were having problems doing the snapshot-create within 20 seconds. Turns out that the number of vdisks in our config greatly influences the time it takes to create a snapshot. Our config has 4 standard disks (OS, App&Data, Pagefiles, WSB-backup -now obsolete) and 8 Exchange datadisks (all VMDK). Taking a snapshot in VCenter takes 20+ seconds, and even when VM is powered down, it takes 15 seconds.
When removing virtual disks from the VM, this time is reduced. So I reshuffled the database partitions from 1 database-per-disk tot 4 databases per disk (using 4 partitions), ending up with 2 database disks instead of 8. As this is our Exchange test environment, database size was not an issue. (100GB) - but with the new vSphere5.5 which breaks the 2TB vmdk limit, this could work for larger db's as well.
Andreas Neufert
VP, Product Management
Posts: 6958
Liked: 1472 times
Joined: May 04, 2011 8:36 am
Full Name: Andreas Neufert
Location: Germany
Contact:

Re: VSS timeout with Exchange 2010

Post by Andreas Neufert »

james575 wrote: Yes. I have always have selected this for every snapshot I think I have ever made. After reading your question, I tried snapshotting the Exchange server with that option not selected--took two seconds! I always chose the memory option because I thought that would be the safer/better method, but it looks like I need to read up on that some. Thank you!

(As an informational aside for anyone reading this thread, our VSS writer error was fixed by turning on the log truncating for Exchange in the Veeam replication job.)

Backup Systems do not use the add memory to snapshot option. The idea is to bring the disks in a consistent state (VSS).
Andreas Neufert
VP, Product Management
Posts: 6958
Liked: 1472 times
Joined: May 04, 2011 8:36 am
Full Name: Andreas Neufert
Location: Germany
Contact:

Re: VSS timeout with Exchange 2010

Post by Andreas Neufert » 2 people like this post

Hello Everybody.

Just want to add again my 2 cents here.

Feedback from my customer was, that in most cases adding more vCPUs and RAM to the Server fixed this problem, because Exchange can faster process it´s consistency and you do not run into 20sec timeout (see above).

Reducing vdisk amount can help to process snapshots faster.
If your vcenter is too slow, it can help to add ESXi host as managed server and select the exchange server over this esxi host.
All Exchange disks needs to stay on fast disks. Do not add the OS disk to a datastore with many other OS disks from different VMs.

Here are my updated general recommendations for Echange/Exchange DAG at VMware. (Not all of these are related to the 20sec VSS timeout thing):

1.
Increase the DAG heartbeat time to avoid cluster failover (no reboot or service restart needed)cluster /prop SameSubnetDelay=2000:DWORDcluster /prop CrossSubnetDelay=4000:DWORDcluster /prop CrossSubnetThreshold=10:DWORDcluster /prop SameSubnetThreshold=10:DWORD
2.
Use new Veeam Storage Snapshot Feature (StoreVirtual/3PAR/VSA) if you can (after v7 release) => Reduces Snapshot Lifetime to some seconds => No load and problems at commit because of less data. (This option can be counter productive if you experience the 20 sec VSS timeout)
3.
If you have problems with cluster failover at Backup, one option is to backup DAG member(s) that hold only inactive databases (no cluster failover because of no active databases) (Logfile Truncation will be replicated by Exchange in whole DAG). This give you also the option to restart the server or services and Exchange process VSS consistency more faster afterwards. If you restart the services, take care that you wait long enough afterwards that also the VSS Exchange writers come up again, before you backup.
4.
To reduce Snapshot commit time (and to reduce data in the snapshot), try to avoid any changes at the backup time window (User, Background processes, Antivirus, ....). Also try to avoid that on all LUNs on the storage System itself (faster writes at snapshot commit).
5.
If you can not avoid many changes on block level at your backupwindow? Use Forward Incremental or if you need space Forward incremental with daily transform into rollbacks. Reverse Incremental took a bit longer than the other backup methods => longer snapshot livetime => more changes in the Snapshot to commit
6.
To reduce Snapshot lifetime and reduce amount of data to snapshot commit, use new parallel processing with enought ressources to backup all of your disks at the same time (after v7 release)
7.
To reduce backup time window and snapshot lifetime, use Direct SAN Mode with minimal needed disks connected at selected Proxy. If not possible use NBD mode with 10GbE. (Do not run Proxy in Autoselect mode). Disable VDDK Logging for Direct SAN Mode if your backups themself run stable (ask support for the registry key and consequences).
8.
Use actual VMware Versions (newest VADP/VDDK Kits with a lot of updates in it) and actual Veeam Versions (newer VDDK Integration). And install actual ESXi/vCenter patches!
9.
Use at minimum VMware vSphere 5.0 because of changes in the snapshot places and Background things.
10.
If you still facing problems, use faster disks for all of the VM disks (also the OS disk!!!)
11.
Less VM disks can help to reduce snapshot creation (and commit?) time.
12.
To avoid VSS timeouts (hardcoded 20 seconds at Exchange VSS writer), Check your vcenter load and optimize it (or use direct ESX(i) Connections for Veeam VM selection, so that the snapshot creation took less time.
13.
If you facing VM downtime because of Snapshot commit, maybe another option (unsupported from VMware) is to change VM Setting snapshot.maxConsolidateTime = "1" (in sconds) (see discussion above).
14.
If you see Exchange VSS Timeout EventLog 1296 => Change Log setting => Set-StorageGroup -Identity "<yourstoragegroup>" -CircularLoggingEnabled $false
15.
VSS Timeout Problems => Add more CPU/Memory to the VM
16.
Check your health an configuration of Exchange itself. I saw some installations where different problems ended up with a high cpu utilization at indexing sevice. This prevented VSS to work correct. Check also all other mail transport-cache settings. Sometimes the caches replicate shapdows of the mails over and over again and nobody commit them (if you have multible transport services together with firewalls between them).
cliffm
Enthusiast
Posts: 41
Liked: 4 times
Joined: Jun 03, 2011 12:41 am
Full Name: Cliff Meakin
Contact:

Re: VSS timeout with Exchange 2010

Post by cliffm »

We have had this problem for years and have had many weekends spent on it. We bought Veeam with 5 years of support so obviously we are sticking to it. I went through the while process of upgrading the entire infrastucture to faster, bigger better physical hardware and networks. No dice. The only thing I have ever found that works is to turn vCentre server off. If there is a vCentre server running anywhere on the networks the VSS times out. This is strange because no Veeam backups or replicas I have configured use vCentre server VIX connections, they are all configured directly through standalone hosts. So now my only use of vCentre server is to turn it on to do host upgrades. Then I turn it off. Then my backups and replicas work.
Post Reply

Who is online

Users browsing this forum: No registered users and 41 guests