We recently upgraded to Exchange 2010 (single server, 18 users nothing fancy).
At first, Veeam with VSS was backing up just fine. Then, I migrated all 18(!) users from 2007 to 2010.
From that moment on, the first backup always failed on VSS, but the first retry always went okay. So, not really a big problem.
However, after several days backups started failing on VSS completely.
So, I rebooted the machine and the backup went okay once.
So, several days later I rebooted once more and the backup kept failing. I retried a gazillion times and the millionth time the backup went okay again.
Now, two days in a row the backup went okay the first time (without retry)... And now it's failing completely (on VSS) again.
Rebooting the machine or restarting Exchange sevices sometimes helps, but mostly it doesn't help.
So, in 1% of the cases the backup goes okay and 99% it fails...
This is really annoying me. I also tried increasing the memory assigned to Exchange, I read that somewhere on the internet, but that didn't help either.
The host machine is really very powerful and only hosts a handful of VM's.
The error reported by Veeam is:
"Freezing guest operating system
Unfreeze error: [Backup job failed.
Cannot create a shadow copy of the volumes containing writer's data.
A VSS critical writer has failed. Writer name: [Microsoft Exchange Writer]. Class ID: [{76fe1ac4-15f7-4bcd-987e-8e1acb462fb7}]. Instance ID: [{0db23250-4d1e-42c1-8d14-2be32f448184}]. Writer's state: [VSS_WS_FAILED_AT_FREEZE]. Error code: [0x800423f2].] "
The Windows eventlogs have the following to say:
(each item is a new entry in the eventlog, with only 0 to 90 seconds between them.
- Exchange VSS Writer (instance dd3a9020-cd8a-410b-b91c-605388a497f6) has prepared for backup successfully.
- Information Store (3460) Shadow copy instance 14 starting. This will be a Full shadow copy.
For more information, click http://www.microsoft.com/contentredirect.asp. - Exchange VSS Writer (instance 14) has successfully prepared the database engine for a full or copy backup of database 'Public Folders 2010'.
- Exchange VSS Writer (instance 14) has successfully prepared the database engine for a full or copy backup of database 'Mailbox Database 2058872270'.
- Exchange VSS Writer (instance dd3a9020-cd8a-410b-b91c-605388a497f6:14) has prepared for Snapshot successfully.
- Information Store (3460) Shadow copy instance 14 freeze started.
- Information Store (3460) Mailbox Database 2058872270: Shadow copy instance 14 freeze started.
- Information Store (3460) Public Folders 2010: Shadow copy instance 14 freeze started.
- Exchange VSS Writer (instance dd3a9020-cd8a-410b-b91c-605388a497f6:14) has frozen the database(s) successfully.
- Information Store (3460) Shadow copy instance 14 freeze ended.
- Information Store (3460) Shadow copy instance 14 aborted. (EventID 2007)
(note: above eventlog entries are not the most recent, below VSSAdmin is the most recent - that's why instance ID's are different)
Writer name: 'Microsoft Exchange Writer'
Writer Id: {76fe1ac4-15f7-4bcd-987e-8e1acb462fb7}
Writer Instance Id: {0db23250-4d1e-42c1-8d14-2be32f448184}
State: [9] Failed
Last error: Timed out
So, it only takes a few seconds to go from 'working' to 'timed out'. It's not trying for that long, so I wonder if a timeout is the real reason for the error.