Comprehensive data protection for all workloads
Post Reply
Dave338
Enthusiast
Posts: 40
Liked: 5 times
Joined: Jan 27, 2015 12:21 pm
Full Name: David
Contact:

Exchange DAG node restore

Post by Dave338 »

Hello.

After updating our 2-node DAG Exchange 2016 cluster with CU19, first node (which was the passive one and doing full machine veeam backup), went ok, but second node (our usually active node, which we do only drive C backup to preserve OS), has broken... something went wrong during the update and now some exchange services don't start, but the replication services is starting and databases continue replicating apparently well.

So which is the correct method to recover from this type of issue?

Can I recover a OS backup from before applying the updates? (C: (OS) is the only drive I can recover because this is normally the node who has all the active copies of the databases), or I need to remove this node from DAG and make a new machine (with same name and IP address) ,and install exchange in recovery mode (so new node from scratch)??

I'm in a bit in a hurry with this... have spent quite all day upgrading the server and now is dead, thanks MS for your great updates.

Best Regards.
Dave.
HannesK
Product Manager
Posts: 14839
Liked: 3085 times
Joined: Sep 01, 2014 11:46 am
Full Name: Hannes Kasparick
Location: Austria
Contact:

Re: Exchange DAG node restore

Post by HannesK »

Hello,
what about full VM restore? post402808.html?hilit=exchange%20dag%20restore#p402808

The link from Andreas has more details.

Or are you stuck in the middle of an upgrade and the backup is from before?

Best regards,
Hannes
Andreas Neufert
VP, Product Management
Posts: 7077
Liked: 1510 times
Joined: May 04, 2011 8:36 am
Full Name: Andreas Neufert
Location: Germany
Contact:

Re: Exchange DAG node restore

Post by Andreas Neufert »

I am not sure what you want to do actually.
Are you saying that you want to restore the boot OS with the Exchange installation into an older stage while you do not touch the Databases on the same VM.
I would advice against it as it would bring the databases and the Exchange Patch level out of sync.

If you have the space. Leave the situation as is and create a new Exchange Server and replicate the databases from the working server to it.
If you have the data synced there you can discuss actions with Microsoft on how to remove the server that is not working well or how to repair it.

And important point as well, do not just "remove" the server manually and cleanup manually within DNS/ADFS/AD as this is not supported and can leave your environment in a completely unsupported state (AD not supported anymore). It needs to follow documented procedures.

In general this is a good example why you should alwas backup the databases together with the server. In that case you could "just" recover the server completely to an older state, restore would bing up the VM with network and would place the databases in a VSS recover state. Exchange will detect this situation and will replicate updates to the system.
Dave338
Enthusiast
Posts: 40
Liked: 5 times
Joined: Jan 27, 2015 12:21 pm
Full Name: David
Contact:

Re: Exchange DAG node restore

Post by Dave338 »

Well, I'll explain what I've done today. :)

As I told yesterday, the broken node ("B") is usually the active node (all database copies mounted), and we only backup with veeam the "C" disk (OS+Exchange installation) of this machine.
The passive node ("A") have a full VM backup. We do both nodes backup in a separate job every 4 hours.

After the failure in "B" node, I left it powered off yesterday night
This morning, I''ve restored the OS drive of the failing node from the last backup when it was working (yesterday 8am), leaving other drives (swap, databases) untouched. The server has booted up normally, has done some AD sync on booting windows and after a few minutes has synced all databases from the "A" node correctly.
In this point all was working OK, but as Andreas stated, server version in the console was shown as the version which caused the disaster (CU19 build), but actual server was CU15 build.
I've repeated the upgraded procedure, but in a different order, first the CU19 update+security patch for CU19, and later other Windows patches, so version in the administrative center now matches real version again.
All is working perfectly, the server now is the active node again, and is accepting client connections as always (I can see it in the KEMP load balancer)

So for me restoring the server from a previous point has been a good solution and considerably faster that configuring a new server from scratch+manual deletion of the old server from DAG.
I haven't encountered any drawback of doing a machine restore for the moment...

Best Regards.
Post Reply

Who is online

Users browsing this forum: amirshnurman, Bing [Bot] and 269 guests