Veeam B&R v5 recovery of a domain controller

Availability for the Always-On Enterprise

Re: Veeam B&R v5 recovery of a domain controller

Veeam Logoby tsightler » Thu Aug 01, 2013 3:08 am 5 people like this post

The criticism regarding this is fair. I actually did spend some time working on such a document (and yes, I'm a Veeam "greenie"), and have continued to work on it during available time, so there will be a document forthcoming. Publishing a document isn't just a matter of creating it, you must actually test all of those procedures across a range of potential platforms (2003, 2008, 2012), each having their uniqueness, and verify that the procedures work. It's a time consuming process, and even then, environmental uniqueness can still cause issues.

However, in the interim, I thought I'd share some information that might be useful. For one thing, the Surebackup process is indeed quite different than a normal restore, and there's a very good reason for that. The Veeam "automatic" DC restore process performs a non-authoritative restore of the DC because, in most cases DCs are restored when there are still other functional DCs in the environment. In this case the non-authoritative restore process should truly be 100% "automatic". In other words, if I have 3 DCs, and after Windows updates last night one BSODs, restoring that single DC VM should be a 100% automatic process. The automatic recovery should also work for environments with only a single DC.

Most customers, however, have multiple DCs but when running Surebackup, it's quite common that, while the environment may have multiple DCs in production, only a single DC is started in the Surebackup environment. This means the server must effectively stand on it's own. It's in an isolated environment so it's not going to find it replication partners, etc. and we don't won't it to waste a lot of time attempting to do so. To prevent this, during the "Configuring DC" phase of the Surebackup process, Veeam makes some changes to the registry of the DC prior to powering it on the the Suerbackup lab. These change force an authoritative restore of the DC, and specifically of SYSVOL. You can see the exact changes being made by looking in the Surebackup job log in the Veeam log directory. Open up the logs and search for "PrepareDC" and you'll quickly find the section where this "magic" is performed during Surebackup, including all of the gory details of the registry entries, but in summary it's the following:

1. Set Burflags = D4 to force authoritative SYSVOL restore for systems using FRS for SYSVOL replication (i.e. Windows 2003 and older and potentially upgraded AD environments where users have yet to migrate SYSVOL to DFS-R)
2. Set NTDS\Repl Perform Initial Synchronizations = 0 to keep the DC from wasting 15 minutes (or more) attempting to contact replication peers which simply aren't going to be there
3. Set DFSR\Restore\"SYSVOL" = authoritative to force authoritative restore of SYSVOL for systems using DFS-R (hopefully all Windows 2008R2 DCs and newer)

So now we get to the 100% full DC recovery scenario, i.e. "I've lost all my DCs". When you restore a DC using either full or instant recovery (as stated earlier, there should be no difference other than performance which might increase the time it takes to come online), the automatic recovery process performs a non-authoritative restore reboots and then starts looking for his other DCs to sync up. Because all DCs are gone, there are no other partners available, the replication may take 15 minutes (or longer) to even start (in the absence of the Repl Perform Initial Synchronizations key used by Surebackup). This is why restoring all DCs together will generally work better, as this timeout to start SYSVOL replication is avoided. However, since you have restored all DCs non-authoritatively, they'll likely all be waiting around hoping that one of them claims to be the authoritative SYSVOL so that they can start replicating. You'll need to designate one of these DCs as authoritative for SYSVOL and the procedure to do this varies slightly based on if you are using FRS or DFSR for SYSVOL replication.

So if I were doing a complete disaster restore, I'd restore two of the original DCs, power them on, wait for their reboot, and force one to become authoritative for SYSVOL, then restore the other DCs and they should recover automatically.
tsightler
Veeam Software
 
Posts: 4830
Liked: 1779 times
Joined: Fri Jun 05, 2009 12:57 pm
Full Name: Tom Sightler

Re: Veeam B&R v5 recovery of a domain controller

Veeam Logoby Unison » Thu Aug 01, 2013 5:53 am 1 person likes this post

Thanks for this post Tom - very informative! and good to hear from a greenie.
I understand that it would take quite a bit of effort/time to release a document like this - its very good news to hear that a document like this is 'incoming'. It will be appreciated. Your post alone puts down some very good detail and helps clear things up!

From the information you provided - it is clear why people have been seeing successful restores of their DCs/AD environment when using surebackup.....veeam is actually forcing one of the DCs to be authoritative.

But now i am wondering about the 100% full DC recovery scenario....
How am i (and assume im not alone...) able to successfully recover all DCs and the AD environment without manually forcing one of the recovered DCs into 'authoritative mode'? On recovery of all my DCs at the same time, they both go into directory services restore mode....
then they both reboot......
then they both sit at the ctrl alt del screen for a while.....looking for each other....waiting for someone to become authoritative....
then they both reboot again....
when they come back on line, they dont reboot again - they are all happy and talking to each other and AD/replication is functioning.....all without me doing anything.

Is it possible for the OS's/DCs to successfully elect an authoritative DC on their own during that 'wait period'? or this just doesnt happen automatically....ever...with any OS? Is there any other explanation for how a 100% full DC recovery scenario can be successful without manually setting an authoritative DC.
Unison
Enthusiast
 
Posts: 80
Liked: 16 times
Joined: Fri Feb 17, 2012 6:02 am
Full Name: Gav

Re: Veeam B&R v5 recovery of a domain controller

Veeam Logoby Fiskepudding » Thu Aug 01, 2013 6:28 am

Gav.

This is the CORE:
Unison wrote: I dont see how veeam could make the DC recovery process any easier/better, but maybe at the very least, Veeam could release an extensive technical document/OS that could walk through step by step in detail exactly what SHOULD happen when you recover a DC (what the veeam developers expect through the whole process).....it should show what Veeam is in control of, when veeam is in control, what veeam changes in the OS, at what points those changes are made, when changes are changed back if they are changed back, at what point the OS takes over DC recovery, what the OS is looking for, what changes the OS makes (like setting new identifiers), when reboots should happen, how many reboots should happen, why a reboot happens, rough time periods between events etc etc etc.....with a detailed document like this, at least we could have a fighting chance of tracking down what is NOT happening when a DC recovery is unsuccessful.



I will try to answer all your questions. But first, the gateway on the NICs are also retained.
In my caste they currently point to a n IP that does not exist in my test environment. So I have suspected this to might be an issue from the get-go.
But since both DC's are at the same subnet, the gateway should not even be used.
I have tried to remove the IP as well (off cause after the restore and reboots..).
I will reconfigure my network and put up a router with the same IP, so I can have a working gateway on my testes. I will need some fibre converters, so it won’t happen very soon.
My test environment is not running with instant restore anymore because that stops my nightly jobs…
But I tested the shares using FQDN, did not help.
Yes both servers points to each other, but on primary DNS.. so
i.e.
DC1 DNS
Primary = DC2
Secondary = DC1

DC2 DNS
Primary = DC1
Secondary = DC2
I really do not want to change this, because that shold be best practice config.
I bring ALL servers up with the DCs in sure backup, Exchange, certificate server, webserver,SQL server, Autodesk Vault server..+++ Everything works 
No I have only two DNS’s (the DC’s).

When the DC’s are running in SureBackup they work just like in production, they can access their own and each other’s sysvol shares.
Fiskepudding
Expert
 
Posts: 213
Liked: 26 times
Joined: Wed Feb 01, 2012 7:24 am
Full Name: Espen Dykesteen

Re: Veeam B&R v5 recovery of a domain controller

Veeam Logoby Fiskepudding » Thu Aug 01, 2013 6:40 am 1 person likes this post

Great info Tom! :)

This explains a lot.
There may be more to the SureBackup job then you reveal, but we always bring back BOTH DC’s in the SureBackup job. Do both go into authorative restore?...
If that works with two DC’s in SureBackup, why is non-authorative restore default for normal/instant restores? Because I see no issues with the DC's when they are running in SureBackup lab..

What would be ideal, would be to restore 1 DC first… and be able to choose restore mode (the first one you select, authorative restore)
I guess when you pick restore on a VM, Veeam can check some metadata and see if it is an DC, and give your these choices during restore dialogue.
Even if Veeam can't check if it is an DC druing the restore dialog, there cold maybe be a Advanced optoin containing these DC restore choices??...:

Is there no other DCs running in your enviroinment (autthorative restore) ( ) Radio button
There are at least one running DC in your environment (non-authorative restore) () Radio button


Feature request?


I can agree that most of the time... maybe not 99% as Gostev previous stated, you want non-authorative restore.
But it is the 99 times more important to get the DC's up and running FAST if you have no one running. And then Veeam just makes the decition to do a non-autorative restore, without even asking.
Unless you have been reading on the forums, you have no idea on what is actually happening...
Fiskepudding
Expert
 
Posts: 213
Liked: 26 times
Joined: Wed Feb 01, 2012 7:24 am
Full Name: Espen Dykesteen

Re: Veeam B&R v5 recovery of a domain controller

Veeam Logoby Unison » Thu Aug 01, 2013 7:09 am

Tom has cleared up why the surebackup is working for you - veeams sureback is making one of your DCs recover in authoritative mode for you....so everything comes back online as it should.
In my last post, i am trying to get Tom to comment on why/how/if it should even be possible that i am able to recover both DCs at the same time in what would be a non authoritative restore for both DCs and still have them eventually successfully restore. From his info i would not expect to be able to restore how i am able to restore......but i am able to without manually making an authoritative DC....they seem to elect one on their own.
how can a non authoritative restore of both DCs work in one environment, but not in another...

It might be worth a shot to change your DNS config on your NICs - depending on who you talk to, best practice is to have DCs point to themselves as primary DNS.....others say not to point to themselves as primary. Changing the NIC config for DNS on your DCs specially short term wont cause you any issues because just the order is changing - both DNS servers have the same data anyway. Ive made the DCs point to themselves for primary DNS purely for this purpose, could it be the difference... maybe it is helping that when the DC comes online it will try its self for DNS first and eventually accept that....causing one of the DCs to become authoritative and then brining everything else back online successfully.
Unison
Enthusiast
 
Posts: 80
Liked: 16 times
Joined: Fri Feb 17, 2012 6:02 am
Full Name: Gav

Re: Veeam B&R v5 recovery of a domain controller

Veeam Logoby Unison » Thu Aug 01, 2013 7:11 am 1 person likes this post

Fiskepudding wrote:What would be ideal, would be to restore 1 DC first… and be able to choose restore mode (the first one you select, authorative restore)
I guess when you pick restore on a VM, Veeam can check some metadata and see if it is an DC, and give your these choices during restore dialogue.
Even if Veeam can't check if it is an DC druing the restore dialog, there cold maybe be a Advanced optoin containing these DC restore choices??...:

Is there no other DCs running in your enviroinment (autthorative restore) ( ) Radio button
There are at least one running DC in your environment (non-authorative restore) () Radio button


Feature request?

I think this would be an excellent option if this is indeed what has to happen.
Veeam is already doing this for DCs in surebackup - so when we are restoring a VM, give us the choice to say that the VM being restored is a DC....and give us the choice to say if this is the FIRST and only DC being recovered at this point....ie. make it authoritative. The veeam admin will understand this.
Unison
Enthusiast
 
Posts: 80
Liked: 16 times
Joined: Fri Feb 17, 2012 6:02 am
Full Name: Gav

Re: Veeam B&R v5 recovery of a domain controller

Veeam Logoby Fiskepudding » Thu Aug 01, 2013 7:24 am

I think some Quote tags went wrong in the above post Gav :)

Unison wrote:It might be worth a shot to change your DNS config on your NICs - depending on who you talk to, best practice is to have DCs point to themselves as primary DNS.....others say not to point to themselves as primary. Changing the NIC config for DNS on your DCs specially short term wont cause you any issues because just the order is changing - both DNS servers have the same data anyway. Ive made the DCs point to themselves for primary DNS purely for this purpose, could it be the difference... maybe it is helping that when the DC comes online it will try its self for DNS first and eventually accept that....causing one of the DCs to become authoritative and then brining everything else back online successfully.


I can see that having the Primary DNS set to itself can explain why the recovery works even with NON-AUTHORATIVE restore. But should I configure my DC's for a recovery or production :)
Its just that I never seen anyone recomend to set the DCs IP as the Primary DNS... a bit sceptical.. but I agree, it should not make THAT MUCH difference..

But it would be way better to controll what type of restore you wanted, right there in the restore process.

Instead of doing a NON-AUTHORATIVE restore, when there are actually NO running DCs available, thats just..... well.. wrong... :?
Fiskepudding
Expert
 
Posts: 213
Liked: 26 times
Joined: Wed Feb 01, 2012 7:24 am
Full Name: Espen Dykesteen

Re: Veeam B&R v5 recovery of a domain controller

Veeam Logoby tsightler » Thu Aug 01, 2013 10:52 am 1 person likes this post

Unison wrote:How am i (and assume im not alone...) able to successfully recover all DCs and the AD environment without manually forcing one of the recovered DCs into 'authoritative mode'?

So my honest answer to that question is, I'm not 100% sure, but I think I know the answer (at least, what appears to be the answer based on my own lab testing of dozens of recoveries). The simple version is that I believe DFS-R will eventually automatically recover, while customers using legacy FRS replication will always require intervention (BTW, we still see a LOT of customers that failed to migrate to DFS-R during AD upgrades thus are still using FRS even with Windows 2008R2). I also believe that the reason for this is that FRS has no robust self-healing, so after being told to perform a non-authoritative restore, somebody has to claim to be authoritative, and thus the BURFLAGS settings is required to recover.

On the other hand, DFS-R has advanced self-healing capabilities that, in theory, should eventually recover using conflict resolution algorithms even in the absence of a true "authoritative" claim from one of the partners. I suspect that's where your "second" reboot happens, DFS finally syncs up, exports SYSVOL, and the system finally reboots into a normal working state. However, the process is actually much faster if you force one of the DFS-R nodes to become authoritative.

On a side note, for customers that are still using FRS in newer DC environments, please consider migrating to DFS-R as soon as possible. Here's a great link on the reasons why.

http://blogs.technet.com/b/askds/archiv ... -dfsr.aspx
tsightler
Veeam Software
 
Posts: 4830
Liked: 1779 times
Joined: Fri Jun 05, 2009 12:57 pm
Full Name: Tom Sightler

Re: Veeam B&R v5 recovery of a domain controller

Veeam Logoby Unison » Fri Aug 02, 2013 2:49 am

This might very well be the simple answer as to why some see fully automatic DC recovery and why others do not (providing nothing else mentioned in these posts is your issue).

:arrow: If your DCs are running DFSR - then DC recovery will be fully automatic (with a bit of waiting for the auto second reboot).
:arrow: If your DCs are running FRS - then DC recovery will include some manual processes. You will need to recover your first DC into authoritative mode (by setting burflags etc) then recover your other DCs after that.


Ive just gone and done another DC recovery test now to try and get some times together.
I actually have 3 DCs in my environment:
DC1 = 08R2
DC2 = 08R2
DC3 = 03 - about to be decomissioned

I just recovered DC1 and DC2 with instant recovery.
Changed their network details in vmware - then powered them on at the same time.
They both recovered into directory restore mode then rebooted within 3mins of being powered on.
Both then came to the ctrl alt del screen.
3mins later - DC1 rebooted on its own - then came back to the ctrl alt del screen.
5mins later - DC3 rebooted on its own - then came back to the ctrl alt del screen.
about 15mins later i logged into both of them (i was busy with something else but still had both DCs visible on my screen so i would notice if they did anything else...)

I then checked all the usual stuff, shares were ok, replication was working, AD was working, DNS was working etc etc.
I also checked the logs and there were entries in both the DFSR and FRS logs......both showed that there was trouble contacting other partners. but at the times which matched up with 3 and 5mins above of the reboots - around those times both DCs in their logs showed success with contacting each other and the wonderful message we all want to see.... "no longer preventing this system from becoming a DC". The logs showed that there is no reason to wait to login after the DCs do their second auto reboot....after that time, the logs shows everything is working.
As you suspected, it seems that DFSR is running through its self healing algorithms and nominating an authoritative server - resulting in everything coming back online. From the beginning of recovery in Veeam to the point in time where the logs showed everything was ok was about 13 minutes. Less than 15mins for total DC recovery seems pretty good to me 8)

I hope that this sort of information about DFSR and FRS is included in the document on DC recovery with Veeam - i think it is highly important and might be the reason why some see consistent recoveries and others are left manually bringing their DCs back together.....worrying about the validity of their DC recovery.

Tom, do you think this document might still be months/years away based on where you got up to with it? A lot of detail on this subject can now be found in these posts - but that would take someone so long to read and patch together.
When it does come along - will you post about it here and maybe even get Gostev to release it in his newsletter emails? Thanks for your efforts Tom! :)
Unison
Enthusiast
 
Posts: 80
Liked: 16 times
Joined: Fri Feb 17, 2012 6:02 am
Full Name: Gav

Re: Veeam B&R v5 recovery of a domain controller

Veeam Logoby Unison » Fri Aug 02, 2013 3:02 am

Espen, Does Toms comments about DFSR and FRS apply to you? When you recover your DCs via instant recovery, then login to them - are your DFSR logs in event viewer saying anything to you? or do you only see FRS events?
If you are seeing DFSR events like "having trouble contacting/replicating....." and they never change from that - might be worth a shot to change your DNS settings around to see if this solves the problem. A google search on "DNS settings for DCs" yields many opinions for both options but i have never heard of anyone having issues with setting their DCs as themselves for primary DNS.
Unison
Enthusiast
 
Posts: 80
Liked: 16 times
Joined: Fri Feb 17, 2012 6:02 am
Full Name: Gav

Re: Veeam B&R v5 recovery of a domain controller

Veeam Logoby tsightler » Fri Aug 02, 2013 3:28 am 1 person likes this post

Unison wrote:Tom, do you think this document might still be months/years away based on where you got up to with it? A lot of detail on this subject can now be found in these posts - but that would take someone so long to read and patch together.
When it does come along - will you post about it here and maybe even get Gostev to release it in his newsletter emails? Thanks for your efforts Tom! :)


So the rebirth of this thread has reignited my efforts to create this document. I'll see about putting together a draft in the next couple of weeks as long as I can find enough time to test the recovery scenarios. It may take a little time from that point to become an officially published doc, which would be my goal, but I'll provide something in the interim. Ideally I can coax a few people in this thread to test the procedures I write up as that's part of the challenge. It's easy enough to create a lab and test things, but real world environments have a tendency to shake out additional issues.

If you don't see a draft posted in this thread in the next couple of weeks, ask for an update or shoot me a PM. I think the bulk of the information is here in the thread now, so it's just a matter of formalizing it and testing the documented procedure.
tsightler
Veeam Software
 
Posts: 4830
Liked: 1779 times
Joined: Fri Jun 05, 2009 12:57 pm
Full Name: Tom Sightler

Re: Veeam B&R v5 recovery of a domain controller

Veeam Logoby Fiskepudding » Fri Aug 02, 2013 5:55 am

Fiskepudding wrote:Different AD tools works for periodes, then dont work for periodes, then work again....
Lots of errors (Netlogon,DFS service and group policy) in event viewer.



I posted the above massage a few days ago, I see DFS but NOT DFSR.


I have multiple of these:
Image

Seems to end up with this about 30 min later:
Image


Might sound like a stupid question, but, is there any other way to know if my 2008R2 servers use DFSR or FRS?
Both our DCs are 2008R2 and domain functional level is also 2008R2.

I get this also (The DFS Namespace service successfully initialized cross forest trust information on this domain controller:
http://www.cia.as/www/Veeam/1_2.png

Sysvol do still not work. Currently no AD tols are working either.


Othere messages, related to domain.
http://www.cia.as/www/Veeam/1_2.png
http://www.cia.as/www/Veeam/1_3.png
http://www.cia.as/www/Veeam/1_4.png
http://www.cia.as/www/Veeam/1_5.png
http://www.cia.as/www/Veeam/1_6.png
http://www.cia.as/www/Veeam/1_7.png
http://www.cia.as/www/Veeam/2_2.png
http://www.cia.as/www/Veeam/2_3.png
Fiskepudding
Expert
 
Posts: 213
Liked: 26 times
Joined: Wed Feb 01, 2012 7:24 am
Full Name: Espen Dykesteen

Re: Veeam B&R v5 recovery of a domain controller

Veeam Logoby Unison » Fri Aug 02, 2013 5:59 am

tsightler wrote:So the rebirth of this thread has reignited my efforts to create this document. I'll see about putting together a draft in the next couple of weeks as long as I can find enough time to test the recovery scenarios. It may take a little time from that point to become an officially published doc, which would be my goal, but I'll provide something in the interim. Ideally I can coax a few people in this thread to test the procedures I write up as that's part of the challenge. It's easy enough to create a lab and test things, but real world environments have a tendency to shake out additional issues.

If you don't see a draft posted in this thread in the next couple of weeks, ask for an update or shoot me a PM. I think the bulk of the information is here in the thread now, so it's just a matter of formalizing it and testing the documented procedure.

Excellent! :D
I for one would be happy to look at your drafts and test the procedures that align with my environment - im sure there will be others here too that will be happy to test. I dont mind looking at a very raw copy so its not an issue if its not in an official format at this early stage.
Agree, pretty much everything needed is in here - just the correct stuff needs to be pulled together and put in the right order....then some stepped out documented procedures for different scenarios.
I will wait for an alert from this post or a PM from you if you want me to look at/test anything specific here. Will check in with you in a couple weeks if nothing like that pops up.
Thanks for taking on this project again Tom. It will be very handy to have and should make it into the DRP of any shop running Veeam!
Unison
Enthusiast
 
Posts: 80
Liked: 16 times
Joined: Fri Feb 17, 2012 6:02 am
Full Name: Gav

Re: Veeam B&R v5 recovery of a domain controller

Veeam Logoby Fiskepudding » Fri Aug 02, 2013 6:06 am

I would also gladly help, and test more on our environment.
First thing migt be to change the DNS the other way around :)

And maybe the gateway that points to non existing IP at the moment.
Fiskepudding
Expert
 
Posts: 213
Liked: 26 times
Joined: Wed Feb 01, 2012 7:24 am
Full Name: Espen Dykesteen

Re: Veeam B&R v5 recovery of a domain controller

Veeam Logoby Unison » Fri Aug 02, 2013 6:24 am

if you flip your DNS settings and test - post back the results here so we know if that helped :)
If that second reboot doesnt come within 15mins of getting to the ctrl alt del screen then you can pretty much guarentee that its not coming and your DCs are still broken.

At that point - it could be that you dont have DFS Replication running on your DCs.....but this seems strange as i thought DFSR was installed/enabled by default when you build up an 08R2 DC. your even running a higher domain functional level than me.....im still on functional level 2003 because i still have one 2003 DC left.

*wish it was easier to post/upload an image here*.....but in your even viewer on the DC, under "Application and services logs" - you dont have a "DFS Replication" log in there....just an "File Replication Service" log?
Unison
Enthusiast
 
Posts: 80
Liked: 16 times
Joined: Fri Feb 17, 2012 6:02 am
Full Name: Gav

PreviousNext

Return to Veeam Backup & Replication



Who is online

Users browsing this forum: DerBesan, wessexit and 68 guests