Comprehensive data protection for all workloads
Post Reply
ori
Enthusiast
Posts: 65
Liked: 1 time
Joined: Apr 28, 2012 9:51 pm
Full Name: Ori Besser
Contact:

Re: Veeam B&R v5 recovery of a domain controller

Post by ori » Jul 23, 2013 5:38 am

I have 3 domain controllers, I have powered 2 of them on at the same time yesterday, but still, no sysvol and netlogon shares. They are both in the same isolated network.
I think Veeam should somehow implement the burflag key change automatically in the recovery process.

Unison
Enthusiast
Posts: 95
Liked: 16 times
Joined: Feb 17, 2012 6:02 am
Full Name: Gav
Contact:

Re: Veeam B&R v5 recovery of a domain controller

Post by Unison » Jul 23, 2013 6:26 am

ori wrote:I have 3 domain controllers, I have powered 2 of them on at the same time yesterday, but still, no sysvol and netlogon shares. They are both in the same isolated network.
I think Veeam should somehow implement the burflag key change automatically in the recovery process.
I tried to find some info in this post on your setup/details but couldnt - unless i didnt look hard enough.

What OS are your DCs?
Are they running the VMXNET3?
Are those 2 DCs global catalogues?
Do you have VSS enabled in your DC backup jobs?
As soon as your DCs come online, can you ping them from each other by both IP and DNS name?
Is the OS firewall on?
Do you have AV installed on the DCs?

ori
Enthusiast
Posts: 65
Liked: 1 time
Joined: Apr 28, 2012 9:51 pm
Full Name: Ori Besser
Contact:

Re: Veeam B&R v5 recovery of a domain controller

Post by ori » Jul 23, 2013 1:50 pm

Thanks for trying to help..

2003 r2
vmxnet3
one of them GC
vss is enabled (when vss is disabled there is no problem)
both pingable by ip and name
os firewall is off
mcafee av


I have 3 DCs in total, I also tried starting all of them at the same time but it didn't help, the third one is also a GC.

james575
Novice
Posts: 8
Liked: never
Joined: Jun 18, 2013 4:07 pm
Contact:

Re: Veeam B&R v5 recovery of a domain controller

Post by james575 » Jul 23, 2013 3:43 pm

Unison wrote:Are any of your DCs 08R2? If so, are they using VMXNET3 NICs?
Yes, I saw the conversation above about the 2008 R2/vmxnet3, thanks. The two DCs I am testing are running 2012 with vmxnet3. One of them is the FSMO master. Both are GC and AD-integrated DNS.

I am testing them by just powering up on their own network, not using instant recovery. According to Veeam this should work.

Unison
Enthusiast
Posts: 95
Liked: 16 times
Joined: Feb 17, 2012 6:02 am
Full Name: Gav
Contact:

Re: Veeam B&R v5 recovery of a domain controller

Post by Unison » Jul 23, 2013 10:41 pm

As far as i know, the VMXNET3 issue (really OS issue) does impact upon server 2012 - it is something that only causes a problem for 08R2. Maybe you can confirm this - when you recover your 2012 DCs together, as soon as you boot....go into their NIC properties - do you see the IP information that your are expecting to see or are all the IP fields blank and its set to DHCP?
How about all the items in my last post - do any of them apply to you? i.e. you have VSS enabled on your DC backup jobs, firewalls running, AV etc etc?

Getting your DCs to recover is SO important - i couldn't rest until i was confident/happy with being able to reliably recover the DCs/AD. Hopefully we can work out what is causing the problem in your shop.

Unison
Enthusiast
Posts: 95
Liked: 16 times
Joined: Feb 17, 2012 6:02 am
Full Name: Gav
Contact:

Re: Veeam B&R v5 recovery of a domain controller

Post by Unison » Jul 23, 2013 10:55 pm

ori wrote:Thanks for trying to help..

2003 r2
vmxnet3
one of them GC
vss is enabled (when vss is disabled there is no problem)
both pingable by ip and name
os firewall is off
mcafee av


I have 3 DCs in total, I also tried starting all of them at the same time but it didn't help, the third one is also a GC.

No worries. I know its not a good feeling when your not confident in the backup systems ability to restore DC - i am hopeful the cause for your issues can be found.
When your 03R2 DCs recover and they boot into the OS - if you check their NIC details, do you see the IP information that you are expecting to see or are the NICs set to DHCP and you need to manually set the NIC details.
When you power up the restored DCs for the first time - do all 3 of them boot into 'safe mode' first (i.e. directory restore mode)? Do all 3 then do the following....
# reboot on their own
# boot back to the normal ctrl alt del screen - and you DO NOT login to any of them.....let them sit at the ctrl alt del screen for a while....
# reboot again all on their own (after sitting at the ctrl alt del screen for about 10-30mins)

if one or all of them are not doing the above then it might indicate where the problem is.

Also, you mentioned that when you DONT have VSS enabled for your DC backup jobs - they recover perfectly fine? But when VSS is enabled, you are not able to recover your DCs - did i read that wrong? When you do enable VSS on the jobs, what AD account are you specifying - are you certain it has the right amount of permissions and is entered in perfectly correctly?

Also, you could try disabling mcafee on all of your DCs (disable the real time service in services manager etc) then capture new veeam images/increments - and then test your recoveries with those new increments as mcafee will be disabled when you test with those new recovery points - that would rule out AV getting in the way.

Andreas Neufert
Veeam Software
Posts: 3820
Liked: 687 times
Joined: May 04, 2011 8:36 am
Full Name: Andreas Neufert
Location: Germany
Contact:

Re: Veeam B&R v5 recovery of a domain controller

Post by Andreas Neufert » Jul 25, 2013 8:35 am

Be carefull... disabled VSS means that DC will not run into non-authorative restore mode at restore. => Bad idea if you do not manually boot VM in safe mode and do this on your own (at restore).

Fiskepudding
Expert
Posts: 213
Liked: 26 times
Joined: Feb 01, 2012 7:24 am
Full Name: Espen Dykesteen
Contact:

Re: Veeam B&R v5 recovery of a domain controller

Post by Fiskepudding » Jul 30, 2013 11:23 am

We have been running SureBackup for some time now, checking our backups 2 times a week on a schedule.
They pass every time, including the DC's and their scripts.

But after reading this thread I was curious, and did a full restore of both our DC's.
Actually I first tried Instant VM Recovery, but that did not work.
I had high hopes for a full VM restore of both DC's.
The rest i describe is a FULL restore of two VMs (2 DCs)

Since there is no documentation on what to expect, and what to NOT do,I followed Gostev's advice, "basicly do nothing". Here are the steps i took, and observations I made.

0 Restore of VM finished
1 Power on VM (pressed play on VM)
2 "Stat windows normaly", was selected as default.
3 Applying Computer settings....
(I guess it is here Veeam does its "magic")
4 Stopping services....
5 Shuting Down....
6 VM reboots..
7 Applying Computer settings....
8 Logon screen

The only time i did something was to power on the VM, and log on at the end.
All in between are just observations.


I log on successfully. But the domain is not 100% functional.
I WAS able to open "Active Directory Users and Computers" i add a user to one DC, it is replicated do the other one fine.
repadmin /syncall gives no erros.

But changes to SYSVOL is not replicated. Maybe BURFLAGS needs to be used??..
There are serveral DFS Namspace errors in Event Viewer (14550)
And I am not able to start "Active Directory Sites and Services", gets error, "The specified domain does not exist or could not be contacted".
After a while I was no longer able to open ADUC.

Both DCs can ping eachoter by hostname and IP4.
Even tho networking appears so function, server 1 can not access its own shares
\\SERVER1\SYSVOL
\\SERVER1\c$
The same goes for server 2. And they can not access each other shares either.

Then after a manual reboot of one server I was able to all use Domain related tools.They can also access each other and thir own shares...
If I wait some more maybe 30 min.. they do not work again. F example ADSI edit might work but ADUC might say that "The specified domain does not exist or could not be contacted". Shares can not be accessed (All services reports they are running fine.)
Firewall is OFF
There are NO/have never been any antivirus software on these DC’s
Both are Win 2008 R2 servers with SP1, all the latest patches.
I use VMXNET3, the MS hotfix has been installed ages ago, and the ogiginal IP and DNS settings was retained after the restore.
All these things do off cause function on the production servers.

So I have some open questions...
Do i Restore and power on 1 DC at a time, letting it finish everything, and show the logn screen before i restore/power on aditional DC's?
Should i in the first boot let it stay on "Start windows normaly"? (As far as i can remember it was said in this thread that that was ok, because Veeam did its own magic regardless)....
It is not always true that there are any other operational DCs available when doing the first DC restore. Any condierations restoreing additional DC's?
Different steps, for the first, second, third....?

"do nothing" sounds fine, but it is always good to know what is expected to happen...Even if the manual just contained some steps on what to expect. That would be reassuring, and probably would not make people go into safe mode ad do stuff on their own.

Veeam version is 6.5.0.144

Unison
Enthusiast
Posts: 95
Liked: 16 times
Joined: Feb 17, 2012 6:02 am
Full Name: Gav
Contact:

Re: Veeam B&R v5 recovery of a domain controller

Post by Unison » Jul 30, 2013 10:48 pm

The one thing i can see here that might be causing your problem with instant recovery is that after the server comes back online with your point number 8 - how long do you let it sit at the ctrl alt del screen before you login to it?
Your recovered DCs should actually reboot twice on their own during the recovery process.....it should go like this....

i have copied your steps and modified them below....

0 Restore of VM finished
1 Power on VM (pressed play on VM)
2 "Stat windows normaly", was selected as default.
3 Applying Computer settings....
(I guess it is here Veeam does its "magic")
(SERVER WOULD HAVE GONE INTO ACTIVE DIRECTORY RESTORE MODE HERE)
4 Stopping services....
5 Shuting Down....
6 VM reboots.. (THIS IS THE FIRST TIME THE SERVER REBOOTS ON ITS OWN)
7 Applying Computer settings....
8 Logon screen
9 SERVER SHOULD SIT HERE FOR A WHILE AT THE LOGIN SCREEN (DONT LOGIN TO IT!!!) AND AFTER 10-30MINS IT SHOULD REBOOT AGAIN - FOR A SECOND TIME.
10 SERVER COMES BACK TO CTRL ALT DEL SCREEN.....AND AT THIS POINT YOU SHOULD BE ABLE TO LOGIN AND EVERYTHING WILL BE RUNNING FINE.

apparently if you login to the DC to soon - i.e. before it has a chance to do its second auto reboot - then you can interrupt the recovery process and then up with an unsuccessful recovery. your DCs should reboot twice on their own before you ever login to it.

Also, you ask if you should recover one DC first, wait for it to come online, then recover the second DC.....dont do that. I have found that this never works. Recover at least two of your DCs together. Recover them both with instant recovery and then push the play button on them both at the same time (or close to it) so that they come up together and can talk to each other during the recovery.

Fiskepudding
Expert
Posts: 213
Liked: 26 times
Joined: Feb 01, 2012 7:24 am
Full Name: Espen Dykesteen
Contact:

Re: Veeam B&R v5 recovery of a domain controller

Post by Fiskepudding » Jul 31, 2013 5:42 am

Thanx for the answer Gav
Unison wrote:Also, you ask if you should recover one DC first, wait for it to come online, then recover the second DC.....dont do that. I have found that this never works. Recover at least two of your DCs together. Recover them both with instant recovery and then push the play button on them both at the same time (or close to it) so that they come up together and can talk to each other during the recovery.
I read this in your previous posts, and thought that this was what you needed to do if you did an instant VM recovery. Not a full restore like i just did. (although there should not be much difference).
I did not wait long when it stood at the login screen, maybe 5 min.

No offence!.. Really!...
But Veeam users should not relay 100% on what a "random guy" on the Veeam forum had success with, after experimenting a lot... It just works.. after xxx rebots..?? what is up with that?
And Veeam only says it works automatically, don’t worry... Well it does not work automatically.
And if it is supposed to have 2 auto reboots, then that should be documented somewhere.

Anyway, I am phishing for an officeal answer from a Veeam rep here.. hint hint.. Since they do not provide any info on what to do, ...except it is all automatic...
If I am presented with the logon screen after 1 autoreboot, I would expect Veeam to be finished with its magic, and try to log on, especially if this was a real disaster, I would want to have the DCs up and running and verified as soon as possible.

I think this really proves that some guidance is needed regardless of how automatic Veeam's DC restore process is?

Since you have greater success when restoring/powering on multiple DCs at once. May I ask what backup time they have /is it important that the backup of both VM's stars about the same time?

Fiskepudding
Expert
Posts: 213
Liked: 26 times
Joined: Feb 01, 2012 7:24 am
Full Name: Espen Dykesteen
Contact:

Re: Veeam B&R v5 recovery of a domain controller

Post by Fiskepudding » Jul 31, 2013 6:25 am

Previous post came out a bith harsh. That was not my intent!

But DC recovery is one of the most important recovery's you might do, and that need to work 100% before restoring other VM's.
So there should be very little guesswork in that process.. I guess that is my point :)

Unison
Enthusiast
Posts: 95
Liked: 16 times
Joined: Feb 17, 2012 6:02 am
Full Name: Gav
Contact:

Re: Veeam B&R v5 recovery of a domain controller

Post by Unison » Jul 31, 2013 7:17 am 3 people like this post

Hey Espen....
hmm i never got an alert from Veeam when you first replied - otherwise i would have written back sooner :)
Dont worry - i was not at all offended when i just read through your post....i totally understand where your getting at and sometimes when i post things here, i want to get confirmation from one of the 'greens' (read Veeam employees!!)......not just randoms like you and I.
Although i have seem veeam employees commenting here - and i would hope that if i did post something here that was WRONG, they would delete my post, correct me or offer the right info.

You are not the first person to request official documentation from Veeam about DC recovery - but i dont think it will ever exist in the form your thinking/expecting it should because according to Veeam, the process is fully automatic.....so their extensive and official documentation for recovery of DCs would be something like a one line one page document "Start your DCs together - and the DC recovery process will happen automatically - the end."
People (me included) have discovered issues with the process as you can read here....but every issue i have run into with DC recovery has been caused by something other than Veeam.

The fact that you didnt wait too long for the second reboot might have been what caused your issue with the instant recovery test - not seeing a second reboot before login could have interrupted the recovery process and resulted in the unsuccessful attempt. I know and fully agree that it is very unintuitive to just sit around and wait for DCs to reboot twice before accessing them when doing a recovery - and it would be really nice if the recovery was completed as soon as we seen the first ctrl alt del screen. However i dont believe the second reboot is initiated or even caused by Veeam - i think that is happening because of the OS.....as part of the process when a DC detects that it has been recovered from a backup. Maybe a veeam official can chime in here but i belive that the first reboot is part of the automatic Veeam process (or might be a part of the directory restore mode process?) and the second reboot happens as a part of the DC recovery process from the perspective of the OS. Because one or both reboots are out of Veeams hands - there is not really much veeam can do with regards to a 'confirmation/success' feedback solution. Veeam initiates and facilitates the recovery of DCs, but its the windows OS and DC roles that end up getting your AD working again (and triggering reboot/s).

Regarding your question to me "how much time is between my DC backups if i am wanting to restore them together all the time" - the answer is that the DC backups dont have be to captured at the exact same time as you might think. This was a question i raised once here in this forum. I was of the belief that you would have to restore DCs from backups captured at the exact same time so as to prevent AD roll-back errors....but i was wrong, and corrected by Gostev from memory. Your DCs dont need to be backed up close together......i have tested DC recoveries where DC1 backup was captured hours before DC2 and the recovery process still works perfect - you can see in the event logs of the DCs (after they have recovered successfully) that they both recognise that they have been recovered from a backup, they both recognise that each DC has a different identifier than they were expecting - and then they negotiate different identifiers with each other so they can begin synching again.


I agree that DC recoveries are extremely important - if you cant get them back on-line in a test......how can you expect to do it for real! This is why i am very happy now as i am happy with the process and confident in it......once i worked out the kinks! :)

Thanks for coming back to make sure there was no offence taken - appreciate that - but as mentioned, i was not offended and completely understand your comments.


i wonder, if you test your recovery again now - but wait for the second reboot as mentioned here.....is it successful for you? I would be interested to hear :) (pm me if you respond....because sometimes veeam doesnt alert me to new posts!!! grrr)

Fiskepudding
Expert
Posts: 213
Liked: 26 times
Joined: Feb 01, 2012 7:24 am
Full Name: Espen Dykesteen
Contact:

Re: Veeam B&R v5 recovery of a domain controller

Post by Fiskepudding » Jul 31, 2013 7:46 am

We are on the same page Gav, and I totaly agree with you.
However, if your process is the way to go, that is not very intuitive. (You aslo tried many things to end up with this).
You dont want to spend much time experimenting DC restore in real life disaster.
So the process should be documented!.

Hypothetic DC recovery in manual:

Recover ALL DC at the same time!!
Make sure they boot 2 times (first wait for automatic reboot, withot logon screen. Next, wait xx min for a automatic reboot that should happen at netxt logon screen).
0 Restore of VM finished
1 Power on VM (pressed play on VM)
2 "Stat windows normaly", was selected as default.
3 Applying Computer settings....
(I guess it is here Veeam does its "magic")
(SERVER WOULD HAVE GONE INTO ACTIVE DIRECTORY RESTORE MODE HERE)
4 Stopping services....
5 Shuting Down....
6 VM reboots.. (THIS IS THE FIRST TIME THE SERVER REBOOTS ON ITS OWN)
7 Applying Computer settings....
8 Logon screen
9 SERVER SHOULD SIT HERE FOR A WHILE AT THE LOGIN SCREEN (DONT LOGIN TO IT!!!) AND AFTER 10-30MINS IT SHOULD REBOOT AGAIN - FOR A SECOND TIME.
10 SERVER COMES BACK TO CTRL ALT DEL SCREEN.....AND AT THIS POINT YOU SHOULD BE ABLE TO LOGIN AND EVERYTHING WILL BE RUNNING FINE.


Because what you DONT want in a disaster situation, is to DONT know when to log on.. How long am I supposed to wait?.. How many reboots should i wait for?...
If this will vary, then say so. Then people like me will not log on after the first reboot, because I dont know if veeam auto magic was all done in first reboot or what not :)

I will try once more with a recovery.
Btw Gav, have you observed different behaviour doing full recovery or instant recovery?

Fiskepudding
Expert
Posts: 213
Liked: 26 times
Joined: Feb 01, 2012 7:24 am
Full Name: Espen Dykesteen
Contact:

Re: Veeam B&R v5 recovery of a domain controller

Post by Fiskepudding » Jul 31, 2013 8:39 am

I did a restore according to your instructions. After the second reboot (CTRL-ALT-DEL) screen, i waited even 5 more minutes before loging in.
I see the same behaviour now as before.

Different AD tools works for periodes, then dont work for periodes, then work again....
Lots of errors (Netlogon,DFS service and group policy) in event viewer.

Sysvol/netlogon shares do not work. Default shares ($) and manualy created shares works.

I will try rebooting a few times and see how it goes :)

Unison
Enthusiast
Posts: 95
Liked: 16 times
Joined: Feb 17, 2012 6:02 am
Full Name: Gav
Contact:

Re: Veeam B&R v5 recovery of a domain controller

Post by Unison » Jul 31, 2013 11:26 pm 2 people like this post

I too have observed that the DC recovery process is not very intuitive at all - veeam might one day put out a guide/document about it but i dont think it will be a very long/extensive document if they ever do because they really do believe that DC recovery is a simple and automatic process.

I dont see how veeam could make the DC recovery process any easier/better, but maybe at the very least, Veeam could release an extensive technical document/OS that could walk through step by step in detail exactly what SHOULD happen when you recover a DC (what the veeam developers expect through the whole process).....it should show what Veeam is in control of, when veeam is in control, what veeam changes in the OS, at what points those changes are made, when changes are changed back if they are changed back, at what point the OS takes over DC recovery, what the OS is looking for, what changes the OS makes (like setting new identifiers), when reboots should happen, how many reboots should happen, why a reboot happens, rough time periods between events etc etc etc.....with a detailed document like this, at least we could have a fighting chance of tracking down what is NOT happening when a DC recovery is unsuccessful.

:arrow: :arrow: (LIKE THIS POST IF YOU WANT VEEAM TO CREATE A TECHNICAL DOCUMENT AS DESCRIBED AND/OR SUBMIT A REQUEST TO VEEAM - maybe we can drum up interest/demand - Gostev could put this in his regular email newsletter to veeam subscribers to ask others in the community if they want this kind of document from Veeam) :!: :!:

All my DC recovery (and other servers) tests are done with instant recovery - i don't think there is any difference between instant/full but i believe there is a difference when doing a surebackup recovery of a DC as veeam has more control during that process.
With a tech document from veeam as requested above, that would clearly show the differences in the process - instant/full compared to surebackup. It is strange how your DCs recover 100% successful when using surebackup but not when doing an instant/full.

From your post above - it seems like something networking/dns is getting in the way when you try an instant and DC's/AD is not going to recover properly until that issue is sorted out.
When you recover via instant - your DCs retain their IP details on the NICs so that means you have installed the right hotfix, you have no firewall/AV running, you recover both DCs together from backups that are fairly close together and you only have 2 DCs in your environment.....but when you can finally login to them you are not able to access the network shares from each other....even though you CAN ping each other via DNS name and IPv4 address.
I assume that if you leave the DCs online when doing a surebackup test - you can login to them and test accessing shares from each other and that is successful?

Do your two DCs have the same default gateway address in their NIC or is the gateway empty when you recover? if you still have your two DCs running now in instant recovery (or if you do another test), when you try to access shares from each other, could you try accessing the shares of DC2 from DC1 with this (\\DC2.mydomain.com.au\c$ instead of just \\DC2\c$) and see if you can get to the shares using the FQDN for the DCs?

Also, with your DCs NIC settings......do both DCs point to each other for DNS or do they point to some other server? They should point to themselves as the primary and then to each other as the secondary - is this how you have your DNS settings on both your DCs?
i.e.
DC1 DNS
Primary = DC1
Secondary = DC2

DC2 DNS
Primary = DC2
Secondary = DC1

If your settings are not like this - could you change them to that - back them up again, then try another instant recovery with those settings in place?

When you do a surebackup test and it works - what other servers do you bring online at the same time in the surebackup test with your DCs? Does your surebakup test that works only include your two DCs - or are there other servers in there too? This is where i was going with DNS - do you have more DNS servers in your network besides the two DCs?

tsightler
VP, Product Management
Posts: 5421
Liked: 2243 times
Joined: Jun 05, 2009 12:57 pm
Full Name: Tom Sightler
Contact:

Re: Veeam B&R v5 recovery of a domain controller

Post by tsightler » Aug 01, 2013 3:08 am 5 people like this post

The criticism regarding this is fair. I actually did spend some time working on such a document (and yes, I'm a Veeam "greenie"), and have continued to work on it during available time, so there will be a document forthcoming. Publishing a document isn't just a matter of creating it, you must actually test all of those procedures across a range of potential platforms (2003, 2008, 2012), each having their uniqueness, and verify that the procedures work. It's a time consuming process, and even then, environmental uniqueness can still cause issues.

However, in the interim, I thought I'd share some information that might be useful. For one thing, the Surebackup process is indeed quite different than a normal restore, and there's a very good reason for that. The Veeam "automatic" DC restore process performs a non-authoritative restore of the DC because, in most cases DCs are restored when there are still other functional DCs in the environment. In this case the non-authoritative restore process should truly be 100% "automatic". In other words, if I have 3 DCs, and after Windows updates last night one BSODs, restoring that single DC VM should be a 100% automatic process. The automatic recovery should also work for environments with only a single DC.

Most customers, however, have multiple DCs but when running Surebackup, it's quite common that, while the environment may have multiple DCs in production, only a single DC is started in the Surebackup environment. This means the server must effectively stand on it's own. It's in an isolated environment so it's not going to find it replication partners, etc. and we don't won't it to waste a lot of time attempting to do so. To prevent this, during the "Configuring DC" phase of the Surebackup process, Veeam makes some changes to the registry of the DC prior to powering it on the the Suerbackup lab. These change force an authoritative restore of the DC, and specifically of SYSVOL. You can see the exact changes being made by looking in the Surebackup job log in the Veeam log directory. Open up the logs and search for "PrepareDC" and you'll quickly find the section where this "magic" is performed during Surebackup, including all of the gory details of the registry entries, but in summary it's the following:

1. Set Burflags = D4 to force authoritative SYSVOL restore for systems using FRS for SYSVOL replication (i.e. Windows 2003 and older and potentially upgraded AD environments where users have yet to migrate SYSVOL to DFS-R)
2. Set NTDS\Repl Perform Initial Synchronizations = 0 to keep the DC from wasting 15 minutes (or more) attempting to contact replication peers which simply aren't going to be there
3. Set DFSR\Restore\"SYSVOL" = authoritative to force authoritative restore of SYSVOL for systems using DFS-R (hopefully all Windows 2008R2 DCs and newer)

So now we get to the 100% full DC recovery scenario, i.e. "I've lost all my DCs". When you restore a DC using either full or instant recovery (as stated earlier, there should be no difference other than performance which might increase the time it takes to come online), the automatic recovery process performs a non-authoritative restore reboots and then starts looking for his other DCs to sync up. Because all DCs are gone, there are no other partners available, the replication may take 15 minutes (or longer) to even start (in the absence of the Repl Perform Initial Synchronizations key used by Surebackup). This is why restoring all DCs together will generally work better, as this timeout to start SYSVOL replication is avoided. However, since you have restored all DCs non-authoritatively, they'll likely all be waiting around hoping that one of them claims to be the authoritative SYSVOL so that they can start replicating. You'll need to designate one of these DCs as authoritative for SYSVOL and the procedure to do this varies slightly based on if you are using FRS or DFSR for SYSVOL replication.

So if I were doing a complete disaster restore, I'd restore two of the original DCs, power them on, wait for their reboot, and force one to become authoritative for SYSVOL, then restore the other DCs and they should recover automatically.

Unison
Enthusiast
Posts: 95
Liked: 16 times
Joined: Feb 17, 2012 6:02 am
Full Name: Gav
Contact:

Re: Veeam B&R v5 recovery of a domain controller

Post by Unison » Aug 01, 2013 5:53 am 1 person likes this post

Thanks for this post Tom - very informative! and good to hear from a greenie.
I understand that it would take quite a bit of effort/time to release a document like this - its very good news to hear that a document like this is 'incoming'. It will be appreciated. Your post alone puts down some very good detail and helps clear things up!

From the information you provided - it is clear why people have been seeing successful restores of their DCs/AD environment when using surebackup.....veeam is actually forcing one of the DCs to be authoritative.

But now i am wondering about the 100% full DC recovery scenario....
How am i (and assume im not alone...) able to successfully recover all DCs and the AD environment without manually forcing one of the recovered DCs into 'authoritative mode'? On recovery of all my DCs at the same time, they both go into directory services restore mode....
then they both reboot......
then they both sit at the ctrl alt del screen for a while.....looking for each other....waiting for someone to become authoritative....
then they both reboot again....
when they come back on line, they dont reboot again - they are all happy and talking to each other and AD/replication is functioning.....all without me doing anything.

Is it possible for the OS's/DCs to successfully elect an authoritative DC on their own during that 'wait period'? or this just doesnt happen automatically....ever...with any OS? Is there any other explanation for how a 100% full DC recovery scenario can be successful without manually setting an authoritative DC.

Fiskepudding
Expert
Posts: 213
Liked: 26 times
Joined: Feb 01, 2012 7:24 am
Full Name: Espen Dykesteen
Contact:

Re: Veeam B&R v5 recovery of a domain controller

Post by Fiskepudding » Aug 01, 2013 6:28 am

Gav.

This is the CORE:
Unison wrote: I dont see how veeam could make the DC recovery process any easier/better, but maybe at the very least, Veeam could release an extensive technical document/OS that could walk through step by step in detail exactly what SHOULD happen when you recover a DC (what the veeam developers expect through the whole process).....it should show what Veeam is in control of, when veeam is in control, what veeam changes in the OS, at what points those changes are made, when changes are changed back if they are changed back, at what point the OS takes over DC recovery, what the OS is looking for, what changes the OS makes (like setting new identifiers), when reboots should happen, how many reboots should happen, why a reboot happens, rough time periods between events etc etc etc.....with a detailed document like this, at least we could have a fighting chance of tracking down what is NOT happening when a DC recovery is unsuccessful.

I will try to answer all your questions. But first, the gateway on the NICs are also retained.
In my caste they currently point to a n IP that does not exist in my test environment. So I have suspected this to might be an issue from the get-go.
But since both DC's are at the same subnet, the gateway should not even be used.
I have tried to remove the IP as well (off cause after the restore and reboots..).
I will reconfigure my network and put up a router with the same IP, so I can have a working gateway on my testes. I will need some fibre converters, so it won’t happen very soon.
My test environment is not running with instant restore anymore because that stops my nightly jobs…
But I tested the shares using FQDN, did not help.
Yes both servers points to each other, but on primary DNS.. so
i.e.
DC1 DNS
Primary = DC2
Secondary = DC1

DC2 DNS
Primary = DC1
Secondary = DC2
I really do not want to change this, because that shold be best practice config.
I bring ALL servers up with the DCs in sure backup, Exchange, certificate server, webserver,SQL server, Autodesk Vault server..+++ Everything works 
No I have only two DNS’s (the DC’s).

When the DC’s are running in SureBackup they work just like in production, they can access their own and each other’s sysvol shares.

Fiskepudding
Expert
Posts: 213
Liked: 26 times
Joined: Feb 01, 2012 7:24 am
Full Name: Espen Dykesteen
Contact:

Re: Veeam B&R v5 recovery of a domain controller

Post by Fiskepudding » Aug 01, 2013 6:40 am 1 person likes this post

Great info Tom! :)

This explains a lot.
There may be more to the SureBackup job then you reveal, but we always bring back BOTH DC’s in the SureBackup job. Do both go into authorative restore?...
If that works with two DC’s in SureBackup, why is non-authorative restore default for normal/instant restores? Because I see no issues with the DC's when they are running in SureBackup lab..

What would be ideal, would be to restore 1 DC first… and be able to choose restore mode (the first one you select, authorative restore)
I guess when you pick restore on a VM, Veeam can check some metadata and see if it is an DC, and give your these choices during restore dialogue.
Even if Veeam can't check if it is an DC druing the restore dialog, there cold maybe be a Advanced optoin containing these DC restore choices??...:

Is there no other DCs running in your enviroinment (autthorative restore) ( ) Radio button
There are at least one running DC in your environment (non-authorative restore) () Radio button


Feature request?


I can agree that most of the time... maybe not 99% as Gostev previous stated, you want non-authorative restore.
But it is the 99 times more important to get the DC's up and running FAST if you have no one running. And then Veeam just makes the decition to do a non-autorative restore, without even asking.
Unless you have been reading on the forums, you have no idea on what is actually happening...

Unison
Enthusiast
Posts: 95
Liked: 16 times
Joined: Feb 17, 2012 6:02 am
Full Name: Gav
Contact:

Re: Veeam B&R v5 recovery of a domain controller

Post by Unison » Aug 01, 2013 7:09 am

Tom has cleared up why the surebackup is working for you - veeams sureback is making one of your DCs recover in authoritative mode for you....so everything comes back online as it should.
In my last post, i am trying to get Tom to comment on why/how/if it should even be possible that i am able to recover both DCs at the same time in what would be a non authoritative restore for both DCs and still have them eventually successfully restore. From his info i would not expect to be able to restore how i am able to restore......but i am able to without manually making an authoritative DC....they seem to elect one on their own.
how can a non authoritative restore of both DCs work in one environment, but not in another...

It might be worth a shot to change your DNS config on your NICs - depending on who you talk to, best practice is to have DCs point to themselves as primary DNS.....others say not to point to themselves as primary. Changing the NIC config for DNS on your DCs specially short term wont cause you any issues because just the order is changing - both DNS servers have the same data anyway. Ive made the DCs point to themselves for primary DNS purely for this purpose, could it be the difference... maybe it is helping that when the DC comes online it will try its self for DNS first and eventually accept that....causing one of the DCs to become authoritative and then brining everything else back online successfully.

Unison
Enthusiast
Posts: 95
Liked: 16 times
Joined: Feb 17, 2012 6:02 am
Full Name: Gav
Contact:

Re: Veeam B&R v5 recovery of a domain controller

Post by Unison » Aug 01, 2013 7:11 am 1 person likes this post

Fiskepudding wrote:What would be ideal, would be to restore 1 DC first… and be able to choose restore mode (the first one you select, authorative restore)
I guess when you pick restore on a VM, Veeam can check some metadata and see if it is an DC, and give your these choices during restore dialogue.
Even if Veeam can't check if it is an DC druing the restore dialog, there cold maybe be a Advanced optoin containing these DC restore choices??...:

Is there no other DCs running in your enviroinment (autthorative restore) ( ) Radio button
There are at least one running DC in your environment (non-authorative restore) () Radio button


Feature request?
I think this would be an excellent option if this is indeed what has to happen.
Veeam is already doing this for DCs in surebackup - so when we are restoring a VM, give us the choice to say that the VM being restored is a DC....and give us the choice to say if this is the FIRST and only DC being recovered at this point....ie. make it authoritative. The veeam admin will understand this.

Fiskepudding
Expert
Posts: 213
Liked: 26 times
Joined: Feb 01, 2012 7:24 am
Full Name: Espen Dykesteen
Contact:

Re: Veeam B&R v5 recovery of a domain controller

Post by Fiskepudding » Aug 01, 2013 7:24 am

I think some Quote tags went wrong in the above post Gav :)
Unison wrote: It might be worth a shot to change your DNS config on your NICs - depending on who you talk to, best practice is to have DCs point to themselves as primary DNS.....others say not to point to themselves as primary. Changing the NIC config for DNS on your DCs specially short term wont cause you any issues because just the order is changing - both DNS servers have the same data anyway. Ive made the DCs point to themselves for primary DNS purely for this purpose, could it be the difference... maybe it is helping that when the DC comes online it will try its self for DNS first and eventually accept that....causing one of the DCs to become authoritative and then brining everything else back online successfully.
I can see that having the Primary DNS set to itself can explain why the recovery works even with NON-AUTHORATIVE restore. But should I configure my DC's for a recovery or production :)
Its just that I never seen anyone recomend to set the DCs IP as the Primary DNS... a bit sceptical.. but I agree, it should not make THAT MUCH difference..

But it would be way better to controll what type of restore you wanted, right there in the restore process.

Instead of doing a NON-AUTHORATIVE restore, when there are actually NO running DCs available, thats just..... well.. wrong... :?

tsightler
VP, Product Management
Posts: 5421
Liked: 2243 times
Joined: Jun 05, 2009 12:57 pm
Full Name: Tom Sightler
Contact:

Re: Veeam B&R v5 recovery of a domain controller

Post by tsightler » Aug 01, 2013 10:52 am 1 person likes this post

Unison wrote:How am i (and assume im not alone...) able to successfully recover all DCs and the AD environment without manually forcing one of the recovered DCs into 'authoritative mode'?
So my honest answer to that question is, I'm not 100% sure, but I think I know the answer (at least, what appears to be the answer based on my own lab testing of dozens of recoveries). The simple version is that I believe DFS-R will eventually automatically recover, while customers using legacy FRS replication will always require intervention (BTW, we still see a LOT of customers that failed to migrate to DFS-R during AD upgrades thus are still using FRS even with Windows 2008R2). I also believe that the reason for this is that FRS has no robust self-healing, so after being told to perform a non-authoritative restore, somebody has to claim to be authoritative, and thus the BURFLAGS settings is required to recover.

On the other hand, DFS-R has advanced self-healing capabilities that, in theory, should eventually recover using conflict resolution algorithms even in the absence of a true "authoritative" claim from one of the partners. I suspect that's where your "second" reboot happens, DFS finally syncs up, exports SYSVOL, and the system finally reboots into a normal working state. However, the process is actually much faster if you force one of the DFS-R nodes to become authoritative.

On a side note, for customers that are still using FRS in newer DC environments, please consider migrating to DFS-R as soon as possible. Here's a great link on the reasons why.

http://blogs.technet.com/b/askds/archiv ... -dfsr.aspx

Unison
Enthusiast
Posts: 95
Liked: 16 times
Joined: Feb 17, 2012 6:02 am
Full Name: Gav
Contact:

Re: Veeam B&R v5 recovery of a domain controller

Post by Unison » Aug 02, 2013 2:49 am

This might very well be the simple answer as to why some see fully automatic DC recovery and why others do not (providing nothing else mentioned in these posts is your issue).

:arrow: If your DCs are running DFSR - then DC recovery will be fully automatic (with a bit of waiting for the auto second reboot).
:arrow: If your DCs are running FRS - then DC recovery will include some manual processes. You will need to recover your first DC into authoritative mode (by setting burflags etc) then recover your other DCs after that.


Ive just gone and done another DC recovery test now to try and get some times together.
I actually have 3 DCs in my environment:
DC1 = 08R2
DC2 = 08R2
DC3 = 03 - about to be decomissioned

I just recovered DC1 and DC2 with instant recovery.
Changed their network details in vmware - then powered them on at the same time.
They both recovered into directory restore mode then rebooted within 3mins of being powered on.
Both then came to the ctrl alt del screen.
3mins later - DC1 rebooted on its own - then came back to the ctrl alt del screen.
5mins later - DC3 rebooted on its own - then came back to the ctrl alt del screen.
about 15mins later i logged into both of them (i was busy with something else but still had both DCs visible on my screen so i would notice if they did anything else...)

I then checked all the usual stuff, shares were ok, replication was working, AD was working, DNS was working etc etc.
I also checked the logs and there were entries in both the DFSR and FRS logs......both showed that there was trouble contacting other partners. but at the times which matched up with 3 and 5mins above of the reboots - around those times both DCs in their logs showed success with contacting each other and the wonderful message we all want to see.... "no longer preventing this system from becoming a DC". The logs showed that there is no reason to wait to login after the DCs do their second auto reboot....after that time, the logs shows everything is working.
As you suspected, it seems that DFSR is running through its self healing algorithms and nominating an authoritative server - resulting in everything coming back online. From the beginning of recovery in Veeam to the point in time where the logs showed everything was ok was about 13 minutes. Less than 15mins for total DC recovery seems pretty good to me 8)

I hope that this sort of information about DFSR and FRS is included in the document on DC recovery with Veeam - i think it is highly important and might be the reason why some see consistent recoveries and others are left manually bringing their DCs back together.....worrying about the validity of their DC recovery.

Tom, do you think this document might still be months/years away based on where you got up to with it? A lot of detail on this subject can now be found in these posts - but that would take someone so long to read and patch together.
When it does come along - will you post about it here and maybe even get Gostev to release it in his newsletter emails? Thanks for your efforts Tom! :)

Unison
Enthusiast
Posts: 95
Liked: 16 times
Joined: Feb 17, 2012 6:02 am
Full Name: Gav
Contact:

Re: Veeam B&R v5 recovery of a domain controller

Post by Unison » Aug 02, 2013 3:02 am

Espen, Does Toms comments about DFSR and FRS apply to you? When you recover your DCs via instant recovery, then login to them - are your DFSR logs in event viewer saying anything to you? or do you only see FRS events?
If you are seeing DFSR events like "having trouble contacting/replicating....." and they never change from that - might be worth a shot to change your DNS settings around to see if this solves the problem. A google search on "DNS settings for DCs" yields many opinions for both options but i have never heard of anyone having issues with setting their DCs as themselves for primary DNS.

tsightler
VP, Product Management
Posts: 5421
Liked: 2243 times
Joined: Jun 05, 2009 12:57 pm
Full Name: Tom Sightler
Contact:

Re: Veeam B&R v5 recovery of a domain controller

Post by tsightler » Aug 02, 2013 3:28 am 1 person likes this post

Unison wrote:Tom, do you think this document might still be months/years away based on where you got up to with it? A lot of detail on this subject can now be found in these posts - but that would take someone so long to read and patch together.
When it does come along - will you post about it here and maybe even get Gostev to release it in his newsletter emails? Thanks for your efforts Tom! :)
So the rebirth of this thread has reignited my efforts to create this document. I'll see about putting together a draft in the next couple of weeks as long as I can find enough time to test the recovery scenarios. It may take a little time from that point to become an officially published doc, which would be my goal, but I'll provide something in the interim. Ideally I can coax a few people in this thread to test the procedures I write up as that's part of the challenge. It's easy enough to create a lab and test things, but real world environments have a tendency to shake out additional issues.

If you don't see a draft posted in this thread in the next couple of weeks, ask for an update or shoot me a PM. I think the bulk of the information is here in the thread now, so it's just a matter of formalizing it and testing the documented procedure.

Fiskepudding
Expert
Posts: 213
Liked: 26 times
Joined: Feb 01, 2012 7:24 am
Full Name: Espen Dykesteen
Contact:

Re: Veeam B&R v5 recovery of a domain controller

Post by Fiskepudding » Aug 02, 2013 5:55 am

Fiskepudding wrote: Different AD tools works for periodes, then dont work for periodes, then work again....
Lots of errors (Netlogon,DFS service and group policy) in event viewer.

I posted the above massage a few days ago, I see DFS but NOT DFSR.


I have multiple of these:
Image

Seems to end up with this about 30 min later:
Image


Might sound like a stupid question, but, is there any other way to know if my 2008R2 servers use DFSR or FRS?
Both our DCs are 2008R2 and domain functional level is also 2008R2.

I get this also (The DFS Namespace service successfully initialized cross forest trust information on this domain controller:
http://www.cia.as/www/Veeam/1_2.png

Sysvol do still not work. Currently no AD tols are working either.


Othere messages, related to domain.
http://www.cia.as/www/Veeam/1_2.png
http://www.cia.as/www/Veeam/1_3.png
http://www.cia.as/www/Veeam/1_4.png
http://www.cia.as/www/Veeam/1_5.png
http://www.cia.as/www/Veeam/1_6.png
http://www.cia.as/www/Veeam/1_7.png
http://www.cia.as/www/Veeam/2_2.png
http://www.cia.as/www/Veeam/2_3.png

Unison
Enthusiast
Posts: 95
Liked: 16 times
Joined: Feb 17, 2012 6:02 am
Full Name: Gav
Contact:

Re: Veeam B&R v5 recovery of a domain controller

Post by Unison » Aug 02, 2013 5:59 am

tsightler wrote:So the rebirth of this thread has reignited my efforts to create this document. I'll see about putting together a draft in the next couple of weeks as long as I can find enough time to test the recovery scenarios. It may take a little time from that point to become an officially published doc, which would be my goal, but I'll provide something in the interim. Ideally I can coax a few people in this thread to test the procedures I write up as that's part of the challenge. It's easy enough to create a lab and test things, but real world environments have a tendency to shake out additional issues.

If you don't see a draft posted in this thread in the next couple of weeks, ask for an update or shoot me a PM. I think the bulk of the information is here in the thread now, so it's just a matter of formalizing it and testing the documented procedure.
Excellent! :D
I for one would be happy to look at your drafts and test the procedures that align with my environment - im sure there will be others here too that will be happy to test. I dont mind looking at a very raw copy so its not an issue if its not in an official format at this early stage.
Agree, pretty much everything needed is in here - just the correct stuff needs to be pulled together and put in the right order....then some stepped out documented procedures for different scenarios.
I will wait for an alert from this post or a PM from you if you want me to look at/test anything specific here. Will check in with you in a couple weeks if nothing like that pops up.
Thanks for taking on this project again Tom. It will be very handy to have and should make it into the DRP of any shop running Veeam!

Fiskepudding
Expert
Posts: 213
Liked: 26 times
Joined: Feb 01, 2012 7:24 am
Full Name: Espen Dykesteen
Contact:

Re: Veeam B&R v5 recovery of a domain controller

Post by Fiskepudding » Aug 02, 2013 6:06 am

I would also gladly help, and test more on our environment.
First thing migt be to change the DNS the other way around :)

And maybe the gateway that points to non existing IP at the moment.

Unison
Enthusiast
Posts: 95
Liked: 16 times
Joined: Feb 17, 2012 6:02 am
Full Name: Gav
Contact:

Re: Veeam B&R v5 recovery of a domain controller

Post by Unison » Aug 02, 2013 6:24 am

if you flip your DNS settings and test - post back the results here so we know if that helped :)
If that second reboot doesnt come within 15mins of getting to the ctrl alt del screen then you can pretty much guarentee that its not coming and your DCs are still broken.

At that point - it could be that you dont have DFS Replication running on your DCs.....but this seems strange as i thought DFSR was installed/enabled by default when you build up an 08R2 DC. your even running a higher domain functional level than me.....im still on functional level 2003 because i still have one 2003 DC left.

*wish it was easier to post/upload an image here*.....but in your even viewer on the DC, under "Application and services logs" - you dont have a "DFS Replication" log in there....just an "File Replication Service" log?

Post Reply

Who is online

Users browsing this forum: Bing [Bot] and 22 guests