SureBackup and 2 DC's: One works, the other won't

VMware specific discussions

SureBackup and 2 DC's: One works, the other won't

Veeam Logoby YoMarK » Wed Mar 09, 2016 3:54 pm

Case 01704398

So I have 2 Windows 2008R2 DC's. Both have comparable network configuration, and are in the same subnet.
They have a single NIC, VMware tools are up to date.
Both have the same gateway(in the Surebackup job, this is the Veeam proxy appliance).
One works fine, the other does not, and I can't think of a single reason why.

Unfortunately, I really need the one that isn't working for other Surebackup jobs, because it's the main DNS server, Global catalog, DHCP server etcetera.

The relevant part(I think) of the Veeam Surebackup job:
Code: Select all
Power on VM stabilization engine. Persistent snapshot presented: False
[09.03.2016 11:35:29] <01> Info     [SureBackup] [srvdomain07] [PowerOnVm] [StableIp] Algorithm 'Stable IP' has been created
[09.03.2016 11:35:29] <01> Info     [SureBackup] [srvdomain07] [PowerOnVm] [StableIp] > Enabled = True
[09.03.2016 11:35:29] <01> Info     [SureBackup] [srvdomain07] [PowerOnVm] [StableIp] > SlotName = PowerOnVm.StableIp
[09.03.2016 11:35:29] <01> Info     [SureBackup] [srvdomain07] [PowerOnVm] [StableIp] > Mode = Analyze
[09.03.2016 11:35:29] <01> Info     [SureBackup] [srvdomain07] [PowerOnVm] [StableIp] > UseHibernation = True
[09.03.2016 11:35:29] <01> Info     [SureBackup] [srvdomain07] [PowerOnVm] [StableIp] > HibernateTime = 1 minute(s)
[09.03.2016 11:35:29] <01> Info     [SureBackup] [srvdomain07] [PowerOnVm] [StableIp] > UseApipa = True
[09.03.2016 11:35:29] <01> Info     [SureBackup] [srvdomain07] [PowerOnVm] [StableIp] > StabilizationFactor = 2
[09.03.2016 11:35:29] <01> Info     [SureBackup] [srvdomain07] [PowerOnVm] [StableIp] > ShutdownFactor = 5
[09.03.2016 11:35:29] <01> Info     [SureBackup] [srvdomain07] [PowerOnVm] [StableIp] > ApipaFactor = 3
[09.03.2016 11:35:29] <01> Info     [SureBackup] [srvdomain07] [PowerOnVm] [StableIp] > SleepDelay = 5000 millisecond(s)
[09.03.2016 11:35:29] <01> Info     [SureBackup] [srvdomain07] [PowerOnVm] [StableIp] > ShowChangesOnly = True
[09.03.2016 11:35:29] <01> Info     [SureBackup] [srvdomain07] [PowerOnVm] [StableIp] > SmartDetect = True
[09.03.2016 11:35:29] <01> Info     [SureBackup] [srvdomain07] [PowerOnVm] [StableIp] Waiting for IP-address during 600 second(s)
[09.03.2016 11:35:29] <01> Info     [SureBackup] [srvdomain07] [PowerOnVm] [StableIp] ===========================================================================================================================================================================
[09.03.2016 11:35:29] <01> Info     [SureBackup] [srvdomain07] [PowerOnVm] [StableIp] | VM                                           | VM IP           | Real IP         | Device ID   | Power State | VM State     | Tools Status | Version Status | Heartbeat |
[09.03.2016 11:35:29] <01> Info     [SureBackup] [srvdomain07] [PowerOnVm] [StableIp] ===========================================================================================================================================================================
[09.03.2016 11:35:29] <01> Info     [SureBackup] [srvdomain07] [PowerOnVm] [StableIp] | srvdomain07_da74c82abad346b08cee9ad82d515959 | ???.???.???.??? | ???.???.???.??? |           0 | PowerOn     | NotRunning   | NotRunning   | Current        | Gray      |
[09.03.2016 11:36:21] <01> Info     [SureBackup] [srvdomain07] [PowerOnVm] [StableIp] | srvdomain07_da74c82abad346b08cee9ad82d515959 | ???.???.???.??? | ???.???.???.??? |           0 | PowerOn     | Running      | NotRunning   | Current        | Gray      |
[09.03.2016 11:36:42] <01> Info     [SureBackup] [srvdomain07] [PowerOnVm] [StableIp] | srvdomain07_da74c82abad346b08cee9ad82d515959 | ???.???.???.??? | ???.???.???.??? |           0 | PowerOn     | NotRunning   | NotRunning   | Current        | Gray      |
[09.03.2016 11:36:52] <01> Info     [SureBackup] [srvdomain07] [PowerOnVm] [StableIp] | srvdomain07_da74c82abad346b08cee9ad82d515959 | ???.???.???.??? | ???.???.???.??? |           0 | PowerOn     | NotRunning   | NotRunning   | Current        | Red       |
[09.03.2016 11:37:55] <01> Info     [SureBackup] [srvdomain07] [PowerOnVm] [StableIp] | srvdomain07_da74c82abad346b08cee9ad82d515959 | 10.1.0.1        | 10.1.0.1        |           0 | PowerOn     | Running      | Ok           | Current        | Red       |
[09.03.2016 11:38:21] <01> Info     [SureBackup] [srvdomain07] [PowerOnVm] [StableIp] | srvdomain07_da74c82abad346b08cee9ad82d515959 | 10.1.0.1        | 10.1.0.1        |           0 | PowerOn     | Running      | Ok           | Current        | Yellow    |
[09.03.2016 11:38:21] <01> Info     [SureBackup] [srvdomain07] [PowerOnVm] [StableIp] IP-address has been detected in 00:02:52
[09.03.2016 11:38:21] <01> Info     [SureBackup] [srvdomain07] [PowerOnVm] [StableIp] Waiting for stable IP-address during 86 second(s)
[09.03.2016 11:38:52] <01> Info     [SureBackup] [srvdomain07] [PowerOnVm] [StableIp] | srvdomain07_da74c82abad346b08cee9ad82d515959 | ???.???.???.??? | ???.???.???.??? |           0 | PowerOn     | Running      | NotRunning   | Current        | Green     |
[09.03.2016 11:38:52] <01> Info     [SureBackup] [srvdomain07] [PowerOnVm] [StableIp] Either VM or VMware Tools have been restarted in 00:00:31 seconds(s)
[09.03.2016 11:38:52] <01> Info     [SureBackup] [srvdomain07] [PowerOnVm] [StableIp] Waiting for IP-address during 860 second(s)
[09.03.2016 11:39:24] <01> Info     [SureBackup] [srvdomain07] [PowerOnVm] [StableIp] | srvdomain07_da74c82abad346b08cee9ad82d515959 | ???.???.???.??? | ???.???.???.??? |           0 | PowerOn     | Running      | NotRunning   | Current        | Red       |
[09.03.2016 11:39:34] <01> Info     [SureBackup] [srvdomain07] [PowerOnVm] [StableIp] | srvdomain07_da74c82abad346b08cee9ad82d515959 | ???.???.???.??? | ???.???.???.??? |           0 | PowerOn     | NotRunning   | NotRunning   | Current        | Red       |
[09.03.2016 11:40:11] <01> Info     [SureBackup] [srvdomain07] [PowerOnVm] [StableIp] | srvdomain07_da74c82abad346b08cee9ad82d515959 | 10.1.0.1        | 10.1.0.1        |           0 | PowerOn     | Running      | Ok           | Current        | Red       |
[09.03.2016 11:40:11] <01> Info     [SureBackup] [srvdomain07] [PowerOnVm] [StableIp] IP-address has been detected in 00:01:18
[09.03.2016 11:40:11] <01> Info     [SureBackup] [srvdomain07] [PowerOnVm] [StableIp] Waiting for stable IP-address during 39 second(s)
[09.03.2016 11:40:42] <01> Info     [SureBackup] [srvdomain07] [PowerOnVm] [StableIp] | srvdomain07_da74c82abad346b08cee9ad82d515959 | 10.1.0.1        | 10.1.0.1        |           0 | PowerOn     | Running      | Ok           | Current        | Green     |
[09.03.2016 11:40:53] <01> Info     [SureBackup] [srvdomain07] [PowerOnVm] [StableIp] IP-address was stable during 00:00:42
[09.03.2016 11:40:53] <01> Info     [SureBackup] [srvdomain07] [PowerOnVm] [StableIp] ===========================================================================================================================================================================
[09.03.2016 11:40:53] <01> Info     [SureBackup] [srvdomain07] [PowerOnVm] Dump stabilizator engine
[09.03.2016 11:40:53] <01> Info     [SureBackup] [srvdomain07] [PowerOnVm] > StabilizatorAlgorithm = WaitForStableIp
[09.03.2016 11:40:53] <01> Info     [SureBackup] [srvdomain07] [PowerOnVm] > TimeOut = 600 second(s)
[09.03.2016 11:40:53] <01> Info     [SureBackup] [srvdomain07] [PowerOnVm] Dump stabilization point
[09.03.2016 11:40:53] <01> Info     [SureBackup] [srvdomain07] [PowerOnVm] > CreatedAt = 9-3-2016 10:35:29
[09.03.2016 11:40:53] <01> Info     [SureBackup] [srvdomain07] [PowerOnVm] > FinishedAt = 9-3-2016 10:40:53
[09.03.2016 11:40:53] <01> Info     [SureBackup] [srvdomain07] [PowerOnVm] > ProcessDuration = 00:05:24.1700780
[09.03.2016 11:40:53] <01> Info     [SureBackup] [srvdomain07] [PowerOnVm] > FixedPoint = True
[09.03.2016 11:40:53] <01> Warning  [SureBackup] [srvdomain07] [PowerOnVm] Results: cannot detect IP address
[09.03.2016 11:40:53] <01> Warning  [SureBackup] [srvdomain07] [PowerOnVm] Summary: OS booted up successfully
[09.03.2016 11:40:53] <01> Warning  [SureBackup] [srvdomain07] [PowerOnVm] End 'Powering on'
[09.03.2016 11:40:53] <01> Info     [SureBackup] [srvdomain07] [HeartbeatTest] Begin 'Heartbeat test'
[09.03.2016 11:40:53] <01> Info     [SureBackup] [srvdomain07] [HeartbeatTest] Dump operation
[09.03.2016 11:40:53] <01> Info     [SureBackup] [srvdomain07] [HeartbeatTest] > InstalledVmTools = True
[09.03.2016 11:40:53] <01> Info     [SureBackup] [srvdomain07] [HeartbeatTest] > HeartbeatStatus = Green
[09.03.2016 11:40:53] <01> Info     [SureBackup] [srvdomain07] [HeartbeatTest] Begin 'Heartbeat status: analysing...'
[09.03.2016 11:40:54] <01> Info     [SureBackup] [srvdomain07] [HeartbeatTest] End 'Heartbeat status: green'
[09.03.2016 11:40:54] <01> Info     [SureBackup] [srvdomain07] [HeartbeatTest] Results: heartbeat is green, passed
[09.03.2016 11:40:54] <01> Info     [SureBackup] [srvdomain07] [HeartbeatTest] Summary: 100% total pass rate
[09.03.2016 11:40:54] <01> Info     [SureBackup] [srvdomain07] [HeartbeatTest] End 'Heartbeat test'
[09.03.2016 11:40:54] <01> Info     [SureBackup] [srvdomain07] [PingTest] Begin 'Running ping test(s)'
[09.03.2016 11:40:54] <01> Info     [SureBackup] [srvdomain07] [PingTest] Dump operation
[09.03.2016 11:40:54] <01> Info     [SureBackup] [srvdomain07] [PingTest] > UpdateExternalIpAddress = True
[09.03.2016 11:40:54] <01> Info     [SureBackup] [srvdomain07] [PingTest] > NoVmNetworks = False
[09.03.2016 11:40:54] <01> Info     [SureBackup] [srvdomain07] [PingTest] > InstalledVmTools = True
[09.03.2016 11:40:54] <01> Info     [Soap] Loading 'vm-95758:VirtualMachine' hierarchy
[09.03.2016 11:40:55] <01> Info     [Soap] Loaded 29 elements
[09.03.2016 11:40:55] <01> Info     [Soap] Present 29 hierarchy objects from "srvvsphere.domain.local", 1 Datacenter(s), 1 HostSystem(s), 1 VirtualMachine(s), 17 Datastore(s), 2 ResourcePool(s).
[09.03.2016 11:40:55] <01> Info     [Soap] Outgoing connection 'srvvsphere.domain.local:443:administrator:False::0:1'.
[09.03.2016 11:40:55] <01> Info     [Soap] Connection 'srvvsphere.domain.local:443:administrator:False::0:1' is provided from the cache. Used: 2
[09.03.2016 11:40:55] <01> Info     [Soap] Connection 'srvvsphere.domain.local:443:administrator:False::0:1' is disposing.
[09.03.2016 11:40:55] <01> Info     [Soap] Loading 'vm-95758:VirtualMachine' hierarchy
[09.03.2016 11:40:55] <01> Info     [Soap] Loaded 29 elements
[09.03.2016 11:40:55] <01> Info     [Soap] Present 29 hierarchy objects from "srvvsphere.domain.local", 1 Datacenter(s), 1 HostSystem(s), 1 VirtualMachine(s), 17 Datastore(s), 2 ResourcePool(s).
[09.03.2016 11:40:55] <01> Info     [Soap] Outgoing connection 'srvvsphere.domain.local:443:administrator:False::0:1'.
[09.03.2016 11:40:55] <01> Info     [Soap] Connection 'srvvsphere.domain.local:443:administrator:False::0:1' is provided from the cache. Used: 2
[09.03.2016 11:40:55] <01> Info     [Soap] Connection 'srvvsphere.domain.local:443:administrator:False::0:1' is disposing.
[09.03.2016 11:40:55] <01> Warning  [SureBackup] [srvdomain07] [PingTest] Network adapter 1: name , not mapped
[09.03.2016 11:40:55] <01> Info     [SureBackup] [srvdomain07] [PingTest] Dump collector
[09.03.2016 11:40:55] <01> Info     [SureBackup] [srvdomain07] [PingTest] =======================================================
[09.03.2016 11:40:55] <01> Info     [SureBackup] [srvdomain07] [PingTest] | # | IP | Masquerade IP   | Ping State | Fail Reason |
[09.03.2016 11:40:55] <01> Info     [SureBackup] [srvdomain07] [PingTest] =======================================================
[09.03.2016 11:40:55] <01> Info     [SureBackup] [srvdomain07] [PingTest] | - | -  | -               | -          | -           |
[09.03.2016 11:40:55] <01> Info     [SureBackup] [srvdomain07] [PingTest] =======================================================
[09.03.2016 11:40:55] <01> Warning  [SureBackup] [srvdomain07] [PingTest] No successful ping(s), waiting for maximum boot time...
[09.03.2016 11:40:55] <01> Info     [SureBackup] [srvdomain07] [PingTest] Note: operation will be repeated at 9-3-2016 11:46:19
[09.03.2016 11:40:55] <01> Info     [SureBackup] [srvdomain07] [PingTest] [WaitForMaxBoot] Algorithm 'Wait' has been created
[09.03.2016 11:40:55] <01> Info     [SureBackup] [srvdomain07] [PingTest] [WaitForMaxBoot] > Enabled = True
[09.03.2016 11:40:55] <01> Info     [SureBackup] [srvdomain07] [PingTest] [WaitForMaxBoot] > SlotName = PingTest.WaitForMaxBoot
[09.03.2016 11:40:55] <01> Info     [SureBackup] [srvdomain07] [PingTest] [WaitForMaxBoot] > SleepDelay = 5000 millisecond(s)
[09.03.2016 11:40:55] <01> Info     [SureBackup] [srvdomain07] [PingTest] [WaitForMaxBoot] Waiting for 324 more seconds...
[09.03.2016 11:46:21] <01> Info     [SureBackup] [srvdomain07] [PingTest] [WaitForMaxBoot] Waiting is finished


The issue seems that it can not detect the Ip address:
"Results: cannot detect IP address"
The strange thing is that it HAS detected the IP address(it is also shown in vCenter), and it even says it was stable: "IP-address was stable during 00:00:42".

During this time I have a stable ping going to the masqueraded IP from the Veeam server. At the same time I can login to the server, and have a stable ping going to the Veeam proxy appliance(the VM's gateway). A few minutes later the Surebackup still fails because ping test is unsuccessful.

Does anybody have any idea?

Thanks in advance!
YoMarK
Enthusiast
 
Posts: 35
Liked: 2 times
Joined: Mon Jul 13, 2009 12:50 pm
Location: The Netherlands
Full Name: Mark

Re: SureBackup and 2 DC's: One works, the other won't

Veeam Logoby YoMarK » Thu Mar 10, 2016 4:39 pm

Veeam support says that it's not OK that the domain controller reboots one time, but I have found numerous threads on this forum(and on blogs) that this is normal behavior in case of Domain Controllers. That(combined with the fact that the other DC where Surebackup is working great does also reboot one time), is conflicting.
But Veeam support persists that this is not normal behavior.

I don't want to be stubborn, but want this problem to be fixed as soon as possible without inflicting problems in production.
The "fix" suggested by veeam support consists of replacing VeeamVSSsupport dll files with versions much older then the files that are already there(from Veeam B&R 9). I'm not ready to do that unless i'm sure that this does not impact other production backups and it bites me in the ass later. I also can't see why this somehow improves IP detection in surebackup.

My question: is it normal for a DC in a surebackup Job to do one reboot after first boot?

I think that all this is beside the point, from the log(first it says it has a stable IP, then in the same second it cannot detect the IP):
[09.03.2016 11:40:42] <01> Info [SureBackup] [srvdomain07] [PowerOnVm] [StableIp] | srvdomain07_da74c82abad346b08cee9ad82d515959 | 10.1.0.1
[09.03.2016 11:40:53] <01> Info [SureBackup] [srvdomain07] [PowerOnVm] [StableIp] IP-address was stable during 00:00:42
[09.03.2016 11:40:53] <01> Warning [SureBackup] [srvdomain07] [PowerOnVm] Results: cannot detect IP address

The above is all AFTER the reboot. So: It detects the right IP, says it stable, and then says it cannot detect the IP.
(Meanwhile, I can ping the masq ip from veeam server, are logged on the DC using console, and can ping the Helper VM from the DC)

Ideas?
YoMarK
Enthusiast
 
Posts: 35
Liked: 2 times
Joined: Mon Jul 13, 2009 12:50 pm
Location: The Netherlands
Full Name: Mark

Re: SureBackup and 2 DC's: One works, the other won't

Veeam Logoby Gostev » Fri Mar 11, 2016 12:54 am

I do know for sure that DCs are handled differently in SureBackup jobs due to the fact that they are started up in the test environment by themselves (without other DCs present). SureBackup job does reconfigure some parameters in their registry before booting them up to account for that. I don't remember specific details though, will need to check with the devs.
Gostev
Veeam Software
 
Posts: 21390
Liked: 2349 times
Joined: Sun Jan 01, 2006 1:01 am
Location: Baar, Switzerland

Re: SureBackup and 2 DC's: One works, the other won't

Veeam Logoby YoMarK » Mon Mar 14, 2016 2:53 pm

Yes, I can find this in numerous posts(on this forum) and documentation as well, yet the Veeam support engineer persists that the("any") reboot is not OK(not answering to my claims and links to topics that the reboots ARE normal for DC's.... AARGH) .
I want to try an test a lot of fixes, but I do NOt want problems with my production systems and backups of one of our 300 other VM's.
YoMarK
Enthusiast
 
Posts: 35
Liked: 2 times
Joined: Mon Jul 13, 2009 12:50 pm
Location: The Netherlands
Full Name: Mark

Re: SureBackup and 2 DC's: One works, the other won't

Veeam Logoby foggy » Tue Mar 15, 2016 4:52 pm

Looks like, for some reason, the DC boots very fast for the first time and gets short reboot timeout value (it is half of the time required for IP to stabilize). This results in actual reboot at the time when ping and other tests start to execute and they expectedly fail. I believe support will able to set a higher reboot timeout value to avoid this.

Also, seems you have MaxBootTime timeout set to 600 (instead of default 1800), any reason for that?

As a side note, when you're restoring two DC's in a virtual lab, you'd better disable DC and GC roles for one of them and set it to start some time after another one. SureBackup performs an authoritative restore of the DC in the application group, so having two of them there and designating both as DC's makes them compete for supremacy and lead to undesired consequences.
foggy
Veeam Software
 
Posts: 14742
Liked: 1079 times
Joined: Mon Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson


Return to VMware vSphere



Who is online

Users browsing this forum: Google [Bot] and 8 guests