rbrambley wrote:Correct me if I misunderstand the copied paragraphs below and your obvious Exchange expertise
No expert here - just come across it before
Did a DR simulation last month so the process for DAG switchovers is quite familiar.
in terms of minimizing the headaches of backing up Exchange DAGs and doing restores with Veeam (vPower, vLabs, U-AIR, etc) odd node DAGs appear to be the way to go. So, therefore backing up a single, passive node of a 3 member DAG would allow the VM to boot and mount by itself (assuming AD, DNS, etc is there too).
I think the number of copies in a DAG should really be based of uptime requirements & capacity sizing - as well as site topology. While an odd number of copies does do away with the FSW requirement, it really only has relevance in a DR scenario and doesn't add alot of additional complexity to the switchout steps required given everything else that would be going on at the time
A 5 member DAG would require 2 members to be backed up and restored.
Even node DAGs require a witness VM too so
2 member DAG needs 1 VM, 4 needs 2, etc + the witness VM
Almost. a 5 Member DAG would require 3
available copies to form a quorom (majority). 2 wouldn't cut it as XGE would assume the other 3 are online, talking to each other and have formed a quorum - so it would dismount these 2 to avoid Split Brain.
Upshot is the the number of available copied required, for various DAG configs:
- 8 Node DAG -> Requires 4 copies + FSW
- 7 Node Dag -> Requires 4 copies
- 6 Node Dag -> Requires 3 copies + FSW
- 5 Node Dag -> Requires 3 copies
- 4 Node Dag -> Requires 2 copies + FSW
- 3 Node Dag -> Requires 2 copies
- 2 Node Dag -> Requires 1 copies + FSW
- 1 Node Dag -> Is a happy chappy.
So to be able to bring up a live environment automatically using Veeam (vPower etc), you need the number of servers listed above to form the quorum- BUT thats even only possible if DAC mode is disabled (not recommended
Even with DAC mode disabled it may still not work given Veeam brings the servers up sequentially - I'm not sure of the tolerance period, but if the copies are not all up with network connectivity to each other within a short period then Exchange will assume they are isolated and dismount..
Again, there are manual steps that can be run to force a DAG failover after bootup, so I assume the powershell commands could be innjected into a lab.
To be honest - I've never really tried it. We have a 3 DAG cluster (across two sites), and only backup a single copy (plus the CAS). Its an active copy, but backed up at night, so there's no impact on production as long as it doesn't overlap with the XGE maintenance window (default between 1am and 5am from memory).
For restores I just bring up this copy and use other tools to mount the EDB (for the time being anywy
) Even this is rarely required as the built in exchange retention policy (aka dumpster) provides us sufficient short term retention to cover 99% of helpdesk calls..
So - after all that.. What the best practice? My view is that DAG's themselves effectively give you a DR solution built-in. By definition you have an active/up-to-date copy - presumably at another locaton, ready to go -> one of those (few) areas that Veeam doesn't proove to be as useful for replication
For retrieval of longer term items - Veeam is still a must, but you're going to struggle to get a live environment automatically restored and DB's mounted into a lab (for Surebackup etc) in anything but a standalone or simple DAG environment, unless you're willing to backup the majority (quorum) of your DAG copies.
In that case it doesn't really matter wether you have an odd or even number of copies as the 'deciding' vote is still effectively an additional server to bring up -albeit the FSW or (n/2+1)'th DAG copy - unless you have your FSW on a server that is already being bought up in the lab (eg DC or CAS).