while failing over is quite straight forward and does not take very long (original VM is synced while running, than powered down, last minute sync and replica powered on), failing back that very same machine takes at least 3 times the time of failover, if not hours or days.
Now imagine I want to maintain the primary Hyper-V host where some or all VMs are running. Well yes, lets perform a failover, install some gigabytes of MS updates and then failback the VMs.
No way! I tried this with some test VM that has got a dynamic hard drive that has got occupied space of 117G and that is 130G large in Windows. That VHD file is 125G large.
While failover took 8 minutes failback took three quarters of an hour. Okay, I placed some call and asked what is going wrong. The answer was, that there is nothing going wrong.
It seems like failback works/worked like this:
- take a snapshot of the running replica
- read the original machine to check for changes
- calculate the differences and throw the result in trash
- copy back the entire replica (130G) anyway
- then write another 130G, because the dynamic VHD contains unpartitioned space of another 130G (no idea where the non-existent data is written to, maybe to some single dummy sector?)
- and so on and so on
Now what I'ld like to have is a fast and straight foward failback like this
- the amount of time when neither replica nor original VM are available must be as short as possible, so please sync it online and if possible based on CBT
- there must be an option whether failback shall power down the replica and power up the original automatically or supervised (I haven't got any mind to stare on some progress bar for eight hours or so)
- if supervised failback is selected, the replica must be kept running until the "do it now - button" is pushed
- when that button is pushed, the productive replica(s) is powered down, the remaining data is synced and the original VM(s) is(are) powered on
I do hope that somebody undestands that there seems to be room for improvement in failback. And please, planned failover and planned failback are enterprise options. While I am not talking about the enterprise edition I would bet several pizzas that the enterprise edition suffers from slow failback as well.