Support case has now been escalated within Microsoft support, they did suggest that this may be by design in which case i expect they wont do anything to try and resolve. Microsoft have performed the same tests on our environment that Anatoly performed so they know the issue is VSS, they did try to pass it off as its an issue with the VSS writers and various other excuses.
I have managed to reduce the pause to the VM's by doing the following:
1. make sure all VM's in a job reside on the same node.
2. make sure the CSV the VM's in the backup job reside on is owned by the Node the VM's reside on.
The two changes above have reduced the pause to a couple of seconds but in the long term this would just not be manageable once we on board the other 320 VM's. Ive also tried reducing the size of the CSV's which made no difference at all.
Ive also tested backing up a VM using the Veeam agent for Windows and this backups the VM without causing any pause at all, but again this would not be a manageable solution going forwards.
So we are still stuck at a position where we are not able to move our customer base over to use Veeam and still have a minimum commit per month on the Veeam licensing.
I'm going to try getting the software from Dell as a trial that will allow us to perform backups using Off-Host proxys to see if that resolves the issue, again not ideal as this will introduce another 30k cost which we hadn't factored in.
very very frustrating.
Also found this post from a guy called Jason that seems to have experienced the exact same issues we are having. microsoft-hyper-v-f25/hourly-backups-performance-t30967.html