Discussions specific to the VMware vSphere hypervisor
willjohnson
Novice
Posts: 4
Liked: 2 times
Joined: Jul 11, 2018 8:02 am
Contact:

Increase in Hot Add backup time after installing Update 3a

Post by willjohnson » Jul 11, 2018 2:51 pm 2 people like this post

[UPDATE] August 24, 2018
Solution is to install this hotfix > KB2711


Hi,

Is anyone else experiencing a huge increase in backup or replication times since updating to U3a? (In addition to the SQL problem).

e.g. Backup job normally taking 3.5hrs now taking over 21hrs since updating.

After the update, logs now have additional lines of [ViProxyEnvironment], including"The proxy has NBD mode", while Veeam GUI still says HDDs are being backed up using [hotadd].

Logs also have a new [ViProxyEnvironment] entries of "The proxy cannot be used for write" and "The proxy has not SAN mode".

Before updating, logs had no mention of NBD mode etc. or [ViProxyEnvironment].

Case # 03096521

Cheers,

Will

foggy
Veeam Software
Posts: 18291
Liked: 1568 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Increase in backup/replication times after U3a installat

Post by foggy » Jul 12, 2018 8:53 am

Hi Will, these messages seem to be vSphere 6.7 related. Please continue investigating with support engineer.

willjohnson
Novice
Posts: 4
Liked: 2 times
Joined: Jul 11, 2018 8:02 am
Contact:

Re: Increase in backup/replication times after U3a installat

Post by willjohnson » Jul 12, 2018 8:56 am

Hi foggy,

Messages weren't in logs before U3a update, regardless of vSphere 6.7.

Cheers.

foggy
Veeam Software
Posts: 18291
Liked: 1568 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Increase in backup/replication times after U3a installat

Post by foggy » Jul 12, 2018 8:58 am

Yes, because vSphere 6.7 support was added in U3a update.

willjohnson
Novice
Posts: 4
Liked: 2 times
Joined: Jul 11, 2018 8:02 am
Contact:

Re: Increase in backup/replication times after U3a installat

Post by willjohnson » Jul 12, 2018 9:00 am

We're on 6.5.

Waiting for support to respond.

Cheers

Aron.Stocker
Lurker
Posts: 1
Liked: never
Joined: Jul 11, 2018 8:20 am
Full Name: Aron Stocker
Contact:

Re: Increase in backup/replication times after U3a installat

Post by Aron.Stocker » Jul 16, 2018 5:05 am

Hi, we have the same issue (very longer backups and replication s).
We're interested too, please post any reply you receive from support.
Thanks

foggy
Veeam Software
Posts: 18291
Liked: 1568 times
Joined: Jul 11, 2011 10:22 am
Full Name: Alexander Fogelson
Contact:

Re: Increase in backup/replication times after U3a installat

Post by foggy » Jul 16, 2018 12:18 pm

I would recommend contacting support directly, since the reasons for that might be different and depend on a particular environment.

Gostev
SVP, Product Management
Posts: 24812
Liked: 3573 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Increase in backup/replication times after U3a installat

Post by Gostev » Jul 16, 2018 1:07 pm

It looks like the issue is caused by hot add process taking too long (20-30 min), and it appears to be caused by some change around the latest VDDK update that Update 3a uses. Because for the original case, the workaround was to replace VDDK libraries with ones used in Update 3, which took hot add times down to normal (1-2 min).

We also already know that the issue does not affect every environment, so there's something else to it.

We will be opening a support case with VMware on this.

omegagx
Enthusiast
Posts: 57
Liked: 3 times
Joined: May 09, 2017 6:33 pm
Full Name: Michael Gorn
Contact:

Re: Increase in backup/replication times after U3a installat

Post by omegagx » Jul 16, 2018 1:59 pm

OK, please keep us updated on this. We will hold off on installing Update 3a until this is resolved.

humbertoz
Novice
Posts: 5
Liked: never
Joined: Jul 16, 2018 10:45 pm
Full Name: Robert Zarate
Contact:

[MERGED] Issue with update 3a Hotadd and VMs with many disks

Post by humbertoz » Jul 16, 2018 11:46 pm

I know a lot of people have been fighting with the SQL issues that "appeared" with this update 3a. In reality, it is just how Veeam decided to change how the SQL backups run. The way they are doing it now is better, but everyone who got caught not having the exact "documentation" permissions in their SQL servers had their jobs break when they were running fine in the past.

Unfortunately, we ran into a big issue with one customer who has a couple of VMs with lots of drives (as in 8+ each). We have not observed this issue with any of our other customers, but everybody else has VMs with 4 drives max. When their jobs run, all lower disk count VMs process perfectly as they always have. Once one of the VMs with high disk counts starts to process, all currently processing VMs take from 30 minutes to two hours to connect each hotadd disk. Even disks that are already connected and transferring data pause when a disk hotadd is in it's limbo state. Basically, the entire job pauses during the disk hotadd timeout. Once the high disk VM has finally finished backing up, all remaining VMs in the job run as expected. This has caused their incremental jobs to go from 25 minutes to 2-4 hours. We have a ticket with Veeam and they have confirmed a bug with update 3a and high disk count VMs. They do not see the issue in debug against the VDDK, but when the software runs, obviously the issue is there. They have confirmed several other tickets coming in with the same symptoms. They are not seeing this issue with jobs configured with NBD (network mode). This customer runs on a very fast, three server cluster with 20Gb vSAN. It runs incredibly great and hotadd is perfect in their scenario. NBD is much slower for them since their VMs are on several different networks/DMZs and their firewalls are 1Gb. Their internet is 1Gb also, so they do not have a huge need to have 10Gb or 20Gb firewalls which would speed up their jobs if they were to switch to NBD. Hotadd allows the drives to be ripped at 20Gb regardless of network design/complexity which is beautiful.

Just a warning for everybody out there. Since you have to have VMs with lots of drives (we do not know the exact number, but at least more than four), luckily it will not affect a huge amount of people, but the ones it does it will be painful.

FYI, Veeam ticket # 03095693

Gostev
SVP, Product Management
Posts: 24812
Liked: 3573 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Increase in Hot Add backup time after installing Update

Post by Gostev » Jul 17, 2018 12:03 am

Thanks for the hint on VMs with large number of drives being the culprit. If this is confirmed and the issue is indeed with the latest VDDK version, a simple temporary hotfix could be for us to automatically force Network transport for VMs with number of disks larger than X. We will be trying to reproduce the issue internally now.

humbertoz
Novice
Posts: 5
Liked: never
Joined: Jul 16, 2018 10:45 pm
Full Name: Robert Zarate
Contact:

Re: Increase in backup/replication times after U3a installat

Post by humbertoz » Jul 17, 2018 12:39 am

My bad...I looked around everywhere for hotadd and high disk count posts, but never found this one since the title was worded a little bit different. Luckily, my post was merged to this one to help out everybody already here. As additional info, the customer that is affected by this was on vCenter and ESXi 6.5. After upgrading to 3a, they were hit by these issues. Since they were on 3a now, we were able to upgrade their vCenter and ESXi from 6.5 to 6.7. Unfortunately, their problems persist, so the issue is in the 6.7 VDDK from VMware that Veeam is using in 3a to be compatible with vCenter and ESXi 6.7. Since the customer was previously on vCenter and ESXi 6.5, it would seem that the VMware 6.7 VDDK also talks the same bad language to 6.5. I'm not sure if the root of this issue is in how Veeam interacts with the VDDK, or if it is how the VDDK interacts with vCenter and the hosts after getting all it's instructions from Veeam. If the VDDK just needs to be instructed differently by Veeam, then Veeam will be able to fix this with another patch. If the VDDK needs to interact with vCenter and the hosts differently, then the fix will have to come from VMware as a patch to the VDDK and then Veeam will have to integrate the newly patched VDDK into a patch 3b for all of us.

Veeam support, please correct any of this, or provide any further insight into how we will be able to resolve this issue. Unfortunately, failover for high disk count VMs to NBD would not be a solution for this customer, because then backups would "slow" down to 1Gb. The additional time to process the VMs versus the "broke" hotadd would probably be the same or more for their monthly full backups. For incrementals, it would take longer than before ("working" hotadd), but I'm positive it would still be faster than the current "broke" hotadd. Our support rep mentioned that this is basically the biggest issue they are working on right now coming out of 3a. They said it is in tier 3 and R&D, so hopefully it is getting a lot of traction over there.

Luckily, out of all our customers, only one has VMs with lots of drives and got hit by this bug.

Gostev
SVP, Product Management
Posts: 24812
Liked: 3573 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Increase in Hot Add backup time after installing Update

Post by Gostev » Jul 17, 2018 10:56 am

First tests are done and we could not confirm an issue using multiple VMs with 11 disks. To be continued...

spiritie
Enthusiast
Posts: 76
Liked: 7 times
Joined: Mar 01, 2016 10:16 am
Full Name: Gert van Niekerk
Location: Denmark
Contact:

Re: Increase in Hot Add backup time after installing Update

Post by spiritie » Jul 17, 2018 12:53 pm

We've also seen huge increase in backup time, and are not able to reach our backup window anymore.

Busy adding another proxy to see if it helps.

Running VMware 6.5, and all proxies are VM's using hotadd.

We though don't have many VM's with many disks, but have some large jobs.

EDIT:
Just did an inventory of our VM's in on of our vCenters. Out of around 400-500 VM's,
only 9 VM's has over 3 disks (Ranges between 4-7 disks)

Around 70-75% of the VM's has 1 disk, and 20-25% has 2 disks, rest is above.

humbertoz
Novice
Posts: 5
Liked: never
Joined: Jul 16, 2018 10:45 pm
Full Name: Robert Zarate
Contact:

Re: Increase in Hot Add backup time after installing Update

Post by humbertoz » Jul 17, 2018 1:54 pm

Gostev wrote:First tests are done and we could not confirm an issue using multiple VMs with 11 disks. To be continued...
Gostev,

There has to be something to this. You can look at our ticket which will have logs uploaded. This customer has three backup jobs with 10, 13 and 26 VMs and two replication jobs with 9 and 10 VMs. There are only two VMs that go into the hotadd "pausing job" issue. One is a SQL server in job #1 with 10 VMs. The other is a backend Exchange server in job # 2 with 13 VMs. The third job with 26 VMs is not affected. Neither are the two replication jobs, but as I mentioned, neither the two replication jobs nor the third backup job have those two VMs in it (or any other VMs with high disk count). No matter what order we move those two VMs in their jobs (first, middle, last), the jobs tank as soon as they reach those two VMs. The issue has to be related to drive quantity. It could maybe be total drive sizes? Thos two VMs are their largest VMs. If you add up all 8/9 drives the SQL server is 2.2TB and the Exchange server is 3.5TB, but this is all spread out on 100GB, 200GB, 500GB drives for both VMs. They have three other VMs that are file servers with hundreds of thousands of files and they backup fine. They are 2.0TB and 1.7TB in size. Those three have only two drives, a small 60GB OS drive and then the 1.7TB to 2.0TB drive. I'll list things I can think might influence this.

#1 - VMs with high quantity of disks
#2 - All their VMs including the two affected ones use the VMware Paravirtual SCSI controller. Not sure if you tested this in conjunction with high quantity of disks.
#3 - VMs with large disks. Those two VMs are their biggest ones. Well over 2TB in total disks added up. There next larger VMs are exactly 2.0TB and down. Not sure if their is a majic threshold of 2TB for this issue.
#4 - Application aware processing VMs. All their VMs run with this on. The two affected VMs are a SQL and backend Exchange that have it enabled for obvious transaction log reasons. Again, not sure if you tested this in conjunction with high quantity of disks.
#5 - The current alignment of planets and moon causing this. I'm at a loss with this issue. No matter what we do to the backup jobs, when it hits those two VMs, they blow up.

Hopefully you can get some insight in your testing. If you need any specific changes tested on our end, just let me know and we'll try it. Every body just wants to get to the source of this evil.

humbertoz
Novice
Posts: 5
Liked: never
Joined: Jul 16, 2018 10:45 pm
Full Name: Robert Zarate
Contact:

Re: Increase in Hot Add backup time after installing Update

Post by humbertoz » Jul 17, 2018 2:05 pm

spiritie wrote:We've also seen huge increase in backup time, and are not able to reach our backup window anymore.

Busy adding another proxy to see if it helps.

Running VMware 6.5, and all proxies are VM's using hotadd.

We though don't have many VM's with many disks, but have some large jobs.

EDIT:
Just did an inventory of our VM's in on of our vCenters. Out of around 400-500 VM's,
only 9 VM's has over 3 disks (Ranges between 4-7 disks)

Around 70-75% of the VM's has 1 disk, and 20-25% has 2 disks, rest is above.
Spiritie,

Can you look at your jobs and see where the slowdown is to maybe shed some commonalities with our setup? When we look at our jobs, we can easily see the issue. All VMs process correctly (with low incremental backup times 5-10 minutes) until they hit those two "bad" VMs. Then the backup time for those VMs and any other VMs processing at the same time blows up to 2-4 hours. After those two VMs are processed, all remaining VMs are processed correctly again (5-10 minutes). You can literally see it in the backup summary report without even digging deeper. The "duration" column will be good, good, good, bad, bad, bad, then good , good again as soon as the "bad" VMs are done processing (good VMs are also affected while the "bad" VMs are processing concurrently, but everything goes back to normal as soon as the bad VMs are done). This customer only has two VMs with high disk counts out of 60 VMs, but those two are causing a real headache. Hopefully, when you find which VMs are causing an issue, you can share specs on them.

Gostev
SVP, Product Management
Posts: 24812
Liked: 3573 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Increase in Hot Add backup time after installing Update

Post by Gostev » Jul 17, 2018 2:08 pm

@Gert for those on 6.5, the quick and confirmed fix is to roll backup VDDK back to the version used in Update 3 - feel free to contact our support for assistance with this.

spiritie
Enthusiast
Posts: 76
Liked: 7 times
Joined: Mar 01, 2016 10:16 am
Full Name: Gert van Niekerk
Location: Denmark
Contact:

Re: Increase in Hot Add backup time after installing Update

Post by spiritie » Jul 18, 2018 7:17 am

Gostev wrote:@Gert for those on 6.5, the quick and confirmed fix is to roll backup VDDK back to the version used in Update 3 - feel free to contact our support for assistance with this.
Hi Gostev, we are upgrading vCenter 6.7 tomorrow (No ESXi), is this bug with vCenter or ESXi?

ottl05
Influencer
Posts: 18
Liked: 4 times
Joined: Oct 16, 2014 11:29 am
Contact:

Re: Increase in Hot Add backup time after installing Update

Post by ottl05 » Jul 18, 2018 7:20 am

we have the same issue.
vm on 6.5, hotadd and a increase on backup time

whats the best way to solve this?

mcz
Expert
Posts: 301
Liked: 53 times
Joined: Jul 19, 2016 8:39 am
Full Name: Michael
Contact:

Re: Increase in Hot Add backup time after installing Update

Post by mcz » Jul 18, 2018 7:24 am

ottl05 wrote:we have the same issue.
vm on 6.5, hotadd and a increase on backup time

whats the best way to solve this?
I think rolling back VDDK as Anton Gostev wrote:
@Gert for those on 6.5, the quick and confirmed fix is to roll backup VDDK back to the version used in Update 3 - feel free to contact our support for assistance with this.

spiritie
Enthusiast
Posts: 76
Liked: 7 times
Joined: Mar 01, 2016 10:16 am
Full Name: Gert van Niekerk
Location: Denmark
Contact:

Re: Increase in Hot Add backup time after installing Update

Post by spiritie » Jul 18, 2018 7:37 am

humbertoz wrote: Spiritie,

Can you look at your jobs and see where the slowdown is to maybe shed some commonalities with our setup? When we look at our jobs, we can easily see the issue. All VMs process correctly (with low incremental backup times 5-10 minutes) until they hit those two "bad" VMs. Then the backup time for those VMs and any other VMs processing at the same time blows up to 2-4 hours. After those two VMs are processed, all remaining VMs are processed correctly again (5-10 minutes). You can literally see it in the backup summary report without even digging deeper. The "duration" column will be good, good, good, bad, bad, bad, then good , good again as soon as the "bad" VMs are done processing (good VMs are also affected while the "bad" VMs are processing concurrently, but everything goes back to normal as soon as the bad VMs are done). This customer only has two VMs with high disk counts out of 60 VMs, but those two are causing a real headache. Hopefully, when you find which VMs are causing an issue, you can share specs on them.
I've looked through them, but cannot find any redline on "bad VM's". I have 1 job where we have 27 VM's in it
and the only 2 VM'sthat was slow (a bit over 1 hour each) was tiny VM's that only had 40 GB and 1 disk on them.

But this seems to randomize, because I have some a lot of VM's that worked fine yesterday but "failed" this night.

All the VM's that is slow goes like this in the log, all of them stuck on hot adding the disk, and it fails over to NBD mode and processes them fairly quickly:

17-07-2018 21:35:24 :: Using backup proxy VMware Backup Proxy for disk Hard disk 1 [hotadd]
17-07-2018 22:47:36 :: Unable to hot add source disk, failing over to network mode...
17-07-2018 22:47:38 :: Hard disk 1 (40,0 GB) 1,1 GB read at 22 MB/s [CBT]
17-07-2018 22:48:47 :: Removing VM snapshot
17-07-2018 22:49:08 :: Finalizing

Gostev has reported that there is a fix for VMware vSphere 6.5

mcz
Expert
Posts: 301
Liked: 53 times
Joined: Jul 19, 2016 8:39 am
Full Name: Michael
Contact:

Re: Increase in Hot Add backup time after installing Update

Post by mcz » Jul 18, 2018 7:42 am

I had such issues in the past when proxie's bios.uuid wasn't unique withing the same vcenter. This would be the case if e.g. someone would replicate proxies to another host but still within the same vcenter. Has anybody checked if this isn't the case?

humbertoz
Novice
Posts: 5
Liked: never
Joined: Jul 16, 2018 10:45 pm
Full Name: Robert Zarate
Contact:

Re: Increase in Hot Add backup time after installing Update

Post by humbertoz » Jul 18, 2018 8:19 am

humbertoz wrote: Spiritie,

Can you look at your jobs and see where the slowdown is to maybe shed some commonalities with our setup? When we look at our jobs, we can easily see the issue. All VMs process correctly (with low incremental backup times 5-10 minutes) until they hit those two "bad" VMs. Then the backup time for those VMs and any other VMs processing at the same time blows up to 2-4 hours. After those two VMs are processed, all remaining VMs are processed correctly again (5-10 minutes). You can literally see it in the backup summary report without even digging deeper. The "duration" column will be good, good, good, bad, bad, bad, then good , good again as soon as the "bad" VMs are done processing (good VMs are also affected while the "bad" VMs are processing concurrently, but everything goes back to normal as soon as the bad VMs are done). This customer only has two VMs with high disk counts out of 60 VMs, but those two are causing a real headache. Hopefully, when you find which VMs are causing an issue, you can share specs on them.

I've looked through them, but cannot find any redline on "bad VM's". I have 1 job where we have 27 VM's in it
and the only 2 VM'sthat was slow (a bit over 1 hour each) was tiny VM's that only had 40 GB and 1 disk on them.

But this seems to randomize, because I have some a lot of VM's that worked fine yesterday but "failed" this night.

All the VM's that is slow goes like this in the log, all of them stuck on hot adding the disk, and it fails over to NBD mode and processes them fairly quickly:

17-07-2018 21:35:24 :: Using backup proxy VMware Backup Proxy for disk Hard disk 1 [hotadd]
17-07-2018 22:47:36 :: Unable to hot add source disk, failing over to network mode...
17-07-2018 22:47:38 :: Hard disk 1 (40,0 GB) 1,1 GB read at 22 MB/s [CBT]
17-07-2018 22:48:47 :: Removing VM snapshot
17-07-2018 22:49:08 :: Finalizing

Gostev has reported that there is a fix for VMware vSphere 6.5
Spiritie,

Thanks for the info. Unfortunately, it looks like your issues is a little different than our customer's. Your hotadd takes a long time to process, like theirs, but in the end is failing and switching over to NBD (network) mode. Our customer's two "bad" VMs take a long time for the hotadd process just like you are also seeing, but the hotadd never fails. It finally completes and then the VM begins the drive read and backup process until it pauses again. Their "bad" VMs are never random. Yours seem to be random a little. I do not see any things in common with your issue and our issue except that hotadd is having problems. Hopefully, Gostev is able to track some issues down and find a common source to all our ailments.

ottl05
Influencer
Posts: 18
Liked: 4 times
Joined: Oct 16, 2014 11:29 am
Contact:

Re: Increase in Hot Add backup time after installing Update

Post by ottl05 » Jul 20, 2018 3:22 pm 1 person likes this post

I have an open ticket (#03106424 ) about rollback vddk version, but nothing is happening :-(

ottl05
Influencer
Posts: 18
Liked: 4 times
Joined: Oct 16, 2014 11:29 am
Contact:

Re: Increase in Hot Add backup time after installing Update

Post by ottl05 » Jul 23, 2018 6:20 am

ottl05 wrote:I have an open ticket (#03106424 ) about rollback vddk version, but nothing is happening :-(
I changed the vddk-files and now the backup times are back to "normal".

thanks.

DominikM
Influencer
Posts: 11
Liked: 1 time
Joined: Nov 19, 2014 12:44 pm
Full Name: Dominik Meier
Contact:

Re: Increase in Hot Add backup time after installing Update

Post by DominikM » Jul 23, 2018 7:25 am

I'm having the same Problem since I've upgraded to 9.5 U3a. Since we're still on vSphere 6.5 I've requested a downgrade of the VDDK version.

omegagx
Enthusiast
Posts: 57
Liked: 3 times
Joined: May 09, 2017 6:33 pm
Full Name: Michael Gorn
Contact:

Re: Increase in Hot Add backup time after installing Update

Post by omegagx » Jul 26, 2018 9:34 pm

So this issue doesn't occur on ESX 6.0 U2 ?

AlexWhit
Enthusiast
Posts: 40
Liked: never
Joined: Jul 21, 2011 10:10 am
Full Name: Alex Whittaker
Contact:

Re: Increase in Hot Add backup time after installing Update

Post by AlexWhit » Jul 30, 2018 7:33 am

HI have seen this and dot told by Veeam support that it is normal

Andreas Neufert
Veeam Software
Posts: 3834
Liked: 687 times
Joined: May 04, 2011 8:36 am
Full Name: Andreas Neufert
Location: Germany
Contact:

Re: Increase in Hot Add backup time after installing Update

Post by Andreas Neufert » Jul 30, 2018 8:33 am

omegagx wrote:So this issue doesn't occur on ESX 6.0 U2 ?
Hi,

the issue is not within the hypervisor it is within the vmware software development kit (VDDK) that backup vendors need to add to their products.
For vsphere 6.7 compatibillity reasons we upgraded to VDDK 6.7 which include the issue.

For those customers that do not have upgraded to vsphere 6.7 you can stay on our Update 3 (older vddk kit) or contact support to downgrade the vddk kit with Update 3a. Let‘s hope that VMware fixes the VDDK kit soon.

ChuckS42
Expert
Posts: 109
Liked: 13 times
Joined: Apr 24, 2013 8:53 pm
Full Name: Chuck Stevens
Contact:

Re: Increase in Hot Add backup time after installing Update

Post by ChuckS42 » Jul 30, 2018 9:39 pm 1 person likes this post

Has VMware acknowledged the issue and started working on a fix?

Post Reply

Who is online

Users browsing this forum: Bing [Bot], gedossou, Google [Bot], skrause, TimLawhead and 18 guests