Host-based backup of VMware vSphere VMs.
Post Reply
kbr
Enthusiast
Posts: 25
Liked: never
Joined: Oct 09, 2020 7:36 am
Full Name: Karl
Contact:

Single 15TB volume running at 100MB/sec

Post by kbr »

Hi,

We have a new vSphere 7 environment running on HPE ProLiant hardware with iSCSI attached MSA storage. We have a Windows 2019 VM with a single 15TB disk attached to it. We see backup performance of around 100MB/sec for this disk. That means it will take about 40 hours to backup this disk. With a brand new setup using VEEAM 10 we were hoping to get much quicker backups. Can somebody give us reference numbers of transfer times of large disks to see if this is expected behavior? Network is 10GB, MSA is attached using 10GB, ESXi is connected using 10GB and transport mode is NBD.
Egor Yakovlev
Product Manager
Posts: 2578
Liked: 707 times
Joined: Jun 14, 2013 9:30 am
Full Name: Egor Yakovlev
Location: Prague, Czech Republic
Contact:

Re: Single 15TB volume running at 100MB/sec

Post by Egor Yakovlev »

Greetings, Karl.

- Is Veeam server physical or virtual?
- Did you add additional Proxy servers or stick with default?
- What does Job's Bottleneck counter show?

/Thanks!
kbr
Enthusiast
Posts: 25
Liked: never
Joined: Oct 09, 2020 7:36 am
Full Name: Karl
Contact:

Re: Single 15TB volume running at 100MB/sec

Post by kbr »

We have a Virtual Veeam server (but it's just the management server, doesn't do calculations), the proxy's are Physical HPE Apollo nodes with 2x10GB network, Windows 2019 and HDD spindels

The proxy's are pointed to the physical nodes.

The Bottleneck counter shows the following:
Source 99%
Proxy 8%
Network 2%
Target 0%
Egor Yakovlev
Product Manager
Posts: 2578
Liked: 707 times
Joined: Jun 14, 2013 9:30 am
Full Name: Egor Yakovlev
Location: Prague, Czech Republic
Contact:

Re: Single 15TB volume running at 100MB/sec

Post by Egor Yakovlev »

Thanks for update!

I guess that would be the limit of ESXi management interface.
Can you try switching to Direct SAN Access transport mode instead, by presenting iSCSI LUN directly to proxy server in a read-only mode?

/Thanks!
kbr
Enthusiast
Posts: 25
Liked: never
Joined: Oct 09, 2020 7:36 am
Full Name: Karl
Contact:

Re: Single 15TB volume running at 100MB/sec

Post by kbr »

Hi, we were just testing that :-)
But doesn't help, stil around 100MB/sec might be 110 but not significantly faster!
Gostev
Chief Product Officer
Posts: 31802
Liked: 7298 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Single 15TB volume running at 100MB/sec

Post by Gostev »

This means that your storage performance is the limiting factor, which bottleneck analysis clearly indicates: Veeam spends 99% of the time waiting for the storage to return the requested data, while all other components in the data processing conveyor are just sitting idle.

If the performance does not sound right, then you should engage HPE storage support next.
kbr
Enthusiast
Posts: 25
Liked: never
Joined: Oct 09, 2020 7:36 am
Full Name: Karl
Contact:

Re: Single 15TB volume running at 100MB/sec

Post by kbr »

Is there any way we can get this 100% clear? If we backup a VM with multiple disks we see troughput of 600MB/sec so the storage can deliver. It just will not do it for a single disk it seems.
Egor Yakovlev
Product Manager
Posts: 2578
Liked: 707 times
Joined: Jun 14, 2013 9:30 am
Full Name: Egor Yakovlev
Location: Prague, Czech Republic
Contact:

Re: Single 15TB volume running at 100MB/sec

Post by Egor Yakovlev »

To be 100% sure it will be best to ask Support team for assistance with troubleshooting.
iSCSI performance has too many variables to consider, so we can only guess without logs and performance tests.

/Thanks!
Gostev
Chief Product Officer
Posts: 31802
Liked: 7298 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Single 15TB volume running at 100MB/sec

Post by Gostev »

kbr wrote: Nov 04, 2020 1:43 pmIs there any way we can get this 100% clear? If we backup a VM with multiple disks we see troughput of 600MB/sec so the storage can deliver. It just will not do it for a single disk it seems.
This is exactly what you should troubleshoot with HPE: why a single read stream from their storage is limited to 100 MB/s. May be there are some settings they can tweak in the array, considering that multiple read streams deliver much better total throughput.
jamcool
Enthusiast
Posts: 67
Liked: 11 times
Joined: Feb 02, 2018 7:56 pm
Full Name: Jason Mount
Contact:

Re: Single 15TB volume running at 100MB/sec

Post by jamcool » 1 person likes this post

From my experience NBD (proxy server) is very slow to backup and will say Source even if your VM Host is not busy at all. We get at most maybe 40-60 MBps per disk and even if one disk backing up, it is still 40-60 MBps. This appears to be a limit with NBD. If we switch to Appliance (Hot Add) for the proxy server, we get into the 300-600 MBps range. this is great for big storage and small number of servers. If dealing with lots of small servers or incremental backups, NBD is better solution as it does not have to do all the hot add which can be very time consuming. We are 10 Mbps networking throughout and local (SAS) storage on repository server with several proxy servers in VMWARE (with VMXNET3 NICs).
PetrM
Veeam Software
Posts: 3622
Liked: 608 times
Joined: Aug 28, 2013 8:23 am
Full Name: Petr Makarov
Location: Prague, Czech Republic
Contact:

Re: Single 15TB volume running at 100MB/sec

Post by PetrM »

Hello,

@kbr I think it's a very good idea to check HotAdd performance while you're working on the issue related to slow read in a single stream with HPE.
As far as I understand this is the only mode which is not tested yet.

Please don't forget to provide us with a case ID if you're already in touch with our support team.

Thanks!
Gostev
Chief Product Officer
Posts: 31802
Liked: 7298 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Single 15TB volume running at 100MB/sec

Post by Gostev »

Indeed, Hot Add will always perform better in situations when the storage has slow read performance per stream, but scales linearly when more read streams are added. This is because in the Hot Add transport mode (as well as Direct NFS and Backup from Storage Snapshots) we can read data from VMDK ourselves - and we do this in a way that is optimized to perform best on enterprise-grade storage. For NBD and Direct SAN transport modes, reads are performed directly by VMware VDDK, and we cannot control how it does this. So for those stuck with using the latter transport modes, working with a storage vendor to optimize read performance per stream is the only way to improve performance.
JeroenL
Influencer
Posts: 18
Liked: 12 times
Joined: Feb 03, 2020 2:20 pm
Full Name: Jeroen Leeflang
Contact:

Re: Single 15TB volume running at 100MB/sec

Post by JeroenL »

What is the storage configuration of the MSA?
I presume it is a MSA2050 hopefully a MSA2052 with the performance storage license in place and SSD drives as a performance tier?
What are the diskgroups configurations?

Is this 15TB volume on the same diskgroup as the other small VM's?
Is there really a need to use such a large vdisk? In all restore cases it is worse to restore a single 15TB volume than to restore 15 1TB volumes. When there's a copy to tape and a single file needs to be restored you whish this disk wasn't this big AND you need a minimum of 15TB on the backup server to be able to restore from this disk.
kbr
Enthusiast
Posts: 25
Liked: never
Joined: Oct 09, 2020 7:36 am
Full Name: Karl
Contact:

Re: Single 15TB volume running at 100MB/sec

Post by kbr »

Hi,

We have tried the Virtual Appliance (Hot Add) option, this also didn't help (speed less then 100MB/sec). I have also logged a call with HPE and they did an IOMeter test and told me to go back to VEEAM :-( From their perspective everything is fine and they see 500MB/sec trougput. I followed the guidelines in the VEEAM / HPE document "HPE Reference Architecture for Veeam Availability Suite with HPE Apollo backup target". The case number is Veeam Support - Case # 04475499.

As for the 15TB volume, the customer can't change it to a smaller size because of an application that's not able to split it's files to multiple disks. And indeed it is an MSA2052 with a performance tier.
kbr
Enthusiast
Posts: 25
Liked: never
Joined: Oct 09, 2020 7:36 am
Full Name: Karl
Contact:

Re: Single 15TB volume running at 100MB/sec

Post by kbr »

Currently an incremental is running on the partially filled file server with no users using it, so no changed data. This is the current status:

Code: Select all

Queued for processing at 7-11-2020 18:00:14 	
Required backup infrastructure resources have been assigned 	15:46:32
VM processing started at 8-11-2020 09:46:48 	
VM size: 21,2 TB (10,9 TB used) 	
Getting VM info from vSphere 	00:04
Using guest interaction proxy xxxxxxx (Different subnet) 	
Inventorying guest system 	00:16
Preparing guest for hot backup 	00:55
Releasing guest 	00:00
Creating VM snapshot 	00:00
Getting list of guest file system local users 	00:00
Saving [SAN01-A-VMFS02] xxxxx.vmx 	00:00
Saving [SAN01-A-VMFS02] xxxxx.vmxf 	00:00
Saving [SAN01-A-VMFS02] xxxxx.nvram 	00:00
Using backup proxy xxxxx for disk Hard disk 1 [nbd] 	00:00
Using backup proxy xxxxx for disk Hard disk 2 [nbd] 	00:00
Using backup proxy xxxxx for disk Hard disk 3 [nbd] 	00:00
Using backup proxy xxxxx for disk Hard disk 4 [nbd] 	00:00
Hard disk 1 (100 GB) 21,6 GB read at 75 MB/s [CBT]	04:58
Hard disk 4 (550 GB) 402,9 GB read at 58 MB/s [CBT]	01:59:41
Hard disk 3 (5,6 TB) 4 TB read at 81 MB/s [CBT]	14:29:48
Hard disk 2 (15 TB) 6,2 TB read at 75 MB/s [CBT]	23:59:11
As you can see the disk is being read at 75MB/sec and there is only 3.2GB transferred at the moment, so it's busy for over 24 hours just reading unchanged data
Gostev
Chief Product Officer
Posts: 31802
Liked: 7298 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Single 15TB volume running at 100MB/sec

Post by Gostev »

kbr wrote: Nov 09, 2020 8:39 amWe have tried the Virtual Appliance (Hot Add) option, this also didn't help (speed less then 100MB/sec). I have also logged a call with HPE and they did an IOMeter test and told me to go back to VEEAM :-( From their perspective everything is fine and they see 500MB/sec throughput.
This means they are not using the correct IOMeter parameters, which obviously make a huge difference. For example, they have been testing streaming, not random I/O (while incremental backup is random I/O). You should connect them directly with your Veeam support engineer, so they could repeat the test correctly, and agree where the issue is. You should not be in the middle of this, at the very least it is not efficient.
kbr wrote: Nov 09, 2020 8:51 amit's busy for over 24 hours just reading unchanged data
Unchanged blocks are never read, as thanks to the vSphere changed block tracking we only read blocks that ESXi host had to write data into since the previous backup. From your statistics, it would appear that you have lots of block that were overwritten with the same content as they had before... it is a strange workload you have running there.
kbr
Enthusiast
Posts: 25
Liked: never
Joined: Oct 09, 2020 7:36 am
Full Name: Karl
Contact:

Re: Single 15TB volume running at 100MB/sec

Post by kbr »

Hi Gostev.

I agree HPE and VEEAM should look together, the question how can we get this to happend. HPE is simply stating everything is working fine. I suggested the same you do, we should feed IOMeter with correct info related to our environment (VEEAM backup / vSphere). Now looking for a way to accomplish that.

As for the uchanged data, as you can see in the example above:

Hard disk 2 (15 TB) 6,2 TB read at 75 MB/s [CBT] 23:59:11

This is a disk that has 0 changes, the complete job only transferred 3GB of data. But still it is running for 24 hours...
Gostev
Chief Product Officer
Posts: 31802
Liked: 7298 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Single 15TB volume running at 100MB/sec

Post by Gostev » 1 person likes this post

Just give your Veeam support engineer HPE support case number, and ask them to work directly. They have means to do so, and it's actually very common situation for them to work directly with our partners. Tell them not come back to you until they agree with each other where the bottleneck is. It's very easy to determine with basic storage performance tests anyway, just need to use the correct parameters.

We can only be sure it's not a Veeam issue. Not only all other customers are able to achieve at least a few times faster performance with the same exact code, but also our performance statistics clearly shows that our engine spends 99% of time doing nothing, simply waiting for the storage to return the requested data. This means it can only be a storage or a fabric issue... and we just need to demonstrate this to HPE engineers without Veeam in the picture, so they could move forward to solving the real issue.
kbr
Enthusiast
Posts: 25
Liked: never
Joined: Oct 09, 2020 7:36 am
Full Name: Karl
Contact:

Re: Single 15TB volume running at 100MB/sec

Post by kbr »

HPE has no way to contact VEEAM they say. Have a meeting with them at 11.00am but no VEEAM presence :-(.

VEEAM engineer said the following:
I have referenced the HPE case number in our case details and submitted a request for assistance from next tier.
Unfortunately, it may take some time for next tier engineer to review information and establish communications with vendor.
So hopefully he will be able to get in contact with someone from HPE, question remains how long this wil take.

The incremental form last night ran better:

Hard disk 2 (15 TB) 1,7 TB read at 100 MB/s [CBT] 04:55:24

But again hits what it seems like a hard limit of 100 MB/s

The copy job from one site to the other (From HPE Apollo to HPE Apollo) did the following:

Hard disk 2 (15 TB) 1,5 TB read at 1016 MB/s 27:18

So from slow disks to slow disks with 1016 MB/s ... that's ten times faster
kbr
Enthusiast
Posts: 25
Liked: never
Joined: Oct 09, 2020 7:36 am
Full Name: Karl
Contact:

Re: Single 15TB volume running at 100MB/sec

Post by kbr »

Hi, just an update, both calls are still ongoing (HPE / VEEAM), still haven't seen any interaction between the two vendors.

VEEAM is trying to get a vixDiskLib tool to run but that isn't working at the moment.

Still not a lot of progress unfortunately.
PetrM
Veeam Software
Posts: 3622
Liked: 608 times
Joined: Aug 28, 2013 8:23 am
Full Name: Petr Makarov
Location: Prague, Czech Republic
Contact:

Re: Single 15TB volume running at 100MB/sec

Post by PetrM »

Let's wait a while, I'm sure that our engineers will perform all necessary tests to demonstrate the real issue to our partners and to simplify interaction with them.

Thanks!
kbr
Enthusiast
Posts: 25
Liked: never
Joined: Oct 09, 2020 7:36 am
Full Name: Karl
Contact:

Re: Single 15TB volume running at 100MB/sec

Post by kbr »

Hi Guys, i have a supplement question. One of the HPE engineers found an article stating that using 10GB network, iSCSI and NBD as a transport methode the maximum throughput you can get is around 150MB/sec. I haven't seen confirmation of that yet. But if that's the case we are not going to fix the issue with NBD. We would then need to switch to another transport method, either Direct SAN or Virtual Proxy. Is anybody able to elaborate on the following:

Maximum possible speed when using iSCSI on 10GB network for a single thread (single vmdk) backup:

NBD xxxMB/Sec
SAN xxxMB/Sec
Virtual Proxy xxxMB/Sec

This would be of great help, since if there is no transport method that's able to go to let's say 500MB/sec when using a single thread on 10GB iSCSI then we came to a unfortunate but valid conclusion.
Gostev
Chief Product Officer
Posts: 31802
Liked: 7298 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Single 15TB volume running at 100MB/sec

Post by Gostev » 1 person likes this post

You won't be able to get 500 MB/sec with NBD for sure because of current VMware limitations, however with other backup modes it should be quite realistic to reach 500 MB/s using a single thread on 10Gb Ethernet. Especially with hot add, where Veeam is able to read data directly from VMDK by itself, in a fashion that is optimized for enterprise-grade storage arrays. VDDK is not as good at it with the Direct SAN mode.
kbr
Enthusiast
Posts: 25
Liked: never
Joined: Oct 09, 2020 7:36 am
Full Name: Karl
Contact:

Re: Single 15TB volume running at 100MB/sec

Post by kbr »

Hi Gostev, we changed to Virtual Appliance mode and backup now runs at much higher speeds (500MB/sec plus) and a complete full backup now take 6 hours instead of 24 hours, so that's great news!

BUT :-), our replication job now uses HOT ADD on the source (Virtual Appliance mode) and nbd and de destination side. That's running slow again, is that because of the Hot Add / NBD combi? Do we need to install a Virtual Appliance on the destination side to get it to go faster? Read speed is very good (i guess due to Hot Add), transfer speed is at 56MB/sec and the data is already in sync (i presume):

16-11-2020 06:19:59 :: Hard disk 1 (100 GB) 21,8 GB read at 312 MB/s [CBT]
16-11-2020 06:19:59 :: Hard disk 4 (550 GB) 402,9 GB read at 984 MB/s [CBT]
16-11-2020 06:20:07 :: Hard disk 2 (15 TB) 8,4 TB read at 507 MB/s [CBT]
Gostev
Chief Product Officer
Posts: 31802
Liked: 7298 times
Joined: Jan 01, 2006 1:01 am
Location: Baar, Switzerland
Contact:

Re: Single 15TB volume running at 100MB/sec

Post by Gostev »

Yes, of course.
FedericoV
Technology Partner
Posts: 36
Liked: 38 times
Joined: Aug 21, 2017 3:27 pm
Full Name: Federico Venier
Contact:

Re: Single 15TB volume running at 100MB/sec

Post by FedericoV » 2 people like this post

Veeam can be even faster if you run it on the right HW configuration. In my lab, the single vSphere VM backup using FC from a Storage Snapshot runs constantly above 2.3GB/s for data with just a 2:1 compression. For this test, my Proxy and Backup target was an HPE Apollo 4200.
Having more VMs, or better more write streams, it is possible to run faster. On the HPE Apollo 4510 I have tested full backup constantly above 5.5GB/s with 2:1 compresion. Very likely it is possible to run faster, but every lab has its own limitations.
Post Reply

Who is online

Users browsing this forum: AndyCH and 61 guests