vSphere 6.5 VMFS 6 Snapshot Grain size

obroni · Post by **obroni** » Feb 15, 2017 2:57 pm this post

Is anybody running vSphere 6.5 with VMFS 6 datastores that can confirm if the grain size is still be created at 8 sectors (4kb) for snapshots, as is the case for >2TB luns currently on VMFS5?

Potentially looking at rolling out a new environment with VMFS datastores, but don't want all our replication jobs to start running at snails pace because of the default change to SESPARSE.

obroni · Post by **obroni** » Feb 20, 2017 4:06 pm this post

Hoping this might be covered by Gostev's talk on vSphere 6.5

Post by **Gostev** » Feb 20, 2017 6:27 pm this post

Yes, I believe it remains 4KB.

But how SESparse with 4KB grains can be worse than legacy vmfsSparse with 512 bytes grains in your opinion? Especially considering metadata storage and caching optimizations in SESparse format.

I am checking with QC to see if they saw any performance issues with replication jobs to VMFS 6 datastores.

obroni · Post by **obroni** » Feb 20, 2017 6:33 pm this post

With legacy, I see 64kb IOs. I think something happens with snapshots and legacy vmfs which grows them at 64kb. If I replicate a lun bigger than 2tb with legacy then I see 4kb IOs and the replication goes extremely slow. This was my concern with vmfs 6.

obroni · Post by **obroni** » Feb 20, 2017 6:35 pm this post

Forgot to add, I only see 4kb IOs with 2tb+ LUNs if I use hot add. With nbd they are 64kb again

Post by **Gostev** » Feb 21, 2017 8:36 pm this post

Apparently this also maps to what our QC has been seeing (no replication performance impact from VMFS6 when using NBD, and significant impact when using HOT ADD). I've asked the devs to dig this deeper, I have one idea on what can be causing this.

obroni · Post by **obroni** » Feb 21, 2017 8:50 pm this post

Awesome, glad this is finally getting some traction.

Its definitely down to the IO size. My replica's are sitting on a Linux NFS server and I can see the difference between <2TB and >2TB IO sizes using iostat when using hot add. If you guys find away to set the grain size on snapshot creation to match the IO size that the Veeam proxy submits, then I will be a very happy man, as performance would go through the roof at all vmdk sizes.

FYI, link showing vmfssparse uses 128 sector/64KB grain for snapshots <2TB
https://kb.vmware.com/selfservice/micro ... Id=1015180

Post by **Gostev** » Feb 21, 2017 8:57 pm this post

obroni wrote:If you guys find away to set the grain size on snapshot creation

There's no way to control that for sure.

obroni wrote:FYI, link showing vmfssparse uses 128 sector/64KB grain for snapshots <2TB
https://kb.vmware.com/selfservice/micro ... Id=1015180

Strange because here's what their CTO says in another article:

The vmfsSparse format, used by snapshots and linked cloned have a grain size of 512 bytes or 1 sector. The vmfsSparse format get 16MB chunks at a time from VMFS, but then allocates it at 512 bytes at a time. This is the root cause of many of the performance/alignment complaints that we currently get with linked-clones/snapshots, and what we are addressing with SE Sparse Disks.

obroni · Post by **obroni** » Feb 21, 2017 9:03 pm this post

I can 100% guarantee that the IO's submitted from ESXi to the storage are 64KB when a snapshot is taken with vmfssparse. I've traced them right down from the proxy to the storage. I can see in Windows on the proxy vm that 1MB IO's are submitted by Veeam proxy service, but as they go through the ESXI storage layer they turn into 64kb IO's. The initial replica seed which uses NFS direct shows 1MB IO's (or whatever replication block size is configured).

Are you sure the grain size can't be set.....that would be so awesome to be able to have 1MB io's to replica snapshots. That article you linked, hints it might be possible

With the introduction of SE Sparse disks, the grain size/block allocation unit size is now tuneable and can be set based on the preferences of a particular storage array or application. Note however that this full tuning capability will not be exposed in vSphere 5.1.

Oh and the support case I logged last year in case it has any interesting details for you, which didn't really make much progress and ended up with a dedicated proxy using NBD for VM's bigger than 2TB.

01751221

obroni · Post by **obroni** » Feb 21, 2017 10:35 pm this post

Actually, I'm going to withdraw that statement about 64kb IO's. That was correct when I had that call open last year, but I've checked a couple of jobs with <2TB disks and some are showing full size IO's. Let me do some more digging into what is going on and will report back.

Post by **Gostev** » Feb 22, 2017 1:17 am this post

obroni wrote:Are you sure the grain size can't be set

Positively, at least not for individual snapshots... there's just a few parameters you can specify:
http://pubs.vmware.com/vsphere-65/index ... 0_1_10_1_0

What you are quoting sounds like even if there's some setting, it would be global.

obroni · Post by **obroni** » Feb 23, 2017 4:29 pm this post

Hi Gostev,

As promised here's my comprehensive report on IO sizes I see during Replication. This was done with a test Linux VM and I used DD with random data to make sure the replica job had something to copy each time. I then used ftrace on the Linux NFS server to capture the IO sizes going to the datastore. Filesystem is XFS. Veeam Job size is set to use 1MB Blocks (Local)

1st up the initial copy of the VM over to the replica esxi host of a <2TB Disk using Hot ADD

Code: Select all

COMM         PID    TYPE DEV      BLOCK        BYTES     LATms
kworker/1:2  404    WSM  248,0    4294868864   32768      1.49
kworker/1:2  404    WS   248,0    868360       1048576     9.55
kworker/1:2  404    WSM  248,0    4294868928   32768      1.42
kworker/1:2  404    WS   248,0    870408       1048576     8.46
kworker/0:1  25616  WSM  248,0    4294868992   32768      2.48
kworker/1:2  404    WS   248,0    872456       1048576     8.96
kworker/1:2  404    WSM  248,0    4294869056   32768      1.46
kworker/0:1  25616  WS   248,0    874504       1044480    10.07
kworker/1:2  404    WSM  248,0    4294869120   32768      1.44

As you can see we are seeing a 1MB IO, followed by the XFS journal write

Now Lets run it again to see the effect of writing into a VMware snapshot

Code: Select all

COMM         PID    TYPE DEV      BLOCK        BYTES     LATms
kworker/2:0  19     WS   248,0    3310504      1048576     8.25
kworker/3:2  3873   WSM  248,0    4295030336   32768      1.47
kworker/3:2  3873   WS   248,0    3310472      16384      1.37
kworker/3:2  3873   WSM  248,0    4295030400   32768      1.47
kworker/2:0  19     WS   248,0    3312552      1048576     7.75
kworker/3:2  3873   WSM  248,0    4295030464   32768      1.50
kworker/3:2  3873   WS   248,0    3310472      16384      1.38
kworker/3:2  3873   WSM  248,0    4295030528   32768      1.64
kworker/3:2  3873   WS   248,0    3314600      16384      1.34
kworker/3:2  3873   WSM  248,0    4295030592   32768      1.59
kworker/3:2  3873   WS   248,0    458752       4096       1.36
kworker/3:2  3873   WSM  248,0    4295030656   32768      1.74
kworker/3:2  3873   WS   248,0    3314632      1048576     8.16

We now see as well as the 1MB IO also an extra 16k IO, not sure what this is for, but must be something on how snapshots work

And now lets do the same thing but this time use NBD

Code: Select all

COMM         PID    TYPE DEV      BLOCK        BYTES     LATms
kworker/2:0  19     WS   248,0    9226288      1048576    10.59
kworker/2:0  19     WSM  248,0    4295575488   32768      1.76
kworker/2:0  19     WS   248,0    9226256      16384      1.55
kworker/2:0  19     WSM  248,0    4295575552   32768      1.56
kworker/2:0  19     WS   248,0    9228336      16384      1.41
kworker/2:0  19     WSM  248,0    4295575616   32768      1.61
kworker/2:0  19     WS   248,0    3244072      4096       1.15
kworker/2:0  19     WSM  248,0    4295575680   32768      1.52
kworker/3:1  10321  WS   248,0    9228368      1048576     9.02
kworker/2:0  19     WSM  248,0    4295575744   32768      1.69
kworker/2:0  19     WS   248,0    9228336      16384      1.63
kworker/2:0  19     WSM  248,0    4295575808   32768      1.88
kworker/3:1  10321  WS   248,0    3244032      4096       1.13
kworker/0:1  25616  WS   248,0    6783848      16384      1.71

Pretty similar, but I think we see a couple of extar 16k IO's and maybe also an extra 4kb IO

Then I expand the disk to >2TB and try again with NBD

Code: Select all

COMM         PID    TYPE DEV      BLOCK        BYTES     LATms
kworker/1:0  28517  WS   248,0    18769064     1048576     7.99
kworker/0:1  25616  WSM  248,0    4296303808   32768      1.51
kworker/0:1  25616  WS   248,0    429208       12288      1.65
kworker/1:0  28517  WSM  248,0    4296303872   32768      1.65
kworker/3:1  10321  WS   248,0    18771112     1048576     8.63
kworker/2:0  19     WSM  248,0    4296303936   32768      1.51
kworker/2:0  19     WS   248,0    429240       8192       1.39
kworker/2:0  19     WSM  248,0    4296304000   32768      1.31

Similar to before, I'm still seeing 1MB IO's but now with some additional IO's which seem to be related to snapshots again

And now the >2TB disk with Hot Add

Code: Select all

COMM         PID    TYPE DEV      BLOCK        BYTES     LATms
kworker/2:0  19     WS   248,0    21716864     4096       1.13
kworker/2:0  19     WSM  248,0    4296809664   32768      1.48
kworker/3:2  23386  WS   248,0    21716872     4096       1.12
kworker/2:0  19     WSM  248,0    4296809728   32768      1.37
kworker/3:2  23386  WS   248,0    21716880     4096       1.12
kworker/2:0  19     WSM  248,0    4296809792   32768      1.46
kworker/3:2  23386  WS   248,0    21716888     4096       1.14
kworker/2:0  19     WSM  248,0    4296809856   32768      1.44
kworker/3:2  23386  WS   248,0    21716896     4096       1.30
kworker/2:0  19     WSM  248,0    4296809920   32768      1.37
kworker/3:2  23386  WS   248,0    21716904     4096       1.19
kworker/2:0  19     WSM  248,0    4296809984   32768      1.58
kworker/3:2  23386  WS   248,0    21716912     4096       1.11
kworker/2:0  19     WSM  248,0    4296810048   32768      1.45
kworker/3:2  23386  WS   248,0    21716920     4096       1.10
kworker/2:0  19     WSM  248,0    4296810112   32768      1.94

Wow!! tons and tons of tiny 4KB IO's instead of 1MB IO's

And for completeness this is the IO profile when Veeam applies the retention policy

Code: Select all

COMM         PID    TYPE DEV      BLOCK        BYTES     LATms
kworker/3:1  10321  WS   248,0    1786896      262144     2.83
kworker/3:1  10321  WSM  248,0    4295617920   32768      1.79
kworker/3:1  10321  WS   248,0    1787408      262144     2.61
kworker/3:1  10321  WSM  248,0    4295617984   32768      1.40
kworker/3:1  10321  WS   248,0    1787920      262144     2.59
kworker/3:1  10321  WSM  248,0    4295618048   32768      1.28
kworker/3:1  10321  WS   248,0    1788432      262144     2.70
kworker/3:1  10321  WSM  248,0    4295618112   32768      1.30

Which looks like it performs snapshot remove operations in 256KB chunks.

Hope that helps.

PS. Any idea what the 16K IO's are when writing to a snapshot?

obroni · Post by **obroni** » Feb 27, 2017 4:33 pm this post

Hi Gostev, not sure if you got the notification for this, was this useful in anyway?

cronosinternet · Dec 22, 2017 7:05 pm

Hi guys

I've been sitting here scratching my head for a while. We've had a stack of things change at once - the destination SAN, from VMFS5 to VMFS6, updates to VMware and Update3 to Veeam9.5.

Our replication speed has just fallen off the planet. Replication is fine for the initial first run, we're seeing 300MB/s but subsequent "top up" replications are down to like 8MB/s or so.

Just seen this, tracked back the changes and the first time I saw speed drop was when I changed just the filesystem format from VMFS5 to VMFS6.

Still an issue then ?

Post by **Gostev** » Dec 26, 2017 2:19 pm this post

Hello! I believe the current recommendation is to use NBD on target side, along with 9.5 U3 which addresses forced NBDSSL issue.

There's a long standing case with VMware on this and they recently suggested some possible workarounds that might improve performance. We will prototype around those once we get some available R&D resources, but even it it helps - the improvement is not expected to be by an order of magnitude - and yet it requires significant re-architecture, so not a quick fix. Thus, using NBD is the best bet for the foreseeable future, especially if you have 10Gb Ethernet for your management network.

cronosinternet · Dec 26, 2017 5:26 pm

Hmm, we have a number of proxies in that datacentre which do 90% replication work (on targets) but also some backup work. We keep our management network separate and have a dedicated veeam transport network so would prefer to stick with hot-add.

I think I'll just wipe all the replicas and reformat with VMFS5 then. Makes no obvious difference to us

stevil · Post by **stevil** » Sep 26, 2018 4:35 pm this post

I've just logged a ticket which seems to be exactly this issue. OUr source is VMFS5, target is VMFS6. Does this slow replication speed go away once we upgrade our source datastores to VMFS6? I presume this is still an outstanding issue with vmware.

stevil · Sep 26, 2018 8:27 pm

Replying to myself here. After some testing it seems only the target proxy needs to be in NBD mode, the source can stay in VA. Bit of a pain but we can get around it.

R&D Forums

vSphere 6.5 VMFS 6 Snapshot Grain size

Re: vSphere 6.5 VMFS 6 Snapshot Grain size

Re: vSphere 6.5 VMFS 6 Snapshot Grain size

Re: vSphere 6.5 VMFS 6 Snapshot Grain size

Re: vSphere 6.5 VMFS 6 Snapshot Grain size

Re: vSphere 6.5 VMFS 6 Snapshot Grain size

Re: vSphere 6.5 VMFS 6 Snapshot Grain size

Re: vSphere 6.5 VMFS 6 Snapshot Grain size

Re: vSphere 6.5 VMFS 6 Snapshot Grain size

Re: vSphere 6.5 VMFS 6 Snapshot Grain size

Re: vSphere 6.5 VMFS 6 Snapshot Grain size

Re: vSphere 6.5 VMFS 6 Snapshot Grain size

Re: vSphere 6.5 VMFS 6 Snapshot Grain size

Re: vSphere 6.5 VMFS 6 Snapshot Grain size

Re: vSphere 6.5 VMFS 6 Snapshot Grain size

Re: vSphere 6.5 VMFS 6 Snapshot Grain size

Re: vSphere 6.5 VMFS 6 Snapshot Grain size

Re: vSphere 6.5 VMFS 6 Snapshot Grain size

Who is online