-
- Expert
- Posts: 131
- Liked: 22 times
- Joined: Nov 21, 2014 10:50 pm
- Full Name: Nick Fisk
- Contact:
vSphere 6.5 VMFS 6 Snapshot Grain size
Is anybody running vSphere 6.5 with VMFS 6 datastores that can confirm if the grain size is still be created at 8 sectors (4kb) for snapshots, as is the case for >2TB luns currently on VMFS5?
Potentially looking at rolling out a new environment with VMFS datastores, but don't want all our replication jobs to start running at snails pace because of the default change to SESPARSE.
Potentially looking at rolling out a new environment with VMFS datastores, but don't want all our replication jobs to start running at snails pace because of the default change to SESPARSE.
-
- Expert
- Posts: 131
- Liked: 22 times
- Joined: Nov 21, 2014 10:50 pm
- Full Name: Nick Fisk
- Contact:
Re: vSphere 6.5 VMFS 6 Snapshot Grain size
Hoping this might be covered by Gostev's talk on vSphere 6.5
-
- Chief Product Officer
- Posts: 31806
- Liked: 7300 times
- Joined: Jan 01, 2006 1:01 am
- Location: Baar, Switzerland
- Contact:
Re: vSphere 6.5 VMFS 6 Snapshot Grain size
Yes, I believe it remains 4KB.
But how SESparse with 4KB grains can be worse than legacy vmfsSparse with 512 bytes grains in your opinion? Especially considering metadata storage and caching optimizations in SESparse format.
I am checking with QC to see if they saw any performance issues with replication jobs to VMFS 6 datastores.
But how SESparse with 4KB grains can be worse than legacy vmfsSparse with 512 bytes grains in your opinion? Especially considering metadata storage and caching optimizations in SESparse format.
I am checking with QC to see if they saw any performance issues with replication jobs to VMFS 6 datastores.
-
- Expert
- Posts: 131
- Liked: 22 times
- Joined: Nov 21, 2014 10:50 pm
- Full Name: Nick Fisk
- Contact:
Re: vSphere 6.5 VMFS 6 Snapshot Grain size
With legacy, I see 64kb IOs. I think something happens with snapshots and legacy vmfs which grows them at 64kb. If I replicate a lun bigger than 2tb with legacy then I see 4kb IOs and the replication goes extremely slow. This was my concern with vmfs 6.
-
- Expert
- Posts: 131
- Liked: 22 times
- Joined: Nov 21, 2014 10:50 pm
- Full Name: Nick Fisk
- Contact:
Re: vSphere 6.5 VMFS 6 Snapshot Grain size
Forgot to add, I only see 4kb IOs with 2tb+ LUNs if I use hot add. With nbd they are 64kb again
-
- Chief Product Officer
- Posts: 31806
- Liked: 7300 times
- Joined: Jan 01, 2006 1:01 am
- Location: Baar, Switzerland
- Contact:
Re: vSphere 6.5 VMFS 6 Snapshot Grain size
Apparently this also maps to what our QC has been seeing (no replication performance impact from VMFS6 when using NBD, and significant impact when using HOT ADD). I've asked the devs to dig this deeper, I have one idea on what can be causing this.
-
- Expert
- Posts: 131
- Liked: 22 times
- Joined: Nov 21, 2014 10:50 pm
- Full Name: Nick Fisk
- Contact:
Re: vSphere 6.5 VMFS 6 Snapshot Grain size
Awesome, glad this is finally getting some traction.
Its definitely down to the IO size. My replica's are sitting on a Linux NFS server and I can see the difference between <2TB and >2TB IO sizes using iostat when using hot add. If you guys find away to set the grain size on snapshot creation to match the IO size that the Veeam proxy submits, then I will be a very happy man, as performance would go through the roof at all vmdk sizes.
FYI, link showing vmfssparse uses 128 sector/64KB grain for snapshots <2TB
https://kb.vmware.com/selfservice/micro ... Id=1015180
Its definitely down to the IO size. My replica's are sitting on a Linux NFS server and I can see the difference between <2TB and >2TB IO sizes using iostat when using hot add. If you guys find away to set the grain size on snapshot creation to match the IO size that the Veeam proxy submits, then I will be a very happy man, as performance would go through the roof at all vmdk sizes.
FYI, link showing vmfssparse uses 128 sector/64KB grain for snapshots <2TB
https://kb.vmware.com/selfservice/micro ... Id=1015180
-
- Chief Product Officer
- Posts: 31806
- Liked: 7300 times
- Joined: Jan 01, 2006 1:01 am
- Location: Baar, Switzerland
- Contact:
Re: vSphere 6.5 VMFS 6 Snapshot Grain size
There's no way to control that for sure.obroni wrote:If you guys find away to set the grain size on snapshot creation
Strange because here's what their CTO says in another article:obroni wrote:FYI, link showing vmfssparse uses 128 sector/64KB grain for snapshots <2TB
https://kb.vmware.com/selfservice/micro ... Id=1015180
The vmfsSparse format, used by snapshots and linked cloned have a grain size of 512 bytes or 1 sector. The vmfsSparse format get 16MB chunks at a time from VMFS, but then allocates it at 512 bytes at a time. This is the root cause of many of the performance/alignment complaints that we currently get with linked-clones/snapshots, and what we are addressing with SE Sparse Disks.
-
- Expert
- Posts: 131
- Liked: 22 times
- Joined: Nov 21, 2014 10:50 pm
- Full Name: Nick Fisk
- Contact:
Re: vSphere 6.5 VMFS 6 Snapshot Grain size
I can 100% guarantee that the IO's submitted from ESXi to the storage are 64KB when a snapshot is taken with vmfssparse. I've traced them right down from the proxy to the storage. I can see in Windows on the proxy vm that 1MB IO's are submitted by Veeam proxy service, but as they go through the ESXI storage layer they turn into 64kb IO's. The initial replica seed which uses NFS direct shows 1MB IO's (or whatever replication block size is configured).
Are you sure the grain size can't be set.....that would be so awesome to be able to have 1MB io's to replica snapshots. That article you linked, hints it might be possible
01751221
Are you sure the grain size can't be set.....that would be so awesome to be able to have 1MB io's to replica snapshots. That article you linked, hints it might be possible
Oh and the support case I logged last year in case it has any interesting details for you, which didn't really make much progress and ended up with a dedicated proxy using NBD for VM's bigger than 2TB.With the introduction of SE Sparse disks, the grain size/block allocation unit size is now tuneable and can be set based on the preferences of a particular storage array or application. Note however that this full tuning capability will not be exposed in vSphere 5.1.
01751221
-
- Expert
- Posts: 131
- Liked: 22 times
- Joined: Nov 21, 2014 10:50 pm
- Full Name: Nick Fisk
- Contact:
Re: vSphere 6.5 VMFS 6 Snapshot Grain size
Actually, I'm going to withdraw that statement about 64kb IO's. That was correct when I had that call open last year, but I've checked a couple of jobs with <2TB disks and some are showing full size IO's. Let me do some more digging into what is going on and will report back.
-
- Chief Product Officer
- Posts: 31806
- Liked: 7300 times
- Joined: Jan 01, 2006 1:01 am
- Location: Baar, Switzerland
- Contact:
Re: vSphere 6.5 VMFS 6 Snapshot Grain size
Positively, at least not for individual snapshots... there's just a few parameters you can specify:obroni wrote:Are you sure the grain size can't be set
http://pubs.vmware.com/vsphere-65/index ... 0_1_10_1_0
What you are quoting sounds like even if there's some setting, it would be global.
-
- Expert
- Posts: 131
- Liked: 22 times
- Joined: Nov 21, 2014 10:50 pm
- Full Name: Nick Fisk
- Contact:
Re: vSphere 6.5 VMFS 6 Snapshot Grain size
Hi Gostev,
As promised here's my comprehensive report on IO sizes I see during Replication. This was done with a test Linux VM and I used DD with random data to make sure the replica job had something to copy each time. I then used ftrace on the Linux NFS server to capture the IO sizes going to the datastore. Filesystem is XFS. Veeam Job size is set to use 1MB Blocks (Local)
1st up the initial copy of the VM over to the replica esxi host of a <2TB Disk using Hot ADD
As you can see we are seeing a 1MB IO, followed by the XFS journal write
Now Lets run it again to see the effect of writing into a VMware snapshot
We now see as well as the 1MB IO also an extra 16k IO, not sure what this is for, but must be something on how snapshots work
And now lets do the same thing but this time use NBD
Pretty similar, but I think we see a couple of extar 16k IO's and maybe also an extra 4kb IO
Then I expand the disk to >2TB and try again with NBD
Similar to before, I'm still seeing 1MB IO's but now with some additional IO's which seem to be related to snapshots again
And now the >2TB disk with Hot Add
Wow!! tons and tons of tiny 4KB IO's instead of 1MB IO's
And for completeness this is the IO profile when Veeam applies the retention policy
Which looks like it performs snapshot remove operations in 256KB chunks.
Hope that helps.
PS. Any idea what the 16K IO's are when writing to a snapshot?
As promised here's my comprehensive report on IO sizes I see during Replication. This was done with a test Linux VM and I used DD with random data to make sure the replica job had something to copy each time. I then used ftrace on the Linux NFS server to capture the IO sizes going to the datastore. Filesystem is XFS. Veeam Job size is set to use 1MB Blocks (Local)
1st up the initial copy of the VM over to the replica esxi host of a <2TB Disk using Hot ADD
Code: Select all
COMM PID TYPE DEV BLOCK BYTES LATms
kworker/1:2 404 WSM 248,0 4294868864 32768 1.49
kworker/1:2 404 WS 248,0 868360 1048576 9.55
kworker/1:2 404 WSM 248,0 4294868928 32768 1.42
kworker/1:2 404 WS 248,0 870408 1048576 8.46
kworker/0:1 25616 WSM 248,0 4294868992 32768 2.48
kworker/1:2 404 WS 248,0 872456 1048576 8.96
kworker/1:2 404 WSM 248,0 4294869056 32768 1.46
kworker/0:1 25616 WS 248,0 874504 1044480 10.07
kworker/1:2 404 WSM 248,0 4294869120 32768 1.44
Now Lets run it again to see the effect of writing into a VMware snapshot
Code: Select all
COMM PID TYPE DEV BLOCK BYTES LATms
kworker/2:0 19 WS 248,0 3310504 1048576 8.25
kworker/3:2 3873 WSM 248,0 4295030336 32768 1.47
kworker/3:2 3873 WS 248,0 3310472 16384 1.37
kworker/3:2 3873 WSM 248,0 4295030400 32768 1.47
kworker/2:0 19 WS 248,0 3312552 1048576 7.75
kworker/3:2 3873 WSM 248,0 4295030464 32768 1.50
kworker/3:2 3873 WS 248,0 3310472 16384 1.38
kworker/3:2 3873 WSM 248,0 4295030528 32768 1.64
kworker/3:2 3873 WS 248,0 3314600 16384 1.34
kworker/3:2 3873 WSM 248,0 4295030592 32768 1.59
kworker/3:2 3873 WS 248,0 458752 4096 1.36
kworker/3:2 3873 WSM 248,0 4295030656 32768 1.74
kworker/3:2 3873 WS 248,0 3314632 1048576 8.16
And now lets do the same thing but this time use NBD
Code: Select all
COMM PID TYPE DEV BLOCK BYTES LATms
kworker/2:0 19 WS 248,0 9226288 1048576 10.59
kworker/2:0 19 WSM 248,0 4295575488 32768 1.76
kworker/2:0 19 WS 248,0 9226256 16384 1.55
kworker/2:0 19 WSM 248,0 4295575552 32768 1.56
kworker/2:0 19 WS 248,0 9228336 16384 1.41
kworker/2:0 19 WSM 248,0 4295575616 32768 1.61
kworker/2:0 19 WS 248,0 3244072 4096 1.15
kworker/2:0 19 WSM 248,0 4295575680 32768 1.52
kworker/3:1 10321 WS 248,0 9228368 1048576 9.02
kworker/2:0 19 WSM 248,0 4295575744 32768 1.69
kworker/2:0 19 WS 248,0 9228336 16384 1.63
kworker/2:0 19 WSM 248,0 4295575808 32768 1.88
kworker/3:1 10321 WS 248,0 3244032 4096 1.13
kworker/0:1 25616 WS 248,0 6783848 16384 1.71
Then I expand the disk to >2TB and try again with NBD
Code: Select all
COMM PID TYPE DEV BLOCK BYTES LATms
kworker/1:0 28517 WS 248,0 18769064 1048576 7.99
kworker/0:1 25616 WSM 248,0 4296303808 32768 1.51
kworker/0:1 25616 WS 248,0 429208 12288 1.65
kworker/1:0 28517 WSM 248,0 4296303872 32768 1.65
kworker/3:1 10321 WS 248,0 18771112 1048576 8.63
kworker/2:0 19 WSM 248,0 4296303936 32768 1.51
kworker/2:0 19 WS 248,0 429240 8192 1.39
kworker/2:0 19 WSM 248,0 4296304000 32768 1.31
And now the >2TB disk with Hot Add
Code: Select all
COMM PID TYPE DEV BLOCK BYTES LATms
kworker/2:0 19 WS 248,0 21716864 4096 1.13
kworker/2:0 19 WSM 248,0 4296809664 32768 1.48
kworker/3:2 23386 WS 248,0 21716872 4096 1.12
kworker/2:0 19 WSM 248,0 4296809728 32768 1.37
kworker/3:2 23386 WS 248,0 21716880 4096 1.12
kworker/2:0 19 WSM 248,0 4296809792 32768 1.46
kworker/3:2 23386 WS 248,0 21716888 4096 1.14
kworker/2:0 19 WSM 248,0 4296809856 32768 1.44
kworker/3:2 23386 WS 248,0 21716896 4096 1.30
kworker/2:0 19 WSM 248,0 4296809920 32768 1.37
kworker/3:2 23386 WS 248,0 21716904 4096 1.19
kworker/2:0 19 WSM 248,0 4296809984 32768 1.58
kworker/3:2 23386 WS 248,0 21716912 4096 1.11
kworker/2:0 19 WSM 248,0 4296810048 32768 1.45
kworker/3:2 23386 WS 248,0 21716920 4096 1.10
kworker/2:0 19 WSM 248,0 4296810112 32768 1.94
And for completeness this is the IO profile when Veeam applies the retention policy
Code: Select all
COMM PID TYPE DEV BLOCK BYTES LATms
kworker/3:1 10321 WS 248,0 1786896 262144 2.83
kworker/3:1 10321 WSM 248,0 4295617920 32768 1.79
kworker/3:1 10321 WS 248,0 1787408 262144 2.61
kworker/3:1 10321 WSM 248,0 4295617984 32768 1.40
kworker/3:1 10321 WS 248,0 1787920 262144 2.59
kworker/3:1 10321 WSM 248,0 4295618048 32768 1.28
kworker/3:1 10321 WS 248,0 1788432 262144 2.70
kworker/3:1 10321 WSM 248,0 4295618112 32768 1.30
Hope that helps.
PS. Any idea what the 16K IO's are when writing to a snapshot?
-
- Expert
- Posts: 131
- Liked: 22 times
- Joined: Nov 21, 2014 10:50 pm
- Full Name: Nick Fisk
- Contact:
Re: vSphere 6.5 VMFS 6 Snapshot Grain size
Hi Gostev, not sure if you got the notification for this, was this useful in anyway?
-
- Influencer
- Posts: 21
- Liked: 9 times
- Joined: Oct 31, 2012 1:05 pm
- Full Name: Lee Christie
- Contact:
Re: vSphere 6.5 VMFS 6 Snapshot Grain size
Hi guys
I've been sitting here scratching my head for a while. We've had a stack of things change at once - the destination SAN, from VMFS5 to VMFS6, updates to VMware and Update3 to Veeam9.5.
Our replication speed has just fallen off the planet. Replication is fine for the initial first run, we're seeing 300MB/s but subsequent "top up" replications are down to like 8MB/s or so.
Just seen this, tracked back the changes and the first time I saw speed drop was when I changed just the filesystem format from VMFS5 to VMFS6.
Still an issue then ?
I've been sitting here scratching my head for a while. We've had a stack of things change at once - the destination SAN, from VMFS5 to VMFS6, updates to VMware and Update3 to Veeam9.5.
Our replication speed has just fallen off the planet. Replication is fine for the initial first run, we're seeing 300MB/s but subsequent "top up" replications are down to like 8MB/s or so.
Just seen this, tracked back the changes and the first time I saw speed drop was when I changed just the filesystem format from VMFS5 to VMFS6.
Still an issue then ?
-
- Chief Product Officer
- Posts: 31806
- Liked: 7300 times
- Joined: Jan 01, 2006 1:01 am
- Location: Baar, Switzerland
- Contact:
Re: vSphere 6.5 VMFS 6 Snapshot Grain size
Hello! I believe the current recommendation is to use NBD on target side, along with 9.5 U3 which addresses forced NBDSSL issue.
There's a long standing case with VMware on this and they recently suggested some possible workarounds that might improve performance. We will prototype around those once we get some available R&D resources, but even it it helps - the improvement is not expected to be by an order of magnitude - and yet it requires significant re-architecture, so not a quick fix. Thus, using NBD is the best bet for the foreseeable future, especially if you have 10Gb Ethernet for your management network.
There's a long standing case with VMware on this and they recently suggested some possible workarounds that might improve performance. We will prototype around those once we get some available R&D resources, but even it it helps - the improvement is not expected to be by an order of magnitude - and yet it requires significant re-architecture, so not a quick fix. Thus, using NBD is the best bet for the foreseeable future, especially if you have 10Gb Ethernet for your management network.
-
- Influencer
- Posts: 21
- Liked: 9 times
- Joined: Oct 31, 2012 1:05 pm
- Full Name: Lee Christie
- Contact:
Re: vSphere 6.5 VMFS 6 Snapshot Grain size
Hmm, we have a number of proxies in that datacentre which do 90% replication work (on targets) but also some backup work. We keep our management network separate and have a dedicated veeam transport network so would prefer to stick with hot-add.
I think I'll just wipe all the replicas and reformat with VMFS5 then. Makes no obvious difference to us
I think I'll just wipe all the replicas and reformat with VMFS5 then. Makes no obvious difference to us
-
- Expert
- Posts: 114
- Liked: 3 times
- Joined: Sep 02, 2010 2:23 pm
- Full Name: Steve B
- Location: Manchester, UK
- Contact:
Re: vSphere 6.5 VMFS 6 Snapshot Grain size
I've just logged a ticket which seems to be exactly this issue. OUr source is VMFS5, target is VMFS6. Does this slow replication speed go away once we upgrade our source datastores to VMFS6? I presume this is still an outstanding issue with vmware.
-
- Expert
- Posts: 114
- Liked: 3 times
- Joined: Sep 02, 2010 2:23 pm
- Full Name: Steve B
- Location: Manchester, UK
- Contact:
Re: vSphere 6.5 VMFS 6 Snapshot Grain size
Replying to myself here. After some testing it seems only the target proxy needs to be in NBD mode, the source can stay in VA. Bit of a pain but we can get around it.
Who is online
Users browsing this forum: EskBackupGuy23, tpayton and 83 guests