-
- Service Provider
- Posts: 131
- Liked: 22 times
- Joined: Nov 21, 2014 10:50 pm
- Full Name: Nick Fisk
- Contact:
Veeam Replication IO Sizes
Can I just put a general query out to everyone to see if they are seeing the same thing as me?
When doing replication jobs, do you seen the IO sizes on the source/target datastores happening at 128Kb IO's, regardless of the actual block size setting of the job (eg Wan, Lan, Local)?
I'm just looking at esxtop and I can see data being written in what looks to be 128kb IOs, but my replication jobs are set to LAN block size. To get that figure I'm dividing the MBWRT/s by Writes/s.
When doing replication jobs, do you seen the IO sizes on the source/target datastores happening at 128Kb IO's, regardless of the actual block size setting of the job (eg Wan, Lan, Local)?
I'm just looking at esxtop and I can see data being written in what looks to be 128kb IOs, but my replication jobs are set to LAN block size. To get that figure I'm dividing the MBWRT/s by Writes/s.
-
- VeeaMVP
- Posts: 6139
- Liked: 1932 times
- Joined: Jul 26, 2009 3:39 pm
- Full Name: Luca Dell'Oca
- Location: Varese, Italy
- Contact:
Re: Veeam Replication IO Sizes
Hi Nick,
I've never done such tests to be honest, but I guess is because once Veeam sends data to the ESXi stack, it then writes data following the block size of the underlying VMFS file system, and even if a block created by Veeam is larger, it gets divided into smaller blocks before being written. No difference from other file systems when writing blocks that are larger than the cluster size... But again, no direct experience as this is as VMFS works anyway, so no real interest into it for me...
I've never done such tests to be honest, but I guess is because once Veeam sends data to the ESXi stack, it then writes data following the block size of the underlying VMFS file system, and even if a block created by Veeam is larger, it gets divided into smaller blocks before being written. No difference from other file systems when writing blocks that are larger than the cluster size... But again, no direct experience as this is as VMFS works anyway, so no real interest into it for me...
Luca Dell'Oca
Principal EMEA Cloud Architect @ Veeam Software
@dellock6
https://www.virtualtothecore.com/
vExpert 2011 -> 2022
Veeam VMCE #1
Principal EMEA Cloud Architect @ Veeam Software
@dellock6
https://www.virtualtothecore.com/
vExpert 2011 -> 2022
Veeam VMCE #1
-
- Service Provider
- Posts: 131
- Liked: 22 times
- Joined: Nov 21, 2014 10:50 pm
- Full Name: Nick Fisk
- Contact:
Re: Veeam Replication IO Sizes
Hi Luca,
I'm not 100% convinced that's the case, if I run a IOMeter in a Windows VM generating 1MB blocks, I see this get passed down all the way to esxtop. Windows 2008+ splits anything larger into 1MB blocks, but ESX itself should allow anything up to 32MB I think. I've been doing some more digging and I think Veeam is submitting the IO's as the correct size from what I can see in perfmon on the proxy, but this isn't making its way down to the ESX storage. I'm wondering if this could be potentially linked to some sort of sector misalignment with the vmdk mounting driver......I will continue to have a hunt around and report back on what I find.
This is quite an important performance factor as on a near idle HP MSA array running the above IOMeter test I get
128KB blocks = 45MB/s
1MB block = 85MB/s
Writing to our Ceph storage cluster reveals even larger difference between different block sizes.
I'm not 100% convinced that's the case, if I run a IOMeter in a Windows VM generating 1MB blocks, I see this get passed down all the way to esxtop. Windows 2008+ splits anything larger into 1MB blocks, but ESX itself should allow anything up to 32MB I think. I've been doing some more digging and I think Veeam is submitting the IO's as the correct size from what I can see in perfmon on the proxy, but this isn't making its way down to the ESX storage. I'm wondering if this could be potentially linked to some sort of sector misalignment with the vmdk mounting driver......I will continue to have a hunt around and report back on what I find.
This is quite an important performance factor as on a near idle HP MSA array running the above IOMeter test I get
128KB blocks = 45MB/s
1MB block = 85MB/s
Writing to our Ceph storage cluster reveals even larger difference between different block sizes.
-
- Service Provider
- Posts: 131
- Liked: 22 times
- Joined: Nov 21, 2014 10:50 pm
- Full Name: Nick Fisk
- Contact:
Re: Veeam Replication IO Sizes
Just a quick update, I ran IOmeter directly on one of our 2008 Proxies and was seeing IO splitting as described previously. I have just in-place upgraded this VM to 2012 R2 and no longer see this behaviour, so something changed between 2008 and 2012 R2. I will kick off some replication jobs and see if I am seeing the correct IO sizes being passed down.
-
- VeeaMVP
- Posts: 6139
- Liked: 1932 times
- Joined: Jul 26, 2009 3:39 pm
- Full Name: Luca Dell'Oca
- Location: Varese, Italy
- Contact:
Re: Veeam Replication IO Sizes
Uhm, now it's becoming interesting indeed. Thanks for the updates Nick, eager to see the following results.
Luca Dell'Oca
Principal EMEA Cloud Architect @ Veeam Software
@dellock6
https://www.virtualtothecore.com/
vExpert 2011 -> 2022
Veeam VMCE #1
Principal EMEA Cloud Architect @ Veeam Software
@dellock6
https://www.virtualtothecore.com/
vExpert 2011 -> 2022
Veeam VMCE #1
-
- Service Provider
- Posts: 131
- Liked: 22 times
- Joined: Nov 21, 2014 10:50 pm
- Full Name: Nick Fisk
- Contact:
Re: Veeam Replication IO Sizes
Replica seems to be going a little faster with the 2012 proxy, but still seeing much smaller IO sizes on the ESX side. I'm now wondering if this is due to the way Veeam handles retention points with Replicas by using snapshots. Maybe the average IO size I'm seeing in esxtop is due to the way VMware is handling IO when a VM disk has a snapshot. I will try some IOMeter tests on normal VM's with and without snapshots and see if this makes a difference.
-
- Service Provider
- Posts: 131
- Liked: 22 times
- Joined: Nov 21, 2014 10:50 pm
- Full Name: Nick Fisk
- Contact:
Re: Veeam Replication IO Sizes
After taking a snapshot on a VM running IOMeter, I see similar results to what I do during replication jobs. So I'm pretty much convinced that snapshots are the cause of the smaller average IO seen in esxtop.
-
- VeeaMVP
- Posts: 6139
- Liked: 1932 times
- Joined: Jul 26, 2009 3:39 pm
- Full Name: Luca Dell'Oca
- Location: Varese, Italy
- Contact:
Re: Veeam Replication IO Sizes
Interesting....thanks for sharing, even if using snapshots as part of the supported way of doing things in VMware, I'm not sue how this can be solved. Maybe VVOLs in the future will be better, they have a different snapshot technology but I have no details on the advantages they could bring in terms of IO size.
Luca Dell'Oca
Principal EMEA Cloud Architect @ Veeam Software
@dellock6
https://www.virtualtothecore.com/
vExpert 2011 -> 2022
Veeam VMCE #1
Principal EMEA Cloud Architect @ Veeam Software
@dellock6
https://www.virtualtothecore.com/
vExpert 2011 -> 2022
Veeam VMCE #1
-
- Service Provider
- Posts: 131
- Liked: 22 times
- Joined: Nov 21, 2014 10:50 pm
- Full Name: Nick Fisk
- Contact:
Re: Veeam Replication IO Sizes
Luca, slightly unrelated, but you might find this blog article I wrote interesting as well
http://www.sys-pro.co.uk/blog/2015/veea ... -patterns/
It plots disk seeks on a graph to show the difference between reverse and forward backup types.
http://www.sys-pro.co.uk/blog/2015/veea ... -patterns/
It plots disk seeks on a graph to show the difference between reverse and forward backup types.
-
- VeeaMVP
- Posts: 6139
- Liked: 1932 times
- Joined: Jul 26, 2009 3:39 pm
- Full Name: Luca Dell'Oca
- Location: Varese, Italy
- Contact:
Re: Veeam Replication IO Sizes
Lovely!! Thanks for sharing Nick.
And indeed in my paper about repository performances, I used directIO in fio to avoid flushing and have "pure" results. Even if indeed there are always multiple caches in between working at different layers. Have you thought about using "better" fs like xfs or btrfs? I always use xfs in my deployments for example... I don't feel enough confident yet to move to btrfs.
Luca
And indeed in my paper about repository performances, I used directIO in fio to avoid flushing and have "pure" results. Even if indeed there are always multiple caches in between working at different layers. Have you thought about using "better" fs like xfs or btrfs? I always use xfs in my deployments for example... I don't feel enough confident yet to move to btrfs.
Luca
Luca Dell'Oca
Principal EMEA Cloud Architect @ Veeam Software
@dellock6
https://www.virtualtothecore.com/
vExpert 2011 -> 2022
Veeam VMCE #1
Principal EMEA Cloud Architect @ Veeam Software
@dellock6
https://www.virtualtothecore.com/
vExpert 2011 -> 2022
Veeam VMCE #1
-
- Service Provider
- Posts: 131
- Liked: 22 times
- Joined: Nov 21, 2014 10:50 pm
- Full Name: Nick Fisk
- Contact:
Re: Veeam Replication IO Sizes
No Problem, hope you found it an interesting read.
I've just read your paper, very comprehensive, a very good read. Can I just check if the array you tested on had battery backed write back cache? The IO numbers, particularly for the the forward incrementals look a little low.If you can get your chunk size set so that a veeam block is in the same region as a full stripe, I've found you can easily get into the GB/s range. Thanks to the fact that Veeam doesn't seem to request flushes, having large amounts of ram in Linux massively helps the scheduler getting all the data into a sequential pattern, I saw a massive improvement going from 12GB to 128GB of ram. I'm not sure if Windows buffers as aggressively as this. Merges also go really fast as hopefully most of the data is still in page cache.
I did initially set the storage server up with XFS, but every couple of months I was getting soft kernel panics. I never managed to get the bottom of it before I switched to using EXT4, which has been performing without problems since. I probably do need to revisit looking at different filesystems at some point, but its working so well that I haven't managed to justify the time to look into it. I've been doing a lot of work with Ceph recently trying to get Erasure coding/Cache Tiering into a usable state, as particularly for Cloud Connect we see this as the best path forward for future expansion. We're currently using it for storing Replica's, but due to the high latency Ceph brings and without a front end cache, it has a few performance limitations.
I've just read your paper, very comprehensive, a very good read. Can I just check if the array you tested on had battery backed write back cache? The IO numbers, particularly for the the forward incrementals look a little low.If you can get your chunk size set so that a veeam block is in the same region as a full stripe, I've found you can easily get into the GB/s range. Thanks to the fact that Veeam doesn't seem to request flushes, having large amounts of ram in Linux massively helps the scheduler getting all the data into a sequential pattern, I saw a massive improvement going from 12GB to 128GB of ram. I'm not sure if Windows buffers as aggressively as this. Merges also go really fast as hopefully most of the data is still in page cache.
I did initially set the storage server up with XFS, but every couple of months I was getting soft kernel panics. I never managed to get the bottom of it before I switched to using EXT4, which has been performing without problems since. I probably do need to revisit looking at different filesystems at some point, but its working so well that I haven't managed to justify the time to look into it. I've been doing a lot of work with Ceph recently trying to get Erasure coding/Cache Tiering into a usable state, as particularly for Cloud Connect we see this as the best path forward for future expansion. We're currently using it for storing Replica's, but due to the high latency Ceph brings and without a front end cache, it has a few performance limitations.
-
- VeeaMVP
- Posts: 6139
- Liked: 1932 times
- Joined: Jul 26, 2009 3:39 pm
- Full Name: Luca Dell'Oca
- Location: Varese, Italy
- Contact:
Re: Veeam Replication IO Sizes
I used a NetApp FAS2020, with a vmdk over it acting as a data disk for my repository, so indeed it has proper cache.
I've used XFS a lot and never had kernel panics to be honest, but I agree for repository you need to have something that makes you confident, so if ext4 is the solution for you, it's fine.
Finally, for the ram, actually the biggest difference is using or not v8 Update 2 with the new caching system, more than anything else.
I've used XFS a lot and never had kernel panics to be honest, but I agree for repository you need to have something that makes you confident, so if ext4 is the solution for you, it's fine.
Finally, for the ram, actually the biggest difference is using or not v8 Update 2 with the new caching system, more than anything else.
Luca Dell'Oca
Principal EMEA Cloud Architect @ Veeam Software
@dellock6
https://www.virtualtothecore.com/
vExpert 2011 -> 2022
Veeam VMCE #1
Principal EMEA Cloud Architect @ Veeam Software
@dellock6
https://www.virtualtothecore.com/
vExpert 2011 -> 2022
Veeam VMCE #1
Who is online
Users browsing this forum: Google [Bot], gutti122, Semrush [Bot] and 140 guests